Constructive Deep ReLU Neural Network Approximation
We propose an efficient, deterministic algorithm for constructing exponentially convergent deep neural network (DNN) approximations of multivariate, analytic maps f : [ - 1 , 1 ] K → R . We address in particular networks with the rectified linear unit (ReLU) activation function. Similar results and...
Uložené v:
| Vydané v: | Journal of scientific computing Ročník 90; číslo 2; s. 75 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
Springer US
01.02.2022
Springer Nature B.V |
| Predmet: | |
| ISSN: | 0885-7474, 1573-7691 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | We propose an efficient,
deterministic algorithm for constructing exponentially convergent deep neural network (DNN) approximations
of multivariate, analytic maps
f
:
[
-
1
,
1
]
K
→
R
. We address in particular networks with the rectified linear unit (ReLU) activation function. Similar results and proofs apply for many other popular activation functions. The algorithm is based on collocating
f
in deterministic families of grid points with small Lebesgue constants, and by a-priori (i.e., “offline”) emulation of a spectral basis with DNNs to prescribed fidelity. Assuming availability of
N
function values of a possibly corrupted, numerical approximation
f
˘
of
f
in
[
-
1
,
1
]
K
and a bound on
‖
f
-
f
˘
‖
L
∞
(
[
-
1
,
1
]
K
)
, we provide an explicit, computational construction of a ReLU DNN which attains accuracy
ε
(depending on
N
and
‖
f
-
f
˘
‖
L
∞
(
[
-
1
,
1
]
K
)
) uniformly, with respect to the inputs. For analytic maps
f
:
[
-
1
,
1
]
K
→
R
, we prove
exponential convergence of expression and generalization errors
of the constructed ReLU DNNs. Specifically, for every target accuracy
ε
∈
(
0
,
1
)
, there exists
N
depending also on
f
such that the error of the construction algorithm with
N
evaluations of
f
˘
as input in the norm
L
∞
(
[
-
1
,
1
]
K
;
R
)
is smaller than
ε
up to an additive data-corruption bound
‖
f
-
f
˘
‖
L
∞
(
[
-
1
,
1
]
K
)
multiplied with a factor growing slowly with
1
/
ε
and the number of non-zero DNN weights grows polylogarithmically with respect to
1
/
ε
. The algorithmic construction of the ReLU DNNs which will realize the approximations, is explicit and deterministic in terms of the function values of
f
˘
in tensorized Clenshaw–Curtis grids in
[
-
1
,
1
]
K
. We illustrate the proposed methodology by a constructive algorithm for (offline) computations of posterior expectations in Bayesian PDE inversion. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0885-7474 1573-7691 |
| DOI: | 10.1007/s10915-021-01718-2 |