Constructive Deep ReLU Neural Network Approximation

We propose an efficient, deterministic algorithm for constructing exponentially convergent deep neural network (DNN) approximations of multivariate, analytic maps f : [ - 1 , 1 ] K → R . We address in particular networks with the rectified linear unit (ReLU) activation function. Similar results and...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Journal of scientific computing Ročník 90; číslo 2; s. 75
Hlavní autori: Herrmann, Lukas, Opschoor, Joost A. A., Schwab, Christoph
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer US 01.02.2022
Springer Nature B.V
Predmet:
ISSN:0885-7474, 1573-7691
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:We propose an efficient, deterministic algorithm for constructing exponentially convergent deep neural network (DNN) approximations of multivariate, analytic maps f : [ - 1 , 1 ] K → R . We address in particular networks with the rectified linear unit (ReLU) activation function. Similar results and proofs apply for many other popular activation functions. The algorithm is based on collocating f in deterministic families of grid points with small Lebesgue constants, and by a-priori (i.e., “offline”) emulation of a spectral basis with DNNs to prescribed fidelity. Assuming availability of N function values of a possibly corrupted, numerical approximation f ˘ of f in [ - 1 , 1 ] K and a bound on ‖ f - f ˘ ‖ L ∞ ( [ - 1 , 1 ] K ) , we provide an explicit, computational construction of a ReLU DNN which attains accuracy ε (depending on N and ‖ f - f ˘ ‖ L ∞ ( [ - 1 , 1 ] K ) ) uniformly, with respect to the inputs. For analytic maps f : [ - 1 , 1 ] K → R , we prove exponential convergence of expression and generalization errors of the constructed ReLU DNNs. Specifically, for every target accuracy ε ∈ ( 0 , 1 ) , there exists N depending also on f such that the error of the construction algorithm with N evaluations of f ˘ as input in the norm L ∞ ( [ - 1 , 1 ] K ; R ) is smaller than ε up to an additive data-corruption bound ‖ f - f ˘ ‖ L ∞ ( [ - 1 , 1 ] K ) multiplied with a factor growing slowly with 1 / ε and the number of non-zero DNN weights grows polylogarithmically with respect to 1 / ε . The algorithmic construction of the ReLU DNNs which will realize the approximations, is explicit and deterministic in terms of the function values of f ˘ in tensorized Clenshaw–Curtis grids in [ - 1 , 1 ] K . We illustrate the proposed methodology by a constructive algorithm for (offline) computations of posterior expectations in Bayesian PDE inversion.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0885-7474
1573-7691
DOI:10.1007/s10915-021-01718-2