Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace

[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a no...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Neurocomputing (Amsterdam) Ročník 536; s. 13 - 20
Hlavní autoři: Liu, Yucong, Yu, Shixing, Lin, Tong
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.06.2023
Témata:
ISSN:0925-2312, 1872-8286
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.03.017