Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace
[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a no...
Saved in:
| Published in: | Neurocomputing (Amsterdam) Vol. 536; pp. 13 - 20 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.06.2023
|
| Subjects: | |
| ISSN: | 0925-2312, 1872-8286 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | [Display omitted]
•Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks.
In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization. |
|---|---|
| ISSN: | 0925-2312 1872-8286 |
| DOI: | 10.1016/j.neucom.2023.03.017 |