Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace
[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a no...
Uloženo v:
| Vydáno v: | Neurocomputing (Amsterdam) Ročník 536; s. 13 - 20 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.06.2023
|
| Témata: | |
| ISSN: | 0925-2312, 1872-8286 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | [Display omitted]
•Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks.
In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization. |
|---|---|
| ISSN: | 0925-2312 1872-8286 |
| DOI: | 10.1016/j.neucom.2023.03.017 |