Hessian regularization of deep neural networks: A novel approach based on stochastic estimators of Hessian trace

[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a no...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 536; pp. 13 - 20
Main Authors: Liu, Yucong, Yu, Shixing, Lin, Tong
Format: Journal Article
Language:English
Published: Elsevier B.V 01.06.2023
Subjects:
ISSN:0925-2312, 1872-8286
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:[Display omitted] •Connecting Hessian trace with a generalization error bound.•Flat minima of loss landscape and stability analysis in dynamical systems.•Efficient Hessian trace regularization algorithm with Dropout.•Performance comparison on vision and language tasks. In this paper, we develop a novel regularization method for deep neural networks by penalizing the trace of Hessian. This regularizer is motivated by a recent guarantee bound of the generalization error. We explain its benefits in finding flat minima and avoiding Lyapunov stability in dynamical systems. We adopt the Hutchinson method as a classical unbiased estimator for the trace of a matrix and further accelerate its calculation using a Dropout scheme. Experiments demonstrate that our method outperforms existing regularizers and data augmentation methods, such as Jacobian, Confidence Penalty, Label Smoothing, Cutout, and Mixup. The code is available at https://github.com/Dean-lyc/Hessian-Regularization.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.03.017