Why RELU Units Sometimes Die: Analysis of Single-Unit Error Backpropagation in Neural Networks

Recently, neural networks in machine learning use rectified linear units (ReLUs) in early processing layers for better performance. Training these structures sometimes results in "dying ReLU units" with near-zero outputs. We first explore this condition via simulation using the CIFAR-10 da...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Conference record - Asilomar Conference on Signals, Systems, & Computers S. 864 - 868
Hauptverfasser: Douglas, Scott C., Yu, Jiutian
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.10.2018
Schlagworte:
ISSN:2576-2303
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, neural networks in machine learning use rectified linear units (ReLUs) in early processing layers for better performance. Training these structures sometimes results in "dying ReLU units" with near-zero outputs. We first explore this condition via simulation using the CIFAR-10 dataset and variants of two popular convolutive neural network architectures. Our explorations show that the output activation probability Pr[y > 0] is generally less than 0.5 at system convergence for layers that do not employ skip connections, and this activation probability tends to decrease as one progresses from input layer to output layer. Employing a simplified model of a single ReLU unit trained by a variant of error backpropagation, we then perform a statistical convergence analysis to explore the model's evolutionary behavior. Our analysis describes the potentially-slower convergence speeds of dying ReLU units, and this issue can occur regardless of how the weights are initialized.
ISSN:2576-2303
DOI:10.1109/ACSSC.2018.8645556