A learning algorithm with a gradient normalization and a learning rate adaptation for the mini-batch type learning

The development of a high-performance optimization algorithm to solve the learning problem of the neural networks is strongly demanded with the advance of the deep learning. The learning algorithms with gradient normalization mechanisms have been investigated, and their effectiveness has been shown....

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2017 56th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE) s. 811 - 816
Hlavní autori: Ito, Daiki, Okamoto, Takashi, Koakutsu, Seiichi
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: The Society of Instrument and Control Engineers - SICE 01.09.2017
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The development of a high-performance optimization algorithm to solve the learning problem of the neural networks is strongly demanded with the advance of the deep learning. The learning algorithms with gradient normalization mechanisms have been investigated, and their effectiveness has been shown. In the learning algorithms, the adaptation of the learning rate is very important issue. The learning algorithms of the neural networks are classified into the batch learning and the mini-batch learning. In the learning with vast training data, the mini-batch type learning is often used due to the limitation of memory size and the computational cost. The mini-batch type learning algorithms with gradient normalization mechanisms have been investigated. However, the adaptation of the learning rate in the mini-batch type learning algorithm with the gradient normalization has not been investigated well. This study proposes to introduce a new learning rate adaptation mechanism based on sign variation of gradient to a mini-batch type learning algorithm with the gradient normalization. The effectiveness of the proposed algorithm is verified through applications to a learning problem of the multi-layered neural networks and a learning problem of the convolutional neural networks.
DOI:10.23919/SICE.2017.8105654