Research on a learning rate with energy index in deep learning

The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learnin...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Neural networks Ročník 110; s. 225 - 231
Hlavní autoři: Zhao, Huizhen, Liu, Fuxian, Zhang, Han, Liang, Zhibing
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States Elsevier Ltd 01.02.2019
Témata:
ISSN:0893-6080, 1879-2782, 1879-2782
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms.
AbstractList The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms.
The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms.The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms.
Author Zhang, Han
Liu, Fuxian
Liang, Zhibing
Zhao, Huizhen
Author_xml – sequence: 1
  givenname: Huizhen
  surname: Zhao
  fullname: Zhao, Huizhen
  email: margeryzhao@outlook.com
– sequence: 2
  givenname: Fuxian
  surname: Liu
  fullname: Liu, Fuxian
– sequence: 3
  givenname: Han
  surname: Zhang
  fullname: Zhang, Han
– sequence: 4
  givenname: Zhibing
  surname: Liang
  fullname: Liang, Zhibing
BackLink https://www.ncbi.nlm.nih.gov/pubmed/30599419$$D View this record in MEDLINE/PubMed
BookMark eNo9kF1LwzAUhoNM3If-A5FeetOajyZNbgYy_IKBIHod0vRky-jSmbbq_r0dm96cc154OJzzTNEoNAEQuiY4I5iIu00WoA_QZRQTmRGaYazO0ITIQqW0kHSEJlgqlgos8RhN23aDMRYyZxdozDBXKidqguZv0IKJdp00ITFJPczBh1USTQfJt-_WCQSIq33iQwU_Q00qgN0_d4nOnalbuDr1Gfp4fHhfPKfL16eXxf0yBSpEl7I8V46X3GJnC-6qyjKXO5OXToADwohieAhMcokdka50ypVcyYpbTqUxbIZuj3t3sfnsoe301rcW6toEaPpWUyJoUeRCsgG9OaF9uYVK76LfmrjXfz8PwPwIwHDwl4eoW-shWKh8BNvpqvGaYH2QrDf6KFkfJGtC9SCZ_QKhinG_
ContentType Journal Article
Copyright 2018 Elsevier Ltd
Copyright © 2018 Elsevier Ltd. All rights reserved.
Copyright_xml – notice: 2018 Elsevier Ltd
– notice: Copyright © 2018 Elsevier Ltd. All rights reserved.
DBID NPM
7X8
DOI 10.1016/j.neunet.2018.12.009
DatabaseName PubMed
MEDLINE - Academic
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList
PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1879-2782
EndPage 231
ExternalDocumentID 30599419
S089360801830340X
Genre Journal Article
GroupedDBID ---
--K
--M
-~X
.DC
.~1
0R~
123
186
1B1
1RT
1~.
1~5
29N
4.4
457
4G.
53G
5RE
5VS
6TJ
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXLA
AAXUO
AAYFN
ABAOU
ABBOA
ABCQJ
ABEFU
ABFNM
ABFRF
ABHFT
ABIVO
ABJNI
ABLJU
ABMAC
ABXDB
ABYKQ
ACAZW
ACDAQ
ACGFO
ACGFS
ACIUM
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADGUI
ADJOM
ADMUD
ADRHT
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HMQ
HVGLF
HZ~
IHE
J1W
JJJVA
K-O
KOM
KZ1
LG9
LMP
M2V
M41
MHUIS
MO0
MOBAO
MVM
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SNS
SPC
SPCBC
SSN
SST
SSV
SSW
SSZ
T5K
TAE
UAP
UNMZH
VOH
WUQ
XPP
ZMT
~G-
AAXKI
AEIPS
AKRWK
ANKPU
NPM
7X8
AATTM
AAYWO
ACLOT
ACVFH
ADCNI
AEUPX
AFPUW
AIIUN
AKBMS
AKYEP
APXCP
EFKBS
~HD
ID FETCH-LOGICAL-e266t-3449f5b5c0fc75fddc3f4fa4bf6efe1319304bf38580f18fbf9fb598d5c528aa3
ISICitedReferencesCount 29
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000456713300019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0893-6080
1879-2782
IngestDate Sun Sep 28 11:00:24 EDT 2025
Wed Feb 19 02:34:28 EST 2025
Fri Feb 23 02:28:37 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Deep learning
Energy index
Stochastic gradient algorithm
Convolutional neural network
Learning rate
Language English
License Copyright © 2018 Elsevier Ltd. All rights reserved.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-e266t-3449f5b5c0fc75fddc3f4fa4bf6efe1319304bf38580f18fbf9fb598d5c528aa3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 30599419
PQID 2162774683
PQPubID 23479
PageCount 7
ParticipantIDs proquest_miscellaneous_2162774683
pubmed_primary_30599419
elsevier_sciencedirect_doi_10_1016_j_neunet_2018_12_009
PublicationCentury 2000
PublicationDate 2019-02-01
PublicationDateYYYYMMDD 2019-02-01
PublicationDate_xml – month: 02
  year: 2019
  text: 2019-02-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Neural networks
PublicationTitleAlternate Neural Netw
PublicationYear 2019
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Ruder, S. (2016). An overview of gradient descent optimization algorithms, Eprint Arxiv, 1–12.
Szegedy, Liu, Jia, Sermanet, Reed, Anguelov (b24) 2015
Nair, Hinton (b15) 2010
Wei, Liu (b27) 2018; 32
Riedmiller, Braun (b19) 1993
Bach, Moulines (b1) 2011
Zeiler, M. D. (2012). Adadelta: An adaptive learning rate method, Eprint Arxiv, 1–6.
Taigman, Yang, Ranzato, Wolf (b25) 2014
Deng, Li, Huang, Yao, Yu, Seide (b4) 2013
Fayek, Lech, Cavedon (b6) 2017; 92
Njikam, Zhao (b17) 2016; 45
Graves, Jaitly (b8) 2014
Montavon, Orr, Mller (b14) 2012; 7700
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, Eprint Arxiv, 1–10.
Krizhevsky, Sutskever, Hinton (b11) 2012
Nian, Li, Wang, Xu, Wu (b16) 2016; 210
Chandra, Sharma (b2) 2016; 63
Duchi, Hazan, Singer (b5) 2011; 12
Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2014). Deeply-supervised nets, Eprint Arxiv, 562–570.
Xu, W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent, Eprint Arxiv, 1–19.
Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus), Eprint Arxiv, 1–14.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization, Eprint Arxiv, 1–15.
Mnnle, Richard, Drsam (b13) 1996
Pavel, Schulz, Behnke (b18) 2017; 88
Xu, Xu, Wang, Zheng, Tian, Zhao (b30) 2017; 88
Tieleman, Hinton (b26) 2012
Xu, B., Wang, N., Chen, T., & Li, M. (2015). Empirical evaluation of rectified activations in convolutional network, Eprint Arxiv, 1–5.
Robbins, Monro (b20) 1951; 22
Smith, L. N. (2015). Cyclical learning rates for training neural networks, Eprint Arxiv, 464–472.
Girshick, Donahue, Darrell, Malik (b7) 2014
Hinton, Deng, Yu, Dahl, Mohamed, Jaitly (b9) 2012; 29
References_xml – start-page: 451
  year: 2011
  end-page: 459
  ident: b1
  article-title: Non-asymptotic analysis of stochastic approximation algorithms for machine learning
  publication-title: Advances in Neural Information Processing Systems (NIPS)
– volume: 63
  start-page: 1
  year: 2016
  end-page: 7
  ident: b2
  article-title: Deep learning with adaptive learning rate using laplacian score
  publication-title: Expert Systems with Applications
– start-page: 8604
  year: 2013
  end-page: 8608
  ident: b4
  article-title: Recent advances in deep learning for speech research at Microsoft
  publication-title: IEEE International Conference on Acoustics, Speech and Signal Processing
– start-page: 1
  year: 1996
  end-page: 5
  ident: b13
  article-title: Identification of rule-based fuzzy models using the rprop optimization technique1
  publication-title: Proceedings of Fourth European Congress on Intelligent Techniques and Soft Computing
– reference: Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus), Eprint Arxiv, 1–14.
– start-page: 1
  year: 2012
  end-page: 31
  ident: b26
  article-title: Lecture 6.5-RMSProp, Overview of mini-batch gradient descent
  publication-title: COURSERA: Neural Networks for Machine Learning
– reference: Ruder, S. (2016). An overview of gradient descent optimization algorithms, Eprint Arxiv, 1–12.
– volume: 88
  start-page: 22
  year: 2017
  end-page: 31
  ident: b30
  article-title: Self-Taught convolutional neural networks for short text clustering
  publication-title: Neural Networks
– start-page: 1097
  year: 2012
  end-page: 1105
  ident: b11
  article-title: ImageNet classification with deep convolutional neural networks
  publication-title: International Conference on Neural Information Processing Systems
– volume: 29
  start-page: 82
  year: 2012
  end-page: 97
  ident: b9
  article-title: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups
  publication-title: IEEE Signal Processing Magazine
– volume: 45
  start-page: 75
  year: 2016
  end-page: 82
  ident: b17
  article-title: A novel activation function for multilayer feed-forward neural networks
  publication-title: Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies
– reference: Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2014). Deeply-supervised nets, Eprint Arxiv, 562–570.
– reference: Smith, L. N. (2015). Cyclical learning rates for training neural networks, Eprint Arxiv, 464–472.
– start-page: 1764
  year: 2014
  end-page: 1772
  ident: b8
  article-title: Towards end-to-end speech recognition with recurrent neural networks
  publication-title: International Conference on Machine Learning
– volume: 210
  start-page: 283
  year: 2016
  end-page: 293
  ident: b16
  article-title: Pornographic image detection utilizing deep convolutional neural networks
  publication-title: Neurocomputing
– reference: Xu, B., Wang, N., Chen, T., & Li, M. (2015). Empirical evaluation of rectified activations in convolutional network, Eprint Arxiv, 1–5.
– volume: 12
  start-page: 257
  year: 2011
  end-page: 269
  ident: b5
  article-title: Adaptive subgradient methods for online learning and stochastic optimization
  publication-title: Journal of Machine Learning Research (JMLR)
– reference: Xu, W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent, Eprint Arxiv, 1–19.
– start-page: 807
  year: 2010
  end-page: 814
  ident: b15
  article-title: Rectified linear units improve restricted boltzmann machines
  publication-title: International Conference on International Conference on Machine Learning
– start-page: 586
  year: 1993
  end-page: 591
  ident: b19
  article-title: A direct adaptive method for faster backpropagation learning: the RPROP algorithm
  publication-title: IEEE International Conference on Neural Networks
– reference: Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization, Eprint Arxiv, 1–15.
– volume: 92
  start-page: 60
  year: 2017
  end-page: 68
  ident: b6
  article-title: Evaluating deep learning architectures for speech emotion recognition
  publication-title: Neural Networks
– reference: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, Eprint Arxiv, 1–10.
– reference: Zeiler, M. D. (2012). Adadelta: An adaptive learning rate method, Eprint Arxiv, 1–6.
– volume: 32
  start-page: 236
  year: 2018
  end-page: 244
  ident: b27
  article-title: An iterative -optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state
  publication-title: Neural Networks
– volume: 22
  start-page: 400
  year: 1951
  end-page: 407
  ident: b20
  article-title: A stochastic approximation method
  publication-title: The Annals of Mathematical Statistics
– start-page: 1701
  year: 2014
  end-page: 1708
  ident: b25
  article-title: DeepFace: Closing the gap to human-level performance in face verification
  publication-title: IEEE Conference on Computer Vision and Pattern Recognition
– volume: 88
  start-page: 105
  year: 2017
  end-page: 113
  ident: b18
  article-title: Object class segmentation of RGB-D video using recurrent convolutional neural networks
  publication-title: Neural Networks
– start-page: 580
  year: 2014
  end-page: 587
  ident: b7
  article-title: Rich feature hierarchies for accurate object detection and semantic segmentation
  publication-title: IEEE Conference on Computer Vision and Pattern Recognition
– volume: 7700
  start-page: 103
  year: 2012
  end-page: 105
  ident: b14
  article-title: Neural networks: Tricks of the trade
  publication-title: Lecture Notes in Computer Science
– start-page: 1
  year: 2015
  end-page: 9
  ident: b24
  article-title: Going deeper with convolutions
  publication-title: IEEE Conference on Computer Vision and Pattern Recognition
SSID ssj0006843
Score 2.41497
Snippet The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning...
SourceID proquest
pubmed
elsevier
SourceType Aggregation Database
Index Database
Publisher
StartPage 225
SubjectTerms Convolutional neural network
Deep learning
Energy index
Learning rate
Stochastic gradient algorithm
Title Research on a learning rate with energy index in deep learning
URI https://dx.doi.org/10.1016/j.neunet.2018.12.009
https://www.ncbi.nlm.nih.gov/pubmed/30599419
https://www.proquest.com/docview/2162774683
Volume 110
WOSCitedRecordID wos000456713300019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2782
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006843
  issn: 0893-6080
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdru4e97Htd9lE02KuLbEmW9DIooyMbpQzWgemLkG2JpA9OaOIR9tfvZElOQhlsg72IWBhLuTt-Op1-p0PovSOtYya3WQ4InDFOaCYdJ7BV4YoaIlTZDpq-EJeXsqrU13jQvhrKCYiuk5uNWv5XVUMfKNunzv6FusePQgf8BqVDC2qH9o8Un7h0_hjApKoQvp7Q2oaoq03pfq3dBDKsXY7v7Tqr_uIO0GAXmOKj8309M0N8ddrPf862mWQX837whPvNjsWN8ejpDvdnHvuuZ_M6DRkDDz7XaY_EMWbEbOlHA2gpmpUkVGc6tQFUpVBZIeQ-6kY2a8TNkP0cl-AiLAx30D0EGm5OO9vDX_e8PDnEconarmYjx_Cbn4qfCYAWoYxUB-ioEFwB9B2dfT6vvowLdikDuTJNPWVYDjTAu2PtuS2_25YM7snVY_Qw7ivwWbCHJ-ie7Z6iR6lmB44Q_gx9SOaBFx02OKkde_PA3jxwMA88mAe02JvH-N5z9P3T-dXHaRZraGQWXK91RhlTjte8Ia4R3LVtQx1zhtWutM7mAMCUwIM_HiYul652ytVcyZY3vJDG0BfosFt09iXCrT-lM8a1jlrWsFrSWjWNkP7iLqmsnSCR5KKj-xbcMg3604lNeKODRLWXqM4LDRKdoHdJjBrQzR9Zmc4u-pUu8rKADUop6QQdB_nqZbiGRVN_tRDL1at_Hvc1erA17DfocH3b27fofvNjPV_dnqADUcmTaCy_AKjDgK8
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Research+on+a+learning+rate+with+energy+index+in+deep+learning&rft.jtitle=Neural+networks&rft.au=Zhao%2C+Huizhen&rft.au=Liu%2C+Fuxian&rft.au=Zhang%2C+Han&rft.au=Liang%2C+Zhibing&rft.date=2019-02-01&rft.pub=Elsevier+Ltd&rft.issn=0893-6080&rft.eissn=1879-2782&rft.volume=110&rft.spage=225&rft.epage=231&rft_id=info:doi/10.1016%2Fj.neunet.2018.12.009&rft.externalDocID=S089360801830340X
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0893-6080&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0893-6080&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0893-6080&client=summon