Research on a learning rate with energy index in deep learning
The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learnin...
Uloženo v:
| Vydáno v: | Neural networks Ročník 110; s. 225 - 231 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
Elsevier Ltd
01.02.2019
|
| Témata: | |
| ISSN: | 0893-6080, 1879-2782, 1879-2782 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms. |
|---|---|
| AbstractList | The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms. The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms.The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning rates are tuned over time. In this paper, we propose a novel energy index based optimization method (EIOM) to automatically adjust the learning rate in the backpropagation. Since a frequently occurring feature is more important than a rarely occurring feature, we update the features to different extents according to their frequencies. We first define an energy neuron model and then design an energy index to describe the frequency of a feature. The learning rate is taken as a hyperparameter function according to the energy index. To empirically evaluate the EIOM, we investigate different optimizers with three popular machine learning models: logistic regression, multilayer perceptron, and convolutional neural network. The experiments demonstrate the promising performance of the proposed EIOM compared with that of other optimization algorithms. |
| Author | Zhang, Han Liu, Fuxian Liang, Zhibing Zhao, Huizhen |
| Author_xml | – sequence: 1 givenname: Huizhen surname: Zhao fullname: Zhao, Huizhen email: margeryzhao@outlook.com – sequence: 2 givenname: Fuxian surname: Liu fullname: Liu, Fuxian – sequence: 3 givenname: Han surname: Zhang fullname: Zhang, Han – sequence: 4 givenname: Zhibing surname: Liang fullname: Liang, Zhibing |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/30599419$$D View this record in MEDLINE/PubMed |
| BookMark | eNo9kF1LwzAUhoNM3If-A5FeetOajyZNbgYy_IKBIHod0vRky-jSmbbq_r0dm96cc154OJzzTNEoNAEQuiY4I5iIu00WoA_QZRQTmRGaYazO0ITIQqW0kHSEJlgqlgos8RhN23aDMRYyZxdozDBXKidqguZv0IKJdp00ITFJPczBh1USTQfJt-_WCQSIq33iQwU_Q00qgN0_d4nOnalbuDr1Gfp4fHhfPKfL16eXxf0yBSpEl7I8V46X3GJnC-6qyjKXO5OXToADwohieAhMcokdka50ypVcyYpbTqUxbIZuj3t3sfnsoe301rcW6toEaPpWUyJoUeRCsgG9OaF9uYVK76LfmrjXfz8PwPwIwHDwl4eoW-shWKh8BNvpqvGaYH2QrDf6KFkfJGtC9SCZ_QKhinG_ |
| ContentType | Journal Article |
| Copyright | 2018 Elsevier Ltd Copyright © 2018 Elsevier Ltd. All rights reserved. |
| Copyright_xml | – notice: 2018 Elsevier Ltd – notice: Copyright © 2018 Elsevier Ltd. All rights reserved. |
| DBID | NPM 7X8 |
| DOI | 10.1016/j.neunet.2018.12.009 |
| DatabaseName | PubMed MEDLINE - Academic |
| DatabaseTitle | PubMed MEDLINE - Academic |
| DatabaseTitleList | PubMed MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1879-2782 |
| EndPage | 231 |
| ExternalDocumentID | 30599419 S089360801830340X |
| Genre | Journal Article |
| GroupedDBID | --- --K --M -~X .DC .~1 0R~ 123 186 1B1 1RT 1~. 1~5 29N 4.4 457 4G. 53G 5RE 5VS 6TJ 7-5 71M 8P~ 9JM 9JN AABNK AACTN AADPK AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXLA AAXUO AAYFN ABAOU ABBOA ABCQJ ABEFU ABFNM ABFRF ABHFT ABIVO ABJNI ABLJU ABMAC ABXDB ABYKQ ACAZW ACDAQ ACGFO ACGFS ACIUM ACNNM ACRLP ACZNC ADBBV ADEZE ADGUI ADJOM ADMUD ADRHT AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AFXIZ AGHFR AGUBO AGWIK AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HLZ HMQ HVGLF HZ~ IHE J1W JJJVA K-O KOM KZ1 LG9 LMP M2V M41 MHUIS MO0 MOBAO MVM N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SNS SPC SPCBC SSN SST SSV SSW SSZ T5K TAE UAP UNMZH VOH WUQ XPP ZMT ~G- AAXKI AEIPS AKRWK ANKPU NPM 7X8 AATTM AAYWO ACLOT ACVFH ADCNI AEUPX AFPUW AIIUN AKBMS AKYEP APXCP EFKBS ~HD |
| ID | FETCH-LOGICAL-e266t-3449f5b5c0fc75fddc3f4fa4bf6efe1319304bf38580f18fbf9fb598d5c528aa3 |
| ISICitedReferencesCount | 29 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000456713300019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0893-6080 1879-2782 |
| IngestDate | Sun Sep 28 11:00:24 EDT 2025 Wed Feb 19 02:34:28 EST 2025 Fri Feb 23 02:28:37 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Deep learning Energy index Stochastic gradient algorithm Convolutional neural network Learning rate |
| Language | English |
| License | Copyright © 2018 Elsevier Ltd. All rights reserved. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-e266t-3449f5b5c0fc75fddc3f4fa4bf6efe1319304bf38580f18fbf9fb598d5c528aa3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| PMID | 30599419 |
| PQID | 2162774683 |
| PQPubID | 23479 |
| PageCount | 7 |
| ParticipantIDs | proquest_miscellaneous_2162774683 pubmed_primary_30599419 elsevier_sciencedirect_doi_10_1016_j_neunet_2018_12_009 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-02-01 |
| PublicationDateYYYYMMDD | 2019-02-01 |
| PublicationDate_xml | – month: 02 year: 2019 text: 2019-02-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Neural networks |
| PublicationTitleAlternate | Neural Netw |
| PublicationYear | 2019 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Ruder, S. (2016). An overview of gradient descent optimization algorithms, Eprint Arxiv, 1–12. Szegedy, Liu, Jia, Sermanet, Reed, Anguelov (b24) 2015 Nair, Hinton (b15) 2010 Wei, Liu (b27) 2018; 32 Riedmiller, Braun (b19) 1993 Bach, Moulines (b1) 2011 Zeiler, M. D. (2012). Adadelta: An adaptive learning rate method, Eprint Arxiv, 1–6. Taigman, Yang, Ranzato, Wolf (b25) 2014 Deng, Li, Huang, Yao, Yu, Seide (b4) 2013 Fayek, Lech, Cavedon (b6) 2017; 92 Njikam, Zhao (b17) 2016; 45 Graves, Jaitly (b8) 2014 Montavon, Orr, Mller (b14) 2012; 7700 Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, Eprint Arxiv, 1–10. Krizhevsky, Sutskever, Hinton (b11) 2012 Nian, Li, Wang, Xu, Wu (b16) 2016; 210 Chandra, Sharma (b2) 2016; 63 Duchi, Hazan, Singer (b5) 2011; 12 Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2014). Deeply-supervised nets, Eprint Arxiv, 562–570. Xu, W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent, Eprint Arxiv, 1–19. Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus), Eprint Arxiv, 1–14. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization, Eprint Arxiv, 1–15. Mnnle, Richard, Drsam (b13) 1996 Pavel, Schulz, Behnke (b18) 2017; 88 Xu, Xu, Wang, Zheng, Tian, Zhao (b30) 2017; 88 Tieleman, Hinton (b26) 2012 Xu, B., Wang, N., Chen, T., & Li, M. (2015). Empirical evaluation of rectified activations in convolutional network, Eprint Arxiv, 1–5. Robbins, Monro (b20) 1951; 22 Smith, L. N. (2015). Cyclical learning rates for training neural networks, Eprint Arxiv, 464–472. Girshick, Donahue, Darrell, Malik (b7) 2014 Hinton, Deng, Yu, Dahl, Mohamed, Jaitly (b9) 2012; 29 |
| References_xml | – start-page: 451 year: 2011 end-page: 459 ident: b1 article-title: Non-asymptotic analysis of stochastic approximation algorithms for machine learning publication-title: Advances in Neural Information Processing Systems (NIPS) – volume: 63 start-page: 1 year: 2016 end-page: 7 ident: b2 article-title: Deep learning with adaptive learning rate using laplacian score publication-title: Expert Systems with Applications – start-page: 8604 year: 2013 end-page: 8608 ident: b4 article-title: Recent advances in deep learning for speech research at Microsoft publication-title: IEEE International Conference on Acoustics, Speech and Signal Processing – start-page: 1 year: 1996 end-page: 5 ident: b13 article-title: Identification of rule-based fuzzy models using the rprop optimization technique1 publication-title: Proceedings of Fourth European Congress on Intelligent Techniques and Soft Computing – reference: Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus), Eprint Arxiv, 1–14. – start-page: 1 year: 2012 end-page: 31 ident: b26 article-title: Lecture 6.5-RMSProp, Overview of mini-batch gradient descent publication-title: COURSERA: Neural Networks for Machine Learning – reference: Ruder, S. (2016). An overview of gradient descent optimization algorithms, Eprint Arxiv, 1–12. – volume: 88 start-page: 22 year: 2017 end-page: 31 ident: b30 article-title: Self-Taught convolutional neural networks for short text clustering publication-title: Neural Networks – start-page: 1097 year: 2012 end-page: 1105 ident: b11 article-title: ImageNet classification with deep convolutional neural networks publication-title: International Conference on Neural Information Processing Systems – volume: 29 start-page: 82 year: 2012 end-page: 97 ident: b9 article-title: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups publication-title: IEEE Signal Processing Magazine – volume: 45 start-page: 75 year: 2016 end-page: 82 ident: b17 article-title: A novel activation function for multilayer feed-forward neural networks publication-title: Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies – reference: Lee, C. Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2014). Deeply-supervised nets, Eprint Arxiv, 562–570. – reference: Smith, L. N. (2015). Cyclical learning rates for training neural networks, Eprint Arxiv, 464–472. – start-page: 1764 year: 2014 end-page: 1772 ident: b8 article-title: Towards end-to-end speech recognition with recurrent neural networks publication-title: International Conference on Machine Learning – volume: 210 start-page: 283 year: 2016 end-page: 293 ident: b16 article-title: Pornographic image detection utilizing deep convolutional neural networks publication-title: Neurocomputing – reference: Xu, B., Wang, N., Chen, T., & Li, M. (2015). Empirical evaluation of rectified activations in convolutional network, Eprint Arxiv, 1–5. – volume: 12 start-page: 257 year: 2011 end-page: 269 ident: b5 article-title: Adaptive subgradient methods for online learning and stochastic optimization publication-title: Journal of Machine Learning Research (JMLR) – reference: Xu, W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent, Eprint Arxiv, 1–19. – start-page: 807 year: 2010 end-page: 814 ident: b15 article-title: Rectified linear units improve restricted boltzmann machines publication-title: International Conference on International Conference on Machine Learning – start-page: 586 year: 1993 end-page: 591 ident: b19 article-title: A direct adaptive method for faster backpropagation learning: the RPROP algorithm publication-title: IEEE International Conference on Neural Networks – reference: Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization, Eprint Arxiv, 1–15. – volume: 92 start-page: 60 year: 2017 end-page: 68 ident: b6 article-title: Evaluating deep learning architectures for speech emotion recognition publication-title: Neural Networks – reference: Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, Eprint Arxiv, 1–10. – reference: Zeiler, M. D. (2012). Adadelta: An adaptive learning rate method, Eprint Arxiv, 1–6. – volume: 32 start-page: 236 year: 2018 end-page: 244 ident: b27 article-title: An iterative -optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state publication-title: Neural Networks – volume: 22 start-page: 400 year: 1951 end-page: 407 ident: b20 article-title: A stochastic approximation method publication-title: The Annals of Mathematical Statistics – start-page: 1701 year: 2014 end-page: 1708 ident: b25 article-title: DeepFace: Closing the gap to human-level performance in face verification publication-title: IEEE Conference on Computer Vision and Pattern Recognition – volume: 88 start-page: 105 year: 2017 end-page: 113 ident: b18 article-title: Object class segmentation of RGB-D video using recurrent convolutional neural networks publication-title: Neural Networks – start-page: 580 year: 2014 end-page: 587 ident: b7 article-title: Rich feature hierarchies for accurate object detection and semantic segmentation publication-title: IEEE Conference on Computer Vision and Pattern Recognition – volume: 7700 start-page: 103 year: 2012 end-page: 105 ident: b14 article-title: Neural networks: Tricks of the trade publication-title: Lecture Notes in Computer Science – start-page: 1 year: 2015 end-page: 9 ident: b24 article-title: Going deeper with convolutions publication-title: IEEE Conference on Computer Vision and Pattern Recognition |
| SSID | ssj0006843 |
| Score | 2.41497 |
| Snippet | The stochastic gradient descent algorithm (SGD) is the main optimization solution in deep learning. The performance of SGD depends critically on how learning... |
| SourceID | proquest pubmed elsevier |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 225 |
| SubjectTerms | Convolutional neural network Deep learning Energy index Learning rate Stochastic gradient algorithm |
| Title | Research on a learning rate with energy index in deep learning |
| URI | https://dx.doi.org/10.1016/j.neunet.2018.12.009 https://www.ncbi.nlm.nih.gov/pubmed/30599419 https://www.proquest.com/docview/2162774683 |
| Volume | 110 |
| WOSCitedRecordID | wos000456713300019&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2782 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006843 issn: 0893-6080 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdru4e97Htd9lE02KuLbEmW9DIooyMbpQzWgemLkG2JpA9OaOIR9tfvZElOQhlsg72IWBhLuTt-Op1-p0PovSOtYya3WQ4InDFOaCYdJ7BV4YoaIlTZDpq-EJeXsqrU13jQvhrKCYiuk5uNWv5XVUMfKNunzv6FusePQgf8BqVDC2qH9o8Un7h0_hjApKoQvp7Q2oaoq03pfq3dBDKsXY7v7Tqr_uIO0GAXmOKj8309M0N8ddrPf862mWQX837whPvNjsWN8ejpDvdnHvuuZ_M6DRkDDz7XaY_EMWbEbOlHA2gpmpUkVGc6tQFUpVBZIeQ-6kY2a8TNkP0cl-AiLAx30D0EGm5OO9vDX_e8PDnEconarmYjx_Cbn4qfCYAWoYxUB-ioEFwB9B2dfT6vvowLdikDuTJNPWVYDjTAu2PtuS2_25YM7snVY_Qw7ivwWbCHJ-ie7Z6iR6lmB44Q_gx9SOaBFx02OKkde_PA3jxwMA88mAe02JvH-N5z9P3T-dXHaRZraGQWXK91RhlTjte8Ia4R3LVtQx1zhtWutM7mAMCUwIM_HiYul652ytVcyZY3vJDG0BfosFt09iXCrT-lM8a1jlrWsFrSWjWNkP7iLqmsnSCR5KKj-xbcMg3604lNeKODRLWXqM4LDRKdoHdJjBrQzR9Zmc4u-pUu8rKADUop6QQdB_nqZbiGRVN_tRDL1at_Hvc1erA17DfocH3b27fofvNjPV_dnqADUcmTaCy_AKjDgK8 |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Research+on+a+learning+rate+with+energy+index+in+deep+learning&rft.jtitle=Neural+networks&rft.au=Zhao%2C+Huizhen&rft.au=Liu%2C+Fuxian&rft.au=Zhang%2C+Han&rft.au=Liang%2C+Zhibing&rft.date=2019-02-01&rft.pub=Elsevier+Ltd&rft.issn=0893-6080&rft.eissn=1879-2782&rft.volume=110&rft.spage=225&rft.epage=231&rft_id=info:doi/10.1016%2Fj.neunet.2018.12.009&rft.externalDocID=S089360801830340X |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0893-6080&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0893-6080&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0893-6080&client=summon |