Large-scale learning with AdaGrad on Spark
Stochastic Gradient Descent (SGD) is a simple yet very efficient online learning algorithm for optimizing convex (and often non-convex) functions and one of the most popular stochastic optimization methods in machine learning today. One drawback of SGD is that it is sensitive to the learning rate hy...
Uloženo v:
| Vydáno v: | 2015 IEEE International Conference on Big Data (Big Data) s. 2828 - 2830 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.10.2015
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Stochastic Gradient Descent (SGD) is a simple yet very efficient online learning algorithm for optimizing convex (and often non-convex) functions and one of the most popular stochastic optimization methods in machine learning today. One drawback of SGD is that it is sensitive to the learning rate hyper-parameter. The Adaptive Sub-gradient Descent, AdaGrad, dynamically incorporates knowledge of the geometry of the data observed in earlier iterations to calculate a different learning rate for every feature. In this work, we implement a distributed version of AdaGrad for large-scale machine learning tasks using Apache Spark. Apache Spark is a fast cluster computing engine that provides similar scalability and fault tolerance properties to MapReduce, but in contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's multi-stage in-memory primitives allow user programs to load data into a cluster's memory and query it repeatedly, which makes it ideal for building scalable machine learning applications. We empirically evaluate our implementation on large-scale real-world problems in the machine learning canonical tasks of classification and regression. Comparing our implementation of AdaGrad with the SGD scheduler currently available in Spark's Machine Learning Library (MLlib), we experimentally show that AdaGrad saves time by avoiding manually setting a learning-rate hyperparameter, converges fast and can even achieve better generalization errors. |
|---|---|
| AbstractList | Stochastic Gradient Descent (SGD) is a simple yet very efficient online learning algorithm for optimizing convex (and often non-convex) functions and one of the most popular stochastic optimization methods in machine learning today. One drawback of SGD is that it is sensitive to the learning rate hyper-parameter. The Adaptive Sub-gradient Descent, AdaGrad, dynamically incorporates knowledge of the geometry of the data observed in earlier iterations to calculate a different learning rate for every feature. In this work, we implement a distributed version of AdaGrad for large-scale machine learning tasks using Apache Spark. Apache Spark is a fast cluster computing engine that provides similar scalability and fault tolerance properties to MapReduce, but in contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's multi-stage in-memory primitives allow user programs to load data into a cluster's memory and query it repeatedly, which makes it ideal for building scalable machine learning applications. We empirically evaluate our implementation on large-scale real-world problems in the machine learning canonical tasks of classification and regression. Comparing our implementation of AdaGrad with the SGD scheduler currently available in Spark's Machine Learning Library (MLlib), we experimentally show that AdaGrad saves time by avoiding manually setting a learning-rate hyperparameter, converges fast and can even achieve better generalization errors. |
| Author | Nigam, Aastha Diaz-Aviles, Ernesto Hadgu, Asmelash Teka |
| Author_xml | – sequence: 1 givenname: Asmelash Teka surname: Hadgu fullname: Hadgu, Asmelash Teka email: teka@L3S.de organization: L3S Res. Center, Hannover, Germany – sequence: 2 givenname: Aastha surname: Nigam fullname: Nigam, Aastha email: anigam@nd.edu organization: Univ. of Notre Dame, Notre Dame, IN, USA – sequence: 3 givenname: Ernesto surname: Diaz-Aviles fullname: Diaz-Aviles, Ernesto email: e.diaz-aviles@ie.ibm.com organization: IBM Res., Dublin, Ireland |
| BookMark | eNotj01Lw0AUAFdQUGt-gR5yFhLf249s3rHWWoVAD9Vz2WTfxsWYlk1A_PcKdi5zG5hrcT4eRhbiDqFEBHp4jP2Tm10pAU1pVaWB8ExkZGvUlv6QlbwU2TTFFhQAkZb1lbhvXOq5mDo3cD6wS2Mc-_w7zh_50rtNcj4_jPnu6NLnjbgIbpg4O3kh3p_Xb6uXotluXlfLpogo1Vy0uvVGduQhEMpgu46QFXhljA7MNXrFWoYA2IKpNftgwVLFxgdPXYVqIW7_u5GZ98cUv1z62Z-O1C9b5kNQ |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/BigData.2015.7364091 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781479999262 1479999261 |
| EndPage | 2830 |
| ExternalDocumentID | 7364091 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL ALMA_UNASSIGNED_HOLDINGS CBEJK RIB RIC RIE RIL |
| ID | FETCH-LOGICAL-i123t-b4bd52c9d0f912f7cc91e30d3554fee81d3e42ff01b0584edf70796e5dfd9c613 |
| IEDL.DBID | RIE |
| IngestDate | Wed Dec 20 05:19:11 EST 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i123t-b4bd52c9d0f912f7cc91e30d3554fee81d3e42ff01b0584edf70796e5dfd9c613 |
| PageCount | 3 |
| ParticipantIDs | ieee_primary_7364091 |
| PublicationCentury | 2000 |
| PublicationDate | 20151001 |
| PublicationDateYYYYMMDD | 2015-10-01 |
| PublicationDate_xml | – month: 10 year: 2015 text: 20151001 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | 2015 IEEE International Conference on Big Data (Big Data) |
| PublicationTitleAbbrev | BigData |
| PublicationYear | 2015 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib030099428 |
| Score | 1.8425597 |
| Snippet | Stochastic Gradient Descent (SGD) is a simple yet very efficient online learning algorithm for optimizing convex (and often non-convex) functions and one of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 2828 |
| SubjectTerms | Adaptive gradient Aggregates Distributed machine learning History Spark Sparks Stochastic processes Support vector machines Training |
| Title | Large-scale learning with AdaGrad on Spark |
| URI | https://ieeexplore.ieee.org/document/7364091 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA61ePCk0opvcvAkpk02m93N0Vf1IKWgQm8lm8yUImzLduvvN9ldK4IXbyGQ10zCfEnmmyHkSgnUFhQyJ5VmsZP-zHllsNQiRAplhomtk02k43E2nepJh9xsuTAAUDufwSAU6798t7Sb8FQ2TGXiryP-rrOTpknD1freOzJAHQ-lW3ac4Hp4t5g_mCrEFhJq0Db9lUOlNiGj_f8NfkD6P1w8OtlamUPSgaJHrl-CAzdbewEDbRM_zGl4U6W3zjyVxtFlQV9Xpvzok_fR49v9M2uzHrCFtyIVy-Pcqchqx1GLCFNrtQDJXQAGCODxpYQ4QuQi5x49gMMQ5C4B5dBp663zEekWywKOCQU_Z3_o8oijiIURme-FY5RLZbgwaXZCemGds1UT2GLWLvH07-ozshdE2XiynZNuVW7gguzaz2qxLi9rbXwBFzSKiw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5KFfSk0opvc_AkbptsNt3N0VetWEvBCr2VbDIpRdiW7dbfb7JdK4IXbyGQd4b5JpmZD-BKMCs1ChsYLmQQGe5kzh1GEGuLobA8sR1dkk3Eg0EyHsthDW42sTCIWDqfYcsXy798M9cr_1TWjnnHmSPO1tnyzFlVtNb37eEe7DgwXcXHMSrbd7Ppgyp8diEmWlXjXywqpRLp7v1v-H1o_kTjkeFGzxxADbMGXPe9C3ewdFuMpKJ-mBL_qkpujXrKlSHzjLwtVP7RhPfu4-i-F1S8B8HM6ZEiSKPUiFBLQ61koY21lgw5NR4aWESHMDlGobWUpdThBzTWp7nroDDWSO308yHUs3mGR0DQzdmJXRpSyyKmWOJ6oTZMuVCUqTg5hoZf52SxTm0xqZZ48nf1Jez0Rq_9Sf958HIKu35b135tZ1Av8hWew7b-LGbL_KI8mS-Edo3U |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2015+IEEE+International+Conference+on+Big+Data+%28Big+Data%29&rft.atitle=Large-scale+learning+with+AdaGrad+on+Spark&rft.au=Hadgu%2C+Asmelash+Teka&rft.au=Nigam%2C+Aastha&rft.au=Diaz-Aviles%2C+Ernesto&rft.date=2015-10-01&rft.pub=IEEE&rft.spage=2828&rft.epage=2830&rft_id=info:doi/10.1109%2FBigData.2015.7364091&rft.externalDocID=7364091 |