Doubly stochastic algorithms for large-scale optimization

We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We call the algorithm random parallel because it utilizes multiple...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2016 50th Asilomar Conference on Signals, Systems and Computers s. 1705 - 1709
Hlavní autoři: Koppel, Alec, Mokhtari, Aryan, Ribeiro, Alejandro
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.11.2016
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We call the algorithm random parallel because it utilizes multiple processors to operate in a randomly chosen subset of blocks of the feature vector. We call the algorithm stochastic because processors choose elements of the training set randomly and independently. Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both, the selection of blocks and the selection of elements of the training set. In RAPSA, processors utilize the randomly chosen functions to compute the stochastic gradient component associated with a randomly chosen block. We show that this type of doubly stochastic approximation method, when executed on an asynchronous parallel computing architecture, exhibits comparable convergence behavior to that of classical stochastic gradient descent on strongly convex functions - for diminishing step-sizes, asynchronous RAPSA converges to the minimizer of the expected risk. We illustrate empirical algorithm performance on a linear estimation problem, as well as a binary image classification using the MNIST handwritten digit dataset.
AbstractList We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We call the algorithm random parallel because it utilizes multiple processors to operate in a randomly chosen subset of blocks of the feature vector. We call the algorithm stochastic because processors choose elements of the training set randomly and independently. Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both, the selection of blocks and the selection of elements of the training set. In RAPSA, processors utilize the randomly chosen functions to compute the stochastic gradient component associated with a randomly chosen block. We show that this type of doubly stochastic approximation method, when executed on an asynchronous parallel computing architecture, exhibits comparable convergence behavior to that of classical stochastic gradient descent on strongly convex functions - for diminishing step-sizes, asynchronous RAPSA converges to the minimizer of the expected risk. We illustrate empirical algorithm performance on a linear estimation problem, as well as a binary image classification using the MNIST handwritten digit dataset.
Author Mokhtari, Aryan
Koppel, Alec
Ribeiro, Alejandro
Author_xml – sequence: 1
  givenname: Alec
  surname: Koppel
  fullname: Koppel, Alec
  email: aryanm@seas.upenn.edu
  organization: Dept. of Electr. & Syst. Eng., Univ. of Pennsylvania, Philadelphia, PA, USA
– sequence: 2
  givenname: Aryan
  surname: Mokhtari
  fullname: Mokhtari, Aryan
  email: akoppel@seas.upenn.edu
  organization: Dept. of Electr. & Syst. Eng., Univ. of Pennsylvania, Philadelphia, PA, USA
– sequence: 3
  givenname: Alejandro
  surname: Ribeiro
  fullname: Ribeiro, Alejandro
  email: aribeiro@seas.upenn.edu
  organization: Dept. of Electr. & Syst. Eng., Univ. of Pennsylvania, Philadelphia, PA, USA
BookMark eNotz7tuwjAUgGFXKkNLeQG65AWS-sT3EaVXCakD7Mgxx2DJiZHtDvTpO5Tp3z7pfyT3c5qRkDXQDoCal82w2w1dT0F2SksjFbsjK6M0CKYlM4L3D8S8pp8xXptSkzvbUoNrbDylHOp5Ko1PuYk2n7AtzkZs0qWGKfzaGtL8RBbexoKrW5dk__62Hz7b7ffH17DZtsHQ2oIXo1YwArdgkEvrHUcH3BtxRN0z5NxTao9KamTac2a00ui1GUUPVI1sSZ7_2YCIh0sOk83Xw-2H_QE2pETW
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ACSSC.2016.7869673
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781538639542
1538639548
EndPage 1709
ExternalDocumentID 7869673
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i90t-1f5b871b14a19e46afc4ec14f95de823e44f00ad768e38f439878ef89b52107b3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:46 EDT 2023
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-1f5b871b14a19e46afc4ec14f95de823e44f00ad768e38f439878ef89b52107b3
PageCount 5
ParticipantIDs ieee_primary_7869673
PublicationCentury 2000
PublicationDate 2016-Nov.
PublicationDateYYYYMMDD 2016-11-01
PublicationDate_xml – month: 11
  year: 2016
  text: 2016-Nov.
PublicationDecade 2010
PublicationTitle 2016 50th Asilomar Conference on Signals, Systems and Computers
PublicationTitleAbbrev ACSSC
PublicationYear 2016
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.9793184
Snippet We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve...
SourceID ieee
SourceType Publisher
StartPage 1705
SubjectTerms Approximation algorithms
Convergence
Delays
Indexes
Program processors
Radio frequency
Training
Title Doubly stochastic algorithms for large-scale optimization
URI https://ieeexplore.ieee.org/document/7869673
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB5UeuipLVr6Joceu7pr3scilR6KCHrwJsnubBXULb6g_76TdWsp9NJbCIEwE5L5vmS-CcAjT5B7TSRHZbTdhEYfGeN4pKVE5NYRKC-r67_pwcBMJnZYg6ejFgYRy-QzbIdm-ZafFekuXJV1tFFWaV6HutbqoNX61sHEtvPcG416IVlLtauBv35MKQNG_-x_U51D60d5x4bHmHIBNVw1wRLK9YtPRjgtnblQWJm5xXtBtH623DBCnWwR8rmjDfkbWUGHwLJSV7Zg3H8Z916j6suDaG7jbZTk0hOD8YlwiUWhXJ4KTBORW5mh6XIUIo9jlxFHQG5yAhNGG8yN9RSFY-35JTRWxQqvgHklJUVrkRtvRSA5iSQuqokgdVXqNb-GZrB6-nEoajGtDL75u_sWToNjDyK8O2hs1zu8h5N0v51v1g_lSnwBtZ2LMg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1qFfSk0orf5uDRbXc3ySY5SrFUrKXQHnorm91ZW2i70g_Bf-9ku1YEL95CCISZkMx7ybwJwD0PkFtFJCdKabsJhdbTOuaekhKRm5hAeVFdv6t6PT0amX4FHnZaGEQsks-w4ZrFW36aJxt3VdZUOjKR4nuwL4UI_a1a61sJ45vmY2swaLl0rahRDv31Z0oRMtrH_5vsBOo_2jvW30WVU6jgogaGcK6dfTJCaskkdqWVWTx7y4nYT-YrRriTzVxGt7cijyPL6RiYl_rKOgzbT8NWxys_PfCmxl97QSYtcRgbiDgwKKI4SwQmgciMTFGHHIXIfD9OiSUg1xnBCa00ZtpYisO-svwMqot8gefAbCQlxWuRaWuEozmBJDaqiCKFUWIVv4Cas3r8vi1rMS4Nvvy7-w4OO8PX7rj73Hu5giPn5K0k7xqq6-UGb-Ag-VhPV8vbYlW-ACTpjnk
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2016+50th+Asilomar+Conference+on+Signals%2C+Systems+and+Computers&rft.atitle=Doubly+stochastic+algorithms+for+large-scale+optimization&rft.au=Koppel%2C+Alec&rft.au=Mokhtari%2C+Aryan&rft.au=Ribeiro%2C+Alejandro&rft.date=2016-11-01&rft.pub=IEEE&rft.spage=1705&rft.epage=1709&rft_id=info:doi/10.1109%2FACSSC.2016.7869673&rft.externalDocID=7869673