Sample size selection in optimization methods for machine learning
This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large-scale machine learning problems. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient. We propose a criterion...
Uloženo v:
| Vydáno v: | Mathematical programming Ročník 134; číslo 1; s. 127 - 155 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
Berlin/Heidelberg
Springer-Verlag
01.08.2012
Springer Springer Nature B.V |
| Témata: | |
| ISSN: | 0025-5610, 1436-4646 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large-scale machine learning problems. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient. We propose a criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient. We establish an
complexity bound on the total cost of a gradient method. The second part of the paper describes a practical Newton method that uses a smaller sample to compute Hessian vector-products than to evaluate the function and the gradient, and that also employs a dynamic sampling technique. The focus of the paper shifts in the third part of the paper to
L
1
-regularized problems designed to produce sparse solutions. We propose a Newton-like method that consists of two phases: a (minimalistic) gradient projection phase that identifies zero variables, and subspace phase that applies a subsampled Hessian Newton iteration in the free variables. Numerical tests on speech recognition problems illustrate the performance of the algorithms. |
|---|---|
| Bibliografie: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| ISSN: | 0025-5610 1436-4646 |
| DOI: | 10.1007/s10107-012-0572-5 |