Exponentiated Gradient versus Gradient Descent for Linear Predictors
We consider two algorithms for on-line prediction based on a linear model. The algorithms are the well-known gradient descent (GD) algorithm and a new algorithm, which we call EG ±. They both maintain a weight vector using simple updates. For the GD algorithm, the update is based on subtracting the...
Uložené v:
| Vydané v: | Information and computation Ročník 132; číslo 1; s. 1 - 63 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
San Diego, CA
Elsevier Inc
10.01.1997
Elsevier |
| Predmet: | |
| ISSN: | 0890-5401, 1090-2651 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | We consider two algorithms for on-line prediction based on a linear model. The algorithms are the well-known gradient descent (GD) algorithm and a new algorithm, which we call EG
±. They both maintain a weight vector using simple updates. For the GD algorithm, the update is based on subtracting the gradient of the squared error made on a prediction. The EG
±algorithm uses the components of the gradient in the exponents of factors that are used in updating the weight vector multiplicatively. We present worst-case loss bounds for EG
±and compare them to previously known bounds for the GD algorithm. The bounds suggest that the losses of the algorithms are in general incomparable, but EG
±has a much smaller loss if only few components of the input are relevant for the predictions. We have performed experiments which show that our worst-case upper bounds are quite tight already on simple artificial data. |
|---|---|
| ISSN: | 0890-5401 1090-2651 |
| DOI: | 10.1006/inco.1996.2612 |