A Loss Bound Model for On-Line Stochastic Prediction Algorithms

In this paper, we consider the problem of on-line prediction in which at each time an unlabeled instance is given and then a prediction algorithm outputs a probability distribution over the set of labels rather than {0, 1}-values before it sees the correct label. For this setting, we propose a weigh...

Full description

Saved in:
Bibliographic Details
Published in:Information and computation Vol. 119; no. 1; pp. 39 - 54
Main Author: Yamanishi, K.
Format: Journal Article
Language:English
Published: San Diego, CA Elsevier Inc 15.05.1995
Elsevier
Subjects:
ISSN:0890-5401, 1090-2651
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we consider the problem of on-line prediction in which at each time an unlabeled instance is given and then a prediction algorithm outputs a probability distribution over the set of labels rather than {0, 1}-values before it sees the correct label. For this setting, we propose a weighted-average-type on-line stochastic prediction algorithm WA, which can be regarded as a hybrid of the Bayes algorithm and a sequential real-valued parameter estimation method. We derive upper bounds on the instantaneous logarithmic loss and cumulative logarithmic loss for WA in both the example-dependent form and the expected form (the expectation is taken with respect to the fixed target distribution of sequences). Specifically, under some specific parametric assumptions for target rules, we prove that WA is optimal in the sense that the upper bound on the expected cumulative logarithmic loss for WA asymptotically matches Rissanen′s coding-theoretic lower bound. Further, we derive an upper bound on the expected cumulative quadratic loss by making use of relationships between the quadratic loss and the logarithmic loss. Throughout the paper, we relate computational learning theory to information theory, most specifically, Rissanen′s predictive minimum description length principle, by giving noiseless coding theoretic interpretations to the loss bounds.
ISSN:0890-5401
1090-2651
DOI:10.1006/inco.1995.1076