Designing multi-label classifiers that maximize F measures: State of the art
Multi-label classification problems usually occur in tasks related to information retrieval, like text and image annotation, and are receiving increasing attention from the machine learning and pattern recognition fields. One of the main issues under investigation is the development of classificatio...
Saved in:
| Published in: | Pattern recognition Vol. 61; pp. 394 - 404 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Ltd
01.01.2017
|
| Subjects: | |
| ISSN: | 0031-3203, 1873-5142 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Multi-label classification problems usually occur in tasks related to information retrieval, like text and image annotation, and are receiving increasing attention from the machine learning and pattern recognition fields. One of the main issues under investigation is the development of classification algorithms capable of maximizing specific accuracy measures based on precision and recall. We focus on the widely used F measure, defined for binary, single-label problems as the weighted harmonic mean of precision and recall, and later extended to multi-label problems in three ways: macro-averaged, micro-averaged and instance-wise. In this paper we give a comprehensive survey of theoretical results and algorithms aimed at maximizing F measures. We subdivide it according to the two main existing approaches: empirical utility maximization, and decision-theoretic. Under the former approach, we also derive the optimal (Bayes) classifier at the population level for the instance-wise and micro-averaged F, extending recent results about the single-label F. In a companion paper we shall focus on the micro-averaged F measure, for which relatively fewer solutions exist, and shall develop novel maximization algorithms under both approaches.
•We survey classification algorithms that maximize F measures.•We consider the empirical utility maximization and decision-theoretic approaches.•We consider first the single-label F measure.•We then consider the multi-label instance-wise, macro- and micro-averaged F. |
|---|---|
| ISSN: | 0031-3203 1873-5142 |
| DOI: | 10.1016/j.patcog.2016.08.008 |