Improving the ϵ-approximate algorithm for Probabilistic Classifier Chains

Probabilistic Classifier Chains are a multi-label classification method which has gained the attention of researchers in recent years. This is because of their ability to optimally estimate the entire joint conditional probability of a label combination through the product rule of probability. Their...

Full description

Saved in:

Bibliographic Details
Published in:	Knowledge and information systems Vol. 62; no. 7; pp. 2709 - 2738
Main Authors:	Fdez-Díaz, Miriam, Fdez-Díaz, Laura, Mena, Deiner, Montañés, Elena, Quevedo, José Ramón, Coz, Juan José del
Format:	Journal Article
Language:	English
Published:	London Springer London 01.07.2020 Springer Nature B.V
Subjects:	Algorithms Chains Classifiers Computational efficiency Computer Science Computer simulation Computing costs Computing time Conditional probability Data Mining and Knowledge Discovery Database Management Heuristic methods Information Storage and Retrieval Information Systems and Communication Service Information Systems Applications (incl.Internet) IT in Business Optimization Regular Paper Searching Statistical analysis Multi-label Inference Classifier Chains approximate algorithm
ISSN:	0219-1377, 0219-3116
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Probabilistic Classifier Chains are a multi-label classification method which has gained the attention of researchers in recent years. This is because of their ability to optimally estimate the entire joint conditional probability of a label combination through the product rule of probability. Their main drawback is that they require performing an exhaustive search in order to obtain Bayes optimal predictions. This means computing this probability for all possible label combinations before taking a label combination with the highest value of probability. This is the reason why several works have been published in recent years that avoid exploring all combinations, while maintaining optimality. Approaches such as greedy search, beam search and Monte Carlo reduce the computational cost, but at the cost of not ensuring Bayes optimal predictions (although, in general, they provide close to optimal solutions). Methods based on a heuristic search provide optimal predictions, but the computational time has not been as good as expected. In this respect, the ϵ -approximate algorithm has been found to be the best inference approach among those that provide Bayes optimal predictions, not only for its optimality, but also for its computational time. However, this paper both theoretically and experimentally shows that it sometimes performs some backtracking during the search for optimal predictions which may prolong the prediction time. The aim of this paper is thus to improve this algorithm by achieving a more direct search. Specifically, it enhances the criterion under which the next node to be expanded is chosen by adding heuristic information, although it is only applicable for linear-based models. The experiments carried out confirm that the improved ϵ -approximate algorithm explores fewer nodes and reduces the computational time of the original version.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0219-1377 0219-3116
DOI:	10.1007/s10115-020-01436-5