Comparing machines and humans on a visual categorization test

Automated scene interpretation has benefited from advances in machine learning, and restricted tasks, such as face detection, have been solved with sufficient accuracy for restricted settings. However, the performance of machines in providing rich semantic descriptions of natural scenes from digital...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings of the National Academy of Sciences - PNAS Vol. 108; no. 43; p. 17621
Main Authors:	Fleuret, François, Li, Ting, Dubout, Charles, Wampler, Emma K, Yantis, Steven, Geman, Donald
Format:	Journal Article
Language:	English
Published:	United States 25.10.2011
Subjects:	Algorithms Artificial Intelligence Humans Pattern Recognition, Automated - methods Pattern Recognition, Visual - physiology Problem Solving
ISSN:	1091-6490, 1091-6490
Online Access:	Get more information
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Automated scene interpretation has benefited from advances in machine learning, and restricted tasks, such as face detection, have been solved with sufficient accuracy for restricted settings. However, the performance of machines in providing rich semantic descriptions of natural scenes from digital images remains highly limited and hugely inferior to that of humans. Here we quantify this "semantic gap" in a particular setting: We compare the efficiency of human and machine learning in assigning an image to one of two categories determined by the spatial arrangement of constituent parts. The images are not real, but the category-defining rules reflect the compositional structure of real images and the type of "reasoning" that appears to be necessary for semantic parsing. Experiments demonstrate that human subjects grasp the separating principles from a handful of examples, whereas the error rates of computer programs fluctuate wildly and remain far behind that of humans even after exposure to thousands of examples. These observations lend support to current trends in computer vision such as integrating machine learning with parts-based modeling.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1091-6490 1091-6490
DOI:	10.1073/pnas.1109168108