Improving Machine Vision Using Human Perceptual Representations: The Case of Planar Reflection Symmetry for Object Classification

Achieving human-like visual abilities is a holy grail for machine vision, yet precisely how insights from human vision can improve machines has remained unclear. Here, we demonstrate two key conceptual advances: First, we show that most machine vision models are systematically different from human o...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on pattern analysis and machine intelligence Ročník 44; číslo 1; s. 228 - 241
Hlavní autori: Pramod, RT, Arun, SP
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.01.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Achieving human-like visual abilities is a holy grail for machine vision, yet precisely how insights from human vision can improve machines has remained unclear. Here, we demonstrate two key conceptual advances: First, we show that most machine vision models are systematically different from human object perception. To do so, we collected a large dataset of perceptual distances between isolated objects in humans and asked whether these perceptual data can be predicted by many common machine vision algorithms. We found that while the best algorithms explain <inline-formula><tex-math notation="LaTeX">\sim</tex-math> <mml:math><mml:mo>∼</mml:mo></mml:math><inline-graphic xlink:href="pramod-ieq1-3008107.gif"/> </inline-formula>70 percent of the variance in the perceptual data, all the algorithms we tested make systematic errors on several types of objects. In particular, machine algorithms underestimated distances between symmetric objects compared to human perception. Second, we show that fixing these systematic biases can lead to substantial gains in classification performance. In particular, augmenting a state-of-the-art convolutional neural network with planar/reflection symmetry scores along multiple axes produced significant improvements in classification accuracy (1-10 percent) across categories. These results show that machine vision can be improved by discovering and fixing systematic differences from human vision.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Current affiliation: Massachusetts Institute of Technology, Cambridge, MA 02139, USA
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2020.3008107