Toward modeling visual routines of object segmentation with biologically inspired recurrent vision models

A core task of the primate visual system is to organize its retinal input into coherent figural objects. While psychological theories dating back to Ullman (1984) suggest that such object segmentation at least partially relies on feedback, little is known about how these computations are implemented...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Journal of vision (Charlottesville, Va.) Ročník 22; číslo 14; s. 3773
Hlavní autoři:	Goetschalckx, Lore, Zolfaghar, Maryam, Ashok, Alekh K., Govindarajan, Lakshmi N., Linsley, Drew, Serre, Thomas
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Association for Research in Vision and Ophthalmology 05.12.2022
Témata:	Computer Science Computer Vision and Pattern Recognition
ISSN:	1534-7362, 1534-7362
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	A core task of the primate visual system is to organize its retinal input into coherent figural objects. While psychological theories dating back to Ullman (1984) suggest that such object segmentation at least partially relies on feedback, little is known about how these computations are implemented in neural circuits. Here we investigate this question using the neural circuit model of Serre et al. (VSS 2020), which is trained to solve visual tasks by implementing recurrent contextual interactions through horizontal feedback connections. When optimized for contour detection in natural images, the model rivals human performance and exhibits sensitivity to contextual illusions typically associated with primate vision, despite having no explicit constraints to do so. Our goal here is to understand whether the visual routine this feedback model discovers for object segmentation can explain the one used by human observers, as measured in a behavioral experiment where participants judged if a cue dot fell on the same or different object silhouette than a fixation dot (Jeurissen et al. 2016). To train the model, we built a large natural image dataset of object outlines (N~250K), where each sample included a “fixation” dot on one object. The model learned to segment the target object by adopting an incremental grouping strategy resembling the growth-cone family of psychology models for figure-ground segmentation, through which it achieved near-perfect segmentation accuracy on a validation dataset (F1=.98) and the novel stimulus set used by Jeurissen et al. (N=22, F1=.98). Critically, the model exhibited a similar pattern of reaction times as humans, indicating that its circuit constraints reflect possible neural substrates for the visual routines of object segmentation in humans. Overall, our work establishes task-optimized models of neural circuits as an interface for generating experimental predictions that link cognitive science theory with exact neural computations.
ISSN:	1534-7362 1534-7362
DOI:	10.1167/jov.22.14.3773