Unraveling Audiovisual Perception Across Space and Time: A Neuroinspired Computational Architecture

ABSTRACT Accurate perception of audiovisual stimuli depends crucially on the spatial and temporal properties of each sensory component, with multisensory enhancement only occurring if those components are presented in spatiotemporal congruency. Although spatial localization and temporal detection of...

Full description

Saved in:
Bibliographic Details
Published in:The European journal of neuroscience Vol. 62; no. 3; pp. e70217 - n/a
Main Authors: Cuppini, Cristiano, Di Rosa, Eleonore F., Astolfi, Laura, Monti, Melissa
Format: Journal Article
Language:English
Published: France Wiley Subscription Services, Inc 01.08.2025
John Wiley and Sons Inc
Subjects:
ISSN:0953-816X, 1460-9568, 1460-9568
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ABSTRACT Accurate perception of audiovisual stimuli depends crucially on the spatial and temporal properties of each sensory component, with multisensory enhancement only occurring if those components are presented in spatiotemporal congruency. Although spatial localization and temporal detection of audiovisual signals have each been extensively studied, the neural mechanisms underlying their joint influence, particularly in spatiotemporally misaligned contexts, remain poorly understood. Moreover, empirical dissection of their respective contributions to behavioral outcomes proves challenging when spatial and temporal disparities are introduced concurrently. Here, we sought to elucidate the mutual interaction of temporal and spatial offsets on the neural encoding of audiovisual stimuli. To this end, we developed a biologically inspired neurocomputational model that reproduces behavioral evidence of perceptual phenomena observed in audiovisual tasks, i.e., the modality switch effect (temporal realm) and the ventriloquist effect (spatial realm). Tested against the race model, our network successfully simulates multisensory enhancement in reaction times due to the concurrent presentation of cross‐modal stimuli. Further investigation on the mechanisms implemented in the network upheld the centrality of cross‐sensory inhibition in explaining modality switch effects and of cross‐modal and lateral intra‐area connections in regulating the evolution of these effects in space. Finally, the model predicts an amelioration in temporal detection of different modality stimuli with increasing between‐stimuli eccentricity and indicates a plausible reduction in auditory localization bias for increasing interstimulus interval between spatially disparate cues. Our findings provide novel insights into the neural computations underlying audiovisual perception and offer a comprehensive predictive framework to guide future experimental investigations of multisensory integration. Fitted on auditory localization and reaction time task data, our neurocomputational model aims to elucidate the mechanisms underlying multisensory perception in the entire spatiotemporal domain, predicting how spatial and temporal factors interact to modulate sensory perception. A is for Auditory, V for Visual, AV for Audiovisual, Sw for a trial comprising two sequential stimuli of different sensory modalities, Rp for the same sensory modality (e.g., in SwA, V is followed by A).
Bibliography:This work was supported by the Ministero dell'Istruzione e del Merito, MNESYS (PE0000006) and PRIN MUR 20207S3NB8.
Funding
Alessandro Treves
Associate Editor
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Funding: This work was supported by the Ministero dell'Istruzione e del Merito, MNESYS (PE0000006) and PRIN MUR 20207S3NB8.
Associate Editor: Alessandro Treves
ISSN:0953-816X
1460-9568
1460-9568
DOI:10.1111/ejn.70217