Unsupervised neural network models of the ventral visual stream

Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the National Academy of Sciences - PNAS Vol. 118; no. 3
Main Authors: Zhuang, Chengxu, Yan, Siming, Nayebi, Aran, Schrimpf, Martin, Frank, Michael C, DiCarlo, James J, Yamins, Daniel L K
Format: Journal Article
Language:English
Published: United States 19.01.2021
Subjects:
ISSN:1091-6490, 1091-6490
Online Access:Get more information
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
AbstractList Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However, such networks have remained implausible as a model of the development of the ventral stream, in part because they are trained with supervised methods requiring many more labels than are accessible to infants during development. Here, we report that recent rapid progress in unsupervised learning has largely closed this gap. We find that neural network models learned with deep unsupervised contrastive embedding methods achieve neural prediction accuracy in multiple ventral visual cortical areas that equals or exceeds that of models derived using today's best supervised methods and that the mapping of these neural network models' hidden layers is neuroanatomically consistent across the ventral stream. Strikingly, we find that these methods produce brain-like representations even when trained solely with real human child developmental data collected from head-mounted cameras, despite the fact that these datasets are noisy and limited. We also find that semisupervised deep contrastive embeddings can leverage small numbers of labeled examples to produce representations with substantially improved error-pattern consistency to human behavior. Taken together, these results illustrate a use of unsupervised learning to provide a quantitative model of a multiarea cortical brain system and present a strong candidate for a biologically plausible computational theory of primate sensory learning.
Author Yan, Siming
DiCarlo, James J
Yamins, Daniel L K
Schrimpf, Martin
Frank, Michael C
Zhuang, Chengxu
Nayebi, Aran
Author_xml – sequence: 1
  givenname: Chengxu
  orcidid: 0000-0002-9306-9407
  surname: Zhuang
  fullname: Zhuang, Chengxu
  email: chengxuz@stanford.edu
  organization: Department of Psychology, Stanford University, Stanford, CA 94305; chengxuz@stanford.edu
– sequence: 2
  givenname: Siming
  orcidid: 0000-0002-3873-8153
  surname: Yan
  fullname: Yan, Siming
  organization: Department of Computer Science, The University of Texas at Austin, Austin, TX 78712
– sequence: 3
  givenname: Aran
  orcidid: 0000-0002-7509-9629
  surname: Nayebi
  fullname: Nayebi, Aran
  organization: Neurosciences PhD Program, Stanford University, Stanford, CA 94305
– sequence: 4
  givenname: Martin
  orcidid: 0000-0001-7766-7223
  surname: Schrimpf
  fullname: Schrimpf, Martin
  organization: Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
– sequence: 5
  givenname: Michael C
  orcidid: 0000-0002-7551-4378
  surname: Frank
  fullname: Frank, Michael C
  organization: Department of Psychology, Stanford University, Stanford, CA 94305
– sequence: 6
  givenname: James J
  orcidid: 0000-0002-1592-5896
  surname: DiCarlo
  fullname: DiCarlo, James J
  organization: Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
– sequence: 7
  givenname: Daniel L K
  surname: Yamins
  fullname: Yamins, Daniel L K
  organization: Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA 94305
BackLink https://www.ncbi.nlm.nih.gov/pubmed/33431673$$D View this record in MEDLINE/PubMed
BookMark eNpNj0tLxDAUhYOMOA9du5Mu3XTMzbNZiQzjAwbcOOuStjdYbdOatCP-eyuO4Oo7cD4OnCWZ-c4jIZdA10A1v-m9jWtGQYBRANkJWQA1kCph6OxfnpNljG-UUiMzekbmnAsOSvMFud37OPYYDnXEKvE4BttMGD678J60XYVNTDqXDK-YHNAPP-2kjhPiENC25-TU2SbixZErsr_fvmwe093zw9PmbpeWEvSQVsY4qaQoitIpJSoojVFFxrWVWmoltHCycsBKA4w5XuhSskJPUQp0CiVbkevf3T50HyPGIW_rWGLTWI_dGHMmtGaK8iyb1KujOhYtVnkf6taGr_zvNPsGGWlbBA
CitedBy_id crossref_primary_10_1007_s10462_021_10130_z
crossref_primary_10_1038_s44159_023_00212_w
crossref_primary_10_1063_5_0186054
crossref_primary_10_1038_s41467_025_61399_5
crossref_primary_10_7554_eLife_84357
crossref_primary_10_1016_j_neuroimage_2022_119769
crossref_primary_10_1146_annurev_vision_112823_030616
crossref_primary_10_1038_s41467_021_25939_z
crossref_primary_10_1038_s41598_023_40807_0
crossref_primary_10_1017_S0140525X23001577
crossref_primary_10_1371_journal_pcbi_1012673
crossref_primary_10_1103_PhysRevX_12_031024
crossref_primary_10_3390_brainsci13010061
crossref_primary_10_1007_s10463_023_00887_1
crossref_primary_10_1162_imag_a_00488
crossref_primary_10_1038_s42003_023_05440_7
crossref_primary_10_1371_journal_pcbi_1011506
crossref_primary_10_3389_fcomp_2023_1178450
crossref_primary_10_3390_brainsci11081004
crossref_primary_10_1038_s41467_024_52307_4
crossref_primary_10_1073_pnas_2304085120
crossref_primary_10_7554_eLife_52599
crossref_primary_10_7554_eLife_82502
crossref_primary_10_1073_pnas_2306025121
crossref_primary_10_7554_eLife_88608
crossref_primary_10_1007_s00422_024_00983_2
crossref_primary_10_1038_s41583_023_00705_w
crossref_primary_10_1162_imag_a_00137
crossref_primary_10_1109_TNSRE_2023_3339698
crossref_primary_10_3390_e24050665
crossref_primary_10_1038_s44159_025_00489_z
crossref_primary_10_1371_journal_pcbi_1013271
crossref_primary_10_1126_sciadv_ads6821
crossref_primary_10_1073_pnas_2220642120
crossref_primary_10_1016_j_neunet_2022_06_034
crossref_primary_10_1162_neco_a_01559
crossref_primary_10_1371_journal_pbio_3002366
crossref_primary_10_1016_j_cogsys_2023_101200
crossref_primary_10_1038_s41593_023_01468_4
crossref_primary_10_1017_S0140525X23001516
crossref_primary_10_1038_s43588_024_00746_w
crossref_primary_10_1016_j_tins_2021_12_008
crossref_primary_10_1371_journal_pcbi_1012056
crossref_primary_10_1038_s42256_025_01049_z
crossref_primary_10_3390_electronics13132566
crossref_primary_10_1093_cercor_bhaf106
crossref_primary_10_1016_j_tics_2024_05_001
crossref_primary_10_3758_s13421_024_01580_1
crossref_primary_10_1038_s42256_024_00828_4
crossref_primary_10_1088_1741_2552_ad5d15
crossref_primary_10_1371_journal_pcbi_1011483
crossref_primary_10_1016_j_procs_2022_01_240
crossref_primary_10_1016_j_cogsys_2023_101158
crossref_primary_10_3389_fncom_2022_789253
crossref_primary_10_1371_journal_pcbi_1012600
crossref_primary_10_1016_j_heliyon_2024_e31965
crossref_primary_10_1016_j_jneumeth_2025_110443
crossref_primary_10_1541_ieejeiss_145_514
crossref_primary_10_1038_s41467_024_50821_z
crossref_primary_10_1007_s00422_024_00998_9
crossref_primary_10_1038_s41598_024_78304_7
crossref_primary_10_3389_fninf_2025_1515873
crossref_primary_10_1088_1742_5468_adde43
crossref_primary_10_3389_fcomp_2023_1113609
crossref_primary_10_1016_j_tics_2021_05_010
crossref_primary_10_1038_s41467_025_61458_x
crossref_primary_10_1016_j_isci_2025_112199
crossref_primary_10_1016_j_cell_2024_08_051
crossref_primary_10_1038_s41597_025_04951_8
crossref_primary_10_3758_s13428_023_02206_1
crossref_primary_10_1038_s41467_025_56733_w
crossref_primary_10_1016_j_cobeha_2021_02_006
crossref_primary_10_1016_j_neuroimage_2022_119728
crossref_primary_10_7554_eLife_76096
crossref_primary_10_1016_j_cell_2024_02_036
crossref_primary_10_1038_s41562_025_02220_7
crossref_primary_10_1073_pnas_2408871121
crossref_primary_10_1371_journal_pcbi_1011145
crossref_primary_10_1371_journal_pcbi_1012751
crossref_primary_10_1111_cogs_13400
crossref_primary_10_3389_fncom_2022_929348
crossref_primary_10_1371_journal_pcbi_1011943
crossref_primary_10_1038_s42256_024_00802_0
crossref_primary_10_1016_j_tics_2022_01_006
crossref_primary_10_1017_S0140525X21001357
crossref_primary_10_1038_s41593_023_01442_0
crossref_primary_10_1371_journal_pcbi_1009739
crossref_primary_10_1186_s41235_025_00622_9
crossref_primary_10_7554_eLife_76384
crossref_primary_10_7554_eLife_88608_3
crossref_primary_10_1038_s41467_021_22078_3
crossref_primary_10_1038_s41467_023_38674_4
crossref_primary_10_1016_j_neubiorev_2023_105508
crossref_primary_10_1016_j_conb_2023_102834
crossref_primary_10_1109_ACCESS_2022_3208603
crossref_primary_10_1016_j_tins_2024_02_003
crossref_primary_10_1038_s44271_023_00048_3
crossref_primary_10_1093_cercor_bhac276
crossref_primary_10_1093_cercor_bhac391
crossref_primary_10_1038_s42256_025_01072_0
crossref_primary_10_1007_s11571_023_09989_1
crossref_primary_10_1038_s41467_024_53147_y
crossref_primary_10_1016_j_tins_2022_12_008
crossref_primary_10_1016_j_neuron_2024_04_018
crossref_primary_10_1016_j_cogsys_2024_101244
crossref_primary_10_1371_journal_pcbi_1012019
crossref_primary_10_3390_e23070857
crossref_primary_10_1146_annurev_vision_112823_031607
crossref_primary_10_1111_cogs_13305
crossref_primary_10_1016_j_neubiorev_2024_105650
crossref_primary_10_1073_pnas_2115302119
crossref_primary_10_1371_journal_pcbi_1010878
crossref_primary_10_1016_j_isci_2025_112427
crossref_primary_10_1038_s44159_023_00266_w
crossref_primary_10_1002_hbm_26573
crossref_primary_10_1016_j_cub_2024_11_073
crossref_primary_10_7554_eLife_82566
crossref_primary_10_1016_j_cognition_2023_105621
crossref_primary_10_1016_j_cognition_2024_105920
crossref_primary_10_1016_j_visres_2023_108184
crossref_primary_10_7554_eLife_60830
crossref_primary_10_2183_pjab_98_007
crossref_primary_10_1017_S0140525X22002813
crossref_primary_10_3389_fncom_2022_1030707
crossref_primary_10_3390_biology12101330
crossref_primary_10_1038_s41593_022_01211_5
crossref_primary_10_3389_fncom_2021_686239
crossref_primary_10_1371_journal_pcbi_1011792
crossref_primary_10_1038_s41467_022_28091_4
crossref_primary_10_3390_brainsci12081101
crossref_primary_10_1038_s41598_023_28632_x
crossref_primary_10_1523_JNEUROSCI_1424_22_2022
crossref_primary_10_1093_cercor_bhad416
ContentType Journal Article
Copyright Copyright © 2021 the Author(s). Published by PNAS.
Copyright_xml – notice: Copyright © 2021 the Author(s). Published by PNAS.
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1073/pnas.2014196118
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Sciences (General)
EISSN 1091-6490
ExternalDocumentID 33431673
Genre Research Support, U.S. Gov't, Non-P.H.S
Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NEI NIH HHS
  grantid: P30 EY026877
– fundername: NIMH NIH HHS
  grantid: R01 MH069456
GroupedDBID ---
-DZ
-~X
.55
0R~
123
29P
2AX
2FS
2WC
4.4
53G
5RE
5VS
85S
AACGO
AAFWJ
AANCE
ABBHK
ABOCM
ABPLY
ABPPZ
ABTLG
ABXSQ
ABZEH
ACGOD
ACIWK
ACNCT
ACPRK
AENEX
AEUPB
AEXZC
AFFNX
AFOSN
AFRAH
ALMA_UNASSIGNED_HOLDINGS
BKOMP
CGR
CS3
CUY
CVF
D0L
DCCCD
DIK
DU5
E3Z
EBS
ECM
EIF
F5P
FRP
GX1
H13
HH5
HYE
IPSME
JAAYA
JBMMH
JENOY
JHFFW
JKQEH
JLS
JLXEF
JPM
JSG
JST
KQ8
L7B
LU7
N9A
NPM
N~3
O9-
OK1
PNE
PQQKQ
R.V
RHF
RHI
RNA
RNS
RPM
RXW
SA0
SJN
TAE
TN5
UKR
VQA
W8F
WH7
WOQ
WOW
X7M
XSW
Y6R
YBH
YIF
YIN
YKV
YSK
ZCA
~02
~KM
7X8
ID FETCH-LOGICAL-c517t-d99f5654bbcf664d1c996b837a57576474f5df12c9122f3b7c52b722f54ef6e52
IEDL.DBID 7X8
ISICitedReferencesCount 204
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000609633900029&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1091-6490
IngestDate Fri Sep 05 09:18:54 EDT 2025
Wed Feb 19 02:04:12 EST 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords unsupervised algorithms
deep neural networks
ventral visual stream
Language English
License Copyright © 2021 the Author(s). Published by PNAS.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c517t-d99f5654bbcf664d1c996b837a57576474f5df12c9122f3b7c52b722f54ef6e52
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-7766-7223
0000-0002-1592-5896
0000-0002-3873-8153
0000-0002-9306-9407
0000-0002-7551-4378
0000-0002-7509-9629
OpenAccessLink https://www.pnas.org/content/pnas/118/3/e2014196118.full.pdf
PMID 33431673
PQID 2477260388
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2477260388
pubmed_primary_33431673
PublicationCentury 2000
PublicationDate 2021-01-19
PublicationDateYYYYMMDD 2021-01-19
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-01-19
  day: 19
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Proceedings of the National Academy of Sciences - PNAS
PublicationTitleAlternate Proc Natl Acad Sci U S A
PublicationYear 2021
SSID ssj0009580
Score 2.7038965
Snippet Deep neural networks currently provide the best quantitative models of the response patterns of neurons throughout the primate ventral visual stream. However,...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
SubjectTerms Animals
Child
Datasets as Topic
Humans
Macaca - physiology
Nerve Net - anatomy & histology
Nerve Net - physiology
Neural Networks, Computer
Neurons - physiology
Pattern Recognition, Visual - physiology
Unsupervised Machine Learning
Visual Cortex - anatomy & histology
Visual Cortex - physiology
Title Unsupervised neural network models of the ventral visual stream
URI https://www.ncbi.nlm.nih.gov/pubmed/33431673
https://www.proquest.com/docview/2477260388
Volume 118
WOSCitedRecordID wos000609633900029&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAMrAA5VleMhIDDFZjx4mdCSFExQBVB4q6RX5KDCSBtP39nJNUsCAhscQZ4ii6nM_fne--Q-hKSem45pK4xMYEILUmOpWKeOWc4yaSoqnwfn0S47GczbJJF3Cru7TKlU1sDLUtTYiRDxkHHJgG7pLb6oOErlHhdLVrobGOejFAmaDVYiZ_kO7Klo0goyTlWbSi9hHxsCpUIOumHDSQUvk7vmz2mdHOf79wF213CBPftSrRR2uu2EP9bg3X-Lojmr7ZR7fTol5UwVjUzuLAbAnzijYvHDctcmpcegwQES_bIDCGRxcwhAoT9X6ApqOHl_tH0jVUICahYk5slnkAcFxr49OUW2rA29HgoioAbSLlgvvEespMRhnzsRYmYVrAbcKdT13CDtFGURbuGOHIahAei70WmsPPlfBuEYHvEVlhVZoN0OVKSDkobDiFUIUrF3X-LaYBOmolnVcts0Yex01lfnzyh9mnaIuF_JKIEpqdoZ6H5erO0aZZzt_qz4tGE-A6njx_AV2kvZw
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Unsupervised+neural+network+models+of+the+ventral+visual+stream&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+-+PNAS&rft.au=Zhuang%2C+Chengxu&rft.au=Yan%2C+Siming&rft.au=Nayebi%2C+Aran&rft.au=Schrimpf%2C+Martin&rft.date=2021-01-19&rft.issn=1091-6490&rft.eissn=1091-6490&rft.volume=118&rft.issue=3&rft_id=info:doi/10.1073%2Fpnas.2014196118&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1091-6490&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1091-6490&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1091-6490&client=summon