Neural collapse under cross-entropy loss

We consider the variational problem of cross-entropy loss with n feature vectors on a unit hypersphere in Rd. We prove that when d≥n−1, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that, as n→∞ with fixed d, the minim...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Applied and computational harmonic analysis Ročník 59; s. 224 - 241
Hlavní autori: Lu, Jianfeng, Steinerberger, Stefan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Inc 01.07.2022
Predmet:
ISSN:1063-5203
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract We consider the variational problem of cross-entropy loss with n feature vectors on a unit hypersphere in Rd. We prove that when d≥n−1, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that, as n→∞ with fixed d, the minimizing points will distribute uniformly on the hypersphere and show a connection with the frame potential of Benedetto & Fickus.
AbstractList We consider the variational problem of cross-entropy loss with n feature vectors on a unit hypersphere in Rd. We prove that when d≥n−1, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that, as n→∞ with fixed d, the minimizing points will distribute uniformly on the hypersphere and show a connection with the frame potential of Benedetto & Fickus.
Author Lu, Jianfeng
Steinerberger, Stefan
Author_xml – sequence: 1
  givenname: Jianfeng
  orcidid: 0000-0001-6255-5165
  surname: Lu
  fullname: Lu, Jianfeng
  email: jianfeng@math.duke.edu
  organization: Department of Mathematics, Department of Physics, and Department of Chemistry, Duke University, Box 90320, Durham NC 27708, USA
– sequence: 2
  givenname: Stefan
  surname: Steinerberger
  fullname: Steinerberger, Stefan
  email: steinerb@uw.edu
  organization: Department of Mathematics, University of Washington, Seattle, WA 98195, USA
BookMark eNp9j7tOw0AQRbcIEkngB6hc0tjMPux4JRoU8ZIiaKBejcdjsZGxo10HKX_PBqgoUo3m6p7RnIWYDePAQlxJKCTI6mZbIH1goUDJQqoCpJyJuYRK56UCfS4WMW4hpaa0c3H9wvuAfUZj3-MucrYfWg4ZhTHGnIcpjLtD1qflQpx12Ee-_JtL8f5w_7Z-yjevj8_ru01OGmDKbQcro7XpGmURqhJXFbUWDaXUqgq5Ze6oKalutAY0trN1UxubyppW2uilUL93f14I3Lld8J8YDk6CO_q5rTv6uaOfk8olkwTV_yDyE05-TALo-9Po7S_KSerLc3CRPA_ErQ9Mk2tHfwr_BvcGbvw
CitedBy_id crossref_primary_10_1109_ACCESS_2022_3222495
crossref_primary_10_1016_j_engappai_2025_111020
crossref_primary_10_1016_j_knosys_2025_114090
crossref_primary_10_1016_j_engappai_2025_112152
crossref_primary_10_1016_j_compbiomed_2025_110749
crossref_primary_10_3233_JIFS_232118
crossref_primary_10_1007_s11760_023_02931_2
crossref_primary_10_1016_j_compbiolchem_2024_108261
crossref_primary_10_1016_j_robot_2023_104453
crossref_primary_10_1109_TCSS_2023_3260118
crossref_primary_10_1109_TIT_2023_3320098
Cites_doi 10.1073/pnas.2103091118
10.1023/A:1021323312367
10.1215/S0012-7094-42-00908-6
10.4007/annals.2013.178.2.2
10.1007/BF02391913
10.1073/pnas.2015509117
10.1090/S0002-9947-1938-1501980-0
ContentType Journal Article
Copyright 2022 Elsevier Inc.
Copyright_xml – notice: 2022 Elsevier Inc.
DBID AAYXX
CITATION
DOI 10.1016/j.acha.2021.12.011
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Mathematics
EndPage 241
ExternalDocumentID 10_1016_j_acha_2021_12_011
S1063520321001123
GrantInformation_xml – fundername: NSF
  grantid: DMS-2123224
  funderid: https://doi.org/10.13039/100000001
– fundername: National Science Foundation
  grantid: DMS-2012286; CCF-1934964
  funderid: https://doi.org/10.13039/100000001
– fundername: Alfred P. Sloan Foundation
  funderid: https://doi.org/10.13039/100000879
GroupedDBID --K
--M
.~1
0R~
1B1
1RT
1~.
1~5
23M
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AASFE
AATTM
AAXKI
AAXUO
ABAOU
ABFNM
ABJNI
ABMAC
ABWVN
ABXDB
ACDAQ
ACGFS
ACRLP
ACRPL
ADBBV
ADEZE
ADFGL
ADMUD
ADNMO
ADVLN
AEBSH
AEIPS
AEKER
AENEX
AEXQZ
AFJKZ
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AIEXJ
AIGVJ
AIKHN
AITUG
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ANKPU
ARUGR
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
BNPGV
CAG
COF
CS3
DM4
EBS
EFBJH
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG5
M26
M41
MCRUF
MHUIS
MO0
N9A
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSH
SSW
SSZ
T5K
WUQ
XPP
ZMT
~G-
9DU
AAYWO
AAYXX
ACLOT
ACVFH
ADCNI
AEUPX
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
APXCP
CITATION
EFKBS
EFLBG
~HD
ID FETCH-LOGICAL-c300t-9f074334fb29a065a76cd9a4c743926aedeefcb5c8b330a49f98b8499a03c7343
ISICitedReferencesCount 18
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000798563300007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-5203
IngestDate Sat Nov 29 07:08:06 EST 2025
Tue Nov 18 20:58:05 EST 2025
Sun Apr 06 06:53:11 EDT 2025
IsPeerReviewed true
IsScholarly true
Keywords Neural collapse
Frame potential
Cross-entropy loss
Neural networks
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c300t-9f074334fb29a065a76cd9a4c743926aedeefcb5c8b330a49f98b8499a03c7343
ORCID 0000-0001-6255-5165
PageCount 18
ParticipantIDs crossref_primary_10_1016_j_acha_2021_12_011
crossref_citationtrail_10_1016_j_acha_2021_12_011
elsevier_sciencedirect_doi_10_1016_j_acha_2021_12_011
PublicationCentury 2000
PublicationDate July 2022
2022-07-00
PublicationDateYYYYMMDD 2022-07-01
PublicationDate_xml – month: 07
  year: 2022
  text: July 2022
PublicationDecade 2020
PublicationTitle Applied and computational harmonic analysis
PublicationYear 2022
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Mixon, Parshall, Pi (br0110) 2020
Papyan, Han, Donoho (br0120) 2020; 117
Ergen, Pilanci (br0060) 2021
Jaffe, Kluger, Lindenbaum, Patsenker, Peterfreund, Steinerberger (br0090) 2022
Bondarenko, Radchenko, Viazovska (br0030) 2013; 178
Hormander (br0080) 1968; 121
Bilyk, Glazyrin, Matzke, Park, Vlasiuk (br0020)
Poggio, Liao (br0130) 2021
Zhu, Ding, Zhou, Li, You, Sulam, Qu (br0170) 2021
E, Wojtowytsch (br0050) 2020
Fang, He, Long, Su (br0070) 2021; 118
Schoenberg (br0150) 1942; 9
Wang, Isola (br0160) 2020
Benedetto, Fickus (br0010) 2003; 18
Mikolov, Chen, Corrado, Dean (br0100) 2013
Schoenberg (br0140) 1938; 44
Chopra, Hadsell, LeCun (br0040) 2005
Chopra (10.1016/j.acha.2021.12.011_br0040) 2005
Zhu (10.1016/j.acha.2021.12.011_br0170)
Ergen (10.1016/j.acha.2021.12.011_br0060)
Jaffe (10.1016/j.acha.2021.12.011_br0090) 2022
Schoenberg (10.1016/j.acha.2021.12.011_br0140) 1938; 44
Papyan (10.1016/j.acha.2021.12.011_br0120) 2020; 117
Wang (10.1016/j.acha.2021.12.011_br0160)
Schoenberg (10.1016/j.acha.2021.12.011_br0150) 1942; 9
Fang (10.1016/j.acha.2021.12.011_br0070) 2021; 118
Benedetto (10.1016/j.acha.2021.12.011_br0010) 2003; 18
Mixon (10.1016/j.acha.2021.12.011_br0110)
Poggio (10.1016/j.acha.2021.12.011_br0130)
Bondarenko (10.1016/j.acha.2021.12.011_br0030) 2013; 178
E (10.1016/j.acha.2021.12.011_br0050)
Mikolov (10.1016/j.acha.2021.12.011_br0100)
Bilyk (10.1016/j.acha.2021.12.011_br0020)
Hormander (10.1016/j.acha.2021.12.011_br0080) 1968; 121
References_xml – volume: 18
  start-page: 357
  year: 2003
  end-page: 385
  ident: br0010
  article-title: Finite normalized tight frames
  publication-title: Adv. Comput. Math.
– year: 2021
  ident: br0130
  article-title: Explicit regularization and implicit bias in deep network classifiers trained with the square loss
– year: 2022
  ident: br0090
  article-title: The spectral underpinning of word2vec
  publication-title: Front. Appl. Math. Stat.
– volume: 118
  year: 2021
  ident: br0070
  article-title: Exploring deep neural networks via layer-peeled model: minority collapse in imbalanced training
  publication-title: Proc. Natl. Acad. Sci.
– volume: 9
  start-page: 96
  year: 1942
  end-page: 108
  ident: br0150
  article-title: Positive definite functions on spheres
  publication-title: Duke Math. J.
– volume: 121
  start-page: 193
  year: 1968
  end-page: 218
  ident: br0080
  article-title: The spectral function of an elliptic operator
  publication-title: Acta Math.
– year: 2021
  ident: br0170
  article-title: A geometric analysis of neural collapse with unconstrained features
– year: 2020
  ident: br0050
  article-title: On the emergence of tetrahedral symmetry in the final and penultimate layers of neural network classifiers
– year: 2020
  ident: br0110
  article-title: Neural collapse with unconstrained features
– ident: br0020
  article-title: Energy on spheres and discreteness of minimizing measures
– volume: 44
  start-page: 522
  year: 1938
  end-page: 536
  ident: br0140
  article-title: Metric spaces and positive definite functions
  publication-title: Trans. Am. Math. Soc.
– year: 2020
  ident: br0160
  article-title: Understanding contrastive representation learning through alignment and uniformity on the hypersphere
– year: 2013
  ident: br0100
  article-title: Efficient estimation of word representations in vector space
– volume: 178
  start-page: 443
  year: 2013
  end-page: 452
  ident: br0030
  article-title: Optimal asymptotic bounds for spherical designs
  publication-title: Ann. Math.
– start-page: 539
  year: 2005
  end-page: 546
  ident: br0040
  article-title: Learning a similarity metric discriminatively, with application to face verification
  publication-title: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
– year: 2021
  ident: br0060
  article-title: Revealing the structure of deep neural networks via convex duality
– volume: 117
  start-page: 24652
  year: 2020
  end-page: 24663
  ident: br0120
  article-title: Prevalence of neural collapse during the terminal phase of deep learning training
  publication-title: Proc. Natl. Acad. Sci. USA
– ident: 10.1016/j.acha.2021.12.011_br0100
– ident: 10.1016/j.acha.2021.12.011_br0020
– volume: 118
  year: 2021
  ident: 10.1016/j.acha.2021.12.011_br0070
  article-title: Exploring deep neural networks via layer-peeled model: minority collapse in imbalanced training
  publication-title: Proc. Natl. Acad. Sci.
  doi: 10.1073/pnas.2103091118
– ident: 10.1016/j.acha.2021.12.011_br0130
– ident: 10.1016/j.acha.2021.12.011_br0060
– year: 2022
  ident: 10.1016/j.acha.2021.12.011_br0090
  article-title: The spectral underpinning of word2vec
  publication-title: Front. Appl. Math. Stat.
– volume: 18
  start-page: 357
  year: 2003
  ident: 10.1016/j.acha.2021.12.011_br0010
  article-title: Finite normalized tight frames
  publication-title: Adv. Comput. Math.
  doi: 10.1023/A:1021323312367
– ident: 10.1016/j.acha.2021.12.011_br0050
– volume: 9
  start-page: 96
  year: 1942
  ident: 10.1016/j.acha.2021.12.011_br0150
  article-title: Positive definite functions on spheres
  publication-title: Duke Math. J.
  doi: 10.1215/S0012-7094-42-00908-6
– ident: 10.1016/j.acha.2021.12.011_br0160
– volume: 178
  start-page: 443
  year: 2013
  ident: 10.1016/j.acha.2021.12.011_br0030
  article-title: Optimal asymptotic bounds for spherical designs
  publication-title: Ann. Math.
  doi: 10.4007/annals.2013.178.2.2
– volume: 121
  start-page: 193
  year: 1968
  ident: 10.1016/j.acha.2021.12.011_br0080
  article-title: The spectral function of an elliptic operator
  publication-title: Acta Math.
  doi: 10.1007/BF02391913
– volume: 117
  start-page: 24652
  issue: 40
  year: 2020
  ident: 10.1016/j.acha.2021.12.011_br0120
  article-title: Prevalence of neural collapse during the terminal phase of deep learning training
  publication-title: Proc. Natl. Acad. Sci. USA
  doi: 10.1073/pnas.2015509117
– start-page: 539
  year: 2005
  ident: 10.1016/j.acha.2021.12.011_br0040
  article-title: Learning a similarity metric discriminatively, with application to face verification
– ident: 10.1016/j.acha.2021.12.011_br0110
– volume: 44
  start-page: 522
  year: 1938
  ident: 10.1016/j.acha.2021.12.011_br0140
  article-title: Metric spaces and positive definite functions
  publication-title: Trans. Am. Math. Soc.
  doi: 10.1090/S0002-9947-1938-1501980-0
– ident: 10.1016/j.acha.2021.12.011_br0170
SSID ssj0011459
Score 2.5847576
Snippet We consider the variational problem of cross-entropy loss with n feature vectors on a unit hypersphere in Rd. We prove that when d≥n−1, the global minimum is...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 224
SubjectTerms Cross-entropy loss
Frame potential
Neural collapse
Neural networks
Title Neural collapse under cross-entropy loss
URI https://dx.doi.org/10.1016/j.acha.2021.12.011
Volume 59
WOSCitedRecordID wos000798563300007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 1063-5203
  databaseCode: AIEXJ
  dateStart: 20211209
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0011459
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LS8NAEF7EetCD-MT6IgcPQgnk2U2ORSoqWjxU6S3sbjaoSBraVOq_d3Z3kj58oAcvoQ3NttlvOq_MfEPImZCUC5mltptRagc0jUAPKtb91FdTrUTg6t6qx1va60WDQXyPI-HHepwAzfNoOo2Lf4UazgHYqnX2D3DXi8IJeA2gwxFgh-OvgFd0G5r0AwAuxlIPuh21tDW0VSp3WLy3XodYd1Hxz6Ivik1uxaSscoSK2VoPyWHIXlJX8Ey0AIB0ZRKtnykPVt2EqmgMJ3qVMkMBxNyCN6tDrdQhODAQqjr-vL5EBm9UeKYDGm2nZ0isPqllkyF4ARF5UlxPnqtTsKhlFziwl2xTXTFYFaO9JGqNRK2RuF7iqLbuhkfDGDRao3PdHdzUz5DcQI_Kq-8BW6ZMdd_yL_naLZlzNfpbZBNjBKtjsN0mKzLfIRtzzJHw7q6m2x3vknODuVVhbmnMrQXMLYX5Hnm47PYvrmwcgWEL33FKO86Ui-cHGfdiBt4io22RxiwQKo702kymUmaChyLivu-wIM7iiEcQxTL4n1E_8PfJaj7M5QGxOJfgzgchAwcYongWOdwNhXTByjDOqNMkbrUDiUB-eDWm5DX5fu-bpFVfUxh2lB8_HVYbm6B_Z_y2BOTkh-sO__QtR2R9JsrHZLUcTeQJWRNv5fN4dIpC8gF5f29a
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Neural+collapse+under+cross-entropy+loss&rft.jtitle=Applied+and+computational+harmonic+analysis&rft.au=Lu%2C+Jianfeng&rft.au=Steinerberger%2C+Stefan&rft.date=2022-07-01&rft.issn=1063-5203&rft.volume=59&rft.spage=224&rft.epage=241&rft_id=info:doi/10.1016%2Fj.acha.2021.12.011&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_acha_2021_12_011
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-5203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-5203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-5203&client=summon