A theory of initialisation's impact on specialisation*
Saved in:
| Title: | A theory of initialisation's impact on specialisation* |
|---|---|
| Authors: | Jarvis, Devon, Lee, Sebastian, Carla Juliette Domine, Clementine, Saxe, Andrew M., Sarao Mannelli, Stefano, 1992 |
| Source: | JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT. 2025(11) |
| Subject Terms: | analysis of algorithms, online dynamics, deep learning, machine learning |
| Description: | Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialisation by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique. |
| File Description: | electronic |
| Access URL: | https://research.chalmers.se/publication/549475 https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf |
| Database: | SwePub |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://research.chalmers.se/publication/549475# Name: EDS - SwePub (s4221598) Category: fullText Text: View record in SwePub – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Jarvis%20D Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edsswe DbLabel: SwePub An: edsswe.oai.research.chalmers.se.640b67d1.8a5f.4015.826f.2744d4b5a31f RelevancyScore: 1065 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 1064.736328125 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: A theory of initialisation's impact on specialisation* – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Jarvis%2C+Devon%22">Jarvis, Devon</searchLink><br /><searchLink fieldCode="AR" term="%22Lee%2C+Sebastian%22">Lee, Sebastian</searchLink><br /><searchLink fieldCode="AR" term="%22Carla+Juliette+Domine%2C+Clementine%22">Carla Juliette Domine, Clementine</searchLink><br /><searchLink fieldCode="AR" term="%22Saxe%2C+Andrew+M%2E%22">Saxe, Andrew M.</searchLink><br /><searchLink fieldCode="AR" term="%22Sarao+Mannelli%2C+Stefano%22">Sarao Mannelli, Stefano</searchLink>, 1992 – Name: TitleSource Label: Source Group: Src Data: <i>JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT</i>. 2025(11) – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22analysis+of+algorithms%22">analysis of algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22online+dynamics%22">online dynamics</searchLink><br /><searchLink fieldCode="DE" term="%22deep+learning%22">deep learning</searchLink><br /><searchLink fieldCode="DE" term="%22machine+learning%22">machine learning</searchLink> – Name: Abstract Label: Description Group: Ab Data: Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialisation by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique. – Name: Format Label: File Description Group: SrcInfo Data: electronic – Name: URL Label: Access URL Group: URL Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/549475" linkWindow="_blank">https://research.chalmers.se/publication/549475</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf" linkWindow="_blank">https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf</link> |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.640b67d1.8a5f.4015.826f.2744d4b5a31f |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.1088/1742-5468/ae1214 Languages: – Text: English Subjects: – SubjectFull: analysis of algorithms Type: general – SubjectFull: online dynamics Type: general – SubjectFull: deep learning Type: general – SubjectFull: machine learning Type: general Titles: – TitleFull: A theory of initialisation's impact on specialisation* Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Jarvis, Devon – PersonEntity: Name: NameFull: Lee, Sebastian – PersonEntity: Name: NameFull: Carla Juliette Domine, Clementine – PersonEntity: Name: NameFull: Saxe, Andrew M. – PersonEntity: Name: NameFull: Sarao Mannelli, Stefano IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2025 Identifiers: – Type: issn-print Value: 17425468 – Type: issn-locals Value: SWEPUB_FREE – Type: issn-locals Value: CTH_SWEPUB Numbering: – Type: volume Value: 2025 – Type: issue Value: 11 Titles: – TitleFull: JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT Type: main |
| ResultId | 1 |
Nájsť tento článok vo Web of Science