A theory of initialisation's impact on specialisation*

Saved in:
Bibliographic Details
Title: A theory of initialisation's impact on specialisation*
Authors: Jarvis, Devon, Lee, Sebastian, Carla Juliette Domine, Clementine, Saxe, Andrew M., Sarao Mannelli, Stefano, 1992
Source: JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT. 2025(11)
Subject Terms: analysis of algorithms, online dynamics, deep learning, machine learning
Description: Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialisation by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.
File Description: electronic
Access URL: https://research.chalmers.se/publication/549475
https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf
Database: SwePub
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://research.chalmers.se/publication/549475#
    Name: EDS - SwePub (s4221598)
    Category: fullText
    Text: View record in SwePub
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Jarvis%20D
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsswe
DbLabel: SwePub
An: edsswe.oai.research.chalmers.se.640b67d1.8a5f.4015.826f.2744d4b5a31f
RelevancyScore: 1065
AccessLevel: 6
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 1064.736328125
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: A theory of initialisation's impact on specialisation*
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Jarvis%2C+Devon%22">Jarvis, Devon</searchLink><br /><searchLink fieldCode="AR" term="%22Lee%2C+Sebastian%22">Lee, Sebastian</searchLink><br /><searchLink fieldCode="AR" term="%22Carla+Juliette+Domine%2C+Clementine%22">Carla Juliette Domine, Clementine</searchLink><br /><searchLink fieldCode="AR" term="%22Saxe%2C+Andrew+M%2E%22">Saxe, Andrew M.</searchLink><br /><searchLink fieldCode="AR" term="%22Sarao+Mannelli%2C+Stefano%22">Sarao Mannelli, Stefano</searchLink>, 1992
– Name: TitleSource
  Label: Source
  Group: Src
  Data: <i>JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT</i>. 2025(11)
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22analysis+of+algorithms%22">analysis of algorithms</searchLink><br /><searchLink fieldCode="DE" term="%22online+dynamics%22">online dynamics</searchLink><br /><searchLink fieldCode="DE" term="%22deep+learning%22">deep learning</searchLink><br /><searchLink fieldCode="DE" term="%22machine+learning%22">machine learning</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialisation by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.
– Name: Format
  Label: File Description
  Group: SrcInfo
  Data: electronic
– Name: URL
  Label: Access URL
  Group: URL
  Data: <link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/549475" linkWindow="_blank">https://research.chalmers.se/publication/549475</link><br /><link linkTarget="URL" linkTerm="https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf" linkWindow="_blank">https://research.chalmers.se/publication/549475/file/549475_Fulltext.pdf</link>
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsswe&AN=edsswe.oai.research.chalmers.se.640b67d1.8a5f.4015.826f.2744d4b5a31f
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1088/1742-5468/ae1214
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: analysis of algorithms
        Type: general
      – SubjectFull: online dynamics
        Type: general
      – SubjectFull: deep learning
        Type: general
      – SubjectFull: machine learning
        Type: general
    Titles:
      – TitleFull: A theory of initialisation's impact on specialisation*
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Jarvis, Devon
      – PersonEntity:
          Name:
            NameFull: Lee, Sebastian
      – PersonEntity:
          Name:
            NameFull: Carla Juliette Domine, Clementine
      – PersonEntity:
          Name:
            NameFull: Saxe, Andrew M.
      – PersonEntity:
          Name:
            NameFull: Sarao Mannelli, Stefano
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2025
          Identifiers:
            – Type: issn-print
              Value: 17425468
            – Type: issn-locals
              Value: SWEPUB_FREE
            – Type: issn-locals
              Value: CTH_SWEPUB
          Numbering:
            – Type: volume
              Value: 2025
            – Type: issue
              Value: 11
          Titles:
            – TitleFull: JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT
              Type: main
ResultId 1