Bibliographic Details
| Title: |
Uncovering memorization effect in the presence of spurious correlations. |
| Authors: |
You, Chenyu, Dai, Haocheng, Min, Yifei, Sekhon, Jasjeet S., Joshi, Sarang, Duncan, James S. |
| Source: |
Nature Communications; 7/1/2025, Vol. 16 Issue 1, p1-8, 8p |
| Subject Terms: |
MEMORIZATION, MACHINE performance, MINORITIES, NEURONS, MACHINE learning |
| Abstract: |
Machine learning models often rely on simple spurious features – patterns in training data that correlate with targets but are not causally related to them, like image backgrounds in foreground classification. This reliance typically leads to imbalanced test performance across minority and majority groups. In this work, we take a closer look at the fundamental cause of such imbalanced performance through the lens of memorization, which refers to the ability to predict accurately on atypical examples (minority groups) in the training set but failing in achieving the same accuracy in the testing set. This paper systematically shows the ubiquitous existence of spurious features in a small set of neurons within the network, providing the first-ever evidence that memorization may contribute to imbalanced group performance. Through three experimental sources of converging empirical evidence, we find the property of a small subset of neurons or channels in memorizing minority group information. Inspired by these findings, we hypothesize that spurious memorization, concentrated within a small subset of neurons, plays a key role in driving imbalanced group performance. To further substantiate this hypothesis, we show that eliminating these unnecessary spurious memorization patterns via a novel framework during training can significantly affect the model performance on minority groups. Our experimental results across various architectures and benchmarks offer new insights on how neural networks encode core and spurious knowledge, laying the groundwork for future research in demystifying robustness to spurious correlation. Spurious feature reliance is a challenge in achieving balanced performance in machine learning models. Here, the authors demonstrate that a small subset of neurons are responsible for memorizing spurious correlations and show that this concentrated memorization contributes to imbalanced performance. [ABSTRACT FROM AUTHOR] |
|
Copyright of Nature Communications is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Complementary Index |