Regularly Truncated M-Estimators for Learning With Noisy Labels

The sample selection approach is very popular in learning with noisy labels. As deep networks "learn pattern first" , prior methods built on sample selection share a similar training procedure: the small-loss examples can be regarded as clean examples and used for helping generalization, w...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence Vol. 46; no. 5; pp. 3522 - 3536
Main Authors: Xia, Xiaobo, Lu, Pengqian, Gong, Chen, Han, Bo, Yu, Jun, Liu, Tongliang
Format: Journal Article
Language:English
Published: United States IEEE 01.05.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The sample selection approach is very popular in learning with noisy labels. As deep networks "learn pattern first" , prior methods built on sample selection share a similar training procedure: the small-loss examples can be regarded as clean examples and used for helping generalization, while the large-loss examples are treated as mislabeled ones and excluded from network parameter updates. However, such a procedure is arguably debatable from two folds: (a) it does not consider the bad influence of noisy labels in selected small-loss examples; (b) it does not make good use of the discarded large-loss examples, which may be clean or have meaningful information for generalization. In this paper, we propose regularly truncated M-estimators (RTME) to address the above two issues simultaneously . Specifically, RTME can alternately switch modes between truncated M-estimators and original M-estimators . The former can adaptively select small-losses examples without knowing the noise rate and reduce the side-effects of noisy labels in them. The latter makes the possibly clean examples but with large losses involved to help generalization. Theoretically, we demonstrate that our strategies are label-noise-tolerant. Empirically, comprehensive experimental results show that our method can outperform multiple baselines and is robust to broad noise types and levels.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2023.3347850