Regularly Truncated M-Estimators for Learning With Noisy Labels

The sample selection approach is very popular in learning with noisy labels. As deep networks "learn pattern first" , prior methods built on sample selection share a similar training procedure: the small-loss examples can be regarded as clean examples and used for helping generalization, w...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on pattern analysis and machine intelligence Ročník 46; číslo 5; s. 3522 - 3536
Hlavní autoři: Xia, Xiaobo, Lu, Pengqian, Gong, Chen, Han, Bo, Yu, Jun, Liu, Tongliang
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States IEEE 01.05.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The sample selection approach is very popular in learning with noisy labels. As deep networks "learn pattern first" , prior methods built on sample selection share a similar training procedure: the small-loss examples can be regarded as clean examples and used for helping generalization, while the large-loss examples are treated as mislabeled ones and excluded from network parameter updates. However, such a procedure is arguably debatable from two folds: (a) it does not consider the bad influence of noisy labels in selected small-loss examples; (b) it does not make good use of the discarded large-loss examples, which may be clean or have meaningful information for generalization. In this paper, we propose regularly truncated M-estimators (RTME) to address the above two issues simultaneously . Specifically, RTME can alternately switch modes between truncated M-estimators and original M-estimators . The former can adaptively select small-losses examples without knowing the noise rate and reduce the side-effects of noisy labels in them. The latter makes the possibly clean examples but with large losses involved to help generalization. Theoretically, we demonstrate that our strategies are label-noise-tolerant. Empirically, comprehensive experimental results show that our method can outperform multiple baselines and is robust to broad noise types and levels.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2023.3347850