FR-MIL: Distribution Re-Calibration-Based Multiple Instance Learning With Transformer for Whole Slide Image Classification

In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on medical imaging Ročník 44; číslo 1; s. 409 - 421
Hlavní autoři: Chikontwe, Philip, Kim, Meejeong, Jeong, Jaehoon, Jung Sung, Hyun, Go, Heounjeong, Jeong Nam, Soo, Park, Sang Hyun
Médium: Journal Article
Jazyk:angličtina
Vydáno: United States IEEE 01.01.2025
Témata:
ISSN:0278-0062, 1558-254X, 1558-254X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points. https://github.com/PhilipChicco/FRMIL
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0278-0062
1558-254X
1558-254X
DOI:10.1109/TMI.2024.3446716