Multiple instance learning using pathology foundation models effectively predicts kidney disease diagnosis and clinical classification

Recently developed pathology foundation models, pretrained on large-scale pathology datasets, have demonstrated excellent performance in various downstream tasks. This study evaluated the utility of pathology foundation models combined with multiple instance learning (MIL) for kidney pathology analy...

Full description

Saved in:
Bibliographic Details
Published in:Scientific reports Vol. 15; no. 1; pp. 35298 - 12
Main Authors: Kurata, Yu, Mimura, Imari, Kodera, Satoshi, Abe, Hiroyuki, Yamada, Daisuke, Kume, Haruki, Ushiku, Tetsuo, Tanaka, Tetsuhiro, Takeda, Norihiko, Nangaku, Masaomi
Format: Journal Article
Language:English
Published: London Nature Publishing Group UK 09.10.2025
Nature Publishing Group
Nature Portfolio
Subjects:
ISSN:2045-2322, 2045-2322
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently developed pathology foundation models, pretrained on large-scale pathology datasets, have demonstrated excellent performance in various downstream tasks. This study evaluated the utility of pathology foundation models combined with multiple instance learning (MIL) for kidney pathology analysis. We used 242 hematoxylin and eosin-stained whole slide images (WSIs) from the Kidney Precision Medicine Project (KPMP) and Japan-Pathology Artificial Intelligence Diagnostics Project databases as the development cohort, comprising 47 healthy controls, 35 acute interstitial nephritis, and 160 diabetic kidney disease (DKD) slides. External validation was performed using 83 WSIs from the University of Tokyo Hospital. Pretrained pathology foundation models were utilized as patch encoders and compared with ImageNet-pretrained ResNet50. Using the extracted patch features, we trained MIL models to classify diagnoses. In internal validation, all foundation models outperformed ResNet50, achieving area under the receiver operating characteristic curve (AUROC) over 0.980. In external validation, the performance of ResNet50 markedly dropped, which was in contrast to all foundation models. Visualization of attention heatmaps confirmed that foundation models accurately recognized diagnostically relevant structures. In overt proteinuria (albuminuria ≥ 300 mg/gCre or proteinuria ≥ 1000 mg/gCre) prediction task, foundation models also outperformed ResNet50. We successfully integrated pathology foundation models with MIL to achieve robust diagnostic performance.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-025-19297-9