International multicenter validation of AI-driven ultrasound detection of ovarian cancer

Saved in:
Bibliographic Details
Title: International multicenter validation of AI-driven ultrasound detection of ovarian cancer
Authors: Filip Christiansen, Emir Konuk, Adithya Raju Ganeshan, Robert Welch, Joana Palés Huix, Artur Czekierdowski, Francesco Paolo Giuseppe Leone, Lucia Anna Haak, Robert Fruscio, Adrius Gaurilcikas, Dorella Franchi, Daniela Fischerova, Elisa Mor, Luca Savelli, Maria Àngela Pascual, Marek Jerzy Kudla, Stefano Guerriero, Francesca Buonomo, Karina Liuba, Nina Montik, Juan Luis Alcázar, Ekaterini Domali, Nelinda Catherine P. Pangilinan, Chiara Carella, Maria Munaretto, Petra Saskova, Debora Verri, Chiara Visenzi, Pawel Herman, Kevin Smith, Elisabeth Epstein
Source: Nat Med
Publisher Information: Springer Science and Business Media LLC, 2025.
Publication Year: 2025
Subject Terms: Ovarian Neoplasms, Adult, Middle Aged, Ultrasound, artificial intelligence, Sensitivity and Specificity, Article, Deep Learning, Aged Artificial Intelligence Deep Learning Female Humans Middle Aged Neural Networks, Computer* Ovarian Neoplasms* / diagnostic imaging Retrospective Studies Sensitivity and Specificity Ultrasonography* / methods, Artificial Intelligence, Humans, Female, Neural Networks, Computer, Ultrasonography, Retrospective Studies, Aged
Description: Ovarian lesions are common and often incidentally detected. A critical shortage of expert ultrasound examiners has raised concerns of unnecessary interventions and delayed cancer diagnoses. Deep learning has shown promising results in the detection of ovarian cancer in ultrasound images; however, external validation is lacking. In this international multicenter retrospective study, we developed and validated transformer-based neural network models using a comprehensive dataset of 17,119 ultrasound images from 3,652 patients across 20 centers in eight countries. Using a leave-one-center-out cross-validation scheme, for each center in turn, we trained a model using data from the remaining centers. The models demonstrated robust performance across centers, ultrasound systems, histological diagnoses and patient age groups, significantly outperforming both expert and non-expert examiners on all evaluated metrics, namely F1 score, sensitivity, specificity, accuracy, Cohen’s kappa, Matthew’s correlation coefficient, diagnostic odds ratio and Youden’s J statistic. Furthermore, in a retrospective triage simulation, artificial intelligence (AI)-driven diagnostic support reduced referrals to experts by 63% while significantly surpassing the diagnostic performance of the current practice. These results show that transformer-based models exhibit strong generalization and above human expert-level diagnostic accuracy, with the potential to alleviate the shortage of expert ultrasound examiners and improve patient outcomes.
Document Type: Article
Other literature type
File Description: application/pdf
Language: English
ISSN: 1546-170X
1078-8956
DOI: 10.1038/s41591-024-03329-4
Access URL: https://pubmed.ncbi.nlm.nih.gov/39747679
https://pergamos.lib.uoa.gr/uoa/dl/object/3489361
Rights: CC BY
Accession Number: edsair.doi.dedup.....7ffddac090cb7f3b012552d1298503cc
Database: OpenAIRE
Description
Abstract:Ovarian lesions are common and often incidentally detected. A critical shortage of expert ultrasound examiners has raised concerns of unnecessary interventions and delayed cancer diagnoses. Deep learning has shown promising results in the detection of ovarian cancer in ultrasound images; however, external validation is lacking. In this international multicenter retrospective study, we developed and validated transformer-based neural network models using a comprehensive dataset of 17,119 ultrasound images from 3,652 patients across 20 centers in eight countries. Using a leave-one-center-out cross-validation scheme, for each center in turn, we trained a model using data from the remaining centers. The models demonstrated robust performance across centers, ultrasound systems, histological diagnoses and patient age groups, significantly outperforming both expert and non-expert examiners on all evaluated metrics, namely F1 score, sensitivity, specificity, accuracy, Cohen’s kappa, Matthew’s correlation coefficient, diagnostic odds ratio and Youden’s J statistic. Furthermore, in a retrospective triage simulation, artificial intelligence (AI)-driven diagnostic support reduced referrals to experts by 63% while significantly surpassing the diagnostic performance of the current practice. These results show that transformer-based models exhibit strong generalization and above human expert-level diagnostic accuracy, with the potential to alleviate the shortage of expert ultrasound examiners and improve patient outcomes.
ISSN:1546170X
10788956
DOI:10.1038/s41591-024-03329-4