Generalizability of Deep Learning Segmentation Algorithms for Automated Assessment of Cartilage Morphology and MRI Relaxometry

Background Deep learning (DL)‐based automatic segmentation models can expedite manual segmentation yet require resource‐intensive fine‐tuning before deployment on new datasets. The generalizability of DL methods to new datasets without fine‐tuning is not well characterized. Purpose Evaluate the gene...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Journal of magnetic resonance imaging Ročník 57; číslo 4; s. 1029 - 1039
Hlavní autori: Schmidt, Andrew M., Desai, Arjun D., Watkins, Lauren E., Crowder, Hollis A., Black, Marianne S., Mazzoli, Valentina, Rubin, Elka B., Lu, Quin, MacKay, James W., Boutin, Robert D., Kogan, Feliks, Gold, Garry E., Hargreaves, Brian A., Chaudhari, Akshay S.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Hoboken, USA John Wiley & Sons, Inc 01.04.2023
Wiley Subscription Services, Inc
Predmet:
ISSN:1053-1807, 1522-2586, 1522-2586
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Background Deep learning (DL)‐based automatic segmentation models can expedite manual segmentation yet require resource‐intensive fine‐tuning before deployment on new datasets. The generalizability of DL methods to new datasets without fine‐tuning is not well characterized. Purpose Evaluate the generalizability of DL‐based models by deploying pretrained models on independent datasets varying by MR scanner, acquisition parameters, and subject population. Study Type Retrospective based on prospectively acquired data. Population Overall test dataset: 59 subjects (26 females); Study 1: 5 healthy subjects (zero females), Study 2: 8 healthy subjects (eight females), Study 3: 10 subjects with osteoarthritis (eight females), Study 4: 36 subjects with various knee pathology (10 females). Field Strength/Sequence A 3‐T, quantitative double‐echo steady state (qDESS). Assessment Four annotators manually segmented knee cartilage. Each reader segmented one of four qDESS datasets in the test dataset. Two DL models, one trained on qDESS data and another on Osteoarthritis Initiative (OAI)‐DESS data, were assessed. Manual and automatic segmentations were compared by quantifying variations in segmentation accuracy, volume, and T2 relaxation times for superficial and deep cartilage. Statistical Tests Dice similarity coefficient (DSC) for segmentation accuracy. Lin's concordance correlation coefficient (CCC), Wilcoxon rank‐sum tests, root‐mean‐squared error‐coefficient‐of‐variation to quantify manual vs. automatic T2 and volume variations. Bland–Altman plots for manual vs. automatic T2 agreement. A P value < 0.05 was considered statistically significant. Results DSCs for the qDESS‐trained model, 0.79–0.93, were higher than those for the OAI‐DESS‐trained model, 0.59–0.79. T2 and volume CCCs for the qDESS‐trained model, 0.75–0.98 and 0.47–0.95, were higher than respective CCCs for the OAI‐DESS‐trained model, 0.35–0.90 and 0.13–0.84. Bland–Altman 95% limits of agreement for superficial and deep cartilage T2 were lower for the qDESS‐trained model, ±2.4 msec and ±4.0 msec, than the OAI‐DESS‐trained model, ±4.4 msec and ±5.2 msec. Data Conclusion The qDESS‐trained model may generalize well to independent qDESS datasets regardless of MR scanner, acquisition parameters, and subject population. Evidence Level 1 Technical Efficacy Stage 1
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1053-1807
1522-2586
1522-2586
DOI:10.1002/jmri.28365