Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models

We sought to summarize the study design, modelling strategies, and performance measures reported in studies on clinical prediction models developed using machine learning techniques. We search PubMed for articles published between 01/01/2018 and 31/12/2019, describing the development or the developm...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of clinical epidemiology Ročník 154; s. 8 - 22
Hlavní autori:	Andaur Navarro, Constanza L., Damen, Johanna A.A., van Smeden, Maarten, Takada, Toshihiko, Nijman, Steven W.J., Dhiman, Paula, Ma, Jie, Collins, Gary S., Bajpai, Ram, Riley, Richard D., Moons, Karel G.M., Hooft, Lotty
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	United States Elsevier Inc 01.02.2023 Elsevier Limited
Predmet:	Algorithms Calibration Datasets Design Development Diagnosis Epidemiology Humans Internal Medicine Learning algorithms Machine Learning Medical diagnosis Medical prognosis Missing data Patients Prediction models Predictive algorithm Prognosis Risk prediction ROC Curve Supervised learning Supervised Machine Learning Support vector machines Systematic review Validation Validation Development Risk prediction Prognosis Diagnosis Predictive algorithm diagnosis predictive algorithm development prognosis risk prediction validation
ISSN:	0895-4356, 1878-5921, 1878-5921
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	We sought to summarize the study design, modelling strategies, and performance measures reported in studies on clinical prediction models developed using machine learning techniques. We search PubMed for articles published between 01/01/2018 and 31/12/2019, describing the development or the development with external validation of a multivariable prediction model using any supervised machine learning technique. No restrictions were made based on study design, data source, or predicted patient-related health outcomes. We included 152 studies, 58 (38.2% [95% CI 30.8–46.1]) were diagnostic and 94 (61.8% [95% CI 53.9–69.2]) prognostic studies. Most studies reported only the development of prediction models (n = 133, 87.5% [95% CI 81.3–91.8]), focused on binary outcomes (n = 131, 86.2% [95% CI 79.8–90.8), and did not report a sample size calculation (n = 125, 82.2% [95% CI 75.4–87.5]). The most common algorithms used were support vector machine (n = 86/522, 16.5% [95% CI 13.5–19.9]) and random forest (n = 73/522, 14% [95% CI 11.3–17.2]). Values for area under the Receiver Operating Characteristic curve ranged from 0.45 to 1.00. Calibration metrics were often missed (n = 494/522, 94.6% [95% CI 92.4–96.3]). Our review revealed that focus is required on handling of missing values, methods for internal validation, and reporting of calibration to improve the methodological conduct of studies on machine learning–based prediction models. PROSPERO, CRD42019161764.
Bibliografia:	ObjectType-Article-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-3 ObjectType-Evidence Based Healthcare-1 ObjectType-Article-1 ObjectType-Feature-2 ObjectType-Review-3 content type line 23 ObjectType-Undefined-4
ISSN:	0895-4356 1878-5921 1878-5921
DOI:	10.1016/j.jclinepi.2022.11.015