Evaluation metrics and statistical tests for machine learning

Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it might be difficult to understand how to evaluate the performance of ML models and compare them with each other. Here, we introduce the...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Scientific reports Ročník 14; číslo 1; s. 6086 - 14
Hlavní autoři:	Rainio, Oona, Teuho, Jarmo, Klén, Riku
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	London Nature Publishing Group UK 13.03.2024 Nature Publishing Group Nature Portfolio
Témata:	639/705/117 639/705/531 Accuracy Bronchopulmonary infection Classification Evaluation metrics Humanities and Social Sciences Image processing Image Processing, Computer-Assisted - methods Information processing Learning algorithms Lung cancer Machine Learning Mathematical models Medical images multidisciplinary Neural networks Neural Networks, Computer Performance evaluation Positron emission tomography Science Science (multidisciplinary) Statistical analysis Statistical testing Supervised Machine Learning Tomography X-rays Statistical testing Medical images Evaluation metrics Machine learning
ISSN:	2045-2322, 2045-2322
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it might be difficult to understand how to evaluate the performance of ML models and compare them with each other. Here, we introduce the most common evaluation metrics used for the typical supervised ML tasks including binary, multi-class, and multi-label classification, regression, image segmentation, object detection, and information retrieval. We explain how to choose a suitable statistical test for comparing models, how to obtain enough values of the metric for testing, and how to perform the test and interpret its results. We also present a few practical examples about comparing convolutional neural networks used to classify X-rays with different lung infections and detect cancer tumors in positron emission tomography images.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-024-56706-x