A comprehensive benchmark of machine and deep learning models on structured data for regression and classification
Gespeichert in:
| Titel: | A comprehensive benchmark of machine and deep learning models on structured data for regression and classification |
|---|---|
| Autoren: | Shmuel, Assaf, Glickman, Oren, Lazebnik, Teddy |
| Quelle: | Neurocomputing. 655 |
| Schlagwörter: | Benchmark, Deep learning, Gradient boosting models, Adaptive boosting, Classification (of information), Learning systems, Excel, Gradient boosting, Gradient boosting model, Learning models, Machine-learning, Real-world, Scientific researches, Structured data, article, benchmarking, classification, filtration, machine learning, major clinical study |
| Beschreibung: | The analysis of tabular datasets is highly prevalent both in scientific research and real-world applications of Machine Learning (ML). Unlike many other ML tasks, Deep Learning (DL) models often do not outperform traditional methods in this area. Previous comparative benchmarks have shown that DL performance is frequently equivalent to or even inferior to models such as Gradient Boosting Machines (GBMs). In this study, we introduce a comprehensive benchmark aimed at better characterizing the types of datasets where DL models excel. Although several important benchmarks for tabular datasets already exist, our contribution lies in the variety and depth of our comparison: we evaluate 111 datasets with 20 different models, including both regression and classification tasks. These datasets vary in scale and include both those with and without categorical variables. Importantly, our benchmark contains a sufficient number of datasets where DL models perform best, allowing for a thorough analysis of the conditions under which DL models excel. Building on the results of this benchmark, we train a model that predicts scenarios where DL models significantly outperform alternative methods, considering only datasets where the performance difference between the two groups is statistically significant. This filtering yields 36 datasets out of the original 111. On this subset, our model achieves 92 % accuracy. We present insights derived from this characterization and compare these findings to previous benchmarks. |
| Dateibeschreibung: | |
| Zugangs-URL: | https://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-69810 https://doi.org/10.1016/j.neucom.2025.131337 |
| Datenbank: | SwePub |
| Abstract: | The analysis of tabular datasets is highly prevalent both in scientific research and real-world applications of Machine Learning (ML). Unlike many other ML tasks, Deep Learning (DL) models often do not outperform traditional methods in this area. Previous comparative benchmarks have shown that DL performance is frequently equivalent to or even inferior to models such as Gradient Boosting Machines (GBMs). In this study, we introduce a comprehensive benchmark aimed at better characterizing the types of datasets where DL models excel. Although several important benchmarks for tabular datasets already exist, our contribution lies in the variety and depth of our comparison: we evaluate 111 datasets with 20 different models, including both regression and classification tasks. These datasets vary in scale and include both those with and without categorical variables. Importantly, our benchmark contains a sufficient number of datasets where DL models perform best, allowing for a thorough analysis of the conditions under which DL models excel. Building on the results of this benchmark, we train a model that predicts scenarios where DL models significantly outperform alternative methods, considering only datasets where the performance difference between the two groups is statistically significant. This filtering yields 36 datasets out of the original 111. On this subset, our model achieves 92 % accuracy. We present insights derived from this characterization and compare these findings to previous benchmarks. |
|---|---|
| ISSN: | 09252312 18728286 |
| DOI: | 10.1016/j.neucom.2025.131337 |
Full Text Finder
Nájsť tento článok vo Web of Science