Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox
The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers using machine learning (ML). However, metagenomics-specific software is scarce, and overoptimistic evaluation and limited cross-study generalization are prevailing issues. To address these, we developed SIAMCAT, a v...
Saved in:
| Published in: | Genome Biology Vol. 22; no. 1; p. 93 |
|---|---|
| Main Authors: | , , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
London
BioMed Central
30.03.2021
Springer Nature B.V BMC |
| Subjects: | |
| ISSN: | 1474-760X, 1474-7596, 1474-760X |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers using machine learning (ML). However, metagenomics-specific software is scarce, and overoptimistic evaluation and limited cross-study generalization are prevailing issues. To address these, we developed SIAMCAT, a versatile R toolbox for ML-based comparative metagenomics. We demonstrate its capabilities in a meta-analysis of fecal metagenomic studies (10,803 samples). When naively transferred across studies, ML models lost accuracy and disease specificity, which could however be resolved by a novel training set augmentation strategy. This reveals some biomarkers to be disease-specific, with others shared across multiple conditions. SIAMCAT is freely available from
siamcat.embl.de
. |
|---|---|
| Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Feature-3 ObjectType-Evidence Based Healthcare-1 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 1474-760X 1474-7596 1474-760X |
| DOI: | 10.1186/s13059-021-02306-1 |