Integration strategies of multi-omics data for machine learning analysis

Schematic representation of the main strategies for multi-omics datasets integration. A) Early integration concatenates all omics datasets into a single matrix on which machine learning model can be applied. B) Mixed integration first independently transforms or maps each omics block into a new repr...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computational and structural biotechnology journal Ročník 19; s. 3735 - 3746
Hlavní autoři: Picard, Milan, Scott-Boyer, Marie-Pier, Bodein, Antoine, Périn, Olivier, Droit, Arnaud
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.01.2021
Research Network of Computational and Structural Biotechnology
Elsevier
Témata:
ISSN:2001-0370, 2001-0370
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Schematic representation of the main strategies for multi-omics datasets integration. A) Early integration concatenates all omics datasets into a single matrix on which machine learning model can be applied. B) Mixed integration first independently transforms or maps each omics block into a new representation before combining them for downstream analysis. C) Intermediate integration simultaneously transforms the original datasets into common and omics-specific representations. D) Late integration analyses each omics separately and combines their final predictions. E) Hierarchical integration bases the integration of datasets on prior regulatory relationships between omics layers. [Display omitted] Increased availability of high-throughput technologies has generated an ever-growing number of omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these data have been obtained by machine learning algorithms that have produced diagnostic and classification biomarkers. Most biomarkers obtained to date however only include one omic measurement at a time and thus do not take full advantage of recent multi-omics experiments that now capture the entire complexity of biological systems. Multi-omics data integration strategies are needed to combine the complementary knowledge brought by each omics layer. We have summarized the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical. In this mini-review, we focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
ObjectType-Review-3
content type line 23
ISSN:2001-0370
2001-0370
DOI:10.1016/j.csbj.2021.06.030