Aligned deep neural network for integrative analysis with high-dimensional input

Deep neural network (DNN) techniques have demonstrated significant advantages over regression and some other techniques. In recent studies, DNN-based analysis has been conducted on data with high-dimensional input such as omics measurements. In such analysis, regularization, in particular penalizati...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics Jg. 144; S. 104434
Hauptverfasser: Zhang, Shunqin, Zhang, Sanguo, Yi, Huangdi, Ma, Shuangge
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States 01.08.2023
Schlagworte:
ISSN:1532-0480, 1532-0480
Online-Zugang:Weitere Angaben
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deep neural network (DNN) techniques have demonstrated significant advantages over regression and some other techniques. In recent studies, DNN-based analysis has been conducted on data with high-dimensional input such as omics measurements. In such analysis, regularization, in particular penalization, has been applied to regularize estimation and distinguish relevant input variables from irrelevant ones. A unique challenge arises from the "lack of information" attributable to high dimensionality of input and limited size of training data. For many data/studies, there exist other data/studies that may be relevant and can potentially provide additional information to boost performance. In this study, we conduct integrative analysis of multiple independent datasets/studies, with the goal of borrowing information across each other and improving overall performance. Significantly different from regression-based integrative analysis (where alignment can be easily achieved based on covariates), alignment across multiple DNNs can be nontrivial. We develop ANNI, an Aligned DNN technique for Integrative analysis with high-dimensional input. Penalization is applied for regularized estimation, selection of important input variables, and, equally importantly, information borrowing across multiple DNNs. An effective computational algorithm is developed. Extensive simulations demonstrate competitive performance of the proposed technique. The analysis of cancer omics data further establishes its practical utility.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0480
1532-0480
DOI:10.1016/j.jbi.2023.104434