CAEM-GBDT: a cancer subtype identifying method using multi-omics data and convolutional autoencoder network

The identification of cancer subtypes plays a very important role in the field of medicine. Accurate identification of cancer subtypes is helpful for both cancer treatment and prognosis Currently, most methods for cancer subtype identification are based on single-omics data, such as gene expression...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in bioinformatics Vol. 4; p. 1403826
Main Authors: Shen, Jiquan, Guo, Xuanhui, Bai, Hanwen, Luo, Junwei
Format: Journal Article
Language:English
Published: Switzerland Frontiers Media S.A 15.07.2024
Subjects:
ISSN:2673-7647, 2673-7647
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The identification of cancer subtypes plays a very important role in the field of medicine. Accurate identification of cancer subtypes is helpful for both cancer treatment and prognosis Currently, most methods for cancer subtype identification are based on single-omics data, such as gene expression data. However, multi-omics data can show various characteristics about cancer, which also can improve the accuracy of cancer subtype identification. Therefore, how to extract features from multi-omics data for cancer subtype identification is the main challenge currently faced by researchers. In this paper, we propose a cancer subtype identification method named CAEM-GBDT, which takes gene expression data, miRNA expression data, and DNA methylation data as input, and adopts convolutional autoencoder network to identify cancer subtypes. Through a convolutional encoder layer, the method performs feature extraction on the input data. Within the convolutional encoder layer, a convolutional self-attention module is embedded to recognize higher-level representations of the multi-omics data. The extracted high-level representations from the convolutional encoder are then concatenated with the input to the decoder. The GBDT (Gradient Boosting Decision Tree) is utilized for cancer subtype identification. In the experiments, we compare CAEM-GBDT with existing cancer subtype identifying methods. Experimental results demonstrate that the proposed CAEM-GBDT outperforms other methods. The source code is available from GitHub at https://github.com/gxh-1/CAEM-GBDT.git .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Lun Hu, Chinese Academy of Sciences (CAS), China
Edited by: Chunhou Zheng, Anhui University, China
Hongdong Li, Central South University, China
ISSN:2673-7647
2673-7647
DOI:10.3389/fbinf.2024.1403826