DSTPCA: Double-Sparse Constrained Tensor Principal Component Analysis Method for Feature Selection

The identification of differentially expressed genes plays an increasingly important role biologically. Therefore, the feature selection approach has attracted much attention in the field of bioinformatics. The most popular method of principal component analysis studies two-dimensional data without...

Full description

Saved in:
Bibliographic Details
Published in:IEEE/ACM transactions on computational biology and bioinformatics Vol. 18; no. 4; pp. 1481 - 1491
Main Authors: Hu, Yue, Liu, Jin-Xing, Gao, Ying-Lian, Shang, Junliang
Format: Journal Article
Language:English
Published: United States IEEE 01.07.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1545-5963, 1557-9964, 1557-9964
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The identification of differentially expressed genes plays an increasingly important role biologically. Therefore, the feature selection approach has attracted much attention in the field of bioinformatics. The most popular method of principal component analysis studies two-dimensional data without considering the spatial geometric structure of the data. The recently proposed tensor robust principal component analysis method performs sparse and low-rank decomposition on three-dimensional tensors and effectively preserves the spatial structure. Based on this approach, the <inline-formula><tex-math notation="LaTeX">L_{2,1}</tex-math> <mml:math><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="hu-ieq1-2943459.gif"/> </inline-formula>- norm regularization term is introduced into the DSTPCA (Double-Sparse Constrained Tensor Principal Component Analysis) method. The DSTPCA method removes the redundant noise by double sparse constraints on the objective function to obtain sufficiently sparse results. After the regularization norm is introduced into the model, the ADMM (alternating direction method of multipliers) algorithm is used to solve the optimal problem. In the experiment of feature selection, while the more redundant genes were filtered out, the more genes closely associated with disease were screened. Experimental results using different datasets indicate that our method outperforms other methods.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1545-5963
1557-9964
1557-9964
DOI:10.1109/TCBB.2019.2943459