DSTPCA: Double-Sparse Constrained Tensor Principal Component Analysis Method for Feature Selection
The identification of differentially expressed genes plays an increasingly important role biologically. Therefore, the feature selection approach has attracted much attention in the field of bioinformatics. The most popular method of principal component analysis studies two-dimensional data without...
Saved in:
| Published in: | IEEE/ACM transactions on computational biology and bioinformatics Vol. 18; no. 4; pp. 1481 - 1491 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
IEEE
01.07.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1545-5963, 1557-9964, 1557-9964 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The identification of differentially expressed genes plays an increasingly important role biologically. Therefore, the feature selection approach has attracted much attention in the field of bioinformatics. The most popular method of principal component analysis studies two-dimensional data without considering the spatial geometric structure of the data. The recently proposed tensor robust principal component analysis method performs sparse and low-rank decomposition on three-dimensional tensors and effectively preserves the spatial structure. Based on this approach, the <inline-formula><tex-math notation="LaTeX">L_{2,1}</tex-math> <mml:math><mml:msub><mml:mi>L</mml:mi><mml:mrow><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:math><inline-graphic xlink:href="hu-ieq1-2943459.gif"/> </inline-formula>- norm regularization term is introduced into the DSTPCA (Double-Sparse Constrained Tensor Principal Component Analysis) method. The DSTPCA method removes the redundant noise by double sparse constraints on the objective function to obtain sufficiently sparse results. After the regularization norm is introduced into the model, the ADMM (alternating direction method of multipliers) algorithm is used to solve the optimal problem. In the experiment of feature selection, while the more redundant genes were filtered out, the more genes closely associated with disease were screened. Experimental results using different datasets indicate that our method outperforms other methods. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1545-5963 1557-9964 1557-9964 |
| DOI: | 10.1109/TCBB.2019.2943459 |