Sparse data embedding and prediction by tropical matrix factorization
Background Matrix factorization methods are linear models, with limited capability to model complex relations. In our work, we use tropical semiring to introduce non-linearity into matrix factorization models. We propose a method called Sparse Tropical Matrix Factorization ( STMF ) for the estimatio...
Gespeichert in:
| Veröffentlicht in: | BMC bioinformatics Jg. 22; H. 1; S. 89 - 18 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
London
BioMed Central
25.02.2021
BioMed Central Ltd Springer Nature B.V BMC |
| Schlagworte: | |
| ISSN: | 1471-2105, 1471-2105 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Background
Matrix factorization methods are linear models, with limited capability to model complex relations. In our work, we use tropical semiring to introduce non-linearity into matrix factorization models. We propose a method called
Sparse Tropical Matrix Factorization
(
STMF
) for the estimation of missing (unknown) values in sparse data.
Results
We evaluate the efficiency of the
STMF
method on both synthetic data and biological data in the form of gene expression measurements downloaded from The Cancer Genome Atlas (TCGA) database. Tests on unique synthetic data showed that
STMF
approximation achieves a higher correlation than non-negative matrix factorization (
NMF
), which is unable to recover patterns effectively. On real data,
STMF
outperforms
NMF
on six out of nine gene expression datasets. While
NMF
assumes normal distribution and tends toward the mean value,
STMF
can better fit to extreme values and distributions.
Conclusion
STMF
is the first work that uses tropical semiring on sparse data. We show that in certain cases semirings are useful because they consider the structure, which is different and simpler to understand than it is with standard linear algebra. |
|---|---|
| Bibliographie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1471-2105 1471-2105 |
| DOI: | 10.1186/s12859-021-04023-9 |