Structure-preserving deep embedded clustering algorithm for incomplete gene expression data
Missing values inevitably appear in gene expression data, making it impossible to directly apply clustering algorithms to incomplete gene expression data to identify co-expressed genes. Deep autoencoders are often used for feature learning of data in clustering incomplete data due to their powerful...
Uloženo v:
| Vydáno v: | Chinese Control Conference s. 8255 - 8261 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
Technical Committee on Control Theory, Chinese Association of Automation
28.07.2024
|
| Témata: | |
| ISSN: | 1934-1768 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Missing values inevitably appear in gene expression data, making it impossible to directly apply clustering algorithms to incomplete gene expression data to identify co-expressed genes. Deep autoencoders are often used for feature learning of data in clustering incomplete data due to their powerful ability to learn representations. Existing deep autoencoder-based clustering algorithms for incomplete data are two-stage algorithms that perform feature learning before clustering, ignoring the correlation between the two tasks. In order to ensure that the feature representations learned by the network are oriented to the clustering task, and the mapped features can preserve the inherent structure information of the input data, this paper proposes a deep embedded clustering algorithm for incomplete gene expression data based on structure-preserving autoencoder. On the one hand, the proposed algorithm applies joint optimization to the clustering process of incomplete data, alternately performing feature learning and clustering optimization of the imputed data iteratively. On the other hand, distinguishing from preserving the geometric structure of the input data only in the feature space where the clustering task is performed, we define Sammon's stress between the inputs and outputs so that the data can preserve the inherent geometric structure information throughout the mapping process. Experimental results on several gene expression datasets show that the proposed algorithm achieves better results in terms of both clustering effect and biological significance |
|---|---|
| AbstractList | Missing values inevitably appear in gene expression data, making it impossible to directly apply clustering algorithms to incomplete gene expression data to identify co-expressed genes. Deep autoencoders are often used for feature learning of data in clustering incomplete data due to their powerful ability to learn representations. Existing deep autoencoder-based clustering algorithms for incomplete data are two-stage algorithms that perform feature learning before clustering, ignoring the correlation between the two tasks. In order to ensure that the feature representations learned by the network are oriented to the clustering task, and the mapped features can preserve the inherent structure information of the input data, this paper proposes a deep embedded clustering algorithm for incomplete gene expression data based on structure-preserving autoencoder. On the one hand, the proposed algorithm applies joint optimization to the clustering process of incomplete data, alternately performing feature learning and clustering optimization of the imputed data iteratively. On the other hand, distinguishing from preserving the geometric structure of the input data only in the feature space where the clustering task is performed, we define Sammon's stress between the inputs and outputs so that the data can preserve the inherent geometric structure information throughout the mapping process. Experimental results on several gene expression datasets show that the proposed algorithm achieves better results in terms of both clustering effect and biological significance |
| Author | Li, Dan Wang, Zhencheng |
| Author_xml | – sequence: 1 givenname: Zhencheng surname: Wang fullname: Wang, Zhencheng organization: Dalian University of Technology,School of Control Science and Engineering,Dalian,P. R. China,116024 – sequence: 2 givenname: Dan surname: Li fullname: Li, Dan email: Idan@dlut.edu.cn organization: Dalian University of Technology,School of Control Science and Engineering,Dalian,P. R. China,116024 |
| BookMark | eNo1kM1KxDAUhaMoODP6BoJ5gY65TXqTLKX4BwMunJ2LIW1ua6R_pBnRt3cGdXUW3-GDc5bsbBgHYuwGxDqXFuxtWZYoQeM6F7lag0AEg-aELa0xujBQmOKULcBKlR1a5oIt5_lDCBQW5IK9vaa4r9M-UjZFmil-hqHlnmji1FfkPXled_s5UTwC17VjDOm9580YeRjqsZ86SsRbGojT19Exh3Hg3iV3yc4b18109Zcrtn2435ZP2ebl8bm822TBQsoqRGN8o7VCJwhVRaIyskbMhXLaoIWq9lpLBNEIqNxhn5DCe6tAISgrV-z6VxuIaDfF0Lv4vfs_Qv4ACjNV8A |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.23919/CCC63176.2024.10661868 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Biology |
| EISBN | 9887581585 9789887581581 |
| EISSN | 1934-1768 |
| EndPage | 8261 |
| ExternalDocumentID | 10661868 |
| Genre | orig-research |
| GroupedDBID | 29B 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-i91t-b6688df7746a0e64be0b83c66204a78691bcd773610f01ba661030dd941461493 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:00:20 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i91t-b6688df7746a0e64be0b83c66204a78691bcd773610f01ba661030dd941461493 |
| PageCount | 7 |
| ParticipantIDs | ieee_primary_10661868 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-July-28 |
| PublicationDateYYYYMMDD | 2024-07-28 |
| PublicationDate_xml | – month: 07 year: 2024 text: 2024-July-28 day: 28 |
| PublicationDecade | 2020 |
| PublicationTitle | Chinese Control Conference |
| PublicationTitleAbbrev | CCC |
| PublicationYear | 2024 |
| Publisher | Technical Committee on Control Theory, Chinese Association of Automation |
| Publisher_xml | – name: Technical Committee on Control Theory, Chinese Association of Automation |
| SSID | ssj0060913 |
| Score | 2.2633147 |
| Snippet | Missing values inevitably appear in gene expression data, making it impossible to directly apply clustering algorithms to incomplete gene expression data to... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 8255 |
| SubjectTerms | Analytical models Biology Clustering algorithms Correlation Data models Deep embedded clustering Gene expression data Imputation Joint optimization Missing value Representation learning Structure preserving |
| Title | Structure-preserving deep embedded clustering algorithm for incomplete gene expression data |
| URI | https://ieeexplore.ieee.org/document/10661868 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5sUdCLWiu-2YPX1Dz3cQ4WT6VgDwUPJZudrYU2LTUV_PfObFofBw_eQkJIMpudx-4338fYvdXCYplcBKZMsEARUgc6IXptk0nrnEujyHqxCTkYqPFYD7fN6r4XBgA8-Ax6dOj38u2y3NBSGc5w4endW6wlpWiatXZuVxDBZQPgihMd6Yc8zwUGR4IhxGlvd-svERUfQ_rH_3z6Cet-d-Px4VecOWV7UHXYQSMi-dFhRz8oBc_Yy7MnhN2sISCIK3mCasotwIrDwgC6GcvL-YboEehCMZ8u17P6dcExeeXE1EBswTVw_K-A41s2MNmKE5K0y0b9x1H-FGwFFIKZjurACKGUdZjgiSIEkRoIjUpKQRT0hVRCR6a0UiaYQbkwMgV-Hk55a3VKYt-pTs5Zu1pWcMG4ThWAsKDCAhMoLFFiApC7rDSZcFHiLlmXDDZZNRQZk52trv44f80OaVhokTRWN6yNpoFbtl--17O39Z0f2E_hbaRM |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELaggIAFKEW88cCakofj2HNEVUSpKtGhEkMVx-dSqU2rkiLx7_E5LY-BgS1KFCU5x_ewv_s-Qm615NqWyZmn8sgWKDyRnoyQXlvFiTbGsCDQTmwi6XbFYCB7q2Z11wsDAA58Bk08dHv5epYvcanMznDu6N03yVbMWOhX7Vprx8uR4rKCcIWRDORdmqbchkcEIoSsub75l4yKiyKtg38-_5A0vvvxaO8r0hyRDSjqZKeSkfyok_0fpILH5OXZUcIuF-AhyBV9QTGiGmBOYarAOhpN88kSCRLwQjYZzRbj8nVKbfpKkasB-YJLoPbPAmrfsgLKFhSxpA3Sb93307a3klDwxjIoPcW5ENrYFI9nPnCmwFciyjmS0GeJ4DJQuU6SyOZQxg9UZj_PTnqtJUO5byajE1IrZgWcEiqZAOAahJ_ZFMoWKSFCyE2cq5ibIDJnpIEGG84rkozh2lbnf5y_Ibvt_lNn2HnoPl6QPRwiXDINxSWpWTPBFdnO38vx2-LaDfInmymnkw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Chinese+Control+Conference&rft.atitle=Structure-preserving+deep+embedded+clustering+algorithm+for+incomplete+gene+expression+data&rft.au=Wang%2C+Zhencheng&rft.au=Li%2C+Dan&rft.date=2024-07-28&rft.pub=Technical+Committee+on+Control+Theory%2C+Chinese+Association+of+Automation&rft.eissn=1934-1768&rft.spage=8255&rft.epage=8261&rft_id=info:doi/10.23919%2FCCC63176.2024.10661868&rft.externalDocID=10661868 |