An Improved K-Means Algorithm for DNA Sequence Clustering
In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus...
Uložené v:
| Vydané v: | Proceedings - International Workshop on Database and Expert Systems Applications s. 39 - 42 |
|---|---|
| Hlavní autori: | , |
| Médium: | Konferenčný príspevok.. Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.09.2015
|
| Predmet: | |
| ISBN: | 1467375810, 9781467375818 |
| ISSN: | 1529-4188, 2378-3915 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus, if we manage to assimilate a sequence S1 to another sequence S2 or to a group of previously studied sequences, this will allow us to directly deduce the structure, functions and phylogenetic classification of S2. The purpose of this work is to adapt clustering methods to the specific problem of classification of DNA sequences. We introduce a new method based on K-means clustering for DNA sequences clustering. We begin by explaining and motivating our approach, then we present obtained results. |
|---|---|
| AbstractList | In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus, if we manage to assimilate a sequence S1 to another sequence S2 or to a group of previously studied sequences, this will allow us to directly deduce the structure, functions and phylogenetic classification of S2. The purpose of this work is to adapt clustering methods to the specific problem of classification of DNA sequences. We introduce a new method based on K-means clustering for DNA sequences clustering. We begin by explaining and motivating our approach, then we present obtained results. |
| Author | Labidi, Narimane Aleb, Nasssima |
| Author_xml | – sequence: 1 givenname: Nasssima surname: Aleb fullname: Aleb, Nasssima email: naleb@usthb.dz organization: USTHB-FEI, Univ. & Sci. & Technol. Houari Boumediene, Algiers, Algeria – sequence: 2 givenname: Narimane surname: Labidi fullname: Labidi, Narimane email: narimanelbd@hotmail.com organization: USTHB-FEI, Univ. & Sci. & Technol. Houari Boumediene, Algiers, Algeria |
| BookMark | eNqNjj1PwzAYhA0UibZ0Y2PxyJLi11-xx6gtUFFgoANbZJzXJVI-IE6R-PcEFYmV6aS75043IaOmbZCQC2BzAGavl6uXbM4ZqDlPj8gEpE5Fqgy3x2TMRWoSYUGd_AXARmQMittEgjFnZBZj-cq4YsJqI8bEZg1d1-9d-4kFvU8e0DWRZtWu7cr-raah7ejyMaPP-LHHxiNdVPvYY1c2u3NyGlwVcfarU7K9WW0Xd8nm6Xa9yDZJySX0iZUeClToNXCjnSiMEwE8eB50oSxTTBUeJEiBRjNZ8CI4x7gJHMOPOyVXh9nh4_Ah9nldRo9V5Rps9zEHw5VUeqj-A2VGK7BWDOjlAS0RMX_vytp1X3kqmeZai2-9kGXW |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IL CBEJK RIE RIL 7TM 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/DEXA.2015.27 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present Nucleic Acids Abstracts Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | Nucleic Acids Abstracts Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Nucleic Acids Abstracts Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1467375829 9781467375825 |
| EISSN | 2378-3915 |
| EndPage | 42 |
| ExternalDocumentID | 7406266 |
| Genre | orig-research |
| GroupedDBID | 23M 29F 29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS 7TM 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-i241t-94c1de5ec61286a3d8a3f1c1c2f6d590505dc14143e8604d2dfaa028f2ef4143 |
| IEDL.DBID | RIE |
| ISBN | 1467375810 9781467375818 |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000380461300006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1529-4188 |
| IngestDate | Thu Jul 10 23:22:53 EDT 2025 Thu Jul 10 23:12:28 EDT 2025 Wed Aug 27 01:55:47 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i241t-94c1de5ec61286a3d8a3f1c1c2f6d590505dc14143e8604d2dfaa028f2ef4143 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
| PQID | 1808651993 |
| PQPubID | 23462 |
| PageCount | 4 |
| ParticipantIDs | proquest_miscellaneous_1808651993 proquest_miscellaneous_1825456604 ieee_primary_7406266 |
| PublicationCentury | 2000 |
| PublicationDate | 20150901 |
| PublicationDateYYYYMMDD | 2015-09-01 |
| PublicationDate_xml | – month: 09 year: 2015 text: 20150901 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings - International Workshop on Database and Expert Systems Applications |
| PublicationTitleAbbrev | DEXA |
| PublicationYear | 2015 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib025039683 ssj0019983 |
| Score | 1.9511839 |
| Snippet | In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain... |
| SourceID | proquest ieee |
| SourceType | Aggregation Database Publisher |
| StartPage | 39 |
| SubjectTerms | Bioinformatics Classification Clustering Clustering algorithms Clustering methods Deoxyribonucleic acid DNA DNA Sequence Analysis Expert systems Gene sequencing K-means Proteins Sequential analysis Vector quantization Workshops |
| Title | An Improved K-Means Algorithm for DNA Sequence Clustering |
| URI | https://ieeexplore.ieee.org/document/7406266 https://www.proquest.com/docview/1808651993 https://www.proquest.com/docview/1825456604 |
| WOSCitedRecordID | wos000380461300006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4sFT1VasLyJ4dNsmu8lmj6UPBLUULNJbSfPQQt2VduvvN9mXB0XwtrssISSTfF8y38wA3Aahkr5aCc8elqUXaGPXXBQKj8g-E4RLSrJE2i-P4XTKF4toVoO7KhZGa52Jz3TXPWa-fJXIvbsq64UWfSyg1KEehiyP1SptxyK5HzFe7cIudCwT11PiPJ2cZ0FdrigL5bjK9VS880oRH_VG48XAKb5ol5QVV35s0xn2TJr_6_URtL-D-NCsgqdjqOn4BJplFQdULOoWRIMY5VcLWqEH70lb8EKDzWuyXadv78iSWjSaDtBzoblGw83eJVewbbZhPhnPh_deUVDBW1ugTr0okFhpqqWlNZwJX3HhGyyxJIYpGrmidkriwFIozVk_UEQZISwBMUQb9_UUGnES6zNAFEvBDNNcrExAFRaqbyxR8LnhmCkfd6DlBmL5kafMWBZj0IGbciSX1oydb0LEOtnvlpjbsxV1asK__iGO79menf_e_AUcupnLJWCX0Ei3e30FB_IzXe-215m9fAFtrLi9 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7MKejT1E2c1wg-2m1JmzZ9LJtjsgsDh-ytZLnoYHayi7_fpGvrgyL41pYSQnKS70vOd84BuPcCKVw54445LAvHU9qsuTDgDhEtnxMmKEkTab8MgtGITafhuAQPRSyMUioVn6mGfUx9-XIptvaqrBkY9DGAsgf7tnJWFq2VW4_Bcjf0WbEP2-CxVF5PifV1MpaGddmyLJThIttT9s4KTXzY7DxOI6v5og2S11z5sVGn6NOt_K_fx1D7DuND4wKgTqCkklOo5HUcULasqxBGCdpdLiiJ-s5QGfhC0eJ1uZpv3t6RobWoM4rQc6a6Ru3F1qZXMG3WYNJ9nLR7TlZSwZkbqN44oSewVFQJQ2yYz13JuKuxwIJoX9LQlrWTAnuGRCnmtzxJpObcUBBNlLZfz6CcLBN1DohiwX3tK8Zn2qMSc9nShiq4TDPsSxfXoWoHIv7YJc2IszGow10-krExZOud4IlabtcxZuZ0Ra2e8K9_iGV8pmcXvzd_C4e9yXAQD55G_Us4srO4E4RdQXmz2qprOBCfm_l6dZPazhdBG7wG |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+-+International+Workshop+on+Database+and+Expert+Systems+Applications&rft.atitle=An+Improved+K-Means+Algorithm+for+DNA+Sequence+Clustering&rft.au=Aleb%2C+Nasssima&rft.au=Labidi%2C+Narimane&rft.date=2015-09-01&rft.pub=IEEE&rft.isbn=1467375810&rft.issn=1529-4188&rft.spage=39&rft.epage=42&rft_id=info:doi/10.1109%2FDEXA.2015.27&rft.externalDocID=7406266 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1529-4188&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1529-4188&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1529-4188&client=summon |

