An Improved K-Means Algorithm for DNA Sequence Clustering

In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings - International Workshop on Database and Expert Systems Applications s. 39 - 42
Hlavní autori: Aleb, Nasssima, Labidi, Narimane
Médium: Konferenčný príspevok.. Journal Article
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2015
Predmet:
ISBN:1467375810, 9781467375818
ISSN:1529-4188, 2378-3915
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus, if we manage to assimilate a sequence S1 to another sequence S2 or to a group of previously studied sequences, this will allow us to directly deduce the structure, functions and phylogenetic classification of S2. The purpose of this work is to adapt clustering methods to the specific problem of classification of DNA sequences. We introduce a new method based on K-means clustering for DNA sequences clustering. We begin by explaining and motivating our approach, then we present obtained results.
AbstractList In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain unknown. The solution to this problem is to link sequences between them rather than revisit each new sequence independently of other sequences. Thus, if we manage to assimilate a sequence S1 to another sequence S2 or to a group of previously studied sequences, this will allow us to directly deduce the structure, functions and phylogenetic classification of S2. The purpose of this work is to adapt clustering methods to the specific problem of classification of DNA sequences. We introduce a new method based on K-means clustering for DNA sequences clustering. We begin by explaining and motivating our approach, then we present obtained results.
Author Labidi, Narimane
Aleb, Nasssima
Author_xml – sequence: 1
  givenname: Nasssima
  surname: Aleb
  fullname: Aleb, Nasssima
  email: naleb@usthb.dz
  organization: USTHB-FEI, Univ. & Sci. & Technol. Houari Boumediene, Algiers, Algeria
– sequence: 2
  givenname: Narimane
  surname: Labidi
  fullname: Labidi, Narimane
  email: narimanelbd@hotmail.com
  organization: USTHB-FEI, Univ. & Sci. & Technol. Houari Boumediene, Algiers, Algeria
BookMark eNqNjj1PwzAYhA0UibZ0Y2PxyJLi11-xx6gtUFFgoANbZJzXJVI-IE6R-PcEFYmV6aS75043IaOmbZCQC2BzAGavl6uXbM4ZqDlPj8gEpE5Fqgy3x2TMRWoSYUGd_AXARmQMittEgjFnZBZj-cq4YsJqI8bEZg1d1-9d-4kFvU8e0DWRZtWu7cr-raah7ejyMaPP-LHHxiNdVPvYY1c2u3NyGlwVcfarU7K9WW0Xd8nm6Xa9yDZJySX0iZUeClToNXCjnSiMEwE8eB50oSxTTBUeJEiBRjNZ8CI4x7gJHMOPOyVXh9nh4_Ah9nldRo9V5Rps9zEHw5VUeqj-A2VGK7BWDOjlAS0RMX_vytp1X3kqmeZai2-9kGXW
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IL
CBEJK
RIE
RIL
7TM
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/DEXA.2015.27
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
Nucleic Acids Abstracts
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Nucleic Acids Abstracts
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Nucleic Acids Abstracts
Computer and Information Systems Abstracts

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1467375829
9781467375825
EISSN 2378-3915
EndPage 42
ExternalDocumentID 7406266
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
7TM
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i241t-94c1de5ec61286a3d8a3f1c1c2f6d590505dc14143e8604d2dfaa028f2ef4143
IEDL.DBID RIE
ISBN 1467375810
9781467375818
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000380461300006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1529-4188
IngestDate Thu Jul 10 23:22:53 EDT 2025
Thu Jul 10 23:12:28 EDT 2025
Wed Aug 27 01:55:47 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i241t-94c1de5ec61286a3d8a3f1c1c2f6d590505dc14143e8604d2dfaa028f2ef4143
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1808651993
PQPubID 23462
PageCount 4
ParticipantIDs proquest_miscellaneous_1808651993
proquest_miscellaneous_1825456604
ieee_primary_7406266
PublicationCentury 2000
PublicationDate 20150901
PublicationDateYYYYMMDD 2015-09-01
PublicationDate_xml – month: 09
  year: 2015
  text: 20150901
  day: 01
PublicationDecade 2010
PublicationTitle Proceedings - International Workshop on Database and Expert Systems Applications
PublicationTitleAbbrev DEXA
PublicationYear 2015
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib025039683
ssj0019983
Score 1.9511839
Snippet In recent years, billions of DNA and protein sequences are subject to sequencing. However, few of them have known structures and functions, most remain...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 39
SubjectTerms Bioinformatics
Classification
Clustering
Clustering algorithms
Clustering methods
Deoxyribonucleic acid
DNA
DNA Sequence Analysis
Expert systems
Gene sequencing
K-means
Proteins
Sequential analysis
Vector quantization
Workshops
Title An Improved K-Means Algorithm for DNA Sequence Clustering
URI https://ieeexplore.ieee.org/document/7406266
https://www.proquest.com/docview/1808651993
https://www.proquest.com/docview/1825456604
WOSCitedRecordID wos000380461300006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4sFT1VasLyJ4dNsmu8lmj6UPBLUULNJbSfPQQt2VduvvN9mXB0XwtrssISSTfF8y38wA3Aahkr5aCc8elqUXaGPXXBQKj8g-E4RLSrJE2i-P4XTKF4toVoO7KhZGa52Jz3TXPWa-fJXIvbsq64UWfSyg1KEehiyP1SptxyK5HzFe7cIudCwT11PiPJ2cZ0FdrigL5bjK9VS880oRH_VG48XAKb5ol5QVV35s0xn2TJr_6_URtL-D-NCsgqdjqOn4BJplFQdULOoWRIMY5VcLWqEH70lb8EKDzWuyXadv78iSWjSaDtBzoblGw83eJVewbbZhPhnPh_deUVDBW1ugTr0okFhpqqWlNZwJX3HhGyyxJIYpGrmidkriwFIozVk_UEQZISwBMUQb9_UUGnES6zNAFEvBDNNcrExAFRaqbyxR8LnhmCkfd6DlBmL5kafMWBZj0IGbciSX1oydb0LEOtnvlpjbsxV1asK__iGO79menf_e_AUcupnLJWCX0Ei3e30FB_IzXe-215m9fAFtrLi9
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7MKejT1E2c1wg-2m1JmzZ9LJtjsgsDh-ytZLnoYHayi7_fpGvrgyL41pYSQnKS70vOd84BuPcCKVw54445LAvHU9qsuTDgDhEtnxMmKEkTab8MgtGITafhuAQPRSyMUioVn6mGfUx9-XIptvaqrBkY9DGAsgf7tnJWFq2VW4_Bcjf0WbEP2-CxVF5PifV1MpaGddmyLJThIttT9s4KTXzY7DxOI6v5og2S11z5sVGn6NOt_K_fx1D7DuND4wKgTqCkklOo5HUcULasqxBGCdpdLiiJ-s5QGfhC0eJ1uZpv3t6RobWoM4rQc6a6Ru3F1qZXMG3WYNJ9nLR7TlZSwZkbqN44oSewVFQJQ2yYz13JuKuxwIJoX9LQlrWTAnuGRCnmtzxJpObcUBBNlLZfz6CcLBN1DohiwX3tK8Zn2qMSc9nShiq4TDPsSxfXoWoHIv7YJc2IszGow10-krExZOud4IlabtcxZuZ0Ra2e8K9_iGV8pmcXvzd_C4e9yXAQD55G_Us4srO4E4RdQXmz2qprOBCfm_l6dZPazhdBG7wG
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+-+International+Workshop+on+Database+and+Expert+Systems+Applications&rft.atitle=An+Improved+K-Means+Algorithm+for+DNA+Sequence+Clustering&rft.au=Aleb%2C+Nasssima&rft.au=Labidi%2C+Narimane&rft.date=2015-09-01&rft.pub=IEEE&rft.isbn=1467375810&rft.issn=1529-4188&rft.spage=39&rft.epage=42&rft_id=info:doi/10.1109%2FDEXA.2015.27&rft.externalDocID=7406266
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1529-4188&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1529-4188&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1529-4188&client=summon