A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics

The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k -means clustering technique—the Fast, Efficient, and Scalable k...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:EURASIP journal on bioinformatics & systems biology Ročník 2010; číslo 1; s. 1 - 14
Hlavní autor: Oyana, Tonny J
Médium: Journal Article
Jazyk:angličtina
Vydáno: Cham Springer International Publishing 2010
Springer Nature B.V
Springer
Témata:
ISSN:1687-4145, 1687-4153, 1687-4153
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k -means clustering technique—the Fast, Efficient, and Scalable k -means algorithm ( FES-k -means). The FES-k -means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k -means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k -means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1687-4145
1687-4153
1687-4153
DOI:10.1155/2010/746021