Divisive hierarchical maximum likelihood clustering

Background Biological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated. With this data increasing in a daily basis, the design and development of efficient and accurate statistical methods has become absolutely necessary. Specific analyses, such...

Full description

Saved in:

Bibliographic Details
Published in:	BMC bioinformatics Vol. 18; no. Suppl 16; pp. 546 - 147
Main Authors:	Sharma, Alok, López, Yosvany, Tsunoda, Tatsuhiko
Format:	Journal Article
Language:	English
Published:	London BioMed Central 28.12.2017 BioMed Central Ltd BMC
Subjects:	Algorithms Analysis Bioinformatics Biomedical and Life Sciences Cluster Analysis Computational Biology/Bioinformatics Computer Appl. in Life Sciences Divisive approach Genomics Hierarchical clustering Humans Leukemia Life Sciences Maximum likelihood Microarrays Probability Hierarchical clustering Divisive approach Maximum likelihood
ISSN:	1471-2105, 1471-2105
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Background Biological data comprises various topologies or a mixture of forms, which makes its analysis extremely complicated. With this data increasing in a daily basis, the design and development of efficient and accurate statistical methods has become absolutely necessary. Specific analyses, such as those related to genome-wide association studies and multi-omics information, are often aimed at clustering sub-conditions of cancers and other diseases. Hierarchical clustering methods, which can be categorized into agglomerative and divisive, have been widely used in such situations. However, unlike agglomerative methods divisive clustering approaches have consistently proved to be computationally expensive. Results The proposed clustering algorithm (DRAGON) was verified on mutation and microarray data, and was gauged against standard clustering methods in the literature. Its validation included synthetic and significant biological data. When validated on mixed-lineage leukemia data, DRAGON achieved the highest clustering accuracy with data of four different dimensions. Consequently, DRAGON outperformed previous methods with 3-,4- and 5-dimensional acute leukemia data. When tested on mutation data, DRAGON achieved the best performance with 2-dimensional information. Conclusions This work proposes a computationally efficient divisive hierarchical clustering method, which can compete equally with agglomerative approaches. The proposed method turned out to correctly cluster data with distinct topologies. A MATLAB implementation can be extraced from http://www.riken.jp/en/research/labs/ims/med_sci_math/ or http://www.alok-ai-lab.com
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1471-2105 1471-2105
DOI:	10.1186/s12859-017-1965-5