Rough Based Symmetrical Clustering for Gene Expression Profile Analysis

Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarr...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on nanobioscience Ročník 14; číslo 4; s. 360 - 367
Hlavní autoři: Sarkar, Anasua, Maulik, Ujjwal
Médium: Magazine Article
Jazyk:angličtina
Vydáno: United States IEEE 01.06.2015
Témata:
ISSN:1536-1241, 1558-2639
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. This new parallel implementation with K-means algorithm also satisfies the linear speedup in timing on large microarray datasets. This proposed algorithm is compared with another parallel symmetry-based K-means and parallel version of existing K-means over four artificial and benchmark microarray datasets. We also have experimented over three skewed cancer gene expression datasets. The statistical analysis are also performed to establish the significance of this new implementation. The biological relevance of the clustering solutions are also analyzed.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1536-1241
1558-2639
DOI:10.1109/TNB.2015.2421323