Parallel optimal clustering
Cluster analysis is a generic term coined for procedures that are used objectively to group entities based on their similarities and differences. The primary objective of these procedures is to group n items into up to K mutually exclusive clusters so that items within each cluster are relatively ho...
Uloženo v:
| Hlavní autor: | |
|---|---|
| Médium: | Dissertation |
| Jazyk: | angličtina |
| Vydáno: |
ProQuest Dissertations & Theses
01.01.1997
|
| Témata: | |
| ISBN: | 0591453363, 9780591453362 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Cluster analysis is a generic term coined for procedures that are used objectively to group entities based on their similarities and differences. The primary objective of these procedures is to group n items into up to K mutually exclusive clusters so that items within each cluster are relatively homogeneous in nature while the clusters themselves are distinct. Statistical methods attempt to lower the interaction between each observation and the group mean or median. In contrast, optimal clustering techniques not only finds the best possible solution, but also accounts for total group interaction. In this research, we develop, implement, and evaluate a parallel algorithm (PGROUPS) to solve clustering problems to optimality. PGROUPS was implemented using one to eight processors on the IBM PowerParallel System (SP-2) at University Computing & Network Services at The University of Georgia using the xlf (f77) Fortran compiler under the AIX Version 3.2 (Unix) operating system. Our programming environment is the Parallel Virtual Machine (PVM) and we used the user mode (the communication channel switch) so that more than one process can be activated on a single processor. PGROUPS is based on the model and serial solution methodology (GROUPS) due to Aronson and Klein (1989) and Klein and Aronson (1991). Prior to developing PGROUPS, the serial solution methodology was enhanced by developing a new heuristic to obtain the initial upperbound and eliminating the evaluation of lower bounds for some nodes that are already bound fathomed. The two enhancements decreased the overall solution CPU time for GROUPS, for the test problems, by an average of 27.1%, with a maximum reduction of 53.2%. Test results of PGROUPS indicate that we obtain near linear to superlinear speedups (both absolute and relative) with overall average efficiency also close to one and greater than one. We also achieved even load balancing with each Slave Process performing an equal share of work. We expect that this parallel algorithm will have a significant bearing on our ability to solve relatively large-scale clustering problems to optimality. Thus, solutions to real-world, complex clustering problems, which could not be solved due to lack of efficient algorithms, can now be attempted. |
|---|---|
| Bibliografie: | SourceType-Dissertations & Theses-1 ObjectType-Dissertation/Thesis-1 content type line 12 |
| ISBN: | 0591453363 9780591453362 |

