Pardicle parallel approximate density-based clustering
Dbscan is a widely used isodensity-based clustering algorithm for particle data well-known for its ability to isolate arbitrarily-shaped clusters and to filter noise data. The algorithm is super-linear (O(nlogn)) and computationally expensive for large datasets. Given the need for speed, we propose...
Uloženo v:
| Vydáno v: | Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis s. 560 - 571 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway, NJ, USA
IEEE Press
16.11.2014
IEEE |
| Edice: | ACM Conferences |
| Témata: | |
| ISBN: | 1479955000, 9781479955008 |
| ISSN: | 2167-4329 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Dbscan is a widely used isodensity-based clustering algorithm for particle data well-known for its ability to isolate arbitrarily-shaped clusters and to filter noise data. The algorithm is super-linear (O(nlogn)) and computationally expensive for large datasets. Given the need for speed, we propose a fast heuristic algorithm for Dbscan using density based sampling, which performs equally well in quality compared to exact algorithms, but is more than an order of magnitude faster. Our experiments on astrophysics and synthetic massive datasets (8.5 billion numbers) shows that our approximate algorithm is up to 56x faster than exact algorithms with almost identical quality (Omega-Index ≥ 0.99). We develop a new parallel Dbscan algorithm, which uses dynamic partitioning to improve load balancing and locality. We demonstrate near-linear speedup on shared memory (15x using 16 cores, single node Intel® Xeon® processor) and distributed memory (3917x using 4096 cores, multinode) computers, with 2x additional performance improvement using Intel® Xeon Phi™ coprocessors. Additionally, existing exact algorithms can achieve up to 3.4 times speedup using dynamic partitioning. |
|---|---|
| ISBN: | 1479955000 9781479955008 |
| ISSN: | 2167-4329 |
| DOI: | 10.1109/SC.2014.51 |

