Research on Mini-Batch Affinity Propagation Clustering Algorithm
Clustering is a task of unsupervised learning, aiming to group a set of data so that data in the same group are more similar to each other than to those in other groups. Affinity propagation (AP) is a clustering algorithm which finds the exemplars (representative points) for data points by spreading...
Saved in:
| Published in: | 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA) pp. 1 - 10 |
|---|---|
| Main Authors: | , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
13.10.2022
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Clustering is a task of unsupervised learning, aiming to group a set of data so that data in the same group are more similar to each other than to those in other groups. Affinity propagation (AP) is a clustering algorithm which finds the exemplars (representative points) for data points by spreading messages among them. AP algorithm has several drawbacks. First, it is time-consuming and memory-consuming for clustering on large-scale dataset, due to its N square time and space complexity. Second, AP may produce too many small clusters. Third, AP may have difficulty in converging which leads to a higher cost of time for fine turning. To achieve better effectiveness and efficiency, in this paper we propose Mini-Batch Affinity Propagation (MBAP). MBAP processes small batches of data serially and obtains clustering results gradually. We also proposes MBAP with early stopping (MBAP_ES), which integrates MBAP with stopping strategy so that it can stop clustering early when the model is nearly unchanged. The experiments show the effectiveness and efficiency of MBAP and MBAP_ES in comparison to other AP-based algorithms. |
|---|---|
| DOI: | 10.1109/DSAA54385.2022.10032450 |