A Sampling-Based Density Peaks Clustering Algorithm for Large-Scale Data

•An improved triangle-inequality-based search strategy is proposed.•An approximate local density calculation of representatives is proposed.•Experiments show that our algorithm costs far less time than DPC and other state-of-the-art algorithms proposed recently. With the rapid development of informa...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Pattern recognition Ročník 136; s. 109238
Hlavní autoři: Ding, Shifei, Li, Chao, Xu, Xiao, Ding, Ling, Zhang, Jian, Guo, Lili, Shi, Tianhao
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Ltd 01.04.2023
Témata:
ISSN:0031-3203, 1873-5142
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract •An improved triangle-inequality-based search strategy is proposed.•An approximate local density calculation of representatives is proposed.•Experiments show that our algorithm costs far less time than DPC and other state-of-the-art algorithms proposed recently. With the rapid development of information technology, massive amount of data is generated. How to discover useful information to support decision-making has become one of the focuses of scholar's research. Clustering is thought to be one of the main means to deal with large-scale data. Density peaks clustering (DPC) is an effective density-based clustering algorithm which is widely applied in numerous fields because of its satisfactory performance. However, the computational complexity of DPC is O(N2) which is not friendly to large-scale data. To solve this issue, a sampling-based density peaks clustering algorithm for large-scale data (SDPC) is proposed. Firstly, a sampling method is used to reduce the distance calculations. Secondly, approximate representatives are identified by an improved TI search strategy which further accelerates the clustering process. Afterwards, the approximate representatives are clustered by DPC. Finally, the remaining points are allocated to the same cluster as its nearest representatives. Experimental results on both synthetic datasets and real-world datasets illustrate that SDPC is more efficient than DPC, while its clustering performance maintains the same level as DPC.
AbstractList •An improved triangle-inequality-based search strategy is proposed.•An approximate local density calculation of representatives is proposed.•Experiments show that our algorithm costs far less time than DPC and other state-of-the-art algorithms proposed recently. With the rapid development of information technology, massive amount of data is generated. How to discover useful information to support decision-making has become one of the focuses of scholar's research. Clustering is thought to be one of the main means to deal with large-scale data. Density peaks clustering (DPC) is an effective density-based clustering algorithm which is widely applied in numerous fields because of its satisfactory performance. However, the computational complexity of DPC is O(N2) which is not friendly to large-scale data. To solve this issue, a sampling-based density peaks clustering algorithm for large-scale data (SDPC) is proposed. Firstly, a sampling method is used to reduce the distance calculations. Secondly, approximate representatives are identified by an improved TI search strategy which further accelerates the clustering process. Afterwards, the approximate representatives are clustered by DPC. Finally, the remaining points are allocated to the same cluster as its nearest representatives. Experimental results on both synthetic datasets and real-world datasets illustrate that SDPC is more efficient than DPC, while its clustering performance maintains the same level as DPC.
ArticleNumber 109238
Author Shi, Tianhao
Xu, Xiao
Ding, Shifei
Zhang, Jian
Li, Chao
Ding, Ling
Guo, Lili
Author_xml – sequence: 1
  givenname: Shifei
  surname: Ding
  fullname: Ding, Shifei
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
– sequence: 2
  givenname: Chao
  surname: Li
  fullname: Li, Chao
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
– sequence: 3
  givenname: Xiao
  surname: Xu
  fullname: Xu, Xiao
  email: xu_xiao@cumt.edu.cn
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
– sequence: 4
  givenname: Ling
  surname: Ding
  fullname: Ding, Ling
  email: dingsf@cumt.edu.cn, 414211048@qq.com
  organization: College of Intelligence and Computing, Tianjin University, Tianjin, 300350, China
– sequence: 5
  givenname: Jian
  surname: Zhang
  fullname: Zhang, Jian
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
– sequence: 6
  givenname: Lili
  surname: Guo
  fullname: Guo, Lili
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
– sequence: 7
  givenname: Tianhao
  surname: Shi
  fullname: Shi, Tianhao
  email: 2470486977@qq.com
  organization: School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
BookMark eNqFkNFKwzAUhoMouE3fwIu8QOdJ0q6pF8Lc1AkDhe0-nKWnM7NrRxKFvb0d9coLvTrw_3w_nG_Izpu2IcZuBIwFiMntbnzAaNvtWIKUXVRIpc_YQOhcJZlI5TkbACiRKAnqkg1D2AGIvCsGbDHlK9wfatdskwcMVPI5NcHFI38j_Ah8Vn-GSL6r-bTett7F9z2vWs-X6LeUrCzWxOcY8YpdVFgHuv65I7Z-elzPFsny9fllNl0mVsEkJkUFmGc2z6jYpFhkldV5OtlAIVQmLVRlRsJqraxEqTeilKXWQFmFqFFCrkbsrp-1vg3BU2Wsixhd20SPrjYCzEmJ2ZleiTkpMb2SDk5_wQfv9uiP_2H3PUbdX1-OvAnWUWOpdJ5sNGXr_h74BhE5fpc
CitedBy_id crossref_primary_10_1049_cit2_70050
crossref_primary_10_1111_exsy_13679
crossref_primary_10_3390_sym17081326
crossref_primary_10_3233_JIFS_224234
crossref_primary_10_1109_TKDE_2025_3589794
crossref_primary_10_1007_s11063_024_11444_z
crossref_primary_10_1016_j_neucom_2024_128367
crossref_primary_10_1016_j_engappai_2024_108551
crossref_primary_10_1061_JSUED2_SUENG_1551
crossref_primary_10_1016_j_ins_2024_120685
crossref_primary_10_1016_j_nimb_2024_165453
crossref_primary_10_1109_TMC_2025_3533005
crossref_primary_10_1016_j_eswa_2024_124782
crossref_primary_10_1109_TCSVT_2024_3435383
crossref_primary_10_1016_j_ins_2023_119382
crossref_primary_10_1016_j_neucom_2024_129060
crossref_primary_10_1007_s11222_023_10355_8
crossref_primary_10_1016_j_patcog_2025_111953
crossref_primary_10_1007_s10586_024_04592_3
crossref_primary_10_3390_s24103149
crossref_primary_10_1007_s44336_024_00008_3
crossref_primary_10_1016_j_patcog_2024_110767
crossref_primary_10_3390_info14070414
crossref_primary_10_1007_s13042_024_02104_8
crossref_primary_10_1016_j_ins_2023_120082
crossref_primary_10_1109_TKDE_2024_3401075
crossref_primary_10_1007_s42486_024_00148_x
crossref_primary_10_3390_s24175695
crossref_primary_10_1016_j_engappai_2025_111429
crossref_primary_10_1016_j_ins_2022_12_078
crossref_primary_10_1016_j_eswa_2023_121860
crossref_primary_10_1007_s11227_023_05688_0
crossref_primary_10_1016_j_asoc_2023_110903
crossref_primary_10_1016_j_ins_2024_120811
crossref_primary_10_3390_biomimetics9010003
crossref_primary_10_1016_j_ins_2023_119470
crossref_primary_10_1038_s41598_025_13848_w
Cites_doi 10.1016/j.patcog.2022.108809
10.1007/s10489-021-02278-6
10.1007/s00500-019-04365-w
10.1016/j.patcog.2021.108041
10.1016/j.knosys.2018.09.007
10.1016/j.patcog.2018.05.030
10.1109/TII.2016.2628747
10.1109/TKDE.2016.2609423
10.1016/j.patcog.2022.108745
10.1016/j.eswa.2019.01.074
10.1016/j.patcog.2020.107449
10.1016/j.knosys.2017.07.027
10.1109/TKDE.2019.2903410
10.1007/s13042-017-0648-x
10.1007/s10115-018-1189-7
10.1016/j.ins.2020.08.052
10.1007/s00500-018-3183-0
10.1016/j.ins.2020.11.050
10.1016/j.knosys.2020.106028
10.1109/TKDE.2015.2460735
10.1007/s13042-016-0603-2
10.1007/s10489-020-01926-7
10.1016/j.knosys.2016.02.001
10.1016/j.eswa.2018.07.075
10.1016/j.neucom.2021.05.071
10.1016/j.knosys.2019.06.032
10.1016/j.ins.2019.09.001
10.1016/j.knosys.2019.105088
10.1126/science.1242072
10.1016/j.knosys.2018.05.034
10.1016/j.patcog.2020.107554
10.1109/TNNLS.2019.2909425
10.1016/j.patcog.2020.107452
ContentType Journal Article
Copyright 2022 Elsevier Ltd
Copyright_xml – notice: 2022 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.patcog.2022.109238
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-5142
ExternalDocumentID 10_1016_j_patcog_2022_109238
S0031320322007178
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c306t-9f0a75c75e9b4a95fc8746b091352c0fd5e1c883c2a28b1d2d880e5faa8a2073
ISICitedReferencesCount 42
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000900808400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0031-3203
IngestDate Sat Nov 29 07:26:21 EST 2025
Tue Nov 18 21:23:25 EST 2025
Fri Feb 23 02:39:24 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords TI search strategy
Large-scale data
Sampling method
Density peaks clustering
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c306t-9f0a75c75e9b4a95fc8746b091352c0fd5e1c883c2a28b1d2d880e5faa8a2073
ParticipantIDs crossref_citationtrail_10_1016_j_patcog_2022_109238
crossref_primary_10_1016_j_patcog_2022_109238
elsevier_sciencedirect_doi_10_1016_j_patcog_2022_109238
PublicationCentury 2000
PublicationDate April 2023
2023-04-00
PublicationDateYYYYMMDD 2023-04-01
PublicationDate_xml – month: 04
  year: 2023
  text: April 2023
PublicationDecade 2020
PublicationTitle Pattern recognition
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Chen, Hu, Fan (bib0022) 2020; 187
Xu, Ding, Xu (bib0013) 2019; 23
189 (2020), 105088. DOI: 10.1016/j.knosys.2019.105088.
Wu, Wilamowski (bib0024) 2017; 13
Almalawi, Fahad, Tari (bib0035) 2016; 28
Bai, Cheng, Liang (bib0025) 2017; 13
Hou, Zhang (bib0030) 2020; 108
Abbas, El-Zoghabi (bib0032) 2020; 109
Qv, Ma, Tong (bib0002) 2022; 129
Xu, Ding, Wang (bib0014) 2020; 200
Xu, Ding, Wang (bib0028) 2021; 554
Xu, Ding, Shi (bib0021) 2018; 158
Zhao, Liang, Dang (bib0033) 2019; 163
Du, Ding, Xue (bib0018) 2019; 59
Wang, Chen, Nie (bib0004) 2022; 130
Arthur, Vassilvitskil (bib0037) 2007
Laohakiat, Sa-ing (bib0008) 2021; 547
Baek, Yoon, Song (bib0003) 2021; 118
Ding, Du, Sun (bib0012) 2017; 133
Zhang, Chen, Yu (bib0026) 2016; 28
Unlu, Xanthopoulos (bib0005) 2019; 125
Liu, Li, Du (bib0027) 2017; 107
Khalili, Dakhilalian, Susilo (bib0006) 2020; 510
Zhang, Ding, Wang (bib0007) 2021; 51
Rodriguez, Laio (bib0011) 2014; 334
Du, Ding, Xu (bib0015) 2018; 9
Du, Ding, Jia (bib0016) 2016; 99
Huang, Wang, Wu (bib0034) 2020; 32
Shi, Ding, Xu (bib0019) 2021; 51
Seyedi, Lotfi, Moradi (bib0017) 2019; 115
Guo, Yang, Chen (bib0010) 2020; 24
Chen, Tang, Bouguila (bib0009) 2018; 83
Fang, Qiu (bib0029) 2020; 107
Pan Y, Pan Z, Wang Y, et al., A new fast search algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based check strategy.
Lotfi, Moradi (bib0031) 2020; 107
Guan, Li, He (bib0020) 2021; 445
Chen, Chen, Wu (bib0001) 2020; 31
Xu, Ding, Du (bib0023) 2018; 9
Xu (10.1016/j.patcog.2022.109238_bib0023) 2018; 9
Du (10.1016/j.patcog.2022.109238_bib0015) 2018; 9
Fang (10.1016/j.patcog.2022.109238_bib0029) 2020; 107
Huang (10.1016/j.patcog.2022.109238_bib0034) 2020; 32
Xu (10.1016/j.patcog.2022.109238_bib0021) 2018; 158
10.1016/j.patcog.2022.109238_bib0036
Chen (10.1016/j.patcog.2022.109238_bib0001) 2020; 31
Wang (10.1016/j.patcog.2022.109238_bib0004) 2022; 130
Zhang (10.1016/j.patcog.2022.109238_bib0026) 2016; 28
Baek (10.1016/j.patcog.2022.109238_bib0003) 2021; 118
Laohakiat (10.1016/j.patcog.2022.109238_bib0008) 2021; 547
Guo (10.1016/j.patcog.2022.109238_bib0010) 2020; 24
Xu (10.1016/j.patcog.2022.109238_bib0013) 2019; 23
Du (10.1016/j.patcog.2022.109238_bib0016) 2016; 99
Khalili (10.1016/j.patcog.2022.109238_bib0006) 2020; 510
Rodriguez (10.1016/j.patcog.2022.109238_bib0011) 2014; 334
Ding (10.1016/j.patcog.2022.109238_bib0012) 2017; 133
Almalawi (10.1016/j.patcog.2022.109238_bib0035) 2016; 28
Du (10.1016/j.patcog.2022.109238_bib0018) 2019; 59
Chen (10.1016/j.patcog.2022.109238_bib0022) 2020; 187
Abbas (10.1016/j.patcog.2022.109238_bib0032) 2020; 109
Unlu (10.1016/j.patcog.2022.109238_bib0005) 2019; 125
Wu (10.1016/j.patcog.2022.109238_bib0024) 2017; 13
Shi (10.1016/j.patcog.2022.109238_bib0019) 2021; 51
Zhang (10.1016/j.patcog.2022.109238_bib0007) 2021; 51
Seyedi (10.1016/j.patcog.2022.109238_bib0017) 2019; 115
Lotfi (10.1016/j.patcog.2022.109238_bib0031) 2020; 107
Zhao (10.1016/j.patcog.2022.109238_bib0033) 2019; 163
Arthur (10.1016/j.patcog.2022.109238_bib0037) 2007
Bai (10.1016/j.patcog.2022.109238_bib0025) 2017; 13
Liu (10.1016/j.patcog.2022.109238_bib0027) 2017; 107
Guan (10.1016/j.patcog.2022.109238_bib0020) 2021; 445
Hou (10.1016/j.patcog.2022.109238_bib0030) 2020; 108
Xu (10.1016/j.patcog.2022.109238_bib0014) 2020; 200
Chen (10.1016/j.patcog.2022.109238_bib0009) 2018; 83
Qv (10.1016/j.patcog.2022.109238_bib0002) 2022; 129
Xu (10.1016/j.patcog.2022.109238_bib0028) 2021; 554
References_xml – volume: 547
  start-page: 404
  year: 2021
  end-page: 426
  ident: bib0008
  article-title: An incremental density-based clustering framework using fuzzy local clustering
  publication-title: Inf. Sci.
– volume: 187
  year: 2020
  ident: bib0022
  article-title: Fast density peak clustering for large scale data based on kNN
  publication-title: Knowledge-Based Syst
– volume: 107
  start-page: 442
  year: 2017
  end-page: 447
  ident: bib0027
  article-title: Parallel implementation of density peaks clustering algorithm based on spark
  publication-title: 7
– volume: 13
  start-page: 1620
  year: 2017
  end-page: 1628
  ident: bib0024
  article-title: A fast density and grid based clustering method for data with arbitrary shapes and noise
  publication-title: IEEE Trans. Ind. Inform.
– volume: 108
  year: 2020
  ident: bib0030
  article-title: Density peaks clustering based on relative density relationship
  publication-title: Pattern Recognit
– volume: 107
  year: 2020
  ident: bib0031
  article-title: Density peaks clustering based on density backbone and fuzzy neighborhood
  publication-title: Pattern Recognit
– volume: 130
  year: 2022
  ident: bib0004
  article-title: Directly solving normalized cut for multi-view data
  publication-title: Pattern Recognit
– volume: 445
  start-page: 401
  year: 2021
  end-page: 418
  ident: bib0020
  article-title: Fast hierarchical clustering of local density peaks via an association degree transfer method
  publication-title: Neurocomputing
– volume: 31
  start-page: 725
  year: 2020
  end-page: 736
  ident: bib0001
  article-title: LABIN: balanced min cut for large-scale data
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
– volume: 32
  start-page: 1212
  year: 2020
  end-page: 1226
  ident: bib0034
  article-title: Ultra-scalable spectral clustering and ensemble clustering
  publication-title: IEEE Trans. Knowl. Data Eng.
– volume: 24
  start-page: 7395
  year: 2020
  end-page: 7415
  ident: bib0010
  article-title: Grid-based dynamic robust multi-objective brain storm optimization algorithm
  publication-title: Soft Comput
– volume: 163
  start-page: 416
  year: 2019
  end-page: 428
  ident: bib0033
  article-title: A stratified sampling based clustering algorithm for large-scale data
  publication-title: Knowledge-Based Syst
– volume: 510
  start-page: 155
  year: 2020
  end-page: 164
  ident: bib0006
  article-title: Efficient chameleon hash functions in the enhanced collision resistant model
  publication-title: Inf. Sci.
– volume: 125
  start-page: 33
  year: 2019
  end-page: 39
  ident: bib0005
  article-title: Estimating the number of clusters in a dataset via consensus clustering
  publication-title: Expert Syst. Appl.
– volume: 99
  start-page: 135
  year: 2016
  end-page: 145
  ident: bib0016
  article-title: Study on density peaks clustering based on k-nearest neighbors and principal component analysis
  publication-title: Knowledge-Based Syst
– reference: , 189 (2020), 105088. DOI: 10.1016/j.knosys.2019.105088.
– volume: 9
  start-page: 743
  year: 2018
  end-page: 754
  ident: bib0023
  article-title: GDCG: an efficient density peak clustering algorithm based on grid
  publication-title: Int. J. Mach. Learn. Cybern.
– volume: 28
  start-page: 68
  year: 2016
  end-page: 81
  ident: bib0035
  article-title: kNNVWC: an efficient k-nearest neighbors approach based on various-widths clustering
  publication-title: IEEE Trans. Knowl. Data Eng.
– volume: 51
  start-page: 7917
  year: 2021
  end-page: 7932
  ident: bib0019
  article-title: A community detection algorithm based on Quasi-Laplacian centrality peaks clustering
  publication-title: Appl. Intell.
– volume: 334
  start-page: 1492
  year: 2014
  end-page: 1496
  ident: bib0011
  article-title: Clustering by fast search and find of density peaks
  publication-title: Science
– volume: 200
  start-page: 1
  year: 2020
  end-page: 11
  ident: bib0014
  article-title: A robust density peaks clustering algorithm with density-sensitive similarity
  publication-title: Knowledge-Based Syst
– volume: 28
  start-page: 3218
  year: 2016
  end-page: 3230
  ident: bib0026
  article-title: Efficient distributed density peaks for clustering large data sets in mapreduce
  publication-title: IEEE Trans. Knowl. Data Eng.
– volume: 115
  start-page: 314
  year: 2019
  end-page: 328
  ident: bib0017
  article-title: Dynamic graph-based label propagation for density peaks clustering
  publication-title: Expert Syst. Appl.
– reference: Pan Y, Pan Z, Wang Y, et al., A new fast search algorithm for exact k-nearest neighbors based on optimal triangle-inequality-based check strategy.
– volume: 59
  start-page: 285
  year: 2019
  end-page: 309
  ident: bib0018
  article-title: A novel density peaks clustering with sensitivity of local density and density-adaptive metric
  publication-title: Knowl. Inf. Syst.
– volume: 129
  year: 2022
  ident: bib0002
  article-title: Clustering by centroid drift and boundary shrinkage
  publication-title: Pattern Recognit
– volume: 23
  start-page: 5171
  year: 2019
  end-page: 5183
  ident: bib0013
  article-title: A feasible density peaks clustering algorithm with a merging strategy
  publication-title: Soft Comput
– volume: 554
  start-page: 61
  year: 2021
  end-page: 83
  ident: bib0028
  article-title: A fast density peaks clustering algorithm with sparse search
  publication-title: Inf. Sci.
– volume: 83
  start-page: 375
  year: 2018
  end-page: 387
  ident: bib0009
  article-title: A fast clustering algorithm based on pruning unnecessary distance computations in dbscan for high-dimensional data
  publication-title: Pattern Recognit
– volume: 9
  start-page: 1335
  year: 2018
  end-page: 1349
  ident: bib0015
  article-title: Density peaks clustering using geodesic distances
  publication-title: Int. J. March. Learn. Cybern.
– volume: 118
  year: 2021
  ident: bib0003
  article-title: Deep self-representative subspace clustering network
  publication-title: Pattern Recognit
– volume: 109
  year: 2020
  ident: bib0032
  article-title: DenMune: Density peak based clustering using mutual nearest neighbors
  publication-title: Pattern Recognit
– volume: 158
  start-page: 65
  year: 2018
  end-page: 74
  ident: bib0021
  article-title: An improved density peaks clustering algorithm with fast finding cluster centers
  publication-title: Knowledge-Based Syst
– volume: 107
  year: 2020
  ident: bib0029
  article-title: Adaptive core fusion-based density peaks clustering for complex data with arbitrary shapes and densities
  publication-title: Pattern Recognit
– volume: 51
  start-page: 2031
  year: 2021
  end-page: 2044
  ident: bib0007
  article-title: Chameleon algorithm based on mutual K-nearest neighbors
  publication-title: Appl. Intell.
– volume: 13
  start-page: 1620
  year: 2017
  end-page: 1628
  ident: bib0025
  article-title: Fast density clustering strategies based on the k-means algorithm
  publication-title: Pattern Recognit
– volume: 133
  start-page: 294
  year: 2017
  end-page: 313
  ident: bib0012
  article-title: An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood
  publication-title: Knowledge-Based Syst
– start-page: 1027
  year: 2007
  end-page: 1035
  ident: bib0037
  article-title: k-means++: The advantages of careful seeding
  publication-title: 18
– volume: 130
  year: 2022
  ident: 10.1016/j.patcog.2022.109238_bib0004
  article-title: Directly solving normalized cut for multi-view data
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2022.108809
– volume: 107
  start-page: 442
  year: 2017
  ident: 10.1016/j.patcog.2022.109238_bib0027
  article-title: Parallel implementation of density peaks clustering algorithm based on spark
  publication-title: 7th ICICT
– volume: 51
  start-page: 7917
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0019
  article-title: A community detection algorithm based on Quasi-Laplacian centrality peaks clustering
  publication-title: Appl. Intell.
  doi: 10.1007/s10489-021-02278-6
– volume: 24
  start-page: 7395
  issue: 10
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0010
  article-title: Grid-based dynamic robust multi-objective brain storm optimization algorithm
  publication-title: Soft Comput
  doi: 10.1007/s00500-019-04365-w
– volume: 118
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0003
  article-title: Deep self-representative subspace clustering network
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2021.108041
– volume: 109
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0032
  article-title: DenMune: Density peak based clustering using mutual nearest neighbors
  publication-title: Pattern Recognit
– volume: 163
  start-page: 416
  year: 2019
  ident: 10.1016/j.patcog.2022.109238_bib0033
  article-title: A stratified sampling based clustering algorithm for large-scale data
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2018.09.007
– volume: 83
  start-page: 375
  year: 2018
  ident: 10.1016/j.patcog.2022.109238_bib0009
  article-title: A fast clustering algorithm based on pruning unnecessary distance computations in dbscan for high-dimensional data
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2018.05.030
– volume: 13
  start-page: 1620
  issue: 4
  year: 2017
  ident: 10.1016/j.patcog.2022.109238_bib0024
  article-title: A fast density and grid based clustering method for data with arbitrary shapes and noise
  publication-title: IEEE Trans. Ind. Inform.
  doi: 10.1109/TII.2016.2628747
– volume: 28
  start-page: 3218
  issue: 12
  year: 2016
  ident: 10.1016/j.patcog.2022.109238_bib0026
  article-title: Efficient distributed density peaks for clustering large data sets in mapreduce
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2016.2609423
– volume: 129
  year: 2022
  ident: 10.1016/j.patcog.2022.109238_bib0002
  article-title: Clustering by centroid drift and boundary shrinkage
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2022.108745
– volume: 125
  start-page: 33
  year: 2019
  ident: 10.1016/j.patcog.2022.109238_bib0005
  article-title: Estimating the number of clusters in a dataset via consensus clustering
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2019.01.074
– volume: 107
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0031
  article-title: Density peaks clustering based on density backbone and fuzzy neighborhood
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2020.107449
– volume: 133
  start-page: 294
  year: 2017
  ident: 10.1016/j.patcog.2022.109238_bib0012
  article-title: An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2017.07.027
– volume: 32
  start-page: 1212
  issue: 6
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0034
  article-title: Ultra-scalable spectral clustering and ensemble clustering
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2019.2903410
– volume: 9
  start-page: 1335
  issue: 8
  year: 2018
  ident: 10.1016/j.patcog.2022.109238_bib0015
  article-title: Density peaks clustering using geodesic distances
  publication-title: Int. J. March. Learn. Cybern.
  doi: 10.1007/s13042-017-0648-x
– volume: 59
  start-page: 285
  issue: 2
  year: 2019
  ident: 10.1016/j.patcog.2022.109238_bib0018
  article-title: A novel density peaks clustering with sensitivity of local density and density-adaptive metric
  publication-title: Knowl. Inf. Syst.
  doi: 10.1007/s10115-018-1189-7
– volume: 547
  start-page: 404
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0008
  article-title: An incremental density-based clustering framework using fuzzy local clustering
  publication-title: Inf. Sci.
  doi: 10.1016/j.ins.2020.08.052
– volume: 23
  start-page: 5171
  issue: 13
  year: 2019
  ident: 10.1016/j.patcog.2022.109238_bib0013
  article-title: A feasible density peaks clustering algorithm with a merging strategy
  publication-title: Soft Comput
  doi: 10.1007/s00500-018-3183-0
– volume: 554
  start-page: 61
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0028
  article-title: A fast density peaks clustering algorithm with sparse search
  publication-title: Inf. Sci.
  doi: 10.1016/j.ins.2020.11.050
– volume: 200
  start-page: 1
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0014
  article-title: A robust density peaks clustering algorithm with density-sensitive similarity
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2020.106028
– volume: 28
  start-page: 68
  issue: 1
  year: 2016
  ident: 10.1016/j.patcog.2022.109238_bib0035
  article-title: kNNVWC: an efficient k-nearest neighbors approach based on various-widths clustering
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2015.2460735
– volume: 9
  start-page: 743
  issue: 5
  year: 2018
  ident: 10.1016/j.patcog.2022.109238_bib0023
  article-title: GDCG: an efficient density peak clustering algorithm based on grid
  publication-title: Int. J. Mach. Learn. Cybern.
  doi: 10.1007/s13042-016-0603-2
– volume: 51
  start-page: 2031
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0007
  article-title: Chameleon algorithm based on mutual K-nearest neighbors
  publication-title: Appl. Intell.
  doi: 10.1007/s10489-020-01926-7
– volume: 99
  start-page: 135
  year: 2016
  ident: 10.1016/j.patcog.2022.109238_bib0016
  article-title: Study on density peaks clustering based on k-nearest neighbors and principal component analysis
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2016.02.001
– volume: 115
  start-page: 314
  year: 2019
  ident: 10.1016/j.patcog.2022.109238_bib0017
  article-title: Dynamic graph-based label propagation for density peaks clustering
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2018.07.075
– volume: 13
  start-page: 1620
  issue: 4
  year: 2017
  ident: 10.1016/j.patcog.2022.109238_bib0025
  article-title: Fast density clustering strategies based on the k-means algorithm
  publication-title: Pattern Recognit
– volume: 445
  start-page: 401
  year: 2021
  ident: 10.1016/j.patcog.2022.109238_bib0020
  article-title: Fast hierarchical clustering of local density peaks via an association degree transfer method
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2021.05.071
– volume: 187
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0022
  article-title: Fast density peak clustering for large scale data based on kNN
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2019.06.032
– volume: 510
  start-page: 155
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0006
  article-title: Efficient chameleon hash functions in the enhanced collision resistant model
  publication-title: Inf. Sci.
  doi: 10.1016/j.ins.2019.09.001
– ident: 10.1016/j.patcog.2022.109238_bib0036
  doi: 10.1016/j.knosys.2019.105088
– volume: 334
  start-page: 1492
  issue: 6191
  year: 2014
  ident: 10.1016/j.patcog.2022.109238_bib0011
  article-title: Clustering by fast search and find of density peaks
  publication-title: Science
  doi: 10.1126/science.1242072
– volume: 158
  start-page: 65
  year: 2018
  ident: 10.1016/j.patcog.2022.109238_bib0021
  article-title: An improved density peaks clustering algorithm with fast finding cluster centers
  publication-title: Knowledge-Based Syst
  doi: 10.1016/j.knosys.2018.05.034
– volume: 108
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0030
  article-title: Density peaks clustering based on relative density relationship
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2020.107554
– volume: 31
  start-page: 725
  issue: 3
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0001
  article-title: LABIN: balanced min cut for large-scale data
  publication-title: IEEE Trans. Neural Netw. Learn. Syst.
  doi: 10.1109/TNNLS.2019.2909425
– start-page: 1027
  year: 2007
  ident: 10.1016/j.patcog.2022.109238_bib0037
  article-title: k-means++: The advantages of careful seeding
– volume: 107
  year: 2020
  ident: 10.1016/j.patcog.2022.109238_bib0029
  article-title: Adaptive core fusion-based density peaks clustering for complex data with arbitrary shapes and densities
  publication-title: Pattern Recognit
  doi: 10.1016/j.patcog.2020.107452
SSID ssj0017142
Score 2.577027
Snippet •An improved triangle-inequality-based search strategy is proposed.•An approximate local density calculation of representatives is proposed.•Experiments show...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 109238
SubjectTerms Density peaks clustering
Large-scale data
Sampling method
TI search strategy
Title A Sampling-Based Density Peaks Clustering Algorithm for Large-Scale Data
URI https://dx.doi.org/10.1016/j.patcog.2022.109238
Volume 136
WOSCitedRecordID wos000900808400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-5142
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017142
  issn: 0031-3203
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV09b9swECVap0OXfhdNmxYcugUMRFKyqNFNEySFEQSwB20CRZ8cJa5s2HKRn9-jRH24CdJm6CIYtEgLes_HI_nujpCvPJoFRkvJhsY3zI8gZRoAGAfDweMmg7RCehxeXKg4ji5d7MmmKicQFoW6vY1W_xVqbEOwbejsI-BuB8UG_Iyg4xVhx-s_AT86nGgrEy_m7BtOUVZoXNSRgKBvNofHi63NjVDthizmy3VeXv2stIZjqwlnE8QMkAul7rutl1UWThv54uRG3eH9d1cUZXKVZ5C3-p7cneUvm5Z4a1vivGtpeo6b2dNtPgjZ06w4gyo5k8KTOwZV9k0i99CHVPda63rj4PpohbPOco6LdSGOutt3k2P_MWm1UsJGpXad1KMkdpSkHuUp2RNhEKkB2Rudn8Q_2uOlkPt1Gnn39E1MZSX8u_s09_ssPT9k-oq8cAsIOqqBf02eQPGGvGyKc1Bnq9-SsxHd5QF1PKAVD2jHA9rygCIPaI8H1PLgHZmenkyPz5grm8EMrv9KFmWeDgMTBhClvo6sSC_0h6lNABsI42WzALhRShqhhUr5TMzQhkOQaa20QIv_ngyKZQEfCPUBhDZD_E6kfqZMGvhSG894Unsh_vn3iWzeS2JcSnlb2WSRPITKPmFtr1WdUuUv94fNK0-cW1i7ewny6MGeHx_5S5_I847kB2RQrrfwmTwzv8p8s_7iSPQbZOSEBQ
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Sampling-Based+Density+Peaks+Clustering+Algorithm+for+Large-Scale+Data&rft.jtitle=Pattern+recognition&rft.au=Ding%2C+Shifei&rft.au=Li%2C+Chao&rft.au=Xu%2C+Xiao&rft.au=Ding%2C+Ling&rft.date=2023-04-01&rft.issn=0031-3203&rft.volume=136&rft.spage=109238&rft_id=info:doi/10.1016%2Fj.patcog.2022.109238&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_patcog_2022_109238
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon