Efficient algorithm for big data clustering on single machine

Big data analysis requires the presence of large computing powers, which is not always feasible. And so, it became necessary to develop new clustering algorithms capable of such data processing. This study proposes a new parallel clustering algorithm based on the k-means algorithm. It significantly...

Full description

Saved in:
Bibliographic Details
Published in:CAAI Transactions on Intelligence Technology Vol. 5; no. 1; pp. 9 - 14
Main Authors: Alguliyev, Rasim M, Aliguliyev, Ramiz M, Sukhostat, Lyudmila V
Format: Journal Article
Language:English
Published: Beijing The Institution of Engineering and Technology 01.03.2020
John Wiley & Sons, Inc
Wiley
Subjects:
ISSN:2468-2322, 2468-6557, 2468-2322
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Big data analysis requires the presence of large computing powers, which is not always feasible. And so, it became necessary to develop new clustering algorithms capable of such data processing. This study proposes a new parallel clustering algorithm based on the k-means algorithm. It significantly reduces the exponential growth of computations. The proposed algorithm splits a dataset into batches while preserving the characteristics of the initial dataset and increasing the clustering speed. The idea is to define cluster centroids, which are also clustered, for each batch. According to the obtained centroids, the data points belong to the cluster with the nearest centroid. Real large datasets are used to conduct the experiments to evaluate the effectiveness of the proposed approach. The proposed approach is compared with k-means and its modification. The experiments show that the proposed algorithm is a promising tool for clustering large datasets in comparison with the k-means algorithm.
AbstractList Big data analysis requires the presence of large computing powers, which is not always feasible. And so, it became necessary to develop new clustering algorithms capable of such data processing. This study proposes a new parallel clustering algorithm based on the k‐means algorithm. It significantly reduces the exponential growth of computations. The proposed algorithm splits a dataset into batches while preserving the characteristics of the initial dataset and increasing the clustering speed. The idea is to define cluster centroids, which are also clustered, for each batch. According to the obtained centroids, the data points belong to the cluster with the nearest centroid. Real large datasets are used to conduct the experiments to evaluate the effectiveness of the proposed approach. The proposed approach is compared with k‐means and its modification. The experiments show that the proposed algorithm is a promising tool for clustering large datasets in comparison with the k‐means algorithm.
Author Sukhostat, Lyudmila V
Aliguliyev, Ramiz M
Alguliyev, Rasim M
Author_xml – sequence: 1
  givenname: Rasim M
  surname: Alguliyev
  fullname: Alguliyev, Rasim M
  organization: Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141 Baku, Azerbaijan
– sequence: 2
  givenname: Ramiz M
  orcidid: 0000-0001-9795-1694
  surname: Aliguliyev
  fullname: Aliguliyev, Ramiz M
  email: r.aliguliyev@gmail.com
  organization: Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141 Baku, Azerbaijan
– sequence: 3
  givenname: Lyudmila V
  orcidid: 0000-0001-9449-7457
  surname: Sukhostat
  fullname: Sukhostat, Lyudmila V
  organization: Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141 Baku, Azerbaijan
BookMark eNqFkc1rGzEQxUVJoWmaa84LhR4KdkcfuysdemhN0hgCheKchVYaOTLrlauVCfnvq-2WYkpCTyPE-703enpLzoY4ICFXFJYUhPqUU8hLBlQtAYR8Rc6ZaOSCccbOTs5vyOU47gCKTqmat-fk87X3wQYccmX6bSwuD_vKx1R1YVs5k01l--OYMYVhW8WhGsvssdob-xAGfEdee9OPePlnXpD7m-vN6nZx9_3bevXlbmGFbOWCKl-D4DUH6hwCbzrBnTCopGu5x4bVgtuus9J5ZlvwrPXQolVUAXqsOb8g69nXRbPThxT2Jj3paIL-fRHTVpuUg-1RM-5rtMicBBQUamW899w7w7oGVGuK1_vZ65DizyOOWe_iMQ1lfc2hRLaSNrKolrPKpjiOCf3fVAp6alxPjeupcT01XgDxD2BDNjnEIScT-pexdsYeQ49P_wnRq_WGfb0p_ycn8sNMBjx5wubHenMCHJwvwo_PCF_Y5xfmLbHr
CitedBy_id crossref_primary_10_3233_JIFS_189646
crossref_primary_10_3233_WEB_210490
crossref_primary_10_1007_s10619_022_07407_9
crossref_primary_10_3390_math10071012
crossref_primary_10_1007_s11831_020_09426_0
crossref_primary_10_1016_j_jnca_2020_102835
crossref_primary_10_1155_2021_9923748
crossref_primary_10_1145_3392665
crossref_primary_10_1109_ACCESS_2020_3022865
crossref_primary_10_1007_s10922_022_09650_y
crossref_primary_10_3103_S1060992X22010106
crossref_primary_10_1186_s40537_021_00450_w
crossref_primary_10_1016_j_compbiomed_2021_105100
crossref_primary_10_1093_comjnl_bxaf041
crossref_primary_10_1007_s11063_021_10495_w
crossref_primary_10_1016_j_ins_2024_121336
crossref_primary_10_3233_JIFS_189640
crossref_primary_10_1007_s11042_022_13505_8
crossref_primary_10_3390_e23050628
crossref_primary_10_4018_IJDWM_2020070104
crossref_primary_10_3233_JIFS_189643
crossref_primary_10_1145_3577926
crossref_primary_10_1016_j_advengsoft_2022_103331
crossref_primary_10_1007_s00521_022_07889_9
crossref_primary_10_1007_s12652_021_03287_6
crossref_primary_10_1007_s11042_022_12126_5
crossref_primary_10_1007_s12652_022_04425_4
crossref_primary_10_1007_s11277_021_08355_w
crossref_primary_10_1109_ACCESS_2020_3006036
crossref_primary_10_1109_MCE_2021_3140048
crossref_primary_10_1007_s10586_022_03787_w
crossref_primary_10_1007_s12065_022_00720_3
crossref_primary_10_1007_s12652_021_03299_2
crossref_primary_10_7717_peerj_cs_915
crossref_primary_10_1007_s11036_020_01699_w
crossref_primary_10_1016_j_ins_2023_01_148
crossref_primary_10_1002_int_22717
crossref_primary_10_31857_S0005231024030014
crossref_primary_10_3390_app11125740
crossref_primary_10_1016_j_comcom_2022_02_008
crossref_primary_10_3390_sym14020218
crossref_primary_10_1016_j_comcom_2021_02_004
crossref_primary_10_1111_exsy_12988
crossref_primary_10_1007_s12083_020_00959_6
crossref_primary_10_1016_j_comcom_2021_07_023
crossref_primary_10_1016_j_suscom_2021_100613
crossref_primary_10_1016_j_suscom_2024_100985
crossref_primary_10_7717_peerj_cs_724
crossref_primary_10_1016_j_compeleceng_2022_107991
crossref_primary_10_32604_cmc_2021_015056
crossref_primary_10_1109_ACCESS_2025_3554741
crossref_primary_10_1108_EL_07_2020_0219
crossref_primary_10_3233_JIFS_189389
crossref_primary_10_1016_j_compeleceng_2021_107529
crossref_primary_10_1016_j_cose_2020_102160
crossref_primary_10_1007_s11042_022_12891_3
crossref_primary_10_1109_TII_2022_3167842
crossref_primary_10_1155_2021_4935108
crossref_primary_10_1002_ett_4051
crossref_primary_10_1007_s11277_021_08365_8
crossref_primary_10_32604_cmc_2022_019127
crossref_primary_10_1007_s10288_022_00526_0
crossref_primary_10_1007_s11227_022_04558_5
crossref_primary_10_1007_s11036_020_01698_x
crossref_primary_10_3233_JIFS_189399
Cites_doi 10.19139/soic.v6i2.404
10.1016/j.procs.2015.07.290
10.1162/153244302760200678
10.2307/2684318
10.1145/375663.375692
10.1016/j.knosys.2018.04.032
10.1145/1772690.1772862
10.1016/j.knosys.2018.01.031
10.1201/b17320
10.1016/j.ins.2018.06.035
10.1016/j.eswa.2018.09.006
10.1016/j.compeleceng.2017.12.002
10.1007/978-3-540-30501-9_54
10.1016/j.jmva.2018.06.012
10.1145/2809695.2809718
10.1016/j.ejor.2017.06.010
10.1016/j.patcog.2018.05.028
10.1016/j.future.2018.03.006
10.14778/2180912.2180915
10.1145/1066677.1066831
10.1016/j.patcog.2018.02.015
10.1016/j.neucom.2018.02.072
10.1109/ICDIM.2012.6360145
10.1109/TFUZZ.2013.2286993
10.19139/soic.v5i4.365
10.1007/978-3-319-09156-3_49
10.1016/j.patrec.2009.09.011
ContentType Journal Article
Copyright 2020 CAAI Transactions on Intelligence Technology published by John Wiley & Sons Ltd on behalf of Chongqing University of Technology
2020. This work is published under http://creativecommons.org/licenses/by-nc-nd/3.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2020 CAAI Transactions on Intelligence Technology published by John Wiley & Sons Ltd on behalf of Chongqing University of Technology
– notice: 2020. This work is published under http://creativecommons.org/licenses/by-nc-nd/3.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID IDLOA
24P
AAYXX
CITATION
8FE
8FG
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
DOA
DOI 10.1049/trit.2019.0048
DatabaseName IET Digital Library Open Access
Wiley Online Library Open Access Titles
CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials - QC
ProQuest Central
Technology collection
ProQuest One Community College
ProQuest Central
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Advanced Technologies & Aerospace Collection
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList CrossRef



Publicly Available Content Database
Database_xml – sequence: 1
  dbid: 24P
  name: Wiley Online Library Open Access
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
– sequence: 2
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 3
  dbid: PIMPY
  name: Proquest Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
EISSN 2468-2322
EndPage 14
ExternalDocumentID oai_doaj_org_article_23f5ece2d80e41059afff3fda2b6097a
10_1049_trit_2019_0048
CIT2BF00088
Genre article
GeographicLocations United States--US
GeographicLocations_xml – name: United States--US
GroupedDBID 0R
0SF
24P
6I.
AACTN
AAFTH
AAJGR
ABMAC
ACGFS
ADBBV
AEXQZ
AITUG
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
BCNDV
BFFAM
EBS
EJD
FDB
GROUPED_DOAJ
IDLOA
O9-
OCL
OK1
RIE
RIG
RUI
SSZ
0R~
1OC
AAEDW
AAHHS
AAHJG
AALRI
AAXUO
ABQXS
ACCFJ
ACCMX
ACESK
ACXQS
ADVLN
ADZOD
AEEZP
AEQDE
AFKRA
AIWBW
AJBDE
AKRWK
ALUQN
ARAPS
ARCSS
AVUZU
BENPR
BGLVJ
CCPQU
HCIFZ
IAO
ITC
K7-
M41
M43
NCXOZ
PIMPY
ROL
AAMMB
AAYWO
AAYXX
ACVFH
ADCNI
ADMLS
AEFGJ
AEUPX
AFFHD
AFPUW
AGXDD
AIDQK
AIDYY
AIGII
AKBMS
AKYEP
CITATION
ICD
PHGZM
PHGZT
PQGLB
WIN
8FE
8FG
ABUWG
AZQEC
DWQXO
GNUQQ
JQ2
P62
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c4878-19f50435301dde036b43d4ae98d73fe62543cbbc8df2c70f27f07ec9190efe533
IEDL.DBID DOA
ISICitedReferencesCount 70
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000597164200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2468-2322
2468-6557
IngestDate Fri Oct 03 12:44:07 EDT 2025
Sat Jul 26 02:56:01 EDT 2025
Tue Nov 18 22:12:25 EST 2025
Wed Oct 29 21:12:31 EDT 2025
Wed Jan 22 16:30:55 EST 2025
Tue Jan 05 21:51:09 EST 2021
Wed Mar 11 04:12:13 EDT 2020
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords data analysis
computing powers
single machine
big data clustering
data processing
Big Data
data points
cluster centroids
initial dataset
pattern clustering
clustering algorithms
clustering speed
k-means algorithm
big data analysis
Language English
License Attribution-NonCommercial-NoDerivs
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4878-19f50435301dde036b43d4ae98d73fe62543cbbc8df2c70f27f07ec9190efe533
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9795-1694
0000-0001-9449-7457
OpenAccessLink https://doaj.org/article/23f5ece2d80e41059afff3fda2b6097a
PQID 3091978168
PQPubID 6852857
PageCount 6
ParticipantIDs crossref_citationtrail_10_1049_trit_2019_0048
wiley_primary_10_1049_trit_2019_0048_CIT2BF00088
crossref_primary_10_1049_trit_2019_0048
doaj_primary_oai_doaj_org_article_23f5ece2d80e41059afff3fda2b6097a
proquest_journals_3091978168
iet_journals_10_1049_trit_2019_0048
ProviderPackageCode IDLOA
RUI
PublicationCentury 2000
PublicationDate March 2020
PublicationDateYYYYMMDD 2020-03-01
PublicationDate_xml – month: 03
  year: 2020
  text: March 2020
PublicationDecade 2020
PublicationPlace Beijing
PublicationPlace_xml – name: Beijing
PublicationTitle CAAI Transactions on Intelligence Technology
PublicationYear 2020
Publisher The Institution of Engineering and Technology
John Wiley & Sons, Inc
Wiley
Publisher_xml – name: The Institution of Engineering and Technology
– name: John Wiley & Sons, Inc
– name: Wiley
References Meng, Y.; Liang, J.; Cao, F. (C15) 2018; 463-464
Alguliyev, R.; Aliguliyev, R.; Imamverdiyev, Y. (C6) 2017; 9
Kantabutra, S.; Couch, A.L. (C20) 2000; 1
Alguliyev, R.; Aliguliyev, R.; Sukhostat, L. (C5) 2017; 5
Alguliyev, R.; Aliguliyev, R.; Imamverdiyev, Y. (C8) 2018; 45
Karmitsa, N.; Bagirov, A.M.; Taheri, S. (C3) 2017; 263
Zhang, G.; Zhang, C.; Zhang, H. (C22) 2018; 145
Kraus, J.; Kestler, H. (C21) 2010; 11
Thompson, S.K. (C29) 1987; 41
Zhao, W.L.; Deng, C.H.; Ngo, C.W. (C17) 2018; 291
Alguliyev, R.; Aliguliyev, R.; Imamverdiyev, Y. (C7) 2018; 6
Tang, R.; Fong, S. (C18) 2018; 86
Ghesmoune, M.; Lebbah, M.; Azzag, H. (C25) 2015; 53
Jie, S.; Zhongyi, M.; Yichuan, Z. (C13) 2018; 153
Cuomo, S; De Angelis, V.; Farina, G. (C26) 2019; 75
Karmitsa, N.; Bagirov, A.M.; Taheri, S. (C4) 2018; 83
Jain, A.K. (C10) 2010; 31
Meek, C.; Thiesson, B.; Heckerman, D. (C31) 2002; 2
Ismkhan, H. (C16) 2018; 79
Hussain, S.F.; Haris, M. (C23) 2019; 118
Bahmani, B.; Moseley, B.; Vattani, A. (C24) 2012; 5
Parker, J.K.; Hall, L.O. (C28) 2014; 22
Aaron, C.; Cholaquidis, A.; Fraiman, R. (C14) 2019; 170
2017; 5
2010; 11
2010; 31
2019; 170
2012
2019; 75
2011
2010
2015; 53
2018; 145
2009
2002; 2
2005
2000; 1
2004
2018; 83
2018; 45
2018; 86
2017; 9
2018; 463‐464
2014; 22
2018; 153
2018; 6
2018; 291
1987; 41
2001
2016
2015
2014
2017; 263
2019; 118
2012; 5
2018; 79
Kraus J. (e_1_2_6_22_2) 2010; 11
e_1_2_6_31_2
e_1_2_6_30_2
e_1_2_6_18_2
e_1_2_6_19_2
e_1_2_6_12_2
e_1_2_6_35_2
e_1_2_6_13_2
e_1_2_6_34_2
Alguliyev R. (e_1_2_6_7_2) 2017; 9
e_1_2_6_10_2
e_1_2_6_33_2
e_1_2_6_11_2
e_1_2_6_32_2
e_1_2_6_16_2
e_1_2_6_17_2
e_1_2_6_14_2
e_1_2_6_15_2
e_1_2_6_36_2
Alguliyev R. (e_1_2_6_9_2) 2018; 45
e_1_2_6_20_2
Kantabutra S. (e_1_2_6_21_2) 2000; 1
e_1_2_6_8_2
e_1_2_6_29_2
e_1_2_6_4_2
e_1_2_6_3_2
e_1_2_6_6_2
e_1_2_6_5_2
e_1_2_6_24_2
e_1_2_6_23_2
e_1_2_6_2_2
e_1_2_6_28_2
e_1_2_6_27_2
e_1_2_6_26_2
e_1_2_6_25_2
References_xml – volume: 6
  start-page: 178
  issue: 2
  year: 2018
  end-page: 188
  ident: C7
  article-title: Weighted clustering for anomaly detection in big data
  publication-title: Stat. Optim. Inf. Comput.
– volume: 1
  start-page: 243
  issue: 6
  year: 2000
  end-page: 247
  ident: C20
  article-title: Parallel k-means clustering algorithm on NOWs
  publication-title: NECTEC Technical J.
– volume: 11
  start-page: 1
  issue: 169
  year: 2010
  end-page: 16
  ident: C21
  article-title: A highly efficient multi-core algorithm for clustering extremely large data sets
  publication-title: BMC Bioinformatics
– volume: 53
  start-page: 158
  year: 2015
  end-page: 166
  ident: C25
  article-title: Micro-batching growing neural gas for clustering data streams using spark streaming
  publication-title: Procedia Comput. Sci.
– volume: 86
  start-page: 1395
  year: 2018
  end-page: 1412
  ident: C18
  article-title: Clustering big IoT data by metaheuristic optimized mini-batch and parallel partition-based DGC in Hadoop
  publication-title: Future Gener. Comput. Syst.
– volume: 5
  start-page: 622
  issue: 7
  year: 2012
  end-page: 633
  ident: C24
  article-title: Scalable k-means++
  publication-title: Proc. VLDB Endowment
– volume: 22
  start-page: 1229
  issue: 5
  year: 2014
  end-page: 1244
  ident: C28
  article-title: Accelerating fuzzy-c means using an estimated subsample size
  publication-title: IEEE Trans. Fuzzy Syst.
– volume: 463-464
  start-page: 166
  year: 2018
  end-page: 185
  ident: C15
  article-title: A new distance with derivative information for functional k-means clustering algorithm
  publication-title: Inf. Sci.
– volume: 118
  start-page: 20
  year: 2019
  end-page: 34
  ident: C23
  article-title: A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data
  publication-title: Expert Syst. Appl.
– volume: 45
  start-page: 73
  issue: 2
  year: 2018
  end-page: 82
  ident: C8
  article-title: An improved ensemble approach for DoS attacks detection
  publication-title: Radio Electronics Comp. Sci. Control
– volume: 83
  start-page: 245
  year: 2018
  end-page: 249
  ident: C4
  article-title: Clustering in large data sets with the limited memory bundle method
  publication-title: Pattern Recognit.
– volume: 9
  start-page: 87
  issue: 12
  year: 2017
  end-page: 96
  ident: C6
  article-title: An anomaly detection based on optimization
  publication-title: Int. J. Intell. Syst. Appl. Eng.
– volume: 153
  start-page: 105
  year: 2018
  end-page: 116
  ident: C13
  article-title: Rim: a reusable iterative model for big data
  publication-title: Knowl.-Based Syst.
– volume: 2
  start-page: 397
  year: 2002
  end-page: 418
  ident: C31
  article-title: ‘The learning-curve sampling method applied to model-based clustering’
  publication-title: Journal of Machine Learning Research
– volume: 75
  start-page: 262
  year: 2019
  end-page: 274
  ident: C26
  article-title: A GPU-accelerated parallel k-means algorithm
  publication-title: Comput. Electr. Eng.
– volume: 170
  start-page: 149
  year: 2019
  end-page: 161
  ident: C14
  article-title: Multivariate and functional robust fusion methods for structured Big Data
  publication-title: J. Multivar. Anal.
– volume: 79
  start-page: 402
  year: 2018
  end-page: 413
  ident: C16
  article-title: I-k-means−+: an iterative clustering algorithm based on an enhanced version of the k-means
  publication-title: Pattern Recognit.
– volume: 291
  start-page: 195
  year: 2018
  end-page: 206
  ident: C17
  article-title: k-means: a revisit
  publication-title: Neurocomputing
– volume: 145
  start-page: 289
  year: 2018
  end-page: 297
  ident: C22
  article-title: Improved k-means algorithm based on density Canopy
  publication-title: Knowl.-Based Syst.
– volume: 5
  start-page: 325
  issue: 4
  year: 2017
  end-page: 340
  ident: C5
  article-title: Anomaly detection in big data based on clustering
  publication-title: Stat. Optim. Inf. Comput.
– volume: 31
  start-page: 651
  issue: 8
  year: 2010
  end-page: 666
  ident: C10
  article-title: Data clustering: 50 years beyond k-means
  publication-title: Pattern Recognit. Lett.
– volume: 41
  start-page: 42
  issue: 1
  year: 1987
  end-page: 46
  ident: C29
  article-title: Sample size for estimating multinomial proportions
  publication-title: Am. Stat.
– volume: 263
  start-page: 367
  issue: 2
  year: 2017
  end-page: 379
  ident: C3
  article-title: New diagonal bundle method for clustering problems in large data sets
  publication-title: Eur. J. Oper. Res.
– start-page: 116
  year: 2012
  end-page: 123
  article-title: Integrating nature‐inspired optimization algorithms to k‐means clustering
– volume: 5
  start-page: 622
  issue: 7
  year: 2012
  end-page: 633
  article-title: Scalable k‐means++
  publication-title: Proc. VLDB Endowment
– start-page: 248
  year: 2004
  end-page: 251
  article-title: Parallel k‐means clustering algorithm on DNA dataset
– volume: 2
  start-page: 397
  year: 2002
  end-page: 418
  article-title: ‘The learning‐curve sampling method applied to model‐based clustering’
  publication-title: Journal of Machine Learning Research
– start-page: 591
  year: 2011
  end-page: 596
  article-title: The million song dataset
– start-page: 79
  year: 2016
  end-page: 82
  article-title: Batch clustering algorithm for big data sets
– volume: 83
  start-page: 245
  year: 2018
  end-page: 249
  article-title: Clustering in large data sets with the limited memory bundle method
  publication-title: Pattern Recognit.
– volume: 9
  start-page: 87
  issue: 12
  year: 2017
  end-page: 96
  article-title: An anomaly detection based on optimization
  publication-title: Int. J. Intell. Syst. Appl. Eng.
– volume: 1
  start-page: 243
  issue: 6
  year: 2000
  end-page: 247
  article-title: Parallel k‐means clustering algorithm on NOWs
  publication-title: NECTEC Technical J.
– year: 2014
– volume: 153
  start-page: 105
  year: 2018
  end-page: 116
  article-title: Rim: a reusable iterative model for big data
  publication-title: Knowl.‐Based Syst.
– volume: 22
  start-page: 1229
  issue: 5
  year: 2014
  end-page: 1244
  article-title: Accelerating fuzzy‐c means using an estimated subsample size
  publication-title: IEEE Trans. Fuzzy Syst.
– volume: 6
  start-page: 178
  issue: 2
  year: 2018
  end-page: 188
  article-title: Weighted clustering for anomaly detection in big data
  publication-title: Stat. Optim. Inf. Comput.
– volume: 86
  start-page: 1395
  year: 2018
  end-page: 1412
  article-title: Clustering big IoT data by metaheuristic optimized mini‐batch and parallel partition‐based DGC in Hadoop
  publication-title: Future Gener. Comput. Syst.
– volume: 31
  start-page: 651
  issue: 8
  year: 2010
  end-page: 666
  article-title: Data clustering: 50 years beyond k‐means
  publication-title: Pattern Recognit. Lett.
– start-page: 127
  year: 2015
  end-page: 140
  article-title: Smart devices are different: assessing and mitigating mobile sensing heterogeneities for activity recognition
– volume: 75
  start-page: 262
  year: 2019
  end-page: 274
  article-title: A GPU‐accelerated parallel k‐means algorithm
  publication-title: Comput. Electr. Eng.
– start-page: 707
  year: 2014
  end-page: 720
  article-title: Big data clustering: a review
– start-page: 271
  year: 2001
  end-page: 282
  article-title: Query optimization in compressed database systems
– volume: 11
  start-page: 1
  issue: 169
  year: 2010
  end-page: 16
  article-title: A highly efficient multi‐core algorithm for clustering extremely large data sets
  publication-title: BMC Bioinformatics
– volume: 263
  start-page: 367
  issue: 2
  year: 2017
  end-page: 379
  article-title: New diagonal bundle method for clustering problems in large data sets
  publication-title: Eur. J. Oper. Res.
– start-page: 153
  year: 2009
  end-page: 161
  article-title: Unsupervised feature selection for the k‐means clustering problem
– volume: 170
  start-page: 149
  year: 2019
  end-page: 161
  article-title: Multivariate and functional robust fusion methods for structured Big Data
  publication-title: J. Multivar. Anal.
– volume: 291
  start-page: 195
  year: 2018
  end-page: 206
  article-title: k‐means: a revisit
  publication-title: Neurocomputing
– volume: 53
  start-page: 158
  year: 2015
  end-page: 166
  article-title: Micro‐batching growing neural gas for clustering data streams using spark streaming
  publication-title: Procedia Comput. Sci.
– volume: 5
  start-page: 325
  issue: 4
  year: 2017
  end-page: 340
  article-title: Anomaly detection in big data based on clustering
  publication-title: Stat. Optim. Inf. Comput.
– start-page: 676
  year: 2005
  end-page: 677
  article-title: An information theoretic histogram for single dimensional selectivity estimation
– volume: 145
  start-page: 289
  year: 2018
  end-page: 297
  article-title: Improved k‐means algorithm based on density Canopy
  publication-title: Knowl.‐Based Syst.
– start-page: 1177
  year: 2010
  end-page: 1178
  article-title: Web‐scale k‐means clustering
– volume: 41
  start-page: 42
  issue: 1
  year: 1987
  end-page: 46
  article-title: Sample size for estimating multinomial proportions
  publication-title: Am. Stat.
– volume: 45
  start-page: 73
  issue: 2
  year: 2018
  end-page: 82
  article-title: An improved ensemble approach for DoS attacks detection
  publication-title: Radio Electronics Comp. Sci. Control
– volume: 79
  start-page: 402
  year: 2018
  end-page: 413
  article-title: I‐k‐means−+: an iterative clustering algorithm based on an enhanced version of the k‐means
  publication-title: Pattern Recognit.
– volume: 118
  start-page: 20
  year: 2019
  end-page: 34
  article-title: A k‐means based co‐clustering (kCC) algorithm for sparse, high dimensional data
  publication-title: Expert Syst. Appl.
– volume: 463‐464
  start-page: 166
  year: 2018
  end-page: 185
  article-title: A new distance with derivative information for functional k‐means clustering algorithm
  publication-title: Inf. Sci.
– ident: e_1_2_6_8_2
  doi: 10.19139/soic.v6i2.404
– ident: e_1_2_6_26_2
  doi: 10.1016/j.procs.2015.07.290
– ident: e_1_2_6_32_2
  doi: 10.1162/153244302760200678
– ident: e_1_2_6_30_2
  doi: 10.2307/2684318
– ident: e_1_2_6_10_2
– ident: e_1_2_6_33_2
  doi: 10.1145/375663.375692
– ident: e_1_2_6_14_2
  doi: 10.1016/j.knosys.2018.04.032
– ident: e_1_2_6_13_2
– volume: 45
  start-page: 73
  issue: 2
  year: 2018
  ident: e_1_2_6_9_2
  article-title: An improved ensemble approach for DoS attacks detection
  publication-title: Radio Electronics Comp. Sci. Control
– ident: e_1_2_6_12_2
  doi: 10.1145/1772690.1772862
– ident: e_1_2_6_23_2
  doi: 10.1016/j.knosys.2018.01.031
– ident: e_1_2_6_2_2
  doi: 10.1201/b17320
– ident: e_1_2_6_35_2
– ident: e_1_2_6_16_2
  doi: 10.1016/j.ins.2018.06.035
– ident: e_1_2_6_24_2
  doi: 10.1016/j.eswa.2018.09.006
– volume: 11
  start-page: 1
  issue: 169
  year: 2010
  ident: e_1_2_6_22_2
  article-title: A highly efficient multi‐core algorithm for clustering extremely large data sets
  publication-title: BMC Bioinformatics
– ident: e_1_2_6_27_2
  doi: 10.1016/j.compeleceng.2017.12.002
– ident: e_1_2_6_28_2
  doi: 10.1007/978-3-540-30501-9_54
– ident: e_1_2_6_15_2
  doi: 10.1016/j.jmva.2018.06.012
– ident: e_1_2_6_31_2
– ident: e_1_2_6_36_2
  doi: 10.1145/2809695.2809718
– ident: e_1_2_6_4_2
  doi: 10.1016/j.ejor.2017.06.010
– ident: e_1_2_6_5_2
  doi: 10.1016/j.patcog.2018.05.028
– ident: e_1_2_6_19_2
  doi: 10.1016/j.future.2018.03.006
– ident: e_1_2_6_25_2
  doi: 10.14778/2180912.2180915
– ident: e_1_2_6_34_2
  doi: 10.1145/1066677.1066831
– ident: e_1_2_6_17_2
  doi: 10.1016/j.patcog.2018.02.015
– ident: e_1_2_6_18_2
  doi: 10.1016/j.neucom.2018.02.072
– volume: 1
  start-page: 243
  issue: 6
  year: 2000
  ident: e_1_2_6_21_2
  article-title: Parallel k‐means clustering algorithm on NOWs
  publication-title: NECTEC Technical J.
– volume: 9
  start-page: 87
  issue: 12
  year: 2017
  ident: e_1_2_6_7_2
  article-title: An anomaly detection based on optimization
  publication-title: Int. J. Intell. Syst. Appl. Eng.
– ident: e_1_2_6_20_2
  doi: 10.1109/ICDIM.2012.6360145
– ident: e_1_2_6_29_2
  doi: 10.1109/TFUZZ.2013.2286993
– ident: e_1_2_6_6_2
  doi: 10.19139/soic.v5i4.365
– ident: e_1_2_6_3_2
  doi: 10.1007/978-3-319-09156-3_49
– ident: e_1_2_6_11_2
  doi: 10.1016/j.patrec.2009.09.011
SSID ssj0001999537
ssib050169717
ssib050729737
ssib052855658
Score 2.4241798
Snippet Big data analysis requires the presence of large computing powers, which is not always feasible. And so, it became necessary to develop new clustering...
SourceID doaj
proquest
crossref
wiley
iet
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 9
SubjectTerms Accelerometers
Algorithms
Big Data
big data analysis
big data clustering
C6130 Data handling techniques
Censuses
Centroids
cluster centroids
Clustering
clustering algorithms
clustering speed
computing powers
Data analysis
Data points
Data processing
Datasets
Efficiency
Experiments
initial dataset
k-means algorithm
Massive data points
pattern clustering
Performance evaluation
Personal computers
Research Article
single machine
Standard deviation
SummonAdditionalLinks – databaseName: Computer Science Database
  dbid: K7-
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LTxsxELZa2kMvUARVt6XUUitxWuG1d9f2CRUEKkJCVRUkbtbankkjhSRNAr-_HrMhcKA99Lo7-5yx5-HP3zD2FbwRGGJbVq01KUFBW3proURsvAJEH4XPzSb05aW5vrY_-oLboodVrubEPFHHaaAa-aFKjo34mVpzNPtdUtcoWl3tW2i8ZK8qKSuy8wtdrmssKfpplF5xNdb2kAjuCc9FJJXU8ueRL8qU_cnDjGD5JNp8HLNmp3O29b-v-5Zt9uEm_3ZvH9vsBUx2GFEWj_JGSN6Nh-mq5a8bnoJX7kdDTpBRHsa3RKCQ3BqfTjiVE8bAbzLuEnbZ1dnp4OR72bdRKEPKRlKOaJFoypo0lNNcljyWr1WsO7AmaoXQ0nb44H0wEWXQAqVGoSGk9xeAkMLBd2xjMp3Ae8aN7lRKyFoUdairTljVAq1c-oCNDVUoWLn6oS70HOPU6mLs8lp3bR0pwJECHCmgYAcP8rN7do1nJY9JPw9SxIqdD0znQ9cPMicVNhBARiOA4Ku2Q0SFsZO-FVZ3BfuStOv6Ubp49lGfn0gNfp4P1mfdLGLB9lZKX4utNV4wkW3mH5_kTs4H8pjQncZ8-PsdP7I3kvL8jH3bYxvL-S18Yq_D3XK0mO9no_8DPaYIHA
  priority: 102
  providerName: ProQuest
– databaseName: Wiley Online Library Open Access Titles
  dbid: 24P
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1NTxsxELUK5cCltCqIFNpaKlJPK7z2fthHQEXlglAVJG7W2jsTIoUEhcDvZ8bZJOSAEOp1d-zdtT07b-yZN0IcQbAKY1tleeUsOSjosuAcZIhlMIAYWhVSsYn68tLe3LirF1n8c36I5YYba0b6X7OCN2FehYRALU0ic9VzaBbzTRZ2Q3zMc2O5eIMurla7LIR_ykScqTnFiOCDXjA3Fu54vYs1y5QI_MneDGG2hj1fIthkgs53_v_lP4tPHfyUJ_P18kV8gPFXwRTGw5QYKZvRYEKNbu8kgVkZhgPJIaQyjh6ZUIHMnJyMJW8vjEDepThM2BXX53_6Z3-zrqxCFsk7IZ_RIdOWlaTa9G8jCxYK0xYNONvWBqHi9PgYQrQt6lgr1DWqGqIj6AAIBA_3xOZ4MoZ9IW3dGHLQKlRFLPJGOVMBn2SGiKWLeeyJbDGkPnac41z6YuTT2XfhPA-E54HwPBA98Xspfz9n23hV8pRnaCnFLNnpwmQ68J3SeW2whAi6tQo4nNU1iGiwbXSolKubnvhF8-s7rX149VE_16T6_y76q7v-vsWeOFyskJWYISTGhGIVdaDSWnjjk_zZRV-fcrSntd_e3-RAbGveC0jxcYdiczZ9hO9iKz7Nhg_TH0ktngFUMgw7
  priority: 102
  providerName: Wiley-Blackwell
Title Efficient algorithm for big data clustering on single machine
URI http://digital-library.theiet.org/content/journals/10.1049/trit.2019.0048
https://onlinelibrary.wiley.com/doi/abs/10.1049%2Ftrit.2019.0048
https://www.proquest.com/docview/3091978168
https://doaj.org/article/23f5ece2d80e41059afff3fda2b6097a
Volume 5
WOSCitedRecordID wos000597164200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: DOA
  dateStart: 20180101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssib050729737
  issn: 2468-2322
  databaseCode: M~E
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: K7-
  dateStart: 20170601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: BENPR
  dateStart: 20170601
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Proquest Publicly Available Content Database
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: PIMPY
  dateStart: 20170601
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVWIB
  databaseName: Wiley Online Library Open Access
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: WIN
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
– providerCode: PRVWIB
  databaseName: Wiley Online Library Open Access
  customDbUrl:
  eissn: 2468-2322
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001999537
  issn: 2468-2322
  databaseCode: 24P
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Rb9MwELbY4GEvCMQQhVEsgbSnaG6cxPYjnTpRoVXRVMR4smLnblTq2mnreOS3c-emW_cw7YWXSIlPSXwX576zz98J8QWCVRjbKhtUzlKAgi4LzkGGWAYNiKFVIRWbMJOJPT939VapL84JW9MDrxV3lGssIULeWgWckugaRNTYNnmolDMJGhHq2Qqm0uwK4Z5Smw1LY-GOmNqeM7mYnpKL_Wx5oUTWT75lBqsHOHMbrSZ3c_JKvOxwovy6fr_X4hks3gjmGp6lHYyymV8s6Vm_LyWhThlmF5JzPWWc3zLzAfkjuVxIngeYg7xMCZOwL36cjKbH37Ku_kEWKYyg4M4h84uVNAbpJ0SuJhS6LRpwtjUaoeJ97DGEaFvMo1GYG1QGoiMfDwiE496K3cVyAe-EtKbRFElVqIpYDBrldAW85Bgili4OYk9kG3342JGDc42KuU-L1IXzrD_P-vOsv544vJO_WtNiPCo5ZPXeSTGddbpARvadkf1TRu6Jz2Qc3w2vm0cf9emB1PRsPL1v9Vct9sTBxrz3YpogEzN_VXQDlUz-RJf88XiaDzkt09r3_6N7H8RezmF8Sm07ELur61v4KF7EP6vZzXVf7ORF3RfPh6NJfdZP3zkdv5uMjqd_R9RSj0_rX3T2czz5ByqzApc
linkProvider Directory of Open Access Journals
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEB5VLRJceAgQhj5WAsTJ6sbrx-4Boaa0atQSVZWRelu8650QKU1CkoL4U_zG7mzspj0UTj1wtUde2_Oenf0G4J0zkqOt87iTK-kTFFSxUcrFiJkRDtHU3IRhE0W_L8_P1eka_GnPwlBbZWsTg6GuJ5Zq5LvCOzbCZ8rlp-mPmKZG0e5qO0JjKRbH7vcvn7LNP_Y-e_6-T5LDg3L_KG6mCsTWB-c-ZVJIqF2Zl2yv2t6Am1TUaeWUrAuBLqfT4dYYK2tMbMExKZAXzvrVuUOXUQHUm_yNVKSF16uN7kH_9GxV1fHxViaKFh0yVbsEqU8dZASLSUOGbni_MCTA-7ShW9yKb29GycHNHT75337QU3jcBNRsb6kBz2DNjZ8DgTIPw1FPVo0G_i0X3y-YD8-ZGQ4YNcUyO7okiAjvuNlkzKhgMnLsInSWuhfw9V7e-CWsjydj9wqYLCrhU84ceWrTTsWVyB3tzRqLmbIdG0HcMlDbBkWdhnmMdNjNT5UmhmtiuCaGR_Dhmn66xA-5k7JL8nBNRbjf4cJkNtCNGdGJwMxZl9SSO2rQVRUiCqyrxORcFVUEb7006cYOze9caucWVXnWK1d39bTGCDZbIVuRrSQsAh5k9B-fpPd7ZdKl_lUpX__9iTvw8Kj8cqJPev3jN_AooapG6PTbhPXF7NJtwQP7czGcz7YblWPw7b6F-Apz5GZM
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Rb9MwELZgIMQLMAGiMDZLTOIpwnWcxH5kY9UqUFVNnbQ3K3buukpdO3Udv587N23pw4SQ9pqcncS-y31n330W4hiCVRibMuuWzlKAgi4LzkGGWIQcEEOjQjpsohoM7NWVG7bZhFwLs-KH2Cy4sWWk_zUbONw2uAo4DZNkMlk952Yx4aSxT8UzU1RdVmxthttlFgJARWLO1FxjRPhBr6kbjfu228WOa0oM_uRwJrDcAZ9_Q9jkg3qvH-Ht34hXLQCV31casy-ewOytYBLjSSqNlPV0PKdG1zeS4KwMk7HkJFIZp_dMqUCOTs5nkhcYpiBvUiYmvBOXvbPR6XnWHqyQRYpPKGp0yMRlBRk3_d3IhwWTN6YGZ5sqRyi5QD6GEG2DOlYKdYWqgugIPAACAcT3Ym82n8EHIW1V5xSilahMNN1aubwE3ssMEQsXu7EjsvWY-tiyjvPhF1Ofdr-N8zwQngfC80B0xNeN_O2Kb-NByROeoo0U82SnC_PF2Ldm53WOBUTQjVXACa2uRsQcm1qHUrmq7ogvNMG-tdu7Bx91tCM1uuiPtnc9TWxHHKxVZCuWExZjSrGSOlBJGf7xSf60P9InnO9p7cf_b3IkXgx_9Pyv_uDnJ_FS88JASpY7EHvLxT18Fs_j7-XkbnGYTOQP63gPqg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+algorithm+for+big+data+clustering+on+single+machine&rft.jtitle=CAAI+Transactions+on+Intelligence+Technology&rft.au=Alguliyev%2C+Rasim+M.&rft.au=Aliguliyev%2C+Ramiz+M.&rft.au=Sukhostat%2C+Lyudmila+V.&rft.date=2020-03-01&rft.issn=2468-6557&rft.eissn=2468-2322&rft.volume=5&rft.issue=1&rft.spage=9&rft.epage=14&rft_id=info:doi/10.1049%2Ftrit.2019.0048&rft.externalDBID=n%2Fa&rft.externalDocID=10_1049_trit_2019_0048
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2468-2322&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2468-2322&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2468-2322&client=summon