A Fast Partitional Clustering Algorithm based on Nearest Neighbours Heuristics
•Investigate K-means clustering for large collections of sparse vectors of high dimensionality.•Proposed utilizing the inverted list data-structure to improve run-time of K-means.•Heuristics proposed for initial centroid selection and centroid updates.•Proposed approach outperforms the run-time of K...
Gespeichert in:
| Veröffentlicht in: | Pattern recognition letters Jg. 112; S. 198 - 204 |
|---|---|
| 1. Verfasser: | |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Amsterdam
Elsevier B.V
01.09.2018
Elsevier Science Ltd |
| Schlagworte: | |
| ISSN: | 0167-8655, 1872-7344 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | •Investigate K-means clustering for large collections of sparse vectors of high dimensionality.•Proposed utilizing the inverted list data-structure to improve run-time of K-means.•Heuristics proposed for initial centroid selection and centroid updates.•Proposed approach outperforms the run-time of K-means by up to 35x on a collection of 14M tweets.
K-means, along with its several other variants, is the most widely used family of partitional clustering algorithms. Generally speaking, this family of algorithm starts by initializing a number of data points as cluster centres, and then iteratively refines these cluster centres based on the current partition of the dataset. Given a set of cluster centres, inducing the partition over the dataset involves finding the nearest (or most similar) cluster centre for each data point, which is an O(NK) operation, N and K being the number of data points and the number of clusters, respectively. In our proposed approach, we avoid the explicit computation of these distances for the case of sparse vectors, e.g. documents, by utilizing a fundamental operation, namely TOP(x), which gives a list of the top most similar vectors with respect to the vector x. A standard way to store sparse vectors and retrieve the top most similar ones given a query vector, is with the help of the inverted list data structure. In our proposed method, we use the TOP(x) function to first select cluster centres that are likely to be dissimilar to each other. Secondly, to obtain the partition during each iteration of K-means, we avoid the explicit computation of the pair-wise similarities between the centroid and the non-centroid vectors. Thirdly, we avoid recomputation of the cluster centroids by adopting a centrality based heuristic. We demonstrate the effectiveness of our proposed algorithm on TREC-2011 Microblog dataset, a large collection of about 14M tweets. Our experiments demonstrate that our proposed method is about 35x faster and produces more effective clusters in comparison to the standard K-means algorithm. |
|---|---|
| AbstractList | K-means, along with its several other variants, is the most widely used family of partitional clustering algorithms. Generally speaking, this family of algorithm starts by initializing a number of data points as cluster centres, and then iteratively refines these cluster centres based on the current partition of the dataset. Given a set of cluster centres, inducing the partition over the dataset involves finding the nearest (or most similar) cluster centre for each data point, which is an O(NK) operation, N and K being the number of data points and the number of clusters, respectively. In our proposed approach, we avoid the explicit computation of these distances for the case of sparse vectors, e.g. documents, by utilizing a fundamental operation, namely TOP(x), which gives a list of the top most similar vectors with respect to the vector x. A standard way to store sparse vectors and retrieve the top most similar ones given a query vector, is with the help of the inverted list data structure. In our proposed method, we use the TOP(x) function to first select cluster centres that are likely to be dissimilar to each other. Secondly, to obtain the partition during each iteration of K-means, we avoid the explicit computation of the pair-wise similarities between the centroid and the non-centroid vectors. Thirdly, we avoid recomputation of the cluster centroids by adopting a centrality based heuristic. We demonstrate the effectiveness of our proposed algorithm on TREC-2011 Microblog dataset, a large collection of about 14M tweets. Our experiments demonstrate that our proposed method is about 35x faster and produces more effective clusters in comparison to the standard K-means algorithm. •Investigate K-means clustering for large collections of sparse vectors of high dimensionality.•Proposed utilizing the inverted list data-structure to improve run-time of K-means.•Heuristics proposed for initial centroid selection and centroid updates.•Proposed approach outperforms the run-time of K-means by up to 35x on a collection of 14M tweets. K-means, along with its several other variants, is the most widely used family of partitional clustering algorithms. Generally speaking, this family of algorithm starts by initializing a number of data points as cluster centres, and then iteratively refines these cluster centres based on the current partition of the dataset. Given a set of cluster centres, inducing the partition over the dataset involves finding the nearest (or most similar) cluster centre for each data point, which is an O(NK) operation, N and K being the number of data points and the number of clusters, respectively. In our proposed approach, we avoid the explicit computation of these distances for the case of sparse vectors, e.g. documents, by utilizing a fundamental operation, namely TOP(x), which gives a list of the top most similar vectors with respect to the vector x. A standard way to store sparse vectors and retrieve the top most similar ones given a query vector, is with the help of the inverted list data structure. In our proposed method, we use the TOP(x) function to first select cluster centres that are likely to be dissimilar to each other. Secondly, to obtain the partition during each iteration of K-means, we avoid the explicit computation of the pair-wise similarities between the centroid and the non-centroid vectors. Thirdly, we avoid recomputation of the cluster centroids by adopting a centrality based heuristic. We demonstrate the effectiveness of our proposed algorithm on TREC-2011 Microblog dataset, a large collection of about 14M tweets. Our experiments demonstrate that our proposed method is about 35x faster and produces more effective clusters in comparison to the standard K-means algorithm. |
| Author | Ganguly, Debasis |
| Author_xml | – sequence: 1 givenname: Debasis orcidid: 0000-0003-0050-7138 surname: Ganguly fullname: Ganguly, Debasis email: debasis.ganguly1@ie.ibm.com organization: IBM Research Lab, Dublin, Ireland |
| BookMark | eNqFkE1LAzEQhoNUsFX_gYeA512TzX5kPQilWCuU6kHPIZudtCnbTU2ygv_elPXkQU_DwPsM7zwzNOltDwjdUJJSQsu7fXqUwYFKM0J5SqqU0OoMTSmvsqRieT5B0xirEl4WxQWaeb8nhJSs5lO0meOl9AG_ShdMMLaXHV50gw_gTL_F825rnQm7A26khxbbHm9AOojEBsx219jBebyCwRkfjPJX6FzLzsP1z7xE78vHt8UqWb88PS_m60QxnoUkp6TSLSs0h0YT1RKqdQO0LYuMy1oXsSwjeROXHGpZ6CavGAPFNMtlyTSwS3Q73j06-zHEOmIfm8TyXmSU1jWrM8ZjKh9TylnvHWhxdOYg3ZegRJzMib0YzYmTOUEqEc1F7P4XpkyQJznBSdP9Bz-MMMT3Pw044ZWBXkFrYjSI1pq_D3wD81OPAw |
| CitedBy_id | crossref_primary_10_1007_s11042_022_13453_3 crossref_primary_10_1016_j_patrec_2021_10_005 crossref_primary_10_3233_JIFS_179879 crossref_primary_10_1177_0165551520911590 |
| Cites_doi | 10.1109/TIT.1982.1056489 10.1109/TPAMI.2010.57 10.1145/2027216.2027217 10.1109/TPAMI.2002.1017616 10.14778/2180912.2180915 10.1016/j.eswa.2008.01.039 10.1016/j.is.2016.08.003 |
| ContentType | Journal Article |
| Copyright | 2018 Elsevier B.V. Copyright Elsevier Science Ltd. Sep 1, 2018 |
| Copyright_xml | – notice: 2018 Elsevier B.V. – notice: Copyright Elsevier Science Ltd. Sep 1, 2018 |
| DBID | AAYXX CITATION 7SC 7TK 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.patrec.2018.07.017 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Neurosciences Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Neurosciences Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1872-7344 |
| EndPage | 204 |
| ExternalDocumentID | 10_1016_j_patrec_2018_07_017 S0167865518303143 |
| GroupedDBID | --M .DC .~1 0R~ 123 1RT 1~. 1~5 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXUO AAYFN ABBOA ABFNM ABFRF ABJNI ABMAC ABYKQ ACDAQ ACGFO ACGFS ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ J1W JJJVA KOM LG9 LY1 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 RIG RNS ROL SDF SDG SDP SES SPC SPCBC SST SSV SSZ T5K TN5 UNMZH WH7 XPP ZMT ~G- --K 1B1 29O 9DU AAQXK AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADMXK ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EJD FEDTE FGOYB HLZ HVGLF HZ~ IHE R2- RPZ SBC SDS SEW VOH WUQ Y6R ~HD 7SC 7TK 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c382t-4107fd35f8ebf0cd01ffbe1d6528a9f5167304b28a4e9a5fb4733ec3f34a63fe3 |
| ISICitedReferencesCount | 5 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000443950800029&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0167-8655 |
| IngestDate | Sun Nov 09 07:31:12 EST 2025 Tue Nov 18 22:10:18 EST 2025 Sat Nov 29 07:23:41 EST 2025 Fri Feb 23 02:46:00 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Tweet clustering 41A10 65D05 65D17 Inverted index 41A05 Scalable K-means |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c382t-4107fd35f8ebf0cd01ffbe1d6528a9f5167304b28a4e9a5fb4733ec3f34a63fe3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-0050-7138 |
| PQID | 2119939238 |
| PQPubID | 2047552 |
| PageCount | 7 |
| ParticipantIDs | proquest_journals_2119939238 crossref_primary_10_1016_j_patrec_2018_07_017 crossref_citationtrail_10_1016_j_patrec_2018_07_017 elsevier_sciencedirect_doi_10_1016_j_patrec_2018_07_017 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-09-01 |
| PublicationDateYYYYMMDD | 2018-09-01 |
| PublicationDate_xml | – month: 09 year: 2018 text: 2018-09-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | Amsterdam |
| PublicationPlace_xml | – name: Amsterdam |
| PublicationTitle | Pattern recognition letters |
| PublicationYear | 2018 |
| Publisher | Elsevier B.V Elsevier Science Ltd |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier Science Ltd |
| References | Xu, Croft (bib0022) 1999 Lee, Croft, Allan (bib0014) 2008 Cui, Ruan, Xue, Xie, Wang, Feng (bib0006) 2014 Choi, Chung (bib0005) 2017; 64 Efron, Organisciak, Fenlon (bib0008) 2012 Lavrenko, Croft (bib0013) 2001 Ding, Liu, Huang, Li (bib0007) 2016 Levi, Raiber, Kurland, Guy (bib0015) 2016 Raiber, Kurland (bib0021) 2013 Zeng (bib0023) 2012 Park, Jun (bib0018) 2009; 36 Arthur, Manthey, Röglin (bib0001) 2011; 58 Kanungo, Mount, Netanyahu, Piatko, Silverman, Wu (bib0012) 2002; 24 Liu, Croft (bib0016) 2004 Bahmani, Moseley, Vattani, Kumar, Vassilvitskii (bib0003) 2012; 5 Elkan (bib0009) 2003 Jegou, Douze, Schmid (bib0011) 2011; 33 Pelleg, Moore (bib0020) 2000 Lloyd (bib0017) 1982; 28 Arthur, Vassilvitskii (bib0002) 2007 Broder, Garcia-Pueyo, Josifovski, Vassilvitskii, Venkatesan (bib0004) 2014 Hiemstra (bib0010) 2000 Pelleg, Moore (bib0019) 1999 Bahmani (10.1016/j.patrec.2018.07.017_bib0003) 2012; 5 Broder (10.1016/j.patrec.2018.07.017_bib0004) 2014 Levi (10.1016/j.patrec.2018.07.017_bib0015) 2016 Lloyd (10.1016/j.patrec.2018.07.017_bib0017) 1982; 28 Raiber (10.1016/j.patrec.2018.07.017_bib0021) 2013 Jegou (10.1016/j.patrec.2018.07.017_bib0011) 2011; 33 Xu (10.1016/j.patrec.2018.07.017_bib0022) 1999 Hiemstra (10.1016/j.patrec.2018.07.017_bib0010) 2000 Kanungo (10.1016/j.patrec.2018.07.017_bib0012) 2002; 24 Pelleg (10.1016/j.patrec.2018.07.017_bib0020) 2000 Arthur (10.1016/j.patrec.2018.07.017_sbref0001) 2011; 58 Arthur (10.1016/j.patrec.2018.07.017_bib0002) 2007 Lee (10.1016/j.patrec.2018.07.017_bib0014) 2008 Ding (10.1016/j.patrec.2018.07.017_bib0007) 2016 Pelleg (10.1016/j.patrec.2018.07.017_bib0019) 1999 Efron (10.1016/j.patrec.2018.07.017_bib0008) 2012 Lavrenko (10.1016/j.patrec.2018.07.017_bib0013) 2001 Liu (10.1016/j.patrec.2018.07.017_bib0016) 2004 Elkan (10.1016/j.patrec.2018.07.017_bib0009) 2003 Park (10.1016/j.patrec.2018.07.017_bib0018) 2009; 36 Zeng (10.1016/j.patrec.2018.07.017_bib0023) 2012 Choi (10.1016/j.patrec.2018.07.017_bib0005) 2017; 64 Cui (10.1016/j.patrec.2018.07.017_bib0006) 2014 |
| References_xml | – volume: 28 start-page: 129 year: 1982 end-page: 137 ident: bib0017 article-title: Least squares quantization in PCM publication-title: IEEE Trans. Inf. Theor. – start-page: 727 year: 2000 end-page: 734 ident: bib0020 article-title: X-means: Extending k-means with efficient estimation of the number of clusters publication-title: Proc. of ICML ’00 – start-page: 186 year: 2004 end-page: 193 ident: bib0016 article-title: Cluster-based retrieval using language models publication-title: Proc. of SIGIR ’04 – start-page: 333 year: 2013 end-page: 342 ident: bib0021 article-title: Ranking document clusters using markov random fields publication-title: Proc. of SIGIR ’13 – volume: 33 start-page: 117 year: 2011 end-page: 128 ident: bib0011 article-title: Product quantization for nearest neighbor search publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – start-page: 235 year: 2008 end-page: 242 ident: bib0014 article-title: A cluster-based resampling method for pseudo-relevance feedback publication-title: Proc. of SIGIR ’08 – start-page: 3037 year: 2012 end-page: 3044 ident: bib0023 article-title: Fast approximate k-means via cluster closures publication-title: Proc. of CVPR ’12 – volume: 5 start-page: 622 year: 2012 end-page: 633 ident: bib0003 article-title: Scalable k-means++ publication-title: Proc. VLDB Endow. – start-page: 233 year: 2014 end-page: 242 ident: bib0004 article-title: Scalable k-means by ranked retrieval publication-title: Proc. of WSDM ’14 – start-page: 1027 year: 2007 end-page: 1035 ident: bib0002 article-title: K-means++: The advantages of careful seeding publication-title: Proc. of SODA ’07 – start-page: 1339 year: 2016 end-page: 1348 ident: bib0007 article-title: K-means clustering with distributed dimensions publication-title: Proc. of ICML 2016 – year: 2000 ident: bib0010 publication-title: Using Language Models for Information Retrieval, Ph.D. thesis – volume: 36 start-page: 3336 year: 2009 end-page: 3341 ident: bib0018 article-title: A simple and fast algorithm for k-medoids clustering publication-title: Expert Syst. Appl. – volume: 64 start-page: 1 year: 2017 end-page: 11 ident: bib0005 article-title: A K-partitioning algorithm for clustering large-scale spatio-textual data publication-title: Inf. Syst. – start-page: 1473 year: 2016 end-page: 1482 ident: bib0015 article-title: Selective cluster-based document retrieval publication-title: Proc. of CIKM ’16 – start-page: 254 year: 1999 end-page: 261 ident: bib0022 article-title: Cluster-based language models for distributed retrieval publication-title: Proc. of SIGIR ’99 – start-page: 147 year: 2003 end-page: 153 ident: bib0009 article-title: Using the triangle inequality to accelerate k-means publication-title: Proc. of ICML ’03 – start-page: 20:1 year: 2014 end-page: 20:10 ident: bib0006 article-title: A collaborative divide-and-conquer k-means clustering algorithm for processing large data publication-title: Proc. of CF’14 – start-page: 911 year: 2012 end-page: 920 ident: bib0008 article-title: Improving retrieval of short texts through document expansion publication-title: Proc. of SIGIR ’12 – start-page: 120 year: 2001 end-page: 127 ident: bib0013 article-title: Relevance based language models publication-title: Proc. of SIGIR ’01 – volume: 58 year: 2011 ident: bib0001 article-title: Smoothed analysis of the k-means method publication-title: J. ACM – volume: 24 start-page: 881 year: 2002 end-page: 892 ident: bib0012 article-title: An efficient k-means clustering algorithm: analysis and implementation publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – start-page: 277 year: 1999 end-page: 281 ident: bib0019 article-title: Accelerating exact k-means algorithms with geometric reasoning publication-title: Proc. of KDD ’99 – start-page: 1339 year: 2016 ident: 10.1016/j.patrec.2018.07.017_bib0007 article-title: K-means clustering with distributed dimensions – start-page: 20:1 year: 2014 ident: 10.1016/j.patrec.2018.07.017_bib0006 article-title: A collaborative divide-and-conquer k-means clustering algorithm for processing large data – year: 2000 ident: 10.1016/j.patrec.2018.07.017_bib0010 – volume: 28 start-page: 129 issue: 2 year: 1982 ident: 10.1016/j.patrec.2018.07.017_bib0017 article-title: Least squares quantization in PCM publication-title: IEEE Trans. Inf. Theor. doi: 10.1109/TIT.1982.1056489 – start-page: 186 year: 2004 ident: 10.1016/j.patrec.2018.07.017_bib0016 article-title: Cluster-based retrieval using language models – volume: 33 start-page: 117 issue: 1 year: 2011 ident: 10.1016/j.patrec.2018.07.017_bib0011 article-title: Product quantization for nearest neighbor search publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2010.57 – start-page: 1473 year: 2016 ident: 10.1016/j.patrec.2018.07.017_bib0015 article-title: Selective cluster-based document retrieval – volume: 58 issue: 5 year: 2011 ident: 10.1016/j.patrec.2018.07.017_sbref0001 article-title: Smoothed analysis of the k-means method publication-title: J. ACM doi: 10.1145/2027216.2027217 – volume: 24 start-page: 881 issue: 7 year: 2002 ident: 10.1016/j.patrec.2018.07.017_bib0012 article-title: An efficient k-means clustering algorithm: analysis and implementation publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2002.1017616 – start-page: 3037 year: 2012 ident: 10.1016/j.patrec.2018.07.017_bib0023 article-title: Fast approximate k-means via cluster closures – volume: 5 start-page: 622 issue: 7 year: 2012 ident: 10.1016/j.patrec.2018.07.017_bib0003 article-title: Scalable k-means++ publication-title: Proc. VLDB Endow. doi: 10.14778/2180912.2180915 – start-page: 1027 year: 2007 ident: 10.1016/j.patrec.2018.07.017_bib0002 article-title: K-means++: The advantages of careful seeding – start-page: 235 year: 2008 ident: 10.1016/j.patrec.2018.07.017_bib0014 article-title: A cluster-based resampling method for pseudo-relevance feedback – start-page: 233 year: 2014 ident: 10.1016/j.patrec.2018.07.017_bib0004 article-title: Scalable k-means by ranked retrieval – start-page: 911 year: 2012 ident: 10.1016/j.patrec.2018.07.017_bib0008 article-title: Improving retrieval of short texts through document expansion – start-page: 147 year: 2003 ident: 10.1016/j.patrec.2018.07.017_bib0009 article-title: Using the triangle inequality to accelerate k-means – start-page: 120 year: 2001 ident: 10.1016/j.patrec.2018.07.017_bib0013 article-title: Relevance based language models – start-page: 254 year: 1999 ident: 10.1016/j.patrec.2018.07.017_bib0022 article-title: Cluster-based language models for distributed retrieval – volume: 36 start-page: 3336 issue: 2 year: 2009 ident: 10.1016/j.patrec.2018.07.017_bib0018 article-title: A simple and fast algorithm for k-medoids clustering publication-title: Expert Syst. Appl. doi: 10.1016/j.eswa.2008.01.039 – start-page: 277 year: 1999 ident: 10.1016/j.patrec.2018.07.017_bib0019 article-title: Accelerating exact k-means algorithms with geometric reasoning – start-page: 333 year: 2013 ident: 10.1016/j.patrec.2018.07.017_bib0021 article-title: Ranking document clusters using markov random fields – volume: 64 start-page: 1 issue: C year: 2017 ident: 10.1016/j.patrec.2018.07.017_bib0005 article-title: A K-partitioning algorithm for clustering large-scale spatio-textual data publication-title: Inf. Syst. doi: 10.1016/j.is.2016.08.003 – start-page: 727 year: 2000 ident: 10.1016/j.patrec.2018.07.017_bib0020 article-title: X-means: Extending k-means with efficient estimation of the number of clusters |
| SSID | ssj0006398 |
| Score | 2.273438 |
| Snippet | •Investigate K-means clustering for large collections of sparse vectors of high dimensionality.•Proposed utilizing the inverted list data-structure to improve... K-means, along with its several other variants, is the most widely used family of partitional clustering algorithms. Generally speaking, this family of... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 198 |
| SubjectTerms | Algorithms Centroids Clustering Computation Data points Data structures Datasets Inverted index Iterative methods Partitions Problem solving Scalable K-means Tweet clustering |
| Title | A Fast Partitional Clustering Algorithm based on Nearest Neighbours Heuristics |
| URI | https://dx.doi.org/10.1016/j.patrec.2018.07.017 https://www.proquest.com/docview/2119939238 |
| Volume | 112 |
| WOSCitedRecordID | wos000443950800029&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7344 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006398 issn: 0167-8655 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZgywEOPAqIQkE-cKuCktiJ7eMKtTyEoh6KtDfLiW3YaputNlnUn8_4kexuKyg9cImSaG1lM59nxpOZbxB6X-hCNRa2qTxt0oQyrZPawMIrs7IwIldUW0-Z_41VFZ_NxGlsCd_5dgKsbfnVlbj8r6KGeyBsVzp7B3GPk8INOAehwxHEDsd_Evz06ER1PbiGq8BD5KIDi7XjQ_ARkMWP5Wre_7w4cvZLu28FlaOxhRGVi5K6IGcHxmgdGZy3nddTz8Xp6l9i0hEMXvhyoNEx_-TCn6F5NWgy1c13wgoZH_OmxkgjaFBXtbqjKmPKc1B2WegfHe1mHtoI31DJITpw_sHF9o0jjcy4p0sNJZu7DNjXLNOYLzikop3LMIt0s8iUSZjlPtrLWSH4BO1NvxzPvo52GHwvPjC7uz8yFE767L6bT_Mnx-SaifZ-x9lT9DhuGPA0CPoZumfaffRkaMaBo27eR4-2mCWfo2qKHQrwFgrwBgV4RAH2KMDLFkcU4A0K8AYFL9D3k-Ozj5-T2DojaQjP-4TCrt5qUlhuaps2Os2srU2myyLnStgC3glJaQ0X1AhV2JoyQkxDLKGqJNaQl2jSLlvzCmGwQsIYQllqSlqrXOjGWFamVPFUca0OEBnem2wir7xrb7KQf5PaAUrGUZeBV-WW37NBJDL6hsHnk4CzW0YeDhKUcZl20vEaCtgaEP76jg_yBj3cLJdDNOlXa_MWPWh-9fNu9S5i8DcFQpOn |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Fast+Partitional+Clustering+Algorithm+based+on+Nearest+Neighbours+Heuristics&rft.jtitle=Pattern+recognition+letters&rft.au=Ganguly%2C+Debasis&rft.date=2018-09-01&rft.issn=0167-8655&rft.volume=112&rft.spage=198&rft.epage=204&rft_id=info:doi/10.1016%2Fj.patrec.2018.07.017&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_patrec_2018_07_017 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8655&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8655&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8655&client=summon |