Efficient approximation algorithms for clustering point-sets

In this paper, we consider the problem of clustering a set of n finite point-sets in d-dimensional Euclidean space. Different from the traditional clustering problem (called points clustering problem where the to-be-clustered objects are points), the point-sets clustering problem requires that all p...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computational geometry : theory and applications Ročník 43; číslo 1; s. 59 - 66
Hlavní autoři: Xu, Guang, Xu, Jinhui
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 2010
Témata:
ISSN:0925-7721
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract In this paper, we consider the problem of clustering a set of n finite point-sets in d-dimensional Euclidean space. Different from the traditional clustering problem (called points clustering problem where the to-be-clustered objects are points), the point-sets clustering problem requires that all points in a single point-set be clustered into the same cluster. This requirement disturbs the metric property of the underlying distance function among point-sets and complicates the clustering problem dramatically. In this paper, we use a number of interesting observations and techniques to overcome this difficulty. For the k-center clustering problem on point-sets, we give an O ( m + n log k ) -time 3-approximation algorithm and an O ( k m ) -time ( 1 + 3 ) -approximation algorithm, where m is the total number of input points and k is the number of clusters. When k is a small constant, the performance ratio of our algorithm reduces to ( 2 + ϵ ) for any ϵ > 0 . For the k-median problem on point-sets, we present a polynomial time ( 3 + ϵ ) -approximation algorithm. Our approaches are rather general and can be easily implemented for practical purpose.
AbstractList In this paper, we consider the problem of clustering a set of n finite point-sets in d-dimensional Euclidean space. Different from the traditional clustering problem (called points clustering problem where the to-be-clustered objects are points), the point-sets clustering problem requires that all points in a single point-set be clustered into the same cluster. This requirement disturbs the metric property of the underlying distance function among point-sets and complicates the clustering problem dramatically. In this paper, we use a number of interesting observations and techniques to overcome this difficulty. For the k-center clustering problem on point-sets, we give an O ( m + n log k ) -time 3-approximation algorithm and an O ( k m ) -time ( 1 + 3 ) -approximation algorithm, where m is the total number of input points and k is the number of clusters. When k is a small constant, the performance ratio of our algorithm reduces to ( 2 + ϵ ) for any ϵ > 0 . For the k-median problem on point-sets, we present a polynomial time ( 3 + ϵ ) -approximation algorithm. Our approaches are rather general and can be easily implemented for practical purpose.
Author Xu, Jinhui
Xu, Guang
Author_xml – sequence: 1
  givenname: Guang
  surname: Xu
  fullname: Xu, Guang
  email: guangxu@cse.buffalo.edu
– sequence: 2
  givenname: Jinhui
  surname: Xu
  fullname: Xu, Jinhui
  email: jinhui@cse.buffalo.edu
BookMark eNqFkE1Lw0AQhvdQwbb6DzzkDyTuVzaJiCClfkDBi56XzWS2bkmzYXcV_fem1pMHZQ4DA8_LPO-CzAY_ICEXjBaMMnW5K8Dvt-gLTmlVMF5QymdkThte5lXF2SlZxLij05WXzZxcr6114HBImRnH4D_c3iTnh8z0Wx9cet3HzPqQQf8WEwY3bLPRuyHlEVM8IyfW9BHPf_aSvNytn1cP-ebp_nF1u8lBUJVyWzeyrerSGCxLI-ppDAhUgqtOWtU0oOoKGwGSt7Y0UrRMUGAUaKeMkVIsydUxF4KPMaDV4NL3mykY12tG9cFd7_TRXR_cNeN6spxg-QsewyQZPv_Dbo4YTmLvDoOOh5oAOxcQku68-zvgCwRIerk
CitedBy_id crossref_primary_10_3390_math10010144
crossref_primary_10_1631_jzus_C0910668
crossref_primary_10_1007_s00453_019_00616_2
Cites_doi 10.1145/1007352.1007400
10.1145/62212.62255
10.1007/s00453-005-1166-x
10.1145/331499.331504
10.1126/science.281.5382.1502
10.1007/BF02187718
10.1007/3-540-48481-7_33
10.1016/0304-3975(85)90224-5
10.1145/509943.509947
ContentType Journal Article
Copyright 2009
Copyright_xml – notice: 2009
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.comgeo.2007.12.002
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
EndPage 66
ExternalDocumentID 10_1016_j_comgeo_2007_12_002
S0925772109000467
GroupedDBID --K
--M
-DZ
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABAOU
ABBOA
ABFNM
ABMAC
ABVKL
ABXDB
ABYKQ
ACAZW
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADMUD
AEBSH
AEKER
AEXQZ
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIGVJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG9
M26
M41
MHUIS
MO0
N9A
NCXOZ
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSW
SSZ
T5K
UHS
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ABWVN
ACLOT
ACRPL
ADNMO
ADVLN
AEIPS
AFJKZ
AGQPQ
AIIUN
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c306t-f894b785aae55a38383ac3e6326d4f699c687e93c42bf5a43b130c10c0d6aa443
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000270706900006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0925-7721
IngestDate Sat Nov 29 03:12:51 EST 2025
Tue Nov 18 20:47:38 EST 2025
Fri Feb 23 02:17:40 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords K-center clustering
Point-sets
Core-sets
Clustering
K-median clustering
Language English
License http://www.elsevier.com/open-access/userlicense/1.0
https://www.elsevier.com/tdm/userlicense/1.0
https://www.elsevier.com/open-access/userlicense/1.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c306t-f894b785aae55a38383ac3e6326d4f699c687e93c42bf5a43b130c10c0d6aa443
OpenAccessLink https://dx.doi.org/10.1016/j.comgeo.2007.12.002
PageCount 8
ParticipantIDs crossref_citationtrail_10_1016_j_comgeo_2007_12_002
crossref_primary_10_1016_j_comgeo_2007_12_002
elsevier_sciencedirect_doi_10_1016_j_comgeo_2007_12_002
PublicationCentury 2000
PublicationDate 2010
2010-01-00
PublicationDateYYYYMMDD 2010-01-01
PublicationDate_xml – year: 2010
  text: 2010
PublicationDecade 2010
PublicationTitle Computational geometry : theory and applications
PublicationYear 2010
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Vazirani (bib018) 2001
M. Bădoiu, K.L. Clarkson, Smaller core-sets for balls, in: Proc. 14th Annual ACM–SIAM Symposium on Discrete Algorithms, 2003, pp. 801–802
Jain, Murty, Flynn (bib015) 1999; 31
Vaidya (bib017) 1989; 4
Han, Kamber (bib009) 2000
A. Goel, P. Indyk, K.R. Varadarajan, Reductions among high dimensional proximity problems, in: Proc. 12th ACM–SIAM Sympos. Discrete Algorithms, 2001, pp. 769–778
median problem, in: Proc. 7th Annu. European Sympos. Algorithms, 1999, pp. 378–389
T. Feder, D.H. Greene, Optimal algorithm for approximation clustering, in: Proc. 20th ACM Symp. Theory of Computing, 1988, pp. 434–444
Agarwal, Procopiuc, Varadarajan (bib002) 2005; 42
S. Har-Peled, S. Mazumdar, Coresets for
Wei, Samarabandu, Devdhar, Siegel, Acharya, Berezney (bib019) 1998; 281
Grötschel, Lovász, Schrijver (bib012) 1994; vol. 2
M. Bădoiu, S. Har-Peled, P. Indyk, Approximate clustering via core-sets, in: Proceedings of the 34th ACM Symposium on Theory of Computing, 2002, pp. 250–257
D. Eppstein, Faster construction of planar two-centers, in: Proc. 8th ACM–SIAM Sympos. Discrete Algorithms, 1997
Gonzalez (bib011) 1985; 38
(bib014) 1995
means and
median clustering and their applications, in: Proc. 36th ACM Symposium on Theory of Computing, 2004, pp. 291–300
Duda, Hart, Stork (bib006) 2001
S.G. Kolliopoulos, S. Rao, A nearly linear-time approximation scheme for the Euclidean
Bern, Eppstein (bib005) 1997
P.K. Agarwal, C.M. Procopiuc, Exact and approximation algorithms for clustering, in: Proc. 9th ACM–SIAM Sympos. Discrete Algorithms, 1998, pp. 658–667
10.1016/j.comgeo.2007.12.002_bib004
Grötschel (10.1016/j.comgeo.2007.12.002_bib012) 1994; vol. 2
Jain (10.1016/j.comgeo.2007.12.002_bib015) 1999; 31
10.1016/j.comgeo.2007.12.002_bib016
Gonzalez (10.1016/j.comgeo.2007.12.002_bib011) 1985; 38
10.1016/j.comgeo.2007.12.002_bib007
Bern (10.1016/j.comgeo.2007.12.002_bib005) 1997
10.1016/j.comgeo.2007.12.002_bib008
Agarwal (10.1016/j.comgeo.2007.12.002_bib002) 2005; 42
Duda (10.1016/j.comgeo.2007.12.002_bib006) 2001
10.1016/j.comgeo.2007.12.002_bib010
10.1016/j.comgeo.2007.12.002_bib001
Vaidya (10.1016/j.comgeo.2007.12.002_bib017) 1989; 4
10.1016/j.comgeo.2007.12.002_bib013
(10.1016/j.comgeo.2007.12.002_bib014) 1995
10.1016/j.comgeo.2007.12.002_bib003
Han (10.1016/j.comgeo.2007.12.002_bib009) 2000
Wei (10.1016/j.comgeo.2007.12.002_bib019) 1998; 281
Vazirani (10.1016/j.comgeo.2007.12.002_bib018) 2001
References_xml – reference: M. Bădoiu, K.L. Clarkson, Smaller core-sets for balls, in: Proc. 14th Annual ACM–SIAM Symposium on Discrete Algorithms, 2003, pp. 801–802
– reference: S. Har-Peled, S. Mazumdar, Coresets for
– reference: A. Goel, P. Indyk, K.R. Varadarajan, Reductions among high dimensional proximity problems, in: Proc. 12th ACM–SIAM Sympos. Discrete Algorithms, 2001, pp. 769–778
– reference: T. Feder, D.H. Greene, Optimal algorithm for approximation clustering, in: Proc. 20th ACM Symp. Theory of Computing, 1988, pp. 434–444
– reference: P.K. Agarwal, C.M. Procopiuc, Exact and approximation algorithms for clustering, in: Proc. 9th ACM–SIAM Sympos. Discrete Algorithms, 1998, pp. 658–667
– reference: D. Eppstein, Faster construction of planar two-centers, in: Proc. 8th ACM–SIAM Sympos. Discrete Algorithms, 1997
– year: 1997
  ident: bib005
  article-title: Approximation algorithms for geometric problems
  publication-title: Approximation Algorithms for NP-Hard Problems
– year: 2001
  ident: bib006
  article-title: Pattern Classification
– volume: vol. 2
  year: 1994
  ident: bib012
  article-title: Geometric Algorithms and Combinatorial Optimization
  publication-title: Algorithm and Combinatorics
– year: 2001
  ident: bib018
  article-title: Approximation Algorithms
– volume: 31
  start-page: 264
  year: 1999
  end-page: 323
  ident: bib015
  article-title: Data clustering: A review
  publication-title: ACM Computing Surveys
– reference: S.G. Kolliopoulos, S. Rao, A nearly linear-time approximation scheme for the Euclidean
– reference: -median problem, in: Proc. 7th Annu. European Sympos. Algorithms, 1999, pp. 378–389
– year: 2000
  ident: bib009
  article-title: Data Mining
– volume: 38
  start-page: 293
  year: 1985
  end-page: 306
  ident: bib011
  article-title: Clustering to minimize the maximum intercluster distance
  publication-title: Theoretical Computer Science
– reference: -median clustering and their applications, in: Proc. 36th ACM Symposium on Theory of Computing, 2004, pp. 291–300
– year: 1995
  ident: bib014
  publication-title: Approximation Algorithms for NP-Hard Problems
– volume: 4
  start-page: 101
  year: 1989
  end-page: 115
  ident: bib017
  article-title: An
  publication-title: Discrete and Computational Geometry
– volume: 42
  start-page: 221
  year: 2005
  end-page: 230
  ident: bib002
  article-title: Algorithms for a
  publication-title: Algorithmica
– reference: M. Bădoiu, S. Har-Peled, P. Indyk, Approximate clustering via core-sets, in: Proceedings of the 34th ACM Symposium on Theory of Computing, 2002, pp. 250–257
– reference: -means and
– volume: 281
  start-page: 1502
  year: 1998
  end-page: 1506
  ident: bib019
  article-title: Segregation of transcription and replication sites into higher order domains
  publication-title: Sciences
– ident: 10.1016/j.comgeo.2007.12.002_bib013
  doi: 10.1145/1007352.1007400
– ident: 10.1016/j.comgeo.2007.12.002_bib008
  doi: 10.1145/62212.62255
– year: 1995
  ident: 10.1016/j.comgeo.2007.12.002_bib014
– volume: 42
  start-page: 221
  issue: 3–4
  year: 2005
  ident: 10.1016/j.comgeo.2007.12.002_bib002
  article-title: Algorithms for a k-line center
  publication-title: Algorithmica
  doi: 10.1007/s00453-005-1166-x
– volume: 31
  start-page: 264
  year: 1999
  ident: 10.1016/j.comgeo.2007.12.002_bib015
  article-title: Data clustering: A review
  publication-title: ACM Computing Surveys
  doi: 10.1145/331499.331504
– volume: 281
  start-page: 1502
  year: 1998
  ident: 10.1016/j.comgeo.2007.12.002_bib019
  article-title: Segregation of transcription and replication sites into higher order domains
  publication-title: Sciences
  doi: 10.1126/science.281.5382.1502
– year: 2000
  ident: 10.1016/j.comgeo.2007.12.002_bib009
– volume: 4
  start-page: 101
  year: 1989
  ident: 10.1016/j.comgeo.2007.12.002_bib017
  article-title: An O(nlogn) algorithm for the all-nearest-neighbors problem
  publication-title: Discrete and Computational Geometry
  doi: 10.1007/BF02187718
– ident: 10.1016/j.comgeo.2007.12.002_bib016
  doi: 10.1007/3-540-48481-7_33
– ident: 10.1016/j.comgeo.2007.12.002_bib010
– year: 2001
  ident: 10.1016/j.comgeo.2007.12.002_bib018
– volume: vol. 2
  year: 1994
  ident: 10.1016/j.comgeo.2007.12.002_bib012
  article-title: Geometric Algorithms and Combinatorial Optimization
– year: 2001
  ident: 10.1016/j.comgeo.2007.12.002_bib006
– ident: 10.1016/j.comgeo.2007.12.002_bib007
– year: 1997
  ident: 10.1016/j.comgeo.2007.12.002_bib005
  article-title: Approximation algorithms for geometric problems
– volume: 38
  start-page: 293
  year: 1985
  ident: 10.1016/j.comgeo.2007.12.002_bib011
  article-title: Clustering to minimize the maximum intercluster distance
  publication-title: Theoretical Computer Science
  doi: 10.1016/0304-3975(85)90224-5
– ident: 10.1016/j.comgeo.2007.12.002_bib003
– ident: 10.1016/j.comgeo.2007.12.002_bib001
– ident: 10.1016/j.comgeo.2007.12.002_bib004
  doi: 10.1145/509943.509947
SSID ssj0002259
Score 1.8866231
Snippet In this paper, we consider the problem of clustering a set of n finite point-sets in d-dimensional Euclidean space. Different from the traditional clustering...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 59
SubjectTerms Clustering
Core-sets
K-center clustering
K-median clustering
Point-sets
Title Efficient approximation algorithms for clustering point-sets
URI https://dx.doi.org/10.1016/j.comgeo.2007.12.002
Volume 43
WOSCitedRecordID wos000270706900006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 0925-7721
  databaseCode: AIEXJ
  dateStart: 19950301
  customDbUrl:
  isFulltext: true
  dateEnd: 20180131
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0002259
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdbuoftYXRfrFs7_LC3ILAt2ZJgL2WkbIWVwTrIm5FlJXVJnZLYI_vvd9LJTkrGvmAYjBGWZe4n607nu98R8haM3liZRFNbGUM5FzmVKVPUZGVsBSh8y7HYhLi4kNOp-hxK3K99OQHRNHKzUbf_FWpoA7Bd6uxfwD08FBrgGkCHM8AO5z8CfuJJIXzouOML39SYnDjWi_lyVbdXyMAwNovOcST4ZPRl3bR0bZHUaeAt8PUeel_h3C5vbLv6Pg6BIO7XvOd53fkB3oM37by3vdNBKw5N53Vz1dW7joYQbOr9Xnu5L-hAdIVwBeY392spUi7dmTO4MAbab1SxWGdlb_FGP8K1k_0cEzOFd9XG6VZZDSGEX9z4bvhY-QxYcZ8cpAK2RyNycPpxMj0f9DGsWMi4GN63T6D0UX77Y_3cQNkxOi4PyeOwW4hOEeUn5J5tnpJHnwaq3fUz8m7AO7qDd7TFOwK8oy3e0Rbv5-Tr2eTy_QcaSmJQA3u7ls6k4qWQmdY2yzSTcGjDbA5GeMVnuVIml8IqZnhazjLNWQk2ikliE1e51pyzF2TULBv7kkQ8iSutmDQz-B5Lm2kpNFizecVKVookOSKsl0NhAl-8K1uyKPrAwOsCpedKmYoiSQuQ3hGhQ69b5Ev5zf2iF3ERbD605QqYFb_s-eqfe74mDzEGxDnSjsmoXXX2hDww39p6vXoTps8PZPZ_6Q
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+approximation+algorithms+for+clustering+point-sets&rft.jtitle=Computational+geometry+%3A+theory+and+applications&rft.au=Xu%2C+Guang&rft.au=Xu%2C+Jinhui&rft.date=2010&rft.pub=Elsevier+B.V&rft.issn=0925-7721&rft.volume=43&rft.issue=1&rft.spage=59&rft.epage=66&rft_id=info:doi/10.1016%2Fj.comgeo.2007.12.002&rft.externalDocID=S0925772109000467
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-7721&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-7721&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-7721&client=summon