Exact memory–constrained UPGMA for large scale speaker clustering

•We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition Vol. 95; pp. 235 - 246
Main Authors: Cumani, Sandro, Laface, Pietro
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.11.2019
Subjects:
ISSN:0031-3203, 1873-5142
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions.•We assess the efficiency of our method on datasets including 4 million utterances. This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately O(N2), but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has O(N2) complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors.
AbstractList •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions.•We assess the efficiency of our method on datasets including 4 million utterances. This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately O(N2), but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has O(N2) complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors.
Author Laface, Pietro
Cumani, Sandro
Author_xml – sequence: 1
  givenname: Sandro
  orcidid: 0000-0001-6036-0065
  surname: Cumani
  fullname: Cumani, Sandro
  email: sandro.cumani@polito.it
– sequence: 2
  givenname: Pietro
  orcidid: 0000-0003-2841-7695
  surname: Laface
  fullname: Laface, Pietro
  email: pietro.laface@polito.it
BookMark eNqFkE1OwzAQhS1UJNrCDVjkAgkzcZofFkhVVQpSESzK2nKcceWSxpVtEN1xB27ISUhVVixgM28x-p70vhEbdLYjxi4REgTMrzbJTgZl10kKWCWQJ4DlCRtiWfB4glk6YEMAjjFPgZ-xkfcbACz6x5DN5u9ShWhLW-v2Xx-fynY-OGk6aqLnp8XDNNLWRa10a4q8km1_dyRfyEWqffWBnOnW5-xUy9bTxU-O2ep2vprdxcvHxf1suowVhzzEVFa80hxrXUqOkteQqxx1TbrSlcKmUEWjgeuCeJbKgtdZSahKjc1E5yXwMcuOtcpZ7x1psXNmK91eIIiDB7ERRw_i4EFALnoPPXb9C1MmyGBsd9jZ_gffHGHqd70ZcsIrQ52ixjhSQTTW_F3wDS-Jf8k
CitedBy_id crossref_primary_10_3390_genes15060719
crossref_primary_10_1002_nem_2126
crossref_primary_10_1109_TMM_2020_3024667
Cites_doi 10.1109/34.1000236
10.1016/j.csl.2005.08.001
10.1016/0377-0427(87)90125-7
10.1007/BF01908075
10.1007/BF01890115
10.1109/TASL.2011.2125954
10.1016/S0022-0000(73)80033-9
10.1016/j.patcog.2012.07.021
10.1016/j.ipl.2007.07.002
10.1007/s11222-007-9033-z
10.1093/bioinformatics/btt657
10.1002/sam.10080
10.1109/TASLP.2017.2674966
10.1109/TASLP.2017.2724198
10.1109/83.841516
10.1109/TASL.2013.2245655
10.1109/TASLP.2018.2791806
10.1109/TASL.2010.2064307
10.1080/01621459.1971.10482356
10.1016/j.patrec.2009.09.011
10.1093/comjnl/26.4.354
10.1109/TASLP.2014.2341914
10.1109/TASL.2013.2264673
10.1093/bioinformatics/btm134
10.1109/TPAMI.2006.227
ContentType Journal Article
Copyright 2019 Elsevier Ltd
Copyright_xml – notice: 2019 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.patcog.2019.06.018
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-5142
EndPage 246
ExternalDocumentID 10_1016_j_patcog_2019_06_018
S0031320319302493
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c306t-e8939f31bf8a31a3b06c61fbef9f9c1d7c7df03f7e342a73b48e1c8f1d5f6803
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000478710600020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0031-3203
IngestDate Sat Nov 29 03:52:24 EST 2025
Tue Nov 18 22:23:06 EST 2025
Fri Feb 23 02:25:25 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Reciprocal Nearest Neighbor
Cluster quality measures
UPGMA
Similarity measures
Silhouette
Clustering
PLDA
PSVM
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c306t-e8939f31bf8a31a3b06c61fbef9f9c1d7c7df03f7e342a73b48e1c8f1d5f6803
ORCID 0000-0003-2841-7695
0000-0001-6036-0065
PageCount 12
ParticipantIDs crossref_primary_10_1016_j_patcog_2019_06_018
crossref_citationtrail_10_1016_j_patcog_2019_06_018
elsevier_sciencedirect_doi_10_1016_j_patcog_2019_06_018
PublicationCentury 2000
PublicationDate November 2019
2019-11-00
PublicationDateYYYYMMDD 2019-11-01
PublicationDate_xml – month: 11
  year: 2019
  text: November 2019
PublicationDecade 2010
PublicationTitle Pattern recognition
PublicationYear 2019
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Juan (bib0020) 1982; 12
Snyder, Garcia-Romero, Sell, Povey, Khudanpur (bib0037) 2018
Franti, Kaukoranta, Shen, Chang (bib0012) 2000; 9
Gronau, Moran (bib0018) 2007; 104
Rand (bib0042) 1971; 66
Jain (bib0001) 2010; 31
Loewenstein, Portugaly, Fromer, Linial (bib0014) 2008; 24/ISBM
Zhao, Karypis (bib0013) 2002
Brümmer, du Preez (bib0039) 2006; 20
Hartigan, Wong (bib0008) 1979; 28
Blum, Floyd, Pratt, Rivest, Tarjan (bib0022) 1973; 7
Comaniciu, Meer (bib0011) 2002; 24
Cumani, Brümmer, Burget, Laface, Plchot, Vasilakakis (bib0033) 2013; 21
Ioffe (bib0030) 2006
Shum, Dehak, Dehak, Glass (bib0005) 2013; 21
Day, Edelsbrunner (bib0017) 1984; 1
Kenny (bib0031) 2010
Castaldo, Colibro, Dalmasso, Laface, Vair (bib0004) 2008
Cumani, Laface (bib0032) 2017; 25
Kim, Park (bib0044) 2007; 23
Benzécri (bib0019) 1982; 12
Vendramin, Campello, Hruschka (bib0027) 2010; 3
Hubert, Arabie (bib0043) 1985; 2
Luxburg (bib0010) 2007; 17
Sell, Garcia-Romero (bib0016) 2014
Cumani, Laface (bib0036) 2018; 26
Everitt, Landau, Leese, Stahl (bib0015) 2011
Ng, Jordan, Weiss (bib0009) 2001
Rousseeuw (bib0026) 1987; 20
Dehak, Kenny, Dehak, Dumouchel, Ouellet (bib0035) 2011; 19
Franti, Virmajoki, Hautamaki (bib0028) 2006; 28
Zhang, Ramakrishnan, Livny (bib0024) 1996
Murtagh (bib0021) 1983; 26
Bruynooghe (bib0029) 1977; 2
Pandove, Goel, Rani (bib0002) 2018; 12
Khoury, Shafey, Ferras, Marcel (bib0006) 2014
Leibe, Mikolajczyk, Schiele (bib0025) 2006
van Leeuwen (bib0045) 2010
Anguera, Bozonnet, Evans, Fredouille, Friedland, Vinyals (bib0003) 2012; 20
Cumani, Laface (bib0038) 2017; 25
Garcia-Romero, McCree, Shum, Brümmer, Vaquero (bib0007) 2014
Cumani, Laface (bib0034) 2014; 22
F. Matias Rodrigues, von Mering (bib0023) 2014; 30
Shum, Dehak, Dehak, Glass (bib0040) 2010
Arbelaitz, Gurrutxaga, Muguerza, PéRez, Perona (bib0041) 2013; 46
van Leeuwen (10.1016/j.patcog.2019.06.018_bib0045) 2010
Castaldo (10.1016/j.patcog.2019.06.018_bib0004) 2008
Franti (10.1016/j.patcog.2019.06.018_bib0028) 2006; 28
Anguera (10.1016/j.patcog.2019.06.018_bib0003) 2012; 20
Rousseeuw (10.1016/j.patcog.2019.06.018_bib0026) 1987; 20
Pandove (10.1016/j.patcog.2019.06.018_bib0002) 2018; 12
Benzécri (10.1016/j.patcog.2019.06.018_bib0019) 1982; 12
Murtagh (10.1016/j.patcog.2019.06.018_bib0021) 1983; 26
Luxburg (10.1016/j.patcog.2019.06.018_bib0010) 2007; 17
Kenny (10.1016/j.patcog.2019.06.018_sbref0031) 2010
Sell (10.1016/j.patcog.2019.06.018_bib0016) 2014
Franti (10.1016/j.patcog.2019.06.018_bib0012) 2000; 9
Day (10.1016/j.patcog.2019.06.018_bib0017) 1984; 1
Brümmer (10.1016/j.patcog.2019.06.018_bib0039) 2006; 20
Bruynooghe (10.1016/j.patcog.2019.06.018_bib0029) 1977; 2
Vendramin (10.1016/j.patcog.2019.06.018_bib0027) 2010; 3
Gronau (10.1016/j.patcog.2019.06.018_bib0018) 2007; 104
Shum (10.1016/j.patcog.2019.06.018_bib0005) 2013; 21
Garcia-Romero (10.1016/j.patcog.2019.06.018_bib0007) 2014
Cumani (10.1016/j.patcog.2019.06.018_bib0032) 2017; 25
Ng (10.1016/j.patcog.2019.06.018_bib0009) 2001
Loewenstein (10.1016/j.patcog.2019.06.018_bib0014) 2008; 24/ISBM
Blum (10.1016/j.patcog.2019.06.018_bib0022) 1973; 7
Snyder (10.1016/j.patcog.2019.06.018_bib0037) 2018
Leibe (10.1016/j.patcog.2019.06.018_bib0025) 2006
Arbelaitz (10.1016/j.patcog.2019.06.018_bib0041) 2013; 46
Zhao (10.1016/j.patcog.2019.06.018_bib0013) 2002
Comaniciu (10.1016/j.patcog.2019.06.018_bib0011) 2002; 24
Everitt (10.1016/j.patcog.2019.06.018_bib0015) 2011
Hubert (10.1016/j.patcog.2019.06.018_bib0043) 1985; 2
Dehak (10.1016/j.patcog.2019.06.018_bib0035) 2011; 19
Cumani (10.1016/j.patcog.2019.06.018_bib0038) 2017; 25
Shum (10.1016/j.patcog.2019.06.018_bib0040) 2010
Jain (10.1016/j.patcog.2019.06.018_bib0001) 2010; 31
Kim (10.1016/j.patcog.2019.06.018_bib0044) 2007; 23
Khoury (10.1016/j.patcog.2019.06.018_bib0006) 2014
F. Matias Rodrigues (10.1016/j.patcog.2019.06.018_bib0023) 2014; 30
Rand (10.1016/j.patcog.2019.06.018_bib0042) 1971; 66
Cumani (10.1016/j.patcog.2019.06.018_bib0034) 2014; 22
Hartigan (10.1016/j.patcog.2019.06.018_bib0008) 1979; 28
Ioffe (10.1016/j.patcog.2019.06.018_bib0030) 2006
Zhang (10.1016/j.patcog.2019.06.018_bib0024) 1996
Cumani (10.1016/j.patcog.2019.06.018_bib0036) 2018; 26
Juan (10.1016/j.patcog.2019.06.018_bib0020) 1982; 12
Cumani (10.1016/j.patcog.2019.06.018_bib0033) 2013; 21
References_xml – volume: 12
  start-page: 16:1
  year: 2018
  end-page: 16:68
  ident: bib0002
  article-title: Systematic review of clustering high-dimensional and large datasets
  publication-title: ACM Trans. Knowl. Discovery Data
– volume: 17
  start-page: 395
  year: 2007
  end-page: 416
  ident: bib0010
  article-title: A tutorial on spectral clustering
  publication-title: Stat. Comput.
– volume: 20
  start-page: 53
  year: 1987
  end-page: 65
  ident: bib0026
  article-title: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
  publication-title: J. Comput. Appl. Math.
– start-page: 254
  year: 2014
  end-page: 259
  ident: bib0006
  article-title: Hierarchical speaker clustering methods for the NIST i-vector challenge
  publication-title: Odyssey: The Speaker and Language Recognition Workshop
– volume: 46
  start-page: 243
  year: 2013
  end-page: 256
  ident: bib0041
  article-title: An extensive comparative study of cluster validity indices
  publication-title: Pattern Recognit.
– volume: 7
  start-page: 448
  year: 1973
  end-page: 461
  ident: bib0022
  article-title: Time bounds for selection
  publication-title: J. Comput. Syst. Sci.
– volume: 2
  start-page: 24
  year: 1977
  end-page: 42
  ident: bib0029
  article-title: Méthodes nouvelles en classification automatique de données taxinomiques nombreuses
  publication-title: Statistique et analyse des données
– volume: 26
  start-page: 354
  year: 1983
  end-page: 359
  ident: bib0021
  article-title: A survey of recent advances in hierarchical clustering algorithms
  publication-title: Comput. J.
– start-page: 202
  year: 2010
  end-page: 208
  ident: bib0045
  article-title: Speaker linking in large data sets
  publication-title: Proceedings of Odyssey 2010
– volume: 25
  start-page: 1890
  year: 2017
  end-page: 1900
  ident: bib0032
  article-title: Joint estimation of PLDA and non–linear transformations of speaker vectors
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
– start-page: 76
  year: 2010
  end-page: 82
  ident: bib0040
  article-title: Unsupervised speaker adaptation based on the cosine similarity for text–independent speaker verification
  publication-title: Proceedings of Odyssey 2010
– year: 2011
  ident: bib0015
  article-title: Cluster Analysis
– volume: 104
  start-page: 205
  year: 2007
  end-page: 210
  ident: bib0018
  article-title: Optimal implementations of UPGMA and other common clustering algorithms
  publication-title: Inf. Process. Lett.
– volume: 12
  start-page: 209
  year: 1982
  end-page: 217
  ident: bib0019
  article-title: Construction d’une classification ascendante hieŕarchique par la recherche en chaîne des voisins réciproques
  publication-title: Les Cahiers de l’Analyse des Données
– volume: 26
  start-page: 736
  year: 2018
  end-page: 748
  ident: bib0036
  article-title: Speaker recognition using e–vectors
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
– volume: 23
  start-page: 1495
  year: 2007
  end-page: 1502
  ident: bib0044
  article-title: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis
  publication-title: Bioinformatics
– start-page: 515
  year: 2002
  end-page: 524
  ident: bib0013
  article-title: Evaluation of hierarchical clustering algorithms for document datasets
  publication-title: Proceedings of the Eleventh International Conference on Information and Knowledge Management
– volume: 1
  start-page: 7
  year: 1984
  end-page: 24
  ident: bib0017
  article-title: Efficient algorithms for agglomerative hierarchical clustering methods
  publication-title: J. Classification
– volume: 21
  start-page: 1217
  year: 2013
  end-page: 1227
  ident: bib0033
  article-title: Pairwise discriminative speaker verification in the i-vector space
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
– volume: 24/ISBM
  start-page: 141
  year: 2008
  end-page: 149
  ident: bib0014
  article-title: Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space
  publication-title: Bioinformatics
– volume: 28
  start-page: 100
  year: 1979
  end-page: 108
  ident: bib0008
  article-title: A k-means clustering algorithm
  publication-title: J. R. Stat. Soc. Ser. C
– volume: 9
  start-page: 773
  year: 2000
  end-page: 777
  ident: bib0012
  article-title: Fast and memory efficient implementation of the exact PNN
  publication-title: IEEE Trans. Image Process.
– volume: 25
  start-page: 908
  year: 2017
  end-page: 919
  ident: bib0038
  article-title: Non–linear i–vector transformations for PLDA based speaker recognition
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
– volume: 31
  start-page: 651
  year: 2010
  end-page: 666
  ident: bib0001
  article-title: Data clustering: 50 years beyond k-means
  publication-title: Pattern Recognit. Lett.
– volume: 66
  start-page: 846
  year: 1971
  end-page: 850
  ident: bib0042
  article-title: Objective criteria for the evaluation of clustering methods
  publication-title: J. Am. Stat. Assoc.
– start-page: 260
  year: 2014
  end-page: 264
  ident: bib0007
  article-title: Unsupervised domain adaptation for i-vector speaker recognition
  publication-title: Proc. of Odyssey 2014, The Speaker and Language Recognition Workshop
– start-page: 4133
  year: 2008
  end-page: 4136
  ident: bib0004
  article-title: Stream-based speaker segmentation using speaker factors and eigenvoices
  publication-title: Proceedings of ICASSP 2008
– volume: 24
  start-page: 603
  year: 2002
  end-page: 619
  ident: bib0011
  article-title: Mean shift: a robust approach toward feature space analysis
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
– volume: 3
  start-page: 209
  year: 2010
  end-page: 235
  ident: bib0027
  article-title: Relative clustering validity criteria: a comparative overview
  publication-title: Stat. Anal. Data Min.
– volume: 12
  start-page: 219
  year: 1982
  end-page: 225
  ident: bib0020
  article-title: Programme de classification hieŕarchique par l’algorithme de la recherche en chaîne des voisins réciproques
  publication-title: Les Cahiers de l’Analyse des Données
– start-page: 103
  year: 1996
  end-page: 114
  ident: bib0024
  article-title: BIRCH: an efficient data clustering method for very large databases
  publication-title: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data
– volume: 21
  start-page: 2015
  year: 2013
  end-page: 2028
  ident: bib0005
  article-title: Unsupervised methods for speaker diarization: an integrated and iterative approach
  publication-title: IEEE Trans. Audio Speech Lang.Process.
– start-page: 849
  year: 2001
  end-page: 856
  ident: bib0009
  article-title: On spectral clustering: analysis and an algorithm
  publication-title: Proc. of Neural Information Processing Systems: Natural and Synthetic
– start-page: 81.1
  year: 2006
  end-page: 81.10
  ident: bib0025
  article-title: Efficient clustering and matching for object class recognition
  publication-title: Proc. of the British Machine Vision Conference (BMVC)
– volume: 30
  start-page: 287
  year: 2014
  end-page: 288
  ident: bib0023
  article-title: HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences
  publication-title: Bioinformatics
– start-page: 531
  year: 2006
  end-page: 542
  ident: bib0030
  article-title: Probabilistic linear discriminant analysis
  publication-title: Proceedings of the 9th European Conference on Computer Vision - Volume Part IV
– volume: 20
  start-page: 230
  year: 2006
  end-page: 275
  ident: bib0039
  article-title: Application-independent evaluation of speaker detection
  publication-title: Comput. Speech Lang.
– start-page: 413
  year: 2014
  end-page: 417
  ident: bib0016
  article-title: Speaker diarization with PLDA i-vector scoring and unsupervised calibration
  publication-title: 2014 IEEE Spoken Language Technology Workshop (SLT)
– volume: 2
  start-page: 193
  year: 1985
  end-page: 218
  ident: bib0043
  article-title: Comparing partitions
  publication-title: J. Classification
– volume: 19
  start-page: 788
  year: 2011
  end-page: 798
  ident: bib0035
  article-title: Front–end factor analysis for speaker verification
  publication-title: IEEE Trans. Audio Speech Lang.Process.
– volume: 20
  start-page: 356
  year: 2012
  end-page: 370
  ident: bib0003
  article-title: Speaker diarization: a review of recent research
  publication-title: IEEE Trans. Audio Speech Lang.Process.
– start-page: 5329
  year: 2018
  end-page: 5333
  ident: bib0037
  article-title: X-vectors: robust DNN embeddings for speaker recognition
  publication-title: Proceedings of ICASSP 2018
– volume: 28
  start-page: 1875
  year: 2006
  end-page: 1881
  ident: bib0028
  article-title: Fast agglomerative clustering using a k-Nearest Neighbor graph
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
– year: 2010
  ident: bib0031
  article-title: Bayesian speaker verification with Heavy–Tailed Priors
  publication-title: Keynote presentation, Odyssey 2010, The Speaker and Language Recognition Workshop
– volume: 22
  start-page: 1590
  year: 2014
  end-page: 1600
  ident: bib0034
  article-title: Large scale training of pairwise support vector machines for speaker recognition
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
– start-page: 103
  year: 1996
  ident: 10.1016/j.patcog.2019.06.018_bib0024
  article-title: BIRCH: an efficient data clustering method for very large databases
– volume: 24/ISBM
  start-page: 141
  issue: 13
  year: 2008
  ident: 10.1016/j.patcog.2019.06.018_bib0014
  article-title: Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space
  publication-title: Bioinformatics
– volume: 2
  start-page: 24
  issue: 3
  year: 1977
  ident: 10.1016/j.patcog.2019.06.018_bib0029
  article-title: Méthodes nouvelles en classification automatique de données taxinomiques nombreuses
  publication-title: Statistique et analyse des données
– start-page: 531
  year: 2006
  ident: 10.1016/j.patcog.2019.06.018_bib0030
  article-title: Probabilistic linear discriminant analysis
– start-page: 76
  year: 2010
  ident: 10.1016/j.patcog.2019.06.018_bib0040
  article-title: Unsupervised speaker adaptation based on the cosine similarity for text–independent speaker verification
– volume: 24
  start-page: 603
  issue: 5
  year: 2002
  ident: 10.1016/j.patcog.2019.06.018_bib0011
  article-title: Mean shift: a robust approach toward feature space analysis
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
  doi: 10.1109/34.1000236
– start-page: 202
  year: 2010
  ident: 10.1016/j.patcog.2019.06.018_bib0045
  article-title: Speaker linking in large data sets
– year: 2010
  ident: 10.1016/j.patcog.2019.06.018_sbref0031
  article-title: Bayesian speaker verification with Heavy–Tailed Priors
– volume: 20
  start-page: 230
  issue: 2–3
  year: 2006
  ident: 10.1016/j.patcog.2019.06.018_bib0039
  article-title: Application-independent evaluation of speaker detection
  publication-title: Comput. Speech Lang.
  doi: 10.1016/j.csl.2005.08.001
– volume: 20
  start-page: 53
  issue: 1
  year: 1987
  ident: 10.1016/j.patcog.2019.06.018_bib0026
  article-title: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
  publication-title: J. Comput. Appl. Math.
  doi: 10.1016/0377-0427(87)90125-7
– volume: 2
  start-page: 193
  issue: 1
  year: 1985
  ident: 10.1016/j.patcog.2019.06.018_bib0043
  article-title: Comparing partitions
  publication-title: J. Classification
  doi: 10.1007/BF01908075
– start-page: 254
  year: 2014
  ident: 10.1016/j.patcog.2019.06.018_bib0006
  article-title: Hierarchical speaker clustering methods for the NIST i-vector challenge
– start-page: 81.1
  year: 2006
  ident: 10.1016/j.patcog.2019.06.018_bib0025
  article-title: Efficient clustering and matching for object class recognition
– start-page: 4133
  year: 2008
  ident: 10.1016/j.patcog.2019.06.018_bib0004
  article-title: Stream-based speaker segmentation using speaker factors and eigenvoices
– volume: 1
  start-page: 7
  issue: 1
  year: 1984
  ident: 10.1016/j.patcog.2019.06.018_bib0017
  article-title: Efficient algorithms for agglomerative hierarchical clustering methods
  publication-title: J. Classification
  doi: 10.1007/BF01890115
– volume: 20
  start-page: 356
  issue: 2
  year: 2012
  ident: 10.1016/j.patcog.2019.06.018_bib0003
  article-title: Speaker diarization: a review of recent research
  publication-title: IEEE Trans. Audio Speech Lang.Process.
  doi: 10.1109/TASL.2011.2125954
– volume: 7
  start-page: 448
  issue: 4
  year: 1973
  ident: 10.1016/j.patcog.2019.06.018_bib0022
  article-title: Time bounds for selection
  publication-title: J. Comput. Syst. Sci.
  doi: 10.1016/S0022-0000(73)80033-9
– volume: 46
  start-page: 243
  issue: 1
  year: 2013
  ident: 10.1016/j.patcog.2019.06.018_bib0041
  article-title: An extensive comparative study of cluster validity indices
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2012.07.021
– start-page: 515
  year: 2002
  ident: 10.1016/j.patcog.2019.06.018_bib0013
  article-title: Evaluation of hierarchical clustering algorithms for document datasets
– year: 2011
  ident: 10.1016/j.patcog.2019.06.018_bib0015
– volume: 104
  start-page: 205
  issue: 6
  year: 2007
  ident: 10.1016/j.patcog.2019.06.018_bib0018
  article-title: Optimal implementations of UPGMA and other common clustering algorithms
  publication-title: Inf. Process. Lett.
  doi: 10.1016/j.ipl.2007.07.002
– start-page: 849
  year: 2001
  ident: 10.1016/j.patcog.2019.06.018_bib0009
  article-title: On spectral clustering: analysis and an algorithm
– volume: 17
  start-page: 395
  issue: 4
  year: 2007
  ident: 10.1016/j.patcog.2019.06.018_bib0010
  article-title: A tutorial on spectral clustering
  publication-title: Stat. Comput.
  doi: 10.1007/s11222-007-9033-z
– volume: 30
  start-page: 287
  issue: 2
  year: 2014
  ident: 10.1016/j.patcog.2019.06.018_bib0023
  article-title: HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt657
– volume: 3
  start-page: 209
  issue: 4
  year: 2010
  ident: 10.1016/j.patcog.2019.06.018_bib0027
  article-title: Relative clustering validity criteria: a comparative overview
  publication-title: Stat. Anal. Data Min.
  doi: 10.1002/sam.10080
– volume: 25
  start-page: 908
  issue: 4
  year: 2017
  ident: 10.1016/j.patcog.2019.06.018_bib0038
  article-title: Non–linear i–vector transformations for PLDA based speaker recognition
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
  doi: 10.1109/TASLP.2017.2674966
– volume: 25
  start-page: 1890
  issue: 10
  year: 2017
  ident: 10.1016/j.patcog.2019.06.018_bib0032
  article-title: Joint estimation of PLDA and non–linear transformations of speaker vectors
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
  doi: 10.1109/TASLP.2017.2724198
– volume: 9
  start-page: 773
  issue: 5
  year: 2000
  ident: 10.1016/j.patcog.2019.06.018_bib0012
  article-title: Fast and memory efficient implementation of the exact PNN
  publication-title: IEEE Trans. Image Process.
  doi: 10.1109/83.841516
– volume: 21
  start-page: 1217
  issue: 6
  year: 2013
  ident: 10.1016/j.patcog.2019.06.018_bib0033
  article-title: Pairwise discriminative speaker verification in the i-vector space
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
  doi: 10.1109/TASL.2013.2245655
– volume: 26
  start-page: 736
  issue: 4
  year: 2018
  ident: 10.1016/j.patcog.2019.06.018_bib0036
  article-title: Speaker recognition using e–vectors
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
  doi: 10.1109/TASLP.2018.2791806
– volume: 12
  start-page: 16:1
  issue: 2
  year: 2018
  ident: 10.1016/j.patcog.2019.06.018_bib0002
  article-title: Systematic review of clustering high-dimensional and large datasets
  publication-title: ACM Trans. Knowl. Discovery Data
– volume: 19
  start-page: 788
  issue: 4
  year: 2011
  ident: 10.1016/j.patcog.2019.06.018_bib0035
  article-title: Front–end factor analysis for speaker verification
  publication-title: IEEE Trans. Audio Speech Lang.Process.
  doi: 10.1109/TASL.2010.2064307
– volume: 66
  start-page: 846
  year: 1971
  ident: 10.1016/j.patcog.2019.06.018_bib0042
  article-title: Objective criteria for the evaluation of clustering methods
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1080/01621459.1971.10482356
– volume: 31
  start-page: 651
  issue: 8
  year: 2010
  ident: 10.1016/j.patcog.2019.06.018_bib0001
  article-title: Data clustering: 50 years beyond k-means
  publication-title: Pattern Recognit. Lett.
  doi: 10.1016/j.patrec.2009.09.011
– volume: 12
  start-page: 209
  issue: 7
  year: 1982
  ident: 10.1016/j.patcog.2019.06.018_bib0019
  article-title: Construction d’une classification ascendante hieŕarchique par la recherche en chaîne des voisins réciproques
  publication-title: Les Cahiers de l’Analyse des Données
– volume: 26
  start-page: 354
  issue: 4
  year: 1983
  ident: 10.1016/j.patcog.2019.06.018_bib0021
  article-title: A survey of recent advances in hierarchical clustering algorithms
  publication-title: Comput. J.
  doi: 10.1093/comjnl/26.4.354
– start-page: 413
  year: 2014
  ident: 10.1016/j.patcog.2019.06.018_bib0016
  article-title: Speaker diarization with PLDA i-vector scoring and unsupervised calibration
– volume: 28
  start-page: 100
  issue: 1
  year: 1979
  ident: 10.1016/j.patcog.2019.06.018_bib0008
  article-title: A k-means clustering algorithm
  publication-title: J. R. Stat. Soc. Ser. C
– start-page: 260
  year: 2014
  ident: 10.1016/j.patcog.2019.06.018_bib0007
  article-title: Unsupervised domain adaptation for i-vector speaker recognition
– start-page: 5329
  year: 2018
  ident: 10.1016/j.patcog.2019.06.018_bib0037
  article-title: X-vectors: robust DNN embeddings for speaker recognition
– volume: 22
  start-page: 1590
  issue: 11
  year: 2014
  ident: 10.1016/j.patcog.2019.06.018_bib0034
  article-title: Large scale training of pairwise support vector machines for speaker recognition
  publication-title: IEEE/ACM Trans. Audio SpeechLang. Process.
  doi: 10.1109/TASLP.2014.2341914
– volume: 21
  start-page: 2015
  issue: 10
  year: 2013
  ident: 10.1016/j.patcog.2019.06.018_bib0005
  article-title: Unsupervised methods for speaker diarization: an integrated and iterative approach
  publication-title: IEEE Trans. Audio Speech Lang.Process.
  doi: 10.1109/TASL.2013.2264673
– volume: 23
  start-page: 1495
  issue: 12
  year: 2007
  ident: 10.1016/j.patcog.2019.06.018_bib0044
  article-title: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btm134
– volume: 28
  start-page: 1875
  issue: 11
  year: 2006
  ident: 10.1016/j.patcog.2019.06.018_bib0028
  article-title: Fast agglomerative clustering using a k-Nearest Neighbor graph
  publication-title: IEEE Trans. Pattern Anal. Mach.Intell.
  doi: 10.1109/TPAMI.2006.227
– volume: 12
  start-page: 219
  issue: 7
  year: 1982
  ident: 10.1016/j.patcog.2019.06.018_bib0020
  article-title: Programme de classification hieŕarchique par l’algorithme de la recherche en chaîne des voisins réciproques
  publication-title: Les Cahiers de l’Analyse des Données
SSID ssj0017142
Score 2.3223488
Snippet •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 235
SubjectTerms Cluster quality measures
Clustering
PLDA
PSVM
Reciprocal Nearest Neighbor
Silhouette
Similarity measures
UPGMA
Title Exact memory–constrained UPGMA for large scale speaker clustering
URI https://dx.doi.org/10.1016/j.patcog.2019.06.018
Volume 95
WOSCitedRecordID wos000478710600020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-5142
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017142
  issn: 0031-3203
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV07b9swECZap0OXpk80TVtwyBaoEE1LJEfDcF9AAg8u4E2gKLJI6iqG7RTu1v_Qf5hfkiN5ktWmSJMhC2EQFvX4qOPx9N13hBwICQuHFS4Z6Aw2KM5UiVSZSTQTqeHaS56UodiEOD6Ws5maYIn7VSgnIOpabjZqcadQQx-A7VNnbwF3Oyh0wG8AHVqAHdobAT_e-LTH755B-7OhMnDj3UBfDQL8yy-TD0fDQC-cexr44QpggnZh9Te7PDTzc6-d0Kxo6LdOggynT31BvtH26_0oSGjE6LJXP2gpPtphhH5yYtfYj_EFpjDRrmMzOUt4P-Vdm6myrtGLgiO4fvZjSPGKaY5RgtN3C1hizr56Up0Kyqlofv9Qwv5rhWp5gw0l7bSIoxR-lMIz85i8T3b6IlOyR3aGn8azz-23JMEGUTMe76NJoAwsv6tX828HpeN0TB-TR7hboMOI8hNyz9ZPyW5TiYOiYX5GRgF0GkG_-PW7AzcNcFOAmwa4aYCbItx0C_dzMn0_no4-JlgeIzGwz1snFlxN5TgrndScaV6mucmZK61TThlWCSMql3InLB_0teDlQFpmpGNV5nKZ8hekV5_V9iWh3OTwKqsyNzoHB6_S4LdVXhgv5ToDg71HePNICoPS8f4m5sV1gOyRpD1qEaVT_vN_0TztAt2_6NYVMIWuPfLVLc-0Tx5uZ_pr0lsvz-0b8sD8WJ-slm9x_lwCu8R9rg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exact+memory%E2%80%93constrained+UPGMA+for+large+scale+speaker+clustering&rft.jtitle=Pattern+recognition&rft.au=Cumani%2C+Sandro&rft.au=Laface%2C+Pietro&rft.date=2019-11-01&rft.issn=0031-3203&rft.volume=95&rft.spage=235&rft.epage=246&rft_id=info:doi/10.1016%2Fj.patcog.2019.06.018&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_patcog_2019_06_018
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon