Exact memory–constrained UPGMA for large scale speaker clustering
•We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions...
Uložené v:
| Vydané v: | Pattern recognition Ročník 95; s. 235 - 246 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.11.2019
|
| Predmet: | |
| ISSN: | 0031-3203, 1873-5142 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions.•We assess the efficiency of our method on datasets including 4 million utterances.
This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately O(N2), but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has O(N2) complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors. |
|---|---|
| AbstractList | •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an efficient, exact and parallel implementation of UPGMA clustering.•We extend the Clustering Features concept to speaker recognition scoring functions.•We assess the efficiency of our method on datasets including 4 million utterances.
This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately O(N2), but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has O(N2) complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors. |
| Author | Laface, Pietro Cumani, Sandro |
| Author_xml | – sequence: 1 givenname: Sandro orcidid: 0000-0001-6036-0065 surname: Cumani fullname: Cumani, Sandro email: sandro.cumani@polito.it – sequence: 2 givenname: Pietro orcidid: 0000-0003-2841-7695 surname: Laface fullname: Laface, Pietro email: pietro.laface@polito.it |
| BookMark | eNqFkE1OwzAQhS1UJNrCDVjkAgkzcZofFkhVVQpSESzK2nKcceWSxpVtEN1xB27ISUhVVixgM28x-p70vhEbdLYjxi4REgTMrzbJTgZl10kKWCWQJ4DlCRtiWfB4glk6YEMAjjFPgZ-xkfcbACz6x5DN5u9ShWhLW-v2Xx-fynY-OGk6aqLnp8XDNNLWRa10a4q8km1_dyRfyEWqffWBnOnW5-xUy9bTxU-O2ep2vprdxcvHxf1suowVhzzEVFa80hxrXUqOkteQqxx1TbrSlcKmUEWjgeuCeJbKgtdZSahKjc1E5yXwMcuOtcpZ7x1psXNmK91eIIiDB7ERRw_i4EFALnoPPXb9C1MmyGBsd9jZ_gffHGHqd70ZcsIrQ52ixjhSQTTW_F3wDS-Jf8k |
| CitedBy_id | crossref_primary_10_3390_genes15060719 crossref_primary_10_1002_nem_2126 crossref_primary_10_1109_TMM_2020_3024667 |
| Cites_doi | 10.1109/34.1000236 10.1016/j.csl.2005.08.001 10.1016/0377-0427(87)90125-7 10.1007/BF01908075 10.1007/BF01890115 10.1109/TASL.2011.2125954 10.1016/S0022-0000(73)80033-9 10.1016/j.patcog.2012.07.021 10.1016/j.ipl.2007.07.002 10.1007/s11222-007-9033-z 10.1093/bioinformatics/btt657 10.1002/sam.10080 10.1109/TASLP.2017.2674966 10.1109/TASLP.2017.2724198 10.1109/83.841516 10.1109/TASL.2013.2245655 10.1109/TASLP.2018.2791806 10.1109/TASL.2010.2064307 10.1080/01621459.1971.10482356 10.1016/j.patrec.2009.09.011 10.1093/comjnl/26.4.354 10.1109/TASLP.2014.2341914 10.1109/TASL.2013.2264673 10.1093/bioinformatics/btm134 10.1109/TPAMI.2006.227 |
| ContentType | Journal Article |
| Copyright | 2019 Elsevier Ltd |
| Copyright_xml | – notice: 2019 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.patcog.2019.06.018 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-5142 |
| EndPage | 246 |
| ExternalDocumentID | 10_1016_j_patcog_2019_06_018 S0031320319302493 |
| GroupedDBID | --K --M -D8 -DT -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 29O 4.4 457 4G. 53G 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABFRF ABHFT ABJNI ABMAC ABTAH ABXDB ABYKQ ACBEA ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADMXK ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F0J F5P FD6 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM KZ1 LG9 LMP LY1 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SDS SES SEW SPC SPCBC SST SSV SSZ T5K TN5 UNMZH VOH WUQ XJE XPP ZMT ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c306t-e8939f31bf8a31a3b06c61fbef9f9c1d7c7df03f7e342a73b48e1c8f1d5f6803 |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000478710600020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0031-3203 |
| IngestDate | Sat Nov 29 03:52:24 EST 2025 Tue Nov 18 22:23:06 EST 2025 Fri Feb 23 02:25:25 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Reciprocal Nearest Neighbor Cluster quality measures UPGMA Similarity measures Silhouette Clustering PLDA PSVM |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c306t-e8939f31bf8a31a3b06c61fbef9f9c1d7c7df03f7e342a73b48e1c8f1d5f6803 |
| ORCID | 0000-0003-2841-7695 0000-0001-6036-0065 |
| PageCount | 12 |
| ParticipantIDs | crossref_primary_10_1016_j_patcog_2019_06_018 crossref_citationtrail_10_1016_j_patcog_2019_06_018 elsevier_sciencedirect_doi_10_1016_j_patcog_2019_06_018 |
| PublicationCentury | 2000 |
| PublicationDate | November 2019 2019-11-00 |
| PublicationDateYYYYMMDD | 2019-11-01 |
| PublicationDate_xml | – month: 11 year: 2019 text: November 2019 |
| PublicationDecade | 2010 |
| PublicationTitle | Pattern recognition |
| PublicationYear | 2019 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Juan (bib0020) 1982; 12 Snyder, Garcia-Romero, Sell, Povey, Khudanpur (bib0037) 2018 Franti, Kaukoranta, Shen, Chang (bib0012) 2000; 9 Gronau, Moran (bib0018) 2007; 104 Rand (bib0042) 1971; 66 Jain (bib0001) 2010; 31 Loewenstein, Portugaly, Fromer, Linial (bib0014) 2008; 24/ISBM Zhao, Karypis (bib0013) 2002 Brümmer, du Preez (bib0039) 2006; 20 Hartigan, Wong (bib0008) 1979; 28 Blum, Floyd, Pratt, Rivest, Tarjan (bib0022) 1973; 7 Comaniciu, Meer (bib0011) 2002; 24 Cumani, Brümmer, Burget, Laface, Plchot, Vasilakakis (bib0033) 2013; 21 Ioffe (bib0030) 2006 Shum, Dehak, Dehak, Glass (bib0005) 2013; 21 Day, Edelsbrunner (bib0017) 1984; 1 Kenny (bib0031) 2010 Castaldo, Colibro, Dalmasso, Laface, Vair (bib0004) 2008 Cumani, Laface (bib0032) 2017; 25 Kim, Park (bib0044) 2007; 23 Benzécri (bib0019) 1982; 12 Vendramin, Campello, Hruschka (bib0027) 2010; 3 Hubert, Arabie (bib0043) 1985; 2 Luxburg (bib0010) 2007; 17 Sell, Garcia-Romero (bib0016) 2014 Cumani, Laface (bib0036) 2018; 26 Everitt, Landau, Leese, Stahl (bib0015) 2011 Ng, Jordan, Weiss (bib0009) 2001 Rousseeuw (bib0026) 1987; 20 Dehak, Kenny, Dehak, Dumouchel, Ouellet (bib0035) 2011; 19 Franti, Virmajoki, Hautamaki (bib0028) 2006; 28 Zhang, Ramakrishnan, Livny (bib0024) 1996 Murtagh (bib0021) 1983; 26 Bruynooghe (bib0029) 1977; 2 Pandove, Goel, Rani (bib0002) 2018; 12 Khoury, Shafey, Ferras, Marcel (bib0006) 2014 Leibe, Mikolajczyk, Schiele (bib0025) 2006 van Leeuwen (bib0045) 2010 Anguera, Bozonnet, Evans, Fredouille, Friedland, Vinyals (bib0003) 2012; 20 Cumani, Laface (bib0038) 2017; 25 Garcia-Romero, McCree, Shum, Brümmer, Vaquero (bib0007) 2014 Cumani, Laface (bib0034) 2014; 22 F. Matias Rodrigues, von Mering (bib0023) 2014; 30 Shum, Dehak, Dehak, Glass (bib0040) 2010 Arbelaitz, Gurrutxaga, Muguerza, PéRez, Perona (bib0041) 2013; 46 van Leeuwen (10.1016/j.patcog.2019.06.018_bib0045) 2010 Castaldo (10.1016/j.patcog.2019.06.018_bib0004) 2008 Franti (10.1016/j.patcog.2019.06.018_bib0028) 2006; 28 Anguera (10.1016/j.patcog.2019.06.018_bib0003) 2012; 20 Rousseeuw (10.1016/j.patcog.2019.06.018_bib0026) 1987; 20 Pandove (10.1016/j.patcog.2019.06.018_bib0002) 2018; 12 Benzécri (10.1016/j.patcog.2019.06.018_bib0019) 1982; 12 Murtagh (10.1016/j.patcog.2019.06.018_bib0021) 1983; 26 Luxburg (10.1016/j.patcog.2019.06.018_bib0010) 2007; 17 Kenny (10.1016/j.patcog.2019.06.018_sbref0031) 2010 Sell (10.1016/j.patcog.2019.06.018_bib0016) 2014 Franti (10.1016/j.patcog.2019.06.018_bib0012) 2000; 9 Day (10.1016/j.patcog.2019.06.018_bib0017) 1984; 1 Brümmer (10.1016/j.patcog.2019.06.018_bib0039) 2006; 20 Bruynooghe (10.1016/j.patcog.2019.06.018_bib0029) 1977; 2 Vendramin (10.1016/j.patcog.2019.06.018_bib0027) 2010; 3 Gronau (10.1016/j.patcog.2019.06.018_bib0018) 2007; 104 Shum (10.1016/j.patcog.2019.06.018_bib0005) 2013; 21 Garcia-Romero (10.1016/j.patcog.2019.06.018_bib0007) 2014 Cumani (10.1016/j.patcog.2019.06.018_bib0032) 2017; 25 Ng (10.1016/j.patcog.2019.06.018_bib0009) 2001 Loewenstein (10.1016/j.patcog.2019.06.018_bib0014) 2008; 24/ISBM Blum (10.1016/j.patcog.2019.06.018_bib0022) 1973; 7 Snyder (10.1016/j.patcog.2019.06.018_bib0037) 2018 Leibe (10.1016/j.patcog.2019.06.018_bib0025) 2006 Arbelaitz (10.1016/j.patcog.2019.06.018_bib0041) 2013; 46 Zhao (10.1016/j.patcog.2019.06.018_bib0013) 2002 Comaniciu (10.1016/j.patcog.2019.06.018_bib0011) 2002; 24 Everitt (10.1016/j.patcog.2019.06.018_bib0015) 2011 Hubert (10.1016/j.patcog.2019.06.018_bib0043) 1985; 2 Dehak (10.1016/j.patcog.2019.06.018_bib0035) 2011; 19 Cumani (10.1016/j.patcog.2019.06.018_bib0038) 2017; 25 Shum (10.1016/j.patcog.2019.06.018_bib0040) 2010 Jain (10.1016/j.patcog.2019.06.018_bib0001) 2010; 31 Kim (10.1016/j.patcog.2019.06.018_bib0044) 2007; 23 Khoury (10.1016/j.patcog.2019.06.018_bib0006) 2014 F. Matias Rodrigues (10.1016/j.patcog.2019.06.018_bib0023) 2014; 30 Rand (10.1016/j.patcog.2019.06.018_bib0042) 1971; 66 Cumani (10.1016/j.patcog.2019.06.018_bib0034) 2014; 22 Hartigan (10.1016/j.patcog.2019.06.018_bib0008) 1979; 28 Ioffe (10.1016/j.patcog.2019.06.018_bib0030) 2006 Zhang (10.1016/j.patcog.2019.06.018_bib0024) 1996 Cumani (10.1016/j.patcog.2019.06.018_bib0036) 2018; 26 Juan (10.1016/j.patcog.2019.06.018_bib0020) 1982; 12 Cumani (10.1016/j.patcog.2019.06.018_bib0033) 2013; 21 |
| References_xml | – volume: 12 start-page: 16:1 year: 2018 end-page: 16:68 ident: bib0002 article-title: Systematic review of clustering high-dimensional and large datasets publication-title: ACM Trans. Knowl. Discovery Data – volume: 17 start-page: 395 year: 2007 end-page: 416 ident: bib0010 article-title: A tutorial on spectral clustering publication-title: Stat. Comput. – volume: 20 start-page: 53 year: 1987 end-page: 65 ident: bib0026 article-title: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis publication-title: J. Comput. Appl. Math. – start-page: 254 year: 2014 end-page: 259 ident: bib0006 article-title: Hierarchical speaker clustering methods for the NIST i-vector challenge publication-title: Odyssey: The Speaker and Language Recognition Workshop – volume: 46 start-page: 243 year: 2013 end-page: 256 ident: bib0041 article-title: An extensive comparative study of cluster validity indices publication-title: Pattern Recognit. – volume: 7 start-page: 448 year: 1973 end-page: 461 ident: bib0022 article-title: Time bounds for selection publication-title: J. Comput. Syst. Sci. – volume: 2 start-page: 24 year: 1977 end-page: 42 ident: bib0029 article-title: Méthodes nouvelles en classification automatique de données taxinomiques nombreuses publication-title: Statistique et analyse des données – volume: 26 start-page: 354 year: 1983 end-page: 359 ident: bib0021 article-title: A survey of recent advances in hierarchical clustering algorithms publication-title: Comput. J. – start-page: 202 year: 2010 end-page: 208 ident: bib0045 article-title: Speaker linking in large data sets publication-title: Proceedings of Odyssey 2010 – volume: 25 start-page: 1890 year: 2017 end-page: 1900 ident: bib0032 article-title: Joint estimation of PLDA and non–linear transformations of speaker vectors publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. – start-page: 76 year: 2010 end-page: 82 ident: bib0040 article-title: Unsupervised speaker adaptation based on the cosine similarity for text–independent speaker verification publication-title: Proceedings of Odyssey 2010 – year: 2011 ident: bib0015 article-title: Cluster Analysis – volume: 104 start-page: 205 year: 2007 end-page: 210 ident: bib0018 article-title: Optimal implementations of UPGMA and other common clustering algorithms publication-title: Inf. Process. Lett. – volume: 12 start-page: 209 year: 1982 end-page: 217 ident: bib0019 article-title: Construction d’une classification ascendante hieŕarchique par la recherche en chaîne des voisins réciproques publication-title: Les Cahiers de l’Analyse des Données – volume: 26 start-page: 736 year: 2018 end-page: 748 ident: bib0036 article-title: Speaker recognition using e–vectors publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. – volume: 23 start-page: 1495 year: 2007 end-page: 1502 ident: bib0044 article-title: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis publication-title: Bioinformatics – start-page: 515 year: 2002 end-page: 524 ident: bib0013 article-title: Evaluation of hierarchical clustering algorithms for document datasets publication-title: Proceedings of the Eleventh International Conference on Information and Knowledge Management – volume: 1 start-page: 7 year: 1984 end-page: 24 ident: bib0017 article-title: Efficient algorithms for agglomerative hierarchical clustering methods publication-title: J. Classification – volume: 21 start-page: 1217 year: 2013 end-page: 1227 ident: bib0033 article-title: Pairwise discriminative speaker verification in the i-vector space publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. – volume: 24/ISBM start-page: 141 year: 2008 end-page: 149 ident: bib0014 article-title: Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space publication-title: Bioinformatics – volume: 28 start-page: 100 year: 1979 end-page: 108 ident: bib0008 article-title: A k-means clustering algorithm publication-title: J. R. Stat. Soc. Ser. C – volume: 9 start-page: 773 year: 2000 end-page: 777 ident: bib0012 article-title: Fast and memory efficient implementation of the exact PNN publication-title: IEEE Trans. Image Process. – volume: 25 start-page: 908 year: 2017 end-page: 919 ident: bib0038 article-title: Non–linear i–vector transformations for PLDA based speaker recognition publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. – volume: 31 start-page: 651 year: 2010 end-page: 666 ident: bib0001 article-title: Data clustering: 50 years beyond k-means publication-title: Pattern Recognit. Lett. – volume: 66 start-page: 846 year: 1971 end-page: 850 ident: bib0042 article-title: Objective criteria for the evaluation of clustering methods publication-title: J. Am. Stat. Assoc. – start-page: 260 year: 2014 end-page: 264 ident: bib0007 article-title: Unsupervised domain adaptation for i-vector speaker recognition publication-title: Proc. of Odyssey 2014, The Speaker and Language Recognition Workshop – start-page: 4133 year: 2008 end-page: 4136 ident: bib0004 article-title: Stream-based speaker segmentation using speaker factors and eigenvoices publication-title: Proceedings of ICASSP 2008 – volume: 24 start-page: 603 year: 2002 end-page: 619 ident: bib0011 article-title: Mean shift: a robust approach toward feature space analysis publication-title: IEEE Trans. Pattern Anal. Mach.Intell. – volume: 3 start-page: 209 year: 2010 end-page: 235 ident: bib0027 article-title: Relative clustering validity criteria: a comparative overview publication-title: Stat. Anal. Data Min. – volume: 12 start-page: 219 year: 1982 end-page: 225 ident: bib0020 article-title: Programme de classification hieŕarchique par l’algorithme de la recherche en chaîne des voisins réciproques publication-title: Les Cahiers de l’Analyse des Données – start-page: 103 year: 1996 end-page: 114 ident: bib0024 article-title: BIRCH: an efficient data clustering method for very large databases publication-title: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data – volume: 21 start-page: 2015 year: 2013 end-page: 2028 ident: bib0005 article-title: Unsupervised methods for speaker diarization: an integrated and iterative approach publication-title: IEEE Trans. Audio Speech Lang.Process. – start-page: 849 year: 2001 end-page: 856 ident: bib0009 article-title: On spectral clustering: analysis and an algorithm publication-title: Proc. of Neural Information Processing Systems: Natural and Synthetic – start-page: 81.1 year: 2006 end-page: 81.10 ident: bib0025 article-title: Efficient clustering and matching for object class recognition publication-title: Proc. of the British Machine Vision Conference (BMVC) – volume: 30 start-page: 287 year: 2014 end-page: 288 ident: bib0023 article-title: HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences publication-title: Bioinformatics – start-page: 531 year: 2006 end-page: 542 ident: bib0030 article-title: Probabilistic linear discriminant analysis publication-title: Proceedings of the 9th European Conference on Computer Vision - Volume Part IV – volume: 20 start-page: 230 year: 2006 end-page: 275 ident: bib0039 article-title: Application-independent evaluation of speaker detection publication-title: Comput. Speech Lang. – start-page: 413 year: 2014 end-page: 417 ident: bib0016 article-title: Speaker diarization with PLDA i-vector scoring and unsupervised calibration publication-title: 2014 IEEE Spoken Language Technology Workshop (SLT) – volume: 2 start-page: 193 year: 1985 end-page: 218 ident: bib0043 article-title: Comparing partitions publication-title: J. Classification – volume: 19 start-page: 788 year: 2011 end-page: 798 ident: bib0035 article-title: Front–end factor analysis for speaker verification publication-title: IEEE Trans. Audio Speech Lang.Process. – volume: 20 start-page: 356 year: 2012 end-page: 370 ident: bib0003 article-title: Speaker diarization: a review of recent research publication-title: IEEE Trans. Audio Speech Lang.Process. – start-page: 5329 year: 2018 end-page: 5333 ident: bib0037 article-title: X-vectors: robust DNN embeddings for speaker recognition publication-title: Proceedings of ICASSP 2018 – volume: 28 start-page: 1875 year: 2006 end-page: 1881 ident: bib0028 article-title: Fast agglomerative clustering using a k-Nearest Neighbor graph publication-title: IEEE Trans. Pattern Anal. Mach.Intell. – year: 2010 ident: bib0031 article-title: Bayesian speaker verification with Heavy–Tailed Priors publication-title: Keynote presentation, Odyssey 2010, The Speaker and Language Recognition Workshop – volume: 22 start-page: 1590 year: 2014 end-page: 1600 ident: bib0034 article-title: Large scale training of pairwise support vector machines for speaker recognition publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. – start-page: 103 year: 1996 ident: 10.1016/j.patcog.2019.06.018_bib0024 article-title: BIRCH: an efficient data clustering method for very large databases – volume: 24/ISBM start-page: 141 issue: 13 year: 2008 ident: 10.1016/j.patcog.2019.06.018_bib0014 article-title: Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space publication-title: Bioinformatics – volume: 2 start-page: 24 issue: 3 year: 1977 ident: 10.1016/j.patcog.2019.06.018_bib0029 article-title: Méthodes nouvelles en classification automatique de données taxinomiques nombreuses publication-title: Statistique et analyse des données – start-page: 531 year: 2006 ident: 10.1016/j.patcog.2019.06.018_bib0030 article-title: Probabilistic linear discriminant analysis – start-page: 76 year: 2010 ident: 10.1016/j.patcog.2019.06.018_bib0040 article-title: Unsupervised speaker adaptation based on the cosine similarity for text–independent speaker verification – volume: 24 start-page: 603 issue: 5 year: 2002 ident: 10.1016/j.patcog.2019.06.018_bib0011 article-title: Mean shift: a robust approach toward feature space analysis publication-title: IEEE Trans. Pattern Anal. Mach.Intell. doi: 10.1109/34.1000236 – start-page: 202 year: 2010 ident: 10.1016/j.patcog.2019.06.018_bib0045 article-title: Speaker linking in large data sets – year: 2010 ident: 10.1016/j.patcog.2019.06.018_sbref0031 article-title: Bayesian speaker verification with Heavy–Tailed Priors – volume: 20 start-page: 230 issue: 2–3 year: 2006 ident: 10.1016/j.patcog.2019.06.018_bib0039 article-title: Application-independent evaluation of speaker detection publication-title: Comput. Speech Lang. doi: 10.1016/j.csl.2005.08.001 – volume: 20 start-page: 53 issue: 1 year: 1987 ident: 10.1016/j.patcog.2019.06.018_bib0026 article-title: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis publication-title: J. Comput. Appl. Math. doi: 10.1016/0377-0427(87)90125-7 – volume: 2 start-page: 193 issue: 1 year: 1985 ident: 10.1016/j.patcog.2019.06.018_bib0043 article-title: Comparing partitions publication-title: J. Classification doi: 10.1007/BF01908075 – start-page: 254 year: 2014 ident: 10.1016/j.patcog.2019.06.018_bib0006 article-title: Hierarchical speaker clustering methods for the NIST i-vector challenge – start-page: 81.1 year: 2006 ident: 10.1016/j.patcog.2019.06.018_bib0025 article-title: Efficient clustering and matching for object class recognition – start-page: 4133 year: 2008 ident: 10.1016/j.patcog.2019.06.018_bib0004 article-title: Stream-based speaker segmentation using speaker factors and eigenvoices – volume: 1 start-page: 7 issue: 1 year: 1984 ident: 10.1016/j.patcog.2019.06.018_bib0017 article-title: Efficient algorithms for agglomerative hierarchical clustering methods publication-title: J. Classification doi: 10.1007/BF01890115 – volume: 20 start-page: 356 issue: 2 year: 2012 ident: 10.1016/j.patcog.2019.06.018_bib0003 article-title: Speaker diarization: a review of recent research publication-title: IEEE Trans. Audio Speech Lang.Process. doi: 10.1109/TASL.2011.2125954 – volume: 7 start-page: 448 issue: 4 year: 1973 ident: 10.1016/j.patcog.2019.06.018_bib0022 article-title: Time bounds for selection publication-title: J. Comput. Syst. Sci. doi: 10.1016/S0022-0000(73)80033-9 – volume: 46 start-page: 243 issue: 1 year: 2013 ident: 10.1016/j.patcog.2019.06.018_bib0041 article-title: An extensive comparative study of cluster validity indices publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2012.07.021 – start-page: 515 year: 2002 ident: 10.1016/j.patcog.2019.06.018_bib0013 article-title: Evaluation of hierarchical clustering algorithms for document datasets – year: 2011 ident: 10.1016/j.patcog.2019.06.018_bib0015 – volume: 104 start-page: 205 issue: 6 year: 2007 ident: 10.1016/j.patcog.2019.06.018_bib0018 article-title: Optimal implementations of UPGMA and other common clustering algorithms publication-title: Inf. Process. Lett. doi: 10.1016/j.ipl.2007.07.002 – start-page: 849 year: 2001 ident: 10.1016/j.patcog.2019.06.018_bib0009 article-title: On spectral clustering: analysis and an algorithm – volume: 17 start-page: 395 issue: 4 year: 2007 ident: 10.1016/j.patcog.2019.06.018_bib0010 article-title: A tutorial on spectral clustering publication-title: Stat. Comput. doi: 10.1007/s11222-007-9033-z – volume: 30 start-page: 287 issue: 2 year: 2014 ident: 10.1016/j.patcog.2019.06.018_bib0023 article-title: HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt657 – volume: 3 start-page: 209 issue: 4 year: 2010 ident: 10.1016/j.patcog.2019.06.018_bib0027 article-title: Relative clustering validity criteria: a comparative overview publication-title: Stat. Anal. Data Min. doi: 10.1002/sam.10080 – volume: 25 start-page: 908 issue: 4 year: 2017 ident: 10.1016/j.patcog.2019.06.018_bib0038 article-title: Non–linear i–vector transformations for PLDA based speaker recognition publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. doi: 10.1109/TASLP.2017.2674966 – volume: 25 start-page: 1890 issue: 10 year: 2017 ident: 10.1016/j.patcog.2019.06.018_bib0032 article-title: Joint estimation of PLDA and non–linear transformations of speaker vectors publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. doi: 10.1109/TASLP.2017.2724198 – volume: 9 start-page: 773 issue: 5 year: 2000 ident: 10.1016/j.patcog.2019.06.018_bib0012 article-title: Fast and memory efficient implementation of the exact PNN publication-title: IEEE Trans. Image Process. doi: 10.1109/83.841516 – volume: 21 start-page: 1217 issue: 6 year: 2013 ident: 10.1016/j.patcog.2019.06.018_bib0033 article-title: Pairwise discriminative speaker verification in the i-vector space publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. doi: 10.1109/TASL.2013.2245655 – volume: 26 start-page: 736 issue: 4 year: 2018 ident: 10.1016/j.patcog.2019.06.018_bib0036 article-title: Speaker recognition using e–vectors publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. doi: 10.1109/TASLP.2018.2791806 – volume: 12 start-page: 16:1 issue: 2 year: 2018 ident: 10.1016/j.patcog.2019.06.018_bib0002 article-title: Systematic review of clustering high-dimensional and large datasets publication-title: ACM Trans. Knowl. Discovery Data – volume: 19 start-page: 788 issue: 4 year: 2011 ident: 10.1016/j.patcog.2019.06.018_bib0035 article-title: Front–end factor analysis for speaker verification publication-title: IEEE Trans. Audio Speech Lang.Process. doi: 10.1109/TASL.2010.2064307 – volume: 66 start-page: 846 year: 1971 ident: 10.1016/j.patcog.2019.06.018_bib0042 article-title: Objective criteria for the evaluation of clustering methods publication-title: J. Am. Stat. Assoc. doi: 10.1080/01621459.1971.10482356 – volume: 31 start-page: 651 issue: 8 year: 2010 ident: 10.1016/j.patcog.2019.06.018_bib0001 article-title: Data clustering: 50 years beyond k-means publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2009.09.011 – volume: 12 start-page: 209 issue: 7 year: 1982 ident: 10.1016/j.patcog.2019.06.018_bib0019 article-title: Construction d’une classification ascendante hieŕarchique par la recherche en chaîne des voisins réciproques publication-title: Les Cahiers de l’Analyse des Données – volume: 26 start-page: 354 issue: 4 year: 1983 ident: 10.1016/j.patcog.2019.06.018_bib0021 article-title: A survey of recent advances in hierarchical clustering algorithms publication-title: Comput. J. doi: 10.1093/comjnl/26.4.354 – start-page: 413 year: 2014 ident: 10.1016/j.patcog.2019.06.018_bib0016 article-title: Speaker diarization with PLDA i-vector scoring and unsupervised calibration – volume: 28 start-page: 100 issue: 1 year: 1979 ident: 10.1016/j.patcog.2019.06.018_bib0008 article-title: A k-means clustering algorithm publication-title: J. R. Stat. Soc. Ser. C – start-page: 260 year: 2014 ident: 10.1016/j.patcog.2019.06.018_bib0007 article-title: Unsupervised domain adaptation for i-vector speaker recognition – start-page: 5329 year: 2018 ident: 10.1016/j.patcog.2019.06.018_bib0037 article-title: X-vectors: robust DNN embeddings for speaker recognition – volume: 22 start-page: 1590 issue: 11 year: 2014 ident: 10.1016/j.patcog.2019.06.018_bib0034 article-title: Large scale training of pairwise support vector machines for speaker recognition publication-title: IEEE/ACM Trans. Audio SpeechLang. Process. doi: 10.1109/TASLP.2014.2341914 – volume: 21 start-page: 2015 issue: 10 year: 2013 ident: 10.1016/j.patcog.2019.06.018_bib0005 article-title: Unsupervised methods for speaker diarization: an integrated and iterative approach publication-title: IEEE Trans. Audio Speech Lang.Process. doi: 10.1109/TASL.2013.2264673 – volume: 23 start-page: 1495 issue: 12 year: 2007 ident: 10.1016/j.patcog.2019.06.018_bib0044 article-title: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm134 – volume: 28 start-page: 1875 issue: 11 year: 2006 ident: 10.1016/j.patcog.2019.06.018_bib0028 article-title: Fast agglomerative clustering using a k-Nearest Neighbor graph publication-title: IEEE Trans. Pattern Anal. Mach.Intell. doi: 10.1109/TPAMI.2006.227 – volume: 12 start-page: 219 issue: 7 year: 1982 ident: 10.1016/j.patcog.2019.06.018_bib0020 article-title: Programme de classification hieŕarchique par l’algorithme de la recherche en chaîne des voisins réciproques publication-title: Les Cahiers de l’Analyse des Données |
| SSID | ssj0017142 |
| Score | 2.3223488 |
| Snippet | •We focus on exact hierarchical clustering of large sets of utterances.•Hierarchical clustering is challenging due to memory constraints.•We propose an... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 235 |
| SubjectTerms | Cluster quality measures Clustering PLDA PSVM Reciprocal Nearest Neighbor Silhouette Similarity measures UPGMA |
| Title | Exact memory–constrained UPGMA for large scale speaker clustering |
| URI | https://dx.doi.org/10.1016/j.patcog.2019.06.018 |
| Volume | 95 |
| WOSCitedRecordID | wos000478710600020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-5142 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017142 issn: 0031-3203 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3LbhMxFLVCyoINb0R5yQt2ldF4nBnbyygKUARVJALKbuTx2KhtmEZpWqW7_gN_yJdw_ZqkKuIlsbEiK87Dx7o-c33uMUIva39YxwwphXLZKlWQmmUZMZrlDQxhyt-i8Pk9PzgQs5mc9Hr7qRbmfM7bVqzXcvFfoYY-ANuVzv4F3N2HQge8BtChBdih_SPgx2tX9vjVKWgvkpSBaUcD3W0QwC8_Td58GHp54dzJwPdOASZoF0Ydm-Wenp8574S0o0XeOvE2nK70JeqNNqf3I2-hEbLLzv2gk_goGzP0k0Oziv0xv0BlLLTrkl6p8GWjMvKBlFHC8izEJhNip-CMAP-6ElxlsR0dgzNJ3GjzkHu8FsNDOuHo1QL2opMvTn0nvcVqjNNX3bE_BvNJV4vFnPshu4F2cl5I0Uc7w_3x7F13pMTpIFjHx1-e6ii92O_6d_2cp2xxj-lddDs-NOBhAPse6pn2PrqTLuTAMT4_QCOPPQ7Yf7_8toU69qhjQB171LFHHUfU8Qb1h2j6ejwdvSXxlgyi4XFvRQwwTmkZra1QjCpWZ6Uuqa2NlVZq2nDNG5sxyw0b5IqzeiAM1cLSprClyNgj1G9PWvMY4aKoIf6rrBamdK5CEugjsD0rmtwYIDu7iKUpqXR0kHd_Yl4lqeBRFSaychNZOcUkFbuIdKMWwUHlN-_nabaryAIDu6tggfxy5JN_HvkU3dos_Weov1qemefopj5fHZ4uX8SV9AN_zIJ4 |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exact+memory%E2%80%93constrained+UPGMA+for+large+scale+speaker+clustering&rft.jtitle=Pattern+recognition&rft.au=Cumani%2C+Sandro&rft.au=Laface%2C+Pietro&rft.date=2019-11-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=95&rft.spage=235&rft.epage=246&rft_id=info:doi/10.1016%2Fj.patcog.2019.06.018&rft.externalDocID=S0031320319302493 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon |