A probabilistic framework for multiple speaker localization
This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subseque...
Saved in:
| Published in: | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 3962 - 3966 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.05.2013
|
| Subjects: | |
| ISSN: | 1520-6149 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus. |
|---|---|
| AbstractList | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus. |
| Author | Klakow, Dietrich Faubel, Friedrich Magimai-Doss, Mathew Oualil, Youssef |
| Author_xml | – sequence: 1 givenname: Youssef surname: Oualil fullname: Oualil, Youssef email: youssef.oualil@lsv.uni-saarland.de organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany – sequence: 2 givenname: Mathew surname: Magimai-Doss fullname: Magimai-Doss, Mathew organization: Idiap Research Institute, CH-1920 Martigny, Switzerland – sequence: 3 givenname: Friedrich surname: Faubel fullname: Faubel, Friedrich organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany – sequence: 4 givenname: Dietrich surname: Klakow fullname: Klakow, Dietrich organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany |
| BookMark | eNotj81Kw0AUhUeoYFt9gm7mBRLv_OQmg6tS1AoFheq6zIQ7MHaSCZOI6NMbsKtzOIuP76zYok89MbYRUAoB5v5ltz0e30oJQpWIqtEgr9hK6NoYUBXigi1FJaFAoc0NW43jJwA0tW6W7GHLh5ycdSGGcQot99l29J3ymfuUefcVpzBE4uNA9kyZx9TaGH7tFFJ_y669jSPdXXLNPp4e33f74vD6PCsdiiAlTIVy0muUFZC2RqOYqyTnXdNgi7OFQ1thTVp6qLwy1oAnIltjq12NRGrNNv_cMO-nIYfO5p_T5aj6A0qqShI |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICASSP.2013.6638402 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 1479903566 9781479903566 |
| EndPage | 3966 |
| ExternalDocumentID | 6638402 |
| Genre | orig-research |
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS |
| ID | FETCH-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 5 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-6149 |
| IngestDate | Wed Aug 27 05:56:45 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3 |
| OpenAccessLink | http://infoscience.epfl.ch/record/192555 |
| PageCount | 5 |
| ParticipantIDs | ieee_primary_6638402 |
| PublicationCentury | 2000 |
| PublicationDate | 2013-May |
| PublicationDateYYYYMMDD | 2013-05-01 |
| PublicationDate_xml | – month: 05 year: 2013 text: 2013-May |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) |
| PublicationTitleAbbrev | ICASSP |
| PublicationYear | 2013 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0008748 |
| Score | 1.9137945 |
| Snippet | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 3962 |
| SubjectTerms | Acoustics Arrays Gaussian mixture Joints localization Microphone arrays Microphones multiple speakers Position measurement Probabilistic logic Speech steered response power |
| Title | A probabilistic framework for multiple speaker localization |
| URI | https://ieeexplore.ieee.org/document/6638402 |
| WOSCitedRecordID | wos000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1q8aAXP1rxmxw8uu1ukiYbPJViUZBSqEpvJdmdQBG2Zbv195uk26rgxVvIISQTyMskb94DuKOa9jwbITLcSJ-gJFHKlY08ukgjjcqDZ-T7ixyN0ulUjRtwv6uFQcRAPsOOb4a__HyRrf1TWdeho8tH3IG7J6XY1GrtTt1UBqcsB0c-HeKqVhhKYtV9HvQnk7GncbFOPcQvL5UAJcOj_03iGNrfNXlkvEObE2hgcQqHP-QEW_DQJ94fJmjmevllYrfMK-KupmTLHSSrJeoPLEnAsboOsw1vw8fXwVNUmyNEc0rjKmKGWu6Slxi5VlwkrknRWJOmIhMuGEbonpDIqY17limtYuuWoqXI3KYIRHYGzWJR4DmQPDc8tio12n-bYmIYM9pdqyyjMlEmuYCWD8NsudG_mNURuPy7-woOaLCM8KTAa2hW5RpvYD_7rOar8jZs2hfLP5bo |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NawIxEB3EFtpe-qGl382hx65ms9nNhp5EKkqtCNriTZLdCUhBxY_-_iZxtS300lvIISQTyMskb94DeGCKxY6NEGiuhUtQwiDl0gQOXYQWWubeM_K9K3q9dDSS_RI87mphENGTz7Dmmv4vP59la_dUVrfoaPMRe-DuxZwzuqnW2p27qfBeWRaQXELEZaExFFJZ7zQbg0HfEbmiWjHILzcVDyat4_9N4wSq31V5pL_Dm1Mo4fQMjn4IClbgqUGcQ4xXzXUCzMRsuVfEXk7Jlj1IlnNUH7ggHsmKSswqvLWeh812UNgjBBPG6CqINDPcpi8UuZI8CW2ToTY6TZMsscHQiYoTgZwZGptIKkmNXYoSSWa3JUGMzqE8nU3xAkiea06NTLVyH6cY6ijSyl6sTMREKHV4CRUXhvF8o4AxLiJw9Xf3PRy0h6_dcbfTe7mGQ-YNJBxF8AbKq8Uab2E_-1xNlos7v4Ff10KaLw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=A+probabilistic+framework+for+multiple+speaker+localization&rft.au=Oualil%2C+Youssef&rft.au=Magimai-Doss%2C+Mathew&rft.au=Faubel%2C+Friedrich&rft.au=Klakow%2C+Dietrich&rft.date=2013-05-01&rft.pub=IEEE&rft.issn=1520-6149&rft.spage=3962&rft.epage=3966&rft_id=info:doi/10.1109%2FICASSP.2013.6638402&rft.externalDocID=6638402 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon |