A probabilistic framework for multiple speaker localization
This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subseque...
Uloženo v:
| Vydáno v: | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) s. 3962 - 3966 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.05.2013
|
| Témata: | |
| ISSN: | 1520-6149 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus. |
|---|---|
| AbstractList | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus. |
| Author | Klakow, Dietrich Faubel, Friedrich Magimai-Doss, Mathew Oualil, Youssef |
| Author_xml | – sequence: 1 givenname: Youssef surname: Oualil fullname: Oualil, Youssef email: youssef.oualil@lsv.uni-saarland.de organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany – sequence: 2 givenname: Mathew surname: Magimai-Doss fullname: Magimai-Doss, Mathew organization: Idiap Research Institute, CH-1920 Martigny, Switzerland – sequence: 3 givenname: Friedrich surname: Faubel fullname: Faubel, Friedrich organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany – sequence: 4 givenname: Dietrich surname: Klakow fullname: Klakow, Dietrich organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany |
| BookMark | eNotj81Kw0AUhUeoYFt9gm7mBRLv_OQmg6tS1AoFheq6zIQ7MHaSCZOI6NMbsKtzOIuP76zYok89MbYRUAoB5v5ltz0e30oJQpWIqtEgr9hK6NoYUBXigi1FJaFAoc0NW43jJwA0tW6W7GHLh5ycdSGGcQot99l29J3ymfuUefcVpzBE4uNA9kyZx9TaGH7tFFJ_y669jSPdXXLNPp4e33f74vD6PCsdiiAlTIVy0muUFZC2RqOYqyTnXdNgi7OFQ1thTVp6qLwy1oAnIltjq12NRGrNNv_cMO-nIYfO5p_T5aj6A0qqShI |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/ICASSP.2013.6638402 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 1479903566 9781479903566 |
| EndPage | 3966 |
| ExternalDocumentID | 6638402 |
| Genre | orig-research |
| GroupedDBID | 23M 29P 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS |
| ID | FETCH-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 5 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-6149 |
| IngestDate | Wed Aug 27 05:56:45 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3 |
| OpenAccessLink | http://infoscience.epfl.ch/record/192555 |
| PageCount | 5 |
| ParticipantIDs | ieee_primary_6638402 |
| PublicationCentury | 2000 |
| PublicationDate | 2013-May |
| PublicationDateYYYYMMDD | 2013-05-01 |
| PublicationDate_xml | – month: 05 year: 2013 text: 2013-May |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) |
| PublicationTitleAbbrev | ICASSP |
| PublicationYear | 2013 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0008748 |
| Score | 1.9137945 |
| Snippet | This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 3962 |
| SubjectTerms | Acoustics Arrays Gaussian mixture Joints localization Microphone arrays Microphones multiple speakers Position measurement Probabilistic logic Speech steered response power |
| Title | A probabilistic framework for multiple speaker localization |
| URI | https://ieeexplore.ieee.org/document/6638402 |
| WOSCitedRecordID | wos000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1da8IwFL042cP2sg8d-yYPe1w1pm2SsieRyQZDBLfhmyTtDchARet-_25idRvsZW-hhdKekJze9txzAO54ariiefbx7SJKCqEiS6wQ0aLkCh2XLlgKvb-owUCPx9mwBve7XhhEDOIzbPlh-JdfzPO1_1TWJnakeoQ23D2l5KZXa7frahWSsoiOfDmUZJXDUIdn7ededzQaehlX3Kou8StLJVBJ_-h_N3EMze-ePDbcsc0J1HB2Coc_7AQb8NBlPh8meOZ6-2XmtsorRq-mbKsdZKsFmg9cssBjVR9mE976j6-9p6gKR4imQvAyiq1wBG_KMTGEc4eGAq2zWstcEhhWmlQqTITjqYszk3FHj2KUzBOrJGJ8BvXZfIbnwLQtqKyyhUI6Ses7s5oXKtdWG5U7E19Aw8MwWWz8LyYVApd_H76CAxEiI7wo8Brq5XKNN7Cff5bT1fI2TNoXFdCWgw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1ba8IwGP0QN9j2souO3ZeHPa4a00tS9iQyUeZE2AXfJGm_gAxUvOz370usboO97C20UNoTktOvPd85AHc81lzSPLv4dhFEuZCBIVYIaFFyiZYn1lsKvfdkv6-Gw3RQgvttLwwievEZ1tzQ_8vPp9nKfSqrEztSPUIb7k4cRYKvu7W2-66SPiuLCMkVRFFaeAw1eFrvtpovLwMn5AprxUV-pal4Mmkf_u82jqD63ZXHBlu-OYYSTk7g4IehYAUemswlxHjXXGfAzOxGe8Xo5ZRt1INsMUP9gXPmmazoxKzCW_vxtdUJiniEYCwEXwahEZYAjjlGmpBu0FCgsUapJEsIDJPoOJEYCctjG6Y65ZYeRcski4xMEMNTKE-mEzwDpkxOhZXJJdJJWuGpUTyXmTJKy8zq8BwqDobRbO2AMSoQuPj78C3sdV6fe6Net_90CfvCB0g4ieAVlJfzFV7Dbva5HC_mN34CvwAbVZnK |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=A+probabilistic+framework+for+multiple+speaker+localization&rft.au=Oualil%2C+Youssef&rft.au=Magimai-Doss%2C+Mathew&rft.au=Faubel%2C+Friedrich&rft.au=Klakow%2C+Dietrich&rft.date=2013-05-01&rft.pub=IEEE&rft.issn=1520-6149&rft.spage=3962&rft.epage=3966&rft_id=info:doi/10.1109%2FICASSP.2013.6638402&rft.externalDocID=6638402 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon |