A probabilistic framework for multiple speaker localization

This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subseque...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 3962 - 3966
Main Authors: Oualil, Youssef, Magimai-Doss, Mathew, Faubel, Friedrich, Klakow, Dietrich
Format: Conference Proceeding
Language:English
Published: IEEE 01.05.2013
Subjects:
ISSN:1520-6149
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus.
AbstractList This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus.
Author Klakow, Dietrich
Faubel, Friedrich
Magimai-Doss, Mathew
Oualil, Youssef
Author_xml – sequence: 1
  givenname: Youssef
  surname: Oualil
  fullname: Oualil, Youssef
  email: youssef.oualil@lsv.uni-saarland.de
  organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany
– sequence: 2
  givenname: Mathew
  surname: Magimai-Doss
  fullname: Magimai-Doss, Mathew
  organization: Idiap Research Institute, CH-1920 Martigny, Switzerland
– sequence: 3
  givenname: Friedrich
  surname: Faubel
  fullname: Faubel, Friedrich
  organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany
– sequence: 4
  givenname: Dietrich
  surname: Klakow
  fullname: Klakow, Dietrich
  organization: Spoken Language Systems, Saarland University, Saarbrücken, Germany
BookMark eNotj81Kw0AUhUeoYFt9gm7mBRLv_OQmg6tS1AoFheq6zIQ7MHaSCZOI6NMbsKtzOIuP76zYok89MbYRUAoB5v5ltz0e30oJQpWIqtEgr9hK6NoYUBXigi1FJaFAoc0NW43jJwA0tW6W7GHLh5ycdSGGcQot99l29J3ymfuUefcVpzBE4uNA9kyZx9TaGH7tFFJ_y669jSPdXXLNPp4e33f74vD6PCsdiiAlTIVy0muUFZC2RqOYqyTnXdNgi7OFQ1thTVp6qLwy1oAnIltjq12NRGrNNv_cMO-nIYfO5p_T5aj6A0qqShI
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP.2013.6638402
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1479903566
9781479903566
EndPage 3966
ExternalDocumentID 6638402
Genre orig-research
GroupedDBID 23M
29P
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1520-6149
IngestDate Wed Aug 27 05:56:45 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i220t-3b2f46250e4a94616252ebfb886c6874b6a567e42f05f39a90feeea76c4b76ee3
OpenAccessLink http://infoscience.epfl.ch/record/192555
PageCount 5
ParticipantIDs ieee_primary_6638402
PublicationCentury 2000
PublicationDate 2013-May
PublicationDateYYYYMMDD 2013-05-01
PublicationDate_xml – month: 05
  year: 2013
  text: 2013-May
PublicationDecade 2010
PublicationTitle Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998)
PublicationTitleAbbrev ICASSP
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 1.9137945
Snippet This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross...
SourceID ieee
SourceType Publisher
StartPage 3962
SubjectTerms Acoustics
Arrays
Gaussian mixture
Joints
localization
Microphone arrays
Microphones
multiple speakers
Position measurement
Probabilistic logic
Speech
steered response power
Title A probabilistic framework for multiple speaker localization
URI https://ieeexplore.ieee.org/document/6638402
WOSCitedRecordID wos000329611504024&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA21eNCLH634TQ4e3TbdzeYDT6VYFKQUqtJbSbITKEJb2q2_30m6rQpevIU9hOyE5OUlM-8RcseKjpMWVOILZZPw0pQY5Qwud-3xOKKYi1JK7y9yMFDjsR7WyP2uFgYAYvIZtEIzvuUXc7cOV2VtREfkI7jh7kkpNrVau11XyeiUhXAU6BDXlcJQh-n2c687Gg1DGlfWqrr45aUSoaR_9L9BHJPmd00eHe7Q5oTUYHZKDn_ICTbIQ5cGf5iomRvkl6nfZl5RPJrSbe4gXS3AfMCSRhyr6jCb5K3_-Np7SipzhGSapqxMMpt6juSFATeaiw42U7DeKiWcwGBYYXIhgaee5T7TRjOPv2KkcNxKAZCdkfpsPoNzQjFkmQeLyxPJkbFCF8jCQHKe5wqw5wvSCGGYLDb6F5MqApd_f74iB2m0jAhJgdekXi7XcEP23Wc5XS1v46R9AfUClnY
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NawIxEA1iC20v_dDS7-bQY1fjbjab0JNIRakVQVu8SbI7ASmo6Nrf30lcbQu99Bb2ELITkpeXzLxHyAPLGmliQAY2kyZwL02BlqnG5a4sHkckS72U0nsv6ffleKwGJfK4q4UBAJ98BjXX9G_52Txdu6uyOqIj8hHccPdizkO2qdba7bsy8V5ZCEiOEHFVaAw1mKp3W83hcOASuaJa0ckvNxUPJu3j_w3jhFS_q_LoYIc3p6QEszNy9ENQsEKemtQ5xHjVXCfATO0294ri4ZRuswfpagH6A5bUI1lRiVklb-3nUasTFPYIwTQMWR5EJrQc6QsDrhUXDWyGYKyRUqQCg2GEjkUCPLQstpHSiln8FZ2IlJtEAETnpDybz-CCUAxZZMHgAkV6pI1QGfIwSDiPYwnY8yWpuDBMFhsFjEkRgau_P9-Tg87otTfpdfsv1-Qw9AYSLkXwhpTz5RpuyX76mU9Xyzs_gV_8F5m9
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=A+probabilistic+framework+for+multiple+speaker+localization&rft.au=Oualil%2C+Youssef&rft.au=Magimai-Doss%2C+Mathew&rft.au=Faubel%2C+Friedrich&rft.au=Klakow%2C+Dietrich&rft.date=2013-05-01&rft.pub=IEEE&rft.issn=1520-6149&rft.spage=3962&rft.epage=3966&rft_id=info:doi/10.1109%2FICASSP.2013.6638402&rft.externalDocID=6638402
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-6149&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-6149&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-6149&client=summon