BeamLearning: An end-to-end deep learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data

Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:The Journal of the Acoustical Society of America Ročník 149; číslo 6; s. 4248
Hlavní autoři: Pujol, Hadrien, Bavu, Éric, Garcia, Alexandre
Médium: Journal Article
Jazyk:angličtina
Vydáno: 01.06.2021
ISSN:1520-8524
On-line přístup:Zjistit podrobnosti o přístupu
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation.
AbstractList Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation.
Author Bavu, Éric
Pujol, Hadrien
Garcia, Alexandre
Author_xml – sequence: 1
  givenname: Hadrien
  surname: Pujol
  fullname: Pujol, Hadrien
– sequence: 2
  givenname: Éric
  surname: Bavu
  fullname: Bavu, Éric
– sequence: 3
  givenname: Alexandre
  surname: Garcia
  fullname: Garcia, Alexandre
BookMark eNotj81OwzAQhC0EEm3hwhPskUvAdmon4VYq_qRIXOBcLc66DXLtYCdC4iF4ZlzRy35ajWY0M2enPnhi7ErwGyGkuM3knCu-1CdsJpTkRa3k8pzNU_o8CHXZzNjvPeG-JYy-99s7WHkg3xVjKDKgIxrAHUXAYYgBzQ5siDDuCNBvJ4cRXDDo-h8c--AhWEhhyuZ8o6EEUzqYI37DfnJjb3boPTlAE6aUXxgipTRFgg5HvGBnFl2iyyMX7P3x4W39XLSvTy_rVVsYuWx0oQ1V3JAQtdRGWF3ykiqrJXZlp9VHWRlbdqaS2laorJWkUBFWwtSNpLLu5IJd_-fmSV8TpXGz75Mh59BT7rWRSnGpG62F_ANt0mlS
CitedBy_id crossref_primary_10_1109_ACCESS_2025_3561265
crossref_primary_10_1121_10_0006783
crossref_primary_10_1007_s11042_023_16947_w
crossref_primary_10_1145_3586996
crossref_primary_10_3390_electronics14051043
crossref_primary_10_1109_TASLP_2022_3224282
crossref_primary_10_1121_10_0011809
crossref_primary_10_1121_10_0016467
crossref_primary_10_1121_10_0019802
crossref_primary_10_1016_j_apacoust_2024_110488
crossref_primary_10_1109_LSP_2023_3248952
crossref_primary_10_1121_10_0015005
ContentType Journal Article
DBID 7X8
DOI 10.1121/10.0005046
DatabaseName MEDLINE - Academic
DatabaseTitle MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
Database_xml – sequence: 1
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod no_fulltext_linktorsrc
Discipline Physics
EISSN 1520-8524
EndPage 4248
GroupedDBID ---
--Z
-~X
.DC
123
29L
4.4
5-Q
5RE
5VS
7X8
85S
AAAAW
AAGWI
AAPUP
AAYIH
ABDNZ
ABJGX
ABJNI
ABNAN
ABPPZ
ABZEH
ACBRY
ACCUC
ACGFO
ACGFS
ACNCT
ADCTM
ADMLS
AEGXH
AENEX
AFFNX
AFHCQ
AGKCL
AGLKD
AGMXG
AGTJO
AGVCI
AHSDT
AIAGR
AIDUJ
ALMA_UNASSIGNED_HOLDINGS
AQWKA
BAUXJ
CS3
D0L
DU5
EBS
F5P
H~9
M71
M73
P2P
RAZ
RIP
RNS
RQS
SC5
SJN
TN5
TWZ
UHB
UPT
UQL
WH7
XSW
YQT
~02
ID FETCH-LOGICAL-c2496-6ce70ce11826c1f6303e7f62ad3d65b37cf3dc726f7a5ff2e5a5ea71c892e38d2
IEDL.DBID 7X8
IngestDate Fri Sep 05 12:29:50 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2496-6ce70ce11826c1f6303e7f62ad3d65b37cf3dc726f7a5ff2e5a5ea71c892e38d2
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://asa.scitation.org/doi/pdf/10.1121/10.0005046
PQID 2550269661
PQPubID 23479
PageCount 1
ParticipantIDs proquest_miscellaneous_2550269661
PublicationCentury 2000
PublicationDate 20210601
PublicationDateYYYYMMDD 2021-06-01
PublicationDate_xml – month: 06
  year: 2021
  text: 20210601
  day: 01
PublicationDecade 2020
PublicationTitle The Journal of the Acoustical Society of America
PublicationYear 2021
SSID ssj0005839
Score 2.4810612
Snippet Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in...
SourceID proquest
SourceType Aggregation Database
StartPage 4248
Title BeamLearning: An end-to-end deep learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data
URI https://www.proquest.com/docview/2550269661
Volume 149
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAgsTCG_HWIbFGTZzETllQQVQMqGIA1K1y7HOFBElpCvwLfjNn11UHFiSmLFFkOed7fP7uPsYuZBYrmVkbYZFQgYJFHBUObcrKjpEmKYXxfdzP97LfLwaDzkMA3JpAq5z7RO-oTa0dRt6m1JfKBUrOk6vxe-RUo9ztapDQWGatlFIZZ9VysJgWnlP0DyNJE560PYsrzuNM_HK8Ppr0Nv-7ji22EfJI6M5-_DZbwmqHrXk-p2522fc1qrcwO3V0Cd0KsDLRtI7oAQZxDEEuYgTzqeJA6StQOggOwaR6F3yYC22aUFtonAITzND-BhxhfgQT9QWek-gaiCt8BXKwXh8MPMH2Y4LgOKh77Kl3-3hzFwXphUhTPSYioVHGGn31oRMrKNChtIIrkxqRl6nUNjVacmGlyq3lmKsclUx00eGYFobvs5WqrvCAQYcnouSZpZNvMmHzoqRvCiVKk0tppD5k5_ONHpJpu_sKVSGtdbjY6qM_vHPM1rnjm3iE5IS1LB1fPGWr-nP60kzOvGX8AIUkx3g
linkProvider ProQuest
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BeamLearning%3A+An+end-to-end+deep+learning+approach+for+the+angular+localization+of+sound+sources+using+raw+multichannel+acoustic+pressure+data&rft.jtitle=The+Journal+of+the+Acoustical+Society+of+America&rft.au=Pujol%2C+Hadrien&rft.au=Bavu%2C+%C3%89ric&rft.au=Garcia%2C+Alexandre&rft.date=2021-06-01&rft.eissn=1520-8524&rft.volume=149&rft.issue=6&rft.spage=4248&rft.epage=4248&rft_id=info:doi/10.1121%2F10.0005046&rft.externalDBID=NO_FULL_TEXT