BeamLearning: An end-to-end deep learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data
Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also...
Uloženo v:
| Vydáno v: | The Journal of the Acoustical Society of America Ročník 149; číslo 6; s. 4248 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
01.06.2021
|
| ISSN: | 1520-8524 |
| On-line přístup: | Zjistit podrobnosti o přístupu |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation. |
|---|---|
| AbstractList | Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation. |
| Author | Bavu, Éric Pujol, Hadrien Garcia, Alexandre |
| Author_xml | – sequence: 1 givenname: Hadrien surname: Pujol fullname: Pujol, Hadrien – sequence: 2 givenname: Éric surname: Bavu fullname: Bavu, Éric – sequence: 3 givenname: Alexandre surname: Garcia fullname: Garcia, Alexandre |
| BookMark | eNotj81OwzAQhC0EEm3hwhPskUvAdmon4VYq_qRIXOBcLc66DXLtYCdC4iF4ZlzRy35ajWY0M2enPnhi7ErwGyGkuM3knCu-1CdsJpTkRa3k8pzNU_o8CHXZzNjvPeG-JYy-99s7WHkg3xVjKDKgIxrAHUXAYYgBzQ5siDDuCNBvJ4cRXDDo-h8c--AhWEhhyuZ8o6EEUzqYI37DfnJjb3boPTlAE6aUXxgipTRFgg5HvGBnFl2iyyMX7P3x4W39XLSvTy_rVVsYuWx0oQ1V3JAQtdRGWF3ykiqrJXZlp9VHWRlbdqaS2laorJWkUBFWwtSNpLLu5IJd_-fmSV8TpXGz75Mh59BT7rWRSnGpG62F_ANt0mlS |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2025_3561265 crossref_primary_10_1121_10_0006783 crossref_primary_10_1007_s11042_023_16947_w crossref_primary_10_1145_3586996 crossref_primary_10_3390_electronics14051043 crossref_primary_10_1109_TASLP_2022_3224282 crossref_primary_10_1121_10_0011809 crossref_primary_10_1121_10_0016467 crossref_primary_10_1121_10_0019802 crossref_primary_10_1016_j_apacoust_2024_110488 crossref_primary_10_1109_LSP_2023_3248952 crossref_primary_10_1121_10_0015005 |
| ContentType | Journal Article |
| DBID | 7X8 |
| DOI | 10.1121/10.0005046 |
| DatabaseName | MEDLINE - Academic |
| DatabaseTitle | MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | no_fulltext_linktorsrc |
| Discipline | Physics |
| EISSN | 1520-8524 |
| EndPage | 4248 |
| GroupedDBID | --- --Z -~X .DC 123 29L 4.4 5-Q 5RE 5VS 7X8 85S AAAAW AAGWI AAPUP AAYIH ABDNZ ABJGX ABJNI ABNAN ABPPZ ABZEH ACBRY ACCUC ACGFO ACGFS ACNCT ADCTM ADMLS AEGXH AENEX AFFNX AFHCQ AGKCL AGLKD AGMXG AGTJO AGVCI AHSDT AIAGR AIDUJ ALMA_UNASSIGNED_HOLDINGS AQWKA BAUXJ CS3 D0L DU5 EBS F5P H~9 M71 M73 P2P RAZ RIP RNS RQS SC5 SJN TN5 TWZ UHB UPT UQL WH7 XSW YQT ~02 |
| ID | FETCH-LOGICAL-c2496-6ce70ce11826c1f6303e7f62ad3d65b37cf3dc726f7a5ff2e5a5ea71c892e38d2 |
| IEDL.DBID | 7X8 |
| IngestDate | Fri Sep 05 12:29:50 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c2496-6ce70ce11826c1f6303e7f62ad3d65b37cf3dc726f7a5ff2e5a5ea71c892e38d2 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| OpenAccessLink | https://asa.scitation.org/doi/pdf/10.1121/10.0005046 |
| PQID | 2550269661 |
| PQPubID | 23479 |
| PageCount | 1 |
| ParticipantIDs | proquest_miscellaneous_2550269661 |
| PublicationCentury | 2000 |
| PublicationDate | 20210601 |
| PublicationDateYYYYMMDD | 2021-06-01 |
| PublicationDate_xml | – month: 06 year: 2021 text: 20210601 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | The Journal of the Acoustical Society of America |
| PublicationYear | 2021 |
| SSID | ssj0005839 |
| Score | 2.4810612 |
| Snippet | Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in... |
| SourceID | proquest |
| SourceType | Aggregation Database |
| StartPage | 4248 |
| Title | BeamLearning: An end-to-end deep learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data |
| URI | https://www.proquest.com/docview/2550269661 |
| Volume | 149 |
| hasFullText | |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAgsTCG_HWIbFGTZzETllQQVQMqGIA1K1y7HOFBElpCvwLfjNn11UHFiSmLFFkOed7fP7uPsYuZBYrmVkbYZFQgYJFHBUObcrKjpEmKYXxfdzP97LfLwaDzkMA3JpAq5z7RO-oTa0dRt6m1JfKBUrOk6vxe-RUo9ztapDQWGatlFIZZ9VysJgWnlP0DyNJE560PYsrzuNM_HK8Ppr0Nv-7ji22EfJI6M5-_DZbwmqHrXk-p2522fc1qrcwO3V0Cd0KsDLRtI7oAQZxDEEuYgTzqeJA6StQOggOwaR6F3yYC22aUFtonAITzND-BhxhfgQT9QWek-gaiCt8BXKwXh8MPMH2Y4LgOKh77Kl3-3hzFwXphUhTPSYioVHGGn31oRMrKNChtIIrkxqRl6nUNjVacmGlyq3lmKsclUx00eGYFobvs5WqrvCAQYcnouSZpZNvMmHzoqRvCiVKk0tppD5k5_ONHpJpu_sKVSGtdbjY6qM_vHPM1rnjm3iE5IS1LB1fPGWr-nP60kzOvGX8AIUkx3g |
| linkProvider | ProQuest |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BeamLearning%3A+An+end-to-end+deep+learning+approach+for+the+angular+localization+of+sound+sources+using+raw+multichannel+acoustic+pressure+data&rft.jtitle=The+Journal+of+the+Acoustical+Society+of+America&rft.au=Pujol%2C+Hadrien&rft.au=Bavu%2C+%C3%89ric&rft.au=Garcia%2C+Alexandre&rft.date=2021-06-01&rft.eissn=1520-8524&rft.volume=149&rft.issue=6&rft.spage=4248&rft.epage=4248&rft_id=info:doi/10.1121%2F10.0005046&rft.externalDBID=NO_FULL_TEXT |