Learning discriminative representations from integrated features for DOA estimation
As a fundamental step in array signal processing, accurate direction-of-arrival (DOA) estimation is crucial for speaker localization using microphone arrays. Noise, reverberation, and an unknown number of sources in realistic environments pose significant challenges, making the extraction of discrim...
Uloženo v:
| Vydáno v: | Journal of King Saud University. Computer and information sciences Ročník 37; číslo 10; s. 323 - 20 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Cham
Springer International Publishing
01.12.2025
Springer Nature B.V Springer |
| Témata: | |
| ISSN: | 1319-1578, 2213-1248, 1319-1578 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | As a fundamental step in array signal processing, accurate direction-of-arrival (DOA) estimation is crucial for speaker localization using microphone arrays. Noise, reverberation, and an unknown number of sources in realistic environments pose significant challenges, making the extraction of discriminative representations a key step in DOA estimation. These representations need to reduce the influence of redundant information unrelated to localization, yet recent methods have largely overlooked this important characteristic. To address these issues, we propose an end-to-end feature integration and discriminative learning network (FID-Net) for multi-source DOA estimation. Specifically, our approach consists of three stages: the feature integration stage, the discriminative learning stage, and the temporal modeling stage. In the feature integration stage, we aim to capture multi-scale spatial information that is critical for localization. In the discriminative learning stage, we introduce a discriminative representation learning strategy and design a mutual information-based loss to guide the network to better capture the differences among diverse features. The discriminative features are further utilized in the temporal modeling stage to enhance the global contextual representation. Experimental results on both simulated and real-world datasets demonstrate the superior performance of the proposed method compared with other advanced methods. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1319-1578 2213-1248 1319-1578 |
| DOI: | 10.1007/s44443-025-00356-0 |