A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments

Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, so...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Applied sciences Ročník 11; číslo 22; s. 10788
Hlavní autoři: Fallah, Ali, van de Par, Steven
Médium: Journal Article
Jazyk:angličtina
Vydáno: Basel MDPI AG 01.11.2021
Témata:
ISSN:2076-3417, 2076-3417
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, some look-ahead in clean speech and prior knowledge of room impulse response (RIR) are required. In this method, by optimizing a defined cost function, the Spectro-Temporal Envelope of reverb speech is optimized to be as close as possible to that of clean speech. In this cost function, onsets of speech are optimized with increased weight. This approach is different from overlap-masking ratio (OMR) and speech enhancement (OE) approaches (Grosse, van de Par, 2017, J. Audio Eng. Soc., Vol. 65 (1/2), pp. 31–41) that only consider previous frames in each time slot for determining the time variant filtering. The SRT measurements show that the new optimization framework enhances the speech intelligibility up to 2 dB more that OE.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2076-3417
2076-3417
DOI:10.3390/app112210788