Speech overlap detection and attribution using convolutive non-negative sparse coding

Overlapping speech is known to degrade speaker diarization performance with impacts on speaker clustering and segmentation. While previous work made important advances in detecting overlapping speech intervals and in attributing them to relevant speakers, the problem remains largely unsolved. This p...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) s. 4181 - 4184
Hlavní autoři: Vipperla, R., Geiger, J. T., Bozonnet, S., Dong Wang, Evans, N., Schuller, B., Rigoll, G.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.03.2012
Témata:
ISBN:1467300454, 9781467300452
ISSN:1520-6149
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Overlapping speech is known to degrade speaker diarization performance with impacts on speaker clustering and segmentation. While previous work made important advances in detecting overlapping speech intervals and in attributing them to relevant speakers, the problem remains largely unsolved. This paper reports the first application of convolutive non-negative sparse coding (CNSC) to the overlap problem. CNSC aims to decompose a composite signal into its underlying contributory parts and is thus naturally suited to overlap detection and attribution. Experimental results on NIST RT data show that the CNSC approach gives comparable results to a state-of-the-art hidden Markov model based overlap detector. In a practical diarization system, CNSC based speaker attribution is shown to reduce the speaker error by over 40% relative in overlapping segments.
ISBN:1467300454
9781467300452
ISSN:1520-6149
DOI:10.1109/ICASSP.2012.6288840