A Novel Uncertainty Decoding Rule With Applications to Transmission Error Robust Speech Recognition

In this paper, we derive an uncertainty decoding rule for automatic speech recognition (ASR), which accounts for both corrupted observations and inter-frame correlation. The conditional independence assumption, prevalent in hidden Markov model-based ASR, is relaxed to obtain a clean speech posterior...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on audio, speech, and language processing Jg. 16; H. 5; S. 1047 - 1060
Hauptverfasser:	Ion, V., Haeb-Umbach, R.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Piscataway, NJ IEEE 01.07.2008 Institute of Electrical and Electronics Engineers
Schlagworte:	Acoustic distortion Acoustic noise Applied sciences Automatic speech recognition Cleaning Coding, codes Communication networks Conditional independence Conditioning Correlation Decoding distributed speech recognition error concealment Exact sciences and technology Hidden Markov models Information, signal and communications theory network speech recognition Networks Robustness Signal and communications theory Signal processing Speech Speech processing Speech recognition Telecommunications and information theory Uncertainty uncertainty decoding Packet switching distributed speech recognition network speech recognition Probabilistic approach Data transmission Man machine dialogue Speech-based user interfaces Decoding Distributed system Codec Conditional independence Hidden Markov models Speech recognition Telecommunication network error concealment uncertainty decoding Error correction Transmission loss Automatic recognition Speech processing Internet telephony Transmission error
ISSN:	1558-7916
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this paper, we derive an uncertainty decoding rule for automatic speech recognition (ASR), which accounts for both corrupted observations and inter-frame correlation. The conditional independence assumption, prevalent in hidden Markov model-based ASR, is relaxed to obtain a clean speech posterior that is conditioned on the complete observed feature vector sequence. This is a more informative posterior than one conditioned only on the current observation. The novel decoding is used to obtain a transmission-error robust remote ASR system, where the speech capturing unit is connected to the decoder via an error-prone communication network. We show how the clean speech posterior can be computed for communication links being characterized by either bit errors or packet loss. Recognition results are presented for both distributed and network speech recognition, where in the latter case common voice-over-IP codecs are employed.
Bibliographie:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1558-7916
DOI:	10.1109/TASL.2008.925879