Learning with noisy supervision for Spoken Language Understanding

Data-driven spoken language understanding (SLU) systems need semantically annotated data which are expensive, time consuming and prone to human errors. Active learning has been successfully applied to automatic speech recognition and utterance classification. In general, corpora annotation for SLU i...

Full description

Saved in:
Bibliographic Details
Published in:2008 IEEE International Conference on Acoustics, Speech and Signal Processing pp. 4989 - 4992
Main Authors: Raymond, C., Riccardi, G.
Format: Conference Proceeding
Language:English
Published: IEEE 01.03.2008
Subjects:
ISBN:9781424414833, 1424414830
ISSN:1520-6149
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data-driven spoken language understanding (SLU) systems need semantically annotated data which are expensive, time consuming and prone to human errors. Active learning has been successfully applied to automatic speech recognition and utterance classification. In general, corpora annotation for SLU involves such tasks as sentence segmentation, chunking or frame labeling and predicate-argument annotation. In such cases human annotations are subject to errors increasing with the annotation complexity. We investigate two alternative noise-robust active learning strategies that are either data-intensive or supervision-intensive. The strategies detect likely erroneous examples and improve significantly the SLU performance for a given labeling cost. We apply uncertainty based active learning with conditional random fields on the concept segmentation task for SLU. We perform annotation experiments on two databases, namely ATIS (English) and Media (French). We show that our noise-robust algorithm could improve the accuracy up to 6% (absolute) depending on the noise level and the labeling cost.
ISBN:9781424414833
1424414830
ISSN:1520-6149
DOI:10.1109/ICASSP.2008.4518778