A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios

This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE/ACM transactions on audio, speech, and language processing Jg. 23; H. 3; S. 494 - 504
Hauptverfasser: Hutchinson, Brian, Ostendorf, Mari, Fazel, Maryam
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Piscataway IEEE 01.03.2015
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:2329-9290, 2329-9304
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representations of words and histories. The sparse matrices learn multi-word lexical items and topic/domain idiosyncrasies. This model generalizes the standard ℓ 1 -regularized exponential language model, and has an efficient accelerated first-order training algorithm. Language modeling experiments show that the approach is useful in scenarios with limited training data, including low resource languages and domain adaptation.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2014.2379593