A Sparse Plus Low-Rank Exponential Language Model for Limited Resource Scenarios
This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representat...
Uložené v:
| Vydané v: | IEEE/ACM transactions on audio, speech, and language processing Ročník 23; číslo 3; s. 494 - 504 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Piscataway
IEEE
01.03.2015
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 2329-9290, 2329-9304 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | This paper describes a new exponential language model that decomposes the model parameters into one or more low-rank matrices that learn regularities in the training data and one or more sparse matrices that learn exceptions (e.g., keywords). The low-rank matrices induce continuous-space representations of words and histories. The sparse matrices learn multi-word lexical items and topic/domain idiosyncrasies. This model generalizes the standard ℓ 1 -regularized exponential language model, and has an efficient accelerated first-order training algorithm. Language modeling experiments show that the approach is useful in scenarios with limited training data, including low resource languages and domain adaptation. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 2329-9290 2329-9304 |
| DOI: | 10.1109/TASLP.2014.2379593 |