Attending From Foresight: A Novel Attention Mechanism for Neural Machine Translation

Machines translation (MT) is an essential task in natural language processing or even in artificial intelligence. Statistical machine translation has been the dominant approach to MT for decades, but recently neural machine translation achieves increasing interest because of its appealing model arch...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE/ACM transactions on audio, speech, and language processing Ročník 29; s. 2606 - 2616
Hlavní autoři: Li, Xintong, Liu, Lemao, Tu, Zhaopeng, Li, Guanlin, Shi, Shuming, Meng, Max Q.-H.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2329-9290, 2329-9304
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Machines translation (MT) is an essential task in natural language processing or even in artificial intelligence. Statistical machine translation has been the dominant approach to MT for decades, but recently neural machine translation achieves increasing interest because of its appealing model architecture and impressive translation performance. In neural machine translation, an attention model is used to identify the aligned source words for the next target word, i.e., target foresight word, to select translation context. However, it does not make use of any information about this target foresight word at all. Previous work proposed an approach to improve the attention model by explicitly accessing this target foresight word and demonstrating substantial alignment tasks. However, this approach cannot be applied in machine translation tasks where the target foresight word is unavailable. This paper proposes several novel enhanced attention models by introducing hidden information (such as part-of-speech) of the target foresight word for the translation task. We incorporate the novel enhanced attention employing hidden information about the target foresight word into both recurrent and self-attention-based neural translation models and theoretically justify that such hidden information can make translation prediction easier. Empirical experiments on four datasets further verify that the proposed attention models deliver significant improvements in translation quality.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2329-9290
2329-9304
DOI:10.1109/TASLP.2021.3097939