Phrase embedding learning from internal and external information based on autoencoder

•We propose an autoencoder-based method to generate phrase embedding.•The method uses full connected network, LSTM and attention to get phrase embedding.•The method can utilize the external and internal contextual information of phrases.•The method can learn the order and semantic information of com...

Full description

Saved in:
Bibliographic Details
Published in:Information processing & management Vol. 58; no. 1; p. 102422
Main Authors: Li, Rongsheng, Yu, Qinyong, Huang, Shaobin, Shen, Linshan, Wei, Chi, Sun, Xuewei
Format: Journal Article
Language:English
Published: Oxford Elsevier Ltd 01.01.2021
Elsevier Science Ltd
Subjects:
ISSN:0306-4573, 1873-5371
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We propose an autoencoder-based method to generate phrase embedding.•The method uses full connected network, LSTM and attention to get phrase embedding.•The method can utilize the external and internal contextual information of phrases.•The method can learn the order and semantic information of component words.•The proposed method performs best on phrase similarity and classification tasks. Phrase embedding can improve the performance of multiple NLP tasks. Most of the previous phrase-embedding methods that only use the external or internal semantic information of phrases to learn phrase embedding are challenging to solve the problem of data sparseness and have poor semantic presentation ability. To solve the above issues, in this paper, we propose an autoencoder-based method to combine pre-trained phrase embeddings and phrase component word embeddings into new phrase embeddings through complex non-linear transformations. This method uses both internal and external semantic information of phrases to generate new phrases with better semantic expression capabilities. This method can also generate well-represented phrase embeddings when only pre-trained component word embeddings are used as input to solve the problem of data sparseness effectively. We have designed two models for this method. The first one uses an FCNN(Fully Connected Neural Network) as the encoder and decoder, which we call AE-F. The second one uses the attention mechanism shared by the parameters of encoder and decoder to proportionally allocate the outputs of an LSTM and an FCNN, which we call it AE-ALF. We evaluated them in terms of phrase similarity and phrase classification and used two English datasets and two Chinese datasets. Experimental results show that AE-F and AE-ALF methods using pre-trained phrase embeddings and component word embeddings exceed 17 baseline methods, and AE-F and AE-ALF perform similarly. With only pre-trained component word embeddings, AE-F and AE-ALF also exceed most baseline methods, and AE-ALF performs better than AE-F.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2020.102422