A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations

Cross-lingual sentiment analysis (CLSA) leverages one or several source languages to help the low-resource languages to perform sentiment analysis. Therefore, the problem of lack of annotated corpora in many non-English languages can be alleviated. Along with the development of economic globalizatio...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Data Science and Engineering Ročník 7; číslo 3; s. 279 - 299
Hlavní autori:	Xu, Yuemei, Cao, Han, Du, Wanze, Wang, Wenqing
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Singapore Springer Nature Singapore 01.09.2022 Springer Springer Nature B.V
Predmet:	Algorithm Analysis and Problem Complexity Artificial Intelligence Chemistry and Earth Sciences Computer Science Data mining Data Mining and Knowledge Discovery Database Management Datasets Global economy Globalization Languages Machine translation Non-English languages Physics Sentiment analysis State-of-the-art reviews Statistics for Engineering Survey Paper Surveys Systems and Data Security United Kingdom Japan China Bilingual word embedding Cross-lingual Sentiment analysis
ISSN:	2364-1185, 2364-1541
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Cross-lingual sentiment analysis (CLSA) leverages one or several source languages to help the low-resource languages to perform sentiment analysis. Therefore, the problem of lack of annotated corpora in many non-English languages can be alleviated. Along with the development of economic globalization, CLSA has attracted much attention in the field of sentiment analysis and the last decade has seen a surge of researches in this area. Numerous methods, datasets and evaluation metrics have been proposed in the literature, raising the need for a comprehensive and updated survey. This paper fills the gap by reviewing the state-of-the-art CLSA approaches from 2004 to the present. This paper teases out the research context of cross-lingual sentiment analysis and elaborates the following methods in detail: (1) The early main methods of CLSA, including those based on Machine Translation and its improved variants, parallel corpora or bilingual sentiment lexicon; (2) CLSA based on cross-lingual word embedding; (3) CLSA based on multi-BERT and other pre-trained models. We further analyze their main ideas, methodologies, shortcomings, etc., and attempt to reach a conclusion on the coverage of languages, datasets and their performance. Finally, we look into the future development of CLSA and the challenges facing the research area.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2364-1185 2364-1541
DOI:	10.1007/s41019-022-00187-3