One-class learning for fake news detection through multimodal variational autoencoders

Machine learning methods to detect fake news typically use textual features and Binary or Multi-class classification. However, accurately labeling a large news set is still a very costly process. On the other hand, one of the prominent approaches is One-Class Learning (OCL). OCL requires only the la...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Engineering applications of artificial intelligence Ročník 122; s. 106088
Hlavní autori: Gôlo, Marcos Paulo Silva, de Souza, Mariana Caravanti, Rossi, Rafael Geraldeli, Rezende, Solange Oliveira, Nogueira, Bruno Magalhães, Marcacini, Ricardo Marcondes
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.06.2023
Predmet:
ISSN:0952-1976, 1873-6769
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Machine learning methods to detect fake news typically use textual features and Binary or Multi-class classification. However, accurately labeling a large news set is still a very costly process. On the other hand, one of the prominent approaches is One-Class Learning (OCL). OCL requires only the labeling of fake news, minimizing data labeling efforts. Although we eliminate the need to label non-interest news, the efficiency of OCL algorithms depends directly on the data representation model adopted. Most existing methods in the OCL literature explore representations based on one modality to detect fake news. However, different text features can be the reason for the news to be fake, such as topic or linguistic features. We model this behavior as different modalities for news to represent different textual feature sets. Thus, we present the MVAE-FakeNews, a multimodal method to represent the texts in the fake news detection through OCL that learns a new representation from the combination of promising modalities for news data: text embeddings, topic, and linguistic information. We used real-world fake news datasets in Portuguese and English in the experimental evaluation. Results show that MVAE-FakeNews obtained a better F1-Score and AUC-ROC, outperforming another fourteen methods in three datasets and getting competitive results on the other three. Moreover, our MVAE-FakeNews, with only 3% of labeled fake news, obtained comparable or higher results than other methods. To improve the experimental evaluation, we also propose the Multimodal LIME for OCL to identify how each modality is associated with the fake news class.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2023.106088