One-class learning for fake news detection through multimodal variational autoencoders

Machine learning methods to detect fake news typically use textual features and Binary or Multi-class classification. However, accurately labeling a large news set is still a very costly process. On the other hand, one of the prominent approaches is One-Class Learning (OCL). OCL requires only the la...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Engineering applications of artificial intelligence Jg. 122; S. 106088
Hauptverfasser: Gôlo, Marcos Paulo Silva, de Souza, Mariana Caravanti, Rossi, Rafael Geraldeli, Rezende, Solange Oliveira, Nogueira, Bruno Magalhães, Marcacini, Ricardo Marcondes
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.06.2023
Schlagworte:
ISSN:0952-1976, 1873-6769
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Machine learning methods to detect fake news typically use textual features and Binary or Multi-class classification. However, accurately labeling a large news set is still a very costly process. On the other hand, one of the prominent approaches is One-Class Learning (OCL). OCL requires only the labeling of fake news, minimizing data labeling efforts. Although we eliminate the need to label non-interest news, the efficiency of OCL algorithms depends directly on the data representation model adopted. Most existing methods in the OCL literature explore representations based on one modality to detect fake news. However, different text features can be the reason for the news to be fake, such as topic or linguistic features. We model this behavior as different modalities for news to represent different textual feature sets. Thus, we present the MVAE-FakeNews, a multimodal method to represent the texts in the fake news detection through OCL that learns a new representation from the combination of promising modalities for news data: text embeddings, topic, and linguistic information. We used real-world fake news datasets in Portuguese and English in the experimental evaluation. Results show that MVAE-FakeNews obtained a better F1-Score and AUC-ROC, outperforming another fourteen methods in three datasets and getting competitive results on the other three. Moreover, our MVAE-FakeNews, with only 3% of labeled fake news, obtained comparable or higher results than other methods. To improve the experimental evaluation, we also propose the Multimodal LIME for OCL to identify how each modality is associated with the fake news class.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2023.106088