Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources

Misinformation is now a major problem due to its poten-tial high risks to our core democratic and societal values and orders. Out-of-context misinformation is one of the easiest and effective ways used by adversaries to spread vi-ral false stories. In this threat, a real image is re-purposed to supp...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 14920 - 14929
Main Authors:	Abdelnabi, Sahar, Hasan, Rakibul, Fritz, Mario
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01.06.2022
Subjects:	categorization Cognition Computer architecture Computer vision Computer vision for social good; Recognition: detection Machine vision Manuals MIMICs retrieval; Vision + language; Vision applications and systems; Visual reasoning Visualization
ISSN:	1063-6919
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Misinformation is now a major problem due to its poten-tial high risks to our core democratic and societal values and orders. Out-of-context misinformation is one of the easiest and effective ways used by adversaries to spread vi-ral false stories. In this threat, a real image is re-purposed to support other narratives by misrepresenting its context and/or elements. The internet is being used as the go-to way to verify information using different sources and modali-ties. Our goal is an inspectable method that automates this time-consuming and reasoning-intensive process by fact-checking the image-caption pairing using Web evidence. To integrate evidence and cues from both modalities, we intro-duce the concept of 'multi-modal cycle-consistency check' starting from the image/caption, we gather tex-tual/visual evidence, which will be compared against the other paired caption/image, respectively. Moreover, we propose a novel architecture, Consistency-Checking Network (CCN), that mimics the layered human reasoning across the same and different modalities: the caption vs. textual evidence, the image vs. visual evidence, and the image vs. caption. Our work offers the first step and bench-mark for open-domain, content-based, multi-modal fact-checking, and significantly outperforms previous baselines that did not leverage external evidence 1 1 For code, checkpoints, and dataset, check: https://s-abdelnabi.github.io/OoC-multi-modal-fc/.
ISSN:	1063-6919
DOI:	10.1109/CVPR52688.2022.01452