Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

Recent advancements in multimodal large language models (MLLMs) have significantly improved performance in visual question answering. However, they often suffer from hallucinations. In this work, hallucinations are categorized into two main types: initial hallucinations and snowball hallucinations....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Jg. 2025; S. 26147 - 26159
Hauptverfasser: Tang, Feilong, Liu, Chengzhi, Xu, Zhongxing, Hu, Ming, Huang, Zile, Xue, Haochen, Chen, Ziyang, Peng, Zelin, Yang, Zhiwei, Zhou, Sijin, Li, Wenxue, Li, Yulong, Song, Wenxuan, Su, Shiyan, Feng, Wei, Su, Jionglong, Lin, Minquan, Peng, Yifan, Cheng, Xuelian, Razzak, Imran, Ge, Zongyuan
Format: Tagungsbericht Journal Article
Sprache:Englisch
Veröffentlicht: United States IEEE 01.06.2025
Schlagworte:
ISSN:1063-6919, 1063-6919
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!