Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

Recent advancements in multimodal large language models (MLLMs) have significantly improved performance in visual question answering. However, they often suffer from hallucinations. In this work, hallucinations are categorized into two main types: initial hallucinations and snowball hallucinations....

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Ročník 2025; s. 26147 - 26159
Hlavní autori:	Tang, Feilong, Liu, Chengzhi, Xu, Zhongxing, Hu, Ming, Huang, Zile, Xue, Haochen, Chen, Ziyang, Peng, Zelin, Yang, Zhiwei, Zhou, Sijin, Li, Wenxue, Li, Yulong, Song, Wenxuan, Su, Shiyan, Feng, Wei, Su, Jionglong, Lin, Minquan, Peng, Yifan, Cheng, Xuelian, Razzak, Imran, Ge, Zongyuan
Médium:	Konferenčný príspevok.. Journal Article
Jazyk:	English
Vydavateľské údaje:	United States IEEE 01.06.2025
Predmet:	Data mining Decoding Encoding FarSight demonstrates significant hallucination-mitigating performance across different MLLMs on both image and video benchmarks Heart Interference Large language models proving its effectiveness Question answering (information retrieval) Registers Video sequences Visualization With extensive experiments
ISSN:	1063-6919, 1063-6919
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Buďte prvý, kto okomentuje tento záznam!