Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

1Xiamen University 2Westlake University 3DAMO Academy, Alibaba Group 4Hupan Laboratory
*Corresponding author: wanghuan@westlake.edu.cn

Anhui University
Xiamen University
Westlake University
PDF Preview

We introduce VAP (visual adversarial perturbation), a novel approach that strategically injects beneficial visual noise to mitigate object hallucination in LVMs without altering the complex base model. Our method consistently improves performance across 8 state-of-the-art LVMs under the POPE hallucination evaluation setting.

Abstract

Large vision-language models (LVMs) extend large language models (LLMs) with visual perception capabilities, enabling them to process and interpret visual information. A major challenge compromising their reliability is object hallucination that LVMs may generate plausible but factually inaccurate information. We propose a novel visual adversarial perturbation (VAP) method to mitigate this hallucination issue. VAP alleviates LVM hallucination by applying strategically optimized visual noise without altering the base model. Our method formulates hallucination suppression as an optimization problem, leveraging adversarial strategies to generate beneficial visual perturbations that enhance the model's factual grounding and reduce parametric knowledge bias. Extensive experimental results demonstrate that our method consistently reduces object hallucinations across 8 state-of-the-art LVMs, validating its efficacy across diverse evaluations.

Detailed Overview of Our Proposed Method

The VAP method generates beneficial visual noise by leveraging adversarial knowledge through the optimization of three strategies: (1) maximizing the semantic alignment between the LVM's response and the visual content to preserve the semantic consistency of the image, (2) minimizing the response similarity between the original and distorted visual content through noise-induced uncertainty, and (3) mitigating parametric knowledge bias by minimizing the similarity of representations between original and distorted inputs. Strategies (2) and (3) jointly mitigate parametric knowledge bias. The optimized visual noise effectively mitigates object hallucinations.

Method Overview
Illustration of our proposed VAP method.

Experiment Results

Illustration of the effectiveness on VQA Tasks.

BibTeX


        @article{zhang2025poison,
        title={Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs},
        author={Kejia Zhang and Keda Tao and Jiasheng Tang and Huan Wang},
        journal={arXiv preprint arXiv:2501.19164},
        year={2025}
        }
      

Visitor Location Map

366 Total Pageviews