arxiv:2509.22582

Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs

Published on Sep 26

· Submitted by

Zorik on Oct 3

Technion Israel institute of technology

Upvote

Authors:

Zorik Gekhman ,

Abstract

A study evaluates the effectiveness of large language models in identifying context-grounded hallucinations using a newly constructed benchmark and free-form textual descriptions, revealing challenges in distinguishing between missing details and unverifiable information.

AI-generated summary

Context-grounded hallucinations are cases where model outputs contain information not verifiable against the source text. We study the applicability of LLMs for localizing such hallucinations, as a more practical alternative to existing complex evaluation pipelines. In the absence of established benchmarks for meta-evaluation of hallucinations localization, we construct one tailored to LLMs, involving a challenging human annotation of over 1,000 examples. We complement the benchmark with an LLM-based evaluation protocol, verifying its quality in a human evaluation. Since existing representations of hallucinations limit the types of errors that can be expressed, we propose a new representation based on free-form textual descriptions, capturing the full range of possible errors. We conduct a comprehensive study, evaluating four large-scale LLMs, which highlights the benchmark's difficulty, as the best model achieves an F1 score of only 0.67. Through careful analysis, we offer insights into optimal prompting strategies for the task and identify the main factors that make it challenging for LLMs: (1) a tendency to incorrectly flag missing details as inconsistent, despite being instructed to check only facts in the output; and (2) difficulty with outputs containing factually correct information absent from the source - and thus not verifiable - due to alignment with the model's parametric knowledge.