Definition
A hallucination is an output generated by an AI model that appears linguistically fluent and confident but contains false, invented, or unsupported information.
The term borrows from psychology but technically “confabulation” is more accurate: the model doesn’t perceive things that don’t exist, but generates plausible output based on statistical patterns, without verifying factual correctness.
Types
Intrinsic hallucination: output directly contradicts the provided input or context. Example: in a summarization task, the summary includes facts not present in the original document.
Extrinsic hallucination: output includes unverifiable information from the input, which may or may not be true. Example: the model adds unrequested details that cannot be confirmed from context.
Factual hallucination: invented facts presented as true. Non-existent citations, false statistics, events that never occurred.
Faithfulness hallucination: in tasks requiring fidelity to a source (RAG, summarization), output diverges from source content.
Why They Occur
LLMs are trained to predict the most probable token given a sequence. They lack an internal model of “truth” or fact-checking mechanisms. If a statistical pattern produces plausible output, the model generates it, regardless of factual correctness.
Factors that increase hallucinations:
- Questions about facts underrepresented in training
- Requests for specific details (names, dates, numbers)
- High temperature (more randomness in generation)
- Insufficient or ambiguous context
Mitigation
RAG: provide reference documents in context. Reduces but does not eliminate hallucinations.
Grounding: constrain output to specific sources, require verifiable citations.
Low temperature: reduces randomness, more conservative and repeatable output.
Prompt engineering: explicit instructions (“answer only based on provided context”, “if unsure, say so”).
Validation pipeline: automatic verification of output against authoritative sources or business rules.
Human-in-the-loop: human review for critical applications.
Common Misconceptions
”Hallucinations are a bug that will be fixed”
No. They are a consequence of the architecture itself. They can be mitigated, not eliminated. Models generate probabilistically plausible output, not verified output.
”If the model is confident, it’s correct”
Linguistic confidence level does not correlate with accuracy. Models are trained to produce fluent output, not calibrated on their own uncertainty.
”RAG solves the problem”
RAG reduces hallucinations by providing factual context, but the model can still ignore, misunderstand, or integrate the context with invented information.
Related Terms
- LLM: models subject to hallucinations
- RAG: technique for partial mitigation
- Prompt Engineering: techniques to reduce hallucinations
Sources
- Huang, L. et al. (2023). A Survey on Hallucination in Large Language Models. arXiv
- Ji, Z. et al. (2023). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys