AI will hallucinate if it doesn't have the right training data or if you ask it something it does not know ! Hallucinations and Failure ModesA “hallucination” in an LLM is not a bug in the human sense.
It is the natural result of a system that generates the most statistically likely continuation of text without an internal truth-checking mechanism.
---
1. Probabilistic generationLLMs do not retrieve facts.
They predict the next token based on probability:
P(next token | context)
At each step, the model selects from likely continuations.
This means:
- It is sampling from learned distributions
- It does not “know” truth
- It optimizes for plausibility, not correctness
---
2. Why confidence ≠ correctnessA model can sound extremely confident while being wrong.
Reason:
- Language fluency is learned separately from factual grounding
- High-probability text often resembles correct explanations
- Training rewards coherence, not truth verification
So:
Fluent answer ≠ accurate answer
Confidence is just statistical smoothness in output.
---
3. Distribution gapsThe model is trained on a finite dataset distribution.
When it encounters something outside that distribution:
Unknown / rare / ambiguous input
It still must produce an output.
So it:
- Interpolates from similar patterns
- Blends unrelated concepts
- Generates plausible but incorrect content
This is a major source of hallucinations.
---
4. Mode collapse (generation biasing)Mode collapse happens when the model over-favours certain “safe” or common patterns.
Effects:
- Repetitive answers
- Generic explanations
- Loss of diversity in responses
In extreme cases:
Many different inputs → same type of answer
It reduces variability but can harm accuracy.
---
5. ConfabulationConfabulation is when the model invents details to maintain coherence.
Example behaviour:
- Fills missing facts with plausible ones
- Creates citations, names, or numbers that look real
- Maintains narrative consistency over truth
It is essentially:
"Make something that fits the pattern"
rather than:
"Verify what is true"
---
6. Context poisoningContext poisoning occurs when incorrect or misleading information enters the context window.
Once inside:
Model treats it as ground truth
Causes include:
- User-provided false data
- Incorrect retrieved documents (RAG failures)
- Earlier hallucinated output reused as context
Since the model has no built-in truth filter, it can fully propagate errors.
---
7. Prompt injectionPrompt injection is a security failure mode in instruction-following models.
It happens when malicious or unintended instructions are embedded inside input data.
Example:
- Hidden instructions in retrieved documents
- User tries to override system rules
- Data contains “ignore previous instructions” style payloads
If not properly isolated, the model may:
Follow injected instructions instead of intended system behaviour
This is especially dangerous in RAG systems.
---
Key InsightHallucinations are not random mistakes.
They are structural outcomes of the system:
No grounded truth system + probabilistic language generation = plausible fiction
The model is optimising for:
- Coherence
- Fluency
- Statistical likelihood
Not for:
- Truth
- Verification
- External consistency checking
That gap is where all hallucinations and failure modes arise.