Embeddings and Vector SpacesEmbeddings are the core representation layer in modern AI systems. They convert text tokens into numerical structures that preserve meaning in a way machines can compute.
---
1. What embeddings actually areAn embedding is a mapping from a token (word/subword) to a vector of numbers.
Example:
cat → [0.21, -1.3, 0.88, ...]
dog → [0.19, -1.1, 0.91, ...]
These vectors are learned from data, not manually defined.
Purpose:
- Turn symbols into geometry
- Allow mathematical comparison of meaning
- Enable neural networks to operate on language]
---
2. High-dimensional vector spacesEmbeddings live in spaces with hundreds or thousands of dimensions (e.g. 768–12288).
Each dimension encodes some abstract feature learned from data.
Why high-dimensional space matters:
- Allows many independent features to coexist
- Reduces interference between meanings
- Enables complex structure to form naturally
Human intuition fails here because we only perceive 3D space.
---
3. Semantic proximityMeaning is represented by distance.
If two vectors are close, the meanings are related.
Example:
cosine_similarity(cat, dog) → high
cosine_similarity(cat, banana) → low
So similarity is not symbolic — it is geometric.
---
4. Why "cat" and "dog" cluster togetherWords cluster based on shared contexts.
Example training contexts:
"The ___ is sleeping"
"My ___ ate food"
"The ___ barked/meowed"
Because "cat" and "dog" appear in similar sentence structures:
- They share similar embeddings
- They move closer in vector space
- They form an "animal cluster"
No explicit rule is programmed — it emerges from statistics.
---
5. Cosine similarityCosine similarity measures how aligned two vectors are.
Formula idea:
cos(θ) = (A · B) / (|A| |B|)
Interpretation:
- 1.0 → identical direction (very similar meaning)
- 0.0 → unrelated
- -1.0 → opposite meaning (rare in language embeddings)
Why cosine matters:
- Focuses on direction, not magnitude
- Works well in high-dimensional spaces
- Standard metric for semantic search
---
6. Latent spaceLatent space is the hidden internal representation space inside neural networks.
Embeddings are part of it, but latent space is broader.
It contains:
- Compressed semantic information
- Abstract features not explicitly defined
- Intermediate representations used for prediction
Key idea:
Raw text → latent space → prediction
Latent space is where "meaning" is internally stored.
---
7. Why RAG works (Retrieval-Augmented Generation)RAG uses embeddings to fetch relevant external information.
Process:
User query → embedding → vector search → retrieve documents → LLM generates answer
Why it works:
- Query and documents live in the same vector space
- Similarity search finds semantically related content
- LLM grounds output in retrieved data
So RAG is basically:
Semantic search + language generation
---
8. Why hallucinations happenHallucinations occur because the model is not retrieving truth — it is predicting likely text.
Core causes:
- No built-in fact database (unless using RAG/tools)
- It operates on probability, not verification
- Latent space encodes plausibility, not correctness
- Similar patterns can overwrite exact facts
So the model can produce:
"plausible-sounding but incorrect continuation"
Even if embeddings are semantically close, they are not truth-anchored.
Key distinction:
Similarity ≠ Truth
Probability ≠ Accuracy
---
Key InsightEmbeddings turn language into geometry:
Meaning = position in vector space
Similarity = distance / angle
Reasoning = transformations in latent space
Retrieval = nearest-neighbour search
Errors = statistical plausibility without grounding
That is the foundation of almost all modern AI systems.