This should get you up to speed, in this order:
0. The First AI - Samuel’s Checkers ML Systemhttps://forum.drugs-and-users.org/index.php?topic=7338Historical grounding.
Shows the origins of machine learning and self-improving systems.
1. Pretraining modern AI / LLMshttps://forum.drugs-and-users.org/index.php?topic=7365There is a conceptual ancestry to 0.:
Samuel → early machine learning idea: “systems can improve from data instead of rules”
Later: reinforcement learning (temporal difference learning, policy/value methods)
Much later: deep learning + backprop + Transformers
Then: RLHF (reinforcement learning from human feedback) added after pretraining
If anything, Samuel’s system is closer to modern reinforcement learning agents, not language model pretraining.
2. Basic components of AI are and how the data/query flows through to the replyhttps://forum.drugs-and-users.org/index.php?topic=7347High-level architectural overview before deep diving.
Stacks and "subsystems".
3. How Neural Networks Workhttps://forum.drugs-and-users.org/index.php?topic=7343Core foundation.
Everything modern comes from this.
4. The AI Tokenisation Pipelinehttps://forum.drugs-and-users.org/index.php?topic=7350Now the reader understands WHY text must become vectors and embeddings.
5. Transformershttps://forum.drugs-and-users.org/index.php?topic=7344The real breakthrough architecture behind modern LLMs.
6. LLMs Explained -- A light introduction to LLMs, chatbots, pretraining, and transformershttps://forum.drugs-and-users.org/index.php?topic=7342Applies the transformer concept to actual LLM systems and chatbot behaviour.
7. RAG - Retrieval Augmented Generationhttps://forum.drugs-and-users.org/index.php?topic=7348Advanced modern extension layer.
Shows how models interface with external knowledge.
Those were the basics and once you roughly understand them then continue on with the following topics:
8. Embeddings and Vector Spaceshttps://forum.drugs-and-users.org/index.php?topic=7351Right now embeddings are probably buried inside tokenisation or neural networks, but embeddings are absolutely central to modern AI.
Topics:
- What embeddings actually are
- High-dimensional vector spaces
- Semantic proximity
- Why "cat" and "dog" cluster together
- Cosine similarity
- Latent space
- Why RAG works
- Why hallucinations happen
This becomes the bridge between:
Token IDs → Meaning Space
9. Attention Mechanisms and Self-Attentionhttps://forum.drugs-and-users.org/index.php?topic=7352Transformers really deserve to be split and Attention is the actual revolutionary mechanism.
Topics:
- Query / Key / Value vectors
- Attention weighting
- Context windows
- Token relationships
- Parallel processing vs recurrence
- Why transformers replaced RNNs/LSTMs
Without attention, transformers look like magic.
10. Training vs Inferencehttps://forum.drugs-and-users.org/index.php?topic=7353This is one of the most misunderstood things in AI discussions.
Most people think ChatGPT is "learning while talking."
It usually is not.
Topics:
- Pretraining
- Gradient descent
- Backpropagation
- Weights
- Inference-only operation
- Fine tuning
- RLHF
- Why models are static snapshots
This clears up enormous confusion.
11. Context Windows and Memoryhttps://forum.drugs-and-users.org/index.php?topic=7354Critical for chatbot understanding.
Topics:
- What context windows are
- Token limits
- Sliding attention windows
- Conversation truncation
- Why models "forget"
- Persistent memory systems
- RAG vs memory
This directly explains chatbot behaviour.
12. Hallucinations and Failure Modeshttps://forum.drugs-and-users.org/index.php?topic=7355Very important as AIs won't say "I don't know".
Topics:
- Probabilistic generation
- Why confidence ≠ correctness
- Distribution gaps
- Mode collapse
- Confabulation
- Context poisoning
- Prompt injection
Most people fundamentally misunderstand hallucinations.
13. Multi-Modal AIhttps://forum.drugs-and-users.org/index.php?topic=7356Modern systems are no longer just text.
Topics:
- Vision transformers
- Image tokenisation
- Audio embeddings
- Cross-modal embeddings
- Unified latent spaces
- Image generation diffusion models
This connects LLMs to image/video/audio systems.
14. Agents and Tool Usehttps://forum.drugs-and-users.org/index.php?topic=7357Modern frontier AI architecture.
Topics:
- Tool calling
- External APIs
- Planning loops
- Chain-of-thought orchestration
- Autonomous agents
- Memory stores
- Execution environments
This is where systems are heading now.
15. Scaling Lawshttps://forum.drugs-and-users.org/index.php?topic=7358Very important historically.
Topics:
- Why bigger models suddenly worked
- Emergent behaviour
- Parameter scaling
- Data scaling
- Compute scaling
- Why GPT-3 changed everything
16. Quantisation and Model Compressionhttps://forum.drugs-and-users.org/index.php?topic=7360The practical consequence of Scaling Laws.
Topics:
- What model weights actually are at the bit level
- FP32 vs FP16 vs INT8 vs INT4
- How precision reduction affects output quality
- GGUF and GGML formats
- llama.cpp and local inference
- Pruning and knowledge distillation
- Why a 7B quantised model can run on a laptop
Scaling Laws explains why models got enormous.
This explains how ordinary hardware runs them anyway.
Directly relevant to anyone self-hosting or running local models.
This explains why AI progress looked sudden.
17. Diffusion Models and What They Are Forhttps://forum.drugs-and-users.org/index.php?topic=7359When discussing image generation.
Topics:
- Noise schedules
- Denoising
- Latent diffusion
- Classifier guidance
- Why Stable Diffusion works
Completely different architecture family from transformers.
18. The Future of AI — What Is Actually Coming [/b]
https://forum.drugs-and-users.org/index.php?topic=7361By Claude, a prediction only ...
That’s plenty for now !