Author Topic: The basic "Learning AI" curriculum (Read 414 times)

Chip · « **on:** May 27, 2026, 04:14:33 PM »

Whilst not perfect, this should get you up to speed, in this order:

Quote

If only I could find the audience that is as interested as I am and without being overwhelmed and distracted by mathematics and overly complicated explanations ... hey, that might inclund you

However, the optimal audience won't be hanging out at a drug HR site

History: The First AI - Samuel’s Checkers Machine Learning System
https://forum.drugs-and-users.org/index.php?topic=7338

Historical grounding.
Shows the origins of machine learning and self-improving systems.

0. Machine Learning Foundations
https://forum.drugs-and-users.org/index.php?topic=7368

Outlines the various Machine Learning systems and has an evolutionary timeline

See Example Code: Samuel’s Checkers in Python and My Food Preference Model in Psuedocode

1. Pretraining modern AI / LLMs
https://forum.drugs-and-users.org/index.php?topic=7365

There is a conceptual ancestry to The First AI:

Samuel → early machine learning idea: “systems can improve from data instead of rules”

Later: reinforcement learning (temporal difference learning, policy/value methods)

Much later: deep learning + backprop + Transformers

Then: RLHF (reinforcement learning from human feedback) is applied after pretraining, following an initial instruction fine-tuning stage. Pretraining is large-scale next-token prediction on massive text data. RLHF uses human preference signals to optimize the model’s outputs toward more helpful, safe, and conversational behaviour, shaping the surface interaction style rather than core knowledge.

If anything, Samuel’s system is closer to modern reinforcement learning agents, not language model pretraining.

How AI training data, tokens, and “regeneration” actually works
https://forum.drugs-and-users.org/index.php?topic=7395

2. The basic components of Inference (what the user runs with the weights now static from Training) and how the data/query flows through to the reply
https://forum.drugs-and-users.org/index.php?topic=7347

A more detailed but highly educational article I built is at:
https://forum.drugs-and-users.org/index.php?topic=7532

High-level architectural overview before deep diving.
Stacks and "subsystems".

3. How Neural Networks Work
https://forum.drugs-and-users.org/index.php?topic=7343

Core foundation.
Everything modern comes from this.

4. The AI Tokenisation Pipeline
https://forum.drugs-and-users.org/index.php?topic=7350

Introducing tokens as massively multidimensional vectors

Now the reader understands WHY text must become vectors and embeddings.

The Tokenisation Pipeline Re-exaplained because this is so critical to understand, I have revisited it again, and this is easier to understand:
https://forum.drugs-and-users.org/index.php?topic=7394

5. Embeddings and Vector Spaces
https://forum.drugs-and-users.org/index.php?topic=7351

Right now embeddings are probably buried inside tokenisation or neural networks, but embeddings are absolutely central to modern AI.

Topics:

What embeddings actually are
High-dimensional vector spaces
Semantic proximity
Why "cat" and "dog" cluster together
Cosine similarity
Latent space
Why RAG works
Why hallucinations happen

This becomes the bridge between:

Code: [Select]

Token IDs → Meaning Space

6. Transformers
https://forum.drugs-and-users.org/index.php?topic=7344

The real breakthrough architecture behind modern LLMs.

Transformers are built directly on attention mechanisms, which are explained in section 9.

7. LLMs Explained -- A light introduction to LLMs, chatbots, pretraining, and transformers
https://forum.drugs-and-users.org/index.php?topic=7342

Applies the transformer concept to actual LLM systems and chatbot behaviour.

8. RAG - Retrieval Augmented Generation
https://forum.drugs-and-users.org/index.php?topic=7348

Advanced modern extension layer.
Shows how models interface with external knowledge.

It's not that intimidating - all weights and biases are initially checked by humans until it manages to assign them automatically and correctly during the "learning" phase and 90% of the same actual AI code is used to process your query and to also generate your reply.

The RAG and It lets an AI look things up while answering, instead of relying only on what it was trained on.

Those were the basics and once you roughly understand them then continue on with the following topics:

9. Attention Mechanisms and Self-Attention
https://forum.drugs-and-users.org/index.php?topic=7352

Transformers really deserve to be split and Attention is the actual revolutionary mechanism.

Topics:

Query / Key / Value vectors
Attention weighting
Context windows
Token relationships
Parallel processing vs recurrence
Why transformers replaced RNNs (Recurrent Neural Networks) /LSTMs (RNN with a Long Short-Term Memory

Without attention, transformers look like magic.

10. Training vs Inference
https://forum.drugs-and-users.org/index.php?topic=7353

This is one of the most misunderstood things in AI discussions.

Most people think ChatGPT is "learning while talking."

It usually is not.

Topics:

Pretraining
Gradient descent
Backpropagation
Weights
Inference-only operation
Fine tuning — including LoRA and QLoRA (efficient low-rank adaptation)
RLHF
Why models are static snapshots

This clears up enormous confusion.

11. Context Windows and Memory
https://forum.drugs-and-users.org/index.php?topic=7354

Critical for chatbot understanding.

Topics:

What context windows are
Token limits
Sliding attention windows
Conversation truncation
Why models "forget"
Persistent memory systems
RAG vs memory

This directly explains chatbot behaviour.

12. Hallucinations and Failure Modes
https://forum.drugs-and-users.org/index.php?topic=7355

Very important as AIs won't say "I don't know".

Topics:

Probabilistic generation
Why confidence ≠ correctness
Distribution gaps
Mode collapse
Confabulation
Context poisoning
Prompt injection

Most people fundamentally misunderstand hallucinations.

13. Multi-Modal AI
https://forum.drugs-and-users.org/index.php?topic=7356

Modern systems are no longer just text.

Topics:

Vision transformers
Image tokenisation
Audio embeddings
Cross-modal embeddings
Unified latent spaces
Image generation diffusion models

This connects LLMs to image/video/audio systems.

14. Agents and Tool Use
https://forum.drugs-and-users.org/index.php?topic=7357

Modern frontier AI architecture.

Topics:

Tool calling
External APIs
Planning loops
Chain-of-thought orchestration
Autonomous agents
Memory stores
Execution environments

This is where systems are heading now.

15. Scaling Laws
https://forum.drugs-and-users.org/index.php?topic=7358

Very important historically.

Topics:

Why bigger models suddenly worked
Emergent behaviour
Parameter scaling
Data scaling
Compute scaling
Why GPT-3 changed everything

16. Mixture of Experts (MoE)
https://forum.drugs-and-users.org/index.php?topic=7367

How modern frontier models scale without scaling compute proportionally.

Topics:

What an "expert" actually is (a sub-network)
The router / gating mechanism
Sparse activation — only some experts fire per token
Why MoE allows enormous parameter counts on modest compute
MoE vs dense models
Mixtral, GPT-4 (rumoured), Gemini — real-world examples
The trade-offs: memory vs compute

Explains why parameter counts stopped being a simple proxy for capability or cost.

17. Quantisation and Model Compression
https://forum.drugs-and-users.org/index.php?topic=7360

The practical consequence of Scaling Laws.

Topics:

What model weights actually are at the bit level
FP32 vs FP16 vs INT8 vs INT4
How precision reduction affects output quality
GGUF and GGML formats
llama.cpp and local inference
Pruning and knowledge distillation
Why a 7B quantised model can run on a laptop

This explains why AI progress looked sudden.

Scaling Laws explains why models got enormous.
This explains how ordinary hardware runs them anyway.
Directly relevant to anyone self-hosting or running local models.

18. Diffusion Models and What They Are For
https://forum.drugs-and-users.org/index.php?topic=7359

When discussing image generation.

Topics:

Noise schedules
Denoising
Latent diffusion
Classifier guidance
Why Stable Diffusion works

Completely different architecture family from transformers.

19. Critical Defining Factors That Exemplify the Nature of the LLM / Model
https://forum.drugs-and-users.org/index.php?topic=7385

Assigning Their Personality

20. Mainframes: IBM Telum vs GPU AI Architecture — Discussion Thread
https://forum.drugs-and-users.org/index.php?topic=7376

21. The Future of AI — What Is Actually Coming
https://forum.drugs-and-users.org/index.php?topic=7361

By Claude, a prediction only ...

22. What impact will quantum computing have on AI ?
https://forum.drugs-and-users.org/index.php?topic=7400

By ChatGPT

That’s plenty for now !

dopetalk

News:

dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse

Author Topic: The basic "Learning AI" curriculum (Read 414 times)

Chip (OP)

The basic "Learning AI" curriculum

Related Topics

Need help or a chat ?

If you need any help or a chat then IM/PM or email me, Chip

dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse