Author Topic: Machine Learning Foundations (Read 326 times)

Chip · « **on:** May 28, 2026, 01:15:22 PM »

https://www.youtube.com/embed/9gGnTQTYNaEi=Oy-LzGsslOwbrLSV

Machine Learning Foundations Map

See also ML Example Code: Samuel’s Checkers in Python and My Food Preference Model in Psuedocode

1) Core Goal
All machine learning is about:

Approximating a function f(x)
Or choosing actions that maximize reward over time

2) Three Fundamental Paradigms

Neural Computation (Representation Learning)
Perceptron → Deep Nets → Transformers

Learns mappings from inputs to outputs
Optimised via gradient descent
Builds internal representations of data

Reward-Driven Learning (Decision Making)
Samuel → Temporal Difference Learning → RL → Deep RL → AlphaZero

Learns via reward signals from environment
Optimises long-term outcomes
Includes exploration and trial-and-error learning

Structure Learning (Unsupervised Inference)
Clustering → PCA → Bayesian methods

Finds hidden structure in data
No labels or rewards required
Learns latent variables, similarity, and density structure

3) Why These Paradigms Exist

Perceptron failed → needed non-linear models → deep networks
Rule-based systems failed → needed learning from experience → RL
Labels are expensive → needed unsupervised structure discovery

4) Modern AI = Hybrid Stack
Most real AI systems combine:

Neural computation (model capacity)
Reward learning (decision optimisation)
Structure learning (data organisation / embeddings)

5) Key Unification Idea
learn representations + optimise objectives + exploit data structure

Machine Learning Evolution Timeline

1) 1950s–1960s: Foundations

Perceptron (Rosenblatt, 1957)

Early supervised learning model
Binary linear classifier
Computes weighted sum of inputs then applies threshold
Learns by adjusting weights when predictions are wrong

Core idea: decision boundaries can be learned from labeled data

Samuel’s Checkers Program (1959)

Early reinforcement learning-style system
Self-play against itself
Learns from win/loss reward signals
Improves a hand-crafted evaluation function over time

Core idea: systems can improve through experience rather than fixed rules

2) 1960s–1980s: Limits & Theory Gap

Perceptron fails on non-linearly separable problems such as XOR
Neural network research declines
Reinforcement learning exists mainly as Markov Decision Processes theory

3) 1980s–1990s: Revival

Backpropagation (1986)

Enables multi-layer neural networks
Solves XOR via hidden layers
Core mechanism behind modern deep learning

Temporal Difference Learning (1988)

Bridges Samuel-style learning with formal reinforcement learning
Learns from prediction errors over time

4) 1990s–2010s: Function Approximation Era

Q-learning becomes widely used in reinforcement learning
Neural networks approximate value functions
TD-Gammon demonstrates strong game-playing performance

5) 2010s: Deep Learning Era

Pretraining (Core LLM Stage)
Pretraining is large-scale self-supervised learning where a model is trained on massive text corpora to predict the next token.

Pretraining is the foundational training stage that enables all downstream instruction tuning and RLHF. RLHF is a post-pretraining alignment step that uses human preference rankings to fine-tune model behaviour.

Learns statistical structure of language and world knowledge
Uses next-token prediction as the training objective
Requires no human labels or explicit instructions
Forms the base capability of modern language models

Core idea: learn general representations from raw data via prediction

RNNs (Recurrent Neural Networks)
Process sequential data using a hidden memory state across time steps.

CNNs (Convolutional Neural Networks)
Process spatial data using learned filters that detect edges, textures, and patterns.

Transformers
Use attention mechanisms instead of recurrence/convolution, enabling parallel processing and long-range dependency modeling.

CNNs, RNNs, and Transformers scale representation learning
Perceptron becomes the basic neuron unit in deep networks
Supervised learning dominates early deep learning success

6) 2013–2016: Deep Reinforcement Learning

DQN combines Q-learning with deep neural networks
Learns directly from raw inputs such as images

7) 2017–Present: Self-Play Systems

AlphaZero / MuZero

Self-play reinforcement learning
Deep neural networks combined with search
No human training data required

2017–Present: Alignment & RLHF Era

Instruction tuning trains models to follow prompts
RLHF uses human preference rankings to shape outputs
Optimises behaviour: helpfulness, safety, tone, refusal style
Does not primarily increase intelligence, but aligns model behaviour

Two Parallel Streams

Representation Learning Stream
Perceptron → Backpropagation → Deep Learning → Transformers

Reward Learning Stream
Samuel’s Checkers → Temporal Difference Learning → Reinforcement Learning → Deep RL → AlphaZero

Key Insight
Modern AI is the combination of:

Neural computation (representation learning)
Reward-driven learning (decision-making)
Alignment methods (RLHF shaping behaviour)

Core Machine Learning Paradigms (Summary)

Neural computation learns representations via data-driven optimisation.
Reward-driven learning optimises actions via environmental feedback.
Structure learning discovers latent patterns without labels or rewards.

Modern AI systems are typically hybrids combining multiple paradigms.

dopetalk

News:

dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse

Author Topic: Machine Learning Foundations (Read 326 times)

Chip (OP)

Machine Learning Foundations

Related Topics

Need help or a chat ?

If you need any help or a chat then IM/PM or email me, Chip

dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse