dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse


Our Discord Notification Server invitation link is https://discord.gg/jB2qmRrxyD

Author Topic: Machine Learning Foundations  (Read 22 times)

Online Chip (OP)

  • Server Admin
  • Hero Member
  • *****
  • Administrator
  • *****
  • Join Date: Dec 2014
  • Location: Australia
  • Posts: 7150
  • Reputation Power: 0
  • Chip has hidden their reputation power
  • Gender: Male
  • Last Login:Today at 02:02:38 AM
  • Deeply Confused Learner
  • Profession: IT Engineer now retired
Machine Learning Foundations
« on: Yesterday at 01:15:22 PM »

i=Oy-LzGsslOwbrLSV

Machine Learning Foundations Map

See also ML Example Code: Samuel’s Checkers in Python and Food Preference in Psuedocode


1) Core Goal
All machine learning is about:
  • Approximating a function f(x)
  • Or choosing actions that maximize reward over time


2) Three Fundamental Paradigms

Neural Computation (Representation Learning)
Perceptron → Deep Nets → Transformers 
  • Learns mappings from inputs to outputs
  • Optimised via gradient descent
  • Builds internal representations of data

Reward-Driven Learning (Decision Making)
Samuel → Temporal Difference Learning → RL → Deep RL → AlphaZero 
  • Learns via reward signals from environment
  • Optimises long-term outcomes
  • Includes exploration and trial-and-error learning

Structure Learning (Unsupervised Inference)
Clustering → PCA → Bayesian methods 
  • Finds hidden structure in data
  • No labels or rewards required
  • Learns latent variables, similarity, and density structure


3) Why These Paradigms Exist
  • Perceptron failed → needed non-linear models → deep networks
  • Rule-based systems failed → needed learning from experience → RL
  • Labels are expensive → needed unsupervised structure discovery


4) Modern AI = Hybrid Stack
Most real AI systems combine:
  • Neural computation (model capacity)
  • Reward learning (decision optimisation)
  • Structure learning (data organisation / embeddings)


5) Key Unification Idea
learn representations + optimise objectives + exploit data structure


Machine Learning Evolution Timeline


1) 1950s–1960s: Foundations

Perceptron (Rosenblatt, 1957)
  • Early supervised learning model
  • Binary linear classifier
  • Computes weighted sum of inputs then applies threshold
  • Learns by adjusting weights when predictions are wrong
Core idea: decision boundaries can be learned from labeled data

Samuel’s Checkers Program (1959)
  • Early reinforcement learning-style system
  • Self-play against itself
  • Learns from win/loss reward signals
  • Improves a hand-crafted evaluation function over time
Core idea: systems can improve through experience rather than fixed rules


2) 1960s–1980s: Limits & Theory Gap
  • Perceptron fails on non-linearly separable problems such as XOR
  • Neural network research declines
  • Reinforcement learning exists mainly as Markov Decision Processes theory


3) 1980s–1990s: Revival

Backpropagation (1986)
  • Enables multi-layer neural networks
  • Solves XOR via hidden layers
  • Core mechanism behind modern deep learning

Temporal Difference Learning (1988)
  • Bridges Samuel-style learning with formal reinforcement learning
  • Learns from prediction errors over time


4) 1990s–2010s: Function Approximation Era
  • Q-learning becomes widely used in reinforcement learning
  • Neural networks approximate value functions
  • TD-Gammon demonstrates strong game-playing performance


5) 2010s: Deep Learning Era

Pretraining (Core LLM Stage)
Pretraining is large-scale self-supervised learning where a model is trained on massive text corpora to predict the next token.

Pretraining is the foundational training stage that enables all downstream instruction tuning and RLHF. RLHF is a post-pretraining alignment step that uses human preference rankings to fine-tune model behaviour.

  • Learns statistical structure of language and world knowledge
  • Uses next-token prediction as the training objective
  • Requires no human labels or explicit instructions
  • Forms the base capability of modern language models

Core idea: learn general representations from raw data via prediction

RNNs (Recurrent Neural Networks)
Process sequential data using a hidden memory state across time steps.

CNNs (Convolutional Neural Networks)
Process spatial data using learned filters that detect edges, textures, and patterns.

Transformers
Use attention mechanisms instead of recurrence/convolution, enabling parallel processing and long-range dependency modeling.

  • CNNs, RNNs, and Transformers scale representation learning
  • Perceptron becomes the basic neuron unit in deep networks
  • Supervised learning dominates early deep learning success




6) 2013–2016: Deep Reinforcement Learning
  • DQN combines Q-learning with deep neural networks
  • Learns directly from raw inputs such as images


7) 2017–Present: Self-Play Systems

AlphaZero / MuZero
  • Self-play reinforcement learning
  • Deep neural networks combined with search
  • No human training data required


8) 2017–Present: Alignment & RLHF Era
  • Instruction tuning trains models to follow prompts
  • RLHF uses human preference rankings to shape outputs
  • Optimises behaviour: helpfulness, safety, tone, refusal style
  • Does not primarily increase intelligence, but aligns model behaviour


Two Parallel Streams

Representation Learning Stream
Perceptron → Backpropagation → Deep Learning → Transformers

Reward Learning Stream
Samuel’s Checkers → Temporal Difference Learning → Reinforcement Learning → Deep RL → AlphaZero


Key Insight
Modern AI is the combination of:
  • Neural computation (representation learning)
  • Reward-driven learning (decision-making)
  • Alignment methods (RLHF shaping behaviour)


Core Machine Learning Paradigms (Summary)

Neural computation learns representations via data-driven optimisation. 
Reward-driven learning optimises actions via environmental feedback. 
Structure learning discovers latent patterns without labels or rewards. 

Modern AI systems are typically hybrids combining multiple paradigms.
« Last Edit: Yesterday at 09:13:26 PM by Chip »
friendly
0
funny
0
informative
0
agree
0
disagree
0
like
0
dislike
0
No reactions
No reactions
No reactions
No reactions
No reactions
No reactions
No reactions
Our Discord Server invitation link is https://discord.gg/jB2qmRrxyD

Tags:
 

Related Topics

  Subject / Started by Replies Last post
0 Replies
23020 Views
Last post June 27, 2018, 05:27:39 AM
by Chip
0 Replies
22447 Views
Last post June 29, 2018, 02:15:06 PM
by Chip
0 Replies
24713 Views
Last post September 24, 2018, 01:43:39 PM
by Chip
0 Replies
22135 Views
Last post July 07, 2019, 07:12:28 AM
by Chip
0 Replies
10239 Views
Last post February 08, 2025, 12:05:00 AM
by smfadmin
0 Replies
17507 Views
Last post February 14, 2025, 04:15:44 AM
by smfadmin
0 Replies
14671 Views
Last post March 28, 2025, 08:31:01 PM
by Miennla
0 Replies
14370 Views
Last post April 05, 2025, 04:46:38 PM
by smfadmin
0 Replies
14133 Views
Last post April 06, 2025, 03:45:07 AM
by smfadmin
0 Replies
35 Views
Last post Yesterday at 02:42:17 AM
by Chip


dopetalk does not endorse any advertised product nor does it accept any liability for it's use or misuse





TERMS AND CONDITIONS

In no event will d&u or any person involved in creating, producing, or distributing site information be liable for any direct, indirect, incidental, punitive, special or consequential damages arising out of the use of or inability to use d&u. You agree to indemnify and hold harmless d&u, its domain founders, sponsors, maintainers, server administrators, volunteers and contributors from and against all liability, claims, damages, costs and expenses, including legal fees, that arise directly or indirectly from the use of any part of the d&u site.


TO USE THIS WEBSITE YOU MUST AGREE TO THE TERMS AND CONDITIONS ABOVE


Founded December 2014
SimplePortal 2.3.6 © 2008-2014, SimplePortal