Transcript of "MATH3346 Data Mining A Machine Learning Framework for Data Mining"
1.
A Learning Framework A Learning Framework
Concept Learning Concept Learning
1 A Learning Framework
MATH3346 Data Mining The Learning Problem
A Machine Learning Framework for Data Playing Draughts
Mining Knowledge Representation
Algorithm
Graham.Williams@togaware.com
2 Concept Learning
Example
Hypotheses
August 2005
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
Reference Book
1 A Learning Framework
The Learning Problem
Playing Draughts
Knowledge Representation Machine Learning
Algorithm Tom Mitchell
1997, McGraw-Hill
ISBN: 0070428077.
2 Concept Learning
Example
Hypotheses
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
2.
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
What is the Learning Problem? A Framework For Learning
From what data do we learn?
Learning = Improving with experience at some task
Supervised versus Unsupervised
Improve over task T , with respect to performance measure P,
How to represent the knowledge discovered?
based on experience E .
Group means
E.g., Learn to play draughts (checkers) Regression formula
T : Play draughts Decision tree
P: % of games won in world tournament Neural Network
E : opportunity to play against self How to discover the sentence that best describes data?
Search through the representation space
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
Learning to Play Draughts Training Experience
T : Play draughts Direct Training: current board → move
P: Percent of games won in world tournament Indirect Training: moves → outcome
Teacher: to guide training as a supervisor
What experience?
No Teacher: the learner proposes boards and measures
What exactly should be learned? performance.
How shall it be represented?
A problem: is training experience representative of performance
What speciﬁc algorithm to learn it? goal?
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
3.
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
Choose the Target Function Possible Deﬁnition for Target Function V
if b is a ﬁnal board state that is won, then V (b) = 100
What is the best move, given the current layout: if b is a ﬁnal board state that is lost, then V (b) = −100
ChooseMove : Board → Move if b is a ﬁnal board state that is drawn, then V (b) = 0
ChooseMove is diﬃcult to learn. if b is a not a ﬁnal state in the game, then V (b) = V (b ),
Evaluate the current board layout: where b is the best ﬁnal board state that can be achieved
V : Board → starting from b and playing optimally until the end of the
The aim is to learn an evaluation function. game.
This gives correct values, but is not operational - how to make use
of this?
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
Choose Representation for Target Function A Representation for Learned Function
V (b) = w0 +w1 ·bp(b)+w2 ·rp(b)+w3 ·bk(b)+w4 ·rk(b)+w5 ·bt(b)+w6 ·
Choice of representation is “everything”..... but which one? bp(b): number of black pieces on board b
collection of rules? rp(b): number of red pieces on b
neural network? bk(b): number of black kings on b
decision tree? rk(b): number of red kings on b
numeric formula? bt(b): number of red pieces threatened by black (i.e., which
polynomial function of board features? can be taken on black’s next turn)
... rt(b): number of black pieces threatened by red
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
4.
The Learning Problem The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Playing Draughts
Concept Learning Knowledge Representation Concept Learning Knowledge Representation
Algorithm Algorithm
Obtaining Training Examples Choose Weight Tuning Rule
LMS Weight update rule:
Minimise squared error E = ˆ
− V (b))2
training (Vtrain (b)
All we know is the outcome of the game
Repeat:
V (b): the true target function
ˆ Select a training example b at random
V (b) : the learned function
1 ˆ
Compute error (b): error (b) = Vtrain (b) − V (b)
Vtrain (b): the training values (supplied) 2 For each board feature fi (e.g., bp), update weight wi :
A simple and empirically useful rule for estimating training values: wi ← wi + c · fi · error (b)
ˆ
Vtrain (b) ← V (Successor (b)) c is some small constant, say 0.1, to moderate the rate of learning
Stochastic gradient-descent search to minimise E
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
The Learning Problem
A Learning Framework Playing Draughts A Learning Framework Example
Concept Learning Knowledge Representation Concept Learning Hypotheses
Algorithm
Design Choices Determine Type
of Training Experience
Games against ...
experts Table of correct
Games against moves
self
1 A Learning Framework
Determine The Learning Problem
Target Function
Playing Draughts
Board Board ... Knowledge Representation
¨ move ¨ value
Algorithm
Determine Representation
of Learned Function
... 2 Concept Learning
Polynomial
Linear function Artificial neural
of six features network Example
Determine
Hypotheses
Learning Algorithm
Linear ...
Gradient programming
descent
Completed Design
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
5.
A Learning Framework Example A Learning Framework Example
Concept Learning Hypotheses Concept Learning Hypotheses
Learning a Concept from Examples: EnjoySport Representing Hypotheses
Concept learning: infer boolean function from examples of
input/output Many possible representations
Target concept: Days when Aldo enjoys his water sport Here, h is conjunction of constraints on attributes
Each constraint can be
a speciﬁc value (e.g., Water = Warm)
Sky Temp Humid Wind Water Forecast EnjoySport
don’t care (e.g., “Water =?”)
Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes no value allowed (e.g.,“Water=∅”)
Rainy Cold High Strong Warm Change No For example,
Rainy Warm High Strong Cool Change Yes Sky AirTemp Humid Wind Water Forecast
Sunny ? ? Strong ? Same
What is the general concept?
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
A Learning Framework Example A Learning Framework Example
Concept Learning Hypotheses Concept Learning Hypotheses
Prototypical Concept Learning Task Inductive Learning Hypothesis
Given:
Instances X : Possible days, each described by the
attributes Sky, AirTemp, Humidity, Wind, Water, Forecast
Hypotheses H: Conjunctions of literals. E.g.
Any hypothesis found to approximate the target function
?, Cold, High, ?, ?, ? . well over a suﬃciently large set of training examples will
Target function (concept) c: EnjoySport : X → {0, 1}
also approximate the target function well over other
Training examples D: Positive and negative examples of unobserved examples.
the target function
x1 , c(x1 ) , . . . xm , c(xm )
Determine: A hypothesis h in H such that h(x) = c(x) for
all x in D.
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
6.
A Learning Framework Example A Learning Framework Example
Concept Learning Hypotheses Concept Learning Hypotheses
Instances, Hypotheses, and The Learning Problem
More-General-Than partial order
Instances X Hypotheses H
Specific
How do we search through this generally very large hypothesis
h h space to ﬁnd the best hypothesis for the task at hand!
x1 1 3
h
x 2
2
General
x1= <Sunny, Warm, High, Strong, Cool, Same> h 1= <Sunny, ?, ?, Strong, ?, ?>
x = <Sunny, Warm, High, Light, Warm, Same> h = <Sunny, ?, ?, ?, ?, ?>
2 2
h = <Sunny, ?, ?, ?, Cool, ?>
3
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
A Learning Framework Example
Concept Learning Hypotheses
Limits on Representational Languages
Consider 2 dimensional instance space—
instances are represented by (x, y ). Choice
of representational language aﬀects how well
we can learn
Consider the illustrations from Hastie, Tibshi-
rani, Friedman, The Elements of Statistical
Learning.
c 2005 Graham.Williams@togaware.com MATH3346 Data Mining: Machine Learning
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.
Be the first to comment