Introduction to machine learning

Introduction to Machine Learning

Reminders ,[object Object],[object Object]

A Review ,[object Object],[object Object], 4  3  2  1 P(C|AB)  A,  B  A,B A,  B A,B

A Review ,[object Object],[object Object], 4 =6/11  3 =4/11  2 =1/11  1 =3/8 P(C|AB)  A,  B  A,B A,  B A,B 5 0 0 0 6 1 0 0 7 0 1 0 4 1 1 0 10 0 0 1 1 1 0 1 5 0 1 1 3 1 1 1 # obs C B A

Maximum Likelihood for BN ,[object Object],Alarm Earthquake Burglar E 500 B: 200 N=1000 P(E) = 0.5 P(B) = 0.2 A|E,B: 19/20 A|B: 188/200 A|E: 170/500 A| : 1/380 0.003 F F 0.34 F T 0.95 T F 0.95 T T P(A|E,B) B E

Maximum A Posteriori ,[object Object],[object Object],[object Object], 4 =8/15  3 =6/15  2 =3/15  1 =5/14 P(C|AB)  A,  B  A,B A,  B A,B 5 0 0 0 6 1 0 0 7 0 1 0 4 1 1 0 10 0 0 1 1 1 0 1 5 0 1 1 3 1 1 1 # obs C B A

Discussion ,[object Object],[object Object],[object Object]

Motivation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Topics in Machine Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Tasks & settings Classification Ranking Clustering Regression Decision-making Supervised Unsupervised Semi-supervised Active Reinforcement learning Techniques Bayesian learning Decision trees Neural networks Support vector machines Boosting Case-based reasoning Dimensionality reduction …

What is Learning? ,[object Object],[object Object],[object Object],[object Object]

Inductive Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Inductive learning method ,[object Object],[object Object],[object Object]

Inductive learning method ,[object Object],[object Object],[object Object],h=D is a trivial, but perhaps uninteresting solution (caching)

Classification Task ,[object Object],[object Object],[object Object],[object Object],a small one!

Logic-Based Inductive Learning ,[object Object]

Logic-Based Inductive Learning ,[object Object],Concept Note that the training set does not say whether an observable predicate is pertinent or not

Rewarded Card Example ,[object Object],[object Object],[object Object]

Rewarded Card Example ,[object Object],[object Object],[object Object],[object Object],There are several possible inductive hypotheses

Learning a Logical Predicate (Concept Classifier) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Learning a Logical Predicate (Concept Classifier) ,[object Object],[object Object],[object Object],[object Object]

Learning a Logical Predicate (Concept Classifier) ,[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],Hypothesis Space

Inductive Learning Scheme + + + + + + + + + + + + - - - - - - - - - - - - Example set X {[A, B, …, CONCEPT]} Hypothesis space H {[CONCEPT(x)  S(A,B, …)]} Training set D Inductive hypothesis h

Size of Hypothesis Space ,[object Object],[object Object],[object Object],[object Object],2 2 n

Multiple Inductive Hypotheses h 1  NUM(r)  BLACK(s)  REWARD([r,s]) h 2  BLACK(s)   (r=J)  REWARD([r,s]) h 3  ([r,s]=[4,C])  ([r,s]=[7,C])  [r,s]=[2,S])  REWARD([r,s]) h 4   ([r,s]=[5,H])   ([r,s]=[J,S])  REWARD([r,s]) agree with all the examples in the training set

Multiple Inductive Hypotheses h 1  NUM(r)  BLACK(s)  REWARD([r,s]) h 2  BLACK(s)   (r=J)  REWARD([r,s]) h 3  ([r,s]=[4,C])  ([r,s]=[7,C])  [r,s]=[2,S])  REWARD([r,s]) h 4   ([r,s]=[5,H])   ([r,s]=[J,S])  REWARD([r,s]) agree with all the examples in the training set Need for a system of preferences – called an inductive bias – to compare possible hypotheses

Notion of Capacity ,[object Object],[object Object],[object Object],[object Object]

 Keep-It-Simple (KIS) Bias ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

 Keep-It-Simple (KIS) Bias ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Einstein: “A theory must be as simple as possible, but not simpler than this”

 Keep-It-Simple (KIS) Bias ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],If the bias allows only sentences S that are conjunctions of k << n predicates picked from the n observable predicates, then the size of H is O(n k )

Relation to Bayesian Learning ,[object Object],[object Object],[object Object], MAP = argmin  - log P(D|  ) - log P(  ) Measures accuracy on training set Minimum length encoding of  MAP learning is a form of KIS bias

Supervised Learning Flow Chart Training set Target concept Datapoints Inductive Hypothesis Prediction Learner Hypothesis space Choice of learning algorithm Unknown concept we want to approximate Observations we have seen Test set Observations we will see in the future Better quantities to assess performance

Capacity is Not the Only Criterion ,[object Object],Learn Test Example set X Hypothesis space H Training set D + + + + + + + + + + + + - - - - - - - - - - - -

Generalization Error ,[object Object],Learn Test Example set X Hypothesis space H + + + + + + + + + + + + - - - - - - - - - - - -

Assessing Performance of a Learning Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object]

Cross-Validation ,[object Object],Hypothesis space H Train Examples D + + + + + + + - - - - - - + + + + + - - - - - -

Cross-Validation ,[object Object],Hypothesis space H Testing set + + + + + + + - - - - - -

Cross-Validation ,[object Object],Hypothesis space H Testing set + + + + + - - - - - - + + Test

Cross-Validation ,[object Object],Hypothesis space H Testing set + + + + + - - - - - - + + 9/13 correct + + + + + + + - - - - - -

Common Splitting Strategies ,[object Object],[object Object],Train Test Dataset

Cardinal sins of machine learning ,[object Object],[object Object],[object Object]

Tennis Example ,[object Object]

Tennis Example ,[object Object],Trained hypothesis PlayTennis = (T=Mild or Cool)  (W=Weak) Training errors = 3/10 Testing errors = 4/4

Tennis Example ,[object Object],Trained hypothesis PlayTennis = (T=Mild or Cool) Training errors = 3/10 Testing errors = 1/4

Tennis Example ,[object Object],Trained hypothesis PlayTennis = (T=Mild or Cool) Training errors = 3/10 Testing errors = 2/4

Tennis Example ,[object Object],Sum of all testing errors: 7/12

How to construct a better learner? ,[object Object]

Introduction to machine learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Introduction to machine learning

Similar to Introduction to machine learning (20)

More from butest

More from butest (20)

Introduction to machine learning