SlideShare a Scribd company logo
PROF.MRS.M.P.ATRE
ASSISTANT PROFESSOR,
PVGCOET
Learning
9/19/2017
1
AI : first 3 units
9/19/2017
2
 Foundation
 Searching
 Knowledge Representation
Why is learning important?
 So far we have assumed we know how the
world works
 Rules of queens puzzle
 Rules of chess
 Knowledge base of logical facts
 Actions’ preconditions and effects
 Probabilities in Bayesian networks
9/19/2017
4
 At that point “just” need to solve/optimize
 In the real world this information is often not
immediately available
 AI needs to be able to learn from experience
What is learning
 Machine Learning is the study of how to build
computer systems that adapt and improve with
experience
 subfield of Artificial Intelligence
 intersects with
 cognitive science,
 information theory,
 and probability theory, among others
9/19/2017
5
Reasoning and Learning
9/19/2017
6
 AI deals mainly with deductive reasoning
 Deductive reasoning arrives at answers to queries
relating to a particular situation starting from a set of
general axioms
 Learning represents inductive reasoning
 inductive reasoning arrives at general axioms from a
set of particular instances
Deductive Vs Inductive
9/19/2017
7
 Deductive Reasoning (teacher explains, give
examples and then students practice)
 Generalization(or Rule) Specific Examples or Activities
 Inductive Reasoning (teacher presents students
with many examples showing how the concept is
used to make students “NOTICE”)
 Specific Examples or ActivitiesGenralization(or Rule)
Classical AI
9/19/2017
8
 suffers from the knowledge acquisition problem in
real life applications
 obtaining and updating the knowledge base is costly
and prone to errors
 So the need for Machine Learning
Machine learning serves to solve
the knowledge acquisition
bottleneck by obtaining the
result from data by induction
9/19/2017
9
Machine learning is particularly attractive
because
9/19/2017
10
 Some tasks cannot be defined well except by
example
 Working environment of machines may not
be known at design time
 Explicit knowledge encoding may be difficult
and not available
 Environments change over time
 Biological systems learn
Wide applications where learning used
9/19/2017
11
 Data mining and knowledge discovery
 Speech/image/video (pattern) recognition
 Adaptive control
 Autonomous vehicles/robots
 Decision support systems
 Bioinformatics
 WWW
 ( Data mining is the practice of examining the large pre-
existing databases in order to generate new information)
Defining Learning
9/19/2017
12
 Formally, a computer program is said to learn from
experience E with respect to some class of
tasks T and performance measure P,
 if its performance at tasks in T, as measured
by P, improves with experience E
Thus a learning system is characterized by:
9/19/2017
13
 • task T
 • experience E, and
 • performance measure P
Example 1
9/19/2017
14
 Learning to play chess
 T: Play chess
 P: Percentage of games won in world tournament
 E: Opportunity to play against self or other players
Example 2
9/19/2017
15
 Learning to drive a van
 T: Drive on a public highway using vision sensors
 P: Average distance traveled before an error
(according to human observer)
 E: Sequence of images and steering actions recorded
during human driving.
Block diagram of generic learning system
9/19/2017
16
So learning system consists of
9/19/2017
17
 Goal: Defined with respect to the task to be
performed by the system
 Model: A mathematical function which maps
perception to actions
 Learning rules: Which update the model
parameters with new experience such that the
performance measures with respect to the goals is
optimized
 Experience: A set of perception (and possibly the
corresponding actions)
Taxonomy of Learning Systems
9/19/2017
18
 Or Classification based on above block diagram
1. Goal/Task/Target Function:
9/19/2017
19
 Prediction: To predict the desired output for a
given input based on previous input/output pairs.
 E.g., to predict the value of a stock given other
inputs like market index, interest rates etc.
 Categorization: To classify an object into one of
several categories based on features of the object.
 E.g., a robotic vision system to categorize a machine
part into one of the categories, spanner, hammer etc
based on the parts’ dimension and shape.
9/19/2017
20
 Clustering: To organize a group of objects into
homogeneous segments. E.g., a satellite image
analysis system which groups land areas into
forest, urban and water body, for better utilization
of natural resources.
 Planning: To generate an optimal sequence of
actions to solve a particular problem. E.g., an
Unmanned Air Vehicle which plans its path to
obtain a set of pictures and avoid enemy anti-
aircraft guns.
2.Models
9/19/2017
21
 • Propositional and FOL rules
 • Decision trees
 • Linear separators
 • Neural networks
 • Graphical models
 • Temporal models like hidden Markov models
3.Learning Rules
9/19/2017
22
 often tied up with the model of learning used
 Some common rules :
 gradient descent,
 least square error,
 expectation maximization
 and margin maximization
4. Experiences
9/19/2017
23
 Learning algorithms use experiences in the form of
perceptions or perception action pairs to
improve their performance
 nature of experiences varies with applications
 Supervised learning
 UnSupervised learning
 Active learning
 Reinforcement learning
4.1 Supervised learning:
9/19/2017
24
 A teacher or oracle is available
 It provides the desired action corresponding to a
perception
 A set of perception action pair provides a training set
 Examples :
 an automated vehicle where a set of vision inputs and the
corresponding steering actions are available to the learner
4.2 Unsupervised learning:
9/19/2017
25
 no teacher is available
 learner only discovers persistent patterns in the data
consisting of a collection of perceptions
 also called exploratory learning
 Examples:
 Finding out malicious network attacks from a sequence of
anomalous data packets is an example of unsupervised
learning
4.3 Active learning:
9/19/2017
26
 not only a teacher is available,
 the learner has the freedom to ask the teacher for
suitable perception-action example pairs which will
help the learner to improve its performance
 Examples:
 a news recommender system which tries to learn user’s
preferences and categorize news articles as interesting or
uninteresting to the user.
 The system may present a particular article (of which it is not
sure) to the user and ask whether it is interesting or not.
4.4 Reinforcement learning:
9/19/2017
27
 a teacher is available,
 but the teacher instead of directly providing the
desired action corresponding to a perception,
return reward and punishment to the learner
for its action corresponding to a perception
 Examples:
 a robot in a unknown terrain where its get a punishment when
its hits an obstacle and reward when it moves smoothly
Mathematical formulation of the inductive
learning problem
9/19/2017
28
 Extrapolate from a given set of examples so that we
can make accurate predictions about future
examples.
 Supervised versus Unsupervised learning
 Want to learn an unknown function f(x) = y, where x
is an input example and y is the desired output.
 Supervised learning implies we are given a set of (x,
y) pairs by a "teacher."
 Unsupervised learning means we are only given the
x s. In either case, the goal is to estimate f.
Inductive Bias
9/19/2017
29
 Inductive learning - inherently conjectural process
because any knowledge created by generalization
from specific facts cannot be proven true; it can only
be proven false.
 Hence, inductive inference is falsity preserving,
not truth preserving
9/19/2017
30
 To generalize beyond the specific training examples,
we need constraints or biases on what f is best.
 That is, learning can be viewed as searching
the Hypothesis Space H of possible f
functions
9/19/2017
31
 A bias allows us to choose one f over another one
 A completely unbiased inductive algorithm could
only memorize the training examples and could not
say anything more about other unseen examples
Two types of biases commonly used ML
9/19/2017
32
 Machine Learning : Types of Biases
 Restricted Hypothesis Space Bias
 Allow only certain types of f functions, not arbitrary ones
 Preference Bias
 Define a metric for comparing fs so as to determine
whether one is better than another
Inductive Learning Framework
9/19/2017
33
9/19/2017
34
Example
9/19/2017
35
 We lend money to people
 We have to predict whether they will pay us back or not
 People have various (say, binary) features:
 do we know their Address?
 do they have a Criminal record?
 high Income?
 Educated?
 Old?
 Unemployed?
9/19/2017
36
 We see examples: (Y = paid back, N = not)
+a, -c, +i, +e, +o, +u: Y
-a, +c, -i, +e, -o, -u: N
+a, -c, +i, -e, -o, -u: Y
-a, -c, +i, +e, -o, -u: Y
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, -e, -o, +u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
 Next person is +a, -c, +i, -e, +o, -u. Will we
get paid back?
9/19/2017
37
 We want some hypothesis h that predicts whether we will
be paid back
+a, -c, +i, +e, +o, +u: Y
-a, +c, -i, +e, -o, -u: N
+a, -c, +i, -e, -o, -u: Y
-a, -c, +i, +e, -o, -u: Y
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, -e, -o, +u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
9/19/2017
38
 Lots of possible hypotheses: will be paid back if…
 Income is high (wrong on 2 occasions in training data)
 Income is high and no Criminal record (always right in
training data)
 (Address is known AND ((NOT Old) OR Unemployed))
OR ((NOT Address is known) AND (NOT Criminal
Record)) (always right in training data)
 Which one seems best? Anything better?
Occam’s Razor
9/19/2017
39
 Occam’s razor: simpler hypotheses tend to
generalize to future data better
 Intuition: given limited training data,
 it is likely that there is some complicated hypothesis that is not
actually good but that happens to perform well on the training
data
 it is less likely that there is a simple hypothesis that is not
actually good but that happens to perform well on the training
data
 There are fewer simple hypotheses
 Computational learning theory studies this in much
more depth
Occam’s Razor : a problem-solving principle
9/19/2017
40
 Occam’s Razor/ Ockham’s razor is a principle from
philosophy
 Suppose there exist two explanations for an
occurrence
 In this case, the simpler one is usually better
 Another way of saying it is that the more
assumptions you have to make, the more unlikely the
explanation is!
Decision trees
high Income?
yes no
NO
yes no
NO
Criminal record?
YES
Constructing a
decision tree, one
step at a time
address?
yes no
+a, -c, +i, +e, +o, +u: Y
-a, +c, -i, +e, -o, -u: N
+a, -c, +i, -e, -o, -u: Y
-a, -c, +i, +e, -o, -u: Y
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, -e, -o, +u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
-a, +c, -i, +e, -o, -u: N
-a, -c, +i, +e, -o, -u: Y
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, -e, -o, +u: Y
+a, -c, +i, +e, +o, +u: Y
+a, -c, +i, -e, -o, -u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
criminal? criminal?
-a, +c, -i, +e, -o, -u: N
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, +e, -o, -u: Y
-a, -c, +i, -e, -o, +u: Y
+a, -c, +i, +e, +o, +u: Y
+a, -c, +i, -e, -o, -u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
income?
+a, -c, +i, +e, +o, +u: Y
+a, -c, +i, -e, -o, -u: Y
+a, -c, -i, -e, +o, -u: N
yes no
yes no
yes no Address was
maybe not the
best attribute to
start with…
Starting with a
different attribute
yes no
+a, -c, +i, +e, +o, +u: Y
-a, +c, -i, +e, -o, -u: N
+a, -c, +i, -e, -o, -u: Y
-a, -c, +i, +e, -o, -u: Y
-a, +c, +i, -e, -o, -u: N
-a, -c, +i, -e, -o, +u: Y
+a, -c, -i, -e, +o, -u: N
+a, +c, +i, -e, +o, -u: N
criminal?
-a, +c, -i, +e, -o, -u: N
-a, +c, +i, -e, -o, -u: N
+a, +c, +i, -e, +o, -u: N
+a, -c, +i, +e, +o, +u: Y
+a, -c, +i, -e, -o, -u: Y
-a, -c, +i, +e, -o, -u: Y
-a, -c, +i, -e, -o, +u: Y
+a, -c, -i, -e, +o, -u: N
 Seems like a much better starting point than address
 Each node almost completely uniform
 Almost completely predicts whether we will be paid back
Hypothesis Spaces
9/19/2017
44
 How many distinct decision trees are there with ‘n’
Boolean attributes?
 =number of Boolean functions
 Number of distinct truth tables with (2^n) rows
 2^(2^n) distinct decision trees
 E.g with 6 Boolean attributes, there are
18,446,744,073,709,551,616 trees
Different approach: nearest neighbor(s)
 Next person is -a, +c, -i, +e, -o, +u. Will we get paid
back?
 Nearest neighbor: simply look at most similar example
in the training data, see what happened there
+a, -c, +i, +e, +o, +u: Y (distance 4)
-a, +c, -i, +e, -o, -u: N (distance 1)
+a, -c, +i, -e, -o, -u: Y (distance 5)
-a, -c, +i, +e, -o, -u: Y (distance 3)
-a, +c, +i, -e, -o, -u: N (distance 3)
-a, -c, +i, -e, -o, +u: Y (distance 3)
+a, -c, -i, -e, +o, -u: N (distance 5)
+a, +c, +i, -e, +o, -u: N (distance 5)
9/19/2017
46
 Nearest neighbor is second, so predict N
 k nearest neighbors: look at k nearest neighbors,
take a vote
 E.g., 5 nearest neighbors have 3 Ys, 2Ns, so predict Y
These nearest neighbours are
+a, -c, +i, +e, +o, +u: Y (distance 4)
-a, +c, -i, +e, -o, -u: N (distance 1)
-a, -c, +i, +e, -o, -u: Y (distance 3)
-a, +c, +i, -e, -o, -u: N (distance 3)
-a, -c, +i, -e, -o, +u: Y (distance 3)
Another approach: perceptrons
 Place a weight on every attribute, indicating how
important that attribute is (and in which direction it
affects things)
 E.g., wa = 1, wc = -5, wi = 4, we = 1, wo = 0, wu = -1
+a, -c, +i, +e, +o, +u: Y (score 1+4+1+0-1 = 5)
-a, +c, -i, +e, -o, -u: N (score -5+1=-4)
+a, -c, +i, -e, -o, -u: Y (score 1+4=5)
-a, -c, +i, +e, -o, -u: Y (score 4+1=5)
-a, +c, +i, -e, -o, -u: N (score -5+4=-1)
-a, -c, +i, -e, -o, +u: Y (score 4-1=3)
+a, -c, -i, -e, +o, -u: N (score 1+0=1)
+a, +c, +i, -e, +o, -u: N (score 1-5+4+0=0)
How to calculate the score?
9/19/2017
48
 wa = 1, wc = -5, wi = 4, we = 1, wo = 0, wu = -1
 1) +a, -c, +i, +e, +o, +u: Y
 Its (+a,+i+e+o+u)= (score 1+4+1+0-1 = 5)
 2) -a, +c, -i, +e, -o, -u: N (score -5+1=-4)
 Its (+c+e)=(-5+1= -4)
 And so on
9/19/2017
49
 Need to set some threshold above which we predict to be
paid back (say, 2)
 May care about combinations of things (nonlinearity) –
generalization: neural networks
Reinforcement learning (RL)
 Originates from Dynamic Programming (DP)
 Less exact than DP since it uses experience to
change system’s parameters and/ or structure
 There are three routes you can take to work: A, B, C
 The times you took A, it took: 10, 60, 30 minutes
 The times you took B, it took: 32, 31, 34 minutes
 The time you took C, it took 50 minutes
9/19/2017
51
 What should you do next?
 Exploration vs. exploitation tradeoff
 Exploration: try to explore under-explored options
 Exploitation: stick with options that look best now
 Reinforcement learning usually studied in MDPs**
 Take action, observe, reward and new state
 **MDPs: Markov Decision Processes are a
mathematical framework for modeling sequential
decision problems under uncertainty as well as
reinforcement learning problems.
Bayesian approach to learning
 Assume we have a prior distribution over the long term
behavior of A
 With probability .6, A is a “fast route” which:
 With prob. .25, takes 20 minutes
 With prob. .5, takes 30 minutes
 With prob. .25, takes 40 minutes
 With probability .4, A is a “slow route” which:
 With prob. .25, takes 30 minutes
 With prob. .5, takes 40 minutes
 With prob. .25, takes 50 minutes
9/19/2017
53
 We travel on A once and see it takes 30 minutes
 P(A is fast | observation) = P(observation | A is fast)*P(A is
fast) / P(observation) = .5*.6/(.5*.6+.25*.4) = .3/(.3+.1) =
.75
 Convenient approach for decision theory, game theory
9/19/2017
54
Thank you

More Related Content

What's hot

Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
Ahmed Daoud
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
vikas dhakane
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
Gidey Leul
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating SystemTech_MX
 
AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)
Tajim Md. Niamat Ullah Akhund
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
Mathankumar S
 
Pentium microprocessor
Pentium microprocessorPentium microprocessor
Pentium microprocessor
tanzidshawon
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
Vikas Goyal
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
Mostafa G. M. Mostafa
 
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA Space Division Multi Access,Frequ...
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA  Space Division Multi Access,Frequ...Mobile computing unit2,SDMA,FDMA,CDMA,TDMA  Space Division Multi Access,Frequ...
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA Space Division Multi Access,Frequ...
Pallepati Vasavi
 
4 greedy methodnew
4 greedy methodnew4 greedy methodnew
4 greedy methodnewabhinav108
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
Dilum Bandara
 
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdfDifference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Mahbubay Rabbani Mim
 
Introduction to Dynamic Programming, Principle of Optimality
Introduction to Dynamic Programming, Principle of OptimalityIntroduction to Dynamic Programming, Principle of Optimality
Introduction to Dynamic Programming, Principle of Optimality
Bhavin Darji
 
Types of Addressing modes- COA
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COA
Ruchi Maurya
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier Transformation
Varun Ojha
 
Real Time OS For Embedded Systems
Real Time OS For Embedded SystemsReal Time OS For Embedded Systems
Real Time OS For Embedded SystemsHimanshu Ghetia
 
Lecture 1 mobile and adhoc network- introduction
Lecture 1  mobile and adhoc network- introductionLecture 1  mobile and adhoc network- introduction
Lecture 1 mobile and adhoc network- introductionChandra Meena
 
Propagation mechanisms
Propagation mechanismsPropagation mechanisms
Propagation mechanisms
METHODIST COLLEGE OF ENGG & TECH
 
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
RahulSharma4566
 

What's hot (20)

Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Real time Operating System
Real time Operating SystemReal time Operating System
Real time Operating System
 
AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)
 
Digital Image Processing - Image Compression
Digital Image Processing - Image CompressionDigital Image Processing - Image Compression
Digital Image Processing - Image Compression
 
Pentium microprocessor
Pentium microprocessorPentium microprocessor
Pentium microprocessor
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA Space Division Multi Access,Frequ...
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA  Space Division Multi Access,Frequ...Mobile computing unit2,SDMA,FDMA,CDMA,TDMA  Space Division Multi Access,Frequ...
Mobile computing unit2,SDMA,FDMA,CDMA,TDMA Space Division Multi Access,Frequ...
 
4 greedy methodnew
4 greedy methodnew4 greedy methodnew
4 greedy methodnew
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdfDifference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
 
Introduction to Dynamic Programming, Principle of Optimality
Introduction to Dynamic Programming, Principle of OptimalityIntroduction to Dynamic Programming, Principle of Optimality
Introduction to Dynamic Programming, Principle of Optimality
 
Types of Addressing modes- COA
Types of Addressing modes- COATypes of Addressing modes- COA
Types of Addressing modes- COA
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier Transformation
 
Real Time OS For Embedded Systems
Real Time OS For Embedded SystemsReal Time OS For Embedded Systems
Real Time OS For Embedded Systems
 
Lecture 1 mobile and adhoc network- introduction
Lecture 1  mobile and adhoc network- introductionLecture 1  mobile and adhoc network- introduction
Lecture 1 mobile and adhoc network- introduction
 
Propagation mechanisms
Propagation mechanismsPropagation mechanisms
Propagation mechanisms
 
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
Heuristic Search in Artificial Intelligence | Heuristic Function in AI | Admi...
 

Similar to Learning occam razor

Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics course
gideymichael
 
PREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptxPREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptx
VikramKumar790542
 
ML Unit 1 CS.ppt
ML Unit 1 CS.pptML Unit 1 CS.ppt
ML Unit 1 CS.ppt
AAKANKSHAAGRAWALPC22
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
Kathirvel Ayyaswamy
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Panimalar Engineering College
 
Incorporating Prior Domain Knowledge Into Inductive Machine ...
Incorporating Prior Domain Knowledge Into Inductive Machine ...Incorporating Prior Domain Knowledge Into Inductive Machine ...
Incorporating Prior Domain Knowledge Into Inductive Machine ...butest
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
RSIS International
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
VenkateswaraBabuRavi
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
alicejiang7888
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Dr. Radhey Shyam
 
3171617_introduction_applied machine learning.pptx
3171617_introduction_applied machine learning.pptx3171617_introduction_applied machine learning.pptx
3171617_introduction_applied machine learning.pptx
jainyshah20
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
Abhishek Upadhyay
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
SisayNegash4
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inferencebutest
 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)butest
 
ML-Chapter_one.pptx
ML-Chapter_one.pptxML-Chapter_one.pptx
ML-Chapter_one.pptx
belay41
 
slides
slidesslides
slidesbutest
 

Similar to Learning occam razor (20)

Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics course
 
PREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptxPREDICT 422 - Module 1.pptx
PREDICT 422 - Module 1.pptx
 
ML Unit 1 CS.ppt
ML Unit 1 CS.pptML Unit 1 CS.ppt
ML Unit 1 CS.ppt
 
Machine Learning an Research Overview
Machine Learning an Research OverviewMachine Learning an Research Overview
Machine Learning an Research Overview
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Incorporating Prior Domain Knowledge Into Inductive Machine ...
Incorporating Prior Domain Knowledge Into Inductive Machine ...Incorporating Prior Domain Knowledge Into Inductive Machine ...
Incorporating Prior Domain Knowledge Into Inductive Machine ...
 
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...Comparative Analysis: Effective Information Retrieval Using Different Learnin...
Comparative Analysis: Effective Information Retrieval Using Different Learnin...
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////深度学习639页PPT/////////////////////////////
深度学习639页PPT/////////////////////////////
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Eric Smidth
Eric SmidthEric Smidth
Eric Smidth
 
3171617_introduction_applied machine learning.pptx
3171617_introduction_applied machine learning.pptx3171617_introduction_applied machine learning.pptx
3171617_introduction_applied machine learning.pptx
 
Artificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of IntelligenceArtificial intelligence: Simulation of Intelligence
Artificial intelligence: Simulation of Intelligence
 
Introduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdfIntroduction to machine learning-2023-IT-AI and DS.pdf
Introduction to machine learning-2023-IT-AI and DS.pdf
 
Machine Learning and Inductive Inference
Machine Learning and Inductive InferenceMachine Learning and Inductive Inference
Machine Learning and Inductive Inference
 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)
 
ML-Chapter_one.pptx
ML-Chapter_one.pptxML-Chapter_one.pptx
ML-Chapter_one.pptx
 
slides
slidesslides
slides
 

More from Minakshi Atre

Part1 speech basics
Part1 speech basicsPart1 speech basics
Part1 speech basics
Minakshi Atre
 
Signals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to FundamentalsSignals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to Fundamentals
Minakshi Atre
 
Unit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithmUnit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithm
Minakshi Atre
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
Minakshi Atre
 
Artificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic TerminologiesArtificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic Terminologies
Minakshi Atre
 
2)local search algorithms
2)local search algorithms2)local search algorithms
2)local search algorithms
Minakshi Atre
 
Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)
Minakshi Atre
 
DSP preliminaries
DSP preliminariesDSP preliminaries
DSP preliminaries
Minakshi Atre
 
Artificial intelligence agents and environment
Artificial intelligence agents and environmentArtificial intelligence agents and environment
Artificial intelligence agents and environment
Minakshi Atre
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications
Minakshi Atre
 
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applications
Minakshi Atre
 
Learning in AI
Learning in AILearning in AI
Learning in AI
Minakshi Atre
 
Waltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligenceWaltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligence
Minakshi Atre
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligence
Minakshi Atre
 
Popular search algorithms
Popular search algorithmsPopular search algorithms
Popular search algorithms
Minakshi Atre
 
Artificial Intelligence Terminologies
Artificial Intelligence TerminologiesArtificial Intelligence Terminologies
Artificial Intelligence Terminologies
Minakshi Atre
 
composite video signal
composite video signalcomposite video signal
composite video signal
Minakshi Atre
 
Basic terminologies of television
Basic terminologies of televisionBasic terminologies of television
Basic terminologies of television
Minakshi Atre
 
Mpeg 2
Mpeg 2Mpeg 2
Beginning of dtv
Beginning of dtvBeginning of dtv
Beginning of dtv
Minakshi Atre
 

More from Minakshi Atre (20)

Part1 speech basics
Part1 speech basicsPart1 speech basics
Part1 speech basics
 
Signals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to FundamentalsSignals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to Fundamentals
 
Unit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithmUnit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithm
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
Artificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic TerminologiesArtificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic Terminologies
 
2)local search algorithms
2)local search algorithms2)local search algorithms
2)local search algorithms
 
Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)
 
DSP preliminaries
DSP preliminariesDSP preliminaries
DSP preliminaries
 
Artificial intelligence agents and environment
Artificial intelligence agents and environmentArtificial intelligence agents and environment
Artificial intelligence agents and environment
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications
 
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applications
 
Learning in AI
Learning in AILearning in AI
Learning in AI
 
Waltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligenceWaltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligence
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligence
 
Popular search algorithms
Popular search algorithmsPopular search algorithms
Popular search algorithms
 
Artificial Intelligence Terminologies
Artificial Intelligence TerminologiesArtificial Intelligence Terminologies
Artificial Intelligence Terminologies
 
composite video signal
composite video signalcomposite video signal
composite video signal
 
Basic terminologies of television
Basic terminologies of televisionBasic terminologies of television
Basic terminologies of television
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Beginning of dtv
Beginning of dtvBeginning of dtv
Beginning of dtv
 

Recently uploaded

Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
nooriasukmaningtyas
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
nikitacareer3
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 

Recently uploaded (20)

Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...A review on techniques and modelling methodologies used for checking electrom...
A review on techniques and modelling methodologies used for checking electrom...
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 

Learning occam razor

  • 2. AI : first 3 units 9/19/2017 2  Foundation  Searching  Knowledge Representation
  • 3. Why is learning important?  So far we have assumed we know how the world works  Rules of queens puzzle  Rules of chess  Knowledge base of logical facts  Actions’ preconditions and effects  Probabilities in Bayesian networks
  • 4. 9/19/2017 4  At that point “just” need to solve/optimize  In the real world this information is often not immediately available  AI needs to be able to learn from experience
  • 5. What is learning  Machine Learning is the study of how to build computer systems that adapt and improve with experience  subfield of Artificial Intelligence  intersects with  cognitive science,  information theory,  and probability theory, among others 9/19/2017 5
  • 6. Reasoning and Learning 9/19/2017 6  AI deals mainly with deductive reasoning  Deductive reasoning arrives at answers to queries relating to a particular situation starting from a set of general axioms  Learning represents inductive reasoning  inductive reasoning arrives at general axioms from a set of particular instances
  • 7. Deductive Vs Inductive 9/19/2017 7  Deductive Reasoning (teacher explains, give examples and then students practice)  Generalization(or Rule) Specific Examples or Activities  Inductive Reasoning (teacher presents students with many examples showing how the concept is used to make students “NOTICE”)  Specific Examples or ActivitiesGenralization(or Rule)
  • 8. Classical AI 9/19/2017 8  suffers from the knowledge acquisition problem in real life applications  obtaining and updating the knowledge base is costly and prone to errors  So the need for Machine Learning
  • 9. Machine learning serves to solve the knowledge acquisition bottleneck by obtaining the result from data by induction 9/19/2017 9
  • 10. Machine learning is particularly attractive because 9/19/2017 10  Some tasks cannot be defined well except by example  Working environment of machines may not be known at design time  Explicit knowledge encoding may be difficult and not available  Environments change over time  Biological systems learn
  • 11. Wide applications where learning used 9/19/2017 11  Data mining and knowledge discovery  Speech/image/video (pattern) recognition  Adaptive control  Autonomous vehicles/robots  Decision support systems  Bioinformatics  WWW  ( Data mining is the practice of examining the large pre- existing databases in order to generate new information)
  • 12. Defining Learning 9/19/2017 12  Formally, a computer program is said to learn from experience E with respect to some class of tasks T and performance measure P,  if its performance at tasks in T, as measured by P, improves with experience E
  • 13. Thus a learning system is characterized by: 9/19/2017 13  • task T  • experience E, and  • performance measure P
  • 14. Example 1 9/19/2017 14  Learning to play chess  T: Play chess  P: Percentage of games won in world tournament  E: Opportunity to play against self or other players
  • 15. Example 2 9/19/2017 15  Learning to drive a van  T: Drive on a public highway using vision sensors  P: Average distance traveled before an error (according to human observer)  E: Sequence of images and steering actions recorded during human driving.
  • 16. Block diagram of generic learning system 9/19/2017 16
  • 17. So learning system consists of 9/19/2017 17  Goal: Defined with respect to the task to be performed by the system  Model: A mathematical function which maps perception to actions  Learning rules: Which update the model parameters with new experience such that the performance measures with respect to the goals is optimized  Experience: A set of perception (and possibly the corresponding actions)
  • 18. Taxonomy of Learning Systems 9/19/2017 18  Or Classification based on above block diagram
  • 19. 1. Goal/Task/Target Function: 9/19/2017 19  Prediction: To predict the desired output for a given input based on previous input/output pairs.  E.g., to predict the value of a stock given other inputs like market index, interest rates etc.  Categorization: To classify an object into one of several categories based on features of the object.  E.g., a robotic vision system to categorize a machine part into one of the categories, spanner, hammer etc based on the parts’ dimension and shape.
  • 20. 9/19/2017 20  Clustering: To organize a group of objects into homogeneous segments. E.g., a satellite image analysis system which groups land areas into forest, urban and water body, for better utilization of natural resources.  Planning: To generate an optimal sequence of actions to solve a particular problem. E.g., an Unmanned Air Vehicle which plans its path to obtain a set of pictures and avoid enemy anti- aircraft guns.
  • 21. 2.Models 9/19/2017 21  • Propositional and FOL rules  • Decision trees  • Linear separators  • Neural networks  • Graphical models  • Temporal models like hidden Markov models
  • 22. 3.Learning Rules 9/19/2017 22  often tied up with the model of learning used  Some common rules :  gradient descent,  least square error,  expectation maximization  and margin maximization
  • 23. 4. Experiences 9/19/2017 23  Learning algorithms use experiences in the form of perceptions or perception action pairs to improve their performance  nature of experiences varies with applications  Supervised learning  UnSupervised learning  Active learning  Reinforcement learning
  • 24. 4.1 Supervised learning: 9/19/2017 24  A teacher or oracle is available  It provides the desired action corresponding to a perception  A set of perception action pair provides a training set  Examples :  an automated vehicle where a set of vision inputs and the corresponding steering actions are available to the learner
  • 25. 4.2 Unsupervised learning: 9/19/2017 25  no teacher is available  learner only discovers persistent patterns in the data consisting of a collection of perceptions  also called exploratory learning  Examples:  Finding out malicious network attacks from a sequence of anomalous data packets is an example of unsupervised learning
  • 26. 4.3 Active learning: 9/19/2017 26  not only a teacher is available,  the learner has the freedom to ask the teacher for suitable perception-action example pairs which will help the learner to improve its performance  Examples:  a news recommender system which tries to learn user’s preferences and categorize news articles as interesting or uninteresting to the user.  The system may present a particular article (of which it is not sure) to the user and ask whether it is interesting or not.
  • 27. 4.4 Reinforcement learning: 9/19/2017 27  a teacher is available,  but the teacher instead of directly providing the desired action corresponding to a perception, return reward and punishment to the learner for its action corresponding to a perception  Examples:  a robot in a unknown terrain where its get a punishment when its hits an obstacle and reward when it moves smoothly
  • 28. Mathematical formulation of the inductive learning problem 9/19/2017 28  Extrapolate from a given set of examples so that we can make accurate predictions about future examples.  Supervised versus Unsupervised learning  Want to learn an unknown function f(x) = y, where x is an input example and y is the desired output.  Supervised learning implies we are given a set of (x, y) pairs by a "teacher."  Unsupervised learning means we are only given the x s. In either case, the goal is to estimate f.
  • 29. Inductive Bias 9/19/2017 29  Inductive learning - inherently conjectural process because any knowledge created by generalization from specific facts cannot be proven true; it can only be proven false.  Hence, inductive inference is falsity preserving, not truth preserving
  • 30. 9/19/2017 30  To generalize beyond the specific training examples, we need constraints or biases on what f is best.  That is, learning can be viewed as searching the Hypothesis Space H of possible f functions
  • 31. 9/19/2017 31  A bias allows us to choose one f over another one  A completely unbiased inductive algorithm could only memorize the training examples and could not say anything more about other unseen examples
  • 32. Two types of biases commonly used ML 9/19/2017 32  Machine Learning : Types of Biases  Restricted Hypothesis Space Bias  Allow only certain types of f functions, not arbitrary ones  Preference Bias  Define a metric for comparing fs so as to determine whether one is better than another
  • 35. 9/19/2017 35  We lend money to people  We have to predict whether they will pay us back or not  People have various (say, binary) features:  do we know their Address?  do they have a Criminal record?  high Income?  Educated?  Old?  Unemployed?
  • 36. 9/19/2017 36  We see examples: (Y = paid back, N = not) +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N  Next person is +a, -c, +i, -e, +o, -u. Will we get paid back?
  • 37. 9/19/2017 37  We want some hypothesis h that predicts whether we will be paid back +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N
  • 38. 9/19/2017 38  Lots of possible hypotheses: will be paid back if…  Income is high (wrong on 2 occasions in training data)  Income is high and no Criminal record (always right in training data)  (Address is known AND ((NOT Old) OR Unemployed)) OR ((NOT Address is known) AND (NOT Criminal Record)) (always right in training data)  Which one seems best? Anything better?
  • 39. Occam’s Razor 9/19/2017 39  Occam’s razor: simpler hypotheses tend to generalize to future data better  Intuition: given limited training data,  it is likely that there is some complicated hypothesis that is not actually good but that happens to perform well on the training data  it is less likely that there is a simple hypothesis that is not actually good but that happens to perform well on the training data  There are fewer simple hypotheses  Computational learning theory studies this in much more depth
  • 40. Occam’s Razor : a problem-solving principle 9/19/2017 40  Occam’s Razor/ Ockham’s razor is a principle from philosophy  Suppose there exist two explanations for an occurrence  In this case, the simpler one is usually better  Another way of saying it is that the more assumptions you have to make, the more unlikely the explanation is!
  • 41. Decision trees high Income? yes no NO yes no NO Criminal record? YES
  • 42. Constructing a decision tree, one step at a time address? yes no +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N -a, +c, -i, +e, -o, -u: N -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N criminal? criminal? -a, +c, -i, +e, -o, -u: N -a, +c, +i, -e, -o, -u: N -a, -c, +i, +e, -o, -u: Y -a, -c, +i, -e, -o, +u: Y +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N income? +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N yes no yes no yes no Address was maybe not the best attribute to start with…
  • 43. Starting with a different attribute yes no +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N criminal? -a, +c, -i, +e, -o, -u: N -a, +c, +i, -e, -o, -u: N +a, +c, +i, -e, +o, -u: N +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N  Seems like a much better starting point than address  Each node almost completely uniform  Almost completely predicts whether we will be paid back
  • 44. Hypothesis Spaces 9/19/2017 44  How many distinct decision trees are there with ‘n’ Boolean attributes?  =number of Boolean functions  Number of distinct truth tables with (2^n) rows  2^(2^n) distinct decision trees  E.g with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees
  • 45. Different approach: nearest neighbor(s)  Next person is -a, +c, -i, +e, -o, +u. Will we get paid back?  Nearest neighbor: simply look at most similar example in the training data, see what happened there +a, -c, +i, +e, +o, +u: Y (distance 4) -a, +c, -i, +e, -o, -u: N (distance 1) +a, -c, +i, -e, -o, -u: Y (distance 5) -a, -c, +i, +e, -o, -u: Y (distance 3) -a, +c, +i, -e, -o, -u: N (distance 3) -a, -c, +i, -e, -o, +u: Y (distance 3) +a, -c, -i, -e, +o, -u: N (distance 5) +a, +c, +i, -e, +o, -u: N (distance 5)
  • 46. 9/19/2017 46  Nearest neighbor is second, so predict N  k nearest neighbors: look at k nearest neighbors, take a vote  E.g., 5 nearest neighbors have 3 Ys, 2Ns, so predict Y These nearest neighbours are +a, -c, +i, +e, +o, +u: Y (distance 4) -a, +c, -i, +e, -o, -u: N (distance 1) -a, -c, +i, +e, -o, -u: Y (distance 3) -a, +c, +i, -e, -o, -u: N (distance 3) -a, -c, +i, -e, -o, +u: Y (distance 3)
  • 47. Another approach: perceptrons  Place a weight on every attribute, indicating how important that attribute is (and in which direction it affects things)  E.g., wa = 1, wc = -5, wi = 4, we = 1, wo = 0, wu = -1 +a, -c, +i, +e, +o, +u: Y (score 1+4+1+0-1 = 5) -a, +c, -i, +e, -o, -u: N (score -5+1=-4) +a, -c, +i, -e, -o, -u: Y (score 1+4=5) -a, -c, +i, +e, -o, -u: Y (score 4+1=5) -a, +c, +i, -e, -o, -u: N (score -5+4=-1) -a, -c, +i, -e, -o, +u: Y (score 4-1=3) +a, -c, -i, -e, +o, -u: N (score 1+0=1) +a, +c, +i, -e, +o, -u: N (score 1-5+4+0=0)
  • 48. How to calculate the score? 9/19/2017 48  wa = 1, wc = -5, wi = 4, we = 1, wo = 0, wu = -1  1) +a, -c, +i, +e, +o, +u: Y  Its (+a,+i+e+o+u)= (score 1+4+1+0-1 = 5)  2) -a, +c, -i, +e, -o, -u: N (score -5+1=-4)  Its (+c+e)=(-5+1= -4)  And so on
  • 49. 9/19/2017 49  Need to set some threshold above which we predict to be paid back (say, 2)  May care about combinations of things (nonlinearity) – generalization: neural networks
  • 50. Reinforcement learning (RL)  Originates from Dynamic Programming (DP)  Less exact than DP since it uses experience to change system’s parameters and/ or structure  There are three routes you can take to work: A, B, C  The times you took A, it took: 10, 60, 30 minutes  The times you took B, it took: 32, 31, 34 minutes  The time you took C, it took 50 minutes
  • 51. 9/19/2017 51  What should you do next?  Exploration vs. exploitation tradeoff  Exploration: try to explore under-explored options  Exploitation: stick with options that look best now  Reinforcement learning usually studied in MDPs**  Take action, observe, reward and new state  **MDPs: Markov Decision Processes are a mathematical framework for modeling sequential decision problems under uncertainty as well as reinforcement learning problems.
  • 52. Bayesian approach to learning  Assume we have a prior distribution over the long term behavior of A  With probability .6, A is a “fast route” which:  With prob. .25, takes 20 minutes  With prob. .5, takes 30 minutes  With prob. .25, takes 40 minutes  With probability .4, A is a “slow route” which:  With prob. .25, takes 30 minutes  With prob. .5, takes 40 minutes  With prob. .25, takes 50 minutes
  • 53. 9/19/2017 53  We travel on A once and see it takes 30 minutes  P(A is fast | observation) = P(observation | A is fast)*P(A is fast) / P(observation) = .5*.6/(.5*.6+.25*.4) = .3/(.3+.1) = .75  Convenient approach for decision theory, game theory