SlideShare a Scribd company logo
Tilani Gunawardena
Machine Learning and Data
Mining
1
• Supervised learning
• Unsupervised learning
• Reinforcement learning
Outline
2
• Data Mining: Process of discovering patterns
in data
Data Mining
3
• Machine Learning
– Grew out of work in AI
– New Capability for computers
• Machine Learning is a science of getting computers to
learn without being explicitly programed
• Learning= Improving with experience at some task
– Improve over task T
– With respect to P
– Based on experience E
Machine Learning
4
• Database Mining
– Large datasets from growth of automation/web
• Ex: web click data, medical records, biology, engineering
• Applications can’t program by hand
– Ex: Autonomous helicopter, handwriting recognition,
most of NLP, Computer vision
• Self- customizing programs
– Amazon, Netflix product recommendation
• Understand human Learning(brain, real AI)
Machine Learning
5
Types of Learning
– Supervised learning: Learn to predict
• correct answer for each example. Answer can be a numeric variable,
categorical variable etc.
– Unsupervised learning: learn to understand and describe the
data
• correct answers not given – just examples (e.g. – the same figures as
above , without the labels)
– Reinforcement learning: learn to act
• occasional rewards
M M MF F F
6
Machine Learning Problems
7
• The success of machine learning system also depends on the
algorithms.
• The algorithms control the search to find and build the
knowledge structures.
• The learning algorithms should extract useful information
from training examples.
Algorithms
8
• Supervised learning
– Prediction
– Classification (discrete labels), Regression (real values)
• Unsupervised learning
– Clustering
– Probability distribution estimation
– Finding association (in features)
– Dimension reduction
• Reinforcement learning
– Decision making (robot, chess machine)
Algorithms
9
• Problem of taking labeled dataset, gleaning information from
it so that you can label new data sets
• Learn to predict output from input
• Function approximation
Supervised Learning
10
• Predict housing prices
R
Supervised learning: example 1
Regression: predict continuous valued output(price) 11
• Breast Cancer(malignant, benign)
Supervised learning: example 2
This is classification problem : Discrete valued output(0 or 1)
12
1 attribute/feature : Tumor Size
13
Supervised learning: example 3
2 attributes/features : Tumor Size and Age
14
1. Input: Credit history (# of loans, how much
money you make,…)
Out put : Lend money or not?
2. Input: picture , Output: Predict Bsc, Msc
PhD
3. Input: picture, Output: Predict Age
4. Input: Large inventory of identical items,
Output: Predict how many items will sell
over the next 3 months
5. Input: Customer accounts, Output: hacked
or not
Q?
15
• Find patterns and structure in data
Unsupervised Learning
16
Unsupervised Learning-examples
• Organize computing clusters
– Large data centers: what machines work together?
• Social network analysis
– Given information which friends you send email
most /FB friends/Google+ circles
– Can we automatically identify which are cohesive
groups of friends
17
• Market Segmentation
– Customer data set and group customer into
different market segments
• Astronomical data analysis
– Clustering algorithms gives interesting & useful
theories ex: how galaxies are formed
18
1. Given email labeled as spam/not spam, learn a
spam filter
2. Given set of news articles found on the web,
group them into set of articles about the same
story
3. Given a database of customer data,
automatically discover market segments ad
groups customer into different market segments
4. Given a dataset of patients diagnosed as either
having diabetes or nor, learn to classify new
patients as having diabetes or not
Q?
19
Reinforcement Learning
• Learn to act
• Learning from delayed reward
• Learning come from after several step ,the decisions that you’ve
actually made
20
Algorithms: The Basic Methods
21
Outline
• Simplicity first: 1R
• Naïve Bayes
22
Simplicity first
• Simple algorithms often work very well!
• There are many kinds of simple structure, eg:
– One attribute does all the work
– All attributes contribute equally & independently
– A weighted linear combination might do
– Instance-based: use a few prototypes
– Use simple logical rules
• Success of method depends on the domain
23
Inferring rudimentary rules
• 1R: learns a 1-level decision tree
– I.e., rules that all test one particular attribute
• Basic version
– One branch for each value
– Each branch assigns most frequent class
– Error rate: proportion of instances that don’t belong to the
majority class of their corresponding branch
– Choose attribute with lowest error rate
(assumes nominal attributes)
24
Pseudo-code for 1R
For each attribute,
For each value of the attribute, make a rule as follows:
count how often each class appears
find the most frequent class
make the rule assign that class to this attribute-value
Calculate the error rate of the rules
Choose the rules with the smallest error rate
• Note: “missing” is treated as a separate attribute value
25
Evaluating the weather attributes
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
Attribute
Table 1. Weather data (Nominal) 26
Evaluating the weather attributes
Attribute Rules Errors Total
errors
Outlook Sunny  No 2/5
Overcast 
Rainy 
Temp Hot 
Mild 
Cool 
Humidity High 
Normal 
Windy False 
True 
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
* indicates a tie
27
Evaluating the weather attributes
Attribute Rules Errors Total
errors
Outlook Sunny  No 2/5 4/14
Overcast  Yes 0/4
Rainy  Yes 2/5
Temp Hot  No* 2/4 5/14
Mild  Yes 2/6
Cool  Yes 1/4
Humidity High  No 3/7 4/14
Normal  Yes 1/7
Windy False  Yes 2/8 5/14
True  No* 3/6
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
* indicates a tie
28
Evaluating the weather attributes
Attribute Rules Errors Total
errors
Outlook Sunny  No 2/5 4/14
Overcast  Yes 0/4
Rainy  Yes 2/5
Temp Hot  No* 2/4 5/14
Mild  Yes 2/6
Cool  Yes 1/4
Humidity High  No 3/7 4/14
Normal  Yes 1/7
Windy False  Yes 2/8 5/14
True  No* 3/6
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
* indicates a tie
29
Evaluating the weather attributes
Attribute Rules Errors Total
errors
Outlook Sunny  No 2/5 4/14
Overcast  Yes 0/4
Rainy  Yes 2/5
Temp Hot  No* 2/4 5/14
Mild  Yes 2/6
Cool  Yes 1/4
Humidity High  No 3/7 4/14
Normal  Yes 1/7
Windy False  Yes 2/8 5/14
True  No* 3/6
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No
* indicates a tie
30
Evaluating the weather attributes
Attribute Rules Errors Total errors
Outlook Sunny->No 2/5 4/14
Overcast->Yes 0/4
Rainy->Yes 2/5
Temp Hot->No* 2/4 5/14
Mild->Yes 2/6
Cool->Yes 1/4
Humidity High->No 3/7 4/14
Normal->Yes 1/7
Windy False->Yes 2/8 5/14
True->No* 3/6
Table 2. Rules for Weather data (Nominal) 32
Dealing with numeric attributes
• Discretize numeric attributes
• Divide each attribute’s range into intervals
– Sort instances according to attribute’s values
– Place breakpoints where the class changes
(the majority class)
– This minimizes the total error
33
Example: temperature from weather data
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Y | N
Outlook Temperature Humidity Windy Play
Sunny 85 85 False No
Sunny 80 90 True No
Overcast 83 86 False Yes
Rainy 70 96 False Yes
Rainy 68 80 False Yes
Rainy 65 70 True No
Overcast 64 65 True Yes
Sunny 72 95 False No
Sunny 69 70 False Yes
Rainy 75 80 False Yes
Sunny 75 70 True Yes
Overcast 72 90 True Yes
Overcast 81 75 False Yes
Rainy 71 91 True No
34
Example: temperature from weather data
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No | Yes | Yes Yes | No | Yes Yes | No
Outlook Temperature Humidity Windy Play
Sunny 85 85 False No
Sunny 80 90 True No
Overcast 83 86 False Yes
Rainy 70 96 False Yes
Rainy 68 80 False Yes
Rainy 65 70 True No
Overcast 64 65 True Yes
Sunny 72 95 False No
Sunny 69 70 False Yes
Rainy 75 80 False Yes
Sunny 75 70 True Yes
Overcast 72 90 True Yes
Overcast 81 75 False Yes
Rainy 71 91 True No
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
The simplest fix is to move
the breakpoint at 72 up one
example, to 73.5, producing a
mixed partition in which no is
the majority class.
35
Evaluating the weather attributes
Attribute Rules Errors Total errors
Outlook Sunny->No 2/5
Overcast->
Rainy->
Temp Hot->
Mild->
Cool->
Humidity High->
Normal->
Windy False->
True->
Table :Rules for Weather data (Nominal) 36
Dealing with numeric attributes
Attribute Rules Errors Total errors
Temperature <=64.5->Yes 0/1 1/14
>64.5->No 0/1
>66.5 and <=70.5->Yes 0/3
>70.5 and <=73.5->No 1/3
>73.5 and <=77.5->Yes 0/2
>77.5 and <=80.5->No 0/1
>80.5 and <=84->Yes 0/2
>84->No 0/1
Table 5. Rules for temperature from weather data (overfitting)
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
Break points :64.5, 66.5, 70.5, 73.5, 77.5, 80.5, and 84
37
The problem of overfitting
• This procedure is very sensitive to noise
– One instance with an incorrect class label will probably produce a
separate interval
• Simple solution:
enforce minimum number of instances in majority class per
interval
38
Discretization example
• Example (with min = 3):
• Final result for temperature attribute
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes No Yes Yes Yes | No No Yes Yes Yes | No Yes Yes No
64 65 68 69 70 71 72 72 75 75 80 81 83 85
Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No
which leads to the rule set
temperature: ≤ 77.5 → yes
> 77.5 → 39
With overfitting avoidance
• Resulting rule set:
Attribute Rules Errors Total errors
Outlook Sunny  No 2/5 4/14
Overcast  Yes 0/4
Rainy  Yes 2/5
Temperature  77.5  Yes 3/10 5/14
> 77.5  No* 2/4
Humidity  82.5  Yes 1/7 3/14
> 82.5 and  95.5  No 2/6
> 95.5  Yes 0/1
Windy False  Yes 2/8 5/14
True  No* 3/6
40
Discussion of 1R
• 1R was described in a paper by Holte (1993)
– Contains an experimental evaluation on 16 datasets (using
cross-validation so that results were representative of
performance on future data)
– Minimum number of instances was set to 6 after some
experimentation
– 1R’s simple rules performed not much worse than much more
complex decision trees
• Simplicity first pays off!
Very Simple Classification Rules Perform Well on Most Commonly Used
Datasets
Robert C. Holte, Computer Science Department, University of Ottawa
41
Statistical modeling
• “Opposite” of 1R: use all the attributes
• Two assumptions: Attributes are
– equally important
– statistically independent (given the class value)
• I.e., knowing the value of one attribute says nothing about the
value of another
(if the class is known)
• Independence assumption is almost never correct!
• But … this scheme works surprisingly well in practice
42
Probabilities for weather data
Outlook Temperature Humidity Windy Play
Yes No Yes No Yes No Yes No Yes No
Sunny 2 3 Hot 2 2 High 3 4 False 6 2 9 5
Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3
Rainy 3 2 Cool 3 1
Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14
Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5
Rainy 3/9 2/5 Cool 3/9 1/5
Outlook Temp Humidity Windy Play
Sunny Hot High False No
Sunny Hot High True No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Mild High False No
Sunny Cool Normal False Yes
Rainy Mild Normal False Yes
Sunny Mild Normal True Yes
Overcast Mild High True Yes
Overcast Hot Normal False Yes
Rainy Mild High True No 43
Probabilities for weather data
Outlook Temp. Humidity Windy Play
Sunny Cool High True ?
• A new day: Likelihood of the two classes
For “yes” = 2/9  3/9  3/9  3/9  9/14 = 0.0053
For “no” = 3/5  1/5  4/5  3/5  5/14 = 0.0206
Conversion into a probability by normalization:
P(“yes”) = 0.0053 / (0.0053 + 0.0206) = 0.205
P(“no”) = 0.0206 / (0.0053 + 0.0206) = 0.795
Outlook Temperature Humidity Windy Play
Yes No Yes No Yes No Yes No Yes No
Sunny 2 3 Hot 2 2 High 3 4 False 6 2 9 5
Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3
Rainy 3 2 Cool 3 1
Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14
Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5
Rainy 3/9 2/5 Cool 3/9 1/5
44
Bayes’s rule
• Probability of event H given evidence E :
• A priori probability of H :
– Probability of event before evidence is seen
• A posteriori probability of H :
– Probability of event after evidence is seen
]Pr[
]Pr[]|Pr[
]|Pr[
E
HHE
EH =
]|Pr[ EH
]Pr[H
Thomas Bayes
Born: 1702 in London, England
Died: 1761 in Tunbridge Wells, Kent, England
from Bayes “Essay towards solving a problem in the
doctrine of chances” (1763)
45
Naïve Bayes for classification
• Classification learning: what’s the probability of the class
given an instance?
– Evidence E = instance
– Event H = class value for instance
• Naïve assumption: evidence splits into parts (i.e. attributes)
that are independent
Pr[H | E]=
Pr[E1 | H]Pr[E2 | H]…Pr[En | H]Pr[H]
Pr[E]
46
Weather data example
Outlook Temp. Humidity Windy Play
Sunny Cool High True ?
Evidence E
Probability of
class “yes”
]|Pr[]|Pr[ yesSunnyOutlookEyes ==
]|Pr[ yesCooleTemperatur =´
]|Pr[ yesHighHumidity =´
]|Pr[ yesTrueWindy =´
]Pr[
]Pr[
E
yes
´
]Pr[
14
9
9
3
9
3
9
3
9
2
E
´´´´
=
47
The “zero-frequency problem”
• What if an attribute value doesn’t occur with every class value?
(e.g. “Outlook = Overcast” for class “No”)
– Probability will be zero!
– A posteriori probability will also be zero!
(No matter how likely the other values are!)
• Remedy: add 1 to the count for every attribute value-class
combination (Laplace estimator)
• Result: probabilities will never be zero!
(also: stabilizes probability estimates)
Pr[no| E]= 0
Pr[Outlook =Overcast |no]= 0
Outlook Temperature Humidity Windy Play
Yes No Yes No Yes No Yes No Yes No
Sunny 2 3 +1 Hot 2 2 High 3 4 False 6 2 9 5
Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3
Rainy 3 2 Cool 3 1
Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14
Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5
Rainy 3/9 2/5 Cool 3/9 1/5
48
The “zero-frequency problem”
• What if an attribute value doesn’t occur with every class value?
(e.g. “Outlook = Overcast” for class “No”)
– Probability will be zero!
– A posteriori probability will also be zero!
(No matter how likely the other values are!)
• Remedy: add 1 to the count for every attribute value-class
combination (Laplace estimator)
• Result: probabilities will never be zero!
(also: stabilizes probability estimates)
Pr[no| E]= 0
Pr[Outlook =Overcast |no]= 0
Outlook Temperature Humidity Windy Play
Yes No Yes No Yes No Yes No Yes No
Sunny 2 3 +1 Hot 2 2 High 3 4 False 6 2 9 5
Overcast 4 0 +1 Mild 4 2 Normal 6 1 True 3 3
Rainy 3 2 +1 Cool 3 1
Sunny 2/9 3/5 4/8 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14
Overcast 4/9 0/5 1/8 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5
Rainy 3/9 2/5 3/8 Cool 3/9 1/5
49
*Modified probability estimates
• In some cases adding a constant different from 1 might be
more appropriate
• Example: attribute outlook for class yes
• Weights don’t need to be equal
(but they must sum to 1)
m
m
+
+
9
3/2
m
m
+
+
9
3/4
m
m
+
+
9
3/3
Sunny Overcast Rainy
m
m
+
+
9
2 1p
m
m
+
+
9
4 2p
m
m
+
+
9
3 3p
50
Missing values
• Training: instance is not included in
frequency count for attribute value-class
combination
• Classification: attribute will be omitted
from calculation
• Example: Outlook Temp. Humidity Windy Play
? Cool High True ?
Likelihood of “yes” =    = 0.0238
Likelihood of “no” =    = 0.0343
P(“yes”) = 0.0238 / (0.0238 + 0.0343) = 41%
P(“no”) = 0.0343 / (0.0238 + 0.0343) = 59%
51
Missing values
• Training: instance is not included in
frequency count for attribute value-class
combination
• Classification: attribute will be omitted
from calculation
• Example: Outlook Temp. Humidity Windy Play
? Cool High True ?
Likelihood of “yes” = 3/9  3/9  3/9  9/14 = 0.0238
Likelihood of “no” = 1/5  4/5  3/5  5/14 = 0.0343
P(“yes”) = 0.0238 / (0.0238 + 0.0343) = 41%
P(“no”) = 0.0343 / (0.0238 + 0.0343) = 59%
52
Numeric attributes
• Usual assumption: attributes have a normal or
Gaussian probability distribution (given the class)
• The probability density function for the normal
distribution is defined by two parameters:
– Sample mean 
– Standard deviation 
– Then the density function f(x) is
å=
=
n
i
ix
n 1
1
m
å=
-
-
=
n
i
ix
n 1
2
)(
1
1
ms
2
2
2
)(
2
1
)( s
m
sp
-
-
=
x
exf
Karl Gauss, 1777-1855
great German mathematician
53
Statistics for weather data
• Example density value:
0340.0
2.62
1
)|66(
2
2
2.62
)7366(
=== *
-
-
eyesetemperaturf
p
Outlook Temperature Humidity Windy Play
Yes No Yes No Yes No Yes No Yes No
Sunny 2 3 64, 68, 65, 71, 65, 70, 70, 85, False 6 2 9 5
Overcast 4 0 69, 70, 72, 80, 70, 75, 90, 91, True 3 3
Rainy 3 2 72, … 85, … 80, … 95, …
Sunny 2/9 3/5  =73  =75  =79  =86 False 6/9 2/5 9/14 5/14
Overcast 4/9 0/5  =6.2  =7.9  =10.2  =9.7 True 3/9 3/5
Rainy 3/9 2/5
54
Classifying a new day
• A new day:
• Missing values during training are not included in
calculation of mean and standard deviation
Outlook Temp. Humidity Windy Play
Sunny 66 90 true ?
Likelihood of “yes” =   0.0221   = 0.000036
Likelihood of “no” =  0.0291  0.0380   = 0.000136
P(“yes”) = 0.000036 / (0.000036 + 0. 000136) = 20.9%
P(“no”) = 0.000136 / (0.000036 + 0. 000136) = 79.1%
55
Classifying a new day
• A new day:
• Missing values during training are not included in
calculation of mean and standard deviation
Outlook Temp. Humidity Windy Play
Sunny 66 90 true ?
Likelihood of “yes” = 2/9  0.0340  0.0221  3/9  9/14 = 0.000036
Likelihood of “no” = 3/5  0.0291  0.0380  3/5  5/14 = 0.000136
P(“yes”) = 0.000036 / (0.000036 + 0. 000136) = 20.9%
P(“no”) = 0.000136 / (0.000036 + 0. 000136) = 79.1%
56
Naïve Bayes: discussion
• Naïve Bayes works surprisingly well (even if
independence assumption is clearly violated)
• Why? Because classification doesn’t require
accurate probability estimates as long as
maximum probability is assigned to correct class
• However: adding too many redundant attributes
will cause problems (e.g. identical attributes)
• Note also: many numeric attributes are not
normally distributed ( kernel density
estimators)
57
Naïve Bayes Extensions
• Improvements:
– select best attributes (e.g. with greedy search)
– often works as well or better with just a
fraction of all attributes
• Bayesian Networks
58
59
Summary
• OneR – uses rules based on just one attribute
• Naïve Bayes – use all attributes and Bayes rules to
estimate probability of the class given an
instance.
• Simple methods frequently work well,
– 1R and Naïve Bayes do just as well—or even better.
• but …
– Complex methods can be better (as we will see)

More Related Content

What's hot

Data cubes
Data cubesData cubes
Data cubes
Mohammed
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
Mohit Saini
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
Usama Fayyaz
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
Mohammad Junaid Khan
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
Duyen Do
 
Data science presentation
Data science presentationData science presentation
Data science presentation
MSDEVMTL
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
Hamidreza Mahdavipanah
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
Paras Kohli
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
Knoldus Inc.
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
Salah Amean
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Dr. Stylianos Kampakis
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Salah Amean
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
Knowledge Representation & Reasoning
Knowledge Representation & ReasoningKnowledge Representation & Reasoning
Knowledge Representation & ReasoningSajid Marwat
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
Gajanand Sharma
 

What's hot (20)

Data cubes
Data cubesData cubes
Data cubes
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Sequential Pattern Mining and GSP
Sequential Pattern Mining and GSPSequential Pattern Mining and GSP
Sequential Pattern Mining and GSP
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Data Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trendData Mining: Concepts and techniques: Chapter 13 trend
Data Mining: Concepts and techniques: Chapter 13 trend
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
Knowledge Representation & Reasoning
Knowledge Representation & ReasoningKnowledge Representation & Reasoning
Knowledge Representation & Reasoning
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 

Similar to Machine Learning and Data Mining

3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.ppt
KhalilDaiyob1
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
SunView Software, Inc.
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.pptbutest
 
Part XIV
Part XIVPart XIV
Part XIVbutest
 
Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outlinebutest
 
SQT.ppt
SQT.pptSQT.ppt
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsPrithwis Mukerjee
 
Debugging Intermittent Issues - A How To
Debugging Intermittent Issues - A How ToDebugging Intermittent Issues - A How To
Debugging Intermittent Issues - A How To
LloydMoore
 
One R (1R) Algorithm
One R (1R) AlgorithmOne R (1R) Algorithm
One R (1R) Algorithm
MLCollab
 
Forecasting demand planning
Forecasting demand planningForecasting demand planning
Forecasting demand planning
ManonmaniA3
 
DM
DMDM
DM
Jaya P
 
Cause and effect diagrams
Cause and effect diagramsCause and effect diagrams
Cause and effect diagrams
Ronald Bartels
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio
sarahdijulio
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
ZHAO Sam
 
I love the smell of data in the morning (getting started with data science) ...
I love the smell of data in the morning (getting started with data science)  ...I love the smell of data in the morning (getting started with data science)  ...
I love the smell of data in the morning (getting started with data science) ...
Troy Magennis
 
lecture 1 Slides.pptx
lecture 1 Slides.pptxlecture 1 Slides.pptx
lecture 1 Slides.pptx
SADAF53170
 
Writing Lab Reports Earth Science Online Each lab re.docx
Writing Lab Reports Earth Science Online Each lab re.docxWriting Lab Reports Earth Science Online Each lab re.docx
Writing Lab Reports Earth Science Online Each lab re.docx
ambersalomon88660
 
Software testing - EXAMPLE
Software testing  - EXAMPLESoftware testing  - EXAMPLE
Software testing - EXAMPLE
priyasoundar
 
introDM.ppt
introDM.pptintroDM.ppt
introDM.ppt
Arumugam Prakash
 

Similar to Machine Learning and Data Mining (20)

3.Classification.ppt
3.Classification.ppt3.Classification.ppt
3.Classification.ppt
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
 
mod_02_intro_ml.ppt
mod_02_intro_ml.pptmod_02_intro_ml.ppt
mod_02_intro_ml.ppt
 
Part XIV
Part XIVPart XIV
Part XIV
 
Machine Learning: finding patterns Outline
Machine Learning: finding patterns OutlineMachine Learning: finding patterns Outline
Machine Learning: finding patterns Outline
 
SQT.ppt
SQT.pptSQT.ppt
SQT.ppt
 
Lecture02 - Data Mining & Analytics
Lecture02 - Data Mining & AnalyticsLecture02 - Data Mining & Analytics
Lecture02 - Data Mining & Analytics
 
Debugging Intermittent Issues - A How To
Debugging Intermittent Issues - A How ToDebugging Intermittent Issues - A How To
Debugging Intermittent Issues - A How To
 
One R (1R) Algorithm
One R (1R) AlgorithmOne R (1R) Algorithm
One R (1R) Algorithm
 
Presentacion 1
Presentacion 1Presentacion 1
Presentacion 1
 
Forecasting demand planning
Forecasting demand planningForecasting demand planning
Forecasting demand planning
 
DM
DMDM
DM
 
Cause and effect diagrams
Cause and effect diagramsCause and effect diagrams
Cause and effect diagrams
 
10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio10NTC - Data Superheroes - DiJulio
10NTC - Data Superheroes - DiJulio
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
I love the smell of data in the morning (getting started with data science) ...
I love the smell of data in the morning (getting started with data science)  ...I love the smell of data in the morning (getting started with data science)  ...
I love the smell of data in the morning (getting started with data science) ...
 
lecture 1 Slides.pptx
lecture 1 Slides.pptxlecture 1 Slides.pptx
lecture 1 Slides.pptx
 
Writing Lab Reports Earth Science Online Each lab re.docx
Writing Lab Reports Earth Science Online Each lab re.docxWriting Lab Reports Earth Science Online Each lab re.docx
Writing Lab Reports Earth Science Online Each lab re.docx
 
Software testing - EXAMPLE
Software testing  - EXAMPLESoftware testing  - EXAMPLE
Software testing - EXAMPLE
 
introDM.ppt
introDM.pptintroDM.ppt
introDM.ppt
 

More from Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL

Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
MapReduce
MapReduceMapReduce
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 

More from Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL (20)

BlockChain.pptx
BlockChain.pptxBlockChain.pptx
BlockChain.pptx
 
Introduction to data mining and machine learning
Introduction to data mining and machine learningIntroduction to data mining and machine learning
Introduction to data mining and machine learning
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
Data analytics
Data analyticsData analytics
Data analytics
 
Hadoop Eco system
Hadoop Eco systemHadoop Eco system
Hadoop Eco system
 
Parallel Computing on the GPU
Parallel Computing on the GPUParallel Computing on the GPU
Parallel Computing on the GPU
 
evaluation and credibility-Part 2
evaluation and credibility-Part 2evaluation and credibility-Part 2
evaluation and credibility-Part 2
 
evaluation and credibility-Part 1
evaluation and credibility-Part 1evaluation and credibility-Part 1
evaluation and credibility-Part 1
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Decision tree
Decision treeDecision tree
Decision tree
 
kmean clustering
kmean clusteringkmean clustering
kmean clustering
 
Covering algorithm
Covering algorithmCovering algorithm
Covering algorithm
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 
Assosiate rule mining
Assosiate rule miningAssosiate rule mining
Assosiate rule mining
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
MapReduce
MapReduceMapReduce
MapReduce
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
 

Recently uploaded

Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 

Recently uploaded (20)

Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 

Machine Learning and Data Mining

  • 2. • Supervised learning • Unsupervised learning • Reinforcement learning Outline 2
  • 3. • Data Mining: Process of discovering patterns in data Data Mining 3
  • 4. • Machine Learning – Grew out of work in AI – New Capability for computers • Machine Learning is a science of getting computers to learn without being explicitly programed • Learning= Improving with experience at some task – Improve over task T – With respect to P – Based on experience E Machine Learning 4
  • 5. • Database Mining – Large datasets from growth of automation/web • Ex: web click data, medical records, biology, engineering • Applications can’t program by hand – Ex: Autonomous helicopter, handwriting recognition, most of NLP, Computer vision • Self- customizing programs – Amazon, Netflix product recommendation • Understand human Learning(brain, real AI) Machine Learning 5
  • 6. Types of Learning – Supervised learning: Learn to predict • correct answer for each example. Answer can be a numeric variable, categorical variable etc. – Unsupervised learning: learn to understand and describe the data • correct answers not given – just examples (e.g. – the same figures as above , without the labels) – Reinforcement learning: learn to act • occasional rewards M M MF F F 6
  • 8. • The success of machine learning system also depends on the algorithms. • The algorithms control the search to find and build the knowledge structures. • The learning algorithms should extract useful information from training examples. Algorithms 8
  • 9. • Supervised learning – Prediction – Classification (discrete labels), Regression (real values) • Unsupervised learning – Clustering – Probability distribution estimation – Finding association (in features) – Dimension reduction • Reinforcement learning – Decision making (robot, chess machine) Algorithms 9
  • 10. • Problem of taking labeled dataset, gleaning information from it so that you can label new data sets • Learn to predict output from input • Function approximation Supervised Learning 10
  • 11. • Predict housing prices R Supervised learning: example 1 Regression: predict continuous valued output(price) 11
  • 12. • Breast Cancer(malignant, benign) Supervised learning: example 2 This is classification problem : Discrete valued output(0 or 1) 12
  • 13. 1 attribute/feature : Tumor Size 13
  • 14. Supervised learning: example 3 2 attributes/features : Tumor Size and Age 14
  • 15. 1. Input: Credit history (# of loans, how much money you make,…) Out put : Lend money or not? 2. Input: picture , Output: Predict Bsc, Msc PhD 3. Input: picture, Output: Predict Age 4. Input: Large inventory of identical items, Output: Predict how many items will sell over the next 3 months 5. Input: Customer accounts, Output: hacked or not Q? 15
  • 16. • Find patterns and structure in data Unsupervised Learning 16
  • 17. Unsupervised Learning-examples • Organize computing clusters – Large data centers: what machines work together? • Social network analysis – Given information which friends you send email most /FB friends/Google+ circles – Can we automatically identify which are cohesive groups of friends 17
  • 18. • Market Segmentation – Customer data set and group customer into different market segments • Astronomical data analysis – Clustering algorithms gives interesting & useful theories ex: how galaxies are formed 18
  • 19. 1. Given email labeled as spam/not spam, learn a spam filter 2. Given set of news articles found on the web, group them into set of articles about the same story 3. Given a database of customer data, automatically discover market segments ad groups customer into different market segments 4. Given a dataset of patients diagnosed as either having diabetes or nor, learn to classify new patients as having diabetes or not Q? 19
  • 20. Reinforcement Learning • Learn to act • Learning from delayed reward • Learning come from after several step ,the decisions that you’ve actually made 20
  • 21. Algorithms: The Basic Methods 21
  • 22. Outline • Simplicity first: 1R • Naïve Bayes 22
  • 23. Simplicity first • Simple algorithms often work very well! • There are many kinds of simple structure, eg: – One attribute does all the work – All attributes contribute equally & independently – A weighted linear combination might do – Instance-based: use a few prototypes – Use simple logical rules • Success of method depends on the domain 23
  • 24. Inferring rudimentary rules • 1R: learns a 1-level decision tree – I.e., rules that all test one particular attribute • Basic version – One branch for each value – Each branch assigns most frequent class – Error rate: proportion of instances that don’t belong to the majority class of their corresponding branch – Choose attribute with lowest error rate (assumes nominal attributes) 24
  • 25. Pseudo-code for 1R For each attribute, For each value of the attribute, make a rule as follows: count how often each class appears find the most frequent class make the rule assign that class to this attribute-value Calculate the error rate of the rules Choose the rules with the smallest error rate • Note: “missing” is treated as a separate attribute value 25
  • 26. Evaluating the weather attributes Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No Attribute Table 1. Weather data (Nominal) 26
  • 27. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny  No 2/5 Overcast  Rainy  Temp Hot  Mild  Cool  Humidity High  Normal  Windy False  True  Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No * indicates a tie 27
  • 28. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny  No 2/5 4/14 Overcast  Yes 0/4 Rainy  Yes 2/5 Temp Hot  No* 2/4 5/14 Mild  Yes 2/6 Cool  Yes 1/4 Humidity High  No 3/7 4/14 Normal  Yes 1/7 Windy False  Yes 2/8 5/14 True  No* 3/6 Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No * indicates a tie 28
  • 29. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny  No 2/5 4/14 Overcast  Yes 0/4 Rainy  Yes 2/5 Temp Hot  No* 2/4 5/14 Mild  Yes 2/6 Cool  Yes 1/4 Humidity High  No 3/7 4/14 Normal  Yes 1/7 Windy False  Yes 2/8 5/14 True  No* 3/6 Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No * indicates a tie 29
  • 30. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny  No 2/5 4/14 Overcast  Yes 0/4 Rainy  Yes 2/5 Temp Hot  No* 2/4 5/14 Mild  Yes 2/6 Cool  Yes 1/4 Humidity High  No 3/7 4/14 Normal  Yes 1/7 Windy False  Yes 2/8 5/14 True  No* 3/6 Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No * indicates a tie 30
  • 31. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny->No 2/5 4/14 Overcast->Yes 0/4 Rainy->Yes 2/5 Temp Hot->No* 2/4 5/14 Mild->Yes 2/6 Cool->Yes 1/4 Humidity High->No 3/7 4/14 Normal->Yes 1/7 Windy False->Yes 2/8 5/14 True->No* 3/6 Table 2. Rules for Weather data (Nominal) 32
  • 32. Dealing with numeric attributes • Discretize numeric attributes • Divide each attribute’s range into intervals – Sort instances according to attribute’s values – Place breakpoints where the class changes (the majority class) – This minimizes the total error 33
  • 33. Example: temperature from weather data 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Y | N Outlook Temperature Humidity Windy Play Sunny 85 85 False No Sunny 80 90 True No Overcast 83 86 False Yes Rainy 70 96 False Yes Rainy 68 80 False Yes Rainy 65 70 True No Overcast 64 65 True Yes Sunny 72 95 False No Sunny 69 70 False Yes Rainy 75 80 False Yes Sunny 75 70 True Yes Overcast 72 90 True Yes Overcast 81 75 False Yes Rainy 71 91 True No 34
  • 34. Example: temperature from weather data 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No | Yes | Yes Yes | No | Yes Yes | No Outlook Temperature Humidity Windy Play Sunny 85 85 False No Sunny 80 90 True No Overcast 83 86 False Yes Rainy 70 96 False Yes Rainy 68 80 False Yes Rainy 65 70 True No Overcast 64 65 True Yes Sunny 72 95 False No Sunny 69 70 False Yes Rainy 75 80 False Yes Sunny 75 70 True Yes Overcast 72 90 True Yes Overcast 81 75 False Yes Rainy 71 91 True No 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No The simplest fix is to move the breakpoint at 72 up one example, to 73.5, producing a mixed partition in which no is the majority class. 35
  • 35. Evaluating the weather attributes Attribute Rules Errors Total errors Outlook Sunny->No 2/5 Overcast-> Rainy-> Temp Hot-> Mild-> Cool-> Humidity High-> Normal-> Windy False-> True-> Table :Rules for Weather data (Nominal) 36
  • 36. Dealing with numeric attributes Attribute Rules Errors Total errors Temperature <=64.5->Yes 0/1 1/14 >64.5->No 0/1 >66.5 and <=70.5->Yes 0/3 >70.5 and <=73.5->No 1/3 >73.5 and <=77.5->Yes 0/2 >77.5 and <=80.5->No 0/1 >80.5 and <=84->Yes 0/2 >84->No 0/1 Table 5. Rules for temperature from weather data (overfitting) 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No Break points :64.5, 66.5, 70.5, 73.5, 77.5, 80.5, and 84 37
  • 37. The problem of overfitting • This procedure is very sensitive to noise – One instance with an incorrect class label will probably produce a separate interval • Simple solution: enforce minimum number of instances in majority class per interval 38
  • 38. Discretization example • Example (with min = 3): • Final result for temperature attribute 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes No Yes Yes Yes | No No Yes Yes Yes | No Yes Yes No 64 65 68 69 70 71 72 72 75 75 80 81 83 85 Yes | No | Yes Yes Yes | No No Yes | Yes Yes | No | Yes Yes | No which leads to the rule set temperature: ≤ 77.5 → yes > 77.5 → 39
  • 39. With overfitting avoidance • Resulting rule set: Attribute Rules Errors Total errors Outlook Sunny  No 2/5 4/14 Overcast  Yes 0/4 Rainy  Yes 2/5 Temperature  77.5  Yes 3/10 5/14 > 77.5  No* 2/4 Humidity  82.5  Yes 1/7 3/14 > 82.5 and  95.5  No 2/6 > 95.5  Yes 0/1 Windy False  Yes 2/8 5/14 True  No* 3/6 40
  • 40. Discussion of 1R • 1R was described in a paper by Holte (1993) – Contains an experimental evaluation on 16 datasets (using cross-validation so that results were representative of performance on future data) – Minimum number of instances was set to 6 after some experimentation – 1R’s simple rules performed not much worse than much more complex decision trees • Simplicity first pays off! Very Simple Classification Rules Perform Well on Most Commonly Used Datasets Robert C. Holte, Computer Science Department, University of Ottawa 41
  • 41. Statistical modeling • “Opposite” of 1R: use all the attributes • Two assumptions: Attributes are – equally important – statistically independent (given the class value) • I.e., knowing the value of one attribute says nothing about the value of another (if the class is known) • Independence assumption is almost never correct! • But … this scheme works surprisingly well in practice 42
  • 42. Probabilities for weather data Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 Hot 2 2 High 3 4 False 6 2 9 5 Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3 Rainy 3 2 Cool 3 1 Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14 Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5 Rainy 3/9 2/5 Cool 3/9 1/5 Outlook Temp Humidity Windy Play Sunny Hot High False No Sunny Hot High True No Overcast Hot High False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Cool Normal True No Overcast Cool Normal True Yes Sunny Mild High False No Sunny Cool Normal False Yes Rainy Mild Normal False Yes Sunny Mild Normal True Yes Overcast Mild High True Yes Overcast Hot Normal False Yes Rainy Mild High True No 43
  • 43. Probabilities for weather data Outlook Temp. Humidity Windy Play Sunny Cool High True ? • A new day: Likelihood of the two classes For “yes” = 2/9  3/9  3/9  3/9  9/14 = 0.0053 For “no” = 3/5  1/5  4/5  3/5  5/14 = 0.0206 Conversion into a probability by normalization: P(“yes”) = 0.0053 / (0.0053 + 0.0206) = 0.205 P(“no”) = 0.0206 / (0.0053 + 0.0206) = 0.795 Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 Hot 2 2 High 3 4 False 6 2 9 5 Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3 Rainy 3 2 Cool 3 1 Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14 Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5 Rainy 3/9 2/5 Cool 3/9 1/5 44
  • 44. Bayes’s rule • Probability of event H given evidence E : • A priori probability of H : – Probability of event before evidence is seen • A posteriori probability of H : – Probability of event after evidence is seen ]Pr[ ]Pr[]|Pr[ ]|Pr[ E HHE EH = ]|Pr[ EH ]Pr[H Thomas Bayes Born: 1702 in London, England Died: 1761 in Tunbridge Wells, Kent, England from Bayes “Essay towards solving a problem in the doctrine of chances” (1763) 45
  • 45. Naïve Bayes for classification • Classification learning: what’s the probability of the class given an instance? – Evidence E = instance – Event H = class value for instance • Naïve assumption: evidence splits into parts (i.e. attributes) that are independent Pr[H | E]= Pr[E1 | H]Pr[E2 | H]…Pr[En | H]Pr[H] Pr[E] 46
  • 46. Weather data example Outlook Temp. Humidity Windy Play Sunny Cool High True ? Evidence E Probability of class “yes” ]|Pr[]|Pr[ yesSunnyOutlookEyes == ]|Pr[ yesCooleTemperatur =´ ]|Pr[ yesHighHumidity =´ ]|Pr[ yesTrueWindy =´ ]Pr[ ]Pr[ E yes ´ ]Pr[ 14 9 9 3 9 3 9 3 9 2 E ´´´´ = 47
  • 47. The “zero-frequency problem” • What if an attribute value doesn’t occur with every class value? (e.g. “Outlook = Overcast” for class “No”) – Probability will be zero! – A posteriori probability will also be zero! (No matter how likely the other values are!) • Remedy: add 1 to the count for every attribute value-class combination (Laplace estimator) • Result: probabilities will never be zero! (also: stabilizes probability estimates) Pr[no| E]= 0 Pr[Outlook =Overcast |no]= 0 Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 +1 Hot 2 2 High 3 4 False 6 2 9 5 Overcast 4 0 Mild 4 2 Normal 6 1 True 3 3 Rainy 3 2 Cool 3 1 Sunny 2/9 3/5 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14 Overcast 4/9 0/5 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5 Rainy 3/9 2/5 Cool 3/9 1/5 48
  • 48. The “zero-frequency problem” • What if an attribute value doesn’t occur with every class value? (e.g. “Outlook = Overcast” for class “No”) – Probability will be zero! – A posteriori probability will also be zero! (No matter how likely the other values are!) • Remedy: add 1 to the count for every attribute value-class combination (Laplace estimator) • Result: probabilities will never be zero! (also: stabilizes probability estimates) Pr[no| E]= 0 Pr[Outlook =Overcast |no]= 0 Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 +1 Hot 2 2 High 3 4 False 6 2 9 5 Overcast 4 0 +1 Mild 4 2 Normal 6 1 True 3 3 Rainy 3 2 +1 Cool 3 1 Sunny 2/9 3/5 4/8 Hot 2/9 2/5 High 3/9 4/5 False 6/9 2/5 9/14 5/14 Overcast 4/9 0/5 1/8 Mild 4/9 2/5 Normal 6/9 1/5 True 3/9 3/5 Rainy 3/9 2/5 3/8 Cool 3/9 1/5 49
  • 49. *Modified probability estimates • In some cases adding a constant different from 1 might be more appropriate • Example: attribute outlook for class yes • Weights don’t need to be equal (but they must sum to 1) m m + + 9 3/2 m m + + 9 3/4 m m + + 9 3/3 Sunny Overcast Rainy m m + + 9 2 1p m m + + 9 4 2p m m + + 9 3 3p 50
  • 50. Missing values • Training: instance is not included in frequency count for attribute value-class combination • Classification: attribute will be omitted from calculation • Example: Outlook Temp. Humidity Windy Play ? Cool High True ? Likelihood of “yes” =    = 0.0238 Likelihood of “no” =    = 0.0343 P(“yes”) = 0.0238 / (0.0238 + 0.0343) = 41% P(“no”) = 0.0343 / (0.0238 + 0.0343) = 59% 51
  • 51. Missing values • Training: instance is not included in frequency count for attribute value-class combination • Classification: attribute will be omitted from calculation • Example: Outlook Temp. Humidity Windy Play ? Cool High True ? Likelihood of “yes” = 3/9  3/9  3/9  9/14 = 0.0238 Likelihood of “no” = 1/5  4/5  3/5  5/14 = 0.0343 P(“yes”) = 0.0238 / (0.0238 + 0.0343) = 41% P(“no”) = 0.0343 / (0.0238 + 0.0343) = 59% 52
  • 52. Numeric attributes • Usual assumption: attributes have a normal or Gaussian probability distribution (given the class) • The probability density function for the normal distribution is defined by two parameters: – Sample mean  – Standard deviation  – Then the density function f(x) is å= = n i ix n 1 1 m å= - - = n i ix n 1 2 )( 1 1 ms 2 2 2 )( 2 1 )( s m sp - - = x exf Karl Gauss, 1777-1855 great German mathematician 53
  • 53. Statistics for weather data • Example density value: 0340.0 2.62 1 )|66( 2 2 2.62 )7366( === * - - eyesetemperaturf p Outlook Temperature Humidity Windy Play Yes No Yes No Yes No Yes No Yes No Sunny 2 3 64, 68, 65, 71, 65, 70, 70, 85, False 6 2 9 5 Overcast 4 0 69, 70, 72, 80, 70, 75, 90, 91, True 3 3 Rainy 3 2 72, … 85, … 80, … 95, … Sunny 2/9 3/5  =73  =75  =79  =86 False 6/9 2/5 9/14 5/14 Overcast 4/9 0/5  =6.2  =7.9  =10.2  =9.7 True 3/9 3/5 Rainy 3/9 2/5 54
  • 54. Classifying a new day • A new day: • Missing values during training are not included in calculation of mean and standard deviation Outlook Temp. Humidity Windy Play Sunny 66 90 true ? Likelihood of “yes” =   0.0221   = 0.000036 Likelihood of “no” =  0.0291  0.0380   = 0.000136 P(“yes”) = 0.000036 / (0.000036 + 0. 000136) = 20.9% P(“no”) = 0.000136 / (0.000036 + 0. 000136) = 79.1% 55
  • 55. Classifying a new day • A new day: • Missing values during training are not included in calculation of mean and standard deviation Outlook Temp. Humidity Windy Play Sunny 66 90 true ? Likelihood of “yes” = 2/9  0.0340  0.0221  3/9  9/14 = 0.000036 Likelihood of “no” = 3/5  0.0291  0.0380  3/5  5/14 = 0.000136 P(“yes”) = 0.000036 / (0.000036 + 0. 000136) = 20.9% P(“no”) = 0.000136 / (0.000036 + 0. 000136) = 79.1% 56
  • 56. Naïve Bayes: discussion • Naïve Bayes works surprisingly well (even if independence assumption is clearly violated) • Why? Because classification doesn’t require accurate probability estimates as long as maximum probability is assigned to correct class • However: adding too many redundant attributes will cause problems (e.g. identical attributes) • Note also: many numeric attributes are not normally distributed ( kernel density estimators) 57
  • 57. Naïve Bayes Extensions • Improvements: – select best attributes (e.g. with greedy search) – often works as well or better with just a fraction of all attributes • Bayesian Networks 58
  • 58. 59 Summary • OneR – uses rules based on just one attribute • Naïve Bayes – use all attributes and Bayes rules to estimate probability of the class given an instance. • Simple methods frequently work well, – 1R and Naïve Bayes do just as well—or even better. • but … – Complex methods can be better (as we will see)