SlideShare a Scribd company logo
1 of 22
1
Machine Learning: Lecture 9
Rule Learning /
Inductive Logic Programming /
Association Rules
2
Learning Rules
 One of the most expressive and human readable representations
for learned hypotheses is sets of production rules (if-then rules).
 Rules can be derived from other representations (e.g., decision
trees) or they can be learned directly. Here, we are concentrating
on the direct method.
 An important aspect of direct rule-learning algorithms is that they
can learn sets of first-order rules which have much more
representational power than the propositional rules that can be
derived from decision trees.
 Rule Learnng also allows the incorporation of background
knowledge nto the process.
 Learning rules is also useful for the data mining task of association
rules mining.
3
Propositional versus First-Order Logic
 Propositional Logic does not include variables and
thus cannot express general relations among the values
of the attributes.
 Example 1: in Propositional logic, you can write:
IF (Father1=Bob) ^ (Name2=Bob)^ (Female1=True)
THEN Daughter1,2=True.
This rule applies only to a specific family!
 Example 2: In First-Order logic, you can write:
IF Father(y,x) ^ Female(y), THEN Daughter(x,y)
This rule (which you cannot write in Propositional
Logic) applies to any family!
4
Learning Propositional versus
First-Order Rules
 Both approaches to learning are useful as they address
different types of learning problems.
 Like Decision Trees, Feedforward Neural Nets and IBL
systems, Propositional Rule Learning systems are suited
for problems in which no substantial relationship between
the values of the different attributes needs to be
represented.
 In First-Order Learning Problems, the hypotheses that
must be represented involve relational assertions that can
be conveniently expressed using first-order
representations such as horn clauses (H <- L1 ^…^Ln).
5
Learning Propositional Rules:
Sequential Covering Algorithms
Sequential-Covering(Target_attribute, Attributes,
Examples, Threshold)
 Learned_rules <-- { }
 Rule <-- Learn-one-rule(Target_attribute, Attributes,
Examples)
 While Performance(Rule, Examples) > Threshold, do
 Learned_rules <-- Learned_rules + Rule
 Examples <-- Examples -{examples correctly
classified by Rule}
 Rule <-- Learn-one-rule(Target_attribute,
Attributes, Examples)
 Learned_rules <-- sort Learned_rules according to
Performance over Examples
 Return Learned_rules
6
Learning Propositional Rules:
Sequential Covering Algorithms
 The algorithm is called a sequential covering algorithm
because it sequentially learns a set of rules that together
cover the whole set of positive examples.
 It has the advantage of reducing the problem of learning a
disjunctive set of rules to a sequence of simpler problems,
each requiring that a single conjunctive rule be learned.
 The final set of rules is sorted so that the most accurate
rules are considered first at classification time.
 However, because it does not backtrack, this algorithm is
not guaranteed to find the smallest or best set of rules --->
Learn-one-rule must be very effective!
7
Learning Propositional Rules:
Learn-one-rule
General-to-Specific Search:
 Consider the most general rule (hypothesis) which
matches every instances in the training set.
 Repeat
 Add the attribute that most improves rule
performance measured over the training set.
 Until the hypothesis reaches an acceptable level of
performance.
General-to-Specific Beam Search (CN2):
 Rather than considering a single candidate at each
search step, keep track of the k best candidates.
8
Comments and Variations regarding
the Basic Rule Learning Algorithms
 Sequential versus Simultaneous covering: sequential
covering algorithms (CN2) make a larger number of
independent choices than simultaneous covering ones (ID3).
 Direction of the search: CN2 uses a general-to-specific
search strategy. Other systems (GOLEM) uses a specific to
general search strategy. General-to-specific search has the
advantage of having a single hypothesis from which to start.
 Generate-then-test versus example-driven: CN2 is a
generate-then-test method. Other methods (AQ, CIGOL) are
example-driven. Generate-then-test systems are more robust
to noise.
9
Comments and Variations regarding the
Basic Rule Learning Algorithms,Cont’d
 Post-Pruning: pre-conditions can be removed
from the rule whenever this leads to improved
performance over a set of pruning examples
distinct from the training set.
 Performance measure: different evaluation can be
used. Example: relative frequency (AQ), m-
estimate of accuracy (certain versions of CN2)
and entropy (original CN2).
10
There are two kinds of loop in the Ripper algorithm:
Outer loop : adding one rule at a time to the rule base
Inner loop : adding one condition at a time to the
current rule
Conditions are added to the rule to maximize an
information gain measure.
Conditions are added to the rule until it covers no
negative example.
Example: RIPPER (this and the next three slides are
borrowed from E. Alpaydin, Lecture Notes for An
Introduction to machine Learning, 2004, MIT Press.
(Chapter 9)).
11
DL: description length of
the rule base
O(Nlog2N)
The description length of a rule base
= (the sum of the description lengths
of all the rules in the rule base)
+ (the description of the instances
not covered by the rule base)
12
13
Ripper Algorithm
 In Ripper, conditions are added to the rule to
 Maximize an information gain measure
• R : the original rule
• R’ : the candidate rule after adding a condition
• N (N’): the number of instances that are covered by R (R’)
• N+ (N’+): the number of true positives in R (R’)
• s : the number of true positives in R and R’ (after adding the
condition)
 Until it covers no negative example
p and n : the number of true and false
positives respectively.
1
)
( 



n
p
n
p
R
rvm
Rule value metric
)
log
'
'
(log
)
,
'
( 2
2
N
N
N
N
s
R
R
Gain 




14
Incorporating Background Knowledge into the
Learning Process:
Induction as Inverted Deduction
 Let D be a set of training examples, each of the form <xi,f(xi)>.
Then, learning is the problem of discovering a hypothesis h, such
that the classification f(xi) of each training instance xi follows
deductively from the hypothesis h, the description of xi and any
other background knowledge B known to the system.
Example:
 xi: Male(Bob), Female(Sharon), Father(Sharon, Bob)
 f(xi): Child(Bob, Sharon)
 B: Parent(u,v) <-- Father(v,u)
 we want to find h s.t., (B^h^xi) |-- f(xi).
h1: Child(u,v) <-- Father(v,u)
h2: Child(u,v) <-- Parent(v,u)
15
Learning Sets of First-Order
Rules: FOIL (Quinlan, 1990)
FOIL is similar to the Propositional Rule learning approach
except for the following:
 FOIL accommodates first-order rules and thus needs to
accommodate variables in the rule pre-conditions.
 FOIL uses a special performance measure (FOIL-GAIN)
which takes into account the different variable bindings.
 FOILS seeks only rules that predict when the target literal
is True (instead of predicting when it is True or when it is
False).
 FOIL performs a simple hillclimbing search rather than a
beam search.
16
Association Rule Mining (borrowed
from Stan Matwin’s slides)
 Given I={i1,.., im} set of items
 D a set of transaction (a database), each transaction is a set
of items T in 2I .
 Association rule: X => Y, X in I, Y in I, X inter Y = 0
 The support of an itemset is defined as the proportion of
transactions in the data set which contain the itemset.
 The confidence of a rule is defined
conf(X => Y)= supp(X U Y)/supp(X)
 Itemset is frequent if its support > θ
17
Itemsets and Association Rules
 Itemset = set of items
 k-itemset = set of k-items
 Finding association rules in databases:
 Find all frequent (or large) itemsets (those with
support > minS)
 Generate rules that satisfy minimum confidence
 Example of an association rule: People who
buy a computer also buy financial software
(support of 2%; confidence of 60%)
Example
 Itemset{milk, bread, butter}
 Support 1/5 = .2
 Rule {Bread, Butter} => {Milk}
 Confidence = 0.2/0.2 = 1 18
transacti
on ID
milk bread butter beer
1 1 1 0 0
2 0 0 1 0
3 0 0 0 1
4 1 1 1 0
5 0 1 0 0
19
Apriori Algorithm
 Start with individual items with large support
 In each next step, k,
 Use itemsets from step k-1, generate new itemset Ck
 Compute Ck’s support
 Prune the ones that are below the threshold θ
Apriori property: All [non-empty] subsets of a
frequent itemset must be frequent
20
Apriori Algorithm: Example from
Han Kamber, Data Mining, p.232
TID List of Items ID
T100 I1, I2, I5
T200 I2, I4
T300 I2, I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2, I3, I5
T900 I1, I2, I3
21
Apriori Algorithm: Example from Han
Kamber, Data Mining, p.232 (Cont’d)
22
From itemsets to association rules
 For each Frequent itemset I
 generate all the partitions of I into s, I-s
 Attempt a rule s => I-s iff
support_count(I)/support_count(s) > minc

More Related Content

Similar to New_ML_Lecture_9.ppt

Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)NYversity
 
Poggi analytics - geneticos - 1
Poggi   analytics - geneticos - 1Poggi   analytics - geneticos - 1
Poggi analytics - geneticos - 1Gaston Liberman
 
A New Extraction Optimization Approach to Frequent 2 Item sets
A New Extraction Optimization Approach to Frequent 2 Item setsA New Extraction Optimization Approach to Frequent 2 Item sets
A New Extraction Optimization Approach to Frequent 2 Item setsijcsa
 
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSA NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSijcsa
 
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSA NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSijcsa
 
3_learning.ppt
3_learning.ppt3_learning.ppt
3_learning.pptbutest
 
Cuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsCuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsXin-She Yang
 
Clustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyClustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyAnkita Dubey
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmZHAO Sam
 
SC561_Machine Learning_Nov2023_Handout6.pdf
SC561_Machine Learning_Nov2023_Handout6.pdfSC561_Machine Learning_Nov2023_Handout6.pdf
SC561_Machine Learning_Nov2023_Handout6.pdfNusaike Mufthie
 
An Empirical Investigation Of The Arbitrage Pricing Theory
An Empirical Investigation Of The Arbitrage Pricing TheoryAn Empirical Investigation Of The Arbitrage Pricing Theory
An Empirical Investigation Of The Arbitrage Pricing TheoryAkhil Goyal
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: butest
 

Similar to New_ML_Lecture_9.ppt (20)

Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
Poggi analytics - geneticos - 1
Poggi   analytics - geneticos - 1Poggi   analytics - geneticos - 1
Poggi analytics - geneticos - 1
 
A New Extraction Optimization Approach to Frequent 2 Item sets
A New Extraction Optimization Approach to Frequent 2 Item setsA New Extraction Optimization Approach to Frequent 2 Item sets
A New Extraction Optimization Approach to Frequent 2 Item sets
 
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSA NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
 
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETSA NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
A NEW EXTRACTION OPTIMIZATION APPROACH TO FREQUENT 2 ITEMSETS
 
3_learning.ppt
3_learning.ppt3_learning.ppt
3_learning.ppt
 
Fuzzy sets
Fuzzy sets Fuzzy sets
Fuzzy sets
 
Cuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and ApplicationsCuckoo Search: Recent Advances and Applications
Cuckoo Search: Recent Advances and Applications
 
Clustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita DubeyClustering and Classification Algorithms Ankita Dubey
Clustering and Classification Algorithms Ankita Dubey
 
syllabus-CBR.pdf
syllabus-CBR.pdfsyllabus-CBR.pdf
syllabus-CBR.pdf
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Ruleby
RulebyRuleby
Ruleby
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
Unit 3.pptx
Unit 3.pptxUnit 3.pptx
Unit 3.pptx
 
SC561_Machine Learning_Nov2023_Handout6.pdf
SC561_Machine Learning_Nov2023_Handout6.pdfSC561_Machine Learning_Nov2023_Handout6.pdf
SC561_Machine Learning_Nov2023_Handout6.pdf
 
An Empirical Investigation Of The Arbitrage Pricing Theory
An Empirical Investigation Of The Arbitrage Pricing TheoryAn Empirical Investigation Of The Arbitrage Pricing Theory
An Empirical Investigation Of The Arbitrage Pricing Theory
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web: Accurately and Reliably Extracting Data from the Web:
Accurately and Reliably Extracting Data from the Web:
 

Recently uploaded

ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 

Recently uploaded (20)

ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 

New_ML_Lecture_9.ppt

  • 1. 1 Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming / Association Rules
  • 2. 2 Learning Rules  One of the most expressive and human readable representations for learned hypotheses is sets of production rules (if-then rules).  Rules can be derived from other representations (e.g., decision trees) or they can be learned directly. Here, we are concentrating on the direct method.  An important aspect of direct rule-learning algorithms is that they can learn sets of first-order rules which have much more representational power than the propositional rules that can be derived from decision trees.  Rule Learnng also allows the incorporation of background knowledge nto the process.  Learning rules is also useful for the data mining task of association rules mining.
  • 3. 3 Propositional versus First-Order Logic  Propositional Logic does not include variables and thus cannot express general relations among the values of the attributes.  Example 1: in Propositional logic, you can write: IF (Father1=Bob) ^ (Name2=Bob)^ (Female1=True) THEN Daughter1,2=True. This rule applies only to a specific family!  Example 2: In First-Order logic, you can write: IF Father(y,x) ^ Female(y), THEN Daughter(x,y) This rule (which you cannot write in Propositional Logic) applies to any family!
  • 4. 4 Learning Propositional versus First-Order Rules  Both approaches to learning are useful as they address different types of learning problems.  Like Decision Trees, Feedforward Neural Nets and IBL systems, Propositional Rule Learning systems are suited for problems in which no substantial relationship between the values of the different attributes needs to be represented.  In First-Order Learning Problems, the hypotheses that must be represented involve relational assertions that can be conveniently expressed using first-order representations such as horn clauses (H <- L1 ^…^Ln).
  • 5. 5 Learning Propositional Rules: Sequential Covering Algorithms Sequential-Covering(Target_attribute, Attributes, Examples, Threshold)  Learned_rules <-- { }  Rule <-- Learn-one-rule(Target_attribute, Attributes, Examples)  While Performance(Rule, Examples) > Threshold, do  Learned_rules <-- Learned_rules + Rule  Examples <-- Examples -{examples correctly classified by Rule}  Rule <-- Learn-one-rule(Target_attribute, Attributes, Examples)  Learned_rules <-- sort Learned_rules according to Performance over Examples  Return Learned_rules
  • 6. 6 Learning Propositional Rules: Sequential Covering Algorithms  The algorithm is called a sequential covering algorithm because it sequentially learns a set of rules that together cover the whole set of positive examples.  It has the advantage of reducing the problem of learning a disjunctive set of rules to a sequence of simpler problems, each requiring that a single conjunctive rule be learned.  The final set of rules is sorted so that the most accurate rules are considered first at classification time.  However, because it does not backtrack, this algorithm is not guaranteed to find the smallest or best set of rules ---> Learn-one-rule must be very effective!
  • 7. 7 Learning Propositional Rules: Learn-one-rule General-to-Specific Search:  Consider the most general rule (hypothesis) which matches every instances in the training set.  Repeat  Add the attribute that most improves rule performance measured over the training set.  Until the hypothesis reaches an acceptable level of performance. General-to-Specific Beam Search (CN2):  Rather than considering a single candidate at each search step, keep track of the k best candidates.
  • 8. 8 Comments and Variations regarding the Basic Rule Learning Algorithms  Sequential versus Simultaneous covering: sequential covering algorithms (CN2) make a larger number of independent choices than simultaneous covering ones (ID3).  Direction of the search: CN2 uses a general-to-specific search strategy. Other systems (GOLEM) uses a specific to general search strategy. General-to-specific search has the advantage of having a single hypothesis from which to start.  Generate-then-test versus example-driven: CN2 is a generate-then-test method. Other methods (AQ, CIGOL) are example-driven. Generate-then-test systems are more robust to noise.
  • 9. 9 Comments and Variations regarding the Basic Rule Learning Algorithms,Cont’d  Post-Pruning: pre-conditions can be removed from the rule whenever this leads to improved performance over a set of pruning examples distinct from the training set.  Performance measure: different evaluation can be used. Example: relative frequency (AQ), m- estimate of accuracy (certain versions of CN2) and entropy (original CN2).
  • 10. 10 There are two kinds of loop in the Ripper algorithm: Outer loop : adding one rule at a time to the rule base Inner loop : adding one condition at a time to the current rule Conditions are added to the rule to maximize an information gain measure. Conditions are added to the rule until it covers no negative example. Example: RIPPER (this and the next three slides are borrowed from E. Alpaydin, Lecture Notes for An Introduction to machine Learning, 2004, MIT Press. (Chapter 9)).
  • 11. 11 DL: description length of the rule base O(Nlog2N) The description length of a rule base = (the sum of the description lengths of all the rules in the rule base) + (the description of the instances not covered by the rule base)
  • 12. 12
  • 13. 13 Ripper Algorithm  In Ripper, conditions are added to the rule to  Maximize an information gain measure • R : the original rule • R’ : the candidate rule after adding a condition • N (N’): the number of instances that are covered by R (R’) • N+ (N’+): the number of true positives in R (R’) • s : the number of true positives in R and R’ (after adding the condition)  Until it covers no negative example p and n : the number of true and false positives respectively. 1 ) (     n p n p R rvm Rule value metric ) log ' ' (log ) , ' ( 2 2 N N N N s R R Gain     
  • 14. 14 Incorporating Background Knowledge into the Learning Process: Induction as Inverted Deduction  Let D be a set of training examples, each of the form <xi,f(xi)>. Then, learning is the problem of discovering a hypothesis h, such that the classification f(xi) of each training instance xi follows deductively from the hypothesis h, the description of xi and any other background knowledge B known to the system. Example:  xi: Male(Bob), Female(Sharon), Father(Sharon, Bob)  f(xi): Child(Bob, Sharon)  B: Parent(u,v) <-- Father(v,u)  we want to find h s.t., (B^h^xi) |-- f(xi). h1: Child(u,v) <-- Father(v,u) h2: Child(u,v) <-- Parent(v,u)
  • 15. 15 Learning Sets of First-Order Rules: FOIL (Quinlan, 1990) FOIL is similar to the Propositional Rule learning approach except for the following:  FOIL accommodates first-order rules and thus needs to accommodate variables in the rule pre-conditions.  FOIL uses a special performance measure (FOIL-GAIN) which takes into account the different variable bindings.  FOILS seeks only rules that predict when the target literal is True (instead of predicting when it is True or when it is False).  FOIL performs a simple hillclimbing search rather than a beam search.
  • 16. 16 Association Rule Mining (borrowed from Stan Matwin’s slides)  Given I={i1,.., im} set of items  D a set of transaction (a database), each transaction is a set of items T in 2I .  Association rule: X => Y, X in I, Y in I, X inter Y = 0  The support of an itemset is defined as the proportion of transactions in the data set which contain the itemset.  The confidence of a rule is defined conf(X => Y)= supp(X U Y)/supp(X)  Itemset is frequent if its support > θ
  • 17. 17 Itemsets and Association Rules  Itemset = set of items  k-itemset = set of k-items  Finding association rules in databases:  Find all frequent (or large) itemsets (those with support > minS)  Generate rules that satisfy minimum confidence  Example of an association rule: People who buy a computer also buy financial software (support of 2%; confidence of 60%)
  • 18. Example  Itemset{milk, bread, butter}  Support 1/5 = .2  Rule {Bread, Butter} => {Milk}  Confidence = 0.2/0.2 = 1 18 transacti on ID milk bread butter beer 1 1 1 0 0 2 0 0 1 0 3 0 0 0 1 4 1 1 1 0 5 0 1 0 0
  • 19. 19 Apriori Algorithm  Start with individual items with large support  In each next step, k,  Use itemsets from step k-1, generate new itemset Ck  Compute Ck’s support  Prune the ones that are below the threshold θ Apriori property: All [non-empty] subsets of a frequent itemset must be frequent
  • 20. 20 Apriori Algorithm: Example from Han Kamber, Data Mining, p.232 TID List of Items ID T100 I1, I2, I5 T200 I2, I4 T300 I2, I3 T400 I1, I2, I4 T500 I1, I3 T600 I2, I3 T700 I1, I3 T800 I1, I2, I3, I5 T900 I1, I2, I3
  • 21. 21 Apriori Algorithm: Example from Han Kamber, Data Mining, p.232 (Cont’d)
  • 22. 22 From itemsets to association rules  For each Frequent itemset I  generate all the partitions of I into s, I-s  Attempt a rule s => I-s iff support_count(I)/support_count(s) > minc