SlideShare a Scribd company logo
MINING FREQUENT PATTERNS,
ASSOCIATION AND CORRELATIONS
Data Mining…
Nadar Saraswathi College Of Arts & Science
Submitted by,
Nagapandiyammal N
W H AT I S F R E Q U E N T PAT T E R N A N A LY S I S ?
 Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that
occurs frequently in a data set.
 First proposed by Agrawal, Imielinski, and Swami [AIS93] in the context of
frequent itemsets and association rule mining.
 Motivation: Finding inherent regularities in data
What products were often purchased together?— Beer and diapers?!
What are the subsequent purchases after buying a PC?
What kinds of DNA are sensitive to this new drug?
Can we automatically classify web documents?
I M P O RTA N T F R E Q U E N T PAT T E R N M I N I N G
 Discloses an intrinsic and important property of data sets
 Forms the foundation for many essential data mining tasks
Association, correlation, and causality analysis
Sequential, structural (e.g., sub-graph) patterns
Pattern analysis in spatiotemporal, multimedia, time series, and stream data
Classification: associative classification
Cluster analysis: frequent pattern-based clustering
Data warehousing: iceberg cube and cube-gradient
Semantic data compression: fascicles
Broad applications
C L O S E D PAT T E R N S A N D M A X - PAT T E R N S
 A long pattern contains a combinatorial number of sub patterns,
e.g., {a1, …, a100} contains (1001) + (1002) + … + (110000) =2100 – 1
= 1.27*1030 sub-patterns!
 An item set X is closed if X is frequent and there exists no super-pattern
Y ‫כ‬X, with the same support as X
 An item set X is a max-pattern if X is frequent and there exists no
frequent super-pattern Y ‫כ‬X
 Closed pattern is a lossless compression of freq. patterns
Reducing the # of patterns and rules
SC A LA BLE M ETHOD S FOR M IN IN G
FR EQU EN T PATTER N S
 The downward closure property of frequent patterns
Any subset of a frequent item set must be frequent
If {beer, diaper, nuts} is frequent, so is {beer, diaper}
i.e., every transaction having {beer, diaper, nuts} also contains
{beer, diaper}
Scalable mining methods: Three major approaches
 Apriori
 Frequent pattern growth
 Vertical data format approach
A P R I O R I : A C A N D I D AT E G E N E R AT I O N A N D
T E S T A P P R O A C H
 Apriori pruning principle: If there is any item set which is infrequent, its
superset should not be generated/tested.
 Method:
Initially, scan DB once to get frequent 1-itemset
Generate length (k+1) candidate itemsets from length k
frequent itemsets
Test the candidates against DB
Terminate when no frequent or candidate set can be generated
THE APRIORI ALGORITHM
 Pseudo-code:
Ck: Candidate item set of size k
Lk: frequent item set of size k
L1= {frequent items};
for (k = 1; Lk !=Æ; k++) do begin
Ck+1=candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1
that are contained in t
Lk+1=candidates in Ck+1 with min support
End
return Èk Lk;
IM PORTA N T D ETA ILS OF A PR IOR I
 How to generate candidates?
Step 1: self-joining Lk
Step 2: pruning
How to count supports of candidates?
Example of Candidate-generation
L3={abc, abd, acd, ace, bcd}
Self-joining: L3*L3
abcd from abc and abd
acde from acd and ace
Pruning:
acde is removed because ade is not in L3
C4={abcd}
CHALLENGES OF FREQUENT
PATTERN MINING
 Challenges
 Multiple scans of transaction database
 Huge number of candidates
 Tedious workload of support counting for candidates
 Improving Apriori: general ideas
 Reduce passes of transaction database scans
 Shrink number of candidates
 Facilitate support counting of candidates
SA M PLE FOR FR EQU EN T PATTER N S
 Select a sample of original database, mine frequent
 patterns within sample using Apriori
 Scan database once to verify frequent itemsets found in sample,
only borders of closure of frequent patterns are checked
 Example: check abcd instead of ab, ac, …, etc.
 Scan database again to find missed frequent patterns
T H E E X A M P L E O F A P R I O R I A L G O R I T H M
SUMMARY
 Frequent pattern mining—an important task in data mining
 Scalable frequent pattern mining methods
Apriori (Candidate generation & test)
Projection-based (FP growth, CLOSET+, ...)
Vertical format approach (CHARM, ...)
 Mining a variety of rules and interesting patterns
 Constraint-based mining
 Mining sequential and structured patterns
 Extensions and applications
THE END

More Related Content

Similar to Data Mining

Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Salah Amean
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
06FPBasic.ppt
06FPBasic.ppt06FPBasic.ppt
06FPBasic.ppt
KomalBanik
 
06FPBasic.ppt
06FPBasic.ppt06FPBasic.ppt
06FPBasic.ppt
KomalBanik
 
Mining Frequent Itemsets.ppt
Mining Frequent Itemsets.pptMining Frequent Itemsets.ppt
Mining Frequent Itemsets.ppt
NBACriteria2SICET
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Subrata Kumer Paul
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth AlgorithmFrequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
ShivarkarSandip
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatterns
Kamal Singh Lodhi
 
Unit 3.pptx
Unit 3.pptxUnit 3.pptx
Unit 3.pptx
AdwaitLaud
 
06FPBasic02.pdf
06FPBasic02.pdf06FPBasic02.pdf
06FPBasic02.pdf
Alireza418370
 
My6asso
My6assoMy6asso
My6asso
ketan533
 
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).pptUNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
RaviKiranVarma4
 
Chapter 01 Introduction DM.pptx
Chapter 01 Introduction DM.pptxChapter 01 Introduction DM.pptx
Chapter 01 Introduction DM.pptx
ssuser957b41
 
Pattern Discovery Using Apriori and Ch-Search Algorithm
 Pattern Discovery Using Apriori and Ch-Search Algorithm Pattern Discovery Using Apriori and Ch-Search Algorithm
Pattern Discovery Using Apriori and Ch-Search Algorithm
ijceronline
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
ijsrd.com
 

Similar to Data Mining (20)

Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...Data Mining:  Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
06FPBasic.ppt
06FPBasic.ppt06FPBasic.ppt
06FPBasic.ppt
 
06FPBasic.ppt
06FPBasic.ppt06FPBasic.ppt
06FPBasic.ppt
 
Mining Frequent Itemsets.ppt
Mining Frequent Itemsets.pptMining Frequent Itemsets.ppt
Mining Frequent Itemsets.ppt
 
06 fp basic
06 fp basic06 fp basic
06 fp basic
 
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
Chapter 6. Mining Frequent Patterns, Associations and Correlations Basic Conc...
 
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 5 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth AlgorithmFrequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
 
05
0505
05
 
Cs501 mining frequentpatterns
Cs501 mining frequentpatternsCs501 mining frequentpatterns
Cs501 mining frequentpatterns
 
Unit 3.pptx
Unit 3.pptxUnit 3.pptx
Unit 3.pptx
 
6 module 4
6 module 46 module 4
6 module 4
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
06FPBasic02.pdf
06FPBasic02.pdf06FPBasic02.pdf
06FPBasic02.pdf
 
My6asso
My6assoMy6asso
My6asso
 
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).pptUNIT 3.2 -Mining Frquent Patterns (part1).ppt
UNIT 3.2 -Mining Frquent Patterns (part1).ppt
 
Chapter 01 Introduction DM.pptx
Chapter 01 Introduction DM.pptxChapter 01 Introduction DM.pptx
Chapter 01 Introduction DM.pptx
 
Pattern Discovery Using Apriori and Ch-Search Algorithm
 Pattern Discovery Using Apriori and Ch-Search Algorithm Pattern Discovery Using Apriori and Ch-Search Algorithm
Pattern Discovery Using Apriori and Ch-Search Algorithm
 
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
Simulation and Performance Analysis of Long Term Evolution (LTE) Cellular Net...
 

More from NilaNila16

Basic Block Scheduling
Basic Block SchedulingBasic Block Scheduling
Basic Block Scheduling
NilaNila16
 
Affine Array Indexes
Affine Array IndexesAffine Array Indexes
Affine Array Indexes
NilaNila16
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
NilaNila16
 
Web Programming
Web ProgrammingWeb Programming
Web Programming
NilaNila16
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
NilaNila16
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
NilaNila16
 
Operating system
Operating systemOperating system
Operating system
NilaNila16
 
RDBMS
RDBMSRDBMS
RDBMS
NilaNila16
 
Linear Block Codes
Linear Block CodesLinear Block Codes
Linear Block Codes
NilaNila16
 
Applications of graph theory
                      Applications of graph theory                      Applications of graph theory
Applications of graph theory
NilaNila16
 
Hasse Diagram
Hasse DiagramHasse Diagram
Hasse Diagram
NilaNila16
 
Fuzzy set
Fuzzy set Fuzzy set
Fuzzy set
NilaNila16
 
Recurrence Relation
Recurrence RelationRecurrence Relation
Recurrence Relation
NilaNila16
 
Input/Output Exploring java.io
Input/Output Exploring java.ioInput/Output Exploring java.io
Input/Output Exploring java.io
NilaNila16
 

More from NilaNila16 (14)

Basic Block Scheduling
Basic Block SchedulingBasic Block Scheduling
Basic Block Scheduling
 
Affine Array Indexes
Affine Array IndexesAffine Array Indexes
Affine Array Indexes
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Web Programming
Web ProgrammingWeb Programming
Web Programming
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Operating system
Operating systemOperating system
Operating system
 
RDBMS
RDBMSRDBMS
RDBMS
 
Linear Block Codes
Linear Block CodesLinear Block Codes
Linear Block Codes
 
Applications of graph theory
                      Applications of graph theory                      Applications of graph theory
Applications of graph theory
 
Hasse Diagram
Hasse DiagramHasse Diagram
Hasse Diagram
 
Fuzzy set
Fuzzy set Fuzzy set
Fuzzy set
 
Recurrence Relation
Recurrence RelationRecurrence Relation
Recurrence Relation
 
Input/Output Exploring java.io
Input/Output Exploring java.ioInput/Output Exploring java.io
Input/Output Exploring java.io
 

Recently uploaded

Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
PedroFerreira53928
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 

Recently uploaded (20)

Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Basic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumersBasic phrases for greeting and assisting costumers
Basic phrases for greeting and assisting costumers
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 

Data Mining

  • 1. MINING FREQUENT PATTERNS, ASSOCIATION AND CORRELATIONS Data Mining… Nadar Saraswathi College Of Arts & Science Submitted by, Nagapandiyammal N
  • 2. W H AT I S F R E Q U E N T PAT T E R N A N A LY S I S ?  Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set.  First proposed by Agrawal, Imielinski, and Swami [AIS93] in the context of frequent itemsets and association rule mining.  Motivation: Finding inherent regularities in data What products were often purchased together?— Beer and diapers?! What are the subsequent purchases after buying a PC? What kinds of DNA are sensitive to this new drug? Can we automatically classify web documents?
  • 3. I M P O RTA N T F R E Q U E N T PAT T E R N M I N I N G  Discloses an intrinsic and important property of data sets  Forms the foundation for many essential data mining tasks Association, correlation, and causality analysis Sequential, structural (e.g., sub-graph) patterns Pattern analysis in spatiotemporal, multimedia, time series, and stream data Classification: associative classification Cluster analysis: frequent pattern-based clustering Data warehousing: iceberg cube and cube-gradient Semantic data compression: fascicles Broad applications
  • 4. C L O S E D PAT T E R N S A N D M A X - PAT T E R N S  A long pattern contains a combinatorial number of sub patterns, e.g., {a1, …, a100} contains (1001) + (1002) + … + (110000) =2100 – 1 = 1.27*1030 sub-patterns!  An item set X is closed if X is frequent and there exists no super-pattern Y ‫כ‬X, with the same support as X  An item set X is a max-pattern if X is frequent and there exists no frequent super-pattern Y ‫כ‬X  Closed pattern is a lossless compression of freq. patterns Reducing the # of patterns and rules
  • 5. SC A LA BLE M ETHOD S FOR M IN IN G FR EQU EN T PATTER N S  The downward closure property of frequent patterns Any subset of a frequent item set must be frequent If {beer, diaper, nuts} is frequent, so is {beer, diaper} i.e., every transaction having {beer, diaper, nuts} also contains {beer, diaper} Scalable mining methods: Three major approaches  Apriori  Frequent pattern growth  Vertical data format approach
  • 6. A P R I O R I : A C A N D I D AT E G E N E R AT I O N A N D T E S T A P P R O A C H  Apriori pruning principle: If there is any item set which is infrequent, its superset should not be generated/tested.  Method: Initially, scan DB once to get frequent 1-itemset Generate length (k+1) candidate itemsets from length k frequent itemsets Test the candidates against DB Terminate when no frequent or candidate set can be generated
  • 7. THE APRIORI ALGORITHM  Pseudo-code: Ck: Candidate item set of size k Lk: frequent item set of size k L1= {frequent items}; for (k = 1; Lk !=Æ; k++) do begin Ck+1=candidates generated from Lk; for each transaction t in database do increment the count of all candidates in Ck+1 that are contained in t Lk+1=candidates in Ck+1 with min support End return Èk Lk;
  • 8. IM PORTA N T D ETA ILS OF A PR IOR I  How to generate candidates? Step 1: self-joining Lk Step 2: pruning How to count supports of candidates? Example of Candidate-generation L3={abc, abd, acd, ace, bcd} Self-joining: L3*L3 abcd from abc and abd acde from acd and ace Pruning: acde is removed because ade is not in L3 C4={abcd}
  • 9. CHALLENGES OF FREQUENT PATTERN MINING  Challenges  Multiple scans of transaction database  Huge number of candidates  Tedious workload of support counting for candidates  Improving Apriori: general ideas  Reduce passes of transaction database scans  Shrink number of candidates  Facilitate support counting of candidates
  • 10. SA M PLE FOR FR EQU EN T PATTER N S  Select a sample of original database, mine frequent  patterns within sample using Apriori  Scan database once to verify frequent itemsets found in sample, only borders of closure of frequent patterns are checked  Example: check abcd instead of ab, ac, …, etc.  Scan database again to find missed frequent patterns
  • 11. T H E E X A M P L E O F A P R I O R I A L G O R I T H M
  • 12. SUMMARY  Frequent pattern mining—an important task in data mining  Scalable frequent pattern mining methods Apriori (Candidate generation & test) Projection-based (FP growth, CLOSET+, ...) Vertical format approach (CHARM, ...)  Mining a variety of rules and interesting patterns  Constraint-based mining  Mining sequential and structured patterns  Extensions and applications