Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Data Mining
1. Association Rule for finding an
optimal pattern in the
real-time dataset for
Supermarket
Dr. Ritu Bhargava
Lecturer,
Department of Computer Science
Sophia Girls’ College, (Autonomous)
Ajmer,
drritubhargava92@gmail.com
Chandni Sharma
Chandanisharma101.cs@gmail.com
Ravina Jeswani
dravina8@gmail.com
2. Contents ❑ Introduction
❑ Association Rules
❑ Market Basket Analysis
❑ Paper Presentation
▪ Introduction to Weka
▪ Project Dataset
▪ Original Dataset
▪ Apriori Algorithm
▪ Pseudo Code
▪ Generated rules
▪ Support & Confidence
▪ Visualization
❑ Benefits
❑ Conclusion
❑ References
3. What is Data Mining?
D M → D A T A M I N I N G
DM is process of analyzing large amount of data stored in a
data warehouse for useful information, which makes use of :
• A I (Artificial Intelligent ) techniques
• Neural networks
• Statistical tools such as cluster analysis
4. What is Association Rule?
Association rules are created by analyzing data for frequent
if/then patterns and using the
criteria support and confidence to identify the most
important relationships
Eg: Egg Milk
"If a customer buys a dozen eggs, he is 80% likely to also
purchase milk."
5. Market Basket Analysis
Market Basket Analysis is a modelling
technique based upon the theory that if you
buy a certain group of items, you are more
(or less) likely to buy another group of items.
It works by looking for combinations of items
that occur together frequently in
transactions.
Eg : Aashirwad aata Sugar Loose
6. What is Weka?
Weka is a collection of machine learning
algorithms for data mining tasks. It contains tools
for :
• Data preparation
• Classification
• Regression
• Clustering
• Association rules mining
• Visualization
7. Project Dataset
A data set (or dataset) is a collection of data.
Most commonly a data setcorresponds to
the contents of a single database table, or a
single statistical datamatrix, where every
column of the table represents a particular
variable, and each row corresponds to a
given member of the data set in question.
Eg: Student dataset
Supermarket dataset
9. Apriori Algorithm
Apriori is an algorithm for frequent item set mining and association rule
learning over transactional databases. The frequent item sets determined by
Apriori can be used to determine association rules which highlight general trends
in the database.
11. Association Rules
Our main objective is to generate the best rules form the datasets. These rules
would be very helpful to take the future decisions. The Associate result is shown
in figure given below:
12. Rule Basic Measures
Support: denotes the frequency of the rule within
transactions. A high value means that the rule involves
a great part of database.
support(X => Y [ s, c ]) = p(X ᴗ Y)
Confidence: denotes the percentage of transactions
containing X which also contain Y. It is an estimation of
conditioned probability .
confidence(X =>Y[s, c ]) = p(Y|X) = sup(X,Y)/sup(X).
13. Visualization
That Weka automatically calculates descriptive statistics for each attribute. That
Weka allows you to review the distribution of each attribute easily.
That Weka provides a scatter plot visualization to review the pair wise
relationships between attributes.
14. Beneficial for Customers & Organizations
✓ store should be organized to shoot
for best revenues.
✓ You can suggest the next best
product which a customer is likely to
buy .
✓ improve the allocations of resources
✓better performance in search results
in case of e-commerce
✓ Help customers to find products of
interest.
✓predict future purchases
15. Conclusion
The point of this examination was the market basket
analysis of buys by mining associations leads on value-
based data from a supermarket with a specific end goal
to give more prominent knowledge into the purchasing
conduct of their clients and talk about the materializes
of the method.
This gave a concentration to the analysis, diminishing
the hunt field of data and lessening the quantity of
produced pattern.
16. References
1.Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules in Large Databases
(International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc, 1994),
pp. 487–499.
2.Karimi-Majd A M, Mahootchi M. A new data mining methodology for generating new
service ideas (Information Systems and e-Business Management, 2015, 13(3)), pp. 421–443.
https://doi.org/10.1007/s10257-014-0267-y,
3.Wang J, Li H, Huang J, et al. Association rules mining based analysis of consequential alarm
sequences in chemical processes(Journal of Loss Prevention in the Process Industries,
2016(41)),pp. 178–185. https://doi.org/10.1016/j.jlp.2016.03.022,
4.Borgelt C. Frequent item set mining (Wiley Interdisciplinary Reviews Data Mining &
Knowledge Discovery, 2012, 2(6)), pp. 437–456. https://doi.org/10.1002/widm.1074,
5.Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation (Acm Sigmod
Record, 2000, 29(2)), pp. 1–12. https://doi.org/10.1145/335191.335372,