Association Rule Product
Recommendation &
Market Basket Analysis Using FP
Growth
Prepared By
Md. Sazzad Hossain Hira
Salvi Saba
Sunday, August 21, 2022
1
Motivation
• If it is known that customers who purchase one product are
likely to purchase another product, it is possible for retailers to
market these products together, or to make the purchasers of a
target prospects for the second. It would help increasing the sales
of products.
• Market Basket Analysis is a key method to reveal relationships
between products, like bread, butter, etc. It works by searching
for a mix of products that happen together every now and then in
exchanges. To give it another perspective, it enables retailers to
recognize connections between things that individuals purchase.
This value added information discovered from Market Basket
Analysis can be used to support decision making.
Sunday, August 21, 2022
2
Problem Statement
• We all want a safe road for safe journey in our daily life.
• Lack of awareness, careless driving, natural causes.
• Identification of accident-prone location will help to take
precautionary measures beforehand.
Sunday, August 21, 2022
3
Research Questions
• What are they major causes behind road-accidents in
Bangladesh?
• Which ML model gives the best result to classify
accident-prone road locations?
• What is the accuracy of the designed model?
Sunday, August 21, 2022
4
Objectives of the research
• The main objective of the project is to make retailers to
understand the current customer's behavior and to predict future
customers’ purchasing behavior. Using customer transaction
data can help in understanding customers’ purchasing behavior,
offering right bundles and promotions , improve sales and
extend their relationship with customers.
• Specific Objectives:
 To understand the purchasing pattern of products that comprise the
customers’ basket.
 To study about many products usually purchased by the customers.
 To study the most likely products purchased by the customers along
with a particular product category.
 To recommend and suggest products to individual customers
Sunday, August 21, 2022
5
Literature Review
According to P. Pravallika and K. Narendra (2018) Market
Basket Analysis is one of the most popular types of data
analysis used in the marketing world . The purpose of Market
Basket Analysis is to determine what products are most
commonly purchased or used by consumers. This MBA is
analyzing consumer buying habits by finding associations
between different products that consumers place in shopping
basketball. In general, MBA is one example of the
implementation of the Association Rule.
Rakesh Agrawal and Usama Fayyad as pioneers in data
mining, Association Rule Mining (ARM) and Clustering
have developed different algorithms to help users achieve
their objectives.
Sunday, August 21, 2022
6
Literature Review
(Hadinata, Waruwu,Hermanto: 2022) provided information
on the Apriori and FP-Growth algorithms which have been
widely used in the research of association rules, implemented
these algorithms to predicts the sales of goods. As a result of
the analysis, researcher noted that the FP-Growth algorithm is
more suitable than the Apriori algorithm to analyze a data set
because of the computation speeds and low memory usage
because it doesn’t need to scan the database for each of the
steps. (Manoj, Tejashree: 2022)
 Recommendation systems are kinds of systems for filtering
information which analyze user behavior data from past times
and try to predict the user's preference
Sunday, August 21, 2022
7
Literature Review
Association rule is related to the statement of “what goes
with what”. For example all fruits are sorted in one aile in a
supermarket, All dairy products placed together under in
another aile. Association rule is related to the statement of
“what goes with what”. The purchase of products by
customers at Super Market are termed as ‘Transactions’. The
magnitude of an associative rule can be derived in the
existence of three parameters, namely support, confidence and
lift (Ansari, 2019)
Sunday, August 21, 2022
8
Data Source
• We collected data from
Open Source dataset
Big shopping chains
Sunday, August 21, 2022
9
Workflow OR Working Plan
Sunday, August 21, 2022
10
Fig: Work Flow
Planning Data Collection
Data Validation
Implementation
Data Format
Preparation
Implementing
Algorithm
Implementation Building model for
recommending
Deploying model
using Flask
Methodology
The approach we will follow Product Recommendation
System:
Sunday, August 21, 2022
11
The collection and pre-processing of raw data
Convert pre-processed data into an easily achievable
form using FP Growth algorithm
Create a model of learning (training)
Use the previously developed set of association rules to
report recommendations
Methodology
• The fundamental algorithm such as Apriori and Fp Growth, collects knowledge about the preferences
of people and recognizes that when people buy Bread and Fruits, they are often generally interested in
Jam.
• Association rule is the key part in developing a recommendation engine. The Association Rule
produces a number of rules after running on a data set with details from past shopping baskets. Each
rule includes a product name collection as an antecedent, one product name as a consequence, and a
few class measures, such as antecedent support, consequent support, support, confidence and lift
Sunday, August 21, 2022
12
Fig: FP tree generation
Methodology
• Pre –Processing Data Set
• To work with missing values: To replace nulls with non-null values, a technique
known as imputation has been used.
Sunday, August 21, 2022
13
Replace numeric values of Day of a week field to textual values: Replacing
days_since_prior_order field for Null values as First Order, since very First
Order were left blank in the original dataset.
Methodology
• Algorithm Used: Frequent Pattern Growth Algorithm, This algorithm is an
improvement to the Apriori method. A frequent pattern is generated without
the need for candidate genera tion. FP growth algorithm represents the
database in the form of a tree called a frequent pattern tree or FP tree.
Sunday, August 21, 2022
14
Fig: : Sales Distribution Over Days Of A Week
Probable Outcome
• We expect around 80-85% accuracy from the models
• Much faster decision making
• Cost effective system
Sunday, August 21, 2022
15
Scope and limitations of the study
Scope:
Work with large and more differentiated dataset.
Use real dataset to train the model accurately.
Combine different algorithm to gain more accurate result
Limitations:
Small dataset
Uncontrolled variables between experiments
Number of features
Methodological nature
Sunday, August 21, 2022
16
Conclusion
• It is very important to any system to be fast and cost
effective
• Our research will open path for others to research further
on this topic to come up with improvement of our system
so that businesses can perform better.
Sunday, August 21, 2022
17
List of References
• Follow the Harvard referencing style
• Agarwal, B. T. S. a. G. M., 2010. Software engineering & testing. 3rd Edition ed.
Sudbury,: Jones and Bartlett.
• Jain,A.,2021.UnifiedModelingLanguage(UML).[Online] Availableat at
:https://www.geeksforgeeks.org/unified-modeling-language-uml-activity-diagrams/
[Accessed 18 January 2022].
• Leach, R. J., 2016. Introduction to software engineering. 2nd edition ed.
Washington DC: CRC Press.
• Navathe, R. E. a. S. B., 2015. Fundamentals of Database Systems. 6th Edition ed.
Chennai,Delhi: Pearson.
• Rifat,S.,2010.AkaarITLimited.[Online ]Available at:https://www.akaarit.com/
[Accessed 18 January 2022].
Sunday, August 21, 2022
18

Sample Thesis Proposal for all students.pptx

  • 1.
    Association Rule Product Recommendation& Market Basket Analysis Using FP Growth Prepared By Md. Sazzad Hossain Hira Salvi Saba Sunday, August 21, 2022 1
  • 2.
    Motivation • If itis known that customers who purchase one product are likely to purchase another product, it is possible for retailers to market these products together, or to make the purchasers of a target prospects for the second. It would help increasing the sales of products. • Market Basket Analysis is a key method to reveal relationships between products, like bread, butter, etc. It works by searching for a mix of products that happen together every now and then in exchanges. To give it another perspective, it enables retailers to recognize connections between things that individuals purchase. This value added information discovered from Market Basket Analysis can be used to support decision making. Sunday, August 21, 2022 2
  • 3.
    Problem Statement • Weall want a safe road for safe journey in our daily life. • Lack of awareness, careless driving, natural causes. • Identification of accident-prone location will help to take precautionary measures beforehand. Sunday, August 21, 2022 3
  • 4.
    Research Questions • Whatare they major causes behind road-accidents in Bangladesh? • Which ML model gives the best result to classify accident-prone road locations? • What is the accuracy of the designed model? Sunday, August 21, 2022 4
  • 5.
    Objectives of theresearch • The main objective of the project is to make retailers to understand the current customer's behavior and to predict future customers’ purchasing behavior. Using customer transaction data can help in understanding customers’ purchasing behavior, offering right bundles and promotions , improve sales and extend their relationship with customers. • Specific Objectives:  To understand the purchasing pattern of products that comprise the customers’ basket.  To study about many products usually purchased by the customers.  To study the most likely products purchased by the customers along with a particular product category.  To recommend and suggest products to individual customers Sunday, August 21, 2022 5
  • 6.
    Literature Review According toP. Pravallika and K. Narendra (2018) Market Basket Analysis is one of the most popular types of data analysis used in the marketing world . The purpose of Market Basket Analysis is to determine what products are most commonly purchased or used by consumers. This MBA is analyzing consumer buying habits by finding associations between different products that consumers place in shopping basketball. In general, MBA is one example of the implementation of the Association Rule. Rakesh Agrawal and Usama Fayyad as pioneers in data mining, Association Rule Mining (ARM) and Clustering have developed different algorithms to help users achieve their objectives. Sunday, August 21, 2022 6
  • 7.
    Literature Review (Hadinata, Waruwu,Hermanto:2022) provided information on the Apriori and FP-Growth algorithms which have been widely used in the research of association rules, implemented these algorithms to predicts the sales of goods. As a result of the analysis, researcher noted that the FP-Growth algorithm is more suitable than the Apriori algorithm to analyze a data set because of the computation speeds and low memory usage because it doesn’t need to scan the database for each of the steps. (Manoj, Tejashree: 2022)  Recommendation systems are kinds of systems for filtering information which analyze user behavior data from past times and try to predict the user's preference Sunday, August 21, 2022 7
  • 8.
    Literature Review Association ruleis related to the statement of “what goes with what”. For example all fruits are sorted in one aile in a supermarket, All dairy products placed together under in another aile. Association rule is related to the statement of “what goes with what”. The purchase of products by customers at Super Market are termed as ‘Transactions’. The magnitude of an associative rule can be derived in the existence of three parameters, namely support, confidence and lift (Ansari, 2019) Sunday, August 21, 2022 8
  • 9.
    Data Source • Wecollected data from Open Source dataset Big shopping chains Sunday, August 21, 2022 9
  • 10.
    Workflow OR WorkingPlan Sunday, August 21, 2022 10 Fig: Work Flow Planning Data Collection Data Validation Implementation Data Format Preparation Implementing Algorithm Implementation Building model for recommending Deploying model using Flask
  • 11.
    Methodology The approach wewill follow Product Recommendation System: Sunday, August 21, 2022 11 The collection and pre-processing of raw data Convert pre-processed data into an easily achievable form using FP Growth algorithm Create a model of learning (training) Use the previously developed set of association rules to report recommendations
  • 12.
    Methodology • The fundamentalalgorithm such as Apriori and Fp Growth, collects knowledge about the preferences of people and recognizes that when people buy Bread and Fruits, they are often generally interested in Jam. • Association rule is the key part in developing a recommendation engine. The Association Rule produces a number of rules after running on a data set with details from past shopping baskets. Each rule includes a product name collection as an antecedent, one product name as a consequence, and a few class measures, such as antecedent support, consequent support, support, confidence and lift Sunday, August 21, 2022 12 Fig: FP tree generation
  • 13.
    Methodology • Pre –ProcessingData Set • To work with missing values: To replace nulls with non-null values, a technique known as imputation has been used. Sunday, August 21, 2022 13 Replace numeric values of Day of a week field to textual values: Replacing days_since_prior_order field for Null values as First Order, since very First Order were left blank in the original dataset.
  • 14.
    Methodology • Algorithm Used:Frequent Pattern Growth Algorithm, This algorithm is an improvement to the Apriori method. A frequent pattern is generated without the need for candidate genera tion. FP growth algorithm represents the database in the form of a tree called a frequent pattern tree or FP tree. Sunday, August 21, 2022 14 Fig: : Sales Distribution Over Days Of A Week
  • 15.
    Probable Outcome • Weexpect around 80-85% accuracy from the models • Much faster decision making • Cost effective system Sunday, August 21, 2022 15
  • 16.
    Scope and limitationsof the study Scope: Work with large and more differentiated dataset. Use real dataset to train the model accurately. Combine different algorithm to gain more accurate result Limitations: Small dataset Uncontrolled variables between experiments Number of features Methodological nature Sunday, August 21, 2022 16
  • 17.
    Conclusion • It isvery important to any system to be fast and cost effective • Our research will open path for others to research further on this topic to come up with improvement of our system so that businesses can perform better. Sunday, August 21, 2022 17
  • 18.
    List of References •Follow the Harvard referencing style • Agarwal, B. T. S. a. G. M., 2010. Software engineering & testing. 3rd Edition ed. Sudbury,: Jones and Bartlett. • Jain,A.,2021.UnifiedModelingLanguage(UML).[Online] Availableat at :https://www.geeksforgeeks.org/unified-modeling-language-uml-activity-diagrams/ [Accessed 18 January 2022]. • Leach, R. J., 2016. Introduction to software engineering. 2nd edition ed. Washington DC: CRC Press. • Navathe, R. E. a. S. B., 2015. Fundamentals of Database Systems. 6th Edition ed. Chennai,Delhi: Pearson. • Rifat,S.,2010.AkaarITLimited.[Online ]Available at:https://www.akaarit.com/ [Accessed 18 January 2022]. Sunday, August 21, 2022 18