Objective - to analyse data to Identify items based on the transaction history of customers.
Identify patterns of relationship between data of customers using association rules.
2. Table of Content
• Scope and objectives
• Introduction
• Modelling process
Data extraction
Data cleansing
• Association analysis
• Conclusion
3. Objective & Scope
Objective
• Our main objective was to
analyze our data to Identify
items based on the transaction
history of customers.
• Identify patterns of relationship
between data of customers using
association rules.
Scope
• Association Rule
• Tools been used: R
Studio, Microsoft
Excel
4. What is Instacart?
• Online grocery ordering app ,store.
• Aims to Deliver Groceries in an Hour.
5. Modelling Process
– Data Extraction
Data is extracted from Kaggle. This is an anonymized data on
customer orders over time.
6. - Data Cleaning
Naturally, unstructured data. Hence, data cleaning (or cleansing,
scrubbing) is important in further analysis. We cleaned our data, Orders
data for days_since_prior_order consist of some missing values so first we
will replace all our missing values with some mode of the values.
10. While most of the users have 8 products in their baskets, the average basket
contains 10 products. For determining the number of products in the future
baskets
The idea is to look at the purchase
history of each user, get the average
number of items in the baskets and
use this number for predicting the
number of items in future baskets.
11. The count and list the 15 most popular products in the basket
12. Fresh Veggie and Fresh Fruits are
most often sold by Aisle
So, basically we conclude that Fruits,Veggies Products have high probability to be ordered by
customers when he makes his next purchase
13. Milk or Dairy Products are the highest
reordered by customer
So, basically we conclude that Milk/Dairy Products have high probability to be ordered by
customers when he makes his next purchase
14. Association Analysis:
Association Identifies how the data items are associated with
each other.
Association rules are created by analyzing data patterns and
using the criteria support and confidence to identify the most
important relationships.
15. Support and Confidence
Support
• Support measures the probability of collection of items
being brought together.
Confidence
• Confidence measures that if a customer buys one product
‘A’ they will buy another product ‘B’, or A=>B. The
confidence of A =>B can be estimated as frequency that
someone will buy both A and B divided by the probability
they will buy A.
16. Rule 1:Low support and High Confidence
Support=0.003269976
Confidence=0.01
22. Conclusion
Using the association rules (rule 1-3), the next purchase of a
customer can be predicted based on his purchase history.
Rules can be refined further based on support and
confidence combination.
Using Jakart Index affinity between different item
combinations can be calculated which would help in
prediction of next purchase of customer.