Instacart Market Basket Analysis

INSTACART ASSOCIATION ANALYSIS
Presented By,
Sharanya Prathap
Mount Carmel College
B.VOC (Analytics)
Batch 2018

Table of Content
• Scope and objectives
• Introduction
• Modelling process
Data extraction
Data cleansing
• Association analysis
• Conclusion

Objective & Scope
Objective
• Our main objective was to
analyze our data to Identify
items based on the transaction
history of customers.
• Identify patterns of relationship
between data of customers using
association rules.
Scope
• Association Rule
• Tools been used: R
Studio, Microsoft
Excel

What is Instacart?
• Online grocery ordering app ,store.
• Aims to Deliver Groceries in an Hour.

Modelling Process
– Data Extraction
Data is extracted from Kaggle. This is an anonymized data on
customer orders over time.

- Data Cleaning
Naturally, unstructured data. Hence, data cleaning (or cleansing,
scrubbing) is important in further analysis. We cleaned our data, Orders
data for days_since_prior_order consist of some missing values so first we
will replace all our missing values with some mode of the values.

EDA
Objective 1
Identify the items based on the transaction
history of customers using affinity analysis.

While most of the users have 8 products in their baskets, the average basket
contains 10 products. For determining the number of products in the future
baskets
The idea is to look at the purchase
history of each user, get the average
number of items in the baskets and
use this number for predicting the
number of items in future baskets.

The count and list the 15 most popular products in the basket

Fresh Veggie and Fresh Fruits are
most often sold by Aisle
So, basically we conclude that Fruits,Veggies Products have high probability to be ordered by
customers when he makes his next purchase

Milk or Dairy Products are the highest
reordered by customer
So, basically we conclude that Milk/Dairy Products have high probability to be ordered by
customers when he makes his next purchase

Association Analysis:
Association Identifies how the data items are associated with
each other.
Association rules are created by analyzing data patterns and
using the criteria support and confidence to identify the most
important relationships.

Support and Confidence
Support
• Support measures the probability of collection of items
being brought together.
Confidence
• Confidence measures that if a customer buys one product
‘A’ they will buy another product ‘B’, or A=>B. The
confidence of A =>B can be estimated as frequency that
someone will buy both A and B divided by the probability
they will buy A.

Rule 1:Low support and High Confidence
Support=0.003269976
Confidence=0.01

Rule 1
Support=0.003269976
Confidence=0.01
rules <- apriori(transactions, parameter =
list(supp = 0.003269976, conf = 0.01,
maxlen=3), control = list(verbose = FALSE))

Rule 2:Support and Confidence
Support=0.001
Confidence=0.4

Rule 2
Support=0.001
Confidence=0.4
rules2 <- apriori(transactions, parameter =
list(supp = 0.001, conf = 0.4, maxlen=3),
control = list(verbose = FALSE))

Rule 3 : High Confidence and less support
Support=0.005
Confidence=0.1

Rule 3
Support=0.005
Confidence=0.1
rules3 <- apriori(transactions, parameter =
list(supp = 0.005, conf = 0.1, maxlen=3), control =
list(verbose = FALSE))

Conclusion
Using the association rules (rule 1-3), the next purchase of a
customer can be predicted based on his purchase history.
Rules can be refined further based on support and
confidence combination.
Using Jakart Index affinity between different item
combinations can be calculated which would help in
prediction of next purchase of customer.

Instacart Market Basket Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Instacart Market Basket Analysis

Similar to Instacart Market Basket Analysis (20)

Recently uploaded

Recently uploaded (20)

Instacart Market Basket Analysis