SlideShare a Scribd company logo
1 of 7
A/B Testing - Customer Experience Platformexperimentation using Pearson’s Chi-
Squared Test
Aurangzeb Khan
Senior Data Analyst
rana.aurangzeb@hotmail.com
MBA, University of Wollongong, Australia
Abstract: E-commerce giants design and run frequent campaigns on their touch points which includes
website to attract more and more customers. The purpose of this paper is to investigate the effectiveness
of a newly launched web page for consumers and find out if the new page is resulting into different
consumer behavior and/or more website visits and conversion. The ‘Chi-Square Test of Independence’
helps us find out if the different user groups of old and new web page are significantly different from each
other based on conversion rate or not!
The Business Problem
As described in Kaggle (Kaggle link is here), an e-commerce company has designed a new web
page for a website to attract more customers. The e-commerce company wants to investigate if
it should implement the new page or keep the old web page.
Many of the times the consumer/user groups are exposed to and/ or studied based on different
situation (before and after a change) to find out if there is a significant difference in terms of their
performance/consumer behavior using some set of metrics like web site visits, click-through-rate
and the conversion rate. The ‘control’ consumer group is exposed to the ‘old page’ and the
‘treatment’ consumer group is exposed to the ‘new page’ of the website. Now the e-commerce
company wants to know that if the two consumergroups are significantly different from each other
in terms of conversion rate & hence consumer behavior.
Analytical Problem
To determine if the two user/consumer groups exposed to ‘old page’ vs ‘new page’ (consumer
group being the categorical variable) are different in term of click-through-rate and conversion
rates we recommend using Pearson’s Chi-Squared Test for Independence. The Chi-Square
Test is suitable for quantifying the independence of pairs of categorical variables i.e the click and
non-click behavior of the consumers against the website page design.
Chi Square also tell us if the input variable has significant impact on the output variable and hence
will let us choose or drop certain variables when we decide to continue feature selection for the
Analysis.
Formula : Chi Square
Image source: Author
Fo: Observed Frequencies
Fe: Expected Frequencies
Steps to conduct Chi-Square of Independence
1. Data wrangling & data consolidation in the shape of contingency table
2. Hypothesis Formulation & Decision Rule
3. Data Visualization
4. Test Statistics calculation
5. Conclusion
1. Data wrangling & Data consolidation in the shape of contingency table
The pairs of categorical variables i.e user group and the click/non-click variables will be displayed
in a contingency table to help us visualize the frequency distribution of the variables.
Below is an example of how the overall data should look like:
Click No-Click Click + No-
Click
Old Page 17489 127785 145,274
New Page 17264 128047 145,311
Old Page + New
Page
34,753 255,832 290,585
Table 1.0: Sample Format of the Data
For Chi Square Test we need the Data in below format
Click No-Click
Old Page 17489 127785
New Page 17264 128047
Table 2.0: Sample format for Contingency Table
First of all, we need to import the Python (Data Analysis programming language) libraries and
the data from Kaggle and visualize it.
Python Code:
# Import necessary Libraries
import numpy as np
import pandas as pd
import seaborn as sns
import scipy
import matplotlib.pyplot asplt
# The data has been taken fromKaggle per below link
# https://www.kaggle.com/zhangluyuan/ab-testing
df = pd.read_csv('ab_data.csv')
df.head()
Table 3.0: Simple Data Visualization
Let's perform few steps to validate if the data is clean and is ready for Chi Square Test
Python Code:
# The control group represents the users of Old Page
# The treatment group represents the users of new Page
# Let’s see how the data looks like
df.groupby(['group','landing_page']).count()
Table 4.0: Data Aggregation for Visualization
We have noticed above that some users in ‘control group’ have visited ‘new page’ and the data
is wrongly classifying against our objectives. We have also noticed that some users in
‘treatment group’ have visited ‘old page’ and the data is wrongly classified against our
objectives. Now instead of cleaning the data we can only pick the relevant correct data
(control/new_page and treatment/old_page) with the help of below Python code.
Python Code:
# from 'Control Group' we only need old page
# From 'Treatment Group' we only neednew page
df_cleaned= df.loc[(df['group'] == 'control') & (df['landing_page'] == 'old_page') |(df['group'] == 'treatment')
& (df['landing_page'] == 'new_page')]
df_cleaned.groupby(['group','landing_page']).count()
Table 5.0: Cleansed and consolidated Data for both user groups
Finding Duplicates
Python Code:
# Checking for duplicate values
print(df_cleaned['user_id'].duplicated().sum())
# Finding user_idfor duplicate value
df_cleaned[df_cleaned.duplicated(['user_id'],keep=False)]['user_id']
# Now we need to drop the Duplicates
df_cleaned= df.drop_duplicates(subset='user_id',keep="first")
Preparing the Contingency Tabe for Chi-Square Test
### To prepare and arrange the Data for Chi-Square Contigency Table
# 1) Take out the Control group
control = df_cleaned[df_cleaned['group'] == 'control']
# 2) Take out the Treatment group
treatment = df_cleaned[df_cleaned['group'] == 'treatment']
# 2A) A-click -i.e The ones who convertedfrom Control group
control_click = control.converted.sum()
# 2B) No-click,i.e The one who did not click fromControl group
control_noclick = control.converted.size -control.converted.sum()
#3 B-click, B-noclick
# 3A) A-click -i.e The ones who convertedfrom Treatment group
treatment_click = treatment.converted.sum()
# 2B) No-click,i.e The one who did not click fromTreatment group
treatment_noclick = treatment.converted.size - treatment.converted.sum()
# 3) Create np array
Table = np.array([[control_click, control_noclick], [treatment_click, treatment_noclick]])
print(Table)
2. Hypothesis Formulation & Decision Rule
Null Hypothesis
H0:
The ‘control’ user group and ‘treatment’ user group are independent in terms of their conversion rate.
Alterative hypothesis
H1:
The ‘control’ user group and ‘treatment’ user group are dependent and different in terms of their
conversion rate
Level of significance
For this test, we assume that α = 0.05 or Confidence Interval = 95%
Decision Rule
If p-value is less than Level of significance (5%) then we will Reject Null Hypothesis (H0).
3. Data Visualization
Let’s printthe multidimensional array thatwe created in Python :
Click No-Click
Old Page 17471 127761
New Page 17274 128078
Table 6.0: Chi-Square Test Contingency Table
4. Test Statistics Calculations
To perform the Test let’s import the necessary Python libraries and get the following parameters
1. Test Statistics
2. P- Value
3. Degree of Freedom
4. Expected Frequencies
Python Code:
import scipy
from scipy importstats
# The correction will Adjustthe observerdvalue by .5 towards the corressponding ExpectedValues
stat,p,dof,expected = scipy.stats.chi2_contingency(Table,correction=True)
print('nStat : ',stat)
print('nP-Value : ',p)
print('nDegree of Freedom : ',dof)
print('nObservedFrequencies: ',Table)
print('nExpectedFrequencies: ',expected)
Snapshot 1.0: Chi-Square Test Results
Python Code:
# interpret p-value
alpha = 1.0 - .95
if p <= alpha:
print('Dependent(reject H0)')
else:
print('Independent(fail to reject H0)')
5. Conclusion:
The p-value is 22.9% at 5% level of significance. As the p-value is greater than alpha so we do
not Reject the Null Hypothesis
The old and new page's users did not behave significantly different and the conversion ratio is
not significantly different. Hence, the new web page is not different from the old one.
The conversion rate is considered independent as the observed and expected frequencies are
similar, the variables do not interact and are not dependent.
THE END

More Related Content

What's hot

Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenSmarten Augmented Analytics
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...Smarten Augmented Analytics
 
Model Calibration and Uncertainty Analysis
Model Calibration and Uncertainty AnalysisModel Calibration and Uncertainty Analysis
Model Calibration and Uncertainty AnalysisJ Boisvert-Chouinard
 
Lab report templante for 10th and 9th grade
Lab report templante for 10th and 9th gradeLab report templante for 10th and 9th grade
Lab report templante for 10th and 9th gradeSofia Paz
 
Lab report templete
Lab report templeteLab report templete
Lab report templeteSofia Paz
 
PRM project report
PRM project reportPRM project report
PRM project reportneha singh
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overviewguest3311ed
 
Lecture 1 practical_guidelines_assignment
Lecture 1 practical_guidelines_assignmentLecture 1 practical_guidelines_assignment
Lecture 1 practical_guidelines_assignmentDaria Bogdanova
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market ResearchTed Clark
 
Ch. 4-demand-estimation(2)
Ch. 4-demand-estimation(2)Ch. 4-demand-estimation(2)
Ch. 4-demand-estimation(2)anj134u
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...Smarten Augmented Analytics
 
Conjoint ppt final one
Conjoint ppt final oneConjoint ppt final one
Conjoint ppt final onesaba khan
 

What's hot (15)

Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - Smarten
 
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
What is FP Growth Analysis and How Can a Business Use Frequent Pattern Mining...
 
Model Calibration and Uncertainty Analysis
Model Calibration and Uncertainty AnalysisModel Calibration and Uncertainty Analysis
Model Calibration and Uncertainty Analysis
 
Gordoncorr
GordoncorrGordoncorr
Gordoncorr
 
Lab report templante for 10th and 9th grade
Lab report templante for 10th and 9th gradeLab report templante for 10th and 9th grade
Lab report templante for 10th and 9th grade
 
Lab report templete
Lab report templeteLab report templete
Lab report templete
 
PRM project report
PRM project reportPRM project report
PRM project report
 
Multivariate Analysis An Overview
Multivariate Analysis An OverviewMultivariate Analysis An Overview
Multivariate Analysis An Overview
 
RapidMiner: Nested Subprocesses
RapidMiner:   Nested SubprocessesRapidMiner:   Nested Subprocesses
RapidMiner: Nested Subprocesses
 
Lecture 1 practical_guidelines_assignment
Lecture 1 practical_guidelines_assignmentLecture 1 practical_guidelines_assignment
Lecture 1 practical_guidelines_assignment
 
Lobsters, Wine and Market Research
Lobsters, Wine and Market ResearchLobsters, Wine and Market Research
Lobsters, Wine and Market Research
 
Ch. 4-demand-estimation(2)
Ch. 4-demand-estimation(2)Ch. 4-demand-estimation(2)
Ch. 4-demand-estimation(2)
 
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...What is the Independent Samples T Test Method of Analysis and How Can it Bene...
What is the Independent Samples T Test Method of Analysis and How Can it Bene...
 
Conjoint ppt final one
Conjoint ppt final oneConjoint ppt final one
Conjoint ppt final one
 
MidTerm memo
MidTerm memoMidTerm memo
MidTerm memo
 

Similar to A/B Testing - Customer Experience Platform experimentation using Pearson’s Chi-Squared Test

Customer Satisfaction Data - Multiple Linear Regression Model.pdf
Customer Satisfaction Data -  Multiple Linear Regression Model.pdfCustomer Satisfaction Data -  Multiple Linear Regression Model.pdf
Customer Satisfaction Data - Multiple Linear Regression Model.pdfruwanp2000
 
Data analysis.pptx
Data analysis.pptxData analysis.pptx
Data analysis.pptxMDPiasKhan
 
Empowerment Technology Lesson 4
Empowerment Technology Lesson 4Empowerment Technology Lesson 4
Empowerment Technology Lesson 4alicelagajino
 
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...QuekelsBaro
 
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptxModule_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptxHarshitGoel87
 
I Simply Excel
I Simply ExcelI Simply Excel
I Simply ExcelEric Couch
 
ANOVA is a hypothesis testing technique used to compare the equali.docx
ANOVA is a hypothesis testing technique used to compare the equali.docxANOVA is a hypothesis testing technique used to compare the equali.docx
ANOVA is a hypothesis testing technique used to compare the equali.docxjustine1simpson78276
 
Ab testing 101
Ab testing 101Ab testing 101
Ab testing 101Ashish Dua
 
How quality management can be measured
How quality management can be measuredHow quality management can be measured
How quality management can be measuredselinasimpson1501
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing frameworkAgnes van Belle
 
Philip crosby quality management
Philip crosby quality managementPhilip crosby quality management
Philip crosby quality managementselinasimpson2701
 
Basics of AB testing in online products
Basics of AB testing in online productsBasics of AB testing in online products
Basics of AB testing in online productsAshish Dua
 
Quality management maturity grid
Quality management maturity gridQuality management maturity grid
Quality management maturity gridselinasimpson1601
 
OL 325 Milestone Three Guidelines and Rubric Section
 OL 325 Milestone Three Guidelines and Rubric  Section OL 325 Milestone Three Guidelines and Rubric  Section
OL 325 Milestone Three Guidelines and Rubric SectionMoseStaton39
 
Purpose of quality management system
Purpose of quality management systemPurpose of quality management system
Purpose of quality management systemselinasimpson1801
 
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...Crimsonpublishers-Rehabilitation
 
Conversion Whitepaper
Conversion WhitepaperConversion Whitepaper
Conversion WhitepaperWSI Ensenada
 

Similar to A/B Testing - Customer Experience Platform experimentation using Pearson’s Chi-Squared Test (20)

Customer Satisfaction Data - Multiple Linear Regression Model.pdf
Customer Satisfaction Data -  Multiple Linear Regression Model.pdfCustomer Satisfaction Data -  Multiple Linear Regression Model.pdf
Customer Satisfaction Data - Multiple Linear Regression Model.pdf
 
Data analysis.pptx
Data analysis.pptxData analysis.pptx
Data analysis.pptx
 
Empowerment Technology Lesson 4
Empowerment Technology Lesson 4Empowerment Technology Lesson 4
Empowerment Technology Lesson 4
 
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...
The Ultimate List of 45 Business Process Improvement Tools (Lean Six Sigma & ...
 
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptxModule_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
Module_6_-_Datamining_tasks_and_tools_uGuVaDv4iv-2.pptx
 
I Simply Excel
I Simply ExcelI Simply Excel
I Simply Excel
 
ANOVA is a hypothesis testing technique used to compare the equali.docx
ANOVA is a hypothesis testing technique used to compare the equali.docxANOVA is a hypothesis testing technique used to compare the equali.docx
ANOVA is a hypothesis testing technique used to compare the equali.docx
 
Ab testing 101
Ab testing 101Ab testing 101
Ab testing 101
 
How quality management can be measured
How quality management can be measuredHow quality management can be measured
How quality management can be measured
 
Quality service management
Quality service managementQuality service management
Quality service management
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Philip crosby quality management
Philip crosby quality managementPhilip crosby quality management
Philip crosby quality management
 
Deming quality management
Deming quality managementDeming quality management
Deming quality management
 
Basics of AB testing in online products
Basics of AB testing in online productsBasics of AB testing in online products
Basics of AB testing in online products
 
Quality management maturity grid
Quality management maturity gridQuality management maturity grid
Quality management maturity grid
 
OL 325 Milestone Three Guidelines and Rubric Section
 OL 325 Milestone Three Guidelines and Rubric  Section OL 325 Milestone Three Guidelines and Rubric  Section
OL 325 Milestone Three Guidelines and Rubric Section
 
Purpose of quality management system
Purpose of quality management systemPurpose of quality management system
Purpose of quality management system
 
ABTest-20231020.pptx
ABTest-20231020.pptxABTest-20231020.pptx
ABTest-20231020.pptx
 
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...
Crimson Publishers-Toward Generating Customized Rehabilitation Plan and Deliv...
 
Conversion Whitepaper
Conversion WhitepaperConversion Whitepaper
Conversion Whitepaper
 

Recently uploaded

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

A/B Testing - Customer Experience Platform experimentation using Pearson’s Chi-Squared Test

  • 1. A/B Testing - Customer Experience Platformexperimentation using Pearson’s Chi- Squared Test Aurangzeb Khan Senior Data Analyst rana.aurangzeb@hotmail.com MBA, University of Wollongong, Australia Abstract: E-commerce giants design and run frequent campaigns on their touch points which includes website to attract more and more customers. The purpose of this paper is to investigate the effectiveness of a newly launched web page for consumers and find out if the new page is resulting into different consumer behavior and/or more website visits and conversion. The ‘Chi-Square Test of Independence’ helps us find out if the different user groups of old and new web page are significantly different from each other based on conversion rate or not! The Business Problem As described in Kaggle (Kaggle link is here), an e-commerce company has designed a new web page for a website to attract more customers. The e-commerce company wants to investigate if it should implement the new page or keep the old web page. Many of the times the consumer/user groups are exposed to and/ or studied based on different situation (before and after a change) to find out if there is a significant difference in terms of their performance/consumer behavior using some set of metrics like web site visits, click-through-rate and the conversion rate. The ‘control’ consumer group is exposed to the ‘old page’ and the ‘treatment’ consumer group is exposed to the ‘new page’ of the website. Now the e-commerce company wants to know that if the two consumergroups are significantly different from each other in terms of conversion rate & hence consumer behavior. Analytical Problem To determine if the two user/consumer groups exposed to ‘old page’ vs ‘new page’ (consumer group being the categorical variable) are different in term of click-through-rate and conversion rates we recommend using Pearson’s Chi-Squared Test for Independence. The Chi-Square Test is suitable for quantifying the independence of pairs of categorical variables i.e the click and non-click behavior of the consumers against the website page design. Chi Square also tell us if the input variable has significant impact on the output variable and hence will let us choose or drop certain variables when we decide to continue feature selection for the Analysis.
  • 2. Formula : Chi Square Image source: Author Fo: Observed Frequencies Fe: Expected Frequencies Steps to conduct Chi-Square of Independence 1. Data wrangling & data consolidation in the shape of contingency table 2. Hypothesis Formulation & Decision Rule 3. Data Visualization 4. Test Statistics calculation 5. Conclusion 1. Data wrangling & Data consolidation in the shape of contingency table The pairs of categorical variables i.e user group and the click/non-click variables will be displayed in a contingency table to help us visualize the frequency distribution of the variables. Below is an example of how the overall data should look like: Click No-Click Click + No- Click Old Page 17489 127785 145,274 New Page 17264 128047 145,311 Old Page + New Page 34,753 255,832 290,585 Table 1.0: Sample Format of the Data
  • 3. For Chi Square Test we need the Data in below format Click No-Click Old Page 17489 127785 New Page 17264 128047 Table 2.0: Sample format for Contingency Table First of all, we need to import the Python (Data Analysis programming language) libraries and the data from Kaggle and visualize it. Python Code: # Import necessary Libraries import numpy as np import pandas as pd import seaborn as sns import scipy import matplotlib.pyplot asplt # The data has been taken fromKaggle per below link # https://www.kaggle.com/zhangluyuan/ab-testing df = pd.read_csv('ab_data.csv') df.head() Table 3.0: Simple Data Visualization
  • 4. Let's perform few steps to validate if the data is clean and is ready for Chi Square Test Python Code: # The control group represents the users of Old Page # The treatment group represents the users of new Page # Let’s see how the data looks like df.groupby(['group','landing_page']).count() Table 4.0: Data Aggregation for Visualization We have noticed above that some users in ‘control group’ have visited ‘new page’ and the data is wrongly classifying against our objectives. We have also noticed that some users in ‘treatment group’ have visited ‘old page’ and the data is wrongly classified against our objectives. Now instead of cleaning the data we can only pick the relevant correct data (control/new_page and treatment/old_page) with the help of below Python code. Python Code: # from 'Control Group' we only need old page # From 'Treatment Group' we only neednew page df_cleaned= df.loc[(df['group'] == 'control') & (df['landing_page'] == 'old_page') |(df['group'] == 'treatment') & (df['landing_page'] == 'new_page')] df_cleaned.groupby(['group','landing_page']).count() Table 5.0: Cleansed and consolidated Data for both user groups
  • 5. Finding Duplicates Python Code: # Checking for duplicate values print(df_cleaned['user_id'].duplicated().sum()) # Finding user_idfor duplicate value df_cleaned[df_cleaned.duplicated(['user_id'],keep=False)]['user_id'] # Now we need to drop the Duplicates df_cleaned= df.drop_duplicates(subset='user_id',keep="first") Preparing the Contingency Tabe for Chi-Square Test ### To prepare and arrange the Data for Chi-Square Contigency Table # 1) Take out the Control group control = df_cleaned[df_cleaned['group'] == 'control'] # 2) Take out the Treatment group treatment = df_cleaned[df_cleaned['group'] == 'treatment'] # 2A) A-click -i.e The ones who convertedfrom Control group control_click = control.converted.sum() # 2B) No-click,i.e The one who did not click fromControl group control_noclick = control.converted.size -control.converted.sum() #3 B-click, B-noclick # 3A) A-click -i.e The ones who convertedfrom Treatment group treatment_click = treatment.converted.sum() # 2B) No-click,i.e The one who did not click fromTreatment group treatment_noclick = treatment.converted.size - treatment.converted.sum() # 3) Create np array Table = np.array([[control_click, control_noclick], [treatment_click, treatment_noclick]]) print(Table) 2. Hypothesis Formulation & Decision Rule Null Hypothesis H0: The ‘control’ user group and ‘treatment’ user group are independent in terms of their conversion rate. Alterative hypothesis H1: The ‘control’ user group and ‘treatment’ user group are dependent and different in terms of their conversion rate Level of significance For this test, we assume that α = 0.05 or Confidence Interval = 95% Decision Rule If p-value is less than Level of significance (5%) then we will Reject Null Hypothesis (H0).
  • 6. 3. Data Visualization Let’s printthe multidimensional array thatwe created in Python : Click No-Click Old Page 17471 127761 New Page 17274 128078 Table 6.0: Chi-Square Test Contingency Table 4. Test Statistics Calculations To perform the Test let’s import the necessary Python libraries and get the following parameters 1. Test Statistics 2. P- Value 3. Degree of Freedom 4. Expected Frequencies Python Code: import scipy from scipy importstats # The correction will Adjustthe observerdvalue by .5 towards the corressponding ExpectedValues stat,p,dof,expected = scipy.stats.chi2_contingency(Table,correction=True) print('nStat : ',stat) print('nP-Value : ',p) print('nDegree of Freedom : ',dof) print('nObservedFrequencies: ',Table) print('nExpectedFrequencies: ',expected) Snapshot 1.0: Chi-Square Test Results
  • 7. Python Code: # interpret p-value alpha = 1.0 - .95 if p <= alpha: print('Dependent(reject H0)') else: print('Independent(fail to reject H0)') 5. Conclusion: The p-value is 22.9% at 5% level of significance. As the p-value is greater than alpha so we do not Reject the Null Hypothesis The old and new page's users did not behave significantly different and the conversion ratio is not significantly different. Hence, the new web page is not different from the old one. The conversion rate is considered independent as the observed and expected frequencies are similar, the variables do not interact and are not dependent. THE END