Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Chi squared test for digital analytics

490 views

Published on

Chi squared test for digital analytics

Published in: Data & Analytics
  • Be the first to comment

Chi squared test for digital analytics

  1. 1. Think stats: chi square test in digital analytics Chi-Square Test for independence FTW!!11one Pawel Kapuscinski pawel@databall.co @aliendeg
  2. 2. Chi-square test use cases Is gender a factor in color preference of a car? Comparing the number of sales from the test experience vs the control experience (A/B test or A/B/n) Comparing sales revenues of each product before and after the change in strategy Is country a factor in pricing plan preference? Is weather a factor in sales of different products?
  3. 3. Implementing the chi square test 1. Identify the two variables of interest from the data table 2. State hypothesis 3. Compute Margin summations 4. Build contingency table 5. Compute the observed chi-square value 6. Compare the observed value to critical value IMPORTANT: Requirements for chi squared test The variables under study are each categorical. If sample data are displayed in a
  4. 4. Hypothesis testing steps 1. State null (H0) and alternative (H1) hypothesis 2. Choose level of significance 3. Find critical values 4. Find test statistic 5. Draw your conclusion
  5. 5. Chi squared distribution plots
  6. 6. Dataset - pricing plans sold across world Sold plans Professional Team Business Enterprise USA 1220 790 500 190 UK 950 590 200 120 Germany 880 420 320 70 Sweden 340 260 130 60 Belgium 290 190 110 80 Poland 910 290 190 40 Spain 250 320 220 50
  7. 7. Hypothesis H0: Number of sales of each pricing plan is independent upon country H1: Number of sales of each pricing plan is dependent upon country
  8. 8. Finding test statistics (manually, Excel and R) Find critical value (https://www.ma.utexas.edu/users/davis/375/popecol/tables/chisq.html) Compute Margin summations Summing rows and columns Build contingency table Compute the observed chi-square value
  9. 9. Finding test statistics - results
  10. 10. R code df = data.frame(Prof= c(152,118,110,42,36,113,31), Team = c(98,73,52,32,23,36,40), Business = c(62,25,40,16,13,23,27), Enterprise = c(23,15,8,7,10,6,6)) chisq.test(df)
  11. 11. Drawing conclusion We can reject hypothesis zero (H0) and accept H1. Number of sales of each pricing plan is dependent upon country
  12. 12. Learn more http://stats.stackexchange.com www.analyticsvidhya.com www.dartistics.com Measure Slack - http://join.measure.chat
  13. 13. Assignment / homework Transactions mobile desktop tablet Direct 3490028 538101 526095 Paid Search 1229227 214050 210811 Organic Search 862144 401720 193064 Referral 228352 129927 39693 Affiliates 38669 31947 12523 Email 35681 14284 6615 Social 9013 5196 2070 Display 231 171 47 (Other) 58 82 36
  14. 14. Questions?

×