SlideShare a Scribd company logo
1 of 15
Practical Statistical Testing
Adrian Cuyugan
18 August 2014
I. T-test and Z-test 3
II. Chi-square Test of Independence 6
III. I-MR Control Charts 8
IV. Binary Logistic Regression 11
V. Data Sources 15
Agenda
T-Test and Z-test
Testing the Difference of Means on a Two-Tailed Test
Problem Statement
Is there significant difference in mean between Forecasted and Calls Offered?
Data Overview
Daily forecasted call volume is done automatically using Blue Pumpkin; this is prepared by Global Workforce Management Team. Staffing, two-year
historical data and other factors are used to produce this forecast.
Calls offered are the actual calls that came into the IVR as initiated by the user.
The data sample is from April to July 2014.
The data has a bimodal shape due to weekends having fewer number
of calls. It is wiser to perform two separate analysis on weekdays and
weekends if the sensitivity of the underlying problem is too high.
For the sake of looking for the comparison of the two groups,
forecasted and offered, the weekdays and weekends are combined.
Another implication of removing the weekend is that the data will be
extremely skewed to the left.
Samples are collected from two different population at the same
time-period, less than 10 % of the population and it is more than 30
observations which are enough to perform inference.
Assuming that the data is normally distributed, we can start exploring
the data further.
Hypothesis
H0 – Forecasted Calls = Calls Offered, μ = 0.
HA – Forecasted Calls ≠ Calls Offered.
T-Test and Z-test
Testing the Difference of Means on a Two-Tailed Test
Exploratory
The differences of the forecasted calls and the
calls offered vary each month.
Even when looked as a whole, the difference
in average is 6.4 calls in favor of forecasted
calls. It is also noticeable that the difference in
the standard deviation is just 0.25 calls in favor
of calls offered.
Calls Forecasted
Min. 1st Qu. Median Mean 3rd Qu. Max. SD n
13 23 188 155.1 221 278 89.7 122
Calls Offered
Min. 1st Qu. Median Mean 3rd Qu. Max. SD n
7 23.25 177.5 148.7 210.5 287 89.95 122
T-Test and Z-test
Testing the Difference of Means on a Two-Tailed Test
Variance test
A non-significant p-value is not interpreted as meaning that the
variances are equal, only that there is insufficient evidence to reject
the null hypothesis that the variances are equal.
F-test = 0.9945,
Numerator DF = 121, Denominator DF = 121
95% confidence interval = 0.695185, 1.422628
p-value = 0.9758
T-test
t = 0.5588
df = 241.998
95 CI = -16.22862, 29.08108
Mean of Forecasted Calls = 155.1230
Mean of Calls Offered = 148.6967
p-value = 0.5768
Hypothesis
H0 – Forecasted Calls = Calls Offered, μ = 0.
HA – Forecasted Calls ≠ Calls Offered.
Z-test
t = 0.5588
95 CI = -16.11532, 28.96778
Mean of Forecasted Calls = 155.1230
Mean of Calls Offered = 148.6967
p-value = 0.5763
Results
Statistics Language
The probability of 0.5768 of having a t-score of 0.5588 in 241.998 degrees
of freedom is more extreme that having less than or greater than 0.5588
from independent samples, therefore the null hypothesis is not rejected.
Since the bounds of confidence intervals are beyond 0, this further
supports the non-rejection of null-hypothesis.
Business Language
The daily number of forecasted calls provided by the Global Workforce
Management is not statistically significant to say that there is difference
with the number of calls offered to Voice Center; it is expected that
without any unusual events, the number of agents forecasted to answer
the calls is sufficient to pass the abandoned %; this test supports of having
a pass rate of Abandoned % (although what is just measured are
abandoned calls more than 30 seconds.)
Chi-square Test of Independence
Finding association between two categorical variables
Problem Statement
Is there significant relationship between CSAT Survey Result and Reported
Source?
Hypothesis
H0 – CSAT Survey Result and Source are independent.
HA – CSAT Survey Result and Source are dependent.
ExploratoryData Overview
A survey is sent to the user after the logged ticket has been marked as
resolved. There are 8 questions included in the survey and these are
scored from 1 to 5, where the latter is the highest.
Since this test can only be done on categorical variables, CSAT Survey
Result is used a dichotomous response variable that indicates the survey
result as 1 = Positive (success) and 0 = Negative (failure).
The explanatory variable is Reported Source. Service Desk only creates
tickets from two sources: Phone and Email.
Samples are collected within less than 10 % of the population. The
expected frequencies is at least 5 counts. As the observed counts is more
than enough to perform inference, bootstrapping method calculation is
not done.
Chi-square Test of Independence
Finding association between two categorical variables
Contingency Table
CSAT Result
Reported Source Contents Pos Neg Row Total
Email
Observed 241 50 291
Expected 252.5 38.5
Row % 82.8% 17.2% 28.9%
Col % 27.6% 37.6%
Phone
Observed 632 83 715
Expected 620.5 94.5
Row % 88.4% 11.6% 71.1%
Col % 72.4% 62.4%
Col Total
873 133 1006
86.8% 13.2%
Results
Statistics Language
Chi-square = 5.6
DF = 1
P-value = 0.0180
The probability that a chi-square statistic having one degree of freedom is
more extreme than 5.6 where the p-value is less than 0.05.
Business Language
There might be other confounding factors that may affect the relationship
between the result of the CSAT and the reported source or where the
ticket originated from; it may be the resolution time; how the ticket was
responded by the service desk or the resolver, the problem itself, etc. But
this cannot be assumed as this test is done to test the relationship
between the given variables (CSAT ~ Reported Source).
It is concluded that that there is strong relationship between CSAT and
Reported Source; that these two vary.
Hypothesis
H0 – CSAT Survey Result and Source are independent.
HA – CSAT Survey Result and Source are dependent.
Individuals-Moving Range Control Charts
Is it within control?
Problem Statement
Are there any highly unusual events that spiked the number of calls
or systemic pattern received by Voice Center?
Data Overview
When plotted on an individuals chart, you can see a seasonal
pattern which occurs every 5 data points, these data points are the
weekends. To produce a more sensible observation, this analysis
only covers weekdays, a separate analysis can be done covering
weekends, if needed.
Individuals-Moving Range Control Charts
Is it within control?
Observations – Nelson Rules
Special Causes
Rule #1 - Two data points fall below the lower control limit which
are highly unusual to happen on a normal working day. This is
caused by the two holidays in the United States: Memorial Day and
Fourth of July.
Common Causes
• Rule #2 - There are no more than 8 consecutive points that fall
below or above the center line.
• Rule #3 - There are no 6 consecutive points show increasing or
decreasing trend.
• Rule #5 - No points that are very close to the limits.
• Rule #4 - Close to oscillation as the data points are very random.
Individuals-Moving Range Control Charts
Is it within control?
Trend
Looking at the whole dataset plotted in time-series and not just each of
the data points that are within control, we can further test if there’s an
overall trend.
Based on the decomposition trend, there’s no obvious pattern.
Additive model is used as we do not assume that the calls increase as
the time progresses.
An exponential smoothing 5-day forecasting can be done, if needed.
Binary Logistic Regression
Predicting the outcome of a binary categorical dependent variable
Problem Statement
What are the odds and the probabilities of predicting the CSAT survey result based on the ticket age (resolution time), reported source, VIP user
status, status reason of the ticket and resolution method.
Data Overview
Each case in the dataset is the survey result responded by the user. The sample is from August 2013 to July 2014. Only tickets that have been
resolved by groups are included in this analysis.
The CSAT survey consists of 9 questions in which the first 8 questions can be rated by the respondent 1 to 5, where the latter is the highest. The last
question contains free-form text in which the respondent can provide comments based on the respondent’s feelings. The result of the survey is the
sum of the eight questions. It is easier to regress the outcome as the total score is computed mathematically based on the questions but for this
analysis, different variables were used to predict the outcome of the survey.
As previously tested, the survey result is dependent on the reported source. In this test, we can determine if this predictor is still significant.
Several data munging were done to produce categorical variables from continuous variables (dummy coding). Continuous variable can also be
predictors.
Response
CSAT Result
1 – Positive (Success)
0 – Negative (Failure)
Predictors
Ticket Age Reported Source VIP Status Reason Resolution Method
1 – < 3 Days 1 – Phone 1 – Yes 1 – First Call Resolved 1 – Service Desk Assisted
2 – < 7 Days 2 – Email 2 – No 2 – Status Call 2 – Remote Control
3 – < 15 Days 3 – Others 3 – On-site Support
4 – < 30 Days 4 – Self-service
5 – 30+ Days
Binary Logistic Regression
Predicting the outcome of a binary categorical dependent variable
Model 1
Assessing the fit of the model
The probability that a chi-square statistic having 11 degrees of freedom is more
extreme than 31.56 where the p-value is less than 0.05 (p-value = 0.0008970423).
This means that when there is no residual deviance left and when all of the degrees of
freedom have been used, only the following predictors are significant to the response
variable:
• Ticket Age
• Status Reason
Having a lot of insignificant predictors, a better model can be built even this one is a
good fit against an empty model.
Binary Logistic Regression
Predicting the outcome of a binary categorical dependent variable
Model 2
Assessing the fit of the model
The probability that a chi-square statistic having 6 degrees of freedom is more extreme
than 26.71 where the p-value is less than 0.05 (p-value = 0.0001640877).
Overall effect
Since the model is a good fit against the null, we can proceed with other diagnosis to
test the two predictors of the overall effect since:
Ticket Age has 5 categories Status Reason has 3 categories
Wald test Chi-square = 10.2 Wald test Chi-square = 7.6
df = 4 df = 2
p-value = 0.037 p-value = 0.022
Both of the categorical predictors are significant. This means that the difference
between two categories is statistically significant – the difference between < 7 Days
and < 15 Days, and Status Call and Others.
Dummy coding and base categories
You might notice that < 3 Days and First Call Resolved are both missing from the
generalized linear model summary, it is because that R uses these categories as the
base in computing the coefficients, where if the ticket has been resolved in less than 3
days, the coefficient is 0. This also follows the same calculation for Status Reason.
These coefficients are very hard to interpret not like in a linear regression because it
follows the logit of the value. We can compute for the odds ratio of each predictor and
category by using the exponential form against the log and the probability by
computing for the scale and location of the parameters.
Binary Logistic Regression
Predicting the outcome of a binary categorical dependent variable
Odds Ratio
For a unit increase in each of the categorical predictors,
the odds of having a positive survey is the value in the
OddsRatio table, this is more interpretable compared to
the logit coefficients from the previous slide.
TicketAgeClass StatusReason2 Prob
< 3 Days First Call Resolved 100.00%
< 3 Days First Call Resolved 91.58%
< 3 Days First Call Resolved 84.26%
< 3 Days Status Call 100.00%
< 3 Days Status Call 91.58%
< 3 Days Status Call 84.26%
< 3 Days Others 100.00%
< 3 Days Others 91.58%
< 3 Days Others 84.26%
< 7 Days First Call Resolved 92.26%
< 7 Days First Call Resolved 85.44%
< 7 Days Status Call 92.26%
< 7 Days Status Call 85.44%
< 7 Days Others 92.26%
< 7 Days Others 85.44%
< 15 Days First Call Resolved 85.77%
< 15 Days First Call Resolved 74.79%
< 15 Days Status Call 85.77%
< 15 Days Status Call 74.79%
< 15 Days Others 85.77%
< 15 Days Others 74.79%
< 30 Days First Call Resolved 90.17%
< 30 Days First Call Resolved 81.86%
< 30 Days Status Call 90.17%
< 30 Days Status Call 81.86%
< 30 Days Others 90.17%
< 30 Days Others 81.86%
+ 30 Days First Call Resolved 37.50%
+ 30 Days Status Call 37.50%
+ 30 Days Others 37.50%
Since the base variables are <3
Days and First Call Resolved, it is
more likely that the CSAT survey
will be positive 10 times when
tickets that have been resolved
less than 3 days and where the
status reason is First Call Resolved
compared to the other tickets
that have been resolved longer
and have a different status
reason.
Probability
The probability of having a
success or positive CSAT survey
may range because of the
variation in the data. Although we
have an idea what would be the
outcome of the survey based on
the significant predictors that we
have finalized, Ticket Age and
Status Reason.
Practical Statistical Testing
Thank You

More Related Content

What's hot

Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)Matt Hansen
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Matt Hansen
 
Satisfaction and loyalty
Satisfaction and loyaltySatisfaction and loyalty
Satisfaction and loyaltyTheDataNation
 
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Matt Hansen
 
MAT 510 Great Stories /newtonhelp.com
MAT 510 Great Stories /newtonhelp.comMAT 510 Great Stories /newtonhelp.com
MAT 510 Great Stories /newtonhelp.combellflower184
 
MAT 510 Inspiring Innovation/tutorialrank.com
 MAT 510 Inspiring Innovation/tutorialrank.com MAT 510 Inspiring Innovation/tutorialrank.com
MAT 510 Inspiring Innovation/tutorialrank.comjonhson139
 
Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Matt Hansen
 
MAT 510 Effective Communication - tutorialrank.com
MAT 510  Effective Communication - tutorialrank.comMAT 510  Effective Communication - tutorialrank.com
MAT 510 Effective Communication - tutorialrank.comBartholomew46
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingPenn State University
 
Mat 510 Enhance teaching / snaptutorial.com
Mat 510 Enhance teaching / snaptutorial.comMat 510 Enhance teaching / snaptutorial.com
Mat 510 Enhance teaching / snaptutorial.comBaileya19
 
MAT 510 Effective Communication - snaptutorial.com
MAT 510 Effective Communication - snaptutorial.comMAT 510 Effective Communication - snaptutorial.com
MAT 510 Effective Communication - snaptutorial.comdonaldzs24
 
MAT80 - White paper july 2017 - Prof. P. Irwing
MAT80 - White paper july 2017 - Prof. P. IrwingMAT80 - White paper july 2017 - Prof. P. Irwing
MAT80 - White paper july 2017 - Prof. P. IrwingPaul Irwing
 
Audit sampling
Audit samplingAudit sampling
Audit samplingzaur2009
 

What's hot (20)

Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Normal (Compare 1:1)
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:Standard)
 
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
Hypothesis Testing: Central Tendency – Non-Normal (Compare 1:1)
 
Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)Hypothesis Testing: Relationships (Overview)
Hypothesis Testing: Relationships (Overview)
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
 
Satisfaction and loyalty
Satisfaction and loyaltySatisfaction and loyalty
Satisfaction and loyalty
 
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)
 
MAT 510 Great Stories /newtonhelp.com
MAT 510 Great Stories /newtonhelp.comMAT 510 Great Stories /newtonhelp.com
MAT 510 Great Stories /newtonhelp.com
 
MAT 510 Inspiring Innovation/tutorialrank.com
 MAT 510 Inspiring Innovation/tutorialrank.com MAT 510 Inspiring Innovation/tutorialrank.com
MAT 510 Inspiring Innovation/tutorialrank.com
 
Doc 20190909-wa0025
Doc 20190909-wa0025Doc 20190909-wa0025
Doc 20190909-wa0025
 
Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)Hypothesis Testing: Relationships (Compare 2+ Factors)
Hypothesis Testing: Relationships (Compare 2+ Factors)
 
MAT 510 Effective Communication - tutorialrank.com
MAT 510  Effective Communication - tutorialrank.comMAT 510  Effective Communication - tutorialrank.com
MAT 510 Effective Communication - tutorialrank.com
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-Making
 
Mat 510 Enhance teaching / snaptutorial.com
Mat 510 Enhance teaching / snaptutorial.comMat 510 Enhance teaching / snaptutorial.com
Mat 510 Enhance teaching / snaptutorial.com
 
MAT 510 Effective Communication - snaptutorial.com
MAT 510 Effective Communication - snaptutorial.comMAT 510 Effective Communication - snaptutorial.com
MAT 510 Effective Communication - snaptutorial.com
 
MAT80 - White paper july 2017 - Prof. P. Irwing
MAT80 - White paper july 2017 - Prof. P. IrwingMAT80 - White paper july 2017 - Prof. P. Irwing
MAT80 - White paper july 2017 - Prof. P. Irwing
 
Audit sampling
Audit samplingAudit sampling
Audit sampling
 

Viewers also liked

'The Call' Analysis of a Clip.
'The Call' Analysis of a Clip.'The Call' Analysis of a Clip.
'The Call' Analysis of a Clip.Dylan Koolman
 
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...Andreas Freund, PhD
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
Customer Experience Strategy & Operations Transformation
Customer Experience Strategy & Operations TransformationCustomer Experience Strategy & Operations Transformation
Customer Experience Strategy & Operations TransformationMarcel Barrera
 
Metrics that Matter: Focusing on key metrics for an efficient service desk an...
Metrics that Matter: Focusing on key metrics for an efficient service desk an...Metrics that Matter: Focusing on key metrics for an efficient service desk an...
Metrics that Matter: Focusing on key metrics for an efficient service desk an...Freshservice
 
Call Center Process Management 101
Call Center Process Management 101Call Center Process Management 101
Call Center Process Management 101Sarfraz Taj
 
Huawei case analysis call drop
Huawei case analysis call dropHuawei case analysis call drop
Huawei case analysis call dropMuffat Itoro
 
Transforming Customer Experience: From Moments to Journeys
Transforming Customer Experience: From Moments to JourneysTransforming Customer Experience: From Moments to Journeys
Transforming Customer Experience: From Moments to JourneysMcKinsey on Marketing & Sales
 

Viewers also liked (11)

'The Call' Analysis of a Clip.
'The Call' Analysis of a Clip.'The Call' Analysis of a Clip.
'The Call' Analysis of a Clip.
 
Using Analytics to Drive Transformation Strategies
Using Analytics to Drive Transformation StrategiesUsing Analytics to Drive Transformation Strategies
Using Analytics to Drive Transformation Strategies
 
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...
Predictive Analytics: Business Process Analysis And Optimization a CRM Case S...
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Case analysis call drop
Case analysis call dropCase analysis call drop
Case analysis call drop
 
Customer Experience Strategy & Operations Transformation
Customer Experience Strategy & Operations TransformationCustomer Experience Strategy & Operations Transformation
Customer Experience Strategy & Operations Transformation
 
Metrics that Matter: Focusing on key metrics for an efficient service desk an...
Metrics that Matter: Focusing on key metrics for an efficient service desk an...Metrics that Matter: Focusing on key metrics for an efficient service desk an...
Metrics that Matter: Focusing on key metrics for an efficient service desk an...
 
Call Center Process Management 101
Call Center Process Management 101Call Center Process Management 101
Call Center Process Management 101
 
Huawei case analysis call drop
Huawei case analysis call dropHuawei case analysis call drop
Huawei case analysis call drop
 
Customer Journey Analytics and Big Data
Customer Journey Analytics and Big DataCustomer Journey Analytics and Big Data
Customer Journey Analytics and Big Data
 
Transforming Customer Experience: From Moments to Journeys
Transforming Customer Experience: From Moments to JourneysTransforming Customer Experience: From Moments to Journeys
Transforming Customer Experience: From Moments to Journeys
 

Similar to Practical Statistical Testing

Project Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxProject Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxChristianahEfunniyi
 
Data Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryData Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryKelvinNMhina
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxdarwinming1
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - FinalBrian Lin
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxaryan532920
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docxAASTHA76
 
Question 1. 1.You are given only three quarterly seasonal indi.docx
Question 1. 1.You are given only three quarterly seasonal indi.docxQuestion 1. 1.You are given only three quarterly seasonal indi.docx
Question 1. 1.You are given only three quarterly seasonal indi.docxteofilapeerless
 
Detecting incorrectly implemented experiments
Detecting incorrectly implemented experimentsDetecting incorrectly implemented experiments
Detecting incorrectly implemented experimentsOptimizely
 
1. You are given only three quarterly seasonal indices and quarter.docx
1. You are given only three quarterly seasonal indices and quarter.docx1. You are given only three quarterly seasonal indices and quarter.docx
1. You are given only three quarterly seasonal indices and quarter.docxjackiewalcutt
 
Sampling and statistical inference
Sampling and statistical inferenceSampling and statistical inference
Sampling and statistical inferenceBhavik A Shah
 
As mentioned earlier, the mid-term will have conceptual and quanti.docx
As mentioned earlier, the mid-term will have conceptual and quanti.docxAs mentioned earlier, the mid-term will have conceptual and quanti.docx
As mentioned earlier, the mid-term will have conceptual and quanti.docxfredharris32
 
Did Something Change? - Using Statistical Techniques to Interpret Service and...
Did Something Change? - Using Statistical Techniques to Interpret Service and...Did Something Change? - Using Statistical Techniques to Interpret Service and...
Did Something Change? - Using Statistical Techniques to Interpret Service and...Joao Galdino Mello de Souza
 
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdf
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdfHST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdf
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdfDR BHADRAPPA HARALAYYA
 
Using Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to MarketUsing Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to MarketCognizant
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststNor Ihsan
 
Classification via Logistic Regression
Classification via Logistic RegressionClassification via Logistic Regression
Classification via Logistic RegressionTaweh Beysolow II
 
File 498 Doc 27 03dm Exploratorydataanalysis
File 498 Doc 27 03dm ExploratorydataanalysisFile 498 Doc 27 03dm Exploratorydataanalysis
File 498 Doc 27 03dm Exploratorydataanalysismupa
 

Similar to Practical Statistical Testing (20)

Project Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptxProject Report for Mostan Superstore.pptx
Project Report for Mostan Superstore.pptx
 
Data Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies SummaryData Analysis for Graduate Studies Summary
Data Analysis for Graduate Studies Summary
 
Statistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docxStatistical ProcessesCan descriptive statistical processes b.docx
Statistical ProcessesCan descriptive statistical processes b.docx
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final
 
Quality of data
Quality of dataQuality of data
Quality of data
 
Computing Descriptive Statistics © 2014 Argos.docx
 Computing Descriptive Statistics     © 2014 Argos.docx Computing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Computing Descriptive Statistics © 2014 Argos.docx
Computing Descriptive Statistics     © 2014 Argos.docxComputing Descriptive Statistics     © 2014 Argos.docx
Computing Descriptive Statistics © 2014 Argos.docx
 
Question 1. 1.You are given only three quarterly seasonal indi.docx
Question 1. 1.You are given only three quarterly seasonal indi.docxQuestion 1. 1.You are given only three quarterly seasonal indi.docx
Question 1. 1.You are given only three quarterly seasonal indi.docx
 
T test
T test T test
T test
 
Detecting incorrectly implemented experiments
Detecting incorrectly implemented experimentsDetecting incorrectly implemented experiments
Detecting incorrectly implemented experiments
 
1. You are given only three quarterly seasonal indices and quarter.docx
1. You are given only three quarterly seasonal indices and quarter.docx1. You are given only three quarterly seasonal indices and quarter.docx
1. You are given only three quarterly seasonal indices and quarter.docx
 
Sampling and statistical inference
Sampling and statistical inferenceSampling and statistical inference
Sampling and statistical inference
 
As mentioned earlier, the mid-term will have conceptual and quanti.docx
As mentioned earlier, the mid-term will have conceptual and quanti.docxAs mentioned earlier, the mid-term will have conceptual and quanti.docx
As mentioned earlier, the mid-term will have conceptual and quanti.docx
 
Did Something Change? - Using Statistical Techniques to Interpret Service and...
Did Something Change? - Using Statistical Techniques to Interpret Service and...Did Something Change? - Using Statistical Techniques to Interpret Service and...
Did Something Change? - Using Statistical Techniques to Interpret Service and...
 
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdf
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdfHST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdf
HST- 0621-151 PAPER ANALYSIS OF BANK PRODUCTIVITY USING PANEL CAUSALITY TEST.pdf
 
Using Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to MarketUsing Investigative Analytics to Speed New Drugs to Market
Using Investigative Analytics to Speed New Drugs to Market
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10stst
 
Classification via Logistic Regression
Classification via Logistic RegressionClassification via Logistic Regression
Classification via Logistic Regression
 
hypothesis.pptx
hypothesis.pptxhypothesis.pptx
hypothesis.pptx
 
File 498 Doc 27 03dm Exploratorydataanalysis
File 498 Doc 27 03dm ExploratorydataanalysisFile 498 Doc 27 03dm Exploratorydataanalysis
File 498 Doc 27 03dm Exploratorydataanalysis
 

Practical Statistical Testing

  • 1. Practical Statistical Testing Adrian Cuyugan 18 August 2014
  • 2. I. T-test and Z-test 3 II. Chi-square Test of Independence 6 III. I-MR Control Charts 8 IV. Binary Logistic Regression 11 V. Data Sources 15 Agenda
  • 3. T-Test and Z-test Testing the Difference of Means on a Two-Tailed Test Problem Statement Is there significant difference in mean between Forecasted and Calls Offered? Data Overview Daily forecasted call volume is done automatically using Blue Pumpkin; this is prepared by Global Workforce Management Team. Staffing, two-year historical data and other factors are used to produce this forecast. Calls offered are the actual calls that came into the IVR as initiated by the user. The data sample is from April to July 2014. The data has a bimodal shape due to weekends having fewer number of calls. It is wiser to perform two separate analysis on weekdays and weekends if the sensitivity of the underlying problem is too high. For the sake of looking for the comparison of the two groups, forecasted and offered, the weekdays and weekends are combined. Another implication of removing the weekend is that the data will be extremely skewed to the left. Samples are collected from two different population at the same time-period, less than 10 % of the population and it is more than 30 observations which are enough to perform inference. Assuming that the data is normally distributed, we can start exploring the data further. Hypothesis H0 – Forecasted Calls = Calls Offered, μ = 0. HA – Forecasted Calls ≠ Calls Offered.
  • 4. T-Test and Z-test Testing the Difference of Means on a Two-Tailed Test Exploratory The differences of the forecasted calls and the calls offered vary each month. Even when looked as a whole, the difference in average is 6.4 calls in favor of forecasted calls. It is also noticeable that the difference in the standard deviation is just 0.25 calls in favor of calls offered. Calls Forecasted Min. 1st Qu. Median Mean 3rd Qu. Max. SD n 13 23 188 155.1 221 278 89.7 122 Calls Offered Min. 1st Qu. Median Mean 3rd Qu. Max. SD n 7 23.25 177.5 148.7 210.5 287 89.95 122
  • 5. T-Test and Z-test Testing the Difference of Means on a Two-Tailed Test Variance test A non-significant p-value is not interpreted as meaning that the variances are equal, only that there is insufficient evidence to reject the null hypothesis that the variances are equal. F-test = 0.9945, Numerator DF = 121, Denominator DF = 121 95% confidence interval = 0.695185, 1.422628 p-value = 0.9758 T-test t = 0.5588 df = 241.998 95 CI = -16.22862, 29.08108 Mean of Forecasted Calls = 155.1230 Mean of Calls Offered = 148.6967 p-value = 0.5768 Hypothesis H0 – Forecasted Calls = Calls Offered, μ = 0. HA – Forecasted Calls ≠ Calls Offered. Z-test t = 0.5588 95 CI = -16.11532, 28.96778 Mean of Forecasted Calls = 155.1230 Mean of Calls Offered = 148.6967 p-value = 0.5763 Results Statistics Language The probability of 0.5768 of having a t-score of 0.5588 in 241.998 degrees of freedom is more extreme that having less than or greater than 0.5588 from independent samples, therefore the null hypothesis is not rejected. Since the bounds of confidence intervals are beyond 0, this further supports the non-rejection of null-hypothesis. Business Language The daily number of forecasted calls provided by the Global Workforce Management is not statistically significant to say that there is difference with the number of calls offered to Voice Center; it is expected that without any unusual events, the number of agents forecasted to answer the calls is sufficient to pass the abandoned %; this test supports of having a pass rate of Abandoned % (although what is just measured are abandoned calls more than 30 seconds.)
  • 6. Chi-square Test of Independence Finding association between two categorical variables Problem Statement Is there significant relationship between CSAT Survey Result and Reported Source? Hypothesis H0 – CSAT Survey Result and Source are independent. HA – CSAT Survey Result and Source are dependent. ExploratoryData Overview A survey is sent to the user after the logged ticket has been marked as resolved. There are 8 questions included in the survey and these are scored from 1 to 5, where the latter is the highest. Since this test can only be done on categorical variables, CSAT Survey Result is used a dichotomous response variable that indicates the survey result as 1 = Positive (success) and 0 = Negative (failure). The explanatory variable is Reported Source. Service Desk only creates tickets from two sources: Phone and Email. Samples are collected within less than 10 % of the population. The expected frequencies is at least 5 counts. As the observed counts is more than enough to perform inference, bootstrapping method calculation is not done.
  • 7. Chi-square Test of Independence Finding association between two categorical variables Contingency Table CSAT Result Reported Source Contents Pos Neg Row Total Email Observed 241 50 291 Expected 252.5 38.5 Row % 82.8% 17.2% 28.9% Col % 27.6% 37.6% Phone Observed 632 83 715 Expected 620.5 94.5 Row % 88.4% 11.6% 71.1% Col % 72.4% 62.4% Col Total 873 133 1006 86.8% 13.2% Results Statistics Language Chi-square = 5.6 DF = 1 P-value = 0.0180 The probability that a chi-square statistic having one degree of freedom is more extreme than 5.6 where the p-value is less than 0.05. Business Language There might be other confounding factors that may affect the relationship between the result of the CSAT and the reported source or where the ticket originated from; it may be the resolution time; how the ticket was responded by the service desk or the resolver, the problem itself, etc. But this cannot be assumed as this test is done to test the relationship between the given variables (CSAT ~ Reported Source). It is concluded that that there is strong relationship between CSAT and Reported Source; that these two vary. Hypothesis H0 – CSAT Survey Result and Source are independent. HA – CSAT Survey Result and Source are dependent.
  • 8. Individuals-Moving Range Control Charts Is it within control? Problem Statement Are there any highly unusual events that spiked the number of calls or systemic pattern received by Voice Center? Data Overview When plotted on an individuals chart, you can see a seasonal pattern which occurs every 5 data points, these data points are the weekends. To produce a more sensible observation, this analysis only covers weekdays, a separate analysis can be done covering weekends, if needed.
  • 9. Individuals-Moving Range Control Charts Is it within control? Observations – Nelson Rules Special Causes Rule #1 - Two data points fall below the lower control limit which are highly unusual to happen on a normal working day. This is caused by the two holidays in the United States: Memorial Day and Fourth of July. Common Causes • Rule #2 - There are no more than 8 consecutive points that fall below or above the center line. • Rule #3 - There are no 6 consecutive points show increasing or decreasing trend. • Rule #5 - No points that are very close to the limits. • Rule #4 - Close to oscillation as the data points are very random.
  • 10. Individuals-Moving Range Control Charts Is it within control? Trend Looking at the whole dataset plotted in time-series and not just each of the data points that are within control, we can further test if there’s an overall trend. Based on the decomposition trend, there’s no obvious pattern. Additive model is used as we do not assume that the calls increase as the time progresses. An exponential smoothing 5-day forecasting can be done, if needed.
  • 11. Binary Logistic Regression Predicting the outcome of a binary categorical dependent variable Problem Statement What are the odds and the probabilities of predicting the CSAT survey result based on the ticket age (resolution time), reported source, VIP user status, status reason of the ticket and resolution method. Data Overview Each case in the dataset is the survey result responded by the user. The sample is from August 2013 to July 2014. Only tickets that have been resolved by groups are included in this analysis. The CSAT survey consists of 9 questions in which the first 8 questions can be rated by the respondent 1 to 5, where the latter is the highest. The last question contains free-form text in which the respondent can provide comments based on the respondent’s feelings. The result of the survey is the sum of the eight questions. It is easier to regress the outcome as the total score is computed mathematically based on the questions but for this analysis, different variables were used to predict the outcome of the survey. As previously tested, the survey result is dependent on the reported source. In this test, we can determine if this predictor is still significant. Several data munging were done to produce categorical variables from continuous variables (dummy coding). Continuous variable can also be predictors. Response CSAT Result 1 – Positive (Success) 0 – Negative (Failure) Predictors Ticket Age Reported Source VIP Status Reason Resolution Method 1 – < 3 Days 1 – Phone 1 – Yes 1 – First Call Resolved 1 – Service Desk Assisted 2 – < 7 Days 2 – Email 2 – No 2 – Status Call 2 – Remote Control 3 – < 15 Days 3 – Others 3 – On-site Support 4 – < 30 Days 4 – Self-service 5 – 30+ Days
  • 12. Binary Logistic Regression Predicting the outcome of a binary categorical dependent variable Model 1 Assessing the fit of the model The probability that a chi-square statistic having 11 degrees of freedom is more extreme than 31.56 where the p-value is less than 0.05 (p-value = 0.0008970423). This means that when there is no residual deviance left and when all of the degrees of freedom have been used, only the following predictors are significant to the response variable: • Ticket Age • Status Reason Having a lot of insignificant predictors, a better model can be built even this one is a good fit against an empty model.
  • 13. Binary Logistic Regression Predicting the outcome of a binary categorical dependent variable Model 2 Assessing the fit of the model The probability that a chi-square statistic having 6 degrees of freedom is more extreme than 26.71 where the p-value is less than 0.05 (p-value = 0.0001640877). Overall effect Since the model is a good fit against the null, we can proceed with other diagnosis to test the two predictors of the overall effect since: Ticket Age has 5 categories Status Reason has 3 categories Wald test Chi-square = 10.2 Wald test Chi-square = 7.6 df = 4 df = 2 p-value = 0.037 p-value = 0.022 Both of the categorical predictors are significant. This means that the difference between two categories is statistically significant – the difference between < 7 Days and < 15 Days, and Status Call and Others. Dummy coding and base categories You might notice that < 3 Days and First Call Resolved are both missing from the generalized linear model summary, it is because that R uses these categories as the base in computing the coefficients, where if the ticket has been resolved in less than 3 days, the coefficient is 0. This also follows the same calculation for Status Reason. These coefficients are very hard to interpret not like in a linear regression because it follows the logit of the value. We can compute for the odds ratio of each predictor and category by using the exponential form against the log and the probability by computing for the scale and location of the parameters.
  • 14. Binary Logistic Regression Predicting the outcome of a binary categorical dependent variable Odds Ratio For a unit increase in each of the categorical predictors, the odds of having a positive survey is the value in the OddsRatio table, this is more interpretable compared to the logit coefficients from the previous slide. TicketAgeClass StatusReason2 Prob < 3 Days First Call Resolved 100.00% < 3 Days First Call Resolved 91.58% < 3 Days First Call Resolved 84.26% < 3 Days Status Call 100.00% < 3 Days Status Call 91.58% < 3 Days Status Call 84.26% < 3 Days Others 100.00% < 3 Days Others 91.58% < 3 Days Others 84.26% < 7 Days First Call Resolved 92.26% < 7 Days First Call Resolved 85.44% < 7 Days Status Call 92.26% < 7 Days Status Call 85.44% < 7 Days Others 92.26% < 7 Days Others 85.44% < 15 Days First Call Resolved 85.77% < 15 Days First Call Resolved 74.79% < 15 Days Status Call 85.77% < 15 Days Status Call 74.79% < 15 Days Others 85.77% < 15 Days Others 74.79% < 30 Days First Call Resolved 90.17% < 30 Days First Call Resolved 81.86% < 30 Days Status Call 90.17% < 30 Days Status Call 81.86% < 30 Days Others 90.17% < 30 Days Others 81.86% + 30 Days First Call Resolved 37.50% + 30 Days Status Call 37.50% + 30 Days Others 37.50% Since the base variables are <3 Days and First Call Resolved, it is more likely that the CSAT survey will be positive 10 times when tickets that have been resolved less than 3 days and where the status reason is First Call Resolved compared to the other tickets that have been resolved longer and have a different status reason. Probability The probability of having a success or positive CSAT survey may range because of the variation in the data. Although we have an idea what would be the outcome of the survey based on the significant predictors that we have finalized, Ticket Age and Status Reason.