Ryan T Johnson, 10-11-20
Being Right Starts
By Knowing You're Wrong
Challenges on the path to using data effectively
The biggest challenges to implementing
effective analytics are cultural, not
technical.
Error
Some guy who quotes himself
“I’m willing to bet there is 75% agreement
with this statement in the audience
today.”
How wrong were we?
The least popular topic in a business
What Is Error?
Error:
The difference between what we expected and what we observed
Prediction Error
The difference between what we expected and what we observed
• The deep learning model predicted 10,000 clicks but we got
10,478

• The CNN predicted the directory contained 10 images of
hotdogs but it actually contained 0

• The recommender system estimated the user would give the
movie 5 stars but they gave it 1
Prediction Error?
The difference between what we expected and what we observed
• I told ordering we’d sell 1000 units of the new widget but we
only sold 670

• We expected employee satisfaction to decrease this quarter but
it went up

• The sales rep anticipated $500K in revenue from this account
but actually got $1.5M!

• There are 100 rows in the table that are missing the required ID
field
Error:
Variation from a desired result
Malcolm Baldrige National Quality Award 1988 Recipient Motorola Inc.
KEY QUALITY INITIATIVES
To accomplish its quality and total customer satisfaction goals, Motorola
concentrates on several key operational initiatives. At the top of the list is
"Six Sigma Quality," a statistical measure of variation from a desired
result.
- NIST.gov
Process Improvement
Process Improvement
Defects: Error by another name
Source
Error:
How much room is there for improvement?
How much better can we get at this thing we are doing?
Error:
An opportunity to learn
Ways to Measure Error
We expected to sell 1000 units
of the new widget but actually
sold 670
Expected - Actual = Error = 330 units
We expected $500K in revenue
from this account but saw
$1.5M!
Expected - Actual = -$1,000,000

= -$1,000,000̂y − y
Error = -1,000,000!
We expected $500K in revenue
from this account but saw
$1.5M!
Absolute error = = $1,000,000| ̂y − y|
We expected $500K in revenue
from this account but saw
$1.5M!
Absolute percent error = = 0.67|
̂y − y
y
|
We expected $500K in revenue
from this account but saw $1.5M!
Absolute percent error = 0.67
We expected to sell 1000 units but
sold 670
Absolute percent error = 0.33
Repeated Measures of Error
Central tendency and dispersion of the errors
Account Revenue Error
AbsolutePercentError
0
9.5
19
28.5
38
April May June July Aug Sept
Repeated Measures of Error
Central tendency and dispersion of the errors
Repeated Measures of Error
Central tendency and dispersion of the errors
Mean Absolute Error = 

Mean Absolute Percent Error = 

Median Absolute Percent Error = where 

Standard Deviation

Interquartile Range (IQR)
1
n
n
∑
i=1
̂yi − yi
1
n
n
∑
i=1
|
̂yi − yi
yi
|
median(p1, p2, . . . , pn) pi = |
̂yi − yi
yi
|
Repeated Measures of Error
Central tendency and spread
Account Revenue Error
AbsolutePercentError
0
9.5
19
28.5
38
April May June July Aug Sept
MAPE = 30.167%

IQR = 2.25%
Repeated Measures of Error
Central tendency and spread
Account Revenue Estimates
USdollars
$0
$5,000
$10,000
$15,000
$20,000
Oct Nov Dec
Repeated Measures of Error
Central tendency and spread
Account Revenue Estimates
USdollars
$0
$5,000
$10,000
$15,000
$20,000
Oct Nov Dec
30% error range applied to each estimate
Repeated Measures of Error
Central tendency and spread
MdAPE = 28.4%

Range = 21.33 46.61
What are these products?
Repeated Measures of Error
Central tendency and dispersion of the errors
Mean Absolute Error = 

Mean Absolute Percent Error = 

Median Absolute Percent Error = where 

Standard Deviation

Interquartile Range (IQR)
1
n
n
∑
i=1
̂yi − yi
1
n
n
∑
i=1
|
̂yi − yi
yi
|
median(p1, p2, . . . , pn) pi = |
̂yi − yi
yi
|
Why Measuring Error is
Important
Machine Learning
is useless without error
History of Machine Learning
Error was key
Source
Please consider donating to the Wikimedia Foundation

https://wikimediafoundation.org
History of Machine Learning
Error was key
Source
1
n
n
∑
t=1
yt − xt
MAE
Learning
Error is key
Inputs Decisions
A popular read on this topic: Thinking Fast and Slow by Daniel Kahneman
Learning
Error is key
Inputs Decisions
Erorr
Actuals
{
Hindsite Bias, Confirmation Bias, Gambler’s fallacy, IKEA effect, Dunning-Kruger
effect, Loss aversion, Omission bias, Pareidolia, Zero-sum bias, Belief bias,
Backfire effect, Clustering illusion, Default effect, Contrast effect, Impact bias,
Optimism bias, Present bias, Projection bias, Recency illusion, Salience bias…..
Learning
Error is key
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Marketing
Engineering
Accounting
Learning
Error is key
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Inputs Decisions
Marketing
Engineering
Accounting
HR
IT
Sales
Operations
Leadership
Product
Learning
Error is key
Marketing
Analytics
Accounting
HR
IT
Sales
Operations
Leadership
Product
Input dollars Output
Engineering
Learning
Error is key
Marketing
Analytics
Accounting
HR
IT
Sales
Operations
Leadership
Product
Input dollars Output
Engineering
Error
?
Your business in a
complex function
you are trying to
optimize. Without
tracking error you’re
just guessing about
how to do so.
Source for image
Applications of Error
Measurement and How To Talk
About Them
Opportunities for Error (Yay!) in Marketing
We want to undertake new brand awareness initiatives this
quarter.
What do we expect our brand awareness to be after these initiatives?
There are a lot of factors to consider so it is not clear.
It sounds like we don’t have as much information as we’d like to make an
estimate. That’s understandable. One of the most useful pieces of information
for making an estimate is how far off we were with our last estimate. So in
order to break this cycle I suggest we begin collecting this information. How
can we ensure the marketing department is comfortable making a really rough
initial estimate?
Example dialogue
Opportunities for Error (Yay!) in Marketing
Throughout the Marketing Funnel
• Impressions, leads, and other volume measures

• Click-thru, lead conversion or other rate measures 

• “What is the error rate on our lead scoring algorithm?”

• Cost per lead, return on marketing spend

• “Can the vendor provided an expected cost per lead?”

• “What change do we expect in marketing return after this new
initiative?”
Opportunities for Error (Yay!) in Sales
Our new sales script drove trial to purchase rates.
It sure did! The analysis showed a 5% increase that doesn’t appear to be due
to random chance. We anticipate a 2% increase giving us an absolute error of
3%.
Who cares about the error!? We increased purchase rates,
right?
I agree it’s a fantastic improvement. That effort should be applauded. We want
to note the error rate so next team we take on a similar project we will be better
able to anticipate the outcome and the downstream impacts. For example, our
product team is seeing an increase in shipping delays due to the increased
demand.
Example dialogue
Opportunities for Error (Yay!) in Sales
Throughout the Sales Process
• Opportunities, deals, trials, contracts and other volume
measures

• Trial to purchase, Opp conversion, or other rate measures 

• Average order size, average contract length, products per
order, days to close

• “By expanding the team do we think days to close will
decrease or is this necessary just to maintain?”
Opportunities for Error (Yay!) in HR
Our goals this quarter are to boost employee satisfaction as
well as reduce time to placement in recruiting.
Sounds great! Last survey our employee sat. was at 75 and average time to
placement stands at 65 days. What do we estimate these will be once the
initiatives are complete?
Right now the goal is just to produce change in the right
direction. We’ll evaluate how things are going as we get
further into the quarter.
I think it’s great that we want to change these measures and I fully agree we
should continuously check-in. To make sure our check-ins help us reach useful
conclusions it’s important to clearly lay out what we expect to happen.
Example dialogue
Opportunities for Error (Yay!) in HR
Our project this quarter focused on reducing “days to fill” error in our estimates for
engineering requisitions. We took the following steps that begin on July 3rd… I’ll turn it
over to our analyst to discuss the results.
For the past year we’ve had a mean error of 34 days when estimating “days to fill”
for these roles. After these new efforts we’ve seen a mean error of 12 days for the
10 engineers we’ve hired. Analysis thus far suggests this reduction is unlikely to be
due to chance. It’s a great improvement.
Based on this outcome, we’ve begun implementing similar changes to the hiring process for
all roles. With this improved ability to estimate we also want to revisit our hiring plan for the
next quarter. It appears some of the open headcount will come much too late to help with
busy season.
Example dialogue
Final Summary
Error is the difference between
our expectations and
observations
Always approach with a sincere desire to improve the company
Error is simple to measure
Get forward momentum by avoiding complicated measures for now
Error is a huge opportunity to
improve… that humans aren’t
great at using
Make it easier for by carefully tracking error across the company
Error tracking is a tool for
analysis.
Analytics itself is not a tool.
Analytics is a way of thinking.
T.S. Eliot
We shall not cease from exploration
And the end of all our exploration
Will be to arrive where we started
And know the place for the first time.
GoGuardian Science and Analytics
Thanks to these incredible explorers
Bianca Jacobs, Stephanie Dang, Mike Frantz,
Greg Johnson, Kevin Wecht, Yola Katsargyri,
Tony Woods, Harrison Mamin, Nicole Jeong,
Rosie Abe, Manoj Rawat
Opportunities for Error Everywhere!
Names may vary
Engineering Product/Design Accounting/Finance IT/Support
Expected # of bugs/defects
 Expected NPS
improvements
Average debtor days Submitted ticket volume
Expected days to complete
Expected product demand
or daily active users
# of current accounts
receivable
Ticket completion time
Uptime expectations
Anticipated inventory
turnover
Accounts payable process
cost
Costs of goods sold
estimates
Time to first draft Budget variance
Rework time Payment error rate
A good exercise might be taking this table and seeing how many of the proposed measures
are actually tracked at your company. Then ask how many of them have error metrics that are
also tracked?

Being Right Starts By Knowing You're Wrong

  • 1.
    Ryan T Johnson,10-11-20 Being Right Starts By Knowing You're Wrong Challenges on the path to using data effectively
  • 2.
    The biggest challengesto implementing effective analytics are cultural, not technical.
  • 3.
  • 4.
    Some guy whoquotes himself “I’m willing to bet there is 75% agreement with this statement in the audience today.”
  • 5.
    How wrong werewe? The least popular topic in a business
  • 6.
  • 7.
    Error: The difference betweenwhat we expected and what we observed
  • 8.
    Prediction Error The differencebetween what we expected and what we observed • The deep learning model predicted 10,000 clicks but we got 10,478 • The CNN predicted the directory contained 10 images of hotdogs but it actually contained 0 • The recommender system estimated the user would give the movie 5 stars but they gave it 1
  • 9.
    Prediction Error? The differencebetween what we expected and what we observed • I told ordering we’d sell 1000 units of the new widget but we only sold 670 • We expected employee satisfaction to decrease this quarter but it went up • The sales rep anticipated $500K in revenue from this account but actually got $1.5M! • There are 100 rows in the table that are missing the required ID field
  • 10.
  • 11.
    Malcolm Baldrige NationalQuality Award 1988 Recipient Motorola Inc. KEY QUALITY INITIATIVES To accomplish its quality and total customer satisfaction goals, Motorola concentrates on several key operational initiatives. At the top of the list is "Six Sigma Quality," a statistical measure of variation from a desired result. - NIST.gov Process Improvement
  • 12.
    Process Improvement Defects: Errorby another name Source
  • 13.
    Error: How much roomis there for improvement? How much better can we get at this thing we are doing?
  • 15.
  • 16.
  • 17.
    We expected tosell 1000 units of the new widget but actually sold 670 Expected - Actual = Error = 330 units
  • 18.
    We expected $500Kin revenue from this account but saw $1.5M! Expected - Actual = -$1,000,000 = -$1,000,000̂y − y
  • 19.
  • 20.
    We expected $500Kin revenue from this account but saw $1.5M! Absolute error = = $1,000,000| ̂y − y|
  • 21.
    We expected $500Kin revenue from this account but saw $1.5M! Absolute percent error = = 0.67| ̂y − y y |
  • 22.
    We expected $500Kin revenue from this account but saw $1.5M! Absolute percent error = 0.67 We expected to sell 1000 units but sold 670 Absolute percent error = 0.33
  • 23.
    Repeated Measures ofError Central tendency and dispersion of the errors Account Revenue Error AbsolutePercentError 0 9.5 19 28.5 38 April May June July Aug Sept
  • 24.
    Repeated Measures ofError Central tendency and dispersion of the errors
  • 25.
    Repeated Measures ofError Central tendency and dispersion of the errors Mean Absolute Error = Mean Absolute Percent Error = Median Absolute Percent Error = where Standard Deviation Interquartile Range (IQR) 1 n n ∑ i=1 ̂yi − yi 1 n n ∑ i=1 | ̂yi − yi yi | median(p1, p2, . . . , pn) pi = | ̂yi − yi yi |
  • 26.
    Repeated Measures ofError Central tendency and spread Account Revenue Error AbsolutePercentError 0 9.5 19 28.5 38 April May June July Aug Sept MAPE = 30.167% IQR = 2.25%
  • 27.
    Repeated Measures ofError Central tendency and spread Account Revenue Estimates USdollars $0 $5,000 $10,000 $15,000 $20,000 Oct Nov Dec
  • 28.
    Repeated Measures ofError Central tendency and spread Account Revenue Estimates USdollars $0 $5,000 $10,000 $15,000 $20,000 Oct Nov Dec 30% error range applied to each estimate
  • 29.
    Repeated Measures ofError Central tendency and spread MdAPE = 28.4% Range = 21.33 46.61 What are these products?
  • 30.
    Repeated Measures ofError Central tendency and dispersion of the errors Mean Absolute Error = Mean Absolute Percent Error = Median Absolute Percent Error = where Standard Deviation Interquartile Range (IQR) 1 n n ∑ i=1 ̂yi − yi 1 n n ∑ i=1 | ̂yi − yi yi | median(p1, p2, . . . , pn) pi = | ̂yi − yi yi |
  • 31.
    Why Measuring Erroris Important
  • 32.
  • 33.
    History of MachineLearning Error was key Source Please consider donating to the Wikimedia Foundation https://wikimediafoundation.org
  • 34.
    History of MachineLearning Error was key Source 1 n n ∑ t=1 yt − xt MAE
  • 35.
    Learning Error is key InputsDecisions A popular read on this topic: Thinking Fast and Slow by Daniel Kahneman
  • 36.
    Learning Error is key InputsDecisions Erorr Actuals { Hindsite Bias, Confirmation Bias, Gambler’s fallacy, IKEA effect, Dunning-Kruger effect, Loss aversion, Omission bias, Pareidolia, Zero-sum bias, Belief bias, Backfire effect, Clustering illusion, Default effect, Contrast effect, Impact bias, Optimism bias, Present bias, Projection bias, Recency illusion, Salience bias…..
  • 37.
    Learning Error is key InputsDecisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Marketing Engineering Accounting
  • 38.
    Learning Error is key InputsDecisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Inputs Decisions Marketing Engineering Accounting HR IT Sales Operations Leadership Product
  • 39.
  • 40.
  • 41.
    Your business ina complex function you are trying to optimize. Without tracking error you’re just guessing about how to do so. Source for image
  • 42.
    Applications of Error Measurementand How To Talk About Them
  • 43.
    Opportunities for Error(Yay!) in Marketing We want to undertake new brand awareness initiatives this quarter. What do we expect our brand awareness to be after these initiatives? There are a lot of factors to consider so it is not clear. It sounds like we don’t have as much information as we’d like to make an estimate. That’s understandable. One of the most useful pieces of information for making an estimate is how far off we were with our last estimate. So in order to break this cycle I suggest we begin collecting this information. How can we ensure the marketing department is comfortable making a really rough initial estimate? Example dialogue
  • 44.
    Opportunities for Error(Yay!) in Marketing Throughout the Marketing Funnel • Impressions, leads, and other volume measures • Click-thru, lead conversion or other rate measures • “What is the error rate on our lead scoring algorithm?” • Cost per lead, return on marketing spend • “Can the vendor provided an expected cost per lead?” • “What change do we expect in marketing return after this new initiative?”
  • 45.
    Opportunities for Error(Yay!) in Sales Our new sales script drove trial to purchase rates. It sure did! The analysis showed a 5% increase that doesn’t appear to be due to random chance. We anticipate a 2% increase giving us an absolute error of 3%. Who cares about the error!? We increased purchase rates, right? I agree it’s a fantastic improvement. That effort should be applauded. We want to note the error rate so next team we take on a similar project we will be better able to anticipate the outcome and the downstream impacts. For example, our product team is seeing an increase in shipping delays due to the increased demand. Example dialogue
  • 46.
    Opportunities for Error(Yay!) in Sales Throughout the Sales Process • Opportunities, deals, trials, contracts and other volume measures • Trial to purchase, Opp conversion, or other rate measures • Average order size, average contract length, products per order, days to close • “By expanding the team do we think days to close will decrease or is this necessary just to maintain?”
  • 47.
    Opportunities for Error(Yay!) in HR Our goals this quarter are to boost employee satisfaction as well as reduce time to placement in recruiting. Sounds great! Last survey our employee sat. was at 75 and average time to placement stands at 65 days. What do we estimate these will be once the initiatives are complete? Right now the goal is just to produce change in the right direction. We’ll evaluate how things are going as we get further into the quarter. I think it’s great that we want to change these measures and I fully agree we should continuously check-in. To make sure our check-ins help us reach useful conclusions it’s important to clearly lay out what we expect to happen. Example dialogue
  • 48.
    Opportunities for Error(Yay!) in HR Our project this quarter focused on reducing “days to fill” error in our estimates for engineering requisitions. We took the following steps that begin on July 3rd… I’ll turn it over to our analyst to discuss the results. For the past year we’ve had a mean error of 34 days when estimating “days to fill” for these roles. After these new efforts we’ve seen a mean error of 12 days for the 10 engineers we’ve hired. Analysis thus far suggests this reduction is unlikely to be due to chance. It’s a great improvement. Based on this outcome, we’ve begun implementing similar changes to the hiring process for all roles. With this improved ability to estimate we also want to revisit our hiring plan for the next quarter. It appears some of the open headcount will come much too late to help with busy season. Example dialogue
  • 49.
  • 50.
    Error is thedifference between our expectations and observations Always approach with a sincere desire to improve the company
  • 51.
    Error is simpleto measure Get forward momentum by avoiding complicated measures for now
  • 52.
    Error is ahuge opportunity to improve… that humans aren’t great at using Make it easier for by carefully tracking error across the company
  • 53.
    Error tracking isa tool for analysis. Analytics itself is not a tool. Analytics is a way of thinking.
  • 54.
    T.S. Eliot We shallnot cease from exploration And the end of all our exploration Will be to arrive where we started And know the place for the first time.
  • 55.
    GoGuardian Science andAnalytics Thanks to these incredible explorers Bianca Jacobs, Stephanie Dang, Mike Frantz, Greg Johnson, Kevin Wecht, Yola Katsargyri, Tony Woods, Harrison Mamin, Nicole Jeong, Rosie Abe, Manoj Rawat
  • 56.
    Opportunities for ErrorEverywhere! Names may vary Engineering Product/Design Accounting/Finance IT/Support Expected # of bugs/defects Expected NPS improvements Average debtor days Submitted ticket volume Expected days to complete Expected product demand or daily active users # of current accounts receivable Ticket completion time Uptime expectations Anticipated inventory turnover Accounts payable process cost Costs of goods sold estimates Time to first draft Budget variance Rework time Payment error rate A good exercise might be taking this table and seeing how many of the proposed measures are actually tracked at your company. Then ask how many of them have error metrics that are also tracked?