SlideShare a Scribd company logo
1 of 70
Download to read offline
When in doubt, go live
Techniques for decision making
based on real user behavior
© 2020 ThoughtWorks
Irene Torres
Klaus Fleerkötter
You save time and make better decisions
by establishing shorter feedback loops
from feature idea to feature usage.
© 2020 ThoughtWorks
Irene Torres
Developer @ TW
PhD Neuroscience
Science perspective
Klaus Fleerkötter
Developer @ TW
Information Systems
Techie perspective
Klaus
Who’s talking?
© 2020 ThoughtWorks
What is this talk about?
Specific use cases
that worked for us
Tech & Research
And what is it not...
© 2020 ThoughtWorks
Extensive coverage of
user research
Software testing
One of Germany’s
biggest online retailers
Top 5 highest traffic
e-commerce sites
(Germany)
Orders: <= 10 per second
Qualified visits:
Ø 1.6 million / day
Examples
© 2020 ThoughtWorks
PO
Establishing Feedback Loops
Users
Team
Stakeholders
Users
PO
Establishing Feedback Loops
Users
Team
Stakeholders
Users
PO
Delivery
Pipeline
Feature
Toggle
Shadow
TrafficLab
Test Focus
Group
Survey
Visual
Report
A/B
Test
Establishing Feedback Loops
Prerequisites
© 2020 ThoughtWorks
PO
An Iterative and Incremental development process
© 2020 ThoughtWorks
Services that can be built independently by cross-functional
teams that are structured around business domains
© 2020 ThoughtWorks
Dev
PO
QA Ops
UX
DA
The Delivery Pipeline
© 2020 ThoughtWorks
Delivery
Pipeline
Iterative and
Incremental
development
Independent
Teams
The Delivery Pipeline
© 2020 ThoughtWorks
Build Test Deploy
Gain situational awareness
Knowing that you went live and nothing’s on fire
© 2020 ThoughtWorks
Feature Toggles
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Iterative and
Incremental
development
Independent
Teams
Feature Toggles
Decouple go-live from deployment
© 2020 ThoughtWorks
© CC BY 2.0 "Switch" Jon_Callow_Images
if (toggleIsOn) then {
executeNewBehavior()
} else {
executeOldBehavior()
}
Feature Toggles
Flip for experimentation
© CC BY-ND 2.0 "Off?" Nicholas Liby
Without Recompile?
Without Restart?
Per Request?
© 2020 ThoughtWorks
While
developing,
go live
© 2020 ThoughtWorks
Shadow Traffic
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
Traffic
Iterative and
Incremental
development
Independent
Teams
Shadow Traffic
Not just for testing
© 2020 ThoughtWorks
User
Old Behavior
New Behavior
sees no difference
Run
both
Team
Shadow Traffic
Get early feedback
60% 40%
Min 3 items?
Mostly fashion?
Not sold out?
Max 1 of each kind?
Maximize!
© 2020 ThoughtWorks
Visual Report
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
Traffic
Visual
Report
Iterative and
Incremental
development
Independent
Teams
Visual Report
Quality of a feature
© 2020 ThoughtWorks
Visual Report
Quality of a feature
© 2020 ThoughtWorks
Visual Report
Quality of a feature
© 2020 ThoughtWorks
Assess that the MVP has the correct business rules
● Visual report (e.g. html page)
Visual Report
Quality of a feature
Beach pants
manual auto
Leather bags
Jackets
© 2020 ThoughtWorks
Go live
without
flying blind
© 2020 ThoughtWorks
A/B Testing
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
Traffic
A/B
Test
Visual
Report
Iterative and
Incremental
development
Independent
Teams
A/B testing
© 2020 ThoughtWorks
“You want your data to inform, to guide, to improve your business model, to help
you decide on a course of action.” Lean Analytics
A/B testing
© 2020 ThoughtWorks
“You want your data to inform, to guide, to improve your business model, to help
you decide on a course of action.” Lean Analytics
Focus on the understanding of the underlying statistics that drives the
calculation of a sample size.
STATS
A/B testing
© 2020 ThoughtWorks
“You want your data to inform, to guide, to improve your business model, to help
you decide on a course of action.” Lean Analytics
A/B testing ≡ a set of statistical tests that evaluate two independent groups, a
control and a test group
“Independent groups” -> between-subjects design
STATS
Focus on the understanding of the underlying statistics that drives the
calculation of a sample size.
groups =
variants
“Independent groups” -> between-subjects design
A/B testing
© 2020 ThoughtWorks
Control [A]
Test [B]
A/B testing
A/B testing mostly uses statistical hypothesis testing to calculate the likelihood of a change in your
website being meaningful.
Null hypothesis (H0): The state of the world. There is no effect, no difference when you apply
changes.
H0: Our <KPIs> remained the “same” in the control group and in the test group
Alternative hypothesis (H1): the changes in the test group had a real effect.
H1: Our users are actively engaged in clicking the button and therefore our A2B is relatively increased
by 5%
© 2020 ThoughtWorks
A/B testing
© 2020 ThoughtWorks
Alternative hypothesis (H1): the changes in the test group had a real effect.
H1: Our users are actively engaged in clicking the button and therefore our A2B is relatively increased
by 5%
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Metrics
we know
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Metrics
we know We decide from
previous data or
knowledge about
this variable
[effect size]
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Metrics
we know
We decide from
previous data or
knowledge about
this variable
[effect size]
Dependent on the
variable and what we
are looking for
[normally two-sided]
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Metrics
we know
We decide from
previous data or
knowledge about
this variable
[effect size]
We can play but
mostly by
convention and
dependent on traffic
[accuracy]
Dependent on the
variable and what we
are looking for
[normally two-sided]
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Effect size
The magnitude of the effect, how important the difference is
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Test conversion rate = 15 * 2 + 2 = 2.3% (± 0.3%)
Effect size
The magnitude of the effect, how important the difference is
Improvement that is meaningful for your business
Test conversion rate - Control conversion rate
Control conversion rate
Relative improvement*100 =
100
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
One-sided or two-sided?
ControlTest
Mean test
Mean
control
Is the difference significant
enough to reject the null
hypothesis?
H0 : 𝝻t = 𝝻c
𝝻t : mean test
𝝻c : mean control
difference in
means
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
One-sided or two-sided?
H1 : 𝝻t > 𝝻c
(one-sided)
directional
H1 : 𝝻t ≠ 𝝻c
(two-sided)
Two-sided tends to be the best option
𝝻t : mean test
𝝻c : mean control
A/B testing
© 2020 ThoughtWorks
Power, significance level & confidence level
A/B testing
© 2020 ThoughtWorks
Power of a test: the probability of finding an effect when it is really there. It is the inverse of the
type II error (false negatives)
Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb
Typical value is 80% (a convention)
Power Chance to miss a true effect
Sample size
Power, significance level & confidence level
A/B testing
© 2020 ThoughtWorks
Source: https://www.youtube.com/watch?v=CSBCKVQLf8c
Our study
Effect present Effect absent
Real World
Effect
present
Reject H0
Type II error
(miss)
Effect absent
Type I error
(false alarm)
Reject H1
Type II error : probability to miss an effect that is really there (the odds to not detect it)
A/B testing
© 2020 ThoughtWorks
Source: https://www.youtube.com/watch?v=CSBCKVQLf8c
Our study
Effect present Effect absent
Real World
Effect present
Reject H0
(power 1-𝛃)
Type II error
(miss)
( 𝛃 risk)
Effect absent
Type I error
(false alarm)
Reject H1
Type II error : miss -> probability less than 20% (𝛃)
Power is 1-𝛃 -> 80%
Power Chance to miss a true effect
Sample size
A/B testing
© 2020 ThoughtWorks
Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb
Typical value is 95% (a convention)
Significance level (𝛂): the probability of detecting an effect that is really not there
Power, significance level & confidence level
A/B testing
© 2020 ThoughtWorks
Source: https://www.youtube.com/watch?v=CSBCKVQLf8c
Type I error : false alarm -> probability less than 5% (𝛂) Confidence level is 1- 𝛂 : 95%
Significance level 𝛂 related to p-value: 𝛂 > p-value
Our study
Effect present Effect absent
Real World
Effect present Reject H0 Type II error (miss)
Effect
absent
Type I error
(false alarm)
(𝛂 risk)
Reject H1
A/B testing
© 2020 ThoughtWorks
Confidence level: the inverse of the significance level. The probability that the value of a
parameter falls within a specified range of values
Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb
Typical value is 95% (a convention)
Significance level (𝛂)
Confidence level
Sample size
(significance level 𝛂 tells you about the
probability that the effect you found was
just chance; 𝛂 > p-value)
Power, significance level & confidence level
Significance level ~ 0.05 (5%)
P-value < 0.05
A/B testing
© 2020 ThoughtWorks
Source: https://abtestguide.com/abtestsize/
Meaningful for your
business
Power and confidence
level influence your
sample size and the
probability of finding a
true effect
A/B testing
© 2020 ThoughtWorks
High traffic Low traffic
Important points
Choose KPIs wisely,
low effect size
Choose KPIs with high
increase (large effect size)
A/B testing
© 2020 ThoughtWorks
High traffic Low traffic
Choose KPIs wisely,
low effect size
Choose KPIs with high
increase (large effect size)
Important points
+5%
+0.5%
A/B testing
© 2020 ThoughtWorks
High traffic Low traffic
Choose KPIs wisely, low effect
size
Accuracy, minimise risk
Choose KPIs with high
increase (large effect size)
Important points
A/B testing
© 2020 ThoughtWorks
High traffic Low traffic
Choose KPIs wisely, low effect
size
Preferably AB but also MVT
Choose KPIs with high
increase (large effect size)
AB
Run Qualitative tests
Never stop an experiment before time even if you “find” significant results (danger! False
positives raising!)
Source: https://www.evanmiller.org/how-not-to-run-an-ab-test.html
https://vwo.com/blog/ab-split-testing-low-traffic-sites/
Important points
Before
development
© 2020 ThoughtWorks
Focus Group Survey
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
TrafficFocus
Group
Survey Visual
Report
Iterative and
Incremental
development
Independent
Teams
A/B
Test
Focus Group Survey
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
TrafficFocus
Group
Survey Visual
Report
Iterative and
Incremental
development
Independent
Teams
What is it
Study using inferential statistics
to verify an hypothesis.
When
As part of the discovery of a
feature, during development
Why
Short feedback loops
Data-driven decisions
Caution! You need experience
designing and analysing statistical
tests.
The shopteaser survey
© 2020 ThoughtWorks
Focus Group Survey
Focus Group Survey
© 2020 ThoughtWorks
Stronglydisagree
Disagree
Neutral
Agree
Stronglyagree
Likert Scale
[categorical variable]
The shopteaser survey
Your research question will
drive the design of the
experiment and also the
analysis of your data
trial
trial
trial
trial
Focus Group Survey
© 2020 ThoughtWorks
Stronglydisagree
Disagree
Neutral
Agree
Stronglyagree
Likert Scale
The shopteaser survey
trial
trial
trial
Things that could go wrong:
- Familiarity bias
Methodology examples:
- Gave 5s per trial so the
answers would be
spontaneous
- The first trials were
discarded
[categorical variable that can be
transformed to continuous -
scale 1-5]
During the design phase we also took into account:
● Collect demographic data: there is no such thing as enough data
● Collect feedback at the end of the survey: did they understand the task, did
something go wrong?
● Make clear instructions: if you are not there, they cannot ask and will “assume”
© 2020 ThoughtWorks
Focus Group Survey
The shopteaser survey
Insights from a focus group
The shopteaser survey
© 2020 ThoughtWorks
selectedmanual
Lab test
© 2020 ThoughtWorks
Delivery
Pipeline
Feature
Toggle
Shadow
Traffic
Lab
Test
Focus
Group
Survey Visual
Report
A/B
Test
Iterative and
Incremental
development
Independent
Teams
UX designers test the design and usability of a
feature on a test group.
● Small group of people in-person (~5-10 pp)
● Web-based testing of users remote
● Qualitative questions
○ e.g. did you like it? Was it easy to find?
UX Lab tests
© 2020 ThoughtWorks
Wrapping up
© 2020 ThoughtWorks
PO
Delivery
Pipeline
Feature
Toggle
Shadow
TrafficLab
Test Focus
Group
Survey
Visual
Report
A/B
Test
Techniques for faster and better decisions
Iterative and
Incremental
development
In-
dependent
Team
When is your next release? Could it be earlier?
Do you have a solid hypothesis and measurable KPIs for it?
Which measurements could you be using instead of
assuming the user’s preference?
Which of your meetings in the next 2 weeks could be
replaced by a lean experiment?
© 2020 ThoughtWorks
Thank you
Irene Torres
Klaus Fleerkötter
© 2020 ThoughtWorks
Questions?
© 2020 ThoughtWorks
#talk5-when-in-doubt-go-live
Irene Torres
Klaus Fleerkötter

More Related Content

What's hot

Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...Thoughtworks
 
Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016Galit Fein
 
Galit Post-Covid ORGANIZATION Presentation
Galit Post-Covid ORGANIZATION Presentation Galit Post-Covid ORGANIZATION Presentation
Galit Post-Covid ORGANIZATION Presentation Galit Fein
 
Israel IT trends and positioning in infrastructure and development (delivery...
Israel IT  trends and positioning in infrastructure and development (delivery...Israel IT  trends and positioning in infrastructure and development (delivery...
Israel IT trends and positioning in infrastructure and development (delivery...Dr. Jimmy Schwarzkopf
 
DARE2HACK: Crowdsourcing ideas through hackathons
DARE2HACK: Crowdsourcing ideas through hackathonsDARE2HACK: Crowdsourcing ideas through hackathons
DARE2HACK: Crowdsourcing ideas through hackathonsShaun West
 
Cto 2021 markets v2
Cto 2021 markets v2Cto 2021 markets v2
Cto 2021 markets v2Pini Cohen
 
Big it stagnation
Big it stagnationBig it stagnation
Big it stagnationInbalraanan
 
What is software product management
What is software  product managementWhat is software  product management
What is software product managementRahulNarsinghani
 
Galit fein product positioning
Galit fein product positioningGalit fein product positioning
Galit fein product positioningGalit Fein
 
Enterprise applications, Web & Analytics trends 2012
Enterprise applications, Web & Analytics trends 2012Enterprise applications, Web & Analytics trends 2012
Enterprise applications, Web & Analytics trends 2012Einat Shimoni
 
Automation revolution AI ML RPAs 2019
Automation revolution   AI ML RPAs 2019Automation revolution   AI ML RPAs 2019
Automation revolution AI ML RPAs 2019Galit Fein
 
CloudTeams Methodology: a Roadmap for Customer-Driven Software Development
CloudTeams Methodology: a Roadmap for Customer-Driven Software DevelopmentCloudTeams Methodology: a Roadmap for Customer-Driven Software Development
CloudTeams Methodology: a Roadmap for Customer-Driven Software DevelopmentIosif Alvertis
 
How does the cio contrinute to other CxOs?
How does the cio contrinute to other CxOs?How does the cio contrinute to other CxOs?
How does the cio contrinute to other CxOs?Einat Shimoni
 
Mindtree's performance engineering services.
Mindtree's performance engineering services.Mindtree's performance engineering services.
Mindtree's performance engineering services.Mindtree Ltd.
 
Iot and cloud trends summit stki 2016
Iot and cloud trends summit stki 2016Iot and cloud trends summit stki 2016
Iot and cloud trends summit stki 2016Galit Fein
 
297727851 getting-to-the-cloud-event-2015
297727851 getting-to-the-cloud-event-2015297727851 getting-to-the-cloud-event-2015
297727851 getting-to-the-cloud-event-2015Inbalraanan
 
Overview of the Implementing Innovation Course
Overview of the Implementing Innovation CourseOverview of the Implementing Innovation Course
Overview of the Implementing Innovation CourseBrad Power
 

What's hot (20)

Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
Emerging Best Practises for Machine Learning Engineering- Lex Toumbourou (By ...
 
Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016Galit Fein IT governance for slideshare 2016
Galit Fein IT governance for slideshare 2016
 
Galit Post-Covid ORGANIZATION Presentation
Galit Post-Covid ORGANIZATION Presentation Galit Post-Covid ORGANIZATION Presentation
Galit Post-Covid ORGANIZATION Presentation
 
Israel IT trends and positioning in infrastructure and development (delivery...
Israel IT  trends and positioning in infrastructure and development (delivery...Israel IT  trends and positioning in infrastructure and development (delivery...
Israel IT trends and positioning in infrastructure and development (delivery...
 
DARE2HACK: Crowdsourcing ideas through hackathons
DARE2HACK: Crowdsourcing ideas through hackathonsDARE2HACK: Crowdsourcing ideas through hackathons
DARE2HACK: Crowdsourcing ideas through hackathons
 
Cto 2021 markets v2
Cto 2021 markets v2Cto 2021 markets v2
Cto 2021 markets v2
 
Big it stagnation
Big it stagnationBig it stagnation
Big it stagnation
 
What is software product management
What is software  product managementWhat is software  product management
What is software product management
 
CTO presentation
CTO presentation  CTO presentation
CTO presentation
 
Galit fein product positioning
Galit fein product positioningGalit fein product positioning
Galit fein product positioning
 
Enterprise applications, Web & Analytics trends 2012
Enterprise applications, Web & Analytics trends 2012Enterprise applications, Web & Analytics trends 2012
Enterprise applications, Web & Analytics trends 2012
 
Automation revolution AI ML RPAs 2019
Automation revolution   AI ML RPAs 2019Automation revolution   AI ML RPAs 2019
Automation revolution AI ML RPAs 2019
 
CloudTeams Methodology: a Roadmap for Customer-Driven Software Development
CloudTeams Methodology: a Roadmap for Customer-Driven Software DevelopmentCloudTeams Methodology: a Roadmap for Customer-Driven Software Development
CloudTeams Methodology: a Roadmap for Customer-Driven Software Development
 
How does the cio contrinute to other CxOs?
How does the cio contrinute to other CxOs?How does the cio contrinute to other CxOs?
How does the cio contrinute to other CxOs?
 
Mindtree's performance engineering services.
Mindtree's performance engineering services.Mindtree's performance engineering services.
Mindtree's performance engineering services.
 
Iot and cloud trends summit stki 2016
Iot and cloud trends summit stki 2016Iot and cloud trends summit stki 2016
Iot and cloud trends summit stki 2016
 
Cto 2021 summit
Cto 2021 summitCto 2021 summit
Cto 2021 summit
 
STKI Summit 1/2021 - REUT
STKI Summit 1/2021 - REUTSTKI Summit 1/2021 - REUT
STKI Summit 1/2021 - REUT
 
297727851 getting-to-the-cloud-event-2015
297727851 getting-to-the-cloud-event-2015297727851 getting-to-the-cloud-event-2015
297727851 getting-to-the-cloud-event-2015
 
Overview of the Implementing Innovation Course
Overview of the Implementing Innovation CourseOverview of the Implementing Innovation Course
Overview of the Implementing Innovation Course
 

Similar to When in doubt, go live

[Webinar] The power of experimentation in direct-to-consumer eCommerce
[Webinar] The power of experimentation in direct-to-consumer eCommerce[Webinar] The power of experimentation in direct-to-consumer eCommerce
[Webinar] The power of experimentation in direct-to-consumer eCommerceChris Goward
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamSenturus
 
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT Roadmap
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT RoadmapNext Gen Continuous Delivery: Connecting Business Initiatives to the IT Roadmap
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT RoadmapHeadspring
 
Why learn Six Sigma, 4,28,15
Why learn Six Sigma, 4,28,15Why learn Six Sigma, 4,28,15
Why learn Six Sigma, 4,28,15James F. McCarthy
 
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...Minitab, LLC
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of ProductProduct School
 
Test Your Cloud Maturity Level: A Practical Guide to Self Assessment
Test Your Cloud Maturity Level: A Practical Guide to Self AssessmentTest Your Cloud Maturity Level: A Practical Guide to Self Assessment
Test Your Cloud Maturity Level: A Practical Guide to Self AssessmentDavid Resnic
 
Testing Metrics: Project, Product, Process
Testing Metrics: Project, Product, ProcessTesting Metrics: Project, Product, Process
Testing Metrics: Project, Product, ProcessTechWell
 
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docx
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docxDue 12 10 2016Week 10 Term PaperClick the link above to submit.docx
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docxsagarlesley
 
1505 Statistical Thinking course extract
1505 Statistical Thinking course extract1505 Statistical Thinking course extract
1505 Statistical Thinking course extractJefferson Lynch
 
A/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsA/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsRamkumar Ravichandran
 
erm Paper Penetration TestingDue Week 10 and worth 120 points.docx
erm Paper Penetration TestingDue Week 10 and worth 120 points.docxerm Paper Penetration TestingDue Week 10 and worth 120 points.docx
erm Paper Penetration TestingDue Week 10 and worth 120 points.docxmealsdeidre
 
Maxdiff webinar_10_19_10
 Maxdiff webinar_10_19_10 Maxdiff webinar_10_19_10
Maxdiff webinar_10_19_10QuestionPro
 
Risk-Based Testing for Agile Projects
Risk-Based Testing for Agile ProjectsRisk-Based Testing for Agile Projects
Risk-Based Testing for Agile ProjectsTechWell
 
A Conceptual Framework for Managing Customer Experience and Analytics (using ...
A Conceptual Framework for Managing Customer Experience and Analytics (using ...A Conceptual Framework for Managing Customer Experience and Analytics (using ...
A Conceptual Framework for Managing Customer Experience and Analytics (using ...Lorien Pratt
 
Criteo TektosData Meetup
Criteo TektosData MeetupCriteo TektosData Meetup
Criteo TektosData MeetupOlivier Koch
 
The Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in KuaishouThe Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in KuaishouJay (Jianqiang) Wang
 

Similar to When in doubt, go live (20)

[Webinar] The power of experimentation in direct-to-consumer eCommerce
[Webinar] The power of experimentation in direct-to-consumer eCommerce[Webinar] The power of experimentation in direct-to-consumer eCommerce
[Webinar] The power of experimentation in direct-to-consumer eCommerce
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science Team
 
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT Roadmap
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT RoadmapNext Gen Continuous Delivery: Connecting Business Initiatives to the IT Roadmap
Next Gen Continuous Delivery: Connecting Business Initiatives to the IT Roadmap
 
Why learn Six Sigma, 4,28,15
Why learn Six Sigma, 4,28,15Why learn Six Sigma, 4,28,15
Why learn Six Sigma, 4,28,15
 
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...
Maximize Efficiency with Minitab Workspace and Minitab Statistical Software -...
 
Intro to Data Analytics with Oscar's Director of Product
 Intro to Data Analytics with Oscar's Director of Product Intro to Data Analytics with Oscar's Director of Product
Intro to Data Analytics with Oscar's Director of Product
 
Test Your Cloud Maturity Level: A Practical Guide to Self Assessment
Test Your Cloud Maturity Level: A Practical Guide to Self AssessmentTest Your Cloud Maturity Level: A Practical Guide to Self Assessment
Test Your Cloud Maturity Level: A Practical Guide to Self Assessment
 
Testing Metrics: Project, Product, Process
Testing Metrics: Project, Product, ProcessTesting Metrics: Project, Product, Process
Testing Metrics: Project, Product, Process
 
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docx
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docxDue 12 10 2016Week 10 Term PaperClick the link above to submit.docx
Due 12 10 2016Week 10 Term PaperClick the link above to submit.docx
 
1505 Statistical Thinking course extract
1505 Statistical Thinking course extract1505 Statistical Thinking course extract
1505 Statistical Thinking course extract
 
A/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'tsA/B Testing Best Practices - Do's and Don'ts
A/B Testing Best Practices - Do's and Don'ts
 
erm Paper Penetration TestingDue Week 10 and worth 120 points.docx
erm Paper Penetration TestingDue Week 10 and worth 120 points.docxerm Paper Penetration TestingDue Week 10 and worth 120 points.docx
erm Paper Penetration TestingDue Week 10 and worth 120 points.docx
 
Maxdiff webinar_10_19_10
 Maxdiff webinar_10_19_10 Maxdiff webinar_10_19_10
Maxdiff webinar_10_19_10
 
Demystifying ML/AI
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AI
 
[TestWarez 2017] Od testowania do monitoringu jakości – wyzwania Continuous ...
[TestWarez 2017]  Od testowania do monitoringu jakości – wyzwania Continuous ...[TestWarez 2017]  Od testowania do monitoringu jakości – wyzwania Continuous ...
[TestWarez 2017] Od testowania do monitoringu jakości – wyzwania Continuous ...
 
Risk-Based Testing for Agile Projects
Risk-Based Testing for Agile ProjectsRisk-Based Testing for Agile Projects
Risk-Based Testing for Agile Projects
 
Promise Keynote
Promise KeynotePromise Keynote
Promise Keynote
 
A Conceptual Framework for Managing Customer Experience and Analytics (using ...
A Conceptual Framework for Managing Customer Experience and Analytics (using ...A Conceptual Framework for Managing Customer Experience and Analytics (using ...
A Conceptual Framework for Managing Customer Experience and Analytics (using ...
 
Criteo TektosData Meetup
Criteo TektosData MeetupCriteo TektosData Meetup
Criteo TektosData Meetup
 
The Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in KuaishouThe Practice of Data Driven Products in Kuaishou
The Practice of Data Driven Products in Kuaishou
 

More from Thoughtworks

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a ProductThoughtworks
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & DogsThoughtworks
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovationThoughtworks
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teamsThoughtworks
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer ExperienceThoughtworks
 
When we design together
When we design togetherWhen we design together
When we design togetherThoughtworks
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)Thoughtworks
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of InnovationThoughtworks
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the RubiconThoughtworks
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!Thoughtworks
 
Docker container security
Docker container securityDocker container security
Docker container securityThoughtworks
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unitThoughtworks
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Thoughtworks
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to TuringThoughtworks
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked outThoughtworks
 
Do No Harm: Do Technologists Need a Code of Ethics?
Do No Harm: Do Technologists Need a Code of Ethics?Do No Harm: Do Technologists Need a Code of Ethics?
Do No Harm: Do Technologists Need a Code of Ethics?Thoughtworks
 
Making best-in-class security ubiquitous - Why security is no longer just an ...
Making best-in-class security ubiquitous - Why security is no longer just an ...Making best-in-class security ubiquitous - Why security is no longer just an ...
Making best-in-class security ubiquitous - Why security is no longer just an ...Thoughtworks
 
Security by default - Building continuous cyber-resilience.
Security by default - Building continuous cyber-resilience.Security by default - Building continuous cyber-resilience.
Security by default - Building continuous cyber-resilience.Thoughtworks
 

More from Thoughtworks (20)

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a Product
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & Dogs
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovation
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teams
 
Dual-Track Agile
Dual-Track AgileDual-Track Agile
Dual-Track Agile
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer Experience
 
When we design together
When we design togetherWhen we design together
When we design together
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of Innovation
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the Rubicon
 
Error handling
Error handlingError handling
Error handling
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!
 
Docker container security
Docker container securityDocker container security
Docker container security
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unit
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to Turing
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked out
 
Do No Harm: Do Technologists Need a Code of Ethics?
Do No Harm: Do Technologists Need a Code of Ethics?Do No Harm: Do Technologists Need a Code of Ethics?
Do No Harm: Do Technologists Need a Code of Ethics?
 
Making best-in-class security ubiquitous - Why security is no longer just an ...
Making best-in-class security ubiquitous - Why security is no longer just an ...Making best-in-class security ubiquitous - Why security is no longer just an ...
Making best-in-class security ubiquitous - Why security is no longer just an ...
 
Security by default - Building continuous cyber-resilience.
Security by default - Building continuous cyber-resilience.Security by default - Building continuous cyber-resilience.
Security by default - Building continuous cyber-resilience.
 

Recently uploaded

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?Watsoo Telematics
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 

Recently uploaded (20)

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?What are the features of Vehicle Tracking System?
What are the features of Vehicle Tracking System?
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 

When in doubt, go live

  • 1. When in doubt, go live Techniques for decision making based on real user behavior © 2020 ThoughtWorks Irene Torres Klaus Fleerkötter
  • 2. You save time and make better decisions by establishing shorter feedback loops from feature idea to feature usage. © 2020 ThoughtWorks
  • 3. Irene Torres Developer @ TW PhD Neuroscience Science perspective Klaus Fleerkötter Developer @ TW Information Systems Techie perspective Klaus Who’s talking? © 2020 ThoughtWorks
  • 4. What is this talk about? Specific use cases that worked for us Tech & Research And what is it not... © 2020 ThoughtWorks Extensive coverage of user research Software testing
  • 5. One of Germany’s biggest online retailers Top 5 highest traffic e-commerce sites (Germany) Orders: <= 10 per second Qualified visits: Ø 1.6 million / day Examples © 2020 ThoughtWorks
  • 10. PO An Iterative and Incremental development process © 2020 ThoughtWorks
  • 11. Services that can be built independently by cross-functional teams that are structured around business domains © 2020 ThoughtWorks Dev PO QA Ops UX DA
  • 12. The Delivery Pipeline © 2020 ThoughtWorks Delivery Pipeline Iterative and Incremental development Independent Teams
  • 13. The Delivery Pipeline © 2020 ThoughtWorks Build Test Deploy
  • 14. Gain situational awareness Knowing that you went live and nothing’s on fire © 2020 ThoughtWorks
  • 15. Feature Toggles © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Iterative and Incremental development Independent Teams
  • 16. Feature Toggles Decouple go-live from deployment © 2020 ThoughtWorks © CC BY 2.0 "Switch" Jon_Callow_Images if (toggleIsOn) then { executeNewBehavior() } else { executeOldBehavior() }
  • 17. Feature Toggles Flip for experimentation © CC BY-ND 2.0 "Off?" Nicholas Liby Without Recompile? Without Restart? Per Request? © 2020 ThoughtWorks
  • 19. Shadow Traffic © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow Traffic Iterative and Incremental development Independent Teams
  • 20. Shadow Traffic Not just for testing © 2020 ThoughtWorks User Old Behavior New Behavior sees no difference Run both Team
  • 21. Shadow Traffic Get early feedback 60% 40% Min 3 items? Mostly fashion? Not sold out? Max 1 of each kind? Maximize! © 2020 ThoughtWorks
  • 22. Visual Report © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow Traffic Visual Report Iterative and Incremental development Independent Teams
  • 23. Visual Report Quality of a feature © 2020 ThoughtWorks
  • 24. Visual Report Quality of a feature © 2020 ThoughtWorks
  • 25. Visual Report Quality of a feature © 2020 ThoughtWorks
  • 26. Assess that the MVP has the correct business rules ● Visual report (e.g. html page) Visual Report Quality of a feature Beach pants manual auto Leather bags Jackets © 2020 ThoughtWorks
  • 27. Go live without flying blind © 2020 ThoughtWorks
  • 28. A/B Testing © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow Traffic A/B Test Visual Report Iterative and Incremental development Independent Teams
  • 29. A/B testing © 2020 ThoughtWorks “You want your data to inform, to guide, to improve your business model, to help you decide on a course of action.” Lean Analytics
  • 30. A/B testing © 2020 ThoughtWorks “You want your data to inform, to guide, to improve your business model, to help you decide on a course of action.” Lean Analytics Focus on the understanding of the underlying statistics that drives the calculation of a sample size. STATS
  • 31. A/B testing © 2020 ThoughtWorks “You want your data to inform, to guide, to improve your business model, to help you decide on a course of action.” Lean Analytics A/B testing ≡ a set of statistical tests that evaluate two independent groups, a control and a test group “Independent groups” -> between-subjects design STATS Focus on the understanding of the underlying statistics that drives the calculation of a sample size. groups = variants “Independent groups” -> between-subjects design
  • 32. A/B testing © 2020 ThoughtWorks Control [A] Test [B]
  • 33. A/B testing A/B testing mostly uses statistical hypothesis testing to calculate the likelihood of a change in your website being meaningful. Null hypothesis (H0): The state of the world. There is no effect, no difference when you apply changes. H0: Our <KPIs> remained the “same” in the control group and in the test group Alternative hypothesis (H1): the changes in the test group had a real effect. H1: Our users are actively engaged in clicking the button and therefore our A2B is relatively increased by 5% © 2020 ThoughtWorks
  • 34. A/B testing © 2020 ThoughtWorks Alternative hypothesis (H1): the changes in the test group had a real effect. H1: Our users are actively engaged in clicking the button and therefore our A2B is relatively increased by 5%
  • 35. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/
  • 36. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Metrics we know
  • 37. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Metrics we know We decide from previous data or knowledge about this variable [effect size]
  • 38. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Metrics we know We decide from previous data or knowledge about this variable [effect size] Dependent on the variable and what we are looking for [normally two-sided]
  • 39. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Metrics we know We decide from previous data or knowledge about this variable [effect size] We can play but mostly by convention and dependent on traffic [accuracy] Dependent on the variable and what we are looking for [normally two-sided]
  • 40. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Effect size The magnitude of the effect, how important the difference is
  • 41. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Test conversion rate = 15 * 2 + 2 = 2.3% (± 0.3%) Effect size The magnitude of the effect, how important the difference is Improvement that is meaningful for your business Test conversion rate - Control conversion rate Control conversion rate Relative improvement*100 = 100
  • 42. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ One-sided or two-sided? ControlTest Mean test Mean control Is the difference significant enough to reject the null hypothesis? H0 : 𝝻t = 𝝻c 𝝻t : mean test 𝝻c : mean control difference in means
  • 43. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ One-sided or two-sided? H1 : 𝝻t > 𝝻c (one-sided) directional H1 : 𝝻t ≠ 𝝻c (two-sided) Two-sided tends to be the best option 𝝻t : mean test 𝝻c : mean control
  • 44. A/B testing © 2020 ThoughtWorks Power, significance level & confidence level
  • 45. A/B testing © 2020 ThoughtWorks Power of a test: the probability of finding an effect when it is really there. It is the inverse of the type II error (false negatives) Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb Typical value is 80% (a convention) Power Chance to miss a true effect Sample size Power, significance level & confidence level
  • 46. A/B testing © 2020 ThoughtWorks Source: https://www.youtube.com/watch?v=CSBCKVQLf8c Our study Effect present Effect absent Real World Effect present Reject H0 Type II error (miss) Effect absent Type I error (false alarm) Reject H1 Type II error : probability to miss an effect that is really there (the odds to not detect it)
  • 47. A/B testing © 2020 ThoughtWorks Source: https://www.youtube.com/watch?v=CSBCKVQLf8c Our study Effect present Effect absent Real World Effect present Reject H0 (power 1-𝛃) Type II error (miss) ( 𝛃 risk) Effect absent Type I error (false alarm) Reject H1 Type II error : miss -> probability less than 20% (𝛃) Power is 1-𝛃 -> 80% Power Chance to miss a true effect Sample size
  • 48. A/B testing © 2020 ThoughtWorks Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb Typical value is 95% (a convention) Significance level (𝛂): the probability of detecting an effect that is really not there Power, significance level & confidence level
  • 49. A/B testing © 2020 ThoughtWorks Source: https://www.youtube.com/watch?v=CSBCKVQLf8c Type I error : false alarm -> probability less than 5% (𝛂) Confidence level is 1- 𝛂 : 95% Significance level 𝛂 related to p-value: 𝛂 > p-value Our study Effect present Effect absent Real World Effect present Reject H0 Type II error (miss) Effect absent Type I error (false alarm) (𝛂 risk) Reject H1
  • 50. A/B testing © 2020 ThoughtWorks Confidence level: the inverse of the significance level. The probability that the value of a parameter falls within a specified range of values Source: https://towardsdatascience.com/a-guide-for-selecting-an-appropriate-metric-for-your-a-b-test-9068cccb7fb Typical value is 95% (a convention) Significance level (𝛂) Confidence level Sample size (significance level 𝛂 tells you about the probability that the effect you found was just chance; 𝛂 > p-value) Power, significance level & confidence level Significance level ~ 0.05 (5%) P-value < 0.05
  • 51. A/B testing © 2020 ThoughtWorks Source: https://abtestguide.com/abtestsize/ Meaningful for your business Power and confidence level influence your sample size and the probability of finding a true effect
  • 52. A/B testing © 2020 ThoughtWorks High traffic Low traffic Important points Choose KPIs wisely, low effect size Choose KPIs with high increase (large effect size)
  • 53. A/B testing © 2020 ThoughtWorks High traffic Low traffic Choose KPIs wisely, low effect size Choose KPIs with high increase (large effect size) Important points +5% +0.5%
  • 54. A/B testing © 2020 ThoughtWorks High traffic Low traffic Choose KPIs wisely, low effect size Accuracy, minimise risk Choose KPIs with high increase (large effect size) Important points
  • 55. A/B testing © 2020 ThoughtWorks High traffic Low traffic Choose KPIs wisely, low effect size Preferably AB but also MVT Choose KPIs with high increase (large effect size) AB Run Qualitative tests Never stop an experiment before time even if you “find” significant results (danger! False positives raising!) Source: https://www.evanmiller.org/how-not-to-run-an-ab-test.html https://vwo.com/blog/ab-split-testing-low-traffic-sites/ Important points
  • 57. Focus Group Survey © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow TrafficFocus Group Survey Visual Report Iterative and Incremental development Independent Teams A/B Test
  • 58. Focus Group Survey © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow TrafficFocus Group Survey Visual Report Iterative and Incremental development Independent Teams What is it Study using inferential statistics to verify an hypothesis. When As part of the discovery of a feature, during development Why Short feedback loops Data-driven decisions Caution! You need experience designing and analysing statistical tests.
  • 59. The shopteaser survey © 2020 ThoughtWorks Focus Group Survey
  • 60. Focus Group Survey © 2020 ThoughtWorks Stronglydisagree Disagree Neutral Agree Stronglyagree Likert Scale [categorical variable] The shopteaser survey Your research question will drive the design of the experiment and also the analysis of your data trial trial trial trial
  • 61. Focus Group Survey © 2020 ThoughtWorks Stronglydisagree Disagree Neutral Agree Stronglyagree Likert Scale The shopteaser survey trial trial trial Things that could go wrong: - Familiarity bias Methodology examples: - Gave 5s per trial so the answers would be spontaneous - The first trials were discarded [categorical variable that can be transformed to continuous - scale 1-5]
  • 62. During the design phase we also took into account: ● Collect demographic data: there is no such thing as enough data ● Collect feedback at the end of the survey: did they understand the task, did something go wrong? ● Make clear instructions: if you are not there, they cannot ask and will “assume” © 2020 ThoughtWorks Focus Group Survey The shopteaser survey
  • 63. Insights from a focus group The shopteaser survey © 2020 ThoughtWorks selectedmanual
  • 64. Lab test © 2020 ThoughtWorks Delivery Pipeline Feature Toggle Shadow Traffic Lab Test Focus Group Survey Visual Report A/B Test Iterative and Incremental development Independent Teams
  • 65. UX designers test the design and usability of a feature on a test group. ● Small group of people in-person (~5-10 pp) ● Web-based testing of users remote ● Qualitative questions ○ e.g. did you like it? Was it easy to find? UX Lab tests © 2020 ThoughtWorks
  • 66. Wrapping up © 2020 ThoughtWorks
  • 67. PO Delivery Pipeline Feature Toggle Shadow TrafficLab Test Focus Group Survey Visual Report A/B Test Techniques for faster and better decisions Iterative and Incremental development In- dependent Team
  • 68. When is your next release? Could it be earlier? Do you have a solid hypothesis and measurable KPIs for it? Which measurements could you be using instead of assuming the user’s preference? Which of your meetings in the next 2 weeks could be replaced by a lean experiment? © 2020 ThoughtWorks
  • 69. Thank you Irene Torres Klaus Fleerkötter © 2020 ThoughtWorks