SlideShare a Scribd company logo
1 of 34
C O N F I D E N T I A L
Testing and Tracking
To Improve Conversion Rates
Jonathan Isernhagen
Digital Marketing Fundamentals Workshop Day 2
Digital Travel Summit
4/2/2014
C O N F I D E N T I A L
Home Page search widget modification
Control
C O N F I D E N T I A L
Home Page search widget modification
Variant A
C O N F I D E N T I A L
Home Page search widget modification
Variant B
C O N F I D E N T I A L
Home Page search widget modification
Variant B
Variant C
C O N F I D E N T I A L
Home Page search widget modification
Variant B
C O N F I D E N T I A L
Hotel Detail Page “Flexible Dates” tab name
Control
C O N F I D E N T I A L
Hotel Detail Page “Flexible Dates” tab name
Variant A:
Removal
C O N F I D E N T I A L
Hotel Detail Page “Flexible Dates” tab name
Variant B:
“Best Dates”
C O N F I D E N T I A L
Hotel Detail Page “Flexible Dates” tab name
Variant C:
“Cheapest Dates”
C O N F I D E N T I A L
Hotel Detail Page “Flexible Dates” tab name
Variant C:
“Cheapest Dates”
C O N F I D E N T I A L
Agenda
1) Examples
2) Program goals
3) Test approach philosophies
a) Eisenberg
b) Anderson
4) Process
a) solicitation
b) prioritization
c) pre-test
d) testing
e) reportage
5) Alternative method: the multi-armed bandit
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
What is A/B testing?
Comparing conversion performance between shoppers on:
• Current site (=“control”);
• Modified version(s) (=“variants”).
Highest Success / View ratio (=“conversion”) is the winner.
“Success” can be defined in many ways:
1) Some sites aim for visitor pages viewed or time-on-site
2) Transaction sites count booking completion pages
3) Sophisticated testers incorporate all revenue (even of ads)
Control page
Test variant
Completion
“success”
page
Other site pages
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Why do we A/B Test? Conversion Improvement
Continuous conversion improvement
subject to market competition
≈Building in Venice
C O N F I D E N T I A L
Why do we A/B Test? HIPPO* Defense
Highest Income Person’s Opinion
“If we have data,
let’s look at data.
If all we have are opinions,
let’s go with mine.”
– Jim Barksdale,
Netscape CEO
*
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Why do we A/B test? Causality is Optional
“A British ship’s captain observed the
lack of scurvy among sailors serving
on the naval ships of Mediterranean
countries, where citrus fruit was part
of their rations…
He then gave half his crew limes (the
Treatment group) while the other half
(the Control group) continued with
their regular diet...
While the captain did not realize that
scurvy is a consequence of vitamin C
deficiency, and that limes are rich in
vitamin C, the intervention worked.” –
-Ron Kohavi, Practical Guide to
Controlled Experiments on the Web
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Agenda
1) Examples
2) Program goal
3) Test approach philosophies
a) Eisenberg
b) Anderson
4) Process
a) solicitation
b) prioritization
c) pre-test
d) testing
e) reportage
5) Alternative method: the multi-armed bandit
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
What should we test?
Travelocity relied heavily on
“Always be Testing”
Contains laundry list of every
combination of testable elements
on web pages.
Good starting point for test
brainstorming.
C O N F I D E N T I A L
Approaches: Eisenbergs (Monetate)
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
Credentials: author of “Always Be Testing,” lecturer, consultant
Philosophy:
• understand shopper psychology;
• sustain “scent;”
• note where shoppers bail;
• write formal hypotheses/draw wireframe, and;
• use matrix to prioritize.
Pace: high volume: 30+/month
Reporting: record results for historical memory/learning
Rigor: 80-95% confidence threshold, extensive path analysis
C O N F I D E N T I A L
Credentials: top consultant of top A/B test firm, 300+ clients gained
average 10-40% conversion improvement using his methodology.
Philosophy:
• cares about the shopper’s actions, not thoughts
• test with open mind, no preconceived notions
• start with page element elimination or MVT,
then test permutations
• prioritize:
– low-converting high-traffic or
– high-conversion-correlated pages
Pace: limited only by sequential testing, 10-14 days/test
Reporting: five metrics max, revenue per visitor, no path reporting
Rigor: 80% confidence threshold, 5% rise, frequent re-testing.
Approaches: Andrew Anderson (Adobe)
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Comparing the methodologies
Topic area
Eisenberg /
Monetate
Adobe/
Andrew
Sabre Research /
Travelocity
Test prioritization 5 x 5 x 5 product Page element Effort/impact
Customer intent Profile/channel Ignore Understand
Design bias Univariate Multivariate Univariate A/B/C…
Contamination Embrace/ignore Acknowledge Accept sometimes
Negative lift Unfortunate Interesting Unfortunate
“Micro-conversion” Exciting Worthless Evaluating
Prone to find Occasional big wins Constant small wins 1 win per 7-10 tests
Test ends Eyeball assessment Eyeball assessment Precalculated size
Error risk High Medium Low
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Agenda
1) Examples
2) Program goal
3) Test approach philosophies
a) Eisenberg
b) Anderson
4) Process
a) solicitation
b) prioritization
c) pre-test
d) testing
e) reportage
5) Alternative method: the multi-armed bandit
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Process: test suggestion database
• Test name
• Page name
• Page abbreviation
• Site section
• Date submitted
• Description
• Wireframe
• Effort expected
• Uplift expected
• Submitter
• Current owner
• Status
• Status change comments
C O N F I D E N T I A L
Process: solicitation
• Request everyone’s ideas (there is no monopoly on good ones)
– Direct them to the online test database (include the link in the message)
• Ask them to:
– do filtered search for similar ideas on same page
– complete the form as completely as possible, including
• Rate (from 1-5) the business improvement they expect, and expected effort
• Include a wireframe picture of the change, even if it’s a scanned crayon drawing
• Describe the effect they’re expecting
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Process: prioritization
1) Schedule periodic prioritization meetings
2) For each page on the site:
a) Review recent test results and how they affect your thinking
b) Multiply projected impact numbers * Ease of implementation numbers
c) Rank in descending order.
d) Select the next wave of tests
3) For each site section/pathway
a) Sample conversion and traffic volume as applies to prioritized test pages
b) Decide what sensitivity/confidence /variant count you wish to specify
c) Calculate sample size and minimum test length
d) Adjust test parameters as appropriate
4) Evaluate selected tests’ potential to interfere with each other
5) Ratify final test set and test parameters
6) Assign the developers and testers their tasks
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Reject null Accept null
Null Type I error Right decision
Alternative Right decision Type II error
Decision
Truth
Avoiding errors
Accepting a
bad variant.
Testing doesn’t reveal The Truth, it enables us to make accurate guesses
about the truth. Those guesses can be wrong in one of two ways:
Rejecting a
good variant.
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Test length for statistical significance
Sample size = 2 * Z^2 * Conversion * (1 - Conversion)
(Conversion * Change)^2
• Change: ….the smaller the lift you want to detect
• Confidence: …the greater the confidence you want to have
• Conversion:…the closer the page’s conversion is to 50%
• Contamination: …the purer you want the results to be.
If you let experiments re-use each others’ traffic, you can get
more data faster.
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
You have to test longer…
C O N F I D E N T I A L
Test length: additional considerations
Beyond statistical minimum sample size, there are other test-sizing
factors to consider:
• Cyclicality: shoppers often demonstrate different conversion behavior
on different days of week, to a different degree across variants.
Inoculate your test by stopping it on a multiple of 7 days.
• Re-shop: the benefits of a superior variant may not express
themselves during the same session. Know the average shop-to-book
incubation period for the tested page and run several weeks longer.
• Interface shock: as above, the control has a home court advantage
by virtue of being well-known to past visitors. Even superior variants
play at a disadvantage until shoppers become accustomed.
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Process: pre-test
1) Ensure all variants have been fully tested
2) Announce forthcoming tests, with special attention to:
a) Help Desk
b) Site Health
3) Re-communicate your reporting procedure:
a) All results and your interpretations will be immediately posted to database
b) Interim test results are not scientific and will not be:
i. Subject to early speculation
ii. Cause to prematurely terminate tests or extend them beyond agreed interval
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
During the tests
• Keep an eye on site metrics generally and where your tests touch
• Avoid “premature exclamation”
– Observe, but do not report on, the test variants’ performance
– Remember that measures of statistical confidence are valid only in the exact
moment you pre-determined to end the test.
C O N F I D E N T I A L
Process: reportage
Once each test has run to its pre-determined conclusion trigger:
1) Copy the results from test tool into the database
2) Complete the after-test fields, including:
a) What we conclude from this test, especially if results are significant?
b) What (if any) concerns we have about its accuracy?
c) What follow-on tests and actions you recommend?
3) Decide whether the results are interesting/controversial enough to
warrant calling a meeting and/or distributing an explanatory .ppt.
4) Periodically review the relative performance of the various test
groups in your site metrics tool if it permits selection that way.
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Agenda
1) Examples
2) Program goal
3) Test approach philosophies
a) Eisenberg
b) Anderson
4) Process
a) solicitation
b) prioritization
c) pre-test
d) testing
e) reportage
5) Alternative test method: the multi-armed bandit
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Testing alternatives: Multi-armed bandit method
Like a hive of bees that
always reconnoiter but sends
most bees to known gardens.
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
C O N F I D E N T I A L
Take-aways
1) Commission a company-wide online test idea database
2) Be clear about your testing methodology:
a) document your hypotheses, be able to tell a narrative
b) know what error risks you’re exposing yourself to
c) quick tests with low confidence/sensitivity can be okay if followed up
3) Ignore your test tool vendor rep
a) “Let’s test longer to see if it goes to confidence.” = disqualifying statement
b) The test tool readout is accurate only on the pre-chosen test day.
4) Consider multi-armed bandit optimization
jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen

More Related Content

Similar to Digital travel summit a b testing 2014-04-02

The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.be
The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.beThe persuasive architecture of 7jaargarantie.be – 7ansdegarantie.be
The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.bePeter Coopmans
 
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015  Conversion rate optimization CRO breakfast seminar Stockholm, August 2015
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015 ThisIsNansen
 
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...Optimizing for Multiple Customer Personas: How a recent test “broke the rules...
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...MarketingExperiments
 
How to build an effective conversion program on a budget
 How to build an effective conversion program on a budget  How to build an effective conversion program on a budget
How to build an effective conversion program on a budget Joni Lindgren
 
How to Correctly Use Experimentation in PM by Google PM
How to Correctly Use Experimentation in PM by Google PMHow to Correctly Use Experimentation in PM by Google PM
How to Correctly Use Experimentation in PM by Google PMProduct School
 
Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Optimizely
 
Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Optimizely
 
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...Chris Goward
 
MLconf NYC Claudia Perlich
MLconf NYC Claudia PerlichMLconf NYC Claudia Perlich
MLconf NYC Claudia PerlichMLconf
 
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...VWO
 
Webinar: Common Mistakes in A/B Testing
Webinar: Common Mistakes in A/B TestingWebinar: Common Mistakes in A/B Testing
Webinar: Common Mistakes in A/B TestingOptimizely
 
Data-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingData-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingJack Nguyen (Hung Tien)
 
4 Steps Toward Scientific A/B Testing
4 Steps Toward Scientific A/B Testing4 Steps Toward Scientific A/B Testing
4 Steps Toward Scientific A/B TestingJanessa Lantz
 
Workshop Stanford University - 28th July 2018 on Website Optimization
Workshop Stanford University - 28th July 2018 on Website Optimization  Workshop Stanford University - 28th July 2018 on Website Optimization
Workshop Stanford University - 28th July 2018 on Website Optimization Raj Lal
 
Conversion Rate Optimisation Presentation
Conversion Rate Optimisation PresentationConversion Rate Optimisation Presentation
Conversion Rate Optimisation PresentationMadhouse Associates
 
Competitive analysis a different approach - Dave Sottimano
Competitive analysis a different approach - Dave SottimanoCompetitive analysis a different approach - Dave Sottimano
Competitive analysis a different approach - Dave Sottimanoauexpo Conference
 
12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNICraig Sullivan
 
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Invest Northern Ireland
 
Taming The HiPPO
Taming The HiPPOTaming The HiPPO
Taming The HiPPOGoogle A/NZ
 

Similar to Digital travel summit a b testing 2014-04-02 (20)

The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.be
The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.beThe persuasive architecture of 7jaargarantie.be – 7ansdegarantie.be
The persuasive architecture of 7jaargarantie.be – 7ansdegarantie.be
 
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015  Conversion rate optimization CRO breakfast seminar Stockholm, August 2015
Conversion rate optimization CRO breakfast seminar Stockholm, August 2015
 
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...Optimizing for Multiple Customer Personas: How a recent test “broke the rules...
Optimizing for Multiple Customer Personas: How a recent test “broke the rules...
 
How to build an effective conversion program on a budget
 How to build an effective conversion program on a budget  How to build an effective conversion program on a budget
How to build an effective conversion program on a budget
 
How to Correctly Use Experimentation in PM by Google PM
How to Correctly Use Experimentation in PM by Google PMHow to Correctly Use Experimentation in PM by Google PM
How to Correctly Use Experimentation in PM by Google PM
 
Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them
 
Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them Product Experimentation Pitfalls & How to Avoid Them
Product Experimentation Pitfalls & How to Avoid Them
 
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...
Conversion Rate Optimization at OMS - the 5 Step Conversion Optimization Stra...
 
MLconf NYC Claudia Perlich
MLconf NYC Claudia PerlichMLconf NYC Claudia Perlich
MLconf NYC Claudia Perlich
 
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...
VWO Webinar: Cutting Guesswork About Visitor Behavior with Data Driven Conver...
 
Webinar: Common Mistakes in A/B Testing
Webinar: Common Mistakes in A/B TestingWebinar: Common Mistakes in A/B Testing
Webinar: Common Mistakes in A/B Testing
 
Data-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingData-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B Testing
 
4 Steps Toward Scientific A/B Testing
4 Steps Toward Scientific A/B Testing4 Steps Toward Scientific A/B Testing
4 Steps Toward Scientific A/B Testing
 
Worst landing page mistakes most common vs most costly
Worst landing page mistakes most common vs most costlyWorst landing page mistakes most common vs most costly
Worst landing page mistakes most common vs most costly
 
Workshop Stanford University - 28th July 2018 on Website Optimization
Workshop Stanford University - 28th July 2018 on Website Optimization  Workshop Stanford University - 28th July 2018 on Website Optimization
Workshop Stanford University - 28th July 2018 on Website Optimization
 
Conversion Rate Optimisation Presentation
Conversion Rate Optimisation PresentationConversion Rate Optimisation Presentation
Conversion Rate Optimisation Presentation
 
Competitive analysis a different approach - Dave Sottimano
Competitive analysis a different approach - Dave SottimanoCompetitive analysis a different approach - Dave Sottimano
Competitive analysis a different approach - Dave Sottimano
 
12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI
 
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
 
Taming The HiPPO
Taming The HiPPOTaming The HiPPO
Taming The HiPPO
 

Recently uploaded

Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsPriya Reddy
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理F
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理F
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 

Recently uploaded (20)

Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 

Digital travel summit a b testing 2014-04-02

  • 1. C O N F I D E N T I A L Testing and Tracking To Improve Conversion Rates Jonathan Isernhagen Digital Marketing Fundamentals Workshop Day 2 Digital Travel Summit 4/2/2014
  • 2. C O N F I D E N T I A L Home Page search widget modification Control
  • 3. C O N F I D E N T I A L Home Page search widget modification Variant A
  • 4. C O N F I D E N T I A L Home Page search widget modification Variant B
  • 5. C O N F I D E N T I A L Home Page search widget modification Variant B Variant C
  • 6. C O N F I D E N T I A L Home Page search widget modification Variant B
  • 7. C O N F I D E N T I A L Hotel Detail Page “Flexible Dates” tab name Control
  • 8. C O N F I D E N T I A L Hotel Detail Page “Flexible Dates” tab name Variant A: Removal
  • 9. C O N F I D E N T I A L Hotel Detail Page “Flexible Dates” tab name Variant B: “Best Dates”
  • 10. C O N F I D E N T I A L Hotel Detail Page “Flexible Dates” tab name Variant C: “Cheapest Dates”
  • 11. C O N F I D E N T I A L Hotel Detail Page “Flexible Dates” tab name Variant C: “Cheapest Dates”
  • 12. C O N F I D E N T I A L Agenda 1) Examples 2) Program goals 3) Test approach philosophies a) Eisenberg b) Anderson 4) Process a) solicitation b) prioritization c) pre-test d) testing e) reportage 5) Alternative method: the multi-armed bandit jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 13. C O N F I D E N T I A L What is A/B testing? Comparing conversion performance between shoppers on: • Current site (=“control”); • Modified version(s) (=“variants”). Highest Success / View ratio (=“conversion”) is the winner. “Success” can be defined in many ways: 1) Some sites aim for visitor pages viewed or time-on-site 2) Transaction sites count booking completion pages 3) Sophisticated testers incorporate all revenue (even of ads) Control page Test variant Completion “success” page Other site pages jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 14. C O N F I D E N T I A L Why do we A/B Test? Conversion Improvement Continuous conversion improvement subject to market competition ≈Building in Venice
  • 15. C O N F I D E N T I A L Why do we A/B Test? HIPPO* Defense Highest Income Person’s Opinion “If we have data, let’s look at data. If all we have are opinions, let’s go with mine.” – Jim Barksdale, Netscape CEO * jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 16. C O N F I D E N T I A L Why do we A/B test? Causality is Optional “A British ship’s captain observed the lack of scurvy among sailors serving on the naval ships of Mediterranean countries, where citrus fruit was part of their rations… He then gave half his crew limes (the Treatment group) while the other half (the Control group) continued with their regular diet... While the captain did not realize that scurvy is a consequence of vitamin C deficiency, and that limes are rich in vitamin C, the intervention worked.” – -Ron Kohavi, Practical Guide to Controlled Experiments on the Web jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 17. C O N F I D E N T I A L Agenda 1) Examples 2) Program goal 3) Test approach philosophies a) Eisenberg b) Anderson 4) Process a) solicitation b) prioritization c) pre-test d) testing e) reportage 5) Alternative method: the multi-armed bandit jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 18. C O N F I D E N T I A L What should we test? Travelocity relied heavily on “Always be Testing” Contains laundry list of every combination of testable elements on web pages. Good starting point for test brainstorming.
  • 19. C O N F I D E N T I A L Approaches: Eisenbergs (Monetate) jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen Credentials: author of “Always Be Testing,” lecturer, consultant Philosophy: • understand shopper psychology; • sustain “scent;” • note where shoppers bail; • write formal hypotheses/draw wireframe, and; • use matrix to prioritize. Pace: high volume: 30+/month Reporting: record results for historical memory/learning Rigor: 80-95% confidence threshold, extensive path analysis
  • 20. C O N F I D E N T I A L Credentials: top consultant of top A/B test firm, 300+ clients gained average 10-40% conversion improvement using his methodology. Philosophy: • cares about the shopper’s actions, not thoughts • test with open mind, no preconceived notions • start with page element elimination or MVT, then test permutations • prioritize: – low-converting high-traffic or – high-conversion-correlated pages Pace: limited only by sequential testing, 10-14 days/test Reporting: five metrics max, revenue per visitor, no path reporting Rigor: 80% confidence threshold, 5% rise, frequent re-testing. Approaches: Andrew Anderson (Adobe) jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 21. C O N F I D E N T I A L Comparing the methodologies Topic area Eisenberg / Monetate Adobe/ Andrew Sabre Research / Travelocity Test prioritization 5 x 5 x 5 product Page element Effort/impact Customer intent Profile/channel Ignore Understand Design bias Univariate Multivariate Univariate A/B/C… Contamination Embrace/ignore Acknowledge Accept sometimes Negative lift Unfortunate Interesting Unfortunate “Micro-conversion” Exciting Worthless Evaluating Prone to find Occasional big wins Constant small wins 1 win per 7-10 tests Test ends Eyeball assessment Eyeball assessment Precalculated size Error risk High Medium Low jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 22. C O N F I D E N T I A L Agenda 1) Examples 2) Program goal 3) Test approach philosophies a) Eisenberg b) Anderson 4) Process a) solicitation b) prioritization c) pre-test d) testing e) reportage 5) Alternative method: the multi-armed bandit jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 23. C O N F I D E N T I A L Process: test suggestion database • Test name • Page name • Page abbreviation • Site section • Date submitted • Description • Wireframe • Effort expected • Uplift expected • Submitter • Current owner • Status • Status change comments
  • 24. C O N F I D E N T I A L Process: solicitation • Request everyone’s ideas (there is no monopoly on good ones) – Direct them to the online test database (include the link in the message) • Ask them to: – do filtered search for similar ideas on same page – complete the form as completely as possible, including • Rate (from 1-5) the business improvement they expect, and expected effort • Include a wireframe picture of the change, even if it’s a scanned crayon drawing • Describe the effect they’re expecting jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 25. C O N F I D E N T I A L Process: prioritization 1) Schedule periodic prioritization meetings 2) For each page on the site: a) Review recent test results and how they affect your thinking b) Multiply projected impact numbers * Ease of implementation numbers c) Rank in descending order. d) Select the next wave of tests 3) For each site section/pathway a) Sample conversion and traffic volume as applies to prioritized test pages b) Decide what sensitivity/confidence /variant count you wish to specify c) Calculate sample size and minimum test length d) Adjust test parameters as appropriate 4) Evaluate selected tests’ potential to interfere with each other 5) Ratify final test set and test parameters 6) Assign the developers and testers their tasks jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 26. C O N F I D E N T I A L Reject null Accept null Null Type I error Right decision Alternative Right decision Type II error Decision Truth Avoiding errors Accepting a bad variant. Testing doesn’t reveal The Truth, it enables us to make accurate guesses about the truth. Those guesses can be wrong in one of two ways: Rejecting a good variant. jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 27. C O N F I D E N T I A L Test length for statistical significance Sample size = 2 * Z^2 * Conversion * (1 - Conversion) (Conversion * Change)^2 • Change: ….the smaller the lift you want to detect • Confidence: …the greater the confidence you want to have • Conversion:…the closer the page’s conversion is to 50% • Contamination: …the purer you want the results to be. If you let experiments re-use each others’ traffic, you can get more data faster. jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen You have to test longer…
  • 28. C O N F I D E N T I A L Test length: additional considerations Beyond statistical minimum sample size, there are other test-sizing factors to consider: • Cyclicality: shoppers often demonstrate different conversion behavior on different days of week, to a different degree across variants. Inoculate your test by stopping it on a multiple of 7 days. • Re-shop: the benefits of a superior variant may not express themselves during the same session. Know the average shop-to-book incubation period for the tested page and run several weeks longer. • Interface shock: as above, the control has a home court advantage by virtue of being well-known to past visitors. Even superior variants play at a disadvantage until shoppers become accustomed. jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 29. C O N F I D E N T I A L Process: pre-test 1) Ensure all variants have been fully tested 2) Announce forthcoming tests, with special attention to: a) Help Desk b) Site Health 3) Re-communicate your reporting procedure: a) All results and your interpretations will be immediately posted to database b) Interim test results are not scientific and will not be: i. Subject to early speculation ii. Cause to prematurely terminate tests or extend them beyond agreed interval jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 30. C O N F I D E N T I A L During the tests • Keep an eye on site metrics generally and where your tests touch • Avoid “premature exclamation” – Observe, but do not report on, the test variants’ performance – Remember that measures of statistical confidence are valid only in the exact moment you pre-determined to end the test.
  • 31. C O N F I D E N T I A L Process: reportage Once each test has run to its pre-determined conclusion trigger: 1) Copy the results from test tool into the database 2) Complete the after-test fields, including: a) What we conclude from this test, especially if results are significant? b) What (if any) concerns we have about its accuracy? c) What follow-on tests and actions you recommend? 3) Decide whether the results are interesting/controversial enough to warrant calling a meeting and/or distributing an explanatory .ppt. 4) Periodically review the relative performance of the various test groups in your site metrics tool if it permits selection that way. jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 32. C O N F I D E N T I A L Agenda 1) Examples 2) Program goal 3) Test approach philosophies a) Eisenberg b) Anderson 4) Process a) solicitation b) prioritization c) pre-test d) testing e) reportage 5) Alternative test method: the multi-armed bandit jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 33. C O N F I D E N T I A L Testing alternatives: Multi-armed bandit method Like a hive of bees that always reconnoiter but sends most bees to known gardens. jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen
  • 34. C O N F I D E N T I A L Take-aways 1) Commission a company-wide online test idea database 2) Be clear about your testing methodology: a) document your hypotheses, be able to tell a narrative b) know what error risks you’re exposing yourself to c) quick tests with low confidence/sensitivity can be okay if followed up 3) Ignore your test tool vendor rep a) “Let’s test longer to see if it goes to confidence.” = disqualifying statement b) The test tool readout is accurate only on the pre-chosen test day. 4) Consider multi-armed bandit optimization jonathan.isernhagen@travelocity.com @jon_isernhagen www.linkedin.com/in/isernhagen

Editor's Notes

  1. Hi, my name is Jonathan Isernhagen and I do marketing analytics for Travelocity. Before my current job I did A/B testing, and I’d like to share some of what I learned on that team. Before we dive in, let’s play Guess the Winning Variant:
  2. In a previous incarnation of the desktop site, the search widget extended off the screen on some screen resolutions.We wanted a way to compress it so the “Search Now” button is always above the fold.
  3. 1) Variant A achieves this vertical compression by moving the “Multiple Destinations” link down from where it logically sits by the “one way” and “round trip” radio buttons to the empty space next to the “Compare Surrounding Airports” checkbox.
  4. 1) A common theme of site design is you “make it as simple as it needs to be, then one step simpler. 2) So variant B is variant A, but with each step numbered so the shopping process is crystal clear.
  5. Variant C is a radical re-design from our information architects which is well-constructed but jarringly different from what users are used to. Time to vote: How many for A, for B, for C? For the Control?
  6. Variant B caused a 5% lift in package conversions, more than offsetting a 1.6% decline in hotel conversion.
  7. Another one from several sites ago: our hotel details page had several tabs, one of which lets the user to check alternative dates during which their stay might be cheaper. But it was named “flexible dates,” which seems ambiguous without more explanation.
  8. Variant A removes the tab completely.
  9. Variant B speaks directly to customer value, but is still ambiguous as to what makes these dates “better” than the original dates chosen by the consumer.
  10. Variant C speaks most directly to specific customer value, but risks creating the impression that we’re offering something that the property had not committed to provide.Votes: A, B, C, Control?
  11. Winner: Variant C causes 3.5% boost in hotel conversion for all shoppers touching this page.
  12. For our agenda I’d like to discusswhat we hope to accomplish through A/B testing;two different philosophies for choosing test candidates;the steps your process should contain, and;a brief explanation of multi-armed banditry
  13. Just so we’re working from a common definition: A/B testing on a transaction site involves: changing the user experience on a certain page for a randomly-selected set of site visitors and then measuring whether they have a different propensity to reach a goal page, which generally has something to do with transaction confirmation.
  14. Why do we do this? our main goal is to improve conversionTheoretically if we keep improving and improving conversion it should eventually reach 100%. In reality, our competitors are always improving too, so it’s like construction in Venice, or climbing a down escalator.
  15. A more devious reason is for HiPPO defense, an acronym for the “Highest Paid Person’s Opinion”I actually think this quote from Jim Barksdale is more clever than it is pushy, but it highlights the fact that in the absence of hard data, experts do what bosses say. When you can say “we can add the dancing monkey, but it will cost us 6.3% conversion,” the burden of proof is now on the boss if he wants to head that way.
  16. 1) A final reason to A/B test is that it works without us knowing why. 2) The famous Limey anecdote tells of a British ship captain who noted a lack of scurvy among sailors of Med-cruising ships and directed half of his crew to eat limes while the others refrained. 3) It worked and was widely adopted though the nutritional mechanism remained unknown.
  17. 1) So what should we test?
  18. 1) During my time on the A/B testing team I had the tremendous good fortune to be tutored by two masters of the game with completely different styles, like the kid in the Bobby Fischer movie.2) One of these was a team comprised of the Eisenberg brothers and John-whose-name-I-can’t-pronounce, authors of the famous “Always Be Testing,” which is a great resource for laundry lists of testable items.
  19. The Eisenbergs stressed the importance of “scent,” which encapsulates all of the implicit promises made to the consumer which your site is expected to fulfillIf your display ad features a green gecko promising that if you fill out a simple form you can save on car insurance, when site visitors click they had better land on a page with lots of green, probably a gecko, and a form not designed by the actuaries. They also recommended that when choosing a page on which to focus testing efforts, you start from the confirmation page and walk backwards until you reach the first page with a big drop in transmission. Reducing bailout on that page will help you more than anything.They recommended using a matrix in which each test idea receives a rating from 1-5 for its ease of implementation and its perceived probable business impact, then sort the tests in each site area in descending order and prioritize those at the top.They insisted that we do some sort of physical wireframe drawing, no matter how ugly, to illustrate each change, and that we write a formal business hypothesis to explain what we think each tested change will do.
  20. The Eisenbergs were associated with Monetate….Jeffrey sat on its board, and when Adobe found out that we had signed a contract with Monetate, despite already using Adobe’s Test & Target, they hauled out their biggest gun to get us back to exclusivity.Their biggest gun was a statistician/coder/genius named Andrew Anderson, whose testing philosophy could not have been more different:Andrew said, and I’m paraphrasing, “I don’t know your customers, I probably don’t like them, and I certainly don’t presume to know what’s going on in their heads. All I want to watch is the ring of the cash register.”Andrew refrained from complicated prioritization exercises and any test involving subtle changes to text, as when you simplify legal terms and conditions, preferring to make sweeping and sometimes unattractive changes to existing elements on the screen. He would simultaneously multivariate test various combinations of characteristics, such as the sizes, shapes and colors of buttons, following up by pitting the winners against each other in univariate testsHe was untroubled if the results obtained by testing didn’t persist in production. “Customer preferences change over time.”For these reasons, Andrew also avoided “over-reporting,” preferring to read out a maximum of five metrics and not relentlessly drilling into intra-path movements.
  21. Travelocity absorbed both of these philosophies and created a best-of-breeds hybrid model that served us well during a period of heavy testing. We allowed Andrew to perform some of his multivariate tests, but for the most part used the Eisenberg’s 5 x 5 x 5 method to rank tests within site areasWe tried to understand customer intent so as to be able to articulate why a real customer would prefer the winning variant.We mostly stuck to univariate testing because it’s hard to get to statistical confidence quickly with lots of variants in the test, and that was one of our priorities.Contamination of tests by other tests is a big problem in theory, but a white paper by a former Microsoft employee suggests that it’s not a big problem in practice except for same-page test conflicts, so we adopted that philosophy and chose to expose ourselves to some risk of test contamination.If Andrew noticed that a specific variant performed terribly, he would take that as a sign that he needed to test its opposite. If big black buttons performed terribly, he would reflexively follow up with a test of tiny white buttons. We wouldn’t necessarily make such follow-on tests a priority.You hear the word “micro-conversion” a lot in the testing world and now also in social media anytime a customer performs a useful-seeming action which shows no direct correlation to actual conversion. The Eisenbergs loved these and adopted an incrementalist approach to test success. Andrew and I both thought they were useless. Travelocity never came to a consensus.The Andrew method tended to change around things which were already on the page, while the Eisenberg method was more visionary and could lead you to construct whole pages for a test which hadn’t even been thought of yet. For that reason, the Andrew method was more likely to reveal local-maximum optimizations while the Eisenbergs brought occasional big wins. We did some of both and had a 1-per 7-to-10 average hit rate.Our operations research guys were adamant that we needed toprecalculate when the tests end and not just end tests when we liked the results. This put us at odds with, and far ahead of, the entire A/B testing industry. Both of the consulting teams’ methods left them vulnerable to statistical errors.
  22. So how did we implement this hybrid philosophy in the real world?
  23. Our first and most important choice was to implement an online test database. This gave us a single image of all tests which had been suggested, prioritized and performed, where everyone in the company couldreview existing test suggestions before submitting their ideas;check back for status without having to send e-mail messages to the test team;read our comments on completed tests and our ongoing hypotheses about each page of the site, and;understand the evolution of each page and section of the site over time as successful test variants were implemented.This one decision improved the clarify of our communications while giving us the bandwidth to focus on testing vs. test communication.
  24. We put out a general call with a link to the interface and a request to submit their ideas after quickly searching to make sure the idea had not yet been submitted. The form is simple but we do request that the submitter:Rate the predicted business impact and, if the submitter is a developer, how hard it would be to do, and;Submit an illustration of the change, which most often takes the form of a screen-captured .jpg with crude drawings in Microsoft Paint.
  25. Each week we would host a prioritization meeting during which we wouldReview recent test results and decide how they impacted our thinking.Re-rank all test suggestions in each path in descending order of benefit * effort, then;Slot the prioritized tests for each site section with allowances for traffic volumes for significance and development/test resource required.In practice, we would have a test schedule set for three months out, and make adjustments if new results or test suggestions forced changes.
  26. As business owners we want the test to be, as Charlie Sheen used to say, “A violent torpedo of Truth.” But it’s not. It’s a best guess about a characteristic of a population from a subsample, and it is vulnerable to errors.1) The “confidence” we specify in the sample size formula is our confidence that we will not say there’s a difference between variant and control when there really is no difference. This is the main error we’re concerned about in a business that’s generally running successfully. 2) The other type of error is that of missing a difference when there is a difference. If we are testing cautiously, we are more accepting of missing good variants than of embracing bad ones.
  27. The formula that governs how long a test needs to run in order to get to statistical confidence looks scary but is actually very simple. The mimimum number of site visitors you need is equal to:two, times;the square of the Z statistic that corresponds to the confidence you want, which is 1.96 for 95% confidence, times;the current conversion rate for the tested page, which is the ratio of visitors to successful completions, times (1- the conversion rate);the minimum amount of change you want to be able to detect. If you think the impact will be obvious, you can set this to 10% and have a much shorter test, but if the real impact is only 5%, this shorter test may miss it. The number that results is the minimum number of visitors you need for the control and each test variant. The conversion rate is a property of the site, but you can play around with the change and confidence variables (as well as the number of test variants) to see what impact they will have on the minimum test length.
  28. Other factors also influence the minimum length of time you can run the test:1) Shoppers often display different conversion behaviors at different times of day and days of week, and those conversion behavior differences may not distribute themselves equally among all variants of the test. In order to eliminate this as a factor, stop your test on a multiple of seven days;2) Know the average length of time between when a visitor first visits your site to research a purchase and when the purchase is made. The objectively best variant may have beneficial conversion effects which only manifest over time, by bringing someone back to the site to transact days after the first visit. Make sure your test runs long enough to allow this effect to be recorded.3) Even the best interface changes risk jarring repeat visitors who may become disoriented by the dissonance between their memory of the site and what you’re showing now. Some testers compensate for this effect by only letting new visitors into the test population, but since repeat business , your repeat site visitors definitely deserve a vote. It’s probably best to just note the difference between behaviors of the new and repeat visitors for each questionable test variant.
  29. Once your prioritization is signed off on, you should make sure the test variants are coded and properly tested, then announce to the entire organization which tests are beginning (and ending) with particular attention to the Site Health and Help Desk people, so that they know what to do if people start calling in site errors.
  30. The most important thing to do during the test is to not report out results, or comment on the interim results that other people with access to the testing tool mention.
  31. After a test concludes:copy the results from the test tool into the database;submit conclusions, whether or not any of the variants ran to confidence anything that could cast doubt on the test’s accuracyany recommendations for re-testing, particularly of subgroups of the existing test.Decide whether the results are interesting or controversial enough to warrant a meeting among the stakeholders.Publish your recommendations to the appropriate site/product governance authorityPeriodically check back within your site monitoring tool (if it permits segmentation by test variant membership) to see whether test-witnessed behavioral changes are permanent.
  32. There are a few other topics that should be mentioned.
  33. An alternative to A/B testing is what’s called the “multi-armed bandit” method because it imagines many variations (each represented by a slot machine) which can all be explored. A/B testing assumes that there’s one right answer that will be durable over time, and we should invest in finding it with certainty. This up-front investment is costly in poor performance.Multi-armed banditry devotes most resources to the top-performing variants but keeps some “tries” back for reconnaissance purposes, in essence never stopping testing.
  34. There are a few other topics that should be mentioned.