Lean Startup has become the default methodology for building successful online products, but the conditions change as companies grow. Through examples of real experiments, Hilary—a Senior Product Manager at Skyscanner—explores some of the challenges of experimentation at scale, and reveals how to avoid some of the negative side effects of being overly data-driven.
This deck was presented as part of Canvas Conf 2016 http://canvasconference.co.uk/
11. hypothesis
Based on the insight that travellers
disproportionately choose certain holiday
destinations, we predict that ordering
destinations by popularity instead of price
will cause more people to convert.
@hilcsr
12. two possible experiment outcomes
successful failed
the hypothesis was correct the hypothesis was incorrect
@hilcsr
14. case study No.1
Based on the insight that active users of our
flights product are also likely to like our hotels
product, we predict that landing returning
travellers on our hotels homepage instead of
our flights homepage will cause more
travellers to use both products.
@hilcsr
15. case study No.1
20% increase in travellers booking both flight and hotel
Negligible impact on overall flights metrics
@hilcsr
16. Skyscanner you are really annoying me with your hotels
and car hire options! You are not called HotelScanner or
CarHireScanner. You are Sky as in aeroplanes!
If you want to diversify into other services then you
should have thought of that before calling your website
SKYscanner. Imagine Compare The Market had limited
themselves by calling it Compare The Car Insurance
Market!
They can now ease into the Travel Insurance market with
no issue because they didnt specify which market they
are comparing. You cant! and please stop defaulting me to
Hotels I'm here for flights!!
@hilcsr
17. case study No. 1
20% increase in travellers booking both flight and hotel
Negligible impact on overall flights metrics
…but some complaints and added friction for >95% of travellers
bottom line
Achieved the desired impact, but qualitatively
was not the experience we wanted to provide.
@hilcsr
18. We could not measure the downside in the experiment.
It never could have failed.
invalid
not possible to fail
successful failed
the hypothesis was correct the hypothesis was incorrect
possible experiment outcomes
@hilcsr
19. invalid tests
Observed outcome: surprisingly one-sided results.
Characterised by inability to measure the upside/
downside.
Typically occur when you’re in love with your idea.
Can be avoided by taking hard decisions up-front.
Trigger: “If we can’t agree, why don’t we just test it?”
@hilcsr
20. case study No. 2
Based on the insight that our “Aha!”
moment is when travellers conduct their
first search on Skyscanner, we predict that
adding search controls to our travel articles
will cause more readers to become active
users of our product.
@hilcsr
22. case study No. 2
Negligible increase in percentage of travellers who did a search.
Negligible difference between the variants.
bottom line
Our solution didn’t match the context.
Why did we run this test?
@hilcsr
23. The method we chose was not capable of having the impact we wanted.
It never could have succeeded.
invalid
not possible to fail
successful failed
the hypothesis was correct the hypothesis was incorrect
flailed
not possible to succeed
possible experiment outcomes
@hilcsr
24. flailed tests
Observed outcome: Zilch.
Characterised by mismatch between desired impact and
proposed solution.
Typically occur when over-focusing on ‘MVP’.
Can be avoided through broader brainstorming.
Trigger: “It’s quick and easy. Let’s just try it.”
@hilcsr
25. invalid
not possible to fail
successful failed
the hypothesis was correct the hypothesis was incorrect
flailed
not possible to succeed
two four possible experiment outcomes
@hilcsr
26. “Any time a team attempts to justify
its failures by resorting to learning as
an excuse, it is engaged in
pseudoscience….
We cannot afford to breed a new
pseudoscience around pivots, MVPs,
and the like.”
The Lean Startup (page 279)@hilcsr
27. We cannot afford to breed a new
pseudoscience around being data
driven.
@hilcsr
28. Invalid or flailed experiments can
look and feel data driven, but they’re
pure waste.
@hilcsr
29. We’re improving our experiment design through
experiment and hypothesis templates,
thought experiments and a culture of peer-review.
@hilcsr
30. To avoid waste, we must balance both
science and sensibility.
@hilcsr
31. More about what we’re learning at Skyscanner
http://codevoyagers.com
UX comic library from
@steve_cable
Hilary Roberts | @hilcsr
Thank you