7. Science Engineering Product
• What we invest into
• How we structure and integrate the team
• What deliverables we expect
Actions and
Reactions
8. Types of
Problems
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution
Science Engineering Product
10. Optimizing
Existing
Product
Good thing:
• well defined and controlled
environment
Bad thing:
• integration with existing infrastructure
Case N1. Replacing heuristics in existing product with ML
18. Production
Pipeline
Data Querying
AWS Athena
Data Archive
AWS S3
Data Collection Current Model
Model Training
scikit-learn
Training Data
7 recent days
Validation Data
5% of the last day
Model Validation
Passed?
Skyscanner
Traffic
Pre-processing
Experiments with
Challenger Model
5%
5%
90%
Training Component (AWS CF + AWS Data Pipeline)
Report Failure
Update ModelApache Kafka
Serving Component
ECMLPKDD’2018, https://arxiv.org/pdf/1812.01735.pdf
19. Optimizing
Existing
Product
Good thing:
• well defined and controlled
environment
Bad thing:
• integration with existing infrastructure
Science Engineering Product
Case N1. Replacing heuristics in existing product with ML
22. Iterating over
Existing
Algorithm
Good thing:
• return on infrastructure
investments
Bad thing:
• possibly limited impact
Science Engineering Product
Case N2. Iterating over existing ML algorithm in existing product
25. Building a
New Product
Good thing:
• less dependencies
Bad thing:
• high level of uncertainty
Science Engineering Product
Case N3. Building a first version of a new data product
26. Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
27. Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Managing
Uncertainty
+30% engagement
28. Managing
Uncertainty
Skyscanner Backpack
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
29. Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Start really simple
30. Managing
Uncertainty
Levels of uncertainty:
– Is the position right?
– Is the user flow right?
– Is the message right?
– Is the design right?
– Is the algorithm right?
– What is the baseline?
– etc. etc.
Study previous experience
31. Case N4:
NewScience in
New Products
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution
32. Case N4:
NewScience in
New Products
Existing product, existing solution Existing product, new solution
New product, existing solution New product, new solution