Demystifying ML/AI

Decoding ML/AI
A Practical Guide For Startups To
Drive Growth And Innovation
Denver Startup Week, 2023

PRESENTING SPONSOR
TITLE SPONSORS
TRACK SPONSORS

HEADLINE SPONSORS
PARTNER SPONSORS MEMBER SPONSORS
- CBRE
- Colorado Sun
- The Commons on Champa
- DAT Software and Analytics
- Denver Place
- Expansive Workspace
- Greenspoon Marder
- Halliburton Labs
- Jake Jabs Center for Entrepreneurship
- Molson Coors
- MSU Denver
- Park Central
- Polsinelli
- Tea with Tae
- Bounce Back
- Caruso
- Credera
- Doyle Group
- Industrious
- Initial Capacity
- MLJ Insurance
- Nexton
- Spectrum
- WeWork

© 2022 Ibotta, Inc. Proprietary and confidential, not to be shared without Ibotta’s express consent.
2.0
addition
Argie Angeleas
Group Product Manager
- @ibotta
Taylor Names
Principal ML Engineer
- @ibotta
Matt Reynolds
Principal ML Platform
Engineer - @ibotta

Gartner Predicts That 85%
Of ML Projects Will Fail

Edtech Startup
People convert to registered users, BUT
they don’t watch courses
Course CTR (Click-through-rate) after a
strong start stayed ﬂat

Product

Where To Start?
3 Important Questions
What is the problem we
are trying to solve?
Are there business
constraints?
Understand The Problem..
Remove Ambiguity..
But How?
What does a successful
outcome like?

Let The Fun Begin Step.0
Define The Problem Spaces
People can’t find relevant coursework and therefore they
are not starting courses
People find relevant courses, but the content of each
course does not meet their expectations
People can’t understand how to use our website, and
that’s why they are not starting courses

Great, But What’s Next?
Peel the onion
Is this an ML problem? What is the project value?
What is the expectation
in business?

Let The Fun Begin Vol. 1
Deﬁne The Hypotheses
Secret #1: Don’t Do It In A Dark Room On Your Own
Involve Subject Matter Experts
(Business, Engineering, Data Science, Product)
Deﬁne hypotheses that can solve the problem
ML is different than traditional software - it is
hypotheses based

Let the fun begin vol. 2
Ideate solutions
Earn trust through outcomes

Let The Fun Begin Vol. 3
Deﬁne Your Metrics
Secret #2: Don’t do it in a dark room on your own
Success Metric:
Click through rate
Guardrail Metrics:
› First course completion rate
› Sponsored content click through rate

Data Science

Your First Model Will Fail
Before model experimentation starts, there is uncertainty around how effectively the business
problem can be solved
› Should we expect a 2% lift in click through rate or 20%?
Uncertainty around which data will be predictive
› Example hypothesis: student’s geographic location will help predict relevant courses
› Many hypotheses won’t pan out!
Model experimentation process should be optimized for rapid iteration
› Input data + modeling approach = success?
› Don’t over polish a solution that isn’t guaranteed to succeed

How Will I Know If My Model Works?
In A Perfect World, Get Model Predictions In Front Of Real Customers & Measure Business
Impact!
A/B Testing: split customers between treatment (ML powered) and control (status-quo)
Cons:
› Large engineering investment (reliable production-ready solution)
› Risk to business (can we afford to see a significant drop in business metrics in our test population?)
› Slow and difficult to get conclusive results (if you have a small user base)
Is there a lower cost and lower risk alternative for rapid model evaluation?
Yes! Evaluate model performance on historical data
› Commonly referred to as offline testing or backtesting

Estimating Performance On Historical Data
Can you directly measure the business impact metric with different model
approaches on past data?
› If yes, do this!
01
What if it is infeasible to measure business metric on historical data?
› Are there proxy metrics that are well correlated with the critical business impact metrics?
• Example: Click through rate -> how high did the model rank items that were clicked
02
Remember to compare performance against a reasonable heuristic!
› Does my model generate more clicks than content sorted in order of general popularity?
03
Don’t over-optimize “oﬄine” performance. Get in front of real users!
04

Clean, representative, and predictive input (training) data
› Clean: outliers, erroneous data need to be handled with care
› Representative: are the business scenarios where the model
will add value represented in your training set?
› Predictive: talk with SMEs on what data should be sourced
and transformed
The right model for the right problem
Rules of thumb:
› Tabular data: ensemble decision trees (XGBoost, lightGBM,
random forest)
› Text/computer vision/audio: open source pre-trained
neural networks
Model training is less than 10% of the total project effort!
What Does It
Take To Build An
Accurate Model?

Is My Model Ready For The Big Leagues?
Am I reasonably conﬁdent that my
model will add value & do no harm?
Outperforms a heuristic according to
business metric or reasonable proxy metric
Was my methodology of
evaluation sound?
Always peer review
code/analysis/conclusions!
Is my model able to be reasonably
supported in a production environment?
Overly complex models can be diﬃcult to
productionize and expensive to maintain
If so, SHIP IT!

© 2023 Ibotta, Inc. Proprietary and confidential, not to be shared without Ibotta’s express consent. 22
Machine Learning Infrastructure

OK, I Have A Model, Now What?
You Need To Figure Out The Best Way To Get It Into Production
1
2
3
4
How “fresh” does the
model need to be?
What’s the simplest way to get
the results?
How often do you need to
perform predictions?
Who do you need to
create predictions for?
If you haven’t been talking to the engineers who will use the results, start now

Start with batch predictions
Starting With Batch Predictions Allows You To Avoid All This:
Real time feature store
(features = data used by model
to generate predictions)
Model serving code
Additional infrastructure
Latency concerns
Resiliency issues
01
02 04
03 05

25
25
› Use source control
› Have others review changes
› Be consistent – use naming conventions
› Parameterize for different environments/versions
› Provide a deﬁned interface for your model predictions
› Reduce manual input in pipelines – aim for CI/CD
Software Engineering Teams (Should!) Do This, Copy Them
Follow Basic
Engineering Best
Practices

What Could This Look Like?
Source Control:
› Consider data scientist model work (generally Jupyter notebook)*
› Scripts for model data generation, training, batch predictions and storing results
Cloud storage:
Standardized naming model artifact, folder by environment, training date
Compute instance/Serverless:
Run scripts on regular time schedule or some automated trigger
Database:
Prediction results - include model version column

How can I save time?
Avoid scope creep
(stick to your deﬁned
goals)
v
v
v
v
v
v
v
v
v
v
v
v
v
v
Keep it simple
(don’t try and build real-time
out the gate)
Use managed services
(don’t build everything from
scratch)
Get dedicated staff involved for
end to end support
(project should be
self-suﬃcient)
Follow best practices
(use standard model
approaches)
Use what you already have
(data engineering, service tools)
Avoid analysis paralysis
(beats baseline = start test)
Hire all-rounders
(no research only data
scientists)

What Should I Avoid Cutting Corners On?
Get other teams involved early
(upstream and downstream)
Don’t skip on automation
Regularly review
(both internal for design/code and stakeholders for progress)
Provide an API interface
(reduce coupling)
Basic monitoring
(did training fail? Did my predictions job run?)

Collaboration works wonders → business <> engineering <> data science <> product
Utilizing a structured approach to problem deﬁnition increases the likelihood of success
Even simple models can create impact that gets the business raving
Conclusions

Demystifying ML/AI

More Related Content

What's hot

Similar to Demystifying ML/AI

Recently uploaded

Demystifying ML/AI