11. 11
What about the rest of us?
● Public solutions are lagging
○ Big Cloud providers aren’t providing end-to-end solutions
○ There is no enterprise solution that goes end-to-end
○ There is no widely-adopted open source solution
● The option set for the rest of us:
○ Buy pieces and combine
■ Requires engineering and money
■ Some pieces of infra didn’t have solutions: feature stores
○ Build
■ Requires engineering and may lead to tech debt with scale
12. 12
Data is at the center of ML Infra
Connect to a range of
data sources
Monitor raw and
transformed data,
Monitor for feature drift
Collect and transform
features for testing
new model ideas
Share model outputs
as features in other
models
Cache production features
for training and validation
of point in time
correctness
Transform data
consistently between
inference and training
Backfill historical features
to test new ideas offline
(Not easy)Validate raw and
transformed data (types,
ranges, etc.)
Extract
Data
Build
Features
Train
Models
Monitor
Models
Serve
Models
Collect features for many
subjects (users, devices,
markets, etc.)
(duh)
13. 1. Start basic
2. Build (or buy) a Feature Service
3. Mature the pieces that are
important to your business
15. Simple Definition: Service for computing, and managing ML Data
In order of importance…
1. Framework
○ Reusable code
○ Consistency
○ Ease of development
2. Computation Engine
○ Service that builds features
○ Backfills new features for old inferences
3. Cache
○ Stores derived features
15
Defining a Feature Service
17. Write
Read
And for training
17
Life of a Feature
Inference Training Training
Model
Iteration
Feature
Iteration
Feature Repository
DynamoDB
Feature
Iteration
Validate point in time
correctness by
running training path
on previously
computed features
Calculate
and cache
features in
production
Use cached
features for
model
development
And for testing
new features
Calculate
features in
production
Train with new
features and
save them to
the cache
18. Flexible methods for
merge, join and concat
Everything is built on ABCs with
automated testing
As flexible as Python
Custom one-off
transforms
Features are built on versioned
extracts and transforms
Chain of
transformations
Multiple Features from
a single extract
Feature Definition
19. Defining Features
● Python is approachable and fast enough for our
inference needs (<10s)
● Keeps it simple
Versions
● Easy to manage at our stage
● Consistent transforms
● Different versions for different models
Transforms
● Reuseable!
● Organized: Filter, Map, Reduce
Testing
● Code works
● Production models don’t break
Feature Definition
20. Validate input and
output data of features Store transformed
features at the point of
inference for records
Track metrics on
features and monitor
for drift
20
Where we are today
Extract
Data
Build
Features
Train
Models
Monitor
Models
Serve
Models
Common Feature
Transformation Code
Features
accessible by
SQL
Backfill historical
features at specific
points in time (100%!!)
Enable Training on much
larger datasets with
previously computed features
Share model outputs as
features in other models
(learned features)
25. 25
Branch’s ML Problem
● Long Feedback Signals
○ Problem: We make loans and get signal back between 28 and 1 year
later
○ Solution: Make it possible to reconstruct
● Feature Drift
○ Problem: The way people use their mobile phones in developing
markets changes constantly
○ Solution: Store features and adjust for feature drift
● Many data sources and types
○ Problem: We collect data from a variety of sources and types (raw
text, network data, event streams, location, etc.)
○ Solutions: Build a system for feature construction that unifies
pipelines from different sources and types of transformations
26. ● Learned Features
○ Model Storage is easy
○ Model Serving isn’t trivial
● Monitoring
○ Concept drift is one of our primary ML challenges
● Auto ML
○ Input labels and output model for production…
○ You already have the features!
26
What’s Next