SlideShare a Scribd company logo
Jim Dowling
CEO, Logical Clocks
June 2021
Breaking the Monolithic
ML Pipeline with a Feature Store
MLOps World
The Jellyfish reminds me of Deep Learning….
Rich Sensory Input produces Complex “Intelligent” Behaviour.
The Jellyfish reminds me of Deep Learning….
You can make them more complex by stacking layers...
The Jellyfish reminds me of Deep Learning….
And it has no Brain!
CNN
Model
Stateless Model Serving with Rich Input Signals. Versioned libraries prevent train/serve skew.
Stateless
App
Images
crop
resize
crop
resize
Images
Training Data
Images
TRAINING SERVING
train
Same image processing
libraries used in train/serving.
NLP
Model
Stateless Model Serving with Rich Input Signals. Versioned libraries prevent train/serve skew.
Stateless
App
Text
token-
ize
token-
ize
Tokenized
Text
Text
TRAINING SERVING
train
Same text processing
libraries used in train/serving.
Useless
Model
Recommender Model with Only Application State - no user history or context available.
Stateless
App
UserID,
Action
featurize
featurize
Not Enough
Features in
Training Data
UserIDs,
Actions
TRAINING SERVING
train
Stateful Model Serving using a Feature Store. Retrieve History and Context to build Feature Vectors.
Model
Stateless App
UserID,
ProdID
Build
Feature
Vector
JOIN
Features
Training
Data
Select
Features
TRAINING SERVING
train
[Feature Vectors]
Feature
Store
[Dataframes of Features]
The Feature Store gives our Model a Brain. Our Jellyfish now looks more like an Octopus.
In New Zealand, a captive octopus apparently took a dislike to one of the staff. Every time the person passed the
tank, the octopus squirted a jet of water at her. That is more like intelligence!!
Feature
Store
Feature
Engineering
Model
Training
Model
serving
Model
monitoring
Validate
& Test
Input Data
The End-to-End ML Pipeline with a Feature Store
Feature
Store
Feature
Engineering
Model
Training
Model
serving
Model
monitoring
ML Engineers
Data Scientists
Model
Testing
Data Engineers
Architects (Governance)
Roles and Responsibilities in a ML Pipeline
Refactor the End-to-End ML Pipeline into Feature and Training Pipelines
Data Lake,
Warehouse,
Kafka
Model
Registry
Feature
Pipeline
Model
Serving
Training
Pipeline
Feature
Store
Orchestrator: Airflow, Github Actions, Jenkins
Actions: Code commit, New data, time trigger (e.g., daily)
What Feature Engineering do we typically perform where?
Aggregations,
Data Validation
Training
Data
Serving
Raw Data
Feature
Store
Model
Repo
Transformations Input Data
Need to ensure no
skew between training
and serving
transformations
Data Science
Data Engineering Compliance & Regulatory
Feature Store
Teams use the tools of their choice,
integrated with the
Hopsworks Feature Store
Model Serving
Hopsworks is an Open, Modular Feature Store that can Plug into ML Pipelines
Feature Group
Feature 1 Feature M
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
Training Dataset
Feature 1
LABEL
(CHURN_weekly)
Feature N
Primary Key
0 ... ... 1
1 ... ... 0
2 ... ... 0
... ... ... ...
N ... ... 1
Feature 1 Feature M
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
Feature 1 Feature J
Primary Key
0 ... ...
1 ... ...
2 ... ...
... ... ...
N ... ...
Feature Group A Feature Group B
Training Dataset
When Model Serving, we retrieve Feature Vectors
Feature 1 CHURN_weekly
Feature N
Primary Key
ID ... ... N/A
From App From Feature Store No Label - Predict it
Lookup Features from Feature Store using “ID”
transaction_type
transaction_amount
user_id
user_nationality
user_gender
transactions_fg
users_fg
Feature Groups Training
Datasets
pk join
fraud_td
Descriptive
Statistics,
Feature
Correlations,
Histograms
...
Use for Drift
Detection
fraud_classifier
Models
Training Data
Features Models
Raw
Data
From Raw Data to Production Models in Hopsworks
Provenance Graph of Dependencies
Feature Groups Models
Training Datasets
Changes in upstream entities trigger actions that can cause downstream computations to run
Upstream Downstream
Breaking the Monolithic Pipeline into Feature Pipelines and Training Pipelines
transaction_type
transaction_amount
user_id
user_nationality
user_gender
transactions_fg
users_fg
Feature Groups Training
Datasets
pk join
fraud_td
Descriptive
Statistics,
Feature
Correlations,
Histograms
...
Use for Drift
Detection
fraud_classifier
Models
Feature Pipeline Training Pipeline
Orchestrate Feature and Training Pipelines with Airflow
Feature Engineering
Notebook/Job
Validate on Data Slices
& Deploy Model
Run Experiment
to Train Model
Select Features, File Format
and Create Training Data
FEATURE
STORE
Validate Data +
Compute Features
2
Data Source
1 Feature Store
3
Click Data every 5 seconds
Sensor Data every 10 seconds
App logs every hour
Customer Profile once/month
User clicks
Event Data
Structured
Data
Data Lake
Feature Store
...
Commit-0002
Commit-0097
Commit-0001
Delta Lake
Snowflake
Amazon S3
Amazon
Redshift
External
Table
Cached Features
Import
Existing Features
2
Getting Features into the Hopsworks Feature Store
Feature Pipelines
Feature Pipelines: Writing Features (Python or Spark)
df = spark.read.json("abfs://dev@xyz.blob.core.windows.net/d/rain.json")
# feature engineering on your dataframe
df.withColumn('precipitation', (df.val-min)/(max-min))
fg = fs.create_feature_group("rain",
version=1,
description="Rain features",
primary_key=['date', 'location_id'],
online_enabled=True)
fg.save(df)
https://docs.hopsworks.ai/
Feature Pipelines: Schema and Data Versioning for Feature Groups
upsert_df = # create from some source
fg = fs.get_feature_group(“rain”, 1)
fg.insert(upsert_df)
# Read the state ‘as of’ the timestamp
fg.read(“2020-12-15 09:00:01”).show()
# Read changes for the time interval
fg.read_changes(“2020-12-14 09:00:01”,
“2020-12-15 09:00:01”).show()
Commit1
Timestamp1
Commit2
Timestamp2
... ...
... ...
Commitn
Timestampn
rain (v1)
rain (v2)
latest
commit of
schema
(v1)
Feature Pipelines: Data Validation for Feature Groups
expectation_sales = fs.create_expectation(..,
rules=[Rule(name="HAS_MIN", level="WARNING", min=0),
Rule(name="HAS_MAX", level="ERROR", max=1000000)])
economy_fg = fs.create_feature_group(....,expectations= [expectation_sales] )
df = # get some dataframe to ingest into the feature store
# Run Data Validation Rules when data is written
economy_fg.insert(df)
Feature Pipelines: Real-Time Feature Engineering
df_read = spark.readStream.format("kafka")...option("subscribe", KAFKA_TOPIC_NAME).load()
# Deserialise data from Kafka and create streaming query
df_deser = df_read.selectExpr(....).select(...)
# 10 minute window
windowed10mSignalDF = df_deser 
.selectExpr(...)
.withWatermark(...) 
.groupBy(window("datetime", "10 minutes"), "cc_num").agg(avg("amount")) 
.select(...)
card_transactions_10m_agg =fs.get_feature_group("card_transactions_10m_agg", version = 1)
query_10m = card_transactions_10m_agg.insert_stream(windowed10mSignalDF)
Training Pipelines
Training Pipelines: Feature Selection and Training Dataset Creation
feature_join = rain_fg.select_all()
.join(temperature_fg.select_all())
.join(location_fg.select_all()))
td = fs.create_training_dataset("training_dataset",
version=1,
data_format="tfrecord",
description="Training dataset, TfRecords format")
td.save(feature_join)
Training Pipelines: Feature Selection and Training Dataset Creation
feature_join = rain_fg.select_all()
.join(temperature_fg.select_all(), on=["date", "location_id"])
.join(location_fg.select_all()))
sc = fs.get_storage_connector("adls_mystorage")
td = fs.create_training_dataset("training_dataset",
version=1,
storage_connector=sc,
data_format="tfrecord",
description="Training dataset, TfRecords format",
splits={'train': 0.7, 'test': 0.2, 'validate': 0.1})
# The train/test/validation files are now saved to the filesystem (S3, HDFS, etc)
td.save(feature_join)
Training Pipelines: Training Dataset Creation with Online Transformation Functions
feature_join = rain_fg.select_all()
.join(temperature_fg.select_all(), on=["date", "location_id"])
.join(location_fg.select_all()))
td = fs.create_training_dataset("precpitation",
version=1,
transformation_functions=
{"date":one_hot_encode,
"precipitation":rainfall_normalize},
data_format="tfrecord",
description="Training dataset, TfRecords format",
splits={'train': 0.7, 'test': 0.2, 'validate': 0.1})
# The train/test/validation files are now saved to the filesystem (S3, HDFS, etc)
td.save(feature_join)
Training Pipelines: Model Training
def train():
dataset = tf.Dataset(“s3://path/to/training_data/“)
model = …
model.compile(..)
model.fit(..)
tf.saved_model.save(model, export_path)
hops.save_model(export_path, "model_name", metrics=metrics)
train()
Training Pipelines: Model Analysis with the What-If Tool*
* https://examples.hopsworks.ai/ml/plotting/what_if_tool_notebook/
Model Serving in Hopsworks - KFServing
Source: https://github.com/kubeflow/kfserving/tree/master/docs
Kafka
Statistics
Feature
Store
Evaluation Store
(part of Feature Store)
Inference
Requests
Drift, outliers
Model Serving: Building Feature Vectors before Making Inference Requests
td = fs.get_training_dataset("precipitation", 1)
# dict containing the primary key name/values for the FGs in the TD
input_keys = { “location_id” : “59.314781 18.070232” }
# retrieve a single feature vector and apply any online transformation functions
ordered_feature_vector = td.get_serving_vector(input_keys)
Alternatively, any JDBC client can retrieve feature vectors using SQL queries.
Complete MLOps Infrastructure with Hopsworks and its Feature Store
Code and
configuration
Data Lake,
Warehouse,
Kafka
Model
Registry
Feature
Engineering
Model
Serving
Model
Training
Model
Deploy
Model
Monitoring
Model
Development
Retrieve Online Features
Log Predictions Training Data Statistics
Sync
Experiment
Versioning
Code/Environment
Versioning
Feature Versioning/Statistics
A/B Test
Model
Versioning
& Statistics
Serving
Statistics
Free-text Search
Feature
Store
Elasticsearch
RonDB
www.hopsworks.ai
-
@logicalclocks
-
alina@logicalclocks.com
github.com/logicalclocks/hopsworks

More Related Content

What's hot

Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AI
QAware GmbH
 
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in Hopsworks
Jim Dowling
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
Data Science Milan
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
Jim Dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
Jim Dowling
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
Jim Dowling
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
Databricks
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Lex Avstreikh
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
Jim Dowling
 
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Kim Hammar
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Jim Dowling
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Databricks
 
mlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecyclemlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecycle
Databricks
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Databricks
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
Databricks
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
AZUG FR
 
MLflow at Company Scale
MLflow at Company ScaleMLflow at Company Scale
MLflow at Company Scale
Databricks
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
Databricks
 

What's hot (20)

Hopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AIHopsworks - The Platform for Data-Intensive AI
Hopsworks - The Platform for Data-Intensive AI
 
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
Kim Hammar - Feature Store: the missing data layer in ML pipelines? - HopsML ...
 
The Feature Store in Hopsworks
The Feature Store in HopsworksThe Feature Store in Hopsworks
The Feature Store in Hopsworks
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021Hops fs huawei internal conference july 2021
Hops fs huawei internal conference july 2021
 
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlowTensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow
 
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptxDowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
Dowling buso-feature-store-logical-clocks-spark-ai-summit-2020.pptx
 
Hopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, SunnyvaleHopsworks at Google AI Huddle, Sunnyvale
Hopsworks at Google AI Huddle, Sunnyvale
 
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
Hopsworks hands on_feature_store_palo_alto_kim_hammar_23_april_2019
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
mlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecyclemlflow: Accelerating the End-to-End ML lifecycle
mlflow: Accelerating the End-to-End ML lifecycle
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
 
I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)I want my model to be deployed ! (another story of MLOps)
I want my model to be deployed ! (another story of MLOps)
 
MLflow at Company Scale
MLflow at Company ScaleMLflow at Company Scale
MLflow at Company Scale
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
 

Similar to Hopsworks MLOps World talk june 21

Optimization in django orm
Optimization in django ormOptimization in django orm
Optimization in django orm
Denys Levchenko
 
Oracle MAF real life OOW.pptx
Oracle MAF real life OOW.pptxOracle MAF real life OOW.pptx
Oracle MAF real life OOW.pptx
Luc Bors
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine Learning
Denise Nepraunig
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 
Simplify Feature Engineering in Your Data Warehouse
Simplify Feature Engineering in Your Data WarehouseSimplify Feature Engineering in Your Data Warehouse
Simplify Feature Engineering in Your Data Warehouse
FeatureByte
 
Learning keras by building dogs-vs-cats image classifier
Learning keras by building dogs-vs-cats image classifierLearning keras by building dogs-vs-cats image classifier
Learning keras by building dogs-vs-cats image classifier
Jian Wu
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
Lviv Startup Club
 
Practical Google App Engine Applications In Py
Practical Google App Engine Applications In PyPractical Google App Engine Applications In Py
Practical Google App Engine Applications In Py
Eric ShangKuan
 
To Study The Tips Tricks Guidelines Related To Performance Tuning For N Hib...
To Study The Tips Tricks  Guidelines Related To Performance Tuning For  N Hib...To Study The Tips Tricks  Guidelines Related To Performance Tuning For  N Hib...
To Study The Tips Tricks Guidelines Related To Performance Tuning For N Hib...
Shahzad
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache Spark
Databricks
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Chetan Khatri
 
Ember
EmberEmber
Ember
mrphilroth
 
Using AI to create smart application - DroidCon Tel Aviv
Using AI to create smart application - DroidCon Tel AvivUsing AI to create smart application - DroidCon Tel Aviv
Using AI to create smart application - DroidCon Tel Aviv
Sarit Tamir
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
Bryan Yang
 
Deep Learning on iOS #360iDev
Deep Learning on iOS #360iDevDeep Learning on iOS #360iDev
Deep Learning on iOS #360iDev
Shuichi Tsutsumi
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
Modern Data Stack France
 
KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature Store
Databricks
 
Begin with Machine Learning
Begin with Machine LearningBegin with Machine Learning
Begin with Machine Learning
Narong Intiruk
 
Relevance trilogy may dream be with you! (dec17)
Relevance trilogy  may dream be with you! (dec17)Relevance trilogy  may dream be with you! (dec17)
Relevance trilogy may dream be with you! (dec17)
Woonsan Ko
 

Similar to Hopsworks MLOps World talk june 21 (20)

Optimization in django orm
Optimization in django ormOptimization in django orm
Optimization in django orm
 
Oracle MAF real life OOW.pptx
Oracle MAF real life OOW.pptxOracle MAF real life OOW.pptx
Oracle MAF real life OOW.pptx
 
Apple Machine Learning
Apple Machine LearningApple Machine Learning
Apple Machine Learning
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
Simplify Feature Engineering in Your Data Warehouse
Simplify Feature Engineering in Your Data WarehouseSimplify Feature Engineering in Your Data Warehouse
Simplify Feature Engineering in Your Data Warehouse
 
Learning keras by building dogs-vs-cats image classifier
Learning keras by building dogs-vs-cats image classifierLearning keras by building dogs-vs-cats image classifier
Learning keras by building dogs-vs-cats image classifier
 
Viktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning ServiceViktor Tsykunov: Azure Machine Learning Service
Viktor Tsykunov: Azure Machine Learning Service
 
Practical Google App Engine Applications In Py
Practical Google App Engine Applications In PyPractical Google App Engine Applications In Py
Practical Google App Engine Applications In Py
 
To Study The Tips Tricks Guidelines Related To Performance Tuning For N Hib...
To Study The Tips Tricks  Guidelines Related To Performance Tuning For  N Hib...To Study The Tips Tricks  Guidelines Related To Performance Tuning For  N Hib...
To Study The Tips Tricks Guidelines Related To Performance Tuning For N Hib...
 
Building a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache SparkBuilding a Feature Store around Dataframes and Apache Spark
Building a Feature Store around Dataframes and Apache Spark
 
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scalaAutomate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
Automate ml workflow_transmogrif_ai-_chetan_khatri_berlin-scala
 
Ember
EmberEmber
Ember
 
Using AI to create smart application - DroidCon Tel Aviv
Using AI to create smart application - DroidCon Tel AvivUsing AI to create smart application - DroidCon Tel Aviv
Using AI to create smart application - DroidCon Tel Aviv
 
Spark MLlib - Training Material
Spark MLlib - Training Material Spark MLlib - Training Material
Spark MLlib - Training Material
 
Deep Learning on iOS #360iDev
Deep Learning on iOS #360iDevDeep Learning on iOS #360iDev
Deep Learning on iOS #360iDev
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
KFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature StoreKFServing, Model Monitoring with Apache Spark and a Feature Store
KFServing, Model Monitoring with Apache Spark and a Feature Store
 
Begin with Machine Learning
Begin with Machine LearningBegin with Machine Learning
Begin with Machine Learning
 
tutorialSCE
tutorialSCEtutorialSCE
tutorialSCE
 
Relevance trilogy may dream be with you! (dec17)
Relevance trilogy  may dream be with you! (dec17)Relevance trilogy  may dream be with you! (dec17)
Relevance trilogy may dream be with you! (dec17)
 

More from Jim Dowling

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
Jim Dowling
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
Jim Dowling
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
Jim Dowling
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
Jim Dowling
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
Jim Dowling
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
Jim Dowling
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Jim Dowling
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
Jim Dowling
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Jim Dowling
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
Jim Dowling
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
Jim Dowling
 
Jfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocks
Jim Dowling
 
Berlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsBerlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on Hops
Jim Dowling
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
Jim Dowling
 
Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)
Jim Dowling
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
Jim Dowling
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Jim Dowling
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Jim Dowling
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
Jim Dowling
 

More from Jim Dowling (19)

ARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdfARVC and flecainide case report[EI] Jim.docx.pdf
ARVC and flecainide case report[EI] Jim.docx.pdf
 
PyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdfPyData Berlin 2023 - Mythical ML Pipeline.pdf
PyData Berlin 2023 - Mythical ML Pipeline.pdf
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdfPyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
PyCon Sweden 2022 - Dowling - Serverless ML with Hopsworks.pdf
 
_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf_Python Ireland Meetup - Serverless ML - Dowling.pdf
_Python Ireland Meetup - Serverless ML - Dowling.pdf
 
Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning Building Hopsworks, a cloud-native managed feature store for machine learning
Building Hopsworks, a cloud-native managed feature store for machine learning
 
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022Real-Time Recommendations  with Hopsworks and OpenSearch - MLOps World 2022
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
 
GANs for Anti Money Laundering
GANs for Anti Money LaunderingGANs for Anti Money Laundering
GANs for Anti Money Laundering
 
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala UniversityInvited Lecture on GPUs and Distributed Deep Learning at Uppsala University
Invited Lecture on GPUs and Distributed Deep Learning at Uppsala University
 
Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019 Hopsworks in the cloud Berlin Buzzwords 2019
Hopsworks in the cloud Berlin Buzzwords 2019
 
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
HopsML Meetup talk on Hopsworks + ROCm/AMD June 2019
 
Jfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocksJfokus 2019-dowling-logical-clocks
Jfokus 2019-dowling-logical-clocks
 
Berlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on HopsBerlin buzzwords 2018 TensorFlow on Hops
Berlin buzzwords 2018 TensorFlow on Hops
 
All AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AIAll AI Roads lead to Distribution - Dot AI
All AI Roads lead to Distribution - Dot AI
 
Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)Distributed TensorFlow on Hops (Papis London, April 2018)
Distributed TensorFlow on Hops (Papis London, April 2018)
 
End-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in FinanceEnd-to-End Platform Support for Distributed Deep Learning in Finance
End-to-End Platform Support for Distributed Deep Learning in Finance
 
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa ClaraScaling TensorFlow with Hops, Global AI Conference Santa Clara
Scaling TensorFlow with Hops, Global AI Conference Santa Clara
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
 

Recently uploaded

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 

Recently uploaded (20)

Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 

Hopsworks MLOps World talk june 21

  • 1. Jim Dowling CEO, Logical Clocks June 2021 Breaking the Monolithic ML Pipeline with a Feature Store MLOps World
  • 2. The Jellyfish reminds me of Deep Learning…. Rich Sensory Input produces Complex “Intelligent” Behaviour.
  • 3. The Jellyfish reminds me of Deep Learning…. You can make them more complex by stacking layers...
  • 4. The Jellyfish reminds me of Deep Learning…. And it has no Brain!
  • 5. CNN Model Stateless Model Serving with Rich Input Signals. Versioned libraries prevent train/serve skew. Stateless App Images crop resize crop resize Images Training Data Images TRAINING SERVING train Same image processing libraries used in train/serving.
  • 6. NLP Model Stateless Model Serving with Rich Input Signals. Versioned libraries prevent train/serve skew. Stateless App Text token- ize token- ize Tokenized Text Text TRAINING SERVING train Same text processing libraries used in train/serving.
  • 7. Useless Model Recommender Model with Only Application State - no user history or context available. Stateless App UserID, Action featurize featurize Not Enough Features in Training Data UserIDs, Actions TRAINING SERVING train
  • 8. Stateful Model Serving using a Feature Store. Retrieve History and Context to build Feature Vectors. Model Stateless App UserID, ProdID Build Feature Vector JOIN Features Training Data Select Features TRAINING SERVING train [Feature Vectors] Feature Store [Dataframes of Features]
  • 9. The Feature Store gives our Model a Brain. Our Jellyfish now looks more like an Octopus. In New Zealand, a captive octopus apparently took a dislike to one of the staff. Every time the person passed the tank, the octopus squirted a jet of water at her. That is more like intelligence!!
  • 12. Refactor the End-to-End ML Pipeline into Feature and Training Pipelines Data Lake, Warehouse, Kafka Model Registry Feature Pipeline Model Serving Training Pipeline Feature Store Orchestrator: Airflow, Github Actions, Jenkins Actions: Code commit, New data, time trigger (e.g., daily)
  • 13. What Feature Engineering do we typically perform where? Aggregations, Data Validation Training Data Serving Raw Data Feature Store Model Repo Transformations Input Data Need to ensure no skew between training and serving transformations
  • 14. Data Science Data Engineering Compliance & Regulatory Feature Store Teams use the tools of their choice, integrated with the Hopsworks Feature Store Model Serving Hopsworks is an Open, Modular Feature Store that can Plug into ML Pipelines
  • 15. Feature Group Feature 1 Feature M Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ...
  • 16. Training Dataset Feature 1 LABEL (CHURN_weekly) Feature N Primary Key 0 ... ... 1 1 ... ... 0 2 ... ... 0 ... ... ... ... N ... ... 1 Feature 1 Feature M Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ... Feature 1 Feature J Primary Key 0 ... ... 1 ... ... 2 ... ... ... ... ... N ... ... Feature Group A Feature Group B Training Dataset
  • 17. When Model Serving, we retrieve Feature Vectors Feature 1 CHURN_weekly Feature N Primary Key ID ... ... N/A From App From Feature Store No Label - Predict it Lookup Features from Feature Store using “ID”
  • 18. transaction_type transaction_amount user_id user_nationality user_gender transactions_fg users_fg Feature Groups Training Datasets pk join fraud_td Descriptive Statistics, Feature Correlations, Histograms ... Use for Drift Detection fraud_classifier Models Training Data Features Models Raw Data From Raw Data to Production Models in Hopsworks
  • 19. Provenance Graph of Dependencies Feature Groups Models Training Datasets Changes in upstream entities trigger actions that can cause downstream computations to run Upstream Downstream
  • 20. Breaking the Monolithic Pipeline into Feature Pipelines and Training Pipelines transaction_type transaction_amount user_id user_nationality user_gender transactions_fg users_fg Feature Groups Training Datasets pk join fraud_td Descriptive Statistics, Feature Correlations, Histograms ... Use for Drift Detection fraud_classifier Models Feature Pipeline Training Pipeline
  • 21. Orchestrate Feature and Training Pipelines with Airflow Feature Engineering Notebook/Job Validate on Data Slices & Deploy Model Run Experiment to Train Model Select Features, File Format and Create Training Data FEATURE STORE
  • 22. Validate Data + Compute Features 2 Data Source 1 Feature Store 3 Click Data every 5 seconds Sensor Data every 10 seconds App logs every hour Customer Profile once/month User clicks Event Data Structured Data Data Lake Feature Store ... Commit-0002 Commit-0097 Commit-0001 Delta Lake Snowflake Amazon S3 Amazon Redshift External Table Cached Features Import Existing Features 2 Getting Features into the Hopsworks Feature Store
  • 24. Feature Pipelines: Writing Features (Python or Spark) df = spark.read.json("abfs://dev@xyz.blob.core.windows.net/d/rain.json") # feature engineering on your dataframe df.withColumn('precipitation', (df.val-min)/(max-min)) fg = fs.create_feature_group("rain", version=1, description="Rain features", primary_key=['date', 'location_id'], online_enabled=True) fg.save(df) https://docs.hopsworks.ai/
  • 25. Feature Pipelines: Schema and Data Versioning for Feature Groups upsert_df = # create from some source fg = fs.get_feature_group(“rain”, 1) fg.insert(upsert_df) # Read the state ‘as of’ the timestamp fg.read(“2020-12-15 09:00:01”).show() # Read changes for the time interval fg.read_changes(“2020-12-14 09:00:01”, “2020-12-15 09:00:01”).show() Commit1 Timestamp1 Commit2 Timestamp2 ... ... ... ... Commitn Timestampn rain (v1) rain (v2) latest commit of schema (v1)
  • 26. Feature Pipelines: Data Validation for Feature Groups expectation_sales = fs.create_expectation(.., rules=[Rule(name="HAS_MIN", level="WARNING", min=0), Rule(name="HAS_MAX", level="ERROR", max=1000000)]) economy_fg = fs.create_feature_group(....,expectations= [expectation_sales] ) df = # get some dataframe to ingest into the feature store # Run Data Validation Rules when data is written economy_fg.insert(df)
  • 27. Feature Pipelines: Real-Time Feature Engineering df_read = spark.readStream.format("kafka")...option("subscribe", KAFKA_TOPIC_NAME).load() # Deserialise data from Kafka and create streaming query df_deser = df_read.selectExpr(....).select(...) # 10 minute window windowed10mSignalDF = df_deser .selectExpr(...) .withWatermark(...) .groupBy(window("datetime", "10 minutes"), "cc_num").agg(avg("amount")) .select(...) card_transactions_10m_agg =fs.get_feature_group("card_transactions_10m_agg", version = 1) query_10m = card_transactions_10m_agg.insert_stream(windowed10mSignalDF)
  • 29. Training Pipelines: Feature Selection and Training Dataset Creation feature_join = rain_fg.select_all() .join(temperature_fg.select_all()) .join(location_fg.select_all())) td = fs.create_training_dataset("training_dataset", version=1, data_format="tfrecord", description="Training dataset, TfRecords format") td.save(feature_join)
  • 30. Training Pipelines: Feature Selection and Training Dataset Creation feature_join = rain_fg.select_all() .join(temperature_fg.select_all(), on=["date", "location_id"]) .join(location_fg.select_all())) sc = fs.get_storage_connector("adls_mystorage") td = fs.create_training_dataset("training_dataset", version=1, storage_connector=sc, data_format="tfrecord", description="Training dataset, TfRecords format", splits={'train': 0.7, 'test': 0.2, 'validate': 0.1}) # The train/test/validation files are now saved to the filesystem (S3, HDFS, etc) td.save(feature_join)
  • 31. Training Pipelines: Training Dataset Creation with Online Transformation Functions feature_join = rain_fg.select_all() .join(temperature_fg.select_all(), on=["date", "location_id"]) .join(location_fg.select_all())) td = fs.create_training_dataset("precpitation", version=1, transformation_functions= {"date":one_hot_encode, "precipitation":rainfall_normalize}, data_format="tfrecord", description="Training dataset, TfRecords format", splits={'train': 0.7, 'test': 0.2, 'validate': 0.1}) # The train/test/validation files are now saved to the filesystem (S3, HDFS, etc) td.save(feature_join)
  • 32. Training Pipelines: Model Training def train(): dataset = tf.Dataset(“s3://path/to/training_data/“) model = … model.compile(..) model.fit(..) tf.saved_model.save(model, export_path) hops.save_model(export_path, "model_name", metrics=metrics) train()
  • 33. Training Pipelines: Model Analysis with the What-If Tool* * https://examples.hopsworks.ai/ml/plotting/what_if_tool_notebook/
  • 34. Model Serving in Hopsworks - KFServing Source: https://github.com/kubeflow/kfserving/tree/master/docs Kafka Statistics Feature Store Evaluation Store (part of Feature Store) Inference Requests Drift, outliers
  • 35. Model Serving: Building Feature Vectors before Making Inference Requests td = fs.get_training_dataset("precipitation", 1) # dict containing the primary key name/values for the FGs in the TD input_keys = { “location_id” : “59.314781 18.070232” } # retrieve a single feature vector and apply any online transformation functions ordered_feature_vector = td.get_serving_vector(input_keys) Alternatively, any JDBC client can retrieve feature vectors using SQL queries.
  • 36. Complete MLOps Infrastructure with Hopsworks and its Feature Store Code and configuration Data Lake, Warehouse, Kafka Model Registry Feature Engineering Model Serving Model Training Model Deploy Model Monitoring Model Development Retrieve Online Features Log Predictions Training Data Statistics Sync Experiment Versioning Code/Environment Versioning Feature Versioning/Statistics A/B Test Model Versioning & Statistics Serving Statistics Free-text Search Feature Store Elasticsearch RonDB