SlideShare a Scribd company logo
Let’s Talk ML on Flink
Boris Lublinsky
Stavros Kontopoulos
Lightbend
Apache Flink London Meetup (15/11/2017)
Talk’s agenda
• Flink ML Roadmap
• Machine learning & Models
• Model Serving with Flink
• Results and next steps
2
Flink ML
Flink ML Roadmap:
https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3
Ud06MIRhahtJ6dw
Model Serving
Online Learning
Incremental learning
Batch api (https://issues.apache.org/jira/browse/FLINK-2396)?
3
How people see ML
4
Reality
5
What is the Model?
6
A model is a function transforming inputs to outputs
- y = f(x),for example:
Linear regression:
y = ac + a1*x1 + … + an*xn
Neural network:
Such a definition of the model allows for an easy
implementation of model’s composition. From the
implementation point of view it is just function
composition
Model learning pipeline
UC Berkeley AMPLab introduced machine learning pipelines as a graph defining the
complete chain of data transformation.
7
Models proliferation
Data Scientist Software engineer
8
Model standardization
9
Model lifecycle considerations
• Models tend to change
• Update frequencies vary greatly - from hourly to
quarterly/yearly
• Model version tracking
• Model release practices
• Model update process 10
Customer SLA
• Response time - time to calculate prediction
• Throughput - predictions per second
• Support for running multiple models
• Model update - how quickly and easily can the model be
updated
• Uptime/reliability 11
Model Governance
Models should be:
• governed by the company’s policies and procedures, laws and regulations and
organization’s goals
• be transparent, explainable, traceable and interpretable for auditors and
regulators.
• have approval and release process.
• have reason code for explanation of decision.
• be versioned. Reasoning for introduction of new version should be clearly
specified. 12
What are we building
We want to build a streaming system allowing to update
models without interruption of execution (dynamically
controlled stream).
13
Model Serving with Apache Flink
Idea: Exploit Flink’s low latency capabilities for serving models. Focus on offline
models loaded from a permanent storage and update them without interruption.
FLIP Proposal:
(https://docs.google.com/document/d/1ON_t9S28_2LJ91Fks2yFw0RYyeZvIvndu8
oGRPsPuk8)
Combines different efforts: https://github.com/FlinkML
● https://github.com/FlinkML/flink-jpmml (https://radicalbit.io/)
● https://github.com/FlinkML/flink-modelServer (Boris Lublinsky)
● https://github.com/FlinkML/flink-tensorflow (Eron Wright)
14
14
Pros
● Live model updating
● No Flink runtime changes
● Latency guarantees
● Not all systems cover the requirements. For example TF
Serving:
○ Metadata not available.
(https://github.com/tensorflow/serving/issues/612)
○ No new models at runtime:
(https://github.com/tensorflow/serving/issues/422)
15
Flink Low Level Join
• Create a state object for one input (or both)
• Update the state upon receiving elements from its input
• Upon receiving elements from the other input, probe the state and
produce the joined result
16
Exporting Model
Different ML packages have different export capabilities. We used
the following:
• For Spark ML export to PMML -
https://github.com/jpmml/jpmml-sparkml
• For Tensorflow:
• Normal graph export
• Saved model format
17
Model representationOn the wire
syntax = "proto3";
// Description of the trained model.
message ModelDescriptor {
// Model name
string name = 1;
// Human readable description.
string description = 2;
// Data type for which this model is applied.
string dataType = 3;
// Model type
enum ModelType {
TENSORFLOW = 0;
TENSORFLOWSAVED = 2;
PMML = 2;
};
ModelType modeltype = 4;
oneof MessageContent {
// Byte array containing the model
bytes data = 5;
string location = 6;
}
} 18
Internal
trait Model {
def score(input : AnyVal) : AnyVal
def cleanup() : Unit
def toBytes() : Array[Byte]
def getType : Long
}
trait ModelFactoryl {
def create(input : ModelDescriptor) :
Model
def restore(bytes : Array[Byte]) : Model
}
Key based join
Flink’s CoProcessFunction allows key-based merge of 2 streams. When using this API,
data is key-partitioned across multiple Flink executors. Records from both streams are
routed (based on key) to the appropriate executor that is responsible for the actual
processing.
19
Partition based join
Flink’s RichCoFlatMapFunction allows merging of 2 streams in parallel (based on
parallelization parameter). When using this API, on the partitioned stream, data from
different partitions is processed by dedicated Flink executor.
20
Monitoring
case class ModelToServeStats(
name: String, // Model name
description: String, // Model descriptor
modelType: ModelDescriptor.ModelType, // Model type
since : Long, // Start time of model usage
var usage : Long = 0, // Number of servings
var duration : Double = .0, // Time spent on serving
var min : Long = Long.MaxValue, // Min serving time
var max : Long = Long.MinValue // Max serving time
)
Model monitoring should provide information about usage, behavior,
performance and lifecycle of the deployed models
21
Queryable state
Queryable state (interactive queries) is an approach, which allows to get more from
streaming than just the processing of data. This feature allows to treat the stream
processing layer as a lightweight embedded database and, more concretely, to directly
query the current state of a stream processing application, without needing to materialize
that state to external databases or external storage first.
22
Queryable state
Limitations
Operator state (non keyed) is not queryable
https://issues.apache.org/jira/browse/FLINK-7771
23
Where are we now?
• Proposed solution does not require modification of Flink proper. It
rather represents a framework build leveraging Flink capabilities.
Although extension of the queryable state will be helpful
• Flink improvement proposal (Flip23) - model serving is close to
completion -
https://docs.google.com/document/d/1ON_t9S28_2LJ91Fks2yF
w0RYyeZvIvndu8oGRPsPuk8).
• Fully functioning prototype of solution supporting PMML and
Tensorflow is currently in the FlinkML Git repo
(https://github.com/FlinkML) 24
Next steps
• Complete Flip23 document
• Integrate Flink Tensorflow and Flink PMML
implementations to provide PMML and Tensorflow
specific model implementations - work in progress
• Look at bringing in additional model implementations
• Look at model governance and management
implementation 25
Model Serving Beyond
Flink
Author: Boris Lublinsky
Download from here:
https://info.lightbend.com/ebook-serving-machine-learning-models-register.html
26
Thank you
Any Questions?
27

More Related Content

What's hot

Mulebatch
MulebatchMulebatch
Mulebatch
shakeela shaik
 
Java spring batch
Java spring batchJava spring batch
M batching
M batchingM batching
M batching
Vasanthii Chowdary
 
work load characterization
work load characterizationwork load characterization
work load characterization
Raghu Golla
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
Jonas Traub
 
FlowSim_presentation
FlowSim_presentationFlowSim_presentation
FlowSim_presentation
Anderson Paschoalon
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking VN
 
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream ProcessingApache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
Apache Flink Taiwan User Group
 

What's hot (9)

Sharing C++ objects in Linux
Sharing C++ objects in LinuxSharing C++ objects in Linux
Sharing C++ objects in Linux
 
Mulebatch
MulebatchMulebatch
Mulebatch
 
Java spring batch
Java spring batchJava spring batch
Java spring batch
 
M batching
M batchingM batching
M batching
 
work load characterization
work load characterizationwork load characterization
work load characterization
 
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream AnalysisLWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
LWA 2015: The Apache Flink Platform for Parallel Batch and Stream Analysis
 
FlowSim_presentation
FlowSim_presentationFlowSim_presentation
FlowSim_presentation
 
Grokking Techtalk #38: Escape Analysis in Go compiler
 Grokking Techtalk #38: Escape Analysis in Go compiler Grokking Techtalk #38: Escape Analysis in Go compiler
Grokking Techtalk #38: Escape Analysis in Go compiler
 
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream ProcessingApache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
Apache Flink Training Workshop @ HadoopCon2016 - #4 Advanced Stream Processing
 

Similar to Apache Flink London Meetup - Let's Talk ML on Flink

Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Lightbend
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML Models
Lightbend
 
Rock Overview
Rock OverviewRock Overview
Rock Overview
Sylvain Joyeux
 
SDN in the Management Plane: OpenConfig and Streaming Telemetry
SDN in the Management Plane: OpenConfig and Streaming TelemetrySDN in the Management Plane: OpenConfig and Streaming Telemetry
SDN in the Management Plane: OpenConfig and Streaming Telemetry
Anees Shaikh
 
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Databricks
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
Liangjun Jiang
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
Liangjun Jiang
 
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Miguel Pérez Colino
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the art
Stavros Kontopoulos
 
OpenConfig: collaborating to enable programmable network management
OpenConfig: collaborating to enable programmable network managementOpenConfig: collaborating to enable programmable network management
OpenConfig: collaborating to enable programmable network management
Anees Shaikh
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonStephan Ewen
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward
 
Flowex - Railway Flow-Based Programming with Elixir GenStage.
Flowex - Railway Flow-Based Programming with Elixir GenStage.Flowex - Railway Flow-Based Programming with Elixir GenStage.
Flowex - Railway Flow-Based Programming with Elixir GenStage.
Anton Mishchuk
 
Flowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
Flowex: Flow-Based Programming with Elixir GenStage - Anton MishchukFlowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
Flowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
Elixir Club
 
Kotlin functional programming basic@Kotlin TW study group
Kotlin  functional programming basic@Kotlin TW study groupKotlin  functional programming basic@Kotlin TW study group
Kotlin functional programming basic@Kotlin TW study group
Julian Yu-Lang Chu
 
OpenDaylight app development tutorial
OpenDaylight app development tutorialOpenDaylight app development tutorial
OpenDaylight app development tutorial
SDN Hub
 
WS-VLAM workflow
WS-VLAM workflowWS-VLAM workflow
WS-VLAM workflowguest6295d0
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
NECST Lab @ Politecnico di Milano
 
Drupal and communication
Drupal and communicationDrupal and communication
Drupal and communication
Peter Arato
 
I want to be a Data DJ!
I want to be a Data DJ!I want to be a Data DJ!
I want to be a Data DJ!
Paul Groth
 

Similar to Apache Flink London Meetup - Let's Talk ML on Flink (20)

Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data StreamsMachine Learning At Speed: Operationalizing ML For Real-Time Data Streams
Machine Learning At Speed: Operationalizing ML For Real-Time Data Streams
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML Models
 
Rock Overview
Rock OverviewRock Overview
Rock Overview
 
SDN in the Management Plane: OpenConfig and Streaming Telemetry
SDN in the Management Plane: OpenConfig and Streaming TelemetrySDN in the Management Plane: OpenConfig and Streaming Telemetry
SDN in the Management Plane: OpenConfig and Streaming Telemetry
 
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
Using Spark Mllib Models in a Production Training and Serving Platform: Exper...
 
MLflow with Databricks
MLflow with DatabricksMLflow with Databricks
MLflow with Databricks
 
Mlflow with databricks
Mlflow with databricksMlflow with databricks
Mlflow with databricks
 
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the art
 
OpenConfig: collaborating to enable programmable network management
OpenConfig: collaborating to enable programmable network managementOpenConfig: collaborating to enable programmable network management
OpenConfig: collaborating to enable programmable network management
 
Apache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World LondonApache Flink@ Strata & Hadoop World London
Apache Flink@ Strata & Hadoop World London
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
 
Flowex - Railway Flow-Based Programming with Elixir GenStage.
Flowex - Railway Flow-Based Programming with Elixir GenStage.Flowex - Railway Flow-Based Programming with Elixir GenStage.
Flowex - Railway Flow-Based Programming with Elixir GenStage.
 
Flowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
Flowex: Flow-Based Programming with Elixir GenStage - Anton MishchukFlowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
Flowex: Flow-Based Programming with Elixir GenStage - Anton Mishchuk
 
Kotlin functional programming basic@Kotlin TW study group
Kotlin  functional programming basic@Kotlin TW study groupKotlin  functional programming basic@Kotlin TW study group
Kotlin functional programming basic@Kotlin TW study group
 
OpenDaylight app development tutorial
OpenDaylight app development tutorialOpenDaylight app development tutorial
OpenDaylight app development tutorial
 
WS-VLAM workflow
WS-VLAM workflowWS-VLAM workflow
WS-VLAM workflow
 
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving SystemsPRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
 
Drupal and communication
Drupal and communicationDrupal and communication
Drupal and communication
 
I want to be a Data DJ!
I want to be a Data DJ!I want to be a Data DJ!
I want to be a Data DJ!
 

More from Stavros Kontopoulos

Serverless Machine Learning Model Inference on Kubernetes with KServe.pdf
Serverless Machine Learning Model Inference on Kubernetes with KServe.pdfServerless Machine Learning Model Inference on Kubernetes with KServe.pdf
Serverless Machine Learning Model Inference on Kubernetes with KServe.pdf
Stavros Kontopoulos
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
Stavros Kontopoulos
 
ML At the Edge: Building Your Production Pipeline With Apache Spark and Tens...
ML At the Edge:  Building Your Production Pipeline With Apache Spark and Tens...ML At the Edge:  Building Your Production Pipeline With Apache Spark and Tens...
ML At the Edge: Building Your Production Pipeline With Apache Spark and Tens...
Stavros Kontopoulos
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
 
Spark Summit EU Supporting Spark (Brussels 2016)
Spark Summit EU Supporting Spark (Brussels 2016)Spark Summit EU Supporting Spark (Brussels 2016)
Spark Summit EU Supporting Spark (Brussels 2016)
Stavros Kontopoulos
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Stavros Kontopoulos
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
Stavros Kontopoulos
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016
Stavros Kontopoulos
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
Stavros Kontopoulos
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
Stavros Kontopoulos
 

More from Stavros Kontopoulos (10)

Serverless Machine Learning Model Inference on Kubernetes with KServe.pdf
Serverless Machine Learning Model Inference on Kubernetes with KServe.pdfServerless Machine Learning Model Inference on Kubernetes with KServe.pdf
Serverless Machine Learning Model Inference on Kubernetes with KServe.pdf
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
ML At the Edge: Building Your Production Pipeline With Apache Spark and Tens...
ML At the Edge:  Building Your Production Pipeline With Apache Spark and Tens...ML At the Edge:  Building Your Production Pipeline With Apache Spark and Tens...
ML At the Edge: Building Your Production Pipeline With Apache Spark and Tens...
 
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutionsMachine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
 
Spark Summit EU Supporting Spark (Brussels 2016)
Spark Summit EU Supporting Spark (Brussels 2016)Spark Summit EU Supporting Spark (Brussels 2016)
Spark Summit EU Supporting Spark (Brussels 2016)
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016
 
Typesafe spark- Zalando meetup
Typesafe spark- Zalando meetupTypesafe spark- Zalando meetup
Typesafe spark- Zalando meetup
 
Cassandra at Pollfish
Cassandra at PollfishCassandra at Pollfish
Cassandra at Pollfish
 

Recently uploaded

Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
Jelle | Nordend
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 

Recently uploaded (20)

Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 

Apache Flink London Meetup - Let's Talk ML on Flink

  • 1. Let’s Talk ML on Flink Boris Lublinsky Stavros Kontopoulos Lightbend Apache Flink London Meetup (15/11/2017)
  • 2. Talk’s agenda • Flink ML Roadmap • Machine learning & Models • Model Serving with Flink • Results and next steps 2
  • 3. Flink ML Flink ML Roadmap: https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3 Ud06MIRhahtJ6dw Model Serving Online Learning Incremental learning Batch api (https://issues.apache.org/jira/browse/FLINK-2396)? 3
  • 6. What is the Model? 6 A model is a function transforming inputs to outputs - y = f(x),for example: Linear regression: y = ac + a1*x1 + … + an*xn Neural network: Such a definition of the model allows for an easy implementation of model’s composition. From the implementation point of view it is just function composition
  • 7. Model learning pipeline UC Berkeley AMPLab introduced machine learning pipelines as a graph defining the complete chain of data transformation. 7
  • 10. Model lifecycle considerations • Models tend to change • Update frequencies vary greatly - from hourly to quarterly/yearly • Model version tracking • Model release practices • Model update process 10
  • 11. Customer SLA • Response time - time to calculate prediction • Throughput - predictions per second • Support for running multiple models • Model update - how quickly and easily can the model be updated • Uptime/reliability 11
  • 12. Model Governance Models should be: • governed by the company’s policies and procedures, laws and regulations and organization’s goals • be transparent, explainable, traceable and interpretable for auditors and regulators. • have approval and release process. • have reason code for explanation of decision. • be versioned. Reasoning for introduction of new version should be clearly specified. 12
  • 13. What are we building We want to build a streaming system allowing to update models without interruption of execution (dynamically controlled stream). 13
  • 14. Model Serving with Apache Flink Idea: Exploit Flink’s low latency capabilities for serving models. Focus on offline models loaded from a permanent storage and update them without interruption. FLIP Proposal: (https://docs.google.com/document/d/1ON_t9S28_2LJ91Fks2yFw0RYyeZvIvndu8 oGRPsPuk8) Combines different efforts: https://github.com/FlinkML ● https://github.com/FlinkML/flink-jpmml (https://radicalbit.io/) ● https://github.com/FlinkML/flink-modelServer (Boris Lublinsky) ● https://github.com/FlinkML/flink-tensorflow (Eron Wright) 14 14
  • 15. Pros ● Live model updating ● No Flink runtime changes ● Latency guarantees ● Not all systems cover the requirements. For example TF Serving: ○ Metadata not available. (https://github.com/tensorflow/serving/issues/612) ○ No new models at runtime: (https://github.com/tensorflow/serving/issues/422) 15
  • 16. Flink Low Level Join • Create a state object for one input (or both) • Update the state upon receiving elements from its input • Upon receiving elements from the other input, probe the state and produce the joined result 16
  • 17. Exporting Model Different ML packages have different export capabilities. We used the following: • For Spark ML export to PMML - https://github.com/jpmml/jpmml-sparkml • For Tensorflow: • Normal graph export • Saved model format 17
  • 18. Model representationOn the wire syntax = "proto3"; // Description of the trained model. message ModelDescriptor { // Model name string name = 1; // Human readable description. string description = 2; // Data type for which this model is applied. string dataType = 3; // Model type enum ModelType { TENSORFLOW = 0; TENSORFLOWSAVED = 2; PMML = 2; }; ModelType modeltype = 4; oneof MessageContent { // Byte array containing the model bytes data = 5; string location = 6; } } 18 Internal trait Model { def score(input : AnyVal) : AnyVal def cleanup() : Unit def toBytes() : Array[Byte] def getType : Long } trait ModelFactoryl { def create(input : ModelDescriptor) : Model def restore(bytes : Array[Byte]) : Model }
  • 19. Key based join Flink’s CoProcessFunction allows key-based merge of 2 streams. When using this API, data is key-partitioned across multiple Flink executors. Records from both streams are routed (based on key) to the appropriate executor that is responsible for the actual processing. 19
  • 20. Partition based join Flink’s RichCoFlatMapFunction allows merging of 2 streams in parallel (based on parallelization parameter). When using this API, on the partitioned stream, data from different partitions is processed by dedicated Flink executor. 20
  • 21. Monitoring case class ModelToServeStats( name: String, // Model name description: String, // Model descriptor modelType: ModelDescriptor.ModelType, // Model type since : Long, // Start time of model usage var usage : Long = 0, // Number of servings var duration : Double = .0, // Time spent on serving var min : Long = Long.MaxValue, // Min serving time var max : Long = Long.MinValue // Max serving time ) Model monitoring should provide information about usage, behavior, performance and lifecycle of the deployed models 21
  • 22. Queryable state Queryable state (interactive queries) is an approach, which allows to get more from streaming than just the processing of data. This feature allows to treat the stream processing layer as a lightweight embedded database and, more concretely, to directly query the current state of a stream processing application, without needing to materialize that state to external databases or external storage first. 22
  • 23. Queryable state Limitations Operator state (non keyed) is not queryable https://issues.apache.org/jira/browse/FLINK-7771 23
  • 24. Where are we now? • Proposed solution does not require modification of Flink proper. It rather represents a framework build leveraging Flink capabilities. Although extension of the queryable state will be helpful • Flink improvement proposal (Flip23) - model serving is close to completion - https://docs.google.com/document/d/1ON_t9S28_2LJ91Fks2yF w0RYyeZvIvndu8oGRPsPuk8). • Fully functioning prototype of solution supporting PMML and Tensorflow is currently in the FlinkML Git repo (https://github.com/FlinkML) 24
  • 25. Next steps • Complete Flip23 document • Integrate Flink Tensorflow and Flink PMML implementations to provide PMML and Tensorflow specific model implementations - work in progress • Look at bringing in additional model implementations • Look at model governance and management implementation 25
  • 26. Model Serving Beyond Flink Author: Boris Lublinsky Download from here: https://info.lightbend.com/ebook-serving-machine-learning-models-register.html 26