SlideShare a Scribd company logo
2019 © GridGain Systems
Distributed Machine
and Deep Learning at Scale
with Apache Ignite
Yuriy Babak
2019 © GridGain Systems
Yuriy Babak
Head of ML/DL framework
development at GridGain
Apache Ignite committer
2019 © GridGain Systems2019 © GridGain Systems
Agenda
● Overview of distributed ML/DL
● Model training in distributed environment
● Data preprocessing and building pipelines
● Model updates and online learning
● Some extra features
2019 © GridGain Systems
Distributed machine
learning with Apache Ignite
2019 © GridGain Systems2019 © GridGain Systems
Why do we need distributed ML?
● Data preprocessing
○ Big amount of data on different machines
○ High costs of ETL
● Model training
○ We cannot store all data on single server
○ Data irregularity
● Inference
○ Intensive data flow
○ Data co-located inference
○ Network latency
2019 © GridGain Systems2019 © GridGain Systems
Why do we need distributed ML?
● Data preprocessing
○ Big amount of data on different machines
○ High costs of ETL
● Model training
○ We cannot store all data on single server
○ Data irregularity
● Inference
○ Intensive data flow
○ Data co-located inference
○ Network latency
2019 © GridGain Systems2019 © GridGain Systems
Why do we need distributed ML?
● Data preprocessing
○ Big amount of data on different machines
○ High costs of ETL
● Model training
○ We cannot store all data on single server
○ Data irregularity
● Inference
○ Intensive data flow
○ Data co-located inference
○ Network latency
2019 © GridGain Systems2019 © GridGain Systems
Common approach
Local calculations Result aggregationModel
Model update
2019 © GridGain Systems2019 © GridGain Systems
Ignite partition-based distributed dataset
Partition Data Dataset Context Dataset Data
Upstream Cache Context Cache On-Heap
Learning Env
On-Heap
Durable Stateless Durable Recoverable
double[][] x = …
double[] y = ...
double[][] x = …
double[] y = ...
Partition Based Dataset StructuresSource Data
2019 © GridGain Systems
Model training
2019 © GridGain Systems
Algorithms: Classification
● Logistic Regression
● SVM
● KNN
● ANN
● Decision trees
● Random Forest
● Naive Bayes
#UnifiedAnalytics #SparkAISummit
2019 © GridGain Systems
Algorithms: Regression
• KNN Regression
• Linear Regression
• Decision tree regression
• Random forest
regression
• Gradient-boosted tree
regression
#UnifiedAnalytics #SparkAISummit
2019 © GridGain Systems
Algorithms: Clusterization
• K-means
• GMM
#UnifiedAnalytics #SparkAISummit
2019 © GridGain Systems
Multilayer Perceptron Neural Network
#UnifiedAnalytics #SparkAISummit
2019 © GridGain Systems2019 © GridGain Systems
Ignite ML API
Vectorizer<Integer, BinaryObject, String, Double> vectorizer =
new BinaryObjectVectorizer<Integer>(
"feature1",
“feature2”
).labeled("label");
KNNClassificationTrainer trainer = new KNNClassificationTrainer();
KNNClassificationModel mdl = trainer.fit(ignite, dataCache,
vectorizer).withK(3)
.withDistanceMeasure(new EuclideanDistance())
.withStrategy(NNStrategy.WEIGHTED);
2019 © GridGain Systems
ML Pipelines
with Apache Ignite
2019 © GridGain Systems2019 © GridGain Systems
Preprocessing and Pipelines
2019 © GridGain Systems2019 © GridGain Systems
Preprocessing
● Imputing
● Normalization
● MinMax Scaler
● MaxAbs Scaler
● Standard Scaler
● Binarization
● One-Hot Encoder
● String Encoder
2019 © GridGain Systems2019 © GridGain Systems
Preprocessing and Pipelines
IgniteCache<Integer, Vector> dataCache = TitanicUtils.readPassengers(ignite);
// Extracts "pclass", "sibsp", "parch", "sex", "embarked", "age", "fare".
Vectorizer<Integer, Vector, Integer, Double> vectorizer
= new DummyVectorizer<Integer>(0, 3, 4, 5, 6, 8, 10).labeled(1);
PipelineMdl<Integer, Vector> mdl =
new Pipeline<Integer, Vector, Integer, Double>()
.addVectorizer(vectorizer)
.addPreprocessingTrainer(new EncoderTrainer<Integer, Vector>()
.withEncoderType(EncoderType.STRING_ENCODER)
.withEncodedFeature(1)
.withEncodedFeature(6))
.addPreprocessingTrainer(new ImputerTrainer<Integer, Vector>())
.addPreprocessingTrainer(new MinMaxScalerTrainer<Integer, Vector>())
.addPreprocessingTrainer(new NormalizationTrainer<Integer, Vector>()
.withP(1))
.addTrainer(new DecisionTreeClassificationTrainer(5, 0))
.fit(ignite, dataCache);
2019 © GridGain Systems2019 © GridGain Systems
Bagging, Boosting and Stacking
DatasetTrainer<LogisticRegressionModel, Double> trainer =
new LogisticRegressionSGDTrainer()
.withUpdatesStgy(new UpdatesStrategy<>(new SimpleGDUpdateCalculator(0.2),
SimpleGDParameterUpdate.SUM_LOCAL, SimpleGDParameterUpdate.AVG))
.withMaxIterations(30000)
.withLocIterations(100)
.withBatchSize(10)
.withSeed(123L);
BaggedTrainer<Double> baggedTrainer = TrainerTransformers.makeBagged(trainer,
// ensemble size, subsample ration, feature vector size, features subspace dim
7, 0.7, 2, 2,
new onMajorityPredictionsAggregator());
2019 © GridGain Systems
Extra features
2019 © GridGain Systems2019 © GridGain Systems
Online learning
SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer();
SVMLinearClassificationModel mdl1 = trainer.fit(ignite, dataCache1, vectorizer);
SVMLinearClassificationModel mdl2 = trainer.update(mdl1, ignite, dataCache2,
vectorizer);
2019 © GridGain Systems2019 © GridGain Systems
Model import
2019 © GridGain Systems
TensorFlow on Apache Ignite
• Ignite Dataset
• IGFS Plugin
• Distributed Training
• More info here
#UnifiedAnalytics #SparkAISummit
>>> import tensorflow as tf
>>> from tensorflow.contrib.ignite import IgniteDataset
>>>
>>> dataset = IgniteDataset(cache_name="SQL_PUBLIC_KITTEN_CACHE")
>>> iterator = dataset.make_one_shot_iterator()
>>> next_obj = iterator.get_next()
>>>
>>> with tf.Session() as sess:
>>> for _ in range(3):
>>> print(sess.run(next_obj))
{'key': 1, 'val': {'NAME': b'WARM KITTY'}}
{'key': 2, 'val': {'NAME': b'SOFT KITTY'}}
{'key': 3, 'val': {'NAME': b'LITTLE BALL OF FUR'}}
2019 © GridGain Systems2019 © GridGain Systems
Inference in Ignite ML
2019 © GridGain Systems
Request queue
Response queue
Inference
Service
Inference
Service
Inference
Gateway
2019 © GridGain Systems2019 © GridGain Systems
IMDB with built-in ML
IgniteModelStorageUtil.saveModel(ignite, model, “titanik_model_tree”);
QueryCursor<List<?>> cursor = cache.query(new SqlFieldsQuery("select " +
"survived as truth, " +
"predict('titanik_model_tree', pclass, age, sibsp, parch, fare, case
sex when 'male' then 1 else 0 end) as prediction " +
"from titanik_train"))
2019 © GridGain Systems
2019 © GridGain Systems2019 © GridGain Systems
Apache Ignite with GridGain ML Python API
GridGain ML client library provides user applications the ability to work
with GridGain ML functionality using Py4J as an integration mechanism.
If you want to use ggml in your project, you may install it from PyPI:
$ pip install ggml
2019 © GridGain Systems2019 © GridGain Systems
Apache Ignite with GridGain ML Python API
GridGain ML client library provides user applications the ability to work
with GridGain ML functionality using Py4J as an integration mechanism.
If you want to use ggml in your project, you may install it from PyPI:
$ pip install ggml
NB: available only for Apache Ignite master and for GG 8.7.6 (17 Jul)
2019 © GridGain Systems2019 © GridGain Systems
Apache Ignite with GridGain ML Python API
from ggml.regression import LinearRegressionTrainer
trainer = LinearRegressionTrainer()
model = trainer.fit(x_train, y_train)
r2_score(y_test, model.predict(x_test))
2019 © GridGain Systems2019 © GridGain Systems
Apache Ignite ML Tutorial
https://github.com/apache/ignite/
org.apache.ignite.examples.ml.tutorial
2019 © GridGain Systems
2019 © GridGain Systems2019 © GridGain Systems
Distributed Machine and Deep Learning at
Scale with Apache Ignite
Links:
• http://ignite.apache.org/
• https://medium.com/tensorflow/tensorflow-on-apache-ignite-99f1fc60efeb
• https://github.com/gridgain/ml-python-api
• http://bit.ly/LondonJune5
Email:
● user@ignite.apache.org
● dev@ignite.apache.org
● ybabak@gridgain.com

More Related Content

What's hot

node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute EngineArun Nagarajan
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Denis Magda
 
Google professional data engineer exam dumps
Google professional data engineer exam dumpsGoogle professional data engineer exam dumps
Google professional data engineer exam dumps
TestPrep Training
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloud
Tu Pham
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTApache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Denis Magda
 
#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited
Audrey Huvet
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
Yang Li
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
Barton Rhodes
 
Quantum Computing: The next new technology in computing
Quantum Computing: The next new technology in computingQuantum Computing: The next new technology in computing
Quantum Computing: The next new technology in computing
Data Con LA
 
Introduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted ConfIntroduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted Conf
In Marketing We Trust
 
GCP Gaming 2016 Keynote Seoul, Korea
GCP Gaming 2016 Keynote Seoul, KoreaGCP Gaming 2016 Keynote Seoul, Korea
GCP Gaming 2016 Keynote Seoul, Korea
Chris Jang
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
MeetupDataScienceRoma
 
BigQuery for Beginners
BigQuery for BeginnersBigQuery for Beginners
BigQuery for Beginners
Better&Stronger
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat Detection
Databricks
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
Márton Kodok
 
Google Cloud Platform: Prototype ->Production-> Planet scale
Google Cloud Platform: Prototype ->Production-> Planet scaleGoogle Cloud Platform: Prototype ->Production-> Planet scale
Google Cloud Platform: Prototype ->Production-> Planet scale
Idan Tohami
 
Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...
DataStax Academy
 
Kubernetes on GCP
Kubernetes on GCPKubernetes on GCP
Kubernetes on GCP
Fotios Tragopoulos
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
Sujai Prakasam
 

What's hot (20)

node.js on Google Compute Engine
node.js on Google Compute Enginenode.js on Google Compute Engine
node.js on Google Compute Engine
 
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the RescueDistributed Database DevOps Dilemmas? Kubernetes to the Rescue
Distributed Database DevOps Dilemmas? Kubernetes to the Rescue
 
Google professional data engineer exam dumps
Google professional data engineer exam dumpsGoogle professional data engineer exam dumps
Google professional data engineer exam dumps
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloud
 
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoTApache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
 
#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited#DataUnlimited - Google Big Data Unlimited
#DataUnlimited - Google Big Data Unlimited
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Quantum Computing: The next new technology in computing
Quantum Computing: The next new technology in computingQuantum Computing: The next new technology in computing
Quantum Computing: The next new technology in computing
 
Introduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted ConfIntroduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted Conf
 
GCP Gaming 2016 Keynote Seoul, Korea
GCP Gaming 2016 Keynote Seoul, KoreaGCP Gaming 2016 Keynote Seoul, Korea
GCP Gaming 2016 Keynote Seoul, Korea
 
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform[Giovanni Galloro] How to use machine learning on Google Cloud Platform
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
 
BigQuery for Beginners
BigQuery for BeginnersBigQuery for Beginners
BigQuery for Beginners
 
AI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat DetectionAI on Spark for Malware Analysis and Anomalous Threat Detection
AI on Spark for Malware Analysis and Anomalous Threat Detection
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Google Cloud Platform: Prototype ->Production-> Planet scale
Google Cloud Platform: Prototype ->Production-> Planet scaleGoogle Cloud Platform: Prototype ->Production-> Planet scale
Google Cloud Platform: Prototype ->Production-> Planet scale
 
Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...Wayne State University & DataStax: World's best data modeling tool for Apache...
Wayne State University & DataStax: World's best data modeling tool for Apache...
 
Kubernetes on GCP
Kubernetes on GCPKubernetes on GCP
Kubernetes on GCP
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 

Similar to Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with Apache ignite' - Yury Babak

Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Dataconomy Media
 
Continuous Machine and Deep Learning with Apache Ignite
Continuous Machine and Deep Learning with Apache IgniteContinuous Machine and Deep Learning with Apache Ignite
Continuous Machine and Deep Learning with Apache Ignite
Denis Magda
 
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
NETWAYS
 
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
Stephen Darlington
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
James Anderson
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)
Karel Dumon
 
IRJET- Secure Skyline Queries over the Encrypted Data
IRJET- Secure Skyline Queries over the Encrypted DataIRJET- Secure Skyline Queries over the Encrypted Data
IRJET- Secure Skyline Queries over the Encrypted Data
IRJET Journal
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
Tom Diederich
 
Using Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High ScalabilityUsing Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High Scalability
mabuhr
 
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
Amazon Web Services
 
Modern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECSModern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECS
Amazon Web Services
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Intel® Software
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
Jasjeet Thind
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Amazon Web Services
 
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdfBOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
MichaelOLeary82
 
Scaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With LuminaireScaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With Luminaire
Databricks
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Provectus
 
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
DataStax
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
Ganesan Narayanasamy
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
Connected Data World
 

Similar to Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with Apache ignite' - Yury Babak (20)

Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Continuous Machine and Deep Learning with Apache Ignite
Continuous Machine and Deep Learning with Apache IgniteContinuous Machine and Deep Learning with Apache Ignite
Continuous Machine and Deep Learning with Apache Ignite
 
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
OSDC 2018 | Apache Ignite - the in-memory hammer for your data science toolki...
 
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...On Cloud Nine: How to be happy migrating your in-memory computing platform to...
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)
 
IRJET- Secure Skyline Queries over the Encrypted Data
IRJET- Secure Skyline Queries over the Encrypted DataIRJET- Secure Skyline Queries over the Encrypted Data
IRJET- Secure Skyline Queries over the Encrypted Data
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
 
Using Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High ScalabilityUsing Grid Technologies in the Cloud for High Scalability
Using Grid Technologies in the Cloud for High Scalability
 
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
The Steady State Reduce Spikiness from GPU Utilization with Apache MXNet (inc...
 
Modern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECSModern-Application-Design-with-Amazon-ECS
Modern-Application-Design-with-Amazon-ECS
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdfBOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
BOS K8S Meetup - Finetuning LLama 2 Model on GKE.pdf
 
Scaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With LuminaireScaling AutoML-Driven Anomaly Detection With Luminaire
Scaling AutoML-Driven Anomaly Detection With Luminaire
 
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
 
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
 
WML OpenPOWER presentation
WML OpenPOWER presentationWML OpenPOWER presentation
WML OpenPOWER presentation
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 

More from Dataconomy Media

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Dataconomy Media
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Dataconomy Media
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Dataconomy Media
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Dataconomy Media
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Dataconomy Media
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Dataconomy Media
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Dataconomy Media
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Dataconomy Media
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Dataconomy Media
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Dataconomy Media
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Dataconomy Media
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Dataconomy Media
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Dataconomy Media
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Dataconomy Media
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Dataconomy Media
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Dataconomy Media
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Dataconomy Media
 
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas TomperiBig Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Dataconomy Media
 

More from Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas TomperiBig Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
 

Recently uploaded

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

Big Data London v 11.0 I 'Distributed Machine & Deep Learning at Scale with Apache ignite' - Yury Babak

  • 1. 2019 © GridGain Systems Distributed Machine and Deep Learning at Scale with Apache Ignite Yuriy Babak
  • 2. 2019 © GridGain Systems Yuriy Babak Head of ML/DL framework development at GridGain Apache Ignite committer
  • 3. 2019 © GridGain Systems2019 © GridGain Systems Agenda ● Overview of distributed ML/DL ● Model training in distributed environment ● Data preprocessing and building pipelines ● Model updates and online learning ● Some extra features
  • 4. 2019 © GridGain Systems Distributed machine learning with Apache Ignite
  • 5. 2019 © GridGain Systems2019 © GridGain Systems Why do we need distributed ML? ● Data preprocessing ○ Big amount of data on different machines ○ High costs of ETL ● Model training ○ We cannot store all data on single server ○ Data irregularity ● Inference ○ Intensive data flow ○ Data co-located inference ○ Network latency
  • 6. 2019 © GridGain Systems2019 © GridGain Systems Why do we need distributed ML? ● Data preprocessing ○ Big amount of data on different machines ○ High costs of ETL ● Model training ○ We cannot store all data on single server ○ Data irregularity ● Inference ○ Intensive data flow ○ Data co-located inference ○ Network latency
  • 7. 2019 © GridGain Systems2019 © GridGain Systems Why do we need distributed ML? ● Data preprocessing ○ Big amount of data on different machines ○ High costs of ETL ● Model training ○ We cannot store all data on single server ○ Data irregularity ● Inference ○ Intensive data flow ○ Data co-located inference ○ Network latency
  • 8. 2019 © GridGain Systems2019 © GridGain Systems Common approach Local calculations Result aggregationModel Model update
  • 9. 2019 © GridGain Systems2019 © GridGain Systems Ignite partition-based distributed dataset Partition Data Dataset Context Dataset Data Upstream Cache Context Cache On-Heap Learning Env On-Heap Durable Stateless Durable Recoverable double[][] x = … double[] y = ... double[][] x = … double[] y = ... Partition Based Dataset StructuresSource Data
  • 10. 2019 © GridGain Systems Model training
  • 11. 2019 © GridGain Systems Algorithms: Classification ● Logistic Regression ● SVM ● KNN ● ANN ● Decision trees ● Random Forest ● Naive Bayes #UnifiedAnalytics #SparkAISummit
  • 12. 2019 © GridGain Systems Algorithms: Regression • KNN Regression • Linear Regression • Decision tree regression • Random forest regression • Gradient-boosted tree regression #UnifiedAnalytics #SparkAISummit
  • 13. 2019 © GridGain Systems Algorithms: Clusterization • K-means • GMM #UnifiedAnalytics #SparkAISummit
  • 14. 2019 © GridGain Systems Multilayer Perceptron Neural Network #UnifiedAnalytics #SparkAISummit
  • 15. 2019 © GridGain Systems2019 © GridGain Systems Ignite ML API Vectorizer<Integer, BinaryObject, String, Double> vectorizer = new BinaryObjectVectorizer<Integer>( "feature1", “feature2” ).labeled("label"); KNNClassificationTrainer trainer = new KNNClassificationTrainer(); KNNClassificationModel mdl = trainer.fit(ignite, dataCache, vectorizer).withK(3) .withDistanceMeasure(new EuclideanDistance()) .withStrategy(NNStrategy.WEIGHTED);
  • 16. 2019 © GridGain Systems ML Pipelines with Apache Ignite
  • 17. 2019 © GridGain Systems2019 © GridGain Systems Preprocessing and Pipelines
  • 18. 2019 © GridGain Systems2019 © GridGain Systems Preprocessing ● Imputing ● Normalization ● MinMax Scaler ● MaxAbs Scaler ● Standard Scaler ● Binarization ● One-Hot Encoder ● String Encoder
  • 19. 2019 © GridGain Systems2019 © GridGain Systems Preprocessing and Pipelines IgniteCache<Integer, Vector> dataCache = TitanicUtils.readPassengers(ignite); // Extracts "pclass", "sibsp", "parch", "sex", "embarked", "age", "fare". Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>(0, 3, 4, 5, 6, 8, 10).labeled(1); PipelineMdl<Integer, Vector> mdl = new Pipeline<Integer, Vector, Integer, Double>() .addVectorizer(vectorizer) .addPreprocessingTrainer(new EncoderTrainer<Integer, Vector>() .withEncoderType(EncoderType.STRING_ENCODER) .withEncodedFeature(1) .withEncodedFeature(6)) .addPreprocessingTrainer(new ImputerTrainer<Integer, Vector>()) .addPreprocessingTrainer(new MinMaxScalerTrainer<Integer, Vector>()) .addPreprocessingTrainer(new NormalizationTrainer<Integer, Vector>() .withP(1)) .addTrainer(new DecisionTreeClassificationTrainer(5, 0)) .fit(ignite, dataCache);
  • 20. 2019 © GridGain Systems2019 © GridGain Systems Bagging, Boosting and Stacking DatasetTrainer<LogisticRegressionModel, Double> trainer = new LogisticRegressionSGDTrainer() .withUpdatesStgy(new UpdatesStrategy<>(new SimpleGDUpdateCalculator(0.2), SimpleGDParameterUpdate.SUM_LOCAL, SimpleGDParameterUpdate.AVG)) .withMaxIterations(30000) .withLocIterations(100) .withBatchSize(10) .withSeed(123L); BaggedTrainer<Double> baggedTrainer = TrainerTransformers.makeBagged(trainer, // ensemble size, subsample ration, feature vector size, features subspace dim 7, 0.7, 2, 2, new onMajorityPredictionsAggregator());
  • 21. 2019 © GridGain Systems Extra features
  • 22. 2019 © GridGain Systems2019 © GridGain Systems Online learning SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer(); SVMLinearClassificationModel mdl1 = trainer.fit(ignite, dataCache1, vectorizer); SVMLinearClassificationModel mdl2 = trainer.update(mdl1, ignite, dataCache2, vectorizer);
  • 23. 2019 © GridGain Systems2019 © GridGain Systems Model import
  • 24. 2019 © GridGain Systems TensorFlow on Apache Ignite • Ignite Dataset • IGFS Plugin • Distributed Training • More info here #UnifiedAnalytics #SparkAISummit >>> import tensorflow as tf >>> from tensorflow.contrib.ignite import IgniteDataset >>> >>> dataset = IgniteDataset(cache_name="SQL_PUBLIC_KITTEN_CACHE") >>> iterator = dataset.make_one_shot_iterator() >>> next_obj = iterator.get_next() >>> >>> with tf.Session() as sess: >>> for _ in range(3): >>> print(sess.run(next_obj)) {'key': 1, 'val': {'NAME': b'WARM KITTY'}} {'key': 2, 'val': {'NAME': b'SOFT KITTY'}} {'key': 3, 'val': {'NAME': b'LITTLE BALL OF FUR'}}
  • 25. 2019 © GridGain Systems2019 © GridGain Systems Inference in Ignite ML 2019 © GridGain Systems Request queue Response queue Inference Service Inference Service Inference Gateway
  • 26. 2019 © GridGain Systems2019 © GridGain Systems IMDB with built-in ML IgniteModelStorageUtil.saveModel(ignite, model, “titanik_model_tree”); QueryCursor<List<?>> cursor = cache.query(new SqlFieldsQuery("select " + "survived as truth, " + "predict('titanik_model_tree', pclass, age, sibsp, parch, fare, case sex when 'male' then 1 else 0 end) as prediction " + "from titanik_train")) 2019 © GridGain Systems
  • 27. 2019 © GridGain Systems2019 © GridGain Systems Apache Ignite with GridGain ML Python API GridGain ML client library provides user applications the ability to work with GridGain ML functionality using Py4J as an integration mechanism. If you want to use ggml in your project, you may install it from PyPI: $ pip install ggml
  • 28. 2019 © GridGain Systems2019 © GridGain Systems Apache Ignite with GridGain ML Python API GridGain ML client library provides user applications the ability to work with GridGain ML functionality using Py4J as an integration mechanism. If you want to use ggml in your project, you may install it from PyPI: $ pip install ggml NB: available only for Apache Ignite master and for GG 8.7.6 (17 Jul)
  • 29. 2019 © GridGain Systems2019 © GridGain Systems Apache Ignite with GridGain ML Python API from ggml.regression import LinearRegressionTrainer trainer = LinearRegressionTrainer() model = trainer.fit(x_train, y_train) r2_score(y_test, model.predict(x_test))
  • 30. 2019 © GridGain Systems2019 © GridGain Systems Apache Ignite ML Tutorial https://github.com/apache/ignite/ org.apache.ignite.examples.ml.tutorial 2019 © GridGain Systems
  • 31. 2019 © GridGain Systems2019 © GridGain Systems Distributed Machine and Deep Learning at Scale with Apache Ignite Links: • http://ignite.apache.org/ • https://medium.com/tensorflow/tensorflow-on-apache-ignite-99f1fc60efeb • https://github.com/gridgain/ml-python-api • http://bit.ly/LondonJune5 Email: ● user@ignite.apache.org ● dev@ignite.apache.org ● ybabak@gridgain.com