SlideShare a Scribd company logo
1 of 25
Production Grade
Data Science for Hadoop
Villu Ruusmann
Openscoring OÜ
About openscoring.io
"Standards-based, Open-source Middleware for
Predictive Analytics Applications"
aka
"Rapid deployment of R and Scikit-Learn models on JVM"
2/25
"From Laboratory to Factory"
3/25
From grams (to kilograms) to megagrams
Scaling the vessel / Hardware
Scaling the chemical reaction / Software and processes
4/25
Scalability through re-engineering
5/25
Broader objectives
● Platform
○ Portability of applications
● Application
○ Central governance and dissemination of models
● Model
○ "Decisioning as a Service"
● Decision
○ Traceability, reproducibility, explainability
6/25
R and
Scikit-Learn
PMML
PFA
Java and C
Domain-Specific Languages
General-Purpose Languages
7/25
The PMML connection
8/25
X model producers
Y model consumers
(ten years into past & future)
1 model API
9/25
Model API pipeline
Conversion Deployment
Ephermeal:
Persistent, asset-like:
Conversion Maintenance Deployment
10/25
Conversion into PMML
Capturing and expressing the essentials of the modeling
workflow using PMML vocabulary:
Input → Feature vector → Response vector → Output
Connecting stable data schemas
11/25
12/25
13/25
Standardized representation
R
ada, cforest, gbm,
randomForest, xgb.Booster
Scikit-Learn
AdaBoostClassifier,
BaggingClassifier,
ExtraTreesClassifier,
GradientBoostingClassifier,
RandomForestClassifier
<MiningModel function="classification">
<Segmentation
multipleModelMethod="weightedAverage"
>
<Segment id="1" weight="1">
<True/>
<TreeModel>
<Node>
...
</Node>
</TreeModel>
</Segment>
...
</Segmentation>
</MiningModel>
14/25
Supra-standardized representation
Model model = MiningModelUtil.createClassifierEnsemble(
MultipleModelMethodType.WEIGHTED_AVERAGE,
Arrays.asList(
PMMLUtil.loadModel("xgboost.pmml"),
PMMLUtil.loadModel("keras-mlp.pmml"),
PMMLUtil.loadModel("sklearn-rf.pmml")
),
Arrays.asList(5d/9d, 2d/9d, 2d/9d)
);
PMMLUtil.storeModel(model, "kaggle-submission.pmml");
15/25
State machine for model maintenance
Enhanced
Enhanced
Enhanced
StandardizedRaw Enhanced Optimized
16/25
Model as a function
Target = f(Active1, Active2, .., Activen)
Outputfeature = ffeature(Target)
17/25
Model metadata API
Evaluator evaluator = getEvaluator();
List<FieldName> activeFields = evaluator.getActiveFields();
for(FieldName activeField : activeFields){
DataField dataField = evaluator.getDataField(activeField);
// Inspect data type, operational type, value space etc.
}
18/25
Model evaluation API
Evaluator evaluator = getEvaluator();
while(!done){
Map<FieldName, ?> arguments = readInRecord();
Map<FieldName, ?> results = evaluator.evaluate(arguments);
writeOutRecord(results);
}
19/25
http://github.com/jpmml/jpmml-${framework}
Volume
Velocity
{ REST }
20/25
JPMML-Cascading
Evaluator evaluator = getEvaluator();
PMMLPlanner pmmlPlanner = new PMMLPlanner(evaluator);
pmmlPlanner.setHeadName("input");
pmmlPlanner.setTailName("output");
FlowDef flowDef = ...;
flowDef.addAssemblyPlanner(pmmlPlanner);
21/25
JPMML-Pig
grunt> REGISTER jpmml-pig-distributable-1.0.jar;
grunt> DEFINE my_udf org.jpmml.pig.PMMLFunc('model.pmml');
grunt> output = FOREACH input GENERATE my_udf(*);
22/25
JPMML-Spark
Evaluator evaluator = getEvaluator();
PMMLFunction pmmlFunction = new PMMLFunction(evaluator);
JavaRDD<Row> input = ...;
JavaRDD<Row> output = input.map(pmmlFunction);
23/25
Thoughts on API-driven architecture
● Based on a relevant standard
○ "Conventions over configuration"
● High(est) abstraction level
○ Productivity
○ Maintainability
● End-to-end value proposition
○ Separation of concerns
24/25
Q&A
villu@openscoring.io
http://openscoring.io
http://github.com/jpmml
25/25

More Related Content

What's hot

Building Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemMLBuilding Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemMLJen Aman
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
Munich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data TransformationMunich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data TransformationDataWorks Summit
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the artStavros Kontopoulos
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudDataWorks Summit
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Databricks
 
Paris FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks PresentationParis FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks PresentationAbdelkrim Hadjidj
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNJosh Patterson
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationAbdelkrim Hadjidj
 
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Databricks
 

What's hot (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Building Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemMLBuilding Custom Machine Learning Algorithms With Apache SystemML
Building Custom Machine Learning Algorithms With Apache SystemML
 
NextGenML
NextGenML NextGenML
NextGenML
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Munich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data TransformationMunich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data Transformation
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Streaming analytics state of the art
Streaming analytics state of the artStreaming analytics state of the art
Streaming analytics state of the art
 
Lessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloudLessons learned processing 70 billion data points a day using the hybrid cloud
Lessons learned processing 70 billion data points a day using the hybrid cloud
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
 
Paris FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks PresentationParis FOD Meetup #5 Hortonworks Presentation
Paris FOD Meetup #5 Hortonworks Presentation
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
 
DevOps for DataScience
DevOps for DataScienceDevOps for DataScience
DevOps for DataScience
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
Operationalizing Edge Machine Learning with Apache Spark with Nisha Talagala ...
 

Viewers also liked

Representing TF and TF-IDF transformations in PMML
Representing TF and TF-IDF transformations in PMMLRepresenting TF and TF-IDF transformations in PMML
Representing TF and TF-IDF transformations in PMMLVillu Ruusmann
 
R, Scikit-Learn and Apache Spark ML - What difference does it make?
R, Scikit-Learn and Apache Spark ML - What difference does it make?R, Scikit-Learn and Apache Spark ML - What difference does it make?
R, Scikit-Learn and Apache Spark ML - What difference does it make?Villu Ruusmann
 
Pattern: An Open Source Project for Migrating Predictive Models from SAS
Pattern: An Open Source Project for Migrating Predictive Models from SASPattern: An Open Source Project for Migrating Predictive Models from SAS
Pattern: An Open Source Project for Migrating Predictive Models from SASDataWorks Summit
 
On the representation and reuse of machine learning (ML) models
On the representation and reuse of machine learning (ML) modelsOn the representation and reuse of machine learning (ML) models
On the representation and reuse of machine learning (ML) modelsVillu Ruusmann
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Chris Fregly
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scaleLooker
 
Product Update: EDB Postgres Platform 2017
Product Update: EDB Postgres Platform 2017Product Update: EDB Postgres Platform 2017
Product Update: EDB Postgres Platform 2017EDB
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesIsheeta Sanghi
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersRahul Jain
 

Viewers also liked (15)

Representing TF and TF-IDF transformations in PMML
Representing TF and TF-IDF transformations in PMMLRepresenting TF and TF-IDF transformations in PMML
Representing TF and TF-IDF transformations in PMML
 
R, Scikit-Learn and Apache Spark ML - What difference does it make?
R, Scikit-Learn and Apache Spark ML - What difference does it make?R, Scikit-Learn and Apache Spark ML - What difference does it make?
R, Scikit-Learn and Apache Spark ML - What difference does it make?
 
Pattern: An Open Source Project for Migrating Predictive Models from SAS
Pattern: An Open Source Project for Migrating Predictive Models from SASPattern: An Open Source Project for Migrating Predictive Models from SAS
Pattern: An Open Source Project for Migrating Predictive Models from SAS
 
Yace 3.0
Yace 3.0Yace 3.0
Yace 3.0
 
On the representation and reuse of machine learning (ML) models
On the representation and reuse of machine learning (ML) modelsOn the representation and reuse of machine learning (ML) models
On the representation and reuse of machine learning (ML) models
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
Product Update: EDB Postgres Platform 2017
Product Update: EDB Postgres Platform 2017Product Update: EDB Postgres Platform 2017
Product Update: EDB Postgres Platform 2017
 
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
 

Similar to Production Grade Data Science for Hadoop

Accelerating SAP transformations with Micro Focus
Accelerating SAP transformations with Micro FocusAccelerating SAP transformations with Micro Focus
Accelerating SAP transformations with Micro FocusChristian Schuetz
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKconfluent
 
Atul_Oracle Functional Distribution and WMS Consultant
Atul_Oracle Functional Distribution and WMS ConsultantAtul_Oracle Functional Distribution and WMS Consultant
Atul_Oracle Functional Distribution and WMS ConsultantAtul Kumar
 
Oracle Forms - stay or move on ? Webinar by Kumaran Systems
Oracle Forms - stay or move on ? Webinar by Kumaran SystemsOracle Forms - stay or move on ? Webinar by Kumaran Systems
Oracle Forms - stay or move on ? Webinar by Kumaran SystemsKumaran Systems Inc
 
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela PoklukarDataScienceConferenc1
 
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...John Archer
 
Scott A Frantz resume
Scott A Frantz resumeScott A Frantz resume
Scott A Frantz resumeScott Frantz
 
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...GSE Systems, Inc.
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
 
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...Farhan Tariq
 
CV_SyedShoeb_2015
CV_SyedShoeb_2015CV_SyedShoeb_2015
CV_SyedShoeb_2015Syed Shoeb
 
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flow
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data FlowA Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flow
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flowjagada7
 
Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change Prolifics
 

Similar to Production Grade Data Science for Hadoop (20)

Accelerating SAP transformations with Micro Focus
Accelerating SAP transformations with Micro FocusAccelerating SAP transformations with Micro Focus
Accelerating SAP transformations with Micro Focus
 
VTT Wireless access - More radio capacity with less energy
VTT Wireless access - More radio capacity with less energyVTT Wireless access - More radio capacity with less energy
VTT Wireless access - More radio capacity with less energy
 
Sip@iPLM 2016
Sip@iPLM 2016 Sip@iPLM 2016
Sip@iPLM 2016
 
Manigandan_narasimhan_resume
Manigandan_narasimhan_resumeManigandan_narasimhan_resume
Manigandan_narasimhan_resume
 
Confluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIKConfluent Partner Tech Talk with QLIK
Confluent Partner Tech Talk with QLIK
 
Atul_Oracle Functional Distribution and WMS Consultant
Atul_Oracle Functional Distribution and WMS ConsultantAtul_Oracle Functional Distribution and WMS Consultant
Atul_Oracle Functional Distribution and WMS Consultant
 
Oracle Forms - stay or move on ? Webinar by Kumaran Systems
Oracle Forms - stay or move on ? Webinar by Kumaran SystemsOracle Forms - stay or move on ? Webinar by Kumaran Systems
Oracle Forms - stay or move on ? Webinar by Kumaran Systems
 
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
[DSC Europe 22] Reproducibility and Versioning of ML Systems - Spela Poklukar
 
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...Extending open source and hybrid cloud to drive OT transformation - Future Oi...
Extending open source and hybrid cloud to drive OT transformation - Future Oi...
 
Harvinder Singh-Resume
Harvinder Singh-ResumeHarvinder Singh-Resume
Harvinder Singh-Resume
 
Scott A Frantz resume
Scott A Frantz resumeScott A Frantz resume
Scott A Frantz resume
 
Top Line Strategies - MS xRM
Top Line Strategies - MS xRMTop Line Strategies - MS xRM
Top Line Strategies - MS xRM
 
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...
Responding to Fukushima: Real-Time, Interactive, Beyond Design Basis Modeling...
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...
(Technologies) AI, Machine Learning, Predictive Analytics, IIOT, Cloud,Web-fr...
 
CV_SyedShoeb_2015
CV_SyedShoeb_2015CV_SyedShoeb_2015
CV_SyedShoeb_2015
 
manikandan_16_05_2015
manikandan_16_05_2015manikandan_16_05_2015
manikandan_16_05_2015
 
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flow
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data FlowA Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flow
A Hybrid Cloud MultiCloud Approach to Streamline Supply Chain Data Flow
 
Mallikarjuna_Resume
Mallikarjuna_ResumeMallikarjuna_Resume
Mallikarjuna_Resume
 
Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change Pure App + Patterns + Prolifics = Feeding Change
Pure App + Patterns + Prolifics = Feeding Change
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Production Grade Data Science for Hadoop