SlideShare a Scribd company logo
1 of 20
Download to read offline
DavidArnu
RapidMiner
Graphical Data Analytic Workflows and
Cross-Platform Optimization
Graphical Data Workflows and Cross-Platform Optimization 1
Project Overview
Graphical Data Workflows and Cross-Platform Optimization 2
• Graphically design data processing workflows and data
analytics tasks with minimal or no programming overhead
• Real-time, interactive machine learning and data mining
tools
• Distributed Complex Event Forecasting
Life Sciences Use Case
Studying the effect of drug synergiesin cancer
from in-silico simulations to in-vivo experimentsand back
• Challenges:
• Huge CPU, memory requirements + output data to be processed
• Too manysimulations, too few promising ones
• Train a ML model to classify promising simulations and kill non-promising ones
• Learn which genes drive evolution of other genes and which to monitor
Graphical Data Workflows and Cross-Platform Optimization 3
Financial Use Case
Goal: Train ML models that extract valid rules used to perform:
• Real-time suggestion and forecast of investmentopportunities
• Systemic risk (i.e., great linkage between major market participants)
prediction
• Forecast price swings
PredictingPrice Swings,SystemicRisk and
ForecastingInvestmentOpportunities
Graphical Data Workflows and Cross-Platform Optimization 4
Maritime Use Case
Maritime Situational Awareness (MSA),
Monitoring Ship Movement and Detecting Illegal
Activities
Graphical Data Workflows and Cross-Platform Optimization 5
Challenges:
• Large amountof ships to monitor
• Many different data sources are
available
• Complex event classification
(patterns of movement or other
behavior)
Graphical Data Analytic Workflows
Graphical Data Workflows and Cross-Platform Optimization 6
Graphical Editor
7
• Simply design Streaming Analytics Workflows
• Upon execution one job is created and deployed
to the connected streaming cluster
Graphical Data Workflows and Cross-Platform Optimization
Inner Workings
8Graphical Data Workflows and Cross-Platform Optimization
Supported Clusters:
Apache Flink
Apache Spark (structured)Streaming
Available Operations
Streaming analytics operations
Synopsis Data Engine
Custom Online Machine Learning engines (running on Flink
and AKKA)
• Connections to financial service providers
Capabilities
9Graphical Data Workflows and Cross-Platform Optimization
Code-Free development
Platform and back-end independent
Pluggable connection management
Easy to share and collaborate
Benefits
10Graphical Data Workflows and Cross-Platform Optimization
https://youtu.be/9SKcM70Bi2U
Link to Demo Video
Graphical Data Workflows and Cross-Platform Optimization 11
Cross-Platform Optimization
Graphical Data Workflows and Cross-Platform Optimization 12
Optimizer Component
In a multi-cluster set-up the optimal stream execution
can depend on
Available resourcesper cluster
Data location
Software performance and implementation details
An optimized process layout can greatly enhance the
performance
Optimizer Component
13Graphical Data Workflows and Cross-Platform Optimization
Streaming Optimization Operator
14Graphical Data Workflows and Cross-Platform Optimization
Optimizer Response
15
Split connection
(process flow
control)
Split connection
(Streaming Graph)
Graphical Data Workflows and Cross-Platform Optimization
Streaming Optimization Operator
16Graphical Data Workflows and Cross-Platform Optimization
Optimized Workflow
17Graphical Data Workflows and Cross-Platform Optimization
Optimized Workflow
18
Split connection
(process flow
control)
Split connection (Streaming Graph)
Graphical Data Workflows and Cross-Platform Optimization
Conclusion
• What we have seen:
• Project use cases and goals
• Graphical editor
• Cross-Platform optimization
• What‘s next:
• Job and Data Monitoring
• Better integration of HPC systems
• Refinements and deployment of the use cases
Graphical Data Workflows and Cross-Platform Optimization 19
Thank you!
http://www.infore-project.eu
Graphical Data Workflows and Cross-Platform Optimization 20

More Related Content

What's hot

Graph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierGraph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierDemai Ni
 
Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkSingleStore
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesWill Gardella
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskDatabricks
 
Splunk .conf2011: Search Language: Intermediate
Splunk .conf2011: Search Language: IntermediateSplunk .conf2011: Search Language: Intermediate
Splunk .conf2011: Search Language: IntermediateErin Sweeney
 
Improving Reporting Performance
Improving Reporting PerformanceImproving Reporting Performance
Improving Reporting PerformanceDhiren Gala
 
JulianRodriguez-TechResume
JulianRodriguez-TechResumeJulianRodriguez-TechResume
JulianRodriguez-TechResumeJulian Rodriguez
 
Observability and its application
Observability and its applicationObservability and its application
Observability and its applicationThao Huynh Quang
 
New features in Zen 5.3 end-to-end network analytics platform
New features in Zen 5.3 end-to-end network analytics platformNew features in Zen 5.3 end-to-end network analytics platform
New features in Zen 5.3 end-to-end network analytics platformSysMech
 
"Lessons learned using Apache Spark for self-service data prep in SaaS world"
"Lessons learned using Apache Spark for self-service data prep in SaaS world""Lessons learned using Apache Spark for self-service data prep in SaaS world"
"Lessons learned using Apache Spark for self-service data prep in SaaS world"Pavel Hardak
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...HPCC Systems
 

What's hot (13)

Graph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierGraph Data: a New Data Management Frontier
Graph Data: a New Data Management Frontier
 
Real-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and SparkReal-Time Analytics with MemSQL and Spark
Real-Time Analytics with MemSQL and Spark
 
Modern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and PracticesModern Machine Learning Infrastructure and Practices
Modern Machine Learning Infrastructure and Practices
 
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at ZendeskThe More the Merrier: Scaling Model Building Infrastructure at Zendesk
The More the Merrier: Scaling Model Building Infrastructure at Zendesk
 
Splunk .conf2011: Search Language: Intermediate
Splunk .conf2011: Search Language: IntermediateSplunk .conf2011: Search Language: Intermediate
Splunk .conf2011: Search Language: Intermediate
 
Improving Reporting Performance
Improving Reporting PerformanceImproving Reporting Performance
Improving Reporting Performance
 
JulianRodriguez-TechResume
JulianRodriguez-TechResumeJulianRodriguez-TechResume
JulianRodriguez-TechResume
 
CapMonPosterweb
CapMonPosterwebCapMonPosterweb
CapMonPosterweb
 
Observability and its application
Observability and its applicationObservability and its application
Observability and its application
 
New features in Zen 5.3 end-to-end network analytics platform
New features in Zen 5.3 end-to-end network analytics platformNew features in Zen 5.3 end-to-end network analytics platform
New features in Zen 5.3 end-to-end network analytics platform
 
cametrics-report-final
cametrics-report-finalcametrics-report-final
cametrics-report-final
 
"Lessons learned using Apache Spark for self-service data prep in SaaS world"
"Lessons learned using Apache Spark for self-service data prep in SaaS world""Lessons learned using Apache Spark for self-service data prep in SaaS world"
"Lessons learned using Apache Spark for self-service data prep in SaaS world"
 
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
ECL-Watch: A Big Data Application Performance Tuning Tool in the HPCC Systems...
 

Similar to Graphical Data Analytic Workflows and Cross-Platform Optimization

Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Sri Ambati
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Dataconomy Media
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Apache Apex
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapNeo4j
 
The path to success with Graph Database and Graph Data Science
The path to success with Graph Database and Graph Data ScienceThe path to success with Graph Database and Graph Data Science
The path to success with Graph Database and Graph Data ScienceNeo4j
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create PyData
 
The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...Neo4j
 
Innovate2010 jazz keynote
Innovate2010 jazz keynoteInnovate2010 jazz keynote
Innovate2010 jazz keynoteoslc
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Joachim Schlosser
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyNeo4j
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...HostedbyConfluent
 
Roadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyRoadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyNeo4j
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationpzjnjr6rsg
 

Similar to Graphical Data Analytic Workflows and Cross-Platform Optimization (20)

Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...Building a Real-Time Security Application Using Log Data and Machine Learning...
Building a Real-Time Security Application Using Log Data and Machine Learning...
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
Flow-ABriefExplanation
Flow-ABriefExplanationFlow-ABriefExplanation
Flow-ABriefExplanation
 
The path to success with Graph Database and Graph Data Science
The path to success with Graph Database and Graph Data ScienceThe path to success with Graph Database and Graph Data Science
The path to success with Graph Database and Graph Data Science
 
Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create Danny Bickson - Python based predictive analytics with GraphLab Create
Danny Bickson - Python based predictive analytics with GraphLab Create
 
The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...The path to success with graph database and graph data science_ Neo4j GraphSu...
The path to success with graph database and graph data science_ Neo4j GraphSu...
 
About CDAP
About CDAPAbout CDAP
About CDAP
 
Innovate2010 jazz keynote
Innovate2010 jazz keynoteInnovate2010 jazz keynote
Innovate2010 jazz keynote
 
The ZDLC Brief
The ZDLC BriefThe ZDLC Brief
The ZDLC Brief
 
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
Den Datenschatz heben und Zeit- und Energieeffizienz steigern: Mathematik und...
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Your Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph StrategyYour Roadmap for An Enterprise Graph Strategy
Your Roadmap for An Enterprise Graph Strategy
 
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
 
Roadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph StrategyRoadmap for Enterprise Graph Strategy
Roadmap for Enterprise Graph Strategy
 
DataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestrationDataOps: Control-M's role in data pipeline orchestration
DataOps: Control-M's role in data pipeline orchestration
 

More from Big Data Value Association

Data Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingData Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingBig Data Value Association
 
Key Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceKey Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceBig Data Value Association
 
GDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingGDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingBig Data Value Association
 
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Big Data Value Association
 
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyThree pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyBig Data Value Association
 
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Big Data Value Association
 
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...Big Data Value Association
 
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna Big Data Value Association
 
BDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBig Data Value Association
 
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...Big Data Value Association
 
BDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBig Data Value Association
 
BDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBig Data Value Association
 
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...Big Data Value Association
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBig Data Value Association
 
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBig Data Value Association
 
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingVirtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingBig Data Value Association
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Big Data Value Association
 
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewPolicy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewBig Data Value Association
 
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Big Data Value Association
 

More from Big Data Value Association (20)

Data Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingData Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharing
 
Key Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceKey Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplace
 
GDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingGDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharing
 
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
 
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyThree pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
 
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
 
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
 
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
 
BDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionals
 
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
 
BDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshop
 
BDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshop
 
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
 
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
 
Virtual BenchLearning - Data Bench Framework
Virtual BenchLearning - Data Bench FrameworkVirtual BenchLearning - Data Bench Framework
Virtual BenchLearning - Data Bench Framework
 
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingVirtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
 
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewPolicy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
 
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
 

Recently uploaded

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Graphical Data Analytic Workflows and Cross-Platform Optimization

  • 1. DavidArnu RapidMiner Graphical Data Analytic Workflows and Cross-Platform Optimization Graphical Data Workflows and Cross-Platform Optimization 1
  • 2. Project Overview Graphical Data Workflows and Cross-Platform Optimization 2 • Graphically design data processing workflows and data analytics tasks with minimal or no programming overhead • Real-time, interactive machine learning and data mining tools • Distributed Complex Event Forecasting
  • 3. Life Sciences Use Case Studying the effect of drug synergiesin cancer from in-silico simulations to in-vivo experimentsand back • Challenges: • Huge CPU, memory requirements + output data to be processed • Too manysimulations, too few promising ones • Train a ML model to classify promising simulations and kill non-promising ones • Learn which genes drive evolution of other genes and which to monitor Graphical Data Workflows and Cross-Platform Optimization 3
  • 4. Financial Use Case Goal: Train ML models that extract valid rules used to perform: • Real-time suggestion and forecast of investmentopportunities • Systemic risk (i.e., great linkage between major market participants) prediction • Forecast price swings PredictingPrice Swings,SystemicRisk and ForecastingInvestmentOpportunities Graphical Data Workflows and Cross-Platform Optimization 4
  • 5. Maritime Use Case Maritime Situational Awareness (MSA), Monitoring Ship Movement and Detecting Illegal Activities Graphical Data Workflows and Cross-Platform Optimization 5 Challenges: • Large amountof ships to monitor • Many different data sources are available • Complex event classification (patterns of movement or other behavior)
  • 6. Graphical Data Analytic Workflows Graphical Data Workflows and Cross-Platform Optimization 6
  • 7. Graphical Editor 7 • Simply design Streaming Analytics Workflows • Upon execution one job is created and deployed to the connected streaming cluster Graphical Data Workflows and Cross-Platform Optimization
  • 8. Inner Workings 8Graphical Data Workflows and Cross-Platform Optimization
  • 9. Supported Clusters: Apache Flink Apache Spark (structured)Streaming Available Operations Streaming analytics operations Synopsis Data Engine Custom Online Machine Learning engines (running on Flink and AKKA) • Connections to financial service providers Capabilities 9Graphical Data Workflows and Cross-Platform Optimization
  • 10. Code-Free development Platform and back-end independent Pluggable connection management Easy to share and collaborate Benefits 10Graphical Data Workflows and Cross-Platform Optimization
  • 11. https://youtu.be/9SKcM70Bi2U Link to Demo Video Graphical Data Workflows and Cross-Platform Optimization 11
  • 12. Cross-Platform Optimization Graphical Data Workflows and Cross-Platform Optimization 12 Optimizer Component
  • 13. In a multi-cluster set-up the optimal stream execution can depend on Available resourcesper cluster Data location Software performance and implementation details An optimized process layout can greatly enhance the performance Optimizer Component 13Graphical Data Workflows and Cross-Platform Optimization
  • 14. Streaming Optimization Operator 14Graphical Data Workflows and Cross-Platform Optimization
  • 15. Optimizer Response 15 Split connection (process flow control) Split connection (Streaming Graph) Graphical Data Workflows and Cross-Platform Optimization
  • 16. Streaming Optimization Operator 16Graphical Data Workflows and Cross-Platform Optimization
  • 17. Optimized Workflow 17Graphical Data Workflows and Cross-Platform Optimization
  • 18. Optimized Workflow 18 Split connection (process flow control) Split connection (Streaming Graph) Graphical Data Workflows and Cross-Platform Optimization
  • 19. Conclusion • What we have seen: • Project use cases and goals • Graphical editor • Cross-Platform optimization • What‘s next: • Job and Data Monitoring • Better integration of HPC systems • Refinements and deployment of the use cases Graphical Data Workflows and Cross-Platform Optimization 19
  • 20. Thank you! http://www.infore-project.eu Graphical Data Workflows and Cross-Platform Optimization 20