SlideShare a Scribd company logo
1 of 17
PROTEUS
Scalable Online Machine
Learning for Predictive
Analytics and Real-Time
Interactive Visualization
BONAVENTURA DEL MONTE
RESEARCHER @DFKI GMBH
PH.D. STUDENT @TU BERLIN
EUROPRO WORKSHOP, EDBT 2017This project is funded
by the European Union.
Horizon 2020
Value
Velocity
VarietyVeracity
Volume
2
3
4
PROTEUS is a EU H2020 funded research project which aims to design,
develop, and provide an open-source ready-to-use Big Data solution, able to
perform real-time interactive analytics and predictive analysis through
massive online machine learning, efficiently dealing with extremely large
historical data and data stream
CONTENTS
1. PROJECT DETAILS
2. VALIDATION SCENARIO
3. HYBRID PROCESSING ENGINE
4. SCALABLE ONLINE MACHINE LEARNING
5. REAL-TIME INTERACTIVE VISUAL ANALYTICS
6. CONCLUSION
6
Project Consortium
7
Project details
 Expected Outcomes
 Hybrid processing
 Batch & Stream processing engine
 Declarative Language for batch & streams analytics
 Scalable Online machine Learning
 SOLMA Library
 Real-time interactive Visual Analytics
 Web charts library
 Incremental engine for interactive analytics
 Business Impact
 Validation in realistic industrial use case
8
Hot Strip Mill: Big Data scenario
9
System Architecture
 Smoother processing of data stream and historical data in the same Flink job
 A declarative language for batch and streaming analytics
 ETL and ML pipelines expressed in an unified language are holistically optimized
10
Hybrid Processing
Gather and
clean sensor
data
PCA
Train ML
Model
D3
D1
D2
Bridging the Gap: Towards Optimizations across Linear and Relational Algebra": Andreas Kunft, Alexander Alexandrov,
Asterios Katsifodimos, Volker Markl. BeyondMR workshop @SIGMOD 2016.
11
Scalable Online Machine Learning
 ML challenge: Distributed Data Streams
 Current state of the art of machine learning algorithms for Big Data is dominated by offline learning
algorithms that process data-at-rest
 Plenty of current data sources are streaming (online, data-in-motion): sensors, social networks,
clickstream, etc.
 In online learning, the algorithms see the data only once. The traditional meaning of online is that
data is processed sequentially one by one but for many epochs: prequential evaluation
12
Real-time Interactive Visual Analytics
 How to interactively visualize Big Data?
 Incremental Analytics engine: incremental partial results in ~ O(1)
 Visualization Layer: SSR-enabled web-based library seamlessly connected to
the Incremental Analytics engine
https://github.com/proteus-h2020/proteic
13
Conclusions
 PROTEUS is an EU H2020 international research project
 PROTEUS will contribute to the Big Data ecosystem with:
 An innovative hybrid engine for processing both data-at-rest and data-in-motion
 SOLMA: An new library for scalable online machine learning
 Big Data Visualization guidelines: new ways of presenting and working with Big Data
 Real-time interactive visualization technology: Incremental engine & web-based library
 PROTEUS will validate its innovations in a realistic industrial scenario
 PROTEUS will provide full-scale evaluation and impact assessment including
benchmarks, KPIs and anonymized datasets
 Specific metrics for the ArcelorMittal use case
 Generic indicators on the advancements in scalable machine learning, hybrid computation and real-time
interactive visual analytics.
14
Thanks for your attention!
Questions?
 Contact us:
 Bonaventura Del Monte
 bonaventura dot delmonte at dfki dot de
 www.dfki.berlin
www.proteus-bigdata.com
www.github.com/proteus-h2020
15
Extra Slides
16
Apache Flink 101
 Massive parallel data flow engine with unified batch and stream
processing
 Rich set of operators (including native iteration)
 Flink Optimizer
 Inspired by optimizers of parallel database systems
 Physical optimization follows cost‐based approach
 Memory Management
 Flink manages its own memory
 Never breaks the JVM heap
17
Scalable Online Machine Learning
 PROTEUS contribution: SOLMA
 User-friendly
 Extensibility
 Basic scalable stream sketches that enable to query the stream
 Iterative algorithms for approximating the outcome of offline computation
 Ready-to-use (supervised & unsupervised) online ML algorithms in Apache Flink

More Related Content

What's hot

The full service mechanic for your big data project
The full service mechanic for your big data projectThe full service mechanic for your big data project
The full service mechanic for your big data projectNeos IT Services GmbH
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...TigerGraph
 
Einblicke ins Dickicht der Parteiprogramme
Einblicke ins Dickicht der ParteiprogrammeEinblicke ins Dickicht der Parteiprogramme
Einblicke ins Dickicht der ParteiprogrammeNeo4j
 
Schema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsSchema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsVera G. Meister
 
Simple Drools Examples
Simple Drools ExamplesSimple Drools Examples
Simple Drools ExamplesMatteo Mortari
 
Big Data Analytics on Hadoop RainStor Infographic
Big Data Analytics on Hadoop RainStor InfographicBig Data Analytics on Hadoop RainStor Infographic
Big Data Analytics on Hadoop RainStor InfographicRainStor
 
Open Data in Agriculture - AGH20013 Hands-on session
Open Data in Agriculture - AGH20013 Hands-on sessionOpen Data in Agriculture - AGH20013 Hands-on session
Open Data in Agriculture - AGH20013 Hands-on sessionCarlos V.
 
Graph Visualisierung mit Neo4j Bloom
Graph Visualisierung mit Neo4j BloomGraph Visualisierung mit Neo4j Bloom
Graph Visualisierung mit Neo4j BloomNeo4j
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APIJeremy Yang
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformBigData_Europe
 
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...Yannick Hougaard
 

What's hot (14)

The full service mechanic for your big data project
The full service mechanic for your big data projectThe full service mechanic for your big data project
The full service mechanic for your big data project
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
Einblicke ins Dickicht der Parteiprogramme
Einblicke ins Dickicht der ParteiprogrammeEinblicke ins Dickicht der Parteiprogramme
Einblicke ins Dickicht der Parteiprogramme
 
Schema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge GraphsSchema Engineering for Enterprise Knowledge Graphs
Schema Engineering for Enterprise Knowledge Graphs
 
Simple Drools Examples
Simple Drools ExamplesSimple Drools Examples
Simple Drools Examples
 
Ieee 2018 2019 project titiles
Ieee 2018 2019 project titilesIeee 2018 2019 project titiles
Ieee 2018 2019 project titiles
 
Big Data Analytics on Hadoop RainStor Infographic
Big Data Analytics on Hadoop RainStor InfographicBig Data Analytics on Hadoop RainStor Infographic
Big Data Analytics on Hadoop RainStor Infographic
 
Open Data in Agriculture - AGH20013 Hands-on session
Open Data in Agriculture - AGH20013 Hands-on sessionOpen Data in Agriculture - AGH20013 Hands-on session
Open Data in Agriculture - AGH20013 Hands-on session
 
Graph Visualisierung mit Neo4j Bloom
Graph Visualisierung mit Neo4j BloomGraph Visualisierung mit Neo4j Bloom
Graph Visualisierung mit Neo4j Bloom
 
IBM IOD 2013
IBM IOD 2013IBM IOD 2013
IBM IOD 2013
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST API
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator Platform
 
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...
Frokostseminar "Digitalisering for kunnskapsbedrifter" 2018.06.14 - Christian...
 

Viewers also liked

How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Visualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityVisualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityCambridge Intelligence
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-feiTianlu Wang
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra JohnsonSigOpt
 
embedded-systems-for-beginners
embedded-systems-for-beginnersembedded-systems-for-beginners
embedded-systems-for-beginnersmohamed gaber
 
Proteus Concepts
Proteus ConceptsProteus Concepts
Proteus Conceptsnpisano
 
Xp exterme-programming-model
Xp exterme-programming-modelXp exterme-programming-model
Xp exterme-programming-modelAli MasudianPour
 
Visualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationVisualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationLiz Dorland
 
Proteus Circuit Simulation
Proteus Circuit SimulationProteus Circuit Simulation
Proteus Circuit SimulationAbdul Haseeb
 
Pcb design using proteus
Pcb design using proteusPcb design using proteus
Pcb design using proteusMashood
 
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedAI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedRaffael Marty
 
Agile Methodologies And Extreme Programming
Agile Methodologies And Extreme ProgrammingAgile Methodologies And Extreme Programming
Agile Methodologies And Extreme ProgrammingUtkarsh Khare
 
extreme Programming
extreme Programmingextreme Programming
extreme ProgrammingBilal Shah
 

Viewers also liked (20)

Proteus CMMS Overview
Proteus CMMS OverviewProteus CMMS Overview
Proteus CMMS Overview
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Visualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber SecurityVisualizing Threats: Network Visualization for Cyber Security
Visualizing Threats: Network Visualization for Cyber Security
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Plotcon 2016 Visualization Talk by Alexandra Johnson
Plotcon 2016 Visualization Talk  by Alexandra JohnsonPlotcon 2016 Visualization Talk  by Alexandra Johnson
Plotcon 2016 Visualization Talk by Alexandra Johnson
 
embedded-systems-for-beginners
embedded-systems-for-beginnersembedded-systems-for-beginners
embedded-systems-for-beginners
 
Proteus Concepts
Proteus ConceptsProteus Concepts
Proteus Concepts
 
Xp exterme-programming-model
Xp exterme-programming-modelXp exterme-programming-model
Xp exterme-programming-model
 
Visualization and Theories of Learning in Education
Visualization and Theories of Learning in EducationVisualization and Theories of Learning in Education
Visualization and Theories of Learning in Education
 
Proteus Circuit Simulation
Proteus Circuit SimulationProteus Circuit Simulation
Proteus Circuit Simulation
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Introduction to proteus
Introduction to  proteusIntroduction to  proteus
Introduction to proteus
 
Pcb design using proteus
Pcb design using proteusPcb design using proteus
Pcb design using proteus
 
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedAI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
 
Agile Methodologies And Extreme Programming
Agile Methodologies And Extreme ProgrammingAgile Methodologies And Extreme Programming
Agile Methodologies And Extreme Programming
 
Enterobacteriaceae
EnterobacteriaceaeEnterobacteriaceae
Enterobacteriaceae
 
Extreme programming (xp)
Extreme programming (xp)Extreme programming (xp)
Extreme programming (xp)
 
extreme Programming
extreme Programmingextreme Programming
extreme Programming
 
Proteus spp (2)
Proteus spp (2)Proteus spp (2)
Proteus spp (2)
 
Proteus spp lecture
Proteus spp lectureProteus spp lecture
Proteus spp lecture
 

Similar to PROTEUS H2020

CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptx
CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptxCARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptx
CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptxFIWARE
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataGuido Schmutz
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...Ignasi Sayol
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: EuropeSaltlux Inc.
 
Prototyping the Internet of Things
Prototyping the Internet of ThingsPrototyping the Internet of Things
Prototyping the Internet of ThingsDavid Bliss
 
Virtual Reality in AEC
Virtual Reality in AECVirtual Reality in AEC
Virtual Reality in AECTero Järvinen
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Tomasz Bednarz
 
Mainflux Labs - References (1).pdf
Mainflux Labs - References (1).pdfMainflux Labs - References (1).pdf
Mainflux Labs - References (1).pdfWlamir Molinari
 
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...Guido Schmutz
 
SC7 Workshop 1: Big Data in Secure Societies
SC7 Workshop 1: Big Data in Secure Societies SC7 Workshop 1: Big Data in Secure Societies
SC7 Workshop 1: Big Data in Secure Societies BigData_Europe
 
Watson IoT breifing for HEC 081516
Watson IoT breifing for HEC 081516Watson IoT breifing for HEC 081516
Watson IoT breifing for HEC 081516Brian Dalgetty
 
Automation in manufacturing - industry use cases
Automation in manufacturing - industry use casesAutomation in manufacturing - industry use cases
Automation in manufacturing - industry use casesCompositionProject
 
From measurement to knowledge with sofia2 Platform
From measurement to knowledge with sofia2 PlatformFrom measurement to knowledge with sofia2 Platform
From measurement to knowledge with sofia2 PlatformSofia2 Smart Platform
 
FIWARE projects 2015-1
FIWARE projects 2015-1FIWARE projects 2015-1
FIWARE projects 2015-1imec
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2
 
CityPulse: Large-scale data analysis for smart city applications
CityPulse: Large-scale data analysis for smart city applicationsCityPulse: Large-scale data analysis for smart city applications
CityPulse: Large-scale data analysis for smart city applicationsPayamBarnaghi
 

Similar to PROTEUS H2020 (20)

CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptx
CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptxCARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptx
CARTIF_CAPRI_sXAIPI_FIWARE_Summit2023_v4.pptx
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
 
Madhu project
Madhu projectMadhu project
Madhu project
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: Europe
 
Prototyping the Internet of Things
Prototyping the Internet of ThingsPrototyping the Internet of Things
Prototyping the Internet of Things
 
Virtual Reality in AEC
Virtual Reality in AECVirtual Reality in AEC
Virtual Reality in AEC
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
Media offering
Media offeringMedia offering
Media offering
 
Mainflux Labs - References (1).pdf
Mainflux Labs - References (1).pdfMainflux Labs - References (1).pdf
Mainflux Labs - References (1).pdf
 
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
IoT Architecture - Are Traditional Architectures Good Enough or do we Need Ne...
 
SC7 Workshop 1: Big Data in Secure Societies
SC7 Workshop 1: Big Data in Secure Societies SC7 Workshop 1: Big Data in Secure Societies
SC7 Workshop 1: Big Data in Secure Societies
 
Watson IoT breifing for HEC 081516
Watson IoT breifing for HEC 081516Watson IoT breifing for HEC 081516
Watson IoT breifing for HEC 081516
 
Automation in manufacturing - industry use cases
Automation in manufacturing - industry use casesAutomation in manufacturing - industry use cases
Automation in manufacturing - industry use cases
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
From measurement to knowledge with sofia2 Platform
From measurement to knowledge with sofia2 PlatformFrom measurement to knowledge with sofia2 Platform
From measurement to knowledge with sofia2 Platform
 
Cloud Manufacturing
Cloud ManufacturingCloud Manufacturing
Cloud Manufacturing
 
FIWARE projects 2015-1
FIWARE projects 2015-1FIWARE projects 2015-1
FIWARE projects 2015-1
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product Overview
 
CityPulse: Large-scale data analysis for smart city applications
CityPulse: Large-scale data analysis for smart city applicationsCityPulse: Large-scale data analysis for smart city applications
CityPulse: Large-scale data analysis for smart city applications
 

Recently uploaded

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

PROTEUS H2020

  • 1. PROTEUS Scalable Online Machine Learning for Predictive Analytics and Real-Time Interactive Visualization BONAVENTURA DEL MONTE RESEARCHER @DFKI GMBH PH.D. STUDENT @TU BERLIN EUROPRO WORKSHOP, EDBT 2017This project is funded by the European Union. Horizon 2020
  • 3. 3
  • 4. 4 PROTEUS is a EU H2020 funded research project which aims to design, develop, and provide an open-source ready-to-use Big Data solution, able to perform real-time interactive analytics and predictive analysis through massive online machine learning, efficiently dealing with extremely large historical data and data stream
  • 5. CONTENTS 1. PROJECT DETAILS 2. VALIDATION SCENARIO 3. HYBRID PROCESSING ENGINE 4. SCALABLE ONLINE MACHINE LEARNING 5. REAL-TIME INTERACTIVE VISUAL ANALYTICS 6. CONCLUSION
  • 7. 7 Project details  Expected Outcomes  Hybrid processing  Batch & Stream processing engine  Declarative Language for batch & streams analytics  Scalable Online machine Learning  SOLMA Library  Real-time interactive Visual Analytics  Web charts library  Incremental engine for interactive analytics  Business Impact  Validation in realistic industrial use case
  • 8. 8 Hot Strip Mill: Big Data scenario
  • 10.  Smoother processing of data stream and historical data in the same Flink job  A declarative language for batch and streaming analytics  ETL and ML pipelines expressed in an unified language are holistically optimized 10 Hybrid Processing Gather and clean sensor data PCA Train ML Model D3 D1 D2 Bridging the Gap: Towards Optimizations across Linear and Relational Algebra": Andreas Kunft, Alexander Alexandrov, Asterios Katsifodimos, Volker Markl. BeyondMR workshop @SIGMOD 2016.
  • 11. 11 Scalable Online Machine Learning  ML challenge: Distributed Data Streams  Current state of the art of machine learning algorithms for Big Data is dominated by offline learning algorithms that process data-at-rest  Plenty of current data sources are streaming (online, data-in-motion): sensors, social networks, clickstream, etc.  In online learning, the algorithms see the data only once. The traditional meaning of online is that data is processed sequentially one by one but for many epochs: prequential evaluation
  • 12. 12 Real-time Interactive Visual Analytics  How to interactively visualize Big Data?  Incremental Analytics engine: incremental partial results in ~ O(1)  Visualization Layer: SSR-enabled web-based library seamlessly connected to the Incremental Analytics engine https://github.com/proteus-h2020/proteic
  • 13. 13 Conclusions  PROTEUS is an EU H2020 international research project  PROTEUS will contribute to the Big Data ecosystem with:  An innovative hybrid engine for processing both data-at-rest and data-in-motion  SOLMA: An new library for scalable online machine learning  Big Data Visualization guidelines: new ways of presenting and working with Big Data  Real-time interactive visualization technology: Incremental engine & web-based library  PROTEUS will validate its innovations in a realistic industrial scenario  PROTEUS will provide full-scale evaluation and impact assessment including benchmarks, KPIs and anonymized datasets  Specific metrics for the ArcelorMittal use case  Generic indicators on the advancements in scalable machine learning, hybrid computation and real-time interactive visual analytics.
  • 14. 14 Thanks for your attention! Questions?  Contact us:  Bonaventura Del Monte  bonaventura dot delmonte at dfki dot de  www.dfki.berlin www.proteus-bigdata.com www.github.com/proteus-h2020
  • 16. 16 Apache Flink 101  Massive parallel data flow engine with unified batch and stream processing  Rich set of operators (including native iteration)  Flink Optimizer  Inspired by optimizers of parallel database systems  Physical optimization follows cost‐based approach  Memory Management  Flink manages its own memory  Never breaks the JVM heap
  • 17. 17 Scalable Online Machine Learning  PROTEUS contribution: SOLMA  User-friendly  Extensibility  Basic scalable stream sketches that enable to query the stream  Iterative algorithms for approximating the outcome of offline computation  Ready-to-use (supervised & unsupervised) online ML algorithms in Apache Flink

Editor's Notes

  1. As you probably got to know in the last couple of years, big data are not just a huge quantity of heterogeneous data whose analysis is rather complex and that are ingested at high rate in your data processing system. Indeed, at the end of the day, what really matters is how much you can capitalize exploiting big data.
  2. However, if you are going to start a new big data related, you will be facing a zoo of technologies and the final choice which strictly relies on the use case is biased by the knowledge of the IT guys leading the project.
  3. However, if you need to deal with large historical data as well as data streams and to perform predictive analysis and real-time interactive analytics then you may consider Proteus as it is an open source ready to use big data solution offering such capabilities. AND I WILL SHOW THAT IN THE NEXT SLIDES.
  4. The presentation goes as follows
  5. Research partners, pure IT companies and ArcelorMittal
  6. Our validation scenario deals with the prediction of anomalies in the coils produced through the so-called Hot strip mill process, which comes from our ArcelorMittal partner, a leader in the steelmaking industry. In order to do such task, we need to perform analytics on streaming data and historical data.
  7. 3 main subsystems: an hybrid processing engine for large historical data and data streams powered by an enhanced version of Apache Flink (distributed dataflow system for batch and stream data in a single engine); a library for scalable online machine learning built on top of our processing engine and then the visualization stack which queries the solma library and the engine in real-time.
  8. The ML challenge we are facing deals with data stream, we need online machine learning which suits better streaming processing rather than traditional batch ML. Online machine learning algorithms see data item one by one, generally speaking it firsts predicts the class of the item and then it does a single training step on the model. This is called prequential evaluation.
  9. Our real-time interactive visual analytics stack tries to answer the question: How to interactively visualize big data? The answer is through incremental partial results that update the charts and a SSR-enabled webchart library