SlideShare a Scribd company logo
1 of 26
© 2016 MapR Technologies 1© 2016 MapR Technologies 1MapR Confidential © 2016 MapR Technologies
CEP - A Simplified Enterprise Architecture
for Real-time Stream Processing
Mathieu Dumoulin, Data Engineer (mdumoulin@mapr.com, @lordxar)
© 2016 MapR Technologies 2© 2016 MapR Technologies 2MapR Confidential
Mathieu Dumoulin
• Living in Tokyo, Japan last 3 years
• Data Engineer for MapR Professional Services
• Other jobs: Data Scientist, Search Engineer
• Connect with me:
–Read my blog posts:
https://www.mapr.com/blog/author/mathieu-dumoulin
–Twitter: @Lordxar
–Email: mdumoulin@mapr.com
© 2016 MapR Technologies 3© 2016 MapR Technologies 3MapR Confidential
Content Summary
1.Complex Event Processing
2.Streaming Architecture
3.Rules Engines for CEP
4.Simplified Hadoop-based CEP Architecture
5.Live Demo
6.Does it scale?
7.Conclusion
© 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential
Complex Event Processing (CEP)
Some terminology:
• Event: Data with a timestamp (a log event, a transaction, ...)
• Event processing: Track and analyze streaming event data
• Complex event processing is to identify meaningful events and
respond to them as quickly as possible. Usually over a sliding
window on the stream of event data.
CEP is just a fancy way to do
business rules on streaming data
© 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential
IoT: Needs some CEP in There Somewhere
© 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential
CEP in Action
The power of CEP comes from being able to detect complex
situations that could not be detected from any individual data
directly.
Window opened
Motion Sensor
Light turned on
Door opened
© 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential
Actually, CEP Has Been Around For a While
Taken from March 2010 issue of the Dutch Java Magazine (source)
© 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential
Technology Has Been Holding Rule Engines Back
• Rule engines are not new
– First papers from the 90’s, many implementations in early 2000’s
• Engine is running in-memory on single node
– A few GB of memory (or less) was a severe limitation
– Single core CPU can only do so much
• Need modern stream messaging (Kafka, MapR Streams)
– Need persistence
– Need speed
• No standard, no dominant sponsor
– 90’s and early 2000 dominated by Microsoft
– OSS had not come of age in enterprise IT
© 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential
CEP in a Modern Enterprise Data Pipeline
Source: Oracle / Rittman Mead Information Management Reference Architecture
© 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential
Modern Streaming Architecture
• Build flexible systems
– more efficient and easier to build
– Decouples dependencies
• Better model the way business processes take place.
• More value now
– Aggregates data from many sources once
– Serves data to one or many projects immediately
• More value later
– Run batch analytics on the data later
– Reprocess the data with different algorithms later
© 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential
Kafka-esque Messaging for Rule Engines
• Stream Persistence is a key feature
• CEP is only one use case
– Support batch analytics and Ad-hoc analysis from the same data
stream
• Compensate for Current Rule Engine limitations
– Enables Hot Replacement for fault-tolerance
– Enables simple horizontal scaling by partitioning data and rules
• Convergence
– Run this use case on your existing, standard, big data technology
– Use OSS frameworks and Open APIs
© 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential
Roy Schulte, vice president, Gartner
Most CEP in IoT [...] is custom coded [...]
rather than
[using a] general purpose stream platform.
See: Complex Event Processing and The Future Of Business Decisions
by David Luckham and W. Roy Schulte
© 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential
Custom Coded CEP: The Good and The Bad
The Good:
• Made to order with a modern framework
• “No limit” to potential for performance and scalability
• Fit to purpose technology
The bad:
• Engineers aren’t business domain experts
• Lots of work to build from scratch every time
• Changes to logic is a pain point (from business side)
• Lack of available talent/organizational capability
© 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential
Declarative Makes Sense For Business
Manage complex behavior through simple rules
working together, executed by a rules Engine.
© 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential
Drools is a business rule management system (BRMS) with a
forward and backward chaining inference based rules engine.
• Project homepage: http://www.drools.org/
• Developer: Red Hat
• Enterprise supported version available
– JBoss Enterprise BRMS
• Enhanced implementation of the Rete algorithm
– A state of the art algorithm for rules engines
• Has a GUI Rules Editor: Workbench
An Open Source Rule Engine:
© 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential
An Open Source Rule Engine:
Production
Memory
(Rules)
Working
Memory
(Facts)
Pattern
Matcher
AgendaDomain Expert
Rules
Editor
Actions
© 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential
STATELESS
Session
CEP in Drools: Stateful Session and Sliding Window
STATELESS
Session
Rule:
Is the ball red?
Rule:
Are there 2+ red
balls in the last 4
balls I’ve seen?
© 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential
STATEFUL
Session
CEP in Drools: Stateful Session + Sliding Window
STATELESS
Session
Rule:
Is the ball red?
Rule:
Are there 2+ red
balls in the last 4
balls I’ve seen?
© 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential
Streaming Architecture for CEP
Sensors -
Real-time Data
Producer
Distributed
Cluster (Kafka,
MapR)
Consumer Server
(Edge node, cluster
node)
Integrate with other
systems
© 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential
The Case for CEP on Streaming Architecture
• Decouple rules maintenance from code and infrastructure
– Manage the cluster separately
– The application code may need only minimal maintenance
• Rules maintenance in the hands of the business domain experts
– Easily supports multiple projects & teams
• Data is persisted in the stream (input and output)
– Open to new use cases
• Send data back to the stream
– Integrate with other downstream use cases
© 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential
But Does It Scale? Yes, But Only to a Point
• Drools and other rule engines are in-memory and the
memory is not distributed
– This is only a technical limitation that can be
overcome (Ex: Alluxio, Apache Ignite)
• Streams make it easy to provide reasonable fault-
tolerance and quick disaster recovery
• Run multiple servers, split rules logically, fan out data
into multiple topics
• A single session can handle 100K+/sec events. How
much scale is needed?
© 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential
Live Demo: Smart City Traffic Management
© 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential
● Try out integration with Spark
Streaming and Flink
● Run serious performance
benchmarks
● Deploy into production
© 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential
Recap
• It’s not Rule Engine vs. Spark and Flink Stream processing
– It’s Rules + Stream Processing
– Spark Flink, Java are just an implementation choice
• Focus on business value from applying rules to data
– Think of benefits of SQL vs. Java, C++, Scala, …
• Great use case for a Streaming Architecture and microservices
An in-depth blog post on this talk topic will be available on
MapR blog: https://www.mapr.com/blog/author/mathieu-dumoulin
© 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential
Suggested Reading
● Get Ted & Ellen’s book and many
more for free:
○ https://www.mapr.com/ebooks/
● More more great blog content
about CEP and IoT applications
○ Eric Bruno on Linkedin
○ Karzel et al. on InfoQ
© 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential
Q & A
@mapr
mdumoulin@mapr.com
@lordxar
Engage with us!
mapr-technologies

More Related Content

What's hot

Blockchain in IoT and Other Considerations by Dinis Guarda
Blockchain in IoT and Other Considerations by Dinis GuardaBlockchain in IoT and Other Considerations by Dinis Guarda
Blockchain in IoT and Other Considerations by Dinis GuardaDinis Guarda
 
Security and ethical issues of mobile device technology
Security and ethical issues of mobile device technologySecurity and ethical issues of mobile device technology
Security and ethical issues of mobile device technologyErik R. Ranschaert, MD, PhD
 
RapidClipse - Visual Low-Code IDE from Eclipse
RapidClipse - Visual Low-Code IDE from EclipseRapidClipse - Visual Low-Code IDE from Eclipse
RapidClipse - Visual Low-Code IDE from EclipseMarkus Kett
 
Seminar report of ewt
Seminar report of ewtSeminar report of ewt
Seminar report of ewtRanol R C
 
The Path to Open Banking
The Path to Open BankingThe Path to Open Banking
The Path to Open BankingMuleSoft
 
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...Kai Wähner
 
Ignou MCA 6th Semester Synopsis
Ignou MCA 6th Semester SynopsisIgnou MCA 6th Semester Synopsis
Ignou MCA 6th Semester SynopsisHitesh Jangid
 
The IoT Suitcase - Pitch Deck
The IoT Suitcase - Pitch DeckThe IoT Suitcase - Pitch Deck
The IoT Suitcase - Pitch DeckThe IoT Suitcase
 
Mappa dei locali ai Murazzi di Torino
Mappa dei locali ai Murazzi di TorinoMappa dei locali ai Murazzi di Torino
Mappa dei locali ai Murazzi di TorinoQuotidiano Piemontese
 
Ebilling project report
Ebilling project reportEbilling project report
Ebilling project reportSrish Kumar
 
BaaS-platforms and open APIs in fintech l bank-as-a-service.com
BaaS-platforms and open APIs in fintech l bank-as-a-service.comBaaS-platforms and open APIs in fintech l bank-as-a-service.com
BaaS-platforms and open APIs in fintech l bank-as-a-service.comVladislav Solodkiy
 
Mobile Based Attendance System
Mobile Based Attendance System Mobile Based Attendance System
Mobile Based Attendance System Abhishek Jha
 
Project black book TYIT
Project black book TYITProject black book TYIT
Project black book TYITLokesh Singrol
 
Internet of things (iot)
Internet of things (iot)Internet of things (iot)
Internet of things (iot)shubhamyadav613
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017paul young cpa, cga
 
Mobile Application Project report
Mobile Application Project reportMobile Application Project report
Mobile Application Project reportChin2uuu
 
IoT and the Role of Platforms
IoT and the Role of PlatformsIoT and the Role of Platforms
IoT and the Role of PlatformsTiE Bangalore
 
Laravel for e commerce build small store now and scale big later
Laravel for e commerce build small store now and scale big laterLaravel for e commerce build small store now and scale big later
Laravel for e commerce build small store now and scale big laterKaty Slemon
 

What's hot (20)

Blockchain in IoT and Other Considerations by Dinis Guarda
Blockchain in IoT and Other Considerations by Dinis GuardaBlockchain in IoT and Other Considerations by Dinis Guarda
Blockchain in IoT and Other Considerations by Dinis Guarda
 
Security and ethical issues of mobile device technology
Security and ethical issues of mobile device technologySecurity and ethical issues of mobile device technology
Security and ethical issues of mobile device technology
 
RapidClipse - Visual Low-Code IDE from Eclipse
RapidClipse - Visual Low-Code IDE from EclipseRapidClipse - Visual Low-Code IDE from Eclipse
RapidClipse - Visual Low-Code IDE from Eclipse
 
Seminar report of ewt
Seminar report of ewtSeminar report of ewt
Seminar report of ewt
 
The Path to Open Banking
The Path to Open BankingThe Path to Open Banking
The Path to Open Banking
 
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
IoT Architectures for a Digital Twin with Apache Kafka, IoT Platforms and Mac...
 
Ignou MCA 6th Semester Synopsis
Ignou MCA 6th Semester SynopsisIgnou MCA 6th Semester Synopsis
Ignou MCA 6th Semester Synopsis
 
Thingsboard IoT Platform - A Quick Tour
Thingsboard IoT Platform - A Quick TourThingsboard IoT Platform - A Quick Tour
Thingsboard IoT Platform - A Quick Tour
 
The IoT Suitcase - Pitch Deck
The IoT Suitcase - Pitch DeckThe IoT Suitcase - Pitch Deck
The IoT Suitcase - Pitch Deck
 
Oauth2.0
Oauth2.0Oauth2.0
Oauth2.0
 
Mappa dei locali ai Murazzi di Torino
Mappa dei locali ai Murazzi di TorinoMappa dei locali ai Murazzi di Torino
Mappa dei locali ai Murazzi di Torino
 
Ebilling project report
Ebilling project reportEbilling project report
Ebilling project report
 
BaaS-platforms and open APIs in fintech l bank-as-a-service.com
BaaS-platforms and open APIs in fintech l bank-as-a-service.comBaaS-platforms and open APIs in fintech l bank-as-a-service.com
BaaS-platforms and open APIs in fintech l bank-as-a-service.com
 
Mobile Based Attendance System
Mobile Based Attendance System Mobile Based Attendance System
Mobile Based Attendance System
 
Project black book TYIT
Project black book TYITProject black book TYIT
Project black book TYIT
 
Internet of things (iot)
Internet of things (iot)Internet of things (iot)
Internet of things (iot)
 
IOT - internet of Things - August 2017
IOT - internet of Things - August 2017IOT - internet of Things - August 2017
IOT - internet of Things - August 2017
 
Mobile Application Project report
Mobile Application Project reportMobile Application Project report
Mobile Application Project report
 
IoT and the Role of Platforms
IoT and the Role of PlatformsIoT and the Role of Platforms
IoT and the Role of Platforms
 
Laravel for e commerce build small store now and scale big later
Laravel for e commerce build small store now and scale big laterLaravel for e commerce build small store now and scale big later
Laravel for e commerce build small store now and scale big later
 

Similar to CEP - simplified streaming architecture - Strata Singapore 2016

Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...MapR Technologies
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteTed Dunning
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning PrimerMathieu Dumoulin
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...DataWorks Summit/Hadoop Summit
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on SparkMathieu Dumoulin
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksJustin Brandenburg
 

Similar to CEP - simplified streaming architecture - Strata Singapore 2016 (20)

Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
MapR and Machine Learning Primer
MapR and Machine Learning PrimerMapR and Machine Learning Primer
MapR and Machine Learning Primer
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on Spark
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural Networks
 

More from Mathieu Dumoulin

Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Mathieu Dumoulin
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comMathieu Dumoulin
 
Introduction aux algorithmes map reduce
Introduction aux algorithmes map reduceIntroduction aux algorithmes map reduce
Introduction aux algorithmes map reduceMathieu Dumoulin
 
MapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMathieu Dumoulin
 
Presentation Hadoop Québec
Presentation Hadoop QuébecPresentation Hadoop Québec
Presentation Hadoop QuébecMathieu Dumoulin
 

More from Mathieu Dumoulin (7)

Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
Converged and Containerized Distributed Deep Learning With TensorFlow and Kub...
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Introduction aux algorithmes map reduce
Introduction aux algorithmes map reduceIntroduction aux algorithmes map reduce
Introduction aux algorithmes map reduce
 
MapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifié
 
Presentation Hadoop Québec
Presentation Hadoop QuébecPresentation Hadoop Québec
Presentation Hadoop Québec
 
Introduction à Hadoop
Introduction à HadoopIntroduction à Hadoop
Introduction à Hadoop
 

Recently uploaded

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

CEP - simplified streaming architecture - Strata Singapore 2016

  • 1. © 2016 MapR Technologies 1© 2016 MapR Technologies 1MapR Confidential © 2016 MapR Technologies CEP - A Simplified Enterprise Architecture for Real-time Stream Processing Mathieu Dumoulin, Data Engineer (mdumoulin@mapr.com, @lordxar)
  • 2. © 2016 MapR Technologies 2© 2016 MapR Technologies 2MapR Confidential Mathieu Dumoulin • Living in Tokyo, Japan last 3 years • Data Engineer for MapR Professional Services • Other jobs: Data Scientist, Search Engineer • Connect with me: –Read my blog posts: https://www.mapr.com/blog/author/mathieu-dumoulin –Twitter: @Lordxar –Email: mdumoulin@mapr.com
  • 3. © 2016 MapR Technologies 3© 2016 MapR Technologies 3MapR Confidential Content Summary 1.Complex Event Processing 2.Streaming Architecture 3.Rules Engines for CEP 4.Simplified Hadoop-based CEP Architecture 5.Live Demo 6.Does it scale? 7.Conclusion
  • 4. © 2016 MapR Technologies 4© 2016 MapR Technologies 4MapR Confidential Complex Event Processing (CEP) Some terminology: • Event: Data with a timestamp (a log event, a transaction, ...) • Event processing: Track and analyze streaming event data • Complex event processing is to identify meaningful events and respond to them as quickly as possible. Usually over a sliding window on the stream of event data. CEP is just a fancy way to do business rules on streaming data
  • 5. © 2016 MapR Technologies 5© 2016 MapR Technologies 5MapR Confidential IoT: Needs some CEP in There Somewhere
  • 6. © 2016 MapR Technologies 6© 2016 MapR Technologies 6MapR Confidential CEP in Action The power of CEP comes from being able to detect complex situations that could not be detected from any individual data directly. Window opened Motion Sensor Light turned on Door opened
  • 7. © 2016 MapR Technologies 7© 2016 MapR Technologies 7MapR Confidential Actually, CEP Has Been Around For a While Taken from March 2010 issue of the Dutch Java Magazine (source)
  • 8. © 2016 MapR Technologies 8© 2016 MapR Technologies 8MapR Confidential Technology Has Been Holding Rule Engines Back • Rule engines are not new – First papers from the 90’s, many implementations in early 2000’s • Engine is running in-memory on single node – A few GB of memory (or less) was a severe limitation – Single core CPU can only do so much • Need modern stream messaging (Kafka, MapR Streams) – Need persistence – Need speed • No standard, no dominant sponsor – 90’s and early 2000 dominated by Microsoft – OSS had not come of age in enterprise IT
  • 9. © 2016 MapR Technologies 9© 2016 MapR Technologies 9MapR Confidential CEP in a Modern Enterprise Data Pipeline Source: Oracle / Rittman Mead Information Management Reference Architecture
  • 10. © 2016 MapR Technologies 10© 2016 MapR Technologies 10MapR Confidential Modern Streaming Architecture • Build flexible systems – more efficient and easier to build – Decouples dependencies • Better model the way business processes take place. • More value now – Aggregates data from many sources once – Serves data to one or many projects immediately • More value later – Run batch analytics on the data later – Reprocess the data with different algorithms later
  • 11. © 2016 MapR Technologies 11© 2016 MapR Technologies 11MapR Confidential Kafka-esque Messaging for Rule Engines • Stream Persistence is a key feature • CEP is only one use case – Support batch analytics and Ad-hoc analysis from the same data stream • Compensate for Current Rule Engine limitations – Enables Hot Replacement for fault-tolerance – Enables simple horizontal scaling by partitioning data and rules • Convergence – Run this use case on your existing, standard, big data technology – Use OSS frameworks and Open APIs
  • 12. © 2016 MapR Technologies 12© 2016 MapR Technologies 12MapR Confidential Roy Schulte, vice president, Gartner Most CEP in IoT [...] is custom coded [...] rather than [using a] general purpose stream platform. See: Complex Event Processing and The Future Of Business Decisions by David Luckham and W. Roy Schulte
  • 13. © 2016 MapR Technologies 13© 2016 MapR Technologies 13MapR Confidential Custom Coded CEP: The Good and The Bad The Good: • Made to order with a modern framework • “No limit” to potential for performance and scalability • Fit to purpose technology The bad: • Engineers aren’t business domain experts • Lots of work to build from scratch every time • Changes to logic is a pain point (from business side) • Lack of available talent/organizational capability
  • 14. © 2016 MapR Technologies 14© 2016 MapR Technologies 14MapR Confidential Declarative Makes Sense For Business Manage complex behavior through simple rules working together, executed by a rules Engine.
  • 15. © 2016 MapR Technologies 15© 2016 MapR Technologies 15MapR Confidential Drools is a business rule management system (BRMS) with a forward and backward chaining inference based rules engine. • Project homepage: http://www.drools.org/ • Developer: Red Hat • Enterprise supported version available – JBoss Enterprise BRMS • Enhanced implementation of the Rete algorithm – A state of the art algorithm for rules engines • Has a GUI Rules Editor: Workbench An Open Source Rule Engine:
  • 16. © 2016 MapR Technologies 16© 2016 MapR Technologies 16MapR Confidential An Open Source Rule Engine: Production Memory (Rules) Working Memory (Facts) Pattern Matcher AgendaDomain Expert Rules Editor Actions
  • 17. © 2016 MapR Technologies 17© 2016 MapR Technologies 17MapR Confidential STATELESS Session CEP in Drools: Stateful Session and Sliding Window STATELESS Session Rule: Is the ball red? Rule: Are there 2+ red balls in the last 4 balls I’ve seen?
  • 18. © 2016 MapR Technologies 18© 2016 MapR Technologies 18MapR Confidential STATEFUL Session CEP in Drools: Stateful Session + Sliding Window STATELESS Session Rule: Is the ball red? Rule: Are there 2+ red balls in the last 4 balls I’ve seen?
  • 19. © 2016 MapR Technologies 19© 2016 MapR Technologies 19MapR Confidential Streaming Architecture for CEP Sensors - Real-time Data Producer Distributed Cluster (Kafka, MapR) Consumer Server (Edge node, cluster node) Integrate with other systems
  • 20. © 2016 MapR Technologies 20© 2016 MapR Technologies 20MapR Confidential The Case for CEP on Streaming Architecture • Decouple rules maintenance from code and infrastructure – Manage the cluster separately – The application code may need only minimal maintenance • Rules maintenance in the hands of the business domain experts – Easily supports multiple projects & teams • Data is persisted in the stream (input and output) – Open to new use cases • Send data back to the stream – Integrate with other downstream use cases
  • 21. © 2016 MapR Technologies 21© 2016 MapR Technologies 21MapR Confidential But Does It Scale? Yes, But Only to a Point • Drools and other rule engines are in-memory and the memory is not distributed – This is only a technical limitation that can be overcome (Ex: Alluxio, Apache Ignite) • Streams make it easy to provide reasonable fault- tolerance and quick disaster recovery • Run multiple servers, split rules logically, fan out data into multiple topics • A single session can handle 100K+/sec events. How much scale is needed?
  • 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential Live Demo: Smart City Traffic Management
  • 23. © 2016 MapR Technologies 23© 2016 MapR Technologies 23MapR Confidential ● Try out integration with Spark Streaming and Flink ● Run serious performance benchmarks ● Deploy into production
  • 24. © 2016 MapR Technologies 24© 2016 MapR Technologies 24MapR Confidential Recap • It’s not Rule Engine vs. Spark and Flink Stream processing – It’s Rules + Stream Processing – Spark Flink, Java are just an implementation choice • Focus on business value from applying rules to data – Think of benefits of SQL vs. Java, C++, Scala, … • Great use case for a Streaming Architecture and microservices An in-depth blog post on this talk topic will be available on MapR blog: https://www.mapr.com/blog/author/mathieu-dumoulin
  • 25. © 2016 MapR Technologies 25© 2016 MapR Technologies 25MapR Confidential Suggested Reading ● Get Ted & Ellen’s book and many more for free: ○ https://www.mapr.com/ebooks/ ● More more great blog content about CEP and IoT applications ○ Eric Bruno on Linkedin ○ Karzel et al. on InfoQ
  • 26. © 2016 MapR Technologies 26© 2016 MapR Technologies 26MapR Confidential Q & A @mapr mdumoulin@mapr.com @lordxar Engage with us! mapr-technologies

Editor's Notes

  1. It’s just not true ML solves all problems. ML seeks to make predictions, which is very useful. But most business processes don’t need prediction every step of the way, they are rather more like a series of steps with conditionals arranged in a DAG
  2. Rules need to be: Independent Easily Updated (Add, Change, Delete) Rules apply to only minimum set of relevant data Allow business domain experts to contribute
  3. Integrate Flink/Spark Streaming with Drools Performance and Scalability Testing Flink brings “for free” lots of benefits: State is saved automatically by checkpoints Fault-recovery for Drools state is simplified Record-at-a-time processing is a good model to add data to KieSession