SlideShare a Scribd company logo
1 of 24
Download to read offline
© HAZELCAST | CONFIDENTIAL | 1
© HAZELCAST | CONFIDENTIAL | 1
A Systematic Literature
Review and Meta-Analysis of
Event Streaming in Academia
John DesJardins @johmdesjardins
(study by Fawaz Ghali)
© HAZELCAST | CONFIDENTIAL | 2
© HAZELCAST | CONFIDENTIAL | 2
Study by
✦ Fawaz Ghali @fawazghali
✦ Developer Advocate at Hazelcast
✦ PhD in Computer Science
✦ +46 peered-review publications
✦ Google Scholar i10-index: 19
✦ Basically, I know this field well.
© HAZELCAST | CONFIDENTIAL | 3
© HAZELCAST | CONFIDENTIAL | 3
This picture is derived from Greek
mythology: the blind giant Orion
carried his servant Cedalion on his
shoulders to act as the giant's eyes
© HAZELCAST | CONFIDENTIAL | 4
© HAZELCAST | CONFIDENTIAL | 4
Where do you stand?
Standing on the shoulders of giants (Google Scholar)
90% of the Research Literature is useless (said someone)
© HAZELCAST | CONFIDENTIAL | 5
© HAZELCAST | CONFIDENTIAL | 5
Literature Research (Framework + Results)
+
Best Practices
=
Success
© HAZELCAST | CONFIDENTIAL | 6
© HAZELCAST | CONFIDENTIAL | 6
Part 1: Framework: Preferred Reporting Items for
Systematic Reviews and Meta-Analyses (PRISMA)
© HAZELCAST | CONFIDENTIAL | 7
© HAZELCAST | CONFIDENTIAL | 7
Included
Eligibility
Screening
Identification
(stream-
processing)
560 articles
12 Digital
libraries
duplicates
removed
Assessed Qualitative
Further
identified
with ML
Quantitative
Removed if
no real-time
Removed if
before 2021
© HAZELCAST | CONFIDENTIAL | 8
© HAZELCAST | CONFIDENTIAL | 8
Part 2: Results
✦ DOI10.5455/jjcit.71-1646838830: A Novel True-real-time Spatiotemporal Data Stream Processing
Framework
© HAZELCAST | CONFIDENTIAL | 9
© HAZELCAST | CONFIDENTIAL | 9
Part 2: Results
© HAZELCAST | CONFIDENTIAL | 10
© HAZELCAST | CONFIDENTIAL | 10
© HAZELCAST | CONFIDENTIAL | 11
© HAZELCAST | CONFIDENTIAL | 11
© HAZELCAST | CONFIDENTIAL | 12
© HAZELCAST | CONFIDENTIAL | 12
✦ scalability (Sc), data analytics (DA), multiple event types (MET), prediction tools (PT), data storage
(DS), realtime (Rt), performance evaluation (PE) and stream processing (SP)
© HAZELCAST | CONFIDENTIAL | 13
© HAZELCAST | CONFIDENTIAL | 13
A Survey of Distributed Data Stream Processing
Frameworks
✦ DOI: 10.1109/ACCESS.2019.2946884
© HAZELCAST | CONFIDENTIAL | 14
© HAZELCAST | CONFIDENTIAL | 14
Quantitative Analysis for SQL over Streaming
0
500
1000
1500
2000
2500
3000
3500
2010-2012 2012-2014 2014-2016 2016-2018 2018-2020 2020-2022
# of publications including citations
© HAZELCAST | CONFIDENTIAL | 15
© HAZELCAST | CONFIDENTIAL | 15
Quantitative Analysis for ML over Streaming
To Be Added / Coming Soon!
© HAZELCAST | CONFIDENTIAL | 16
© HAZELCAST | CONFIDENTIAL | 16
Part 3: Best Practices
John DesJardins - Background in Streaming Analytics
✦ Software AG - Worked in Advanced Technologies Team -
involved in onboarding/enablement for acquired technologies:
• 2010 - acquired RTM Group - Java based streaming
engine, later launched as webMethods Business Events
• 2013 - acquired Apama Complex Event Processing
✦ Cloudera - 2015 - 2018
• Apache Spark and Kafka Streams
✦ Hazelcast - 2018 to present
• Hazelcast Jet, unified into Hazelcast Platform in 2021
Gigascale Stream Processing Engine
https://hazelcast.com/blog/billion-events-per-second-with-
millisecond-latency-streaming-analytics-at-giga-scale/
© HAZELCAST | CONFIDENTIAL | 17
© HAZELCAST | CONFIDENTIAL | 17
Part 3: Best Practices - Streaming Architectures
✦ Do Your Homework - Research on Technologies/Papers, not just Vendors &
Analysts, Hacker News, etc. - but also look for emerging technologies/projects
✦ Favor Modern, Distributed Architectures - Best Suited to Cloud/Cloud-Native
✦ Favor Simpler Architectures - Easier to Operate, More Resilient
✦ Ensure Completeness of Capabilities in Your Architecture, such as:
➢ Stream Processing with Windowing, Joins, etc.
➢ Data Pipelines
➢ Connectors & CDC
➢ SQL for stored and streaming data
➢ Fast, Integrated Storage for Stateful Streaming
➢ Ability to Integrate/Execute ML
✦ Run Your Own Benchmarks & POC - Always “Trust but Verify”
Anti-Pattern for Real-time Stream Processing
Data Lake Pattern
✦ Complex Architecture
✦ Long Lag from Data Birth to Value Creation
✦ Often There Are Changes to Data as It Is Consolidated
✦ Doesn’t Support Zero Downtime
✦ Costly to Implement and Operate
Ingest
“Raw”
Refine &
Enrich Cleansed
Real-time
Analytics &
Machine
Learning
Data Lakes
Batch
Loads
Data Is
Born
Action
Taken,
Value
Created
Source
Source
Source
Data Processing
Live Events
Analytics
Live Events
Analytics
Live Events
Analytics
Live Events
Analytics
Sink or Client
App
Unified Architecture
Scale Ingest/Data/Compute/ML - Together
Streaming Ingest Queries, Logic &
Machine Learning
Streaming Ingest
Queries, Logic &
Machine Learning
Streaming Ingest
Queries, Logic &
Machine Learning
Streaming Ingest Queries, Logic &
Machine Learning
Event Stream Input
MLOps
Runtimes
Both Compute & Data Are Partitioned. Compute Is Partition Aware.
Hazelcast Advantages
✦ Partitioning of Both Compute & Data = Maximizing Parallel & Distributed Architecture
✦ Data-Aware Processing = Data Locality – Less Movement of Data
✦ Collaborative Worksharing = Simplifies Scalability
✦ In-Memory Optimized = Further Drive Down Latency
✦ Simple Peer to Peer Architecture, Cloud-Native
✦ Multi-Region DR, Live Job Upgrades – Do Not Require Restarts – ZERO DOWNTIME
Additional Unique Features:
✦ Easier to Operate across Dev & Ops
✦ Great Developer Experience
✦ Cooperative Multi-threading - Efficiently Uses Multi-Core Processors
© HAZELCAST | CONFIDENTIAL | 21
© HAZELCAST | CONFIDENTIAL | 21
Future work
✦ Blog/Paper - with details on this Systemic Literature Review & additional quantitative findings -
Dr. Fawaz Ghali, Hazelcast
✦ Work to get existing paper peer reviewed
• Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile
https://arxiv.org/abs/2103.10169
✦ Additional research and papers - focused on the value of a Unified approach to real-time
stream processing, along with machine learning highlighting:
• Benefits of combining compute with data, as well as partitioned, data aware processing.
⁃ Reducing latency
⁃ Increasing throughput
⁃ Improving resilience
• Work with researchers on applying these technologies to problems in areas such as
healthcare imaging
• Papers around our work on next generation architecture - another order of magnitude
© HAZELCAST | CONFIDENTIAL | 22
© HAZELCAST | CONFIDENTIAL | 22
Hazelcast Viridian Serverless
▪ Try Us TODAY!
▪ Create a Cluster in Seconds
▪ Free 2GiB of in-memory storage
▪ Declarative API
▪ Support for streaming SQL
▪ Cross-Region replication capabilities
▪ Maximum throughput and low latency
© HAZELCAST | CONFIDENTIAL | 23
© HAZELCAST | CONFIDENTIAL | 23
Join the Hazelcast Developer Community
✦ Slack: Join the Hazelcast Community on slack to chat with users and developers of Hazelcast.
hazelcastcommunity.slack.com
✦ Local events: Speak at or meet the Hazelcast Local community
✦ Twitch live streams: Live sessions, interviews, office hours
https://www.twitch.tv/thehazelcast
✦ Community Blog: Write about your experience or read how other developers use Hazelcast
✦ Hazelcast Heroes: Become a Hazelcast Hero
✦ Email: developer-relations@hazelcast.com
✦ Contribute:
https://github.com/hazelcast
© HAZELCAST | CONFIDENTIAL | 24
© HAZELCAST | CONFIDENTIAL | 24
Thank you!
Q&A
Special thanks to the work of Fawaz Ghali –
Developer Advocate
@fawazghali

More Related Content

Similar to Event Streaming in Academia With John Desjardins | Current 2022

Geek Nights Hong Kong
Geek Nights Hong KongGeek Nights Hong Kong
Geek Nights Hong KongRahul Gupta
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSDenodo
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-OverviewHarry Frost
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jrJonathan Raspaud
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Denodo
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingCascading
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio, Inc.
 
Cascading concurrent yahoo lunch_nlearn
Cascading concurrent   yahoo lunch_nlearnCascading concurrent   yahoo lunch_nlearn
Cascading concurrent yahoo lunch_nlearnCascading
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleDatabricks
 
A Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameA Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameInside Analysis
 
Core Geospatial Ontologies
Core Geospatial OntologiesCore Geospatial Ontologies
Core Geospatial OntologiesStephane Fellah
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...confluent
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...DataStax
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonJeffrey T. Pollock
 
Carrenza event - deliver without sacrifice. Are you in control of your cloud?
Carrenza event  - deliver without sacrifice. Are you in control of your cloud?Carrenza event  - deliver without sacrifice. Are you in control of your cloud?
Carrenza event - deliver without sacrifice. Are you in control of your cloud?Carrenza
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for releaseJen Stirrup
 

Similar to Event Streaming in Academia With John Desjardins | Current 2022 (20)

Geek Nights Hong Kong
Geek Nights Hong KongGeek Nights Hong Kong
Geek Nights Hong Kong
 
Data Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWSData Driven Advanced Analytics using Denodo Platform on AWS
Data Driven Advanced Analytics using Denodo Platform on AWS
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Cardinality-HL-Overview
Cardinality-HL-OverviewCardinality-HL-Overview
Cardinality-HL-Overview
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jr
 
Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)Data Services and the Modern Data Ecosystem (Middle East)
Data Services and the Modern Data Ecosystem (Middle East)
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Alluxio Use Cases and Future Directions
Alluxio Use Cases and Future DirectionsAlluxio Use Cases and Future Directions
Alluxio Use Cases and Future Directions
 
Cascading concurrent yahoo lunch_nlearn
Cascading concurrent   yahoo lunch_nlearnCascading concurrent   yahoo lunch_nlearn
Cascading concurrent yahoo lunch_nlearn
 
Architecting Agile Data Applications for Scale
Architecting Agile Data Applications for ScaleArchitecting Agile Data Applications for Scale
Architecting Agile Data Applications for Scale
 
A Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality GameA Tighter Weave – How YARN Changes the Data Quality Game
A Tighter Weave – How YARN Changes the Data Quality Game
 
Core Geospatial Ontologies
Core Geospatial OntologiesCore Geospatial Ontologies
Core Geospatial Ontologies
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Flash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lonFlash session -streaming--ses1243-lon
Flash session -streaming--ses1243-lon
 
Carrenza event - deliver without sacrifice. Are you in control of your cloud?
Carrenza event  - deliver without sacrifice. Are you in control of your cloud?Carrenza event  - deliver without sacrifice. Are you in control of your cloud?
Carrenza event - deliver without sacrifice. Are you in control of your cloud?
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

Event Streaming in Academia With John Desjardins | Current 2022

  • 1. © HAZELCAST | CONFIDENTIAL | 1 © HAZELCAST | CONFIDENTIAL | 1 A Systematic Literature Review and Meta-Analysis of Event Streaming in Academia John DesJardins @johmdesjardins (study by Fawaz Ghali)
  • 2. © HAZELCAST | CONFIDENTIAL | 2 © HAZELCAST | CONFIDENTIAL | 2 Study by ✦ Fawaz Ghali @fawazghali ✦ Developer Advocate at Hazelcast ✦ PhD in Computer Science ✦ +46 peered-review publications ✦ Google Scholar i10-index: 19 ✦ Basically, I know this field well.
  • 3. © HAZELCAST | CONFIDENTIAL | 3 © HAZELCAST | CONFIDENTIAL | 3 This picture is derived from Greek mythology: the blind giant Orion carried his servant Cedalion on his shoulders to act as the giant's eyes
  • 4. © HAZELCAST | CONFIDENTIAL | 4 © HAZELCAST | CONFIDENTIAL | 4 Where do you stand? Standing on the shoulders of giants (Google Scholar) 90% of the Research Literature is useless (said someone)
  • 5. © HAZELCAST | CONFIDENTIAL | 5 © HAZELCAST | CONFIDENTIAL | 5 Literature Research (Framework + Results) + Best Practices = Success
  • 6. © HAZELCAST | CONFIDENTIAL | 6 © HAZELCAST | CONFIDENTIAL | 6 Part 1: Framework: Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)
  • 7. © HAZELCAST | CONFIDENTIAL | 7 © HAZELCAST | CONFIDENTIAL | 7 Included Eligibility Screening Identification (stream- processing) 560 articles 12 Digital libraries duplicates removed Assessed Qualitative Further identified with ML Quantitative Removed if no real-time Removed if before 2021
  • 8. © HAZELCAST | CONFIDENTIAL | 8 © HAZELCAST | CONFIDENTIAL | 8 Part 2: Results ✦ DOI10.5455/jjcit.71-1646838830: A Novel True-real-time Spatiotemporal Data Stream Processing Framework
  • 9. © HAZELCAST | CONFIDENTIAL | 9 © HAZELCAST | CONFIDENTIAL | 9 Part 2: Results
  • 10. © HAZELCAST | CONFIDENTIAL | 10 © HAZELCAST | CONFIDENTIAL | 10
  • 11. © HAZELCAST | CONFIDENTIAL | 11 © HAZELCAST | CONFIDENTIAL | 11
  • 12. © HAZELCAST | CONFIDENTIAL | 12 © HAZELCAST | CONFIDENTIAL | 12 ✦ scalability (Sc), data analytics (DA), multiple event types (MET), prediction tools (PT), data storage (DS), realtime (Rt), performance evaluation (PE) and stream processing (SP)
  • 13. © HAZELCAST | CONFIDENTIAL | 13 © HAZELCAST | CONFIDENTIAL | 13 A Survey of Distributed Data Stream Processing Frameworks ✦ DOI: 10.1109/ACCESS.2019.2946884
  • 14. © HAZELCAST | CONFIDENTIAL | 14 © HAZELCAST | CONFIDENTIAL | 14 Quantitative Analysis for SQL over Streaming 0 500 1000 1500 2000 2500 3000 3500 2010-2012 2012-2014 2014-2016 2016-2018 2018-2020 2020-2022 # of publications including citations
  • 15. © HAZELCAST | CONFIDENTIAL | 15 © HAZELCAST | CONFIDENTIAL | 15 Quantitative Analysis for ML over Streaming To Be Added / Coming Soon!
  • 16. © HAZELCAST | CONFIDENTIAL | 16 © HAZELCAST | CONFIDENTIAL | 16 Part 3: Best Practices John DesJardins - Background in Streaming Analytics ✦ Software AG - Worked in Advanced Technologies Team - involved in onboarding/enablement for acquired technologies: • 2010 - acquired RTM Group - Java based streaming engine, later launched as webMethods Business Events • 2013 - acquired Apama Complex Event Processing ✦ Cloudera - 2015 - 2018 • Apache Spark and Kafka Streams ✦ Hazelcast - 2018 to present • Hazelcast Jet, unified into Hazelcast Platform in 2021 Gigascale Stream Processing Engine https://hazelcast.com/blog/billion-events-per-second-with- millisecond-latency-streaming-analytics-at-giga-scale/
  • 17. © HAZELCAST | CONFIDENTIAL | 17 © HAZELCAST | CONFIDENTIAL | 17 Part 3: Best Practices - Streaming Architectures ✦ Do Your Homework - Research on Technologies/Papers, not just Vendors & Analysts, Hacker News, etc. - but also look for emerging technologies/projects ✦ Favor Modern, Distributed Architectures - Best Suited to Cloud/Cloud-Native ✦ Favor Simpler Architectures - Easier to Operate, More Resilient ✦ Ensure Completeness of Capabilities in Your Architecture, such as: ➢ Stream Processing with Windowing, Joins, etc. ➢ Data Pipelines ➢ Connectors & CDC ➢ SQL for stored and streaming data ➢ Fast, Integrated Storage for Stateful Streaming ➢ Ability to Integrate/Execute ML ✦ Run Your Own Benchmarks & POC - Always “Trust but Verify”
  • 18. Anti-Pattern for Real-time Stream Processing Data Lake Pattern ✦ Complex Architecture ✦ Long Lag from Data Birth to Value Creation ✦ Often There Are Changes to Data as It Is Consolidated ✦ Doesn’t Support Zero Downtime ✦ Costly to Implement and Operate Ingest “Raw” Refine & Enrich Cleansed Real-time Analytics & Machine Learning Data Lakes Batch Loads Data Is Born Action Taken, Value Created
  • 19. Source Source Source Data Processing Live Events Analytics Live Events Analytics Live Events Analytics Live Events Analytics Sink or Client App Unified Architecture Scale Ingest/Data/Compute/ML - Together Streaming Ingest Queries, Logic & Machine Learning Streaming Ingest Queries, Logic & Machine Learning Streaming Ingest Queries, Logic & Machine Learning Streaming Ingest Queries, Logic & Machine Learning Event Stream Input MLOps Runtimes Both Compute & Data Are Partitioned. Compute Is Partition Aware.
  • 20. Hazelcast Advantages ✦ Partitioning of Both Compute & Data = Maximizing Parallel & Distributed Architecture ✦ Data-Aware Processing = Data Locality – Less Movement of Data ✦ Collaborative Worksharing = Simplifies Scalability ✦ In-Memory Optimized = Further Drive Down Latency ✦ Simple Peer to Peer Architecture, Cloud-Native ✦ Multi-Region DR, Live Job Upgrades – Do Not Require Restarts – ZERO DOWNTIME Additional Unique Features: ✦ Easier to Operate across Dev & Ops ✦ Great Developer Experience ✦ Cooperative Multi-threading - Efficiently Uses Multi-Core Processors
  • 21. © HAZELCAST | CONFIDENTIAL | 21 © HAZELCAST | CONFIDENTIAL | 21 Future work ✦ Blog/Paper - with details on this Systemic Literature Review & additional quantitative findings - Dr. Fawaz Ghali, Hazelcast ✦ Work to get existing paper peer reviewed • Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile https://arxiv.org/abs/2103.10169 ✦ Additional research and papers - focused on the value of a Unified approach to real-time stream processing, along with machine learning highlighting: • Benefits of combining compute with data, as well as partitioned, data aware processing. ⁃ Reducing latency ⁃ Increasing throughput ⁃ Improving resilience • Work with researchers on applying these technologies to problems in areas such as healthcare imaging • Papers around our work on next generation architecture - another order of magnitude
  • 22. © HAZELCAST | CONFIDENTIAL | 22 © HAZELCAST | CONFIDENTIAL | 22 Hazelcast Viridian Serverless ▪ Try Us TODAY! ▪ Create a Cluster in Seconds ▪ Free 2GiB of in-memory storage ▪ Declarative API ▪ Support for streaming SQL ▪ Cross-Region replication capabilities ▪ Maximum throughput and low latency
  • 23. © HAZELCAST | CONFIDENTIAL | 23 © HAZELCAST | CONFIDENTIAL | 23 Join the Hazelcast Developer Community ✦ Slack: Join the Hazelcast Community on slack to chat with users and developers of Hazelcast. hazelcastcommunity.slack.com ✦ Local events: Speak at or meet the Hazelcast Local community ✦ Twitch live streams: Live sessions, interviews, office hours https://www.twitch.tv/thehazelcast ✦ Community Blog: Write about your experience or read how other developers use Hazelcast ✦ Hazelcast Heroes: Become a Hazelcast Hero ✦ Email: developer-relations@hazelcast.com ✦ Contribute: https://github.com/hazelcast
  • 24. © HAZELCAST | CONFIDENTIAL | 24 © HAZELCAST | CONFIDENTIAL | 24 Thank you! Q&A Special thanks to the work of Fawaz Ghali – Developer Advocate @fawazghali