SlideShare a Scribd company logo
1 of 126
IN-MEMORY STREAM PROCESSING WITH
Nazarii Cherkas | Hazelcast
nazarii@hazelcast.com
https://twitter.com/n_cherkas
Brief Agenda
• Why Stream Processing?
• What‘s special about Streaming Data
• Challenges when processing the Infinite Stream
• Hazelcast Jet: The modern Stream Processing Engine
• Overview and Key Concepts
• Infinite Stream Processing
• Fault Tolerance
• Jet Performance
• Summary
2© 2018 Hazelcast Inc.
About me
3© 2018 Hazelcast Inc.
About me
• 7+ years of experience of on different positions
from Java Engineer to Team Lead
4© 2018 Hazelcast Inc.
About me
• 7+ years of experience of on different positions
from Java Engineer to Team Lead
• Solutions Architect at Hazelcast, I solve
problems of our users and interact with
community
5© 2018 Hazelcast Inc.
Why Stream Processing?
6© 2018 Hazelcast Inc.
Streaming Data is everywhere
7© 2018 Hazelcast Inc.
What's special about Streaming Data
8© 2018 Hazelcast Inc.
What's special about Streaming Data
• Infinite data sets
9© 2018 Hazelcast Inc.
What's special about Streaming Data
• Infinite data sets
• Small size of data record
10© 2018 Hazelcast Inc.
What's special about Streaming Data
• Infinite data sets
• Small size of data record
• Near real-time insights
11© 2018 Hazelcast Inc.
What's special about Streaming Data
• Infinite data sets
• Small size of data record
• Near real-time insights
• Variance in throughput and variance in disorder
12© 2018 Hazelcast Inc.
Definitions of Stream Processing
13© 2018 Hazelcast Inc.
Definitions of Stream Processing
“...a type of data processing that is designed with infinite data sets in
mind...”
https://jet.hazelcast.org/use-cases/real-time-stream-processing/
https://data-artisans.com/what-is-stream-processing
https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
14© 2018 Hazelcast Inc.
Definitions of Stream Processing
“...a type of data processing that is designed with infinite data sets in
mind...”
“...processing of data in motion, or in other words, computing on data
directly as it is produced or received…”
https://jet.hazelcast.org/use-cases/real-time-stream-processing/
https://data-artisans.com/what-is-stream-processing
https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
15© 2018 Hazelcast Inc.
Definitions of Stream Processing
“...a type of data processing that is designed with infinite data sets in
mind...”
“...processing of data in motion, or in other words, computing on data
directly as it is produced or received…”
“...a technique to process the data on-the-fly, prior to it’s storage...”
https://jet.hazelcast.org/use-cases/real-time-stream-processing/
https://data-artisans.com/what-is-stream-processing
https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101
16© 2018 Hazelcast Inc.
Stream vs Batch Processing
17© 2018 Hazelcast Inc.
Stream vs Batch Processing
https://aws.amazon.com/streaming-data/
18
Batch processing Stream processing
Data scope Queries or processing over all or
most of the data in the dataset
Queries or processing over data
within a rolling time window, or on
just the most recent data record
© 2018 Hazelcast Inc.
Stream vs Batch Processing
https://aws.amazon.com/streaming-data/
19
Batch processing Stream processing
Data scope Queries or processing over all or
most of the data in the dataset
Queries or processing over data
within a rolling time window, or on
just the most recent data record
Data size Large batches of data
Individual records or micro batches
consisting of a few records
© 2018 Hazelcast Inc.
Stream vs Batch Processing
https://aws.amazon.com/streaming-data/
20
Batch processing Stream processing
Data scope Queries or processing over all or
most of the data in the dataset
Queries or processing over data
within a rolling time window, or on
just the most recent data record
Data size Large batches of data
Individual records or micro batches
consisting of a few records
Responsiveness Latencies in minutes to hours
Requires latency in the order of
seconds or milliseconds
© 2018 Hazelcast Inc.
Stream vs Batch Processing
https://aws.amazon.com/streaming-data/
21
Batch processing Stream processing
Data scope Queries or processing over all or
most of the data in the dataset
Queries or processing over data
within a rolling time window, or on
just the most recent data record
Data size Large batches of data
Individual records or micro batches
consisting of a few records
Responsiveness Latencies in minutes to hours
Requires latency in the order of
seconds or milliseconds
Analyses Complex analytics
Aggregates, approximation
algorithms and simple response
functions
© 2018 Hazelcast Inc.
Layers of Stream Processing
22© 2018 Hazelcast Inc.
Challenges of Stream Processing
23© 2018 Hazelcast Inc.
Challenges of Stream Processing
• Distributed system coordination
24© 2018 Hazelcast Inc.
Challenges of Stream Processing
• Distributed system coordination
• Notion of time
25© 2018 Hazelcast Inc.
Challenges of Stream Processing
• Distributed system coordination
• Notion of time
• Memory management
26© 2018 Hazelcast Inc.
Challenges of Stream Processing
• Distributed system coordination
• Notion of time
• Memory management
• Fault-tolerance
27© 2018 Hazelcast Inc.
Hazelcast Jet: In-Memory Streaming and
Fast Batch Processing
28© 2018 Hazelcast Inc.
What is Hazelcast Jet
29© 2018 Hazelcast Inc.
Source Sink
What is Hazelcast Jet
https://github.com/hazelcast/hazelcast-jet/
Apache License 2.0
30© 2018 Hazelcast Inc.
Source Sink
Hazelcast Jet use cases
31© 2018 Hazelcast Inc.
Hazelcast Jet use cases
• Low-latency Stream processing and analytics
32© 2018 Hazelcast Inc.
Hazelcast Jet use cases
• Low-latency Stream processing and analytics
• Fast Batch processing and ETL
33© 2018 Hazelcast Inc.
Hazelcast Jet use cases
• Low-latency Stream processing and analytics
• Fast Batch processing and ETL
• Distributed java.util.stream
34© 2018 Hazelcast Inc.
Hazelcast Jet use cases
• Low-latency Stream processing and analytics
• Fast Batch processing and ETL
• Distributed java.util.stream
• Implementing event sourcing and CQRS
35© 2018 Hazelcast Inc.
Hazelcast Jet use cases
• Low-latency Stream processing and analytics
• Fast Batch processing and ETL
• Distributed java.util.stream
• Implementing event sourcing and CQRS
• Data processing microservice architectures
36© 2018 Hazelcast Inc.
Hazelcast Jet: Architecture Overview
37
Core API
java.util.stream
Batch Readers and Writers
Batch Processing
Pipeline API
Streaming Readers and Writers
Stream Processing
Networking
Deployment
Data Structures and Partition Management
Execution Engine
Cluster Management with Cloud Discovery SPI
Java Client
Fault-Tolerance
Connectors
High-Level APIs
Processing
Core
© 2018 Hazelcast Inc. 31
Hazelcast Jet: Architecture Overview
38
Core API
java.util.stream
Batch Readers and Writers
Batch Processing
Pipeline API
Streaming Readers and Writers
Stream Processing
Networking
Deployment
Data Structures and Partition Management
Execution Engine
Cluster Management with Cloud Discovery SPI
Java Client
Fault-Tolerance
Connectors
High-Level APIs
Processing
Core
© 2018 Hazelcast Inc. 32
Hazelcast Jet: Architecture Overview
39
Core API
java.util.stream
Batch Readers and Writers
Batch Processing
Pipeline API
Streaming Readers and Writers
Stream Processing
Networking
Deployment
Data Structures and Partition Management
Execution Engine
Cluster Management with Cloud Discovery SPI
Java Client
Fault-Tolerance
Connectors
High-Level APIs
Processing
Core
© 2018 Hazelcast Inc. 33
Talk is cheap, show me the Word Count Demo
Word Count problem is the “Hello, World” in the Land of Stream
Processing
• Input
• Text book in the single file
• Stop-list of words to ignore i.e. ”this”, “that”, “of” etc.
• Output
• Top N word occurrences in the book, saved as key -> value pairs
40© 2018 Hazelcast Inc.
https://es.wikiquote.org/wiki/Linus_Torvalds
https://github.com/ncherkas/hazelcast-jet-demos
Key Concepts
41© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
42© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
43© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
44© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
45© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
46© 2018 Hazelcast Inc.
Key concepts
Distributed Acyclic Graph (DAG)
47© 2018 Hazelcast Inc.
Key concepts
Jet Cluster
48© 2018 Hazelcast Inc.
Key concepts
Jet Cluster
49© 2018 Hazelcast Inc.
Key concepts
Job Execution
50© 2018 Hazelcast Inc.
Key concepts
Job Execution
51© 2018 Hazelcast Inc.
Key concepts
Job Execution
52© 2018 Hazelcast Inc.
Hazelcast Member Hazelcast Member Hazelcast Member
Jet APIs
53© 2018 Hazelcast Inc.
Jet APIs
• Pipeline API
• First choice to use Jet. Build rich data pipelines on a variety of sources and sinks
54© 2018 Hazelcast Inc.
Jet APIs
• Pipeline API
• First choice to use Jet. Build rich data pipelines on a variety of sources and sinks
• Distributed java.util.stream
• Entry-level usage, simple transform-aggregate operations on IMap, JCache and IList
55© 2018 Hazelcast Inc.
Jet APIs
• Pipeline API
• First choice to use Jet. Build rich data pipelines on a variety of sources and sinks
• Distributed java.util.stream
• Entry-level usage, simple transform-aggregate operations on IMap, JCache and IList
• Core DAG API
• Low-level API for fine-grained tuning and integration
56© 2018 Hazelcast Inc.
Sources and Sinks
57© 2018 Hazelcast Inc.
Resource Infinite?
IList ❌
IMap, ICache ❌
Remote IMap, ICache ❌
Event Journal ✅
Remote Event Journal ✅
HDFS ❌
Kafka ✅
Files ❌
File Watcher ✅
TCP Socket ✅
Application Log N/A
Sources and Sinks
58© 2018 Hazelcast Inc.
Resource Infinite? Replyable? Checkpointing?
IList ❌ ✅ ❌
IMap, ICache ❌ ✅ ❌
Remote IMap, ICache ❌ ✅ ❌
Event Journal ✅ ✅ ✅
Remote Event Journal ✅ ✅ ✅
HDFS ❌ ✅ ❌
Kafka ✅ ✅ ✅
Files ❌ ✅ ❌
File Watcher ✅ ❌ ❌
TCP Socket ✅ ❌ ❌
Application Log N/A N/A ❌
Sources and Sinks
59© 2018 Hazelcast Inc.
Resource Infinite? Replyable? Checkpointing? Distributed?
IList ❌ ✅ ❌ ❌
IMap, ICache ❌ ✅ ❌ ✅
Remote IMap, ICache ❌ ✅ ❌ ✅
Event Journal ✅ ✅ ✅ ✅
Remote Event Journal ✅ ✅ ✅ ✅
HDFS ❌ ✅ ❌ ✅
Kafka ✅ ✅ ✅ ✅
Files ❌ ✅ ❌ ❌
File Watcher ✅ ❌ ❌ ❌
TCP Socket ✅ ❌ ❌ ❌
Application Log N/A N/A ❌ ❌
Sources and Sinks
60© 2018 Hazelcast Inc.
Resource Infinite? Replyable? Checkpointing? Distributed? Data Locality
IList ❌ ✅ ❌ ❌ ❌
IMap, ICache ❌ ✅ ❌ ✅ Src ✅ Sink ❌
Remote IMap, ICache ❌ ✅ ❌ ✅ ❌
Event Journal ✅ ✅ ✅ ✅ ❌
Remote Event Journal ✅ ✅ ✅ ✅ ❌
HDFS ❌ ✅ ❌ ✅ ✅
Kafka ✅ ✅ ✅ ✅ ❌
Files ❌ ✅ ❌ ❌ ✅
File Watcher ✅ ❌ ❌ ❌ ✅
TCP Socket ✅ ❌ ❌ ❌ ❌
Application Log N/A N/A ❌ ❌ ✅
Infinite Stream Processing with Jet
61© 2018 Hazelcast Inc.
Jet Streaming Demo
Flight Telemetry
Processing a near real-time Flight Telemetry Stream from ADS-B Exchange
- https://www.adsbexchange.com/
62© 2018 Hazelcast Inc.
Jet Streaming Demo
Flight Telemetry
Processing a near real-time Flight Telemetry Stream from ADS-B Exchange
- https://www.adsbexchange.com/
• Filter out planes outside of defined airports
63© 2018 Hazelcast Inc.
Jet Streaming Demo
Flight Telemetry
Processing a near real-time Flight Telemetry Stream from ADS-B Exchange
- https://www.adsbexchange.com/
• Filter out planes outside of defined airports
• Detect whether the plane is ascending, descending or staying in the same level
64© 2018 Hazelcast Inc.
Jet Streaming Demo
Flight Telemetry
Processing a near real-time Flight Telemetry Stream from ADS-B Exchange
- https://www.adsbexchange.com/
• Filter out planes outside of defined airports
• Detect whether the plane is ascending, descending or staying in the same level
• Based on the plane type and phase of the flight calculate the maximum noise levels
nearby to an airport and estimate C02 emissions for a region
65© 2018 Hazelcast Inc.
https://github.com/ncherkas/hazelcast-jet-demos
Jet Streaming Demo
Dashboard Pipeline
66© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
67© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
68© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
69© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
70© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
71© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
72© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
73© 2018 Hazelcast Inc.
Jet Streaming Demo
Dashboard Pipeline
74© 2018 Hazelcast Inc.
Pipeline transformations
75© 2018 Hazelcast Inc.
Pipeline transformations
• Time-agnostic transformations
• Filter
• Map
• Flatmap
76© 2018 Hazelcast Inc.
Pipeline transformations
• Time-agnostic transformations
• Filter
• Map
• Flatmap
• Aggregation and Grouping
• Build-in count, different kind averages, min/max, linear trends and many more
77© 2018 Hazelcast Inc.
Pipeline transformations
• Time-agnostic transformations
• Filter
• Map
• Flatmap
• Aggregation and Grouping
• Build-in count, different kind averages, min/max, linear trends and many more
• Co-Aggregation
78© 2018 Hazelcast Inc.
Pipeline transformations
• Time-agnostic transformations
• Filter
• Map
• Flatmap
• Aggregation and Grouping
• Build-in count, different kind averages, min/max, linear trends and many more
• Co-Aggregation
• Hash-Join
79© 2018 Hazelcast Inc.
Windowing
80© 2018 Hazelcast Inc.
Windowing
81© 2018 Hazelcast Inc.
Windowing
82© 2018 Hazelcast Inc.
Windowing
83© 2018 Hazelcast Inc.
Windowing
Example: 30-second Window Sliding by 10 Seconds
84© 2018 Hazelcast Inc.
Windowing
Example: 30-second Window Sliding by 10 Seconds
85© 2018 Hazelcast Inc.
Windowing
Example: 30-second Window Sliding by 10 Seconds
86© 2018 Hazelcast Inc.
Windowing
Example: 30-second Window Sliding by 10 Seconds
87© 2018 Hazelcast Inc.
Watermarks to handle Late Events
Makes an educated guess that “from this point on there will be no more
items with timestamp less than this”
88© 2018 Hazelcast Inc.
Watermarks to handle Late Events
Makes an educated guess that “from this point on there will be no more
items with timestamp less than this”
89© 2018 Hazelcast Inc.
Watermarks to handle Late Events
Makes an educated guess that “from this point on there will be no more
items with timestamp less than this”
90© 2018 Hazelcast Inc.
Watermarks in Jet
Predefined Watermark Policies
• With Fixed Lag
• Limiting Lag and Delay
• Limiting Lag and Lull
• Limiting Timestamp and Wall-Clock Lag
91© 2018 Hazelcast Inc.
Fault Tolerance
92© 2018 Hazelcast Inc.
Jet Processing Fault Tolerance
Cluster elects a Coordinator Member who takes care of the Job Coordination
among the Cluster Members
93© 2018 Hazelcast Inc.
Jet Processing Fault Tolerance
Jet achieves fault tolerance in streaming jobs by making a snapshot of the
internal processing state
94© 2018 Hazelcast Inc.
Jet Processing Fault Tolerance
Coordinator Member detects the other Member failure and restarts the Job
using new topology
95© 2018 Hazelcast Inc.
Jet Processing Fault Tolerance
When the Coordinator Member crashes the new one is elected by the
Cluster
96© 2018 Hazelcast Inc.
Distributed Snapshots
Technique 1st described in a paper by Chandy and Lamport in 1989
97© 2018 Hazelcast Inc.
Distributed Snapshots
Technique 1st described in a paper by Chandy and Lamport in 1989
98© 2018 Hazelcast Inc.
Distributed Snapshots
Technique 1st described in a paper by Chandy and Lamport in 1989
99© 2018 Hazelcast Inc.
Distributed Snapshots
Technique 1st described in a paper by Chandy and Lamport in 1989
100© 2018 Hazelcast Inc.
Distributed Snapshots
Technique 1st described in a paper by Chandy and Lamport in 1989
101© 2018 Hazelcast Inc.
Jet Processing Guarantees
102© 2018 Hazelcast Inc.
Jet Processing Guarantees
• At-Least Once
103© 2018 Hazelcast Inc.
Jet Processing Guarantees
• At-Least Once
• Exactly Once
104© 2018 Hazelcast Inc.
Jet Processing Guarantees
• At-Least Once
• Exactly Once
• At-Most Once (meaning that the Fault Tolerance is turned off)
105© 2018 Hazelcast Inc.
Performance
106© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
107© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
• DAG to Model Computations
108© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
• DAG to Model Computations
• In-Memory Data Locality
109© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
• DAG to Model Computations
• In-Memory Data Locality
• Partition Mapping Affinity
110© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
• DAG to Model Computations
• In-Memory Data Locality
• Partition Mapping Affinity
• SP/SC Queues
111© 2018 Hazelcast Inc.
Hazelcast Jet Performance
Key Design Decisions
• DAG to Model Computations
• In-Memory Data Locality
• Partition Mapping Affinity
• SP/SC Queues
• Cooperative Multithreading (Green Threads)
112© 2018 Hazelcast Inc.
Jet Streaming Performance
113© 2018 Hazelcast Inc.
https://jet.hazelcast.org/performance/
Jet Throughput
114© 2018 Hazelcast Inc.
https://jet.hazelcast.org/performance/
Running in Production
115© 2018 Hazelcast Inc.
Running Jet in Production
• Docker images - https://github.com/hazelcast/hazelcast-jet-docker
116© 2018 Hazelcast Inc.
Running Jet in Production
• Docker images - https://github.com/hazelcast/hazelcast-jet-docker
• Cluster Management: Mesos, Yarn
117© 2018 Hazelcast Inc.
Running Jet in Production
• Docker images - https://github.com/hazelcast/hazelcast-jet-docker
• Cluster Management: Mesos, Yarn
• Cluster Discovery
• Cloud Providers: AWS, Windows Azure, GCP, PCF, Heroku
• Kubernetes
• Consul, Eureka, Zookeeper
118© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
119© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
120© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
• Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment
121© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
• Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment
• Easy to start with and integrate | Zero dependencies, developer friendly
122© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
• Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment
• Easy to start with and integrate | Zero dependencies, developer friendly
• Simple to deploy | Embedded 10MB jar or Client-Server
123© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
• Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment
• Easy to start with and integrate | Zero dependencies, developer friendly
• Simple to deploy | Embedded 10MB jar or Client-Server
• Works in every Cloud | Same as Hazelcast IMDG
124© 2018 Hazelcast Inc.
Summary
Why you should consider to use the Hazelcast Jet
• High Performance | Industry Leading
• Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment
• Easy to start with and integrate | Zero dependencies, developer friendly
• Simple to deploy | Embedded 10MB jar or Client-Server
• Works in every Cloud | Same as Hazelcast IMDG
• For Developers by Developers | Code it
125© 2018 Hazelcast Inc.
Questions?
Version 0.6 is the current release with 0.7 coming Q3 2018
aiming for 1.0 this year
http://jet.hazelcast.org
https://groups.google.com/forum/#!forum/hazelcast-jet
https://gitter.im/hazelcast/hazelcast
126© 2018 Hazelcast Inc.

More Related Content

What's hot

Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASDataWorks Summit
 
Accelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveAccelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveDataWorks Summit
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!DataWorks Summit
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkDataWorks Summit
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewSamanthaBerlant
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningModusOptimum
 
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasDataWorks Summit
 
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
Disrupting Insurance with Advanced Analytics The Next Generation CarrierDisrupting Insurance with Advanced Analytics The Next Generation Carrier
Disrupting Insurance with Advanced Analytics The Next Generation CarrierDataWorks Summit/Hadoop Summit
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...DataWorks Summit
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industryDataWorks Summit
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...DataWorks Summit
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationDatabricks
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with GimelAlluxio, Inc.
 
YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018AlanCaldera
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudDatabricks
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockJeffrey T. Pollock
 

What's hot (20)

Multi-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLASMulti-tenant Hadoop - the challenge of maintaining high SLAS
Multi-tenant Hadoop - the challenge of maintaining high SLAS
 
Accelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache HiveAccelerating query processing with materialized views in Apache Hive
Accelerating query processing with materialized views in Apache Hive
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Why and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on FlinkWhy and how to leverage the simplicity and power of SQL on Flink
Why and how to leverage the simplicity and power of SQL on Flink
 
Kyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An OverviewKyligence Cloud 4 - An Overview
Kyligence Cloud 4 - An Overview
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine Learning
 
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache AtlasGDPR-focused partner community showcase for Apache Ranger and Apache Atlas
GDPR-focused partner community showcase for Apache Ranger and Apache Atlas
 
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
Disrupting Insurance with Advanced Analytics The Next Generation CarrierDisrupting Insurance with Advanced Analytics The Next Generation Carrier
Disrupting Insurance with Advanced Analytics The Next Generation Carrier
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industry
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
The Hidden Value of Hadoop Migration
The Hidden Value of Hadoop MigrationThe Hidden Value of Hadoop Migration
The Hidden Value of Hadoop Migration
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with Gimel
 
YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018YugaByte + PKS CloudFoundry Meetup 10/15/2018
YugaByte + PKS CloudFoundry Meetup 10/15/2018
 
Northwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to CloudNorthwestern Mutual Journey – Transform BI Space to Cloud
Northwestern Mutual Journey – Transform BI Space to Cloud
 
Intelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff PollockIntelligent Integration OOW2017 - Jeff Pollock
Intelligent Integration OOW2017 - Jeff Pollock
 

Similar to In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika

In-Memory Stream Processing with Hazelcast Jet @JEEConf
In-Memory Stream Processing with Hazelcast Jet @JEEConfIn-Memory Stream Processing with Hazelcast Jet @JEEConf
In-Memory Stream Processing with Hazelcast Jet @JEEConfNazarii Cherkas
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015hadooparchbook
 
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...Capgemini
 
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Amazon Web Services
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricNETWAYS
 
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Amazon Web Services
 
Geek Nights Hong Kong
Geek Nights Hong KongGeek Nights Hong Kong
Geek Nights Hong KongRahul Gupta
 
Serverless patterns
Serverless patternsServerless patterns
Serverless patternsJesse Butler
 
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Amazon Web Services
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsRogue Wave Software
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
Veritas + MongoDB
Veritas + MongoDBVeritas + MongoDB
Veritas + MongoDBMongoDB
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Amazon Web Services
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 
Workshop: Architecting a Serverless Data Lake
Workshop: Architecting a Serverless Data LakeWorkshop: Architecting a Serverless Data Lake
Workshop: Architecting a Serverless Data LakeAmazon Web Services
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGMatt Stubbs
 

Similar to In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika (20)

In-Memory Stream Processing with Hazelcast Jet @JEEConf
In-Memory Stream Processing with Hazelcast Jet @JEEConfIn-Memory Stream Processing with Hazelcast Jet @JEEConf
In-Memory Stream Processing with Hazelcast Jet @JEEConf
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015
 
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...
Petabytes of Data and No Servers: Corteva Scales DNA Analysis to Meet Increas...
 
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
Analyze Slide Images and Process Phenotypic Assays at Scale on AWS (CMP358) -...
 
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricOSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
 
Migrating database to cloud
Migrating database to cloudMigrating database to cloud
Migrating database to cloud
 
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
Monitor the World: Meaningful Metrics for Containerized Apps and Clusters (CO...
 
Geek Nights Hong Kong
Geek Nights Hong KongGeek Nights Hong Kong
Geek Nights Hong Kong
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
Serverless patterns
Serverless patternsServerless patterns
Serverless patterns
 
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
Petabytes of Data & No Servers: Corteva Scales DNA Analysis to Meet Increasin...
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applications
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Veritas + MongoDB
Veritas + MongoDBVeritas + MongoDB
Veritas + MongoDB
 
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
Amazon Aurora: Database Week SF
Amazon Aurora: Database Week SFAmazon Aurora: Database Week SF
Amazon Aurora: Database Week SF
 
Workshop: Architecting a Serverless Data Lake
Workshop: Architecting a Serverless Data LakeWorkshop: Architecting a Serverless Data Lake
Workshop: Architecting a Serverless Data Lake
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 

In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika

  • 1. IN-MEMORY STREAM PROCESSING WITH Nazarii Cherkas | Hazelcast nazarii@hazelcast.com https://twitter.com/n_cherkas
  • 2. Brief Agenda • Why Stream Processing? • What‘s special about Streaming Data • Challenges when processing the Infinite Stream • Hazelcast Jet: The modern Stream Processing Engine • Overview and Key Concepts • Infinite Stream Processing • Fault Tolerance • Jet Performance • Summary 2© 2018 Hazelcast Inc.
  • 3. About me 3© 2018 Hazelcast Inc.
  • 4. About me • 7+ years of experience of on different positions from Java Engineer to Team Lead 4© 2018 Hazelcast Inc.
  • 5. About me • 7+ years of experience of on different positions from Java Engineer to Team Lead • Solutions Architect at Hazelcast, I solve problems of our users and interact with community 5© 2018 Hazelcast Inc.
  • 6. Why Stream Processing? 6© 2018 Hazelcast Inc.
  • 7. Streaming Data is everywhere 7© 2018 Hazelcast Inc.
  • 8. What's special about Streaming Data 8© 2018 Hazelcast Inc.
  • 9. What's special about Streaming Data • Infinite data sets 9© 2018 Hazelcast Inc.
  • 10. What's special about Streaming Data • Infinite data sets • Small size of data record 10© 2018 Hazelcast Inc.
  • 11. What's special about Streaming Data • Infinite data sets • Small size of data record • Near real-time insights 11© 2018 Hazelcast Inc.
  • 12. What's special about Streaming Data • Infinite data sets • Small size of data record • Near real-time insights • Variance in throughput and variance in disorder 12© 2018 Hazelcast Inc.
  • 13. Definitions of Stream Processing 13© 2018 Hazelcast Inc.
  • 14. Definitions of Stream Processing “...a type of data processing that is designed with infinite data sets in mind...” https://jet.hazelcast.org/use-cases/real-time-stream-processing/ https://data-artisans.com/what-is-stream-processing https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 14© 2018 Hazelcast Inc.
  • 15. Definitions of Stream Processing “...a type of data processing that is designed with infinite data sets in mind...” “...processing of data in motion, or in other words, computing on data directly as it is produced or received…” https://jet.hazelcast.org/use-cases/real-time-stream-processing/ https://data-artisans.com/what-is-stream-processing https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 15© 2018 Hazelcast Inc.
  • 16. Definitions of Stream Processing “...a type of data processing that is designed with infinite data sets in mind...” “...processing of data in motion, or in other words, computing on data directly as it is produced or received…” “...a technique to process the data on-the-fly, prior to it’s storage...” https://jet.hazelcast.org/use-cases/real-time-stream-processing/ https://data-artisans.com/what-is-stream-processing https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 16© 2018 Hazelcast Inc.
  • 17. Stream vs Batch Processing 17© 2018 Hazelcast Inc.
  • 18. Stream vs Batch Processing https://aws.amazon.com/streaming-data/ 18 Batch processing Stream processing Data scope Queries or processing over all or most of the data in the dataset Queries or processing over data within a rolling time window, or on just the most recent data record © 2018 Hazelcast Inc.
  • 19. Stream vs Batch Processing https://aws.amazon.com/streaming-data/ 19 Batch processing Stream processing Data scope Queries or processing over all or most of the data in the dataset Queries or processing over data within a rolling time window, or on just the most recent data record Data size Large batches of data Individual records or micro batches consisting of a few records © 2018 Hazelcast Inc.
  • 20. Stream vs Batch Processing https://aws.amazon.com/streaming-data/ 20 Batch processing Stream processing Data scope Queries or processing over all or most of the data in the dataset Queries or processing over data within a rolling time window, or on just the most recent data record Data size Large batches of data Individual records or micro batches consisting of a few records Responsiveness Latencies in minutes to hours Requires latency in the order of seconds or milliseconds © 2018 Hazelcast Inc.
  • 21. Stream vs Batch Processing https://aws.amazon.com/streaming-data/ 21 Batch processing Stream processing Data scope Queries or processing over all or most of the data in the dataset Queries or processing over data within a rolling time window, or on just the most recent data record Data size Large batches of data Individual records or micro batches consisting of a few records Responsiveness Latencies in minutes to hours Requires latency in the order of seconds or milliseconds Analyses Complex analytics Aggregates, approximation algorithms and simple response functions © 2018 Hazelcast Inc.
  • 22. Layers of Stream Processing 22© 2018 Hazelcast Inc.
  • 23. Challenges of Stream Processing 23© 2018 Hazelcast Inc.
  • 24. Challenges of Stream Processing • Distributed system coordination 24© 2018 Hazelcast Inc.
  • 25. Challenges of Stream Processing • Distributed system coordination • Notion of time 25© 2018 Hazelcast Inc.
  • 26. Challenges of Stream Processing • Distributed system coordination • Notion of time • Memory management 26© 2018 Hazelcast Inc.
  • 27. Challenges of Stream Processing • Distributed system coordination • Notion of time • Memory management • Fault-tolerance 27© 2018 Hazelcast Inc.
  • 28. Hazelcast Jet: In-Memory Streaming and Fast Batch Processing 28© 2018 Hazelcast Inc.
  • 29. What is Hazelcast Jet 29© 2018 Hazelcast Inc. Source Sink
  • 30. What is Hazelcast Jet https://github.com/hazelcast/hazelcast-jet/ Apache License 2.0 30© 2018 Hazelcast Inc. Source Sink
  • 31. Hazelcast Jet use cases 31© 2018 Hazelcast Inc.
  • 32. Hazelcast Jet use cases • Low-latency Stream processing and analytics 32© 2018 Hazelcast Inc.
  • 33. Hazelcast Jet use cases • Low-latency Stream processing and analytics • Fast Batch processing and ETL 33© 2018 Hazelcast Inc.
  • 34. Hazelcast Jet use cases • Low-latency Stream processing and analytics • Fast Batch processing and ETL • Distributed java.util.stream 34© 2018 Hazelcast Inc.
  • 35. Hazelcast Jet use cases • Low-latency Stream processing and analytics • Fast Batch processing and ETL • Distributed java.util.stream • Implementing event sourcing and CQRS 35© 2018 Hazelcast Inc.
  • 36. Hazelcast Jet use cases • Low-latency Stream processing and analytics • Fast Batch processing and ETL • Distributed java.util.stream • Implementing event sourcing and CQRS • Data processing microservice architectures 36© 2018 Hazelcast Inc.
  • 37. Hazelcast Jet: Architecture Overview 37 Core API java.util.stream Batch Readers and Writers Batch Processing Pipeline API Streaming Readers and Writers Stream Processing Networking Deployment Data Structures and Partition Management Execution Engine Cluster Management with Cloud Discovery SPI Java Client Fault-Tolerance Connectors High-Level APIs Processing Core © 2018 Hazelcast Inc. 31
  • 38. Hazelcast Jet: Architecture Overview 38 Core API java.util.stream Batch Readers and Writers Batch Processing Pipeline API Streaming Readers and Writers Stream Processing Networking Deployment Data Structures and Partition Management Execution Engine Cluster Management with Cloud Discovery SPI Java Client Fault-Tolerance Connectors High-Level APIs Processing Core © 2018 Hazelcast Inc. 32
  • 39. Hazelcast Jet: Architecture Overview 39 Core API java.util.stream Batch Readers and Writers Batch Processing Pipeline API Streaming Readers and Writers Stream Processing Networking Deployment Data Structures and Partition Management Execution Engine Cluster Management with Cloud Discovery SPI Java Client Fault-Tolerance Connectors High-Level APIs Processing Core © 2018 Hazelcast Inc. 33
  • 40. Talk is cheap, show me the Word Count Demo Word Count problem is the “Hello, World” in the Land of Stream Processing • Input • Text book in the single file • Stop-list of words to ignore i.e. ”this”, “that”, “of” etc. • Output • Top N word occurrences in the book, saved as key -> value pairs 40© 2018 Hazelcast Inc. https://es.wikiquote.org/wiki/Linus_Torvalds https://github.com/ncherkas/hazelcast-jet-demos
  • 41. Key Concepts 41© 2018 Hazelcast Inc.
  • 42. Key concepts Distributed Acyclic Graph (DAG) 42© 2018 Hazelcast Inc.
  • 43. Key concepts Distributed Acyclic Graph (DAG) 43© 2018 Hazelcast Inc.
  • 44. Key concepts Distributed Acyclic Graph (DAG) 44© 2018 Hazelcast Inc.
  • 45. Key concepts Distributed Acyclic Graph (DAG) 45© 2018 Hazelcast Inc.
  • 46. Key concepts Distributed Acyclic Graph (DAG) 46© 2018 Hazelcast Inc.
  • 47. Key concepts Distributed Acyclic Graph (DAG) 47© 2018 Hazelcast Inc.
  • 48. Key concepts Jet Cluster 48© 2018 Hazelcast Inc.
  • 49. Key concepts Jet Cluster 49© 2018 Hazelcast Inc.
  • 50. Key concepts Job Execution 50© 2018 Hazelcast Inc.
  • 51. Key concepts Job Execution 51© 2018 Hazelcast Inc.
  • 52. Key concepts Job Execution 52© 2018 Hazelcast Inc. Hazelcast Member Hazelcast Member Hazelcast Member
  • 53. Jet APIs 53© 2018 Hazelcast Inc.
  • 54. Jet APIs • Pipeline API • First choice to use Jet. Build rich data pipelines on a variety of sources and sinks 54© 2018 Hazelcast Inc.
  • 55. Jet APIs • Pipeline API • First choice to use Jet. Build rich data pipelines on a variety of sources and sinks • Distributed java.util.stream • Entry-level usage, simple transform-aggregate operations on IMap, JCache and IList 55© 2018 Hazelcast Inc.
  • 56. Jet APIs • Pipeline API • First choice to use Jet. Build rich data pipelines on a variety of sources and sinks • Distributed java.util.stream • Entry-level usage, simple transform-aggregate operations on IMap, JCache and IList • Core DAG API • Low-level API for fine-grained tuning and integration 56© 2018 Hazelcast Inc.
  • 57. Sources and Sinks 57© 2018 Hazelcast Inc. Resource Infinite? IList ❌ IMap, ICache ❌ Remote IMap, ICache ❌ Event Journal ✅ Remote Event Journal ✅ HDFS ❌ Kafka ✅ Files ❌ File Watcher ✅ TCP Socket ✅ Application Log N/A
  • 58. Sources and Sinks 58© 2018 Hazelcast Inc. Resource Infinite? Replyable? Checkpointing? IList ❌ ✅ ❌ IMap, ICache ❌ ✅ ❌ Remote IMap, ICache ❌ ✅ ❌ Event Journal ✅ ✅ ✅ Remote Event Journal ✅ ✅ ✅ HDFS ❌ ✅ ❌ Kafka ✅ ✅ ✅ Files ❌ ✅ ❌ File Watcher ✅ ❌ ❌ TCP Socket ✅ ❌ ❌ Application Log N/A N/A ❌
  • 59. Sources and Sinks 59© 2018 Hazelcast Inc. Resource Infinite? Replyable? Checkpointing? Distributed? IList ❌ ✅ ❌ ❌ IMap, ICache ❌ ✅ ❌ ✅ Remote IMap, ICache ❌ ✅ ❌ ✅ Event Journal ✅ ✅ ✅ ✅ Remote Event Journal ✅ ✅ ✅ ✅ HDFS ❌ ✅ ❌ ✅ Kafka ✅ ✅ ✅ ✅ Files ❌ ✅ ❌ ❌ File Watcher ✅ ❌ ❌ ❌ TCP Socket ✅ ❌ ❌ ❌ Application Log N/A N/A ❌ ❌
  • 60. Sources and Sinks 60© 2018 Hazelcast Inc. Resource Infinite? Replyable? Checkpointing? Distributed? Data Locality IList ❌ ✅ ❌ ❌ ❌ IMap, ICache ❌ ✅ ❌ ✅ Src ✅ Sink ❌ Remote IMap, ICache ❌ ✅ ❌ ✅ ❌ Event Journal ✅ ✅ ✅ ✅ ❌ Remote Event Journal ✅ ✅ ✅ ✅ ❌ HDFS ❌ ✅ ❌ ✅ ✅ Kafka ✅ ✅ ✅ ✅ ❌ Files ❌ ✅ ❌ ❌ ✅ File Watcher ✅ ❌ ❌ ❌ ✅ TCP Socket ✅ ❌ ❌ ❌ ❌ Application Log N/A N/A ❌ ❌ ✅
  • 61. Infinite Stream Processing with Jet 61© 2018 Hazelcast Inc.
  • 62. Jet Streaming Demo Flight Telemetry Processing a near real-time Flight Telemetry Stream from ADS-B Exchange - https://www.adsbexchange.com/ 62© 2018 Hazelcast Inc.
  • 63. Jet Streaming Demo Flight Telemetry Processing a near real-time Flight Telemetry Stream from ADS-B Exchange - https://www.adsbexchange.com/ • Filter out planes outside of defined airports 63© 2018 Hazelcast Inc.
  • 64. Jet Streaming Demo Flight Telemetry Processing a near real-time Flight Telemetry Stream from ADS-B Exchange - https://www.adsbexchange.com/ • Filter out planes outside of defined airports • Detect whether the plane is ascending, descending or staying in the same level 64© 2018 Hazelcast Inc.
  • 65. Jet Streaming Demo Flight Telemetry Processing a near real-time Flight Telemetry Stream from ADS-B Exchange - https://www.adsbexchange.com/ • Filter out planes outside of defined airports • Detect whether the plane is ascending, descending or staying in the same level • Based on the plane type and phase of the flight calculate the maximum noise levels nearby to an airport and estimate C02 emissions for a region 65© 2018 Hazelcast Inc. https://github.com/ncherkas/hazelcast-jet-demos
  • 66. Jet Streaming Demo Dashboard Pipeline 66© 2018 Hazelcast Inc.
  • 67. Jet Streaming Demo Dashboard Pipeline 67© 2018 Hazelcast Inc.
  • 68. Jet Streaming Demo Dashboard Pipeline 68© 2018 Hazelcast Inc.
  • 69. Jet Streaming Demo Dashboard Pipeline 69© 2018 Hazelcast Inc.
  • 70. Jet Streaming Demo Dashboard Pipeline 70© 2018 Hazelcast Inc.
  • 71. Jet Streaming Demo Dashboard Pipeline 71© 2018 Hazelcast Inc.
  • 72. Jet Streaming Demo Dashboard Pipeline 72© 2018 Hazelcast Inc.
  • 73. Jet Streaming Demo Dashboard Pipeline 73© 2018 Hazelcast Inc.
  • 74. Jet Streaming Demo Dashboard Pipeline 74© 2018 Hazelcast Inc.
  • 76. Pipeline transformations • Time-agnostic transformations • Filter • Map • Flatmap 76© 2018 Hazelcast Inc.
  • 77. Pipeline transformations • Time-agnostic transformations • Filter • Map • Flatmap • Aggregation and Grouping • Build-in count, different kind averages, min/max, linear trends and many more 77© 2018 Hazelcast Inc.
  • 78. Pipeline transformations • Time-agnostic transformations • Filter • Map • Flatmap • Aggregation and Grouping • Build-in count, different kind averages, min/max, linear trends and many more • Co-Aggregation 78© 2018 Hazelcast Inc.
  • 79. Pipeline transformations • Time-agnostic transformations • Filter • Map • Flatmap • Aggregation and Grouping • Build-in count, different kind averages, min/max, linear trends and many more • Co-Aggregation • Hash-Join 79© 2018 Hazelcast Inc.
  • 84. Windowing Example: 30-second Window Sliding by 10 Seconds 84© 2018 Hazelcast Inc.
  • 85. Windowing Example: 30-second Window Sliding by 10 Seconds 85© 2018 Hazelcast Inc.
  • 86. Windowing Example: 30-second Window Sliding by 10 Seconds 86© 2018 Hazelcast Inc.
  • 87. Windowing Example: 30-second Window Sliding by 10 Seconds 87© 2018 Hazelcast Inc.
  • 88. Watermarks to handle Late Events Makes an educated guess that “from this point on there will be no more items with timestamp less than this” 88© 2018 Hazelcast Inc.
  • 89. Watermarks to handle Late Events Makes an educated guess that “from this point on there will be no more items with timestamp less than this” 89© 2018 Hazelcast Inc.
  • 90. Watermarks to handle Late Events Makes an educated guess that “from this point on there will be no more items with timestamp less than this” 90© 2018 Hazelcast Inc.
  • 91. Watermarks in Jet Predefined Watermark Policies • With Fixed Lag • Limiting Lag and Delay • Limiting Lag and Lull • Limiting Timestamp and Wall-Clock Lag 91© 2018 Hazelcast Inc.
  • 92. Fault Tolerance 92© 2018 Hazelcast Inc.
  • 93. Jet Processing Fault Tolerance Cluster elects a Coordinator Member who takes care of the Job Coordination among the Cluster Members 93© 2018 Hazelcast Inc.
  • 94. Jet Processing Fault Tolerance Jet achieves fault tolerance in streaming jobs by making a snapshot of the internal processing state 94© 2018 Hazelcast Inc.
  • 95. Jet Processing Fault Tolerance Coordinator Member detects the other Member failure and restarts the Job using new topology 95© 2018 Hazelcast Inc.
  • 96. Jet Processing Fault Tolerance When the Coordinator Member crashes the new one is elected by the Cluster 96© 2018 Hazelcast Inc.
  • 97. Distributed Snapshots Technique 1st described in a paper by Chandy and Lamport in 1989 97© 2018 Hazelcast Inc.
  • 98. Distributed Snapshots Technique 1st described in a paper by Chandy and Lamport in 1989 98© 2018 Hazelcast Inc.
  • 99. Distributed Snapshots Technique 1st described in a paper by Chandy and Lamport in 1989 99© 2018 Hazelcast Inc.
  • 100. Distributed Snapshots Technique 1st described in a paper by Chandy and Lamport in 1989 100© 2018 Hazelcast Inc.
  • 101. Distributed Snapshots Technique 1st described in a paper by Chandy and Lamport in 1989 101© 2018 Hazelcast Inc.
  • 102. Jet Processing Guarantees 102© 2018 Hazelcast Inc.
  • 103. Jet Processing Guarantees • At-Least Once 103© 2018 Hazelcast Inc.
  • 104. Jet Processing Guarantees • At-Least Once • Exactly Once 104© 2018 Hazelcast Inc.
  • 105. Jet Processing Guarantees • At-Least Once • Exactly Once • At-Most Once (meaning that the Fault Tolerance is turned off) 105© 2018 Hazelcast Inc.
  • 107. Hazelcast Jet Performance Key Design Decisions 107© 2018 Hazelcast Inc.
  • 108. Hazelcast Jet Performance Key Design Decisions • DAG to Model Computations 108© 2018 Hazelcast Inc.
  • 109. Hazelcast Jet Performance Key Design Decisions • DAG to Model Computations • In-Memory Data Locality 109© 2018 Hazelcast Inc.
  • 110. Hazelcast Jet Performance Key Design Decisions • DAG to Model Computations • In-Memory Data Locality • Partition Mapping Affinity 110© 2018 Hazelcast Inc.
  • 111. Hazelcast Jet Performance Key Design Decisions • DAG to Model Computations • In-Memory Data Locality • Partition Mapping Affinity • SP/SC Queues 111© 2018 Hazelcast Inc.
  • 112. Hazelcast Jet Performance Key Design Decisions • DAG to Model Computations • In-Memory Data Locality • Partition Mapping Affinity • SP/SC Queues • Cooperative Multithreading (Green Threads) 112© 2018 Hazelcast Inc.
  • 113. Jet Streaming Performance 113© 2018 Hazelcast Inc. https://jet.hazelcast.org/performance/
  • 114. Jet Throughput 114© 2018 Hazelcast Inc. https://jet.hazelcast.org/performance/
  • 115. Running in Production 115© 2018 Hazelcast Inc.
  • 116. Running Jet in Production • Docker images - https://github.com/hazelcast/hazelcast-jet-docker 116© 2018 Hazelcast Inc.
  • 117. Running Jet in Production • Docker images - https://github.com/hazelcast/hazelcast-jet-docker • Cluster Management: Mesos, Yarn 117© 2018 Hazelcast Inc.
  • 118. Running Jet in Production • Docker images - https://github.com/hazelcast/hazelcast-jet-docker • Cluster Management: Mesos, Yarn • Cluster Discovery • Cloud Providers: AWS, Windows Azure, GCP, PCF, Heroku • Kubernetes • Consul, Eureka, Zookeeper 118© 2018 Hazelcast Inc.
  • 119. Summary Why you should consider to use the Hazelcast Jet 119© 2018 Hazelcast Inc.
  • 120. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading 120© 2018 Hazelcast Inc.
  • 121. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading • Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment 121© 2018 Hazelcast Inc.
  • 122. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading • Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment • Easy to start with and integrate | Zero dependencies, developer friendly 122© 2018 Hazelcast Inc.
  • 123. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading • Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment • Easy to start with and integrate | Zero dependencies, developer friendly • Simple to deploy | Embedded 10MB jar or Client-Server 123© 2018 Hazelcast Inc.
  • 124. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading • Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment • Easy to start with and integrate | Zero dependencies, developer friendly • Simple to deploy | Embedded 10MB jar or Client-Server • Works in every Cloud | Same as Hazelcast IMDG 124© 2018 Hazelcast Inc.
  • 125. Summary Why you should consider to use the Hazelcast Jet • High Performance | Industry Leading • Out-of-box integration with Hazelcast IMDG | Source, Sink, Enrichment • Easy to start with and integrate | Zero dependencies, developer friendly • Simple to deploy | Embedded 10MB jar or Client-Server • Works in every Cloud | Same as Hazelcast IMDG • For Developers by Developers | Code it 125© 2018 Hazelcast Inc.
  • 126. Questions? Version 0.6 is the current release with 0.7 coming Q3 2018 aiming for 1.0 this year http://jet.hazelcast.org https://groups.google.com/forum/#!forum/hazelcast-jet https://gitter.im/hazelcast/hazelcast 126© 2018 Hazelcast Inc.

Editor's Notes

  1. TODO: add contacts !!! TODO: what’s written? :)
  2. TODO: add contacts !!! TODO: what’s written? :)
  3. !!! Mention about conferences TODO: add contacts !!! TODO: what’s written? :)
  4. - the answer is that the streaming data [definition of term] is everywhere and it’s usually about … - all these examples of data are generated all the time and usually come with some important real-time insights that require the processing here and now TODO: too much, remove gaming activities
  5. - fraud detection - alerts generation - variance in throughput -> auto-scaling - disorder -> e.g., a plane full of people taking their phones out of airplane mode after having used them offline for the entire flight - disorder -> producer parallelism and retries – specific to the tools that are used, due to the internals, especially when using batching
  6. - fraud detection - alerts generation - variance in throughput -> auto-scaling - disorder -> e.g., a plane full of people taking their phones out of airplane mode after having used them offline for the entire flight - disorder -> producer parallelism and retries – specific to the tools that are used, due to the internals, especially when using batching
  7. - fraud detection - alerts generation - variance in throughput -> auto-scaling - disorder -> e.g., a plane full of people taking their phones out of airplane mode after having used them offline for the entire flight - disorder -> producer parallelism and retries – specific to the tools that are used, due to the internals, especially when using batching
  8. - fraud detection - alerts generation - variance in throughput -> auto-scaling - disorder -> e.g., a plane full of people taking their phones out of airplane mode after having used them offline for the entire flight - disorder -> producer parallelism and retries – specific to the tools that are used, due to the internals, especially when using batching
  9. - fraud detection - alerts generation - variance in throughput -> auto-scaling - disorder -> e.g., a plane full of people taking their phones out of airplane mode after having used them offline for the entire flight - disorder -> producer parallelism and retries – specific to the tools that are used, due to the internals, especially when using batching
  10. - let’s try to understand what is Stream Processing - the key things: on the fly priour to it’s storage, infinite data set in mind, data in motion
  11. - let’s try to understand what is Stream Processing - the key things: on the fly priour to it’s storage, infinite data set in mind, data in motion
  12. - the key things: on the fly priour to it’s storage, infinite data set in mind, data in motion
  13. - the key things: on the fly priour to it’s storage, infinite data set in mind, data in motion
  14. K-means, Hyper Log Log… How it’s different from classical Batch Processing, when we run periodical jobs to handle our data? TODO: review and maybe come up with own points TODO: combine 1 & 2
  15. K-means, Hyper Log Log… How it’s different from classical Batch Processing, when we run periodical jobs to handle our data? TODO: review and maybe come up with own points TODO: combine 1 & 2
  16. TODO: review and maybe come up with own points
  17. TODO: review and maybe come up with own points
  18. TODO: review and maybe come up with own points
  19. 1. Architecturally, stream processing system usually consists of the following 2 layers. 2. Now let’s see how the typical Stream Processing system looks in practice. … 3. Hence, all this doesn’t come for free, there are multiple challenges to solve when you are Processing the Infinite Stream [moving to the next slide].
  20. - problems: how to form the cluster how to coordinate and how to control the required level of consistency
  21. - problems: how to form the cluster how to coordinate and how to control the required level of consistency
  22. - problems: how to form the cluster how to coordinate and how to control the required level of consistency
  23. - problems: how to form the cluster how to coordinate and how to control the required level of consistency
  24. - how to solve these problems? - next slide -
  25. - Hazelcast Jet is one of the products which aim to solve such problem
  26. Architecturally, Jet consists of the following layers
  27. TODO: where is DAG API here? Jet Member is also a fully functional Hazelcast IMDG Member and a Jet Cluster is also a Hazelcast IMDG Cluster Hazelcast IMDG provides Layer of cluster management, deployment, data partitioning and networking In-Memory store for Jet Processing state Shared state to connect multiple Jet Jobs Remote data caching Enrichment data source
  28. TODO: where is DAG API here? Jet Member is also a fully functional Hazelcast IMDG Member and a Jet Cluster is also a Hazelcast IMDG Cluster Hazelcast IMDG provides Layer of cluster management, deployment, data partitioning and networking In-Memory store for Jet Processing state Shared state to connect multiple Jet Jobs Remote data caching Enrichment data source
  29. TODO: Add a client App and make animations.
  30. - Hazelcast Jet is one of the products which aim to solve such problem
  31. TODO: unify orange color among slides! TODO: animation
  32. TODO: unify orange color among slides! TODO: animation
  33. TODO: unify orange color among slides! TODO: animation
  34. TODO: unify orange color among slides! TODO: animation
  35. TODO: unify orange color among slides! TODO: animation
  36. TODO: unify orange color among slides! TODO: animation
  37. Uses Hazelcast IMDG Clustering under the hood Peer-To-Peer communication Members can be either set statically or automatically discovered Elastically scales up or down Topologies Embedded Client-Server
  38. Uses Hazelcast IMDG Clustering under the hood Peer-To-Peer communication Members can be either set statically or automatically discovered Elastically scales up or down Topologies Embedded Client-Server
  39. Unit of work described by DAG which is submitted to the cluster for execution Asynchronous, Distributed Submitted to each running member *Scales up/down when adding removing members Embeds JAR with the source code, if needed
  40. Unit of work described by DAG which is submitted to the cluster for execution Asynchronous, Distributed Submitted to each running member *Scales up/down when adding removing members Embeds JAR with the source code, if needed
  41. Unit of work described by DAG which is submitted to the cluster for execution Asynchronous, Distributed Submitted to each running member *Scales up/down when adding removing members Embeds JAR with the source code, if needed
  42. [DON’T MAKE MUCH INTRO???] Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  43. [DON’T MAKE MUCH INTRO???] Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  44. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  45. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  46. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  47. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  48. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  49. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  50. Automatic dependent surveillance — broadcast (ADS–B) is a surveillance technology in which an aircraft determines its position via satellite navigation and periodically broadcasts it, enabling it to be tracked. The information can be received by air traffic control ground stations as a replacement for secondary surveillance radar, as no interrogation signal is needed from the ground. It can also be received by other aircraft to provide situational awareness and allow self-separation. ADS–B is "automatic" in that it requires no pilot or external input. It is "dependent" in that it depends on data from the aircraft's navigation system.[1]
  51. Automatic dependent surveillance — broadcast (ADS–B) is a surveillance technology in which an aircraft determines its position via satellite navigation and periodically broadcasts it, enabling it to be tracked. The information can be received by air traffic control ground stations as a replacement for secondary surveillance radar, as no interrogation signal is needed from the ground. It can also be received by other aircraft to provide situational awareness and allow self-separation. ADS–B is "automatic" in that it requires no pilot or external input. It is "dependent" in that it depends on data from the aircraft's navigation system.[1]
  52. TODO: more info plus diagram
  53. TODO: more info plus diagram
  54. TODO: more info plus diagram
  55. TODO: more info plus diagram
  56. TODO: more info plus diagram
  57. TODO: more info plus diagram
  58. TODO: more info plus diagram
  59. TODO: more info plus diagram
  60. TODO: more info plus diagram
  61. TODO: more info plus diagram
  62. TODO: more info plus diagram
  63. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  64. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  65. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  66. Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  67. Co-Aggregation – join page visits, user data and payments Context propagation for map, flatMap and filter 2) Aggregation and Grouping Transformation of a set of input values sharing the same distinct key into a single output value Build-in Aggregate Operations for count, different kind avagares, min/max, linear trends and many more Easy to implement own aggregations 3) Co-Aggregation groupBy over the items from more than one contributing stream Like JOIN with the Group By in SQL Typical use case - collecting stats over the user activity coming from the several streams 4) Hash-Join Join of one finite stream with another, possibly infinite stream Optimized for data enrichment - when each item of the primary stream gets enriched with the data resolved by a hashtable lookup To optimize the performance, the entire enriching stream is replicated on each Jet member
  68. TODO: must be “Event time” on axis - commutative and associative operations
  69. TODO: must be “Event time” on axis
  70. TODO: must be “Event time” on axis
  71. TODO: must be “Event time” on axis
  72. TODO: must be “Event time” on axis
  73. TODO: must be “Event time” on axis
  74. TODO: must be “Event time” on axis
  75. TODO: Add a client App and make animations.
  76. TODO: ANIMATIONS!!! TODO: add a final step – when the snapshot completed - due to parallelism, in most cases a processor receives data from more than one upstream processor -
  77. TODO: ANIMATIONS!!! TODO: add a final step – when the snapshot completed - due to parallelism, in most cases a processor receives data from more than one upstream processor -
  78. TODO: ANIMATIONS!!! TODO: add a final step – when the snapshot completed - due to parallelism, in most cases a processor receives data from more than one upstream processor -
  79. TODO: ANIMATIONS!!! TODO: add a final step – when the snapshot completed - due to parallelism, in most cases a processor receives data from more than one upstream processor -
  80. TODO: ANIMATIONS!!! TODO: add a final step – when the snapshot completed - due to parallelism, in most cases a processor receives data from more than one upstream processor -
  81. TODO: animations
  82. TODO: animations
  83. TODO: animations
  84. TODO: animations
  85. Why it’s worth considering Jet for your next stream processing task
  86. Why it’s worth considering Jet for your next stream processing task
  87. TODO: Key Competitive Differentiators?
  88. TODO: Key Competitive Differentiators?
  89. TODO: Key Competitive Differentiators?
  90. TODO: Key Competitive Differentiators?
  91. TODO: Key Competitive Differentiators? Mention that this is an open product, e.g. it’s easy to implement a connector
  92. TODO: add resources