SlideShare a Scribd company logo
@nicolas_frankel
Introduction to Data
Streaming
@nicolas_frankel
• Former developer, team lead,
architect, blah-blah
• Developer Advocate
• Curious about Kubernetes
Me, myself and I
@nicolas_frankel
Hazelcast
HAZELCAST IMDG is an operational,
in-memory, distributed computing
platform that manages data using
in-memory storage, and performs
parallel execution for breakthrough
application speed and scale.
HAZELCAST JET is the ultra fast,
application embeddable, 3rd
generation stream processing
engine for low latency batch
and stream processing.
@nicolas_frankel
• Why streaming?
• Streaming approaches
• Hazelcast Jet
• Open Data
• General Transit Feed Specification
• The demo!
• Q&A
Schedule
@nicolas_frankel
Data was neatly stored in SQL
databases
In a time before our time…
@nicolas_frankel
• Analytics
• Supermarket sales in the last hour?
• Reporting
• Banking account annual closing
The need for Extract Transform Load
@nicolas_frankel
• Constraints
• Joints
• Normal forms
What SQL really means
@nicolas_frankel
• Normalized vs. denormalized
• Correct vs. fast
Writes vs. reads
@nicolas_frankel
• Different actors
• With different needs
• Using the same database?
The need for ETL
@nicolas_frankel
The batch model
1. Extract
2. Transform
3. Load
@nicolas_frankel
Batches are everywhere!
@nicolas_frankel
• Scheduled at regular intervals
• Daily
• Weekly
• Monthly
• Yearly
• Run in a specific amount of time
Properties of batches
@nicolas_frankel
• When the execution time overlaps
the next execution schedule
• When the space taken by the data
exceeds the storage capacity
• When the batch fails mid-execution
• etc.
Oops
@nicolas_frankel
• Parallelize everything
• Map - Reduce
• Hadoop
• NoSQL
• Schema on Read vs. Schema on Write
Big data!
@nicolas_frankel
• Keep a cursor
• And only manage “chunks” of data
• What about new data coming in?
Or chunking?
@nicolas_frankel
Event-Driven Programming
“Programming paradigm in which the flow of the
program is determined by events such as user
actions (mouse clicks, key presses), sensor outputs, or
messages from other programs or threads”
-- Wikipedia
@nicolas_frankel
Event Sourcing
“Event sourcing persists the state of a business entity
such an Order or a Customer as a sequence of state-
changing events. Whenever the state of a business
entity changes, a new event is appended to the list of
events. Since saving an event is a single operation, it is
inherently atomic. The application reconstructs an
entity’s current state by replaying the events.”
-- https://microservices.io/patterns/data/event-sourcing.html
@nicolas_frankel
• Ordered append-only log
• e.g. MySQL binlog
Database internals
@nicolas_frankel
Make everything event-based!
@nicolas_frankel
• Memory-friendly
• Easily processed
• Pull vs. push
• Very close to real-time
• Keeps derived data in-sync
Benefits
@nicolas_frankel
From finite datasets to infinite
@nicolas_frankel
Streaming is smart ETL
Processing
Ingest
In-Memory
Operational
Storage
Combine
Join, Enrich,
Group, Aggregate
Stream
Windowing, Event-
Time
Processing
Compute
Distributed and
Parallel
Computation
Transform
Filter, Clean,
Convert
Publish
In-Memory,
Subscriber
Notifications
Notify if response
time is 10% over 24
hour average, second
by second
@nicolas_frankel
• Real-time dashboards
• Decision making
• Recommendations
• Stats (gaming, infrastructure
monitoring)
• Prediction - often based on
algorithmic prediction
• Push stream through ML model
• Complex Event Processing
Use Case: Analytics and Decision Making
@nicolas_frankel
• Kafka
• Pulsar
Persistent event-storage systems
@nicolas_frankel
• Distributed
• On-disk storage
• Messages sent and read from a
topic
• Publish-subscribe
• Queue
• Consumer can keep track of the
offset
Kafka
@nicolas_frankel
• Apache Flink
• Amazon Kinesis
• IBM Streams
• Hazelcast Jet
• Apache Beam
• Abstraction over some of the above
• …
In-memory stream processing engines
@nicolas_frankel
• Apache 2 Open Source
• Single JAR
• Leverages Hazelcast IMDG
• Unified batch/streaming API
• (Hazelcast Jet Enterprise)
Hazelcast Jet
@nicolas_frankel
Hazelcast Jet
@nicolas_frankel
• Declaration (code) that defines and
links sources, transforms, and
sinks
• Platform-specific SDK (Pipeline API
in Jet)
• Client submits pipeline to the
Stream Processing Engine (SPE)
Concept: Pipeline
@nicolas_frankel
• Running instance of pipeline in SPE
• SPE executes the pipeline
• Code execution
• Data routing
• Flow control
• Parallel and distributed execution
Concept: Job
@nicolas_frankel
Imperative model
final String text = "...";
final Map<String, Long> counts = new HashMap<>();
for (String word : text.split("W+")) {
Long count = counts.get(word);
counts.put(count == null ? 1L : count + 1);
}
@nicolas_frankel
Declarative model
Map<String, Long> counts = lines.stream()
.map(String::toLowerCase)
.flatMap(
line -> Arrays.stream(line.split("W+"))
)
.filter(word -> !word.isEmpty())
.collect(Collectors.groupingBy(
word -> word, Collectors.counting())
);
@nicolas_frankel
• Multiple nodes
• Scalable storage and performance
• Elasticity
• Data stored, partitioned and
replicated
• No single point of failure
What Distributed Means to Hazelcast
@nicolas_frankel
Distributed Parallel Processing
Pipeline p = Pipeline.create();
p.drawFrom(Sources.<Long, String>map(BOOK_LINES))
.flatMap(line -> traverseArray(line.getValue().split("W+")))
.filter(word -> !word.isEmpty())
.groupingKey(wholeItem())
.aggregate(counting())
.drainTo(Sinks.map(COUNTS));
Data
Sink
Data
Source
from aggrmap filter to
Translate declarative code to a Directed Acyclic Graph
@nicolas_frankel
Node 1
Distributed Parallel Processing
read cmb
map
+
filter
acc sink
read cmb
map
+
filter
acc
Node 2
read cmb
map
+
filter
acc
sinkread cmb
map
+
filter
acc
Data
Source
Data
Sink
sink
sink
@nicolas_frankel
« Open data is the idea that some
data should be freely available to
everyone to use and republish as
they wish, without restrictions from
copyright, patents or other
mechanisms of control. »
--https://en.wikipedia.org/wiki/Open_data
Open Data
@nicolas_frankel
• France:
• https://www.data.gouv.fr/fr/
• Switzerland:
• https://opendata.swiss/en/
• European Union:
• https://data.europa.eu/euodp/en/data/
Some Open Data initiatives
@nicolas_frankel
1. Access
2. Format
3. Standard
4. Data correctness
Challenges
@nicolas_frankel
• Download a file
• Access it interactively through a
web-service
Access
@nicolas_frankel
In general, Open Data means Open
Format
• PDF
• CSV
• XML
• JSON
• etc.
Format
@nicolas_frankel
• Let’s pretend the format is XML
• Which grammar is used?
• A shared standard is required
• Congruent to a domain
Standard
@nicolas_frankel
Data correctness
"32.TA.66-43","16:20:00","16:20:00","8504304"
"32.TA.66-44","24:53:00","24:53:00","8500100"
"32.TA.66-44","25:00:00","25:00:00","8500162"
"32.TA.66-44","25:02:00","25:02:00","8500170"
"32.TA.66-45","23:32:00","23:32:00","8500170"
@nicolas_frankel
General Transit Feed Specification
”The General Transit Feed Specification (GTFS) […]
defines a common format for public transportation
schedules and associated geographic information.
GTFS feeds let public transit agencies publish their
transit data and developers write applications that
consume that data in an interoperable way.”
@nicolas_frankel
GTFS static model
Filename Required Defines
agency.txt Required Transit agencies with service represented in this dataset.
stops.txt Required Stops where vehicles pick up or drop off riders. Also defines stations and station entrances.
routes.txt Required Transit routes. A route is a group of trips that are displayed to riders as a single service.
trips.txt Required
Trips for each route. A trip is a sequence of two or more stops that occur during a specific
time period.
stop_times.txt Required Times that a vehicle arrives at and departs from stops for each trip.
calendar.txt
Conditionally
required
Service dates specified using a weekly schedule with start and end dates. This file is required
unless all dates of service are defined in calendar_dates.txt.
calendar_dates.txt
Conditionally
required
Exceptions for the services defined in the calendar.txt. If calendar.txt is omitted, then
calendar_dates.txt is required and must contain all dates of service.
fare_attributes.txt Optional Fare information for a transit agency's routes.
@nicolas_frankel
GTFS static model
Filename Required Defines
fare_rules.txt Optional Rules to apply fares for itineraries.
shapes.txt Optional Rules for mapping vehicle travel paths, sometimes referred to as route alignments.
frequencies.txt Optional
Headway (time between trips) for headway-based service or a compressed representation of fixed-
schedule service.
transfers.txt Optional Rules for making connections at transfer points between routes.
pathways.txt Optional Pathways linking together locations within stations.
levels.txt Optional Levels within stations.
feed_info.txt Optional Dataset metadata, including publisher, version, and expiration information.
translations.txt Optional Translated information of a transit agency.
attributions.txt Optional Specifies the attributions that are applied to the dataset.
@nicolas_frankel
GTFS dynamic model
@nicolas_frankel
• Open Data
• GTFS static available as
downloadable .txt files
• GTFS dynamic available as a REST
endpoint
Use-case: Swiss Public Transport
@nicolas_frankel
The available data model
Where’s the position?!
@nicolas_frankel
• Source: web service
• Split into trip updates
• Enrich with static trip data
• Enrich with static stop times data
• Transform hours into timestamp
• Enrich with static location data
• Sink: Hazelcast IMDG
The dynamic data pipeline
@nicolas_frankel
Architecture overview
@nicolas_frankel
@nicolas_frankel
Recap
• Streaming has a lot of benefits
• Leverage Open Data
• It’s the Wild West out there
• No standards
• Real-world data sucks!
• But you can get cool stuff done
@nicolas_frankel
• https://blog.frankel.ch/
• @nicolas_frankel
• https://jet.hazelcast.org/
• https://bit.ly/opendataswiss
• https://bit.ly/gtransportfs
• https://bit.ly/jet-train
Thanks a lot!

More Related Content

What's hot

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Yaroslav Tkachenko
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
DataStax
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
Apache Apex
 
OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia
Bharat Kalia
 
Pulsar - Real-time Analytics at Scale
Pulsar - Real-time Analytics at ScalePulsar - Real-time Analytics at Scale
Pulsar - Real-time Analytics at Scale
Tony Ng
 
Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
Murtaza Doctor
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
Guido Schmutz
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
confluent
 
Streaming Analytics @ Uber
Streaming Analytics @ UberStreaming Analytics @ Uber
Streaming Analytics @ Uber
Xiang Fu
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Helena Edelson
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Databricks
 
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
Databricks
 
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
confluent
 
Streaming SQL
Streaming SQLStreaming SQL
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
Yogi Devendra Vyavahare
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
Lars Albertsson
 
Parallel Sequence Generator
Parallel Sequence GeneratorParallel Sequence Generator
Parallel Sequence Generator
Rim Moussa
 
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIsBig Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
Matt Stubbs
 

What's hot (20)

Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Introduction to Real-Time Data Processing
Introduction to Real-Time Data ProcessingIntroduction to Real-Time Data Processing
Introduction to Real-Time Data Processing
 
OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia OLAP Basics and Fundamentals by Bharat Kalia
OLAP Basics and Fundamentals by Bharat Kalia
 
Pulsar - Real-time Analytics at Scale
Pulsar - Real-time Analytics at ScalePulsar - Real-time Analytics at Scale
Pulsar - Real-time Analytics at Scale
 
Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
 
Streaming Analytics @ Uber
Streaming Analytics @ UberStreaming Analytics @ Uber
Streaming Analytics @ Uber
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
 
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache SparkArbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
 
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
 
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
Confluent real time_acquisition_analysis_and_evaluation_of_data_streams_20190...
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
Introduction to Real-time data processing
Introduction to Real-time data processingIntroduction to Real-time data processing
Introduction to Real-time data processing
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Parallel Sequence Generator
Parallel Sequence GeneratorParallel Sequence Generator
Parallel Sequence Generator
 
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIsBig Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
Big Data LDN 2017: Processing Fast Data With Apache Spark: the Tale of Two APIs
 

Similar to BruJUG - Introduction to data streaming

JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streaming
Nicolas Fränkel
 
BigData conference - Introduction to stream processing
BigData conference - Introduction to stream processingBigData conference - Introduction to stream processing
BigData conference - Introduction to stream processing
Nicolas Fränkel
 
Devclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processingDevclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processing
Nicolas Fränkel
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Dataconomy Media
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
DataScienceConferenc1
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
Renato Guimaraes
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
Itai Yaffe
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
Yoni Farin
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Apache Apex
 
The End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional ManagementThe End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional Management
Ricardo Jimenez-Peris
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
DataWorks Summit/Hadoop Summit
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Apache Apex
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
confluent
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Ben Stopford
 
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesFebruary 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
Yahoo Developer Network
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019
Zhenxiao Luo
 
Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time Metrics
Samantha Quiñones
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData
 

Similar to BruJUG - Introduction to data streaming (20)

JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streaming
 
BigData conference - Introduction to stream processing
BigData conference - Introduction to stream processingBigData conference - Introduction to stream processing
BigData conference - Introduction to stream processing
 
Devclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processingDevclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processing
 
Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex Big Data Berlin v8.0 Stream Processing with Apache Apex
Big Data Berlin v8.0 Stream Processing with Apache Apex
 
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
Thomas Weise, Apache Apex PMC Member and Architect/Co-Founder, DataTorrent - ...
 
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
[DSC Europe 23] Pramod Immaneni - Real-time analytics at IoT scale
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
The End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional ManagementThe End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional Management
 
Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex Next Gen Big Data Analytics with Apache Apex
Next Gen Big Data Analytics with Apache Apex
 
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache ApexHadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
Hadoop Summit SJ 2016: Next Gen Big Data Analytics with Apache Apex
 
Putting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream ProcessingPutting the Micro into Microservices with Stateful Stream Processing
Putting the Micro into Microservices with Stateful Stream Processing
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
February 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and InsidesFebruary 2014 HUG : Tez Details and Insides
February 2014 HUG : Tez Details and Insides
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019
 
Drinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time MetricsDrinking from the Firehose - Real-time Metrics
Drinking from the Firehose - Real-time Metrics
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
 

More from Nicolas Fränkel

SnowCamp - Adding search to a legacy application
SnowCamp - Adding search to a legacy applicationSnowCamp - Adding search to a legacy application
SnowCamp - Adding search to a legacy application
Nicolas Fränkel
 
Un CV de dévelopeur toujours a jour
Un CV de dévelopeur toujours a jourUn CV de dévelopeur toujours a jour
Un CV de dévelopeur toujours a jour
Nicolas Fränkel
 
Zero-downtime deployment on Kubernetes with Hazelcast
Zero-downtime deployment on Kubernetes with HazelcastZero-downtime deployment on Kubernetes with Hazelcast
Zero-downtime deployment on Kubernetes with Hazelcast
Nicolas Fränkel
 
jLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cachejLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cache
Nicolas Fränkel
 
ADDO - Your own Kubernetes controller, not only in Go
ADDO - Your own Kubernetes controller, not only in GoADDO - Your own Kubernetes controller, not only in Go
ADDO - Your own Kubernetes controller, not only in Go
Nicolas Fränkel
 
TestCon Europe - Mutation Testing to the Rescue of Your Tests
TestCon Europe - Mutation Testing to the Rescue of Your TestsTestCon Europe - Mutation Testing to the Rescue of Your Tests
TestCon Europe - Mutation Testing to the Rescue of Your Tests
Nicolas Fränkel
 
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java applicationOSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
Nicolas Fränkel
 
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cacheGeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
Nicolas Fränkel
 
JavaDay Istanbul - 3 improvements in your microservices architecture
JavaDay Istanbul - 3 improvements in your microservices architectureJavaDay Istanbul - 3 improvements in your microservices architecture
JavaDay Istanbul - 3 improvements in your microservices architecture
Nicolas Fränkel
 
OSCONF Hyderabad - Shorten all URLs!
OSCONF Hyderabad - Shorten all URLs!OSCONF Hyderabad - Shorten all URLs!
OSCONF Hyderabad - Shorten all URLs!
Nicolas Fränkel
 
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring BootOSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
Nicolas Fränkel
 
JOnConf - A CDC use-case: designing an Evergreen Cache
JOnConf - A CDC use-case: designing an Evergreen CacheJOnConf - A CDC use-case: designing an Evergreen Cache
JOnConf - A CDC use-case: designing an Evergreen Cache
Nicolas Fränkel
 
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
Nicolas Fränkel
 
Java.IL - Your own Kubernetes controller, not only in Go!
Java.IL - Your own Kubernetes controller, not only in Go!Java.IL - Your own Kubernetes controller, not only in Go!
Java.IL - Your own Kubernetes controller, not only in Go!
Nicolas Fränkel
 
London Java Community - An Experiment in Continuous Deployment of JVM applica...
London Java Community - An Experiment in Continuous Deployment of JVM applica...London Java Community - An Experiment in Continuous Deployment of JVM applica...
London Java Community - An Experiment in Continuous Deployment of JVM applica...
Nicolas Fränkel
 
OSCONF - Your own Kubernetes controller: not only in Go
OSCONF - Your own Kubernetes controller: not only in GoOSCONF - Your own Kubernetes controller: not only in Go
OSCONF - Your own Kubernetes controller: not only in Go
Nicolas Fränkel
 
vKUG - Migrating Spring Boot apps from annotation-based config to Functional
vKUG - Migrating Spring Boot apps from annotation-based config to FunctionalvKUG - Migrating Spring Boot apps from annotation-based config to Functional
vKUG - Migrating Spring Boot apps from annotation-based config to Functional
Nicolas Fränkel
 
Tech talks - 3 performance improvements
Tech talks - 3 performance improvementsTech talks - 3 performance improvements
Tech talks - 3 performance improvements
Nicolas Fränkel
 
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
Nicolas Fränkel
 
ING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
ING Meetup - Migrating Spring Boot Config Annotations to Functional with KotlinING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
ING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
Nicolas Fränkel
 

More from Nicolas Fränkel (20)

SnowCamp - Adding search to a legacy application
SnowCamp - Adding search to a legacy applicationSnowCamp - Adding search to a legacy application
SnowCamp - Adding search to a legacy application
 
Un CV de dévelopeur toujours a jour
Un CV de dévelopeur toujours a jourUn CV de dévelopeur toujours a jour
Un CV de dévelopeur toujours a jour
 
Zero-downtime deployment on Kubernetes with Hazelcast
Zero-downtime deployment on Kubernetes with HazelcastZero-downtime deployment on Kubernetes with Hazelcast
Zero-downtime deployment on Kubernetes with Hazelcast
 
jLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cachejLove - A Change-Data-Capture use-case: designing an evergreen cache
jLove - A Change-Data-Capture use-case: designing an evergreen cache
 
ADDO - Your own Kubernetes controller, not only in Go
ADDO - Your own Kubernetes controller, not only in GoADDO - Your own Kubernetes controller, not only in Go
ADDO - Your own Kubernetes controller, not only in Go
 
TestCon Europe - Mutation Testing to the Rescue of Your Tests
TestCon Europe - Mutation Testing to the Rescue of Your TestsTestCon Europe - Mutation Testing to the Rescue of Your Tests
TestCon Europe - Mutation Testing to the Rescue of Your Tests
 
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java applicationOSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
OSCONF Jaipur - A Hitchhiker's Tour to Containerizing a Java application
 
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cacheGeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
GeekcampSG 2020 - A Change-Data-Capture use-case: designing an evergreen cache
 
JavaDay Istanbul - 3 improvements in your microservices architecture
JavaDay Istanbul - 3 improvements in your microservices architectureJavaDay Istanbul - 3 improvements in your microservices architecture
JavaDay Istanbul - 3 improvements in your microservices architecture
 
OSCONF Hyderabad - Shorten all URLs!
OSCONF Hyderabad - Shorten all URLs!OSCONF Hyderabad - Shorten all URLs!
OSCONF Hyderabad - Shorten all URLs!
 
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring BootOSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
OSCONF Koshi - Zero downtime deployment with Kubernetes, Flyway and Spring Boot
 
JOnConf - A CDC use-case: designing an Evergreen Cache
JOnConf - A CDC use-case: designing an Evergreen CacheJOnConf - A CDC use-case: designing an Evergreen Cache
JOnConf - A CDC use-case: designing an Evergreen Cache
 
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
London In-Memory Computing Meetup - A Change-Data-Capture use-case: designing...
 
Java.IL - Your own Kubernetes controller, not only in Go!
Java.IL - Your own Kubernetes controller, not only in Go!Java.IL - Your own Kubernetes controller, not only in Go!
Java.IL - Your own Kubernetes controller, not only in Go!
 
London Java Community - An Experiment in Continuous Deployment of JVM applica...
London Java Community - An Experiment in Continuous Deployment of JVM applica...London Java Community - An Experiment in Continuous Deployment of JVM applica...
London Java Community - An Experiment in Continuous Deployment of JVM applica...
 
OSCONF - Your own Kubernetes controller: not only in Go
OSCONF - Your own Kubernetes controller: not only in GoOSCONF - Your own Kubernetes controller: not only in Go
OSCONF - Your own Kubernetes controller: not only in Go
 
vKUG - Migrating Spring Boot apps from annotation-based config to Functional
vKUG - Migrating Spring Boot apps from annotation-based config to FunctionalvKUG - Migrating Spring Boot apps from annotation-based config to Functional
vKUG - Migrating Spring Boot apps from annotation-based config to Functional
 
Tech talks - 3 performance improvements
Tech talks - 3 performance improvementsTech talks - 3 performance improvements
Tech talks - 3 performance improvements
 
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
AllTheTalks.online - A Streaming Use-Case: And Experiment in Continuous Deplo...
 
ING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
ING Meetup - Migrating Spring Boot Config Annotations to Functional with KotlinING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
ING Meetup - Migrating Spring Boot Config Annotations to Functional with Kotlin
 

Recently uploaded

一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
Rakesh Kumar R
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
What next after learning python programming basics
What next after learning python programming basicsWhat next after learning python programming basics
What next after learning python programming basics
Rakesh Kumar R
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 

Recently uploaded (20)

一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
How to write a program in any programming language
How to write a program in any programming languageHow to write a program in any programming language
How to write a program in any programming language
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
What next after learning python programming basics
What next after learning python programming basicsWhat next after learning python programming basics
What next after learning python programming basics
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 

BruJUG - Introduction to data streaming

Editor's Notes

  1. Real-time (latency-sensitive) operations combined with analytics Count usages per CC in last 10 secs, fraud if > 10 Real-time querying Based on analytics, prediction Fraud detection ran overnight has low value Complex event processing Pattern detection (if A and B -> C) SPE runs this at scale Valuable: IOT support. Machine analytics/predictions - fits into AI Without streaming?
  2. SP - make use of multi-processor multi-node runtime, Minimize costs: data shuffling, context switching Jet Does Distributed Parallel Processing 1/ Execution plan 2/ Execute it in parallel How can be computation parallelized? Task parallelism - make use of multiprocessor machines Continuous - run in parallel and exchange data MR - just two steps
  3. SP - make use of multi-processor multi-node runtime, Minimize costs: data shuffling, context switching 1/ Data parallelism - distribute data partitions among available resources DAG deployed to all cluster members = more cores What can be parallelized? Source - if partitioned, can read in parallel Map/filter - can run in parallel Extend the edges Shuffing / moving data is expensive Keeps data local, even reads it locally when co-located with the data source
  4. @startuml class FeedMessage class FeedHeader { gtfs_realtime_version: string timestamp: uint64 } enum Incrementality { FULL_DATASET DIFFERENTIAL } class FeedEntity { id: String is_deleted: boolean } class TripUpdate { timestamp: uint64 delay: int32 } class VehiclePosition { current_stop_sequence: uint32 stop_id: string timestamp: uint64 } enum VehicleStopStatus { INCOMING_AT STOPPED_AT IN_TRANSIT_TO } enum CongestionLevel { UNKNOWN_CONGESTION_LEVEL RUNNING_SMOOTHLY STOP_AND_GO CONGESTION SEVERE_CONGESTION } class Alert enum Cause { UNKNOWN_CAUSE OTHER_CAUSE TECHNICAL_PROBLEM STRIKE DEMONSTRATION ACCIDENT HOLIDAY WEATHER MAINTENANCE CONSTRUCTION POLICE_ACTIVITY MEDICAL_EMERGENCY } enum Effect { NO_SERVICE REDUCED_SERVICE SIGNIFICANT_DELAYS DETOUR ADDITIONAL_SERVICE MODIFIED_SERVICE OTHER_EFFECT UNKNOWN_EFFECT STOP_MOVED } class TimeRange { start: uint64 end: uint64 } class Position { latitude: float longitude: float bearing: float odometer: double speed: float } class TripDescriptor { trip_id: String route_id: String direction_id: uint32 start_time: string start_date: string } class VehicleDescriptor { id: string label: string license_plate: string } class StopTimeUpdate { stop_sequence: uint32 stop_id: string } class StopTimeEvent { delay: uint32 time: int64 uncertainty: int32 } enum ScheduleRelationship { SCHEDULED SKIPPED NO_DATA } class TripDescriptor { trip_id: string route_id: string direction_id: uint32 start_time: string start_date: string } enum ScheduleRelationship2 as "ScheduleRelationship" { SCHEDULED ADDED UNSCHEDULED CANCELED } class EntitySelector { agency_id: string route_id: string route_type: int32 stop_id: string } class Translation { text: string language: string } FeedMessage -up-> "1" FeedHeader: header FeedMessage -down-> "*" FeedEntity: entity FeedHeader -right-> "1" Incrementality FeedEntity --> "0..1" TripUpdate FeedEntity -left-> "0..1" VehiclePosition FeedEntity -right-> "0..1" Alert TripUpdate --> "1" TripDescriptor: trip TripUpdate -left-> "0..1" VehicleDescriptor: vehicle TripUpdate --> "*" StopTimeUpdate StopTimeUpdate -left-> "0..1" StopTimeEvent: arrival StopTimeUpdate -left-> "0..1" StopTimeEvent: departure StopTimeUpdate --> "0..1" ScheduleRelationship TripDescriptor -right-> "0..1" ScheduleRelationship2 VehiclePosition --> "0..1" TripDescriptor: trip VehiclePosition --> "0..1" VehicleDescriptor: vehicle VehiclePosition -left-> "0..1" Position: vehicle VehiclePosition -up-> "0..1" VehicleStopStatus: current_status VehiclePosition -up-> "0..1" CongestionLevel Alert --> "*" TimeRange: active_period Alert --> "1..*" EntitySelector: informed_entity Alert -up-> "0..1" Cause Alert -up-> "0..1" Effect Alert -right-> "0..1" TranslatedString: url Alert -right-> "1" TranslatedString: header_text Alert -right-> "1" TranslatedString: description_text EntitySelector --> "0..1" TripDescriptor: trip TranslatedString --> "1..*" Translation note left of FeedMessage: Root message hide empty members @enduml
  5. node "Hazelcast Jet" as jet { database "Hazelcast IMDG" as imdg artifact "Load reference data Job" as staticjob artifact "Load dynamic data Job" as dynamicjob folder "Reference data files" as refdata { file trips.txt file routes.txt } } component "Reference data loader" <<Loader>> as staticloader component "Dynamic data loader" <<Loader>> as dynamicloader component "Web application" <<Spring Boot>> as webapp cloud { interface "Open Data endpoint" as ws } staticloader --> staticjob: Send job staticjob --> refdata: Read files staticjob --> imdg: Store JSON dynamicloader --> dynamicjob: Send job dynamicjob -right-> ws: Call REST endpoint dynamicjob --> imdg: Store JSON webapp -left-> imdg: Register to changes