SlideShare a Scribd company logo
Fire-fighting Java
big data problems
BIG
DATA
WHAT
THE ?!?!?!!?
WHOOPS
Format
Look at a real-life fire
Understand root cause
Put out the fire
Avoiding the fire
Alex Holmes
@grep_alex
You are skewed
Sprint 14:18 100%
Back The Boss More
My query has been running for
over 4 hours!!!
Our CEO needs this data
right now…
10110101010010
10001001101111
10110101010010
10001001101111
10110101010010
10001001101111
10110101010010
10001001101111
zip code, car status
Distributed FS
Spark
Storage
general-purpose distributed
data processing
distributed storage, e.g.
HDFS, S3, Azure, …
Root cause
94043,speeding
94043,stopped
94103,under_limit
IoT data: zip code, car status
94043,Mountain View
94103,San Francisco
Reference data: zip code, city
94043,under_limit
94043,stopped
94043,speeding
IoT data: zip code, car status
Shuffle (hash join)
Normal distribution
Runtime
Sequential
Parallel
Zip code
Number of cars
Skew
94043,speeding
94043,stopped
94103,under_limit
94043,speeding,Mountain View
94043,stopped,Mountain View
94043,under_limit,Mountain View
94043,stopped,Mountain View
94043,speeding,Mountain View
94103,under_limit,San Francisco
IoT data: zip code, car status
94043,Mountain View
94103,San Francisco
Reference data: zip code, city
94043,under_limit
94043,stopped
94043,speeding
IoT data: zip code, car status
Shuffle (hash join)
Skew
Runtime
Sequential
Parallel
The fix(es)
#1: understand your data
#2: filter-out the skew
#3: only include fields you need in
the result before the join
#4: broadcast join
JVM JVM
94043,Mountain View
94103,San Francisco
94043,Mountain View
94103,San Francisco
94043,Mountain View
94103,San Francisco
1) replicate the small
dataset to each JVM,
and cache in memory
#4: broadcast join
JVM JVM
94043,Mountain View
94103,San Francisco
94043,Mountain View
94103,San Francisco
94043,Mountain View
94103,San Francisco
1) replicate the small
dataset to each JVM,
and cache in memory
94043,speeding
94043,stopped
94103,under_limit
94043,under_limit
94043,stopped
94043,speeding
2) process large data
set and join with small
in-memory data set
#4: broadcast join
this explicitly tells Spark to do a broadcast join . .
. Spark may automatically do this for you if one
dataset is < 10MB
(spark.sql.autoBroadcastJoinThreshold)
How do we know whether Spark is
performing a hash or broadcast join?
#4: not a broadcast join
scala> iotData.join(zipCodes).where($”zip" === $"zip").explain
== Physical Plan ==
*SortMergeJoin [zip2#17], [zip#9], Inner
:- *Sort [zip2#17 ASC NULLS FIRST], false, 0
: +- Exchange hashpartitioning(zip2#17, 200)
#4: broadcast join
scala> iotData.join(broadcast(zipCodes)).where($”zip” ===
$"zip").explain
== Physical Plan ==
*BroadcastHashJoin [zip2#17], [zip#9], Inner, BuildRight
#4: broadcast join
By removing the shuffle, we remove the
impact data skew has on our processing
Only works if one data set fits entirely in
memory
Other advanced options
Filtered broadcast join
Salting + filtered salting
Bin packing
Takeaways
Spark isn’t a database
Check for skew - build join key histograms
Heavily filter + project your data
Avoid or minimize the network
Why can’t you just change?
Sprint 14:18 100%
Back The Boss More
Build me a dashboard to view
the top zip codes in real-time.
Queue
IoT front end
Analytics DB
Web app
Queue
IoT front end
Analytics DB
Web app
what data
format do you
use?
Queue
IoT front end
Analytics DB
Web app
Distributed FS
Schema
{
"type": "record",
"name": "IotData",
"fields" : [
{"name": "zip", “type": "string"},
{"name": "status", "type": "long"},
{"name": "created", "type": “long"}
]
}
Schema update
{
"type": "record",
"name": “IotData",
"fields" : [
{"name": "status", "type": "long"},
{"name": “zip", "type": "string"},
{"name": "created", "type": "long"}
]
}
Swap fields
Sprint 14:18 100%
Back Ops More
The IoT front end
deployment is complete.
Sprint 14:18 100%
Back Ops More
The analytics job is failing with an
java.io.EOFException!
Root cause
Fields are serialized in schema order
4
48
2
52
51
52
57
10
field=“zip”, contents = “94043”
field=“status”, contents = 1L
field=“created”, contents = 2L
bytes
{
"type": "record",
"name": "IotData",
"fields" : [
{"name": “zip", “type": "string"},
{"name": "status", "type": "long"},
{"name": "created", "type": “long"}
]
}
Serialized fields are reordered if you reorder them
in the schema
4
48
2
52
51
52
57
10
field=“zip”, contents = “94043”
field=“status”, contents = 1L
field=“created”, contents = 2L
bytes
52
2
57
10
48
51
4
52
field=“status”, contents = 1L
field=“zip”, contents = “94043”
field=“created”, contents = 2L
bytes
“zip” and “status”
reversed
Original
Avro serialization
Fields written in schema order
Field names aren’t included
Schema typically isn’t included
Avro deserialization
Fields read in schema order
Readers need the writers schema
Writers schema isn’t typically in message
Queue
IoT front end
Analytics DB
Web app
Only the producer is updated
Consumer doesn’t have
the updated schema
The fix
Put out the fire …
Update the consumer process with the new
schema
Skip over records you can’t process, or catch
exception and attempt with old schema
Avoid the fire…
Never change the schema
Include the schema in each message
Schema registry
Use a different data format
Queue
Application
Analytics DB
Confluent
schema registry
tag each message with
a schema version
push message schema
to registry
1
fetch message schema
from registry using
tag
3
deserialize the message using
the schema from the
registry
4
Avro supported evolutions
Deleting optional fields
Re-ordering fields
Changing field names
https://avro.apache.org/docs/1.8.2/
spec.html#Schema+Resolution
It could have been worse…
{
"type": "record",
"name": "IotData",
"fields" : [
{"name": "status", "type": "long"},
{"name": "created", "type": “long"}
]
}
{
"type": "record",
"name": "IotData",
"fields" : [
{"name": "created", "type": “long”},
{"name": “status", "type": "long"}
]
}
the following schema change wouldn’t have
resulted in any runtime errors …
Takeaways
Avro is brittle without a registry
Understand how schema evolution works with your
data format
Have a strategy for schema changes
Enforce schema evolution rules (Confluence schema
registry does this)
Test schema evolution in QA
Stop the firehose!!!
Sprint 14:18 100%
Back The Boss More
Our customers are complaining
that our portal isn’t showing
current data…what’s going on?
Kafka
Spark streaming
Receiver
Spark
batches of data
results pushed to
external system
DB
Root cause
Spike in data
By default, Spark streaming will
perform an unbounded pull from
“current” to “latest”
Which means that it may take a
long time to catch up - if at all
The fix
Put out the fire …
Skip data + restart from latest
Give your application and database more
resources*
Avoid the fire…
Know your spike load and tune your cluster + DB
to handle it
Limit the amount of Kafka data pulled in each
batch (spark.streaming.kafka.maxRatePerPartition)
Automatically skip over spikes
Over-provision or implement a lambda architecture
Takeaways
Measure and alert on lag (wall clock - event
time)
Load-test beyond your expected max rate
before going to production
Have a strategy to handle unexpected spikes
I just wanted to take a look…
Near-real time
processing Cassandra
Web app
I wonder what the
data looks like...
SELECT *
FROM top_zips
LIMIT 1;
Sprint 14:18 100%
Back SRE More
Our production database is
down!!! *^?@!#
Root cause
Node
Node Node
perform a token
scan
Client
Flash back to my 2015 J1 talk…
V VVV V VVVVV V
KKKKKKKKKKK
V VVV V VVVVV V
KKKKKKKKKKK
tombstone markers indicate that the
column has been deleted
deletes in Cassandra
are soft; deleted
columns are marked
with tombstones
these tombstoned
columns slow-down
reads
Node
Read all the data, keeping
deleted data in-memory until
you find the first non-deleted
record
Node
Read all the data, keeping
deleted data in-memory until
you find the first record
OOM!
The fix
Put out the fire …
Bounce Cassandra
Don’t run that query again
Avoid the fire …
Don’t treat Cassandra as a OLAP database
Don’t allow users to run arbitrary queries
Instead use Spark, Splunk, or a relational/
OLAP database
Takeaways
Cassandra isn’t a data warehouse
You will bring it to its knees if you treat it like one
Learn Cassandra schema design patterns to
avoid tombstones impacting your reads when
working with heavy delete workloads
What happened?
What’s worse than the issues we’ve seen?
Not knowing they happened
Or your customer discovering it
before you
Root cause
We don’t think of
failure as a first-class citizen
The fix
Update your SDLC to measure,
test and mitigate failure
Todo In Progress Done
Measure
everything
(latencies,
errors, TPS)
Soak tests
Load testing
beyond
expected
peaks
Define
runbook
Review
runbook +
alerts with
ops/SRE
Verify metrics
are live +
correct
Chaos
monkey
Have a plan for
extended
outages, or huge
data spikes
Takeaways
Measure failure
Anticipate failure
Try and break your systems (before production)
Alert when you fail
Mitigate for it
Learn and improve from it
Thank you!

More Related Content

What's hot

Compiled Websites with Plone, Django, Xapian and SSI
Compiled Websites with Plone, Django, Xapian and SSICompiled Websites with Plone, Django, Xapian and SSI
Compiled Websites with Plone, Django, Xapian and SSI
Wojciech Lichota
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
Spark Summit
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
Federico Panini
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Databricks
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Lucidworks (Archived)
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
Charlie Hull
 
Data Analysis With Pandas
Data Analysis With PandasData Analysis With Pandas
Data Analysis With Pandas
Stephan Solomonidis
 
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Why Your Database Queries Stink -SeaGl.org November 11th, 2016Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Dave Stokes
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
Sematext Group, Inc.
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
C4Media
 
Patterns and Operational Insights from the First Users of Delta Lake
Patterns and Operational Insights from the First Users of Delta LakePatterns and Operational Insights from the First Users of Delta Lake
Patterns and Operational Insights from the First Users of Delta Lake
Databricks
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
Russell Spitzer
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
Ruben Verborgh
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
Julien Le Dem
 
Building a data processing pipeline in Python
Building a data processing pipeline in PythonBuilding a data processing pipeline in Python
Building a data processing pipeline in Python
Joe Cabrera
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
Spark Summit
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
Curtis Mosters
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
New developments in open source ecosystem spark3.0 koalas delta lake
New developments in open source ecosystem spark3.0 koalas delta lakeNew developments in open source ecosystem spark3.0 koalas delta lake
New developments in open source ecosystem spark3.0 koalas delta lake
Xiao Li
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
Trey Grainger
 

What's hot (20)

Compiled Websites with Plone, Django, Xapian and SSI
Compiled Websites with Plone, Django, Xapian and SSICompiled Websites with Plone, Django, Xapian and SSI
Compiled Websites with Plone, Django, Xapian and SSI
 
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
From DataFrames to Tungsten: A Peek into Spark's Future-(Reynold Xin, Databri...
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
 
Boosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User PreferencesBoosting Documents in Solr by Recency, Popularity, and User Preferences
Boosting Documents in Solr by Recency, Popularity, and User Preferences
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 
Data Analysis With Pandas
Data Analysis With PandasData Analysis With Pandas
Data Analysis With Pandas
 
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Why Your Database Queries Stink -SeaGl.org November 11th, 2016Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Patterns and Operational Insights from the First Users of Delta Lake
Patterns and Operational Insights from the First Users of Delta LakePatterns and Operational Insights from the First Users of Delta Lake
Patterns and Operational Insights from the First Users of Delta Lake
 
Spark Cassandra Connector Dataframes
Spark Cassandra Connector DataframesSpark Cassandra Connector Dataframes
Spark Cassandra Connector Dataframes
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
Building a data processing pipeline in Python
Building a data processing pipeline in PythonBuilding a data processing pipeline in Python
Building a data processing pipeline in Python
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
 
OrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionalityOrientDB vs Neo4j - Comparison of query/speed/functionality
OrientDB vs Neo4j - Comparison of query/speed/functionality
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
New developments in open source ecosystem spark3.0 koalas delta lake
New developments in open source ecosystem spark3.0 koalas delta lakeNew developments in open source ecosystem spark3.0 koalas delta lake
New developments in open source ecosystem spark3.0 koalas delta lake
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 

Similar to Fire-fighting java big data problems

Getting started with Splunk Breakout Session
Getting started with Splunk Breakout SessionGetting started with Splunk Breakout Session
Getting started with Splunk Breakout Session
Splunk
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream csching
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
Sheetal Dolas
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications
Humoyun Ahmedov
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
MapR Technologies
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology Overview
Dan Lynn
 
Getting started with Splunk - Break out Session
Getting started with Splunk - Break out SessionGetting started with Splunk - Break out Session
Getting started with Splunk - Break out Session
Georg Knon
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
Splunk
 
Writing Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySparkWriting Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySpark
Databricks
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
Databricks
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
univalence
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
Splunk
 
SplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with SplunkSplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with Splunk
Splunk
 
ETL 2.0 Data Engineering for developers
ETL 2.0 Data Engineering for developersETL 2.0 Data Engineering for developers
ETL 2.0 Data Engineering for developers
Microsoft Tech Community
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
Timothy Spann
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
Splunk
 
Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020
Timothy Spann
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Databricks
 

Similar to Fire-fighting java big data problems (20)

Getting started with Splunk Breakout Session
Getting started with Splunk Breakout SessionGetting started with Splunk Breakout Session
Getting started with Splunk Breakout Session
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
Data Streaming Technology Overview
Data Streaming Technology OverviewData Streaming Technology Overview
Data Streaming Technology Overview
 
Getting started with Splunk - Break out Session
Getting started with Splunk - Break out SessionGetting started with Splunk - Break out Session
Getting started with Splunk - Break out Session
 
Getting started with Splunk
Getting started with SplunkGetting started with Splunk
Getting started with Splunk
 
Writing Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySparkWriting Continuous Applications with Structured Streaming in PySpark
Writing Continuous Applications with Structured Streaming in PySpark
 
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016 A Deep Dive into Structured Streaming:  Apache Spark Meetup at Bloomberg 2016
A Deep Dive into Structured Streaming: Apache Spark Meetup at Bloomberg 2016
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
SplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with SplunkSplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with Splunk
 
ETL 2.0 Data Engineering for developers
ETL 2.0 Data Engineering for developersETL 2.0 Data Engineering for developers
ETL 2.0 Data Engineering for developers
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020
 
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
Deep Dive into Stateful Stream Processing in Structured Streaming with Tathag...
 
My Master's Thesis
My Master's ThesisMy Master's Thesis
My Master's Thesis
 

Recently uploaded

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 

Recently uploaded (20)

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 

Fire-fighting java big data problems