SlideShare a Scribd company logo
Journeys from Kafka to
Parquet
DataWorks Summit, Barcelona 2019
21/03/2019
Largest e-commerce
platform in NL and BE
Gábor Hermann
• data engineer @ bol.com
• Previously at research institute
• MTA SZTAKI, Budapest
• Now measuring, recommendations
This talk
• The problem: sinking data from stream to files
• Solution 1. windowing
• Solution 2. bucketing sink
• Solution 3. closing files at checkpoints
• Solution 4. daily/hourly batch job
• Conclusion
Problem
click data stream
(Kafka)
Problem
click data stream
(Kafka)
10,000 event/sec
Problem
HDFS
click data stream
(Kafka)
10,000 event/sec
(immutable)
files
Problem
HDFS
click data stream
(Kafka)
10,000 event/sec
(immutable)
files
?
Batch vs streaming
Batch vs streaming
batch
Batch vs streaming
batch streaming
Problem
Problem
Expectation
Problem
Expectation Reality
Requirements
Requirements
• Scalable solution
• 10,000 message/sec
Requirements
• Scalable solution
• 10,000 message/sec
• Exactly-once
• Data consumers should not deduplicate
!"DISTINCT
Requirements
• Scalable solution
• 10,000 message/sec
• Exactly-once
• Data consumers should not deduplicate
• Files in event-time
• Consumers should not worry about late events
!"late events
Requirements
• Scalable solution
• 10,000 message/sec
• Exactly-once
• Data consumers should not deduplicate
• Files in event-time
• Consumers should not worry about late events
• Columnar format (Parquet)
• Optimize for reading, not for writing
!"
slow loading
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented Column-oriented
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented Column-oriented
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented
!"slow loading
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented Column-oriented
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Row-oriented Column-oriented
Columnar format (Parquet)
CUSTOMER ID VISITED
PRODUCT
PRODUCT
CATEGORY
45182584 370000004333 Books
45182584 300000053536 Games
11538222 857334358658 Electronics
79245368 370000004333 Books
11538222 370000004333 Books
11538222 942000033234 Electronics
78438133 370000004333 Books
Column-oriented
!"❤
fast loading
Requirements
• Scalable solution
• 10 000 message/sec
• Exactly-once
• Data consumers should not deduplicate
• Files in event-time
• Consumers should not worry about late events
• Columnar format (Parquet)
• Optimize for reading, not for writing
Requirements
• Scalable solution
• 10 000 message/sec
• Exactly-once
• Data consumers should not deduplicate
• Files in event-time
• Consumers should not worry about late events
• Columnar format (Parquet)
• Optimize for reading, not for writing
Apache Flink?
Requirements
• Scalable solution ✅
• 10 000 message/sec
• Exactly-once ✅
• Data consumers should not deduplicate
• Files in event-time ✅
• Consumers should not worry about late events
• Columnar format (Parquet) ✅
• Optimize for reading, not for writing
Apache Flink?
Problem
HDFS
click data stream
(Kafka)
10,000 event/sec
(immutable)
files
?
Problem
HDFS
click data stream
(Kafka)
10,000 event/sec
(immutable)
files
Solution 1.
windowing
Windowing
Kafka Flink HDFS
■ 17-00.parquet18:00-18:59
19:00-19:59
■ 17-00.parquet18:00-18:59
19:00-19:59
Windowing
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
19:00-19:59
18:00-18:59
Windowing
Kafka Flink HDFS
Handling failures?
Handling failures?
• Out-of-the-box!
Handling failures?
• Out-of-the-box!
• But…
Handling failures?
• Out-of-the-box!
• But…
• Too much memory
Handling failures?
• Out-of-the-box!
• But…
• Too much memory
Kafka Flink HDFS
■ 16-00.parquet18:00-19:59
Handling failures?
• Out-of-the-box!
• But…
• Too much memory
Kafka Flink HDFS
■ 16-00.parquet18:00-19:59
Handling failures?
• Out-of-the-box!
• But…
• Too much memory
Kafka Flink HDFS
■ 16-00.parquet18:00-19:59
OUT OF MEMORY
Solution 2.
bucketing sink
Bucketing sink
• Writes data to file “buckets” based on time
Bucketing sink
• Writes data to file “buckets” based on time
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Handling failures?
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Step 1
Handling failures?
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Step 2 (checkpoint)
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Step 3 (failure)
Handling failures?
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Step 3 (failure)
Handling failures?
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
18:00-18:59
19:00-19:59
Step 4 (recovery)
Handling failures?
Kafka Flink HDFS
But…
• Truncating is not supported on HDFS
• Already fixed (Flink 1.6.0): copy instead of truncate
• Cannot flush Parquet files
Solution 3.
closing files at checkpoint
■ 17-00
■ 18-00/
■ part-0.parquet
■ 19-00/
■ part-0.parquet
18:00-18:59
19:00-19:59
Step 1
Closing files at checkpoints
Kafka Flink HDFS
Closing files at checkpoints
Kafka Flink HDFS
18:00-18:59
19:00-19:59
■ 17-00
■ 18-00/
■ part-0.parquet
■ part-1.parquet
■ 19-00/
■ part-0.parquet
■ part-1.parquet
Step 2 (checkpoint)
18:00-18:59
19:00-19:59
■ 17-00
■ 18-00/
■ part-0.parquet
■ part-1.parquet
■ 19-00/
■ part-0.parquet
■ part-1.parquet
Step 3 (failure)
Closing files at checkpoints
Kafka Flink HDFS
18:00-18:59
19:00-19:59
■ 17-00
■ 18-00/
■ part-0.parquet
■ part-1.parquet
■ 19-00/
■ part-0.parquet
■ part-1.parquet
Step 3 (failure)
Closing files at checkpoints
Kafka Flink HDFS
18:00-18:59
19:00-19:59
■ 17-00
■ 18-00/
■ part-0.parquet
■ part-1.parquet
■ 19-00/
■ part-0.parquet
■ part-1.parquet
Step 4 (recovery)
Closing files at checkpoints
Kafka Flink HDFS
But…
• We need to change Flink bucketing sink code
But…
• We need to change Flink bucketing sink code
• Was also fixed in 1.6.0: StreamingFileSink can close files on checkpoints
• Kudos to Flink community!
But…
• We need to change Flink bucketing sink code
• Was also fixed in 1.6.0: StreamingFileSink can close files on checkpoints
• Kudos to Flink community!
• A lot of files
• Small files on HDFS is bad
Streaming fault-tolerance
WHY YOU SO COMPLICATED?
Solution 4.
daily/hourly batch job
■ 17-00.parquet
■ 18-00.parquet
18:00-18:59
Hourly batch job
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
18:00-18:59
Hourly batch job
Kafka Flink HDFS
Hourly batch job
Kafka Flink HDFS
■ 17-00.parquet
■ 18-00.parquet
18:00-18:59
■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
19:00-19:59
Hourly batch job
Kafka Flink HDFS
19:00-19:59 ■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
Hourly batch job
Kafka Flink HDFS
19:00-19:59 ■ 17-00.parquet
■ 18-00.parquet
■ 19-00.parquet
Hourly batch job
Kafka Flink HDFS
Sticking to batch processing…
It’s not perfect
• Reprocessing data
It’s not perfect
• Reprocessing data
• Does not support “semi-real-time”
It’s not perfect
• Reprocessing data
• Does not support “semi-real-time”
• Optimization
• Kafka time-based indexing
Conclusion
Proper solution?
• Use a database instead of files, or
• Use a different tool (e.g. Kafka Streams), or
• Write small files and merge them in the end, or
• Skip late events
• e.g. accept 5 minutes late, but not 12 hours
Support real-time?
• Kappa-architecture
• Streaming-only
• Lambda-architecture
• Batch system + streaming system
• Late events in daily batches + 5-minute files dropping late events
Streaming is not trivial
Streaming is not trivial (yet)
Streaming is not trivial (yet)
• Keep it simple!
• Understand the system!
?Gábor Hermann
ghermann@bol.com
@GbrHrmnn
Blogpost
https://tinyurl.com/kafka-to-parquet

More Related Content

What's hot

Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
Jiangjie Qin
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Helena Edelson
 
Apache Flink Adoption @ Shopify
Apache Flink Adoption @ ShopifyApache Flink Adoption @ Shopify
Apache Flink Adoption @ Shopify
KevinLam737856
 
REST API Pentester's perspective
REST API Pentester's perspectiveREST API Pentester's perspective
REST API Pentester's perspective
SecuRing
 
ELK introduction
ELK introductionELK introduction
ELK introduction
Waldemar Neto
 
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Spark Summit
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safe
confluent
 
Elk
Elk Elk
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at Shopify
Yaroslav Tkachenko
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
DataWorks Summit
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Architecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesArchitecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker services
LINE Corporation
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
confluent
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
Geert Pante
 
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
HostedbyConfluent
 
Spark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
Spark Summit EU talk by Sebastian Schroeder and Ralf SigmundSpark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
Spark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
Spark Summit
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
Edureka!
 
API Security Best Practices and Guidelines
API Security Best Practices and GuidelinesAPI Security Best Practices and Guidelines
API Security Best Practices and Guidelines
WSO2
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 

What's hot (20)

Handle Large Messages In Apache Kafka
Handle Large Messages In Apache KafkaHandle Large Messages In Apache Kafka
Handle Large Messages In Apache Kafka
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Apache Flink Adoption @ Shopify
Apache Flink Adoption @ ShopifyApache Flink Adoption @ Shopify
Apache Flink Adoption @ Shopify
 
REST API Pentester's perspective
REST API Pentester's perspectiveREST API Pentester's perspective
REST API Pentester's perspective
 
ELK introduction
ELK introductionELK introduction
ELK introduction
 
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safe
 
Elk
Elk Elk
Elk
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Apache Flink Adoption at Shopify
Apache Flink Adoption at ShopifyApache Flink Adoption at Shopify
Apache Flink Adoption at Shopify
 
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraphLarge Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Architecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesArchitecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker services
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Log management with ELK
Log management with ELKLog management with ELK
Log management with ELK
 
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
Dissecting our Legacy: The Strangler Fig Pattern with Debezium, Apache Kafka ...
 
Spark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
Spark Summit EU talk by Sebastian Schroeder and Ralf SigmundSpark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
Spark Summit EU talk by Sebastian Schroeder and Ralf Sigmund
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
API Security Best Practices and Guidelines
API Security Best Practices and GuidelinesAPI Security Best Practices and Guidelines
API Security Best Practices and Guidelines
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 

Similar to Journeys from Kafka to Parquet

Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®
Ververica
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
Kostas Tzoumas
 
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge businessCaterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
DataWorks Summit
 
Real Time Big Data
Real Time Big DataReal Time Big Data
Real Time Big Data
InfoFarm
 
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Natan Silnitsky
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
Patrick McFadin
 
Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016
Kostas Tzoumas
 
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
Natan Silnitsky
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Natan Silnitsky
 
Debunking Six Common Myths in Stream Processing
Debunking Six Common Myths in Stream ProcessingDebunking Six Common Myths in Stream Processing
Debunking Six Common Myths in Stream Processing
Kostas Tzoumas
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Lightbend
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
Tony Antony
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
HostedbyConfluent
 
Greyhound - Powerful Functional Kafka Library - Devtalks reimagined
Greyhound - Powerful Functional Kafka Library - Devtalks reimaginedGreyhound - Powerful Functional Kafka Library - Devtalks reimagined
Greyhound - Powerful Functional Kafka Library - Devtalks reimagined
Natan Silnitsky
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
DataWorks Summit/Hadoop Summit
 
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Jamie Grier
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Guozhang Wang
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
DataWorks Summit/Hadoop Summit
 

Similar to Journeys from Kafka to Parquet (20)

Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
 
Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®Kostas Tzoumas - Stream Processing with Apache Flink®
Kostas Tzoumas - Stream Processing with Apache Flink®
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge businessCaterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
Caterpillar’s move to the cloud: cutting edge tools for a cutting-edge business
 
Real Time Big Data
Real Time Big DataReal Time Big Data
Real Time Big Data
 
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016Apache Flink at Strata San Jose 2016
Apache Flink at Strata San Jose 2016
 
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
10 Lessons Learned from using Kafka in 1000 microservices - ScalaUA
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
 
Debunking Six Common Myths in Stream Processing
Debunking Six Common Myths in Stream ProcessingDebunking Six Common Myths in Stream Processing
Debunking Six Common Myths in Stream Processing
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
 
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:High-performance 32G Fibre Channel Module on MDS 9700 Directors:
High-performance 32G Fibre Channel Module on MDS 9700 Directors:
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
Greyhound - Powerful Functional Kafka Library - Devtalks reimagined
Greyhound - Powerful Functional Kafka Library - Devtalks reimaginedGreyhound - Powerful Functional Kafka Library - Devtalks reimagined
Greyhound - Powerful Functional Kafka Library - Devtalks reimagined
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR BenchmarksExtending the Yahoo Streaming Benchmark + MapR Benchmarks
Extending the Yahoo Streaming Benchmark + MapR Benchmarks
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 

Journeys from Kafka to Parquet