SlideShare a Scribd company logo
Kafka Tiered
Storage Using
Azure Blobs
Towards running Kafka
cost-effectively in Azure
Sumant Tambe
Sr. Software Engineer, Core Eng, LinkedIn
March 4, 2022
Agenda
1 Cost of Running Kafka in Azure
2 Tiered Storage
3 KIP-405 Kafka Tiered Storage
Architecture
4 Operationalizing Tiered
Storage Using Azure Blobs
5 Demo
Cost of Running Kafka In Azure
• To minimize the migration and schedule risks LinkedIn migrated
Kafka to Azure as is*
• Kafka is using remote managed disks in LinkedIn Azure
deployments
• LinkedIn's common Kafka VM in Azure
• 64 core general-purpose VM
• RAID0 array of remote managed SSDs for throughput
• No RAID mirroring needed as managed disk are replicated
• Azure SKUs with local disks more expensive than a similar disk-less
SKUs with managed disks attached to them
• Kafka is throughput-sensitive and not latency-sensitive
• The size of the VM local disk is also a limitation.
• Cluster size has a cap
*https://engineering.linkedin.com/blog/2019/building-next-infra
Cost of Running Kafka In Azure
• Standard and premium SSDs cost more than managed HDDs as
they offer better SLAs
• Better bandwidth and P99 latency guarantees
• For example, Bursting support in Azure Standard and Premium
SSD
• The connectivity SLA to the VM improves as they use
more expensive disks.
• Standard SSDs have transaction costs
• Large number of small writes tend to be more expensive
than small number of large writes
• Costs include VM, disk, transactions, and bursts
• Each read/write IO unit up to 256 KB is a billable IOP
What is Tiered Storage
Kafka storage in two persistent layers--local
disk for fresh data and remote store for older
data.
Remote Store
Local Disk
Page
Cache
Kafka Tiered Storage Implementations In The Wild
• KIP-405
• Led by Uber
• HDFS and S3 (Uber)
• Azure Blob Storage (LinkedIn and Microsoft)
• possibly more...
• Confluent Tiered Storage
• Amazon S3, Google Cloud Storage, and Pure Storage FlashBlade (as of version 6.0.0)
• Not open-source
Why Use Kafka Tiered Storage?
Kafka storage in two persistent layers--local disk for fresh data and remote store for older data
Pros Cons
Use cheaper storage for older data Potentially higher latency for older data
Potentially increase retention at lower cost More complex due to additional system dependencies
Reduce coupling between broker compute and storage Compacted topics are not supported
Facilitate independent scaling of compute and storage
Faster cluster maintenance operations. New brokers
catch up to ISR much faster due to smaller local tier.
KIP-405*: Kafka Tiered Storage Architecture
• Kafka Improvement Proposal—KIP-405
• Remote Log Manager
• (Pluggable) Remote Storage Manager
• Stores data and indexes
• (Pluggable) Remote Log Metadata Manager
• Stores startOffset, endOffset, etc. Per
segment
• Manages lifecycle of segments
• Kafka internals interact with
RemoteLogManager only
* Image Source: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage
Producers Consumers
(Oblivious to tiering)
KIP-405: Kafka Tiered Storage Architecture
• RemoteStorageManager and
RemoteLogMetadataManager are separate
• This allows stronger consistency
guarantees for metadata than data.
• The cost of using an object store for
metadata may be too high in cloud
environment due to frequent
r/w access.
KIP-405 Topic-Partition Offsets Due to Tiering
• ListOffset API now offers
• EARLIEST, LATEST, and EARLIEST_LOCAL_TIMESTAMP
Local Local
Remote
Earliest
Offset
Earliest Local
Offset
Last Stable
Offset
Log End
Offset
Remote Log
End Offset
Newer Older
KIP-405 Pluggable Remote Storage Manager (RSM)
This interface provides storing and fetching remote log segment data and indexes
• The API and the implementation operate at the segment level
• Segment details don’t leak outside Kafka
• API
• CopyLogSegmentData (copies data and index)
• FetchLogSegment
• FetchIndex
• DeleteLogSegment
KIP-405 Pluggable Remote Log Metadata Manager (RLMM)
This interface provides storing and fetching
remote log segment metadata with strongly
consistent semantics.
• RemoteLogMetadataManager
• Create/Read/Update API for
RemoteLogSegmentMetadata
Copy
Segment
Started
Copy
Segment
Finished
Delete
Segment
Started
Delete
Segment
Finished
Can read the
segment in this
state only
Basic Remote Log Segment Life-Cycle in the Leader
• ReplicaManager interacts with RemoteLogManager
• When leader copies the log segments to remote tier and clean up the expired remote log
segments
• RemoteLogManager interacts with RemoteLogMetadataManager and
RemoteStoreManager to change the state of the log segment
Remote Log
Metadata
Manager
Remote
Storage
Manager
Add
Remote
Log
Segment
Metadata
Copy
Log
Segment
Data +
index
Update
Remote
Log
Segment
Metadata
Time
Time
COPY_STARTED COPY_FINISHED
Fetch
Log
Segment
+ index
Fetch
Log
Segment
Fetch
Log
Segment
DELETE_STARTED DELETE_FINISHED
Update
Remote
Log
Segment
Metadata
Update
Remote
Log
Segment
Metadata
Delete
Log
Segment
Data +
index
Read 0-N times
Data and Metadata Movement in RSM and RLMM
1. Leader copies log segments with
the auxiliary state (includes time
index, offset index, and producer-id
snapshots) to the remote storage.
2. Leader publishes remote log
segment metadata about
beginning and completion of log
segment copy.
3. Follower fetches the messages from
the leader.
4. Follower consumes the required
remote log segment metadata
The TopicBasedRemoteLogMetadataManager (RLMM)
Must provide strongly consistent view of the metadata
• Read-your-own-writes
• All nodes read the metadata updates in the same order (log append order)
The internal topic name is __remote_log_metadata
High Durability configs
• Replication factor = 4
• min.insync.replicas = 2 (tolerate one planned and one unplanned failure in the cluster)
• unclean.leader.election.enable=false
• Producer sends data with Acks=all
• Currently not a Compact topic. Grows indefinitely.
Work-In-Progress Remote Storage Manager for Azure Blob Storage
Possible Enhancements
RemoteLogMetadataManager and RemoteStorageManager
• Currently, stateless mapping of RemoteLogSegmentId to remote store
container/bucket
• The address space of the remote log segments is assumed to be well-known
(global) for ALL the segments
• However, If multiple storage accounts are needed for throughput or capacity
reasons, currently there's no way to carry the "extra" metadata (location, e.g.)
• Ideally the "extra" metadata can be opaque: byte[]
Operationalizing Kafka Tiered Storage with Azure Blob Storage
How to size brokers' local storage?
• % of data stored locally
• Depends on the maximum time the remote store may be unavailable
• (as of this writing) Azure guarantees 99.9% r+w availability of LRS, ZRS, and GRS
accounts. Equivalent to exactly one incidence of 8hr:45min downtime in year.
• Ensure 9+ hours of local storage headroom all the time. In practice, may be 18 hrs or a
day.
Determining the local retention time of the individual topics
• 0.5 hours is the current proposal
Monitoring Tiered Storage
• How much local storage is remaining, etc. using Cruise-Control
Operationalizing Kafka Tiered Storage with Azure Blob Service
Local segment size
• Large writes and reads to/from the remote store
• Read latency increases. Bootstrapping consumer FetchRequest may expire.
Remember, no cache as of this writing
• 100MB suggested in Confluent's implementation of tiered storage. 256 MB in
LinkedIn without tiering.
The __remote_log_metadata topic retention.ms longer than all topic in the cluster
• May be compact retention policy?
A Quick Demo
1. Produce and Consume
2. Write segments to Azure Blob Storage
3. Fetch from Azure Blob Storag and resume
locally
4. Browse Azure Blob Store contents
5. Peek __remote_log_metadata
Thank you

More Related Content

What's hot

Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
Databricks
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
Knoldus Inc.
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
Todd Palino
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
Flink Forward
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
Yugabyte
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxData
 
kafka
kafkakafka
噛み砕いてKafka Streams #kafkajp
噛み砕いてKafka Streams #kafkajp噛み砕いてKafka Streams #kafkajp
噛み砕いてKafka Streams #kafkajp
Yahoo!デベロッパーネットワーク
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
DataWorks Summit
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
Guido Schmutz
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Kumar Shivam
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent
 
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
NTT DATA Technology & Innovation
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
confluent
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
Guozhang Wang
 

What's hot (20)

Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
The Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization OpportunitiesThe Parquet Format and Performance Optimization Opportunities
The Parquet Format and Performance Optimization Opportunities
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Building Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta LakeBuilding Reliable Lakehouses with Apache Flink and Delta Lake
Building Reliable Lakehouses with Apache Flink and Delta Lake
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
kafka
kafkakafka
kafka
 
噛み砕いてKafka Streams #kafkajp
噛み砕いてKafka Streams #kafkajp噛み砕いてKafka Streams #kafkajp
噛み砕いてKafka Streams #kafkajp
 
Bootstrapping state in Apache Flink
Bootstrapping state in Apache FlinkBootstrapping state in Apache Flink
Bootstrapping state in Apache Flink
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 

Similar to Kafka tiered-storage-meetup-2022-final-presented

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
Guozhang Wang
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
DataWorks Summit
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
Howard Marks
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
Tony Pearson
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
Rachit Arora
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Jeremy Beard
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Marco Obinu
 
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
Yifeng Jiang
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
Taras Matyashovsky
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Databricks
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructureharendra_pathak
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Cloudera, Inc.
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
Yoni Farin
 
Running Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft AzureRunning Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft Azure
Codership Oy - Creators of Galera Cluster
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB
 
Microsoft Azure Veri Servisleri
Microsoft Azure Veri ServisleriMicrosoft Azure Veri Servisleri
Microsoft Azure Veri Servisleri
Önder Değer
 
Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0
kartraj
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
Angelo Cesaro
 

Similar to Kafka tiered-storage-meetup-2022-final-presented (20)

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Managing storage on Prem and in Cloud
Managing storage on Prem and in CloudManaging storage on Prem and in Cloud
Managing storage on Prem and in Cloud
 
Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4Inter connect2016 yss1841-cloud-storage-options-v4
Inter connect2016 yss1841-cloud-storage-options-v4
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
 
Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
Hive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfsHive spark-s3acommitter-hbase-nfs
Hive spark-s3acommitter-hbase-nfs
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Cloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation inCloud computing UNIT 2.1 presentation in
Cloud computing UNIT 2.1 presentation in
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
 
Running Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft AzureRunning Galera Cluster on Microsoft Azure
Running Galera Cluster on Microsoft Azure
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
MongoDB World 2018: Solving Your Backup Needs Using MongoDB Ops Manager, Clou...
 
Microsoft Azure Veri Servisleri
Microsoft Azure Veri ServisleriMicrosoft Azure Veri Servisleri
Microsoft Azure Veri Servisleri
 
Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0Amazon Aurora Getting started Guide -level 0
Amazon Aurora Getting started Guide -level 0
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 

More from Sumant Tambe

Systematic Generation Data and Types in C++
Systematic Generation Data and Types in C++Systematic Generation Data and Types in C++
Systematic Generation Data and Types in C++
Sumant Tambe
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
Sumant Tambe
 
New Tools for a More Functional C++
New Tools for a More Functional C++New Tools for a More Functional C++
New Tools for a More Functional C++
Sumant Tambe
 
C++ Coroutines
C++ CoroutinesC++ Coroutines
C++ Coroutines
Sumant Tambe
 
C++ Generators and Property-based Testing
C++ Generators and Property-based TestingC++ Generators and Property-based Testing
C++ Generators and Property-based Testing
Sumant Tambe
 
Reactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and RxReactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and Rx
Sumant Tambe
 
RPC over DDS Beta 1
RPC over DDS Beta 1RPC over DDS Beta 1
RPC over DDS Beta 1
Sumant Tambe
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJS
Sumant Tambe
 
Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)
Sumant Tambe
 
Reactive Stream Processing for Data-centric Publish/Subscribe
Reactive Stream Processing for Data-centric Publish/SubscribeReactive Stream Processing for Data-centric Publish/Subscribe
Reactive Stream Processing for Data-centric Publish/Subscribe
Sumant Tambe
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and Rx
Sumant Tambe
 
Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)
Sumant Tambe
 
Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)
Sumant Tambe
 
An Extensible Architecture for Avionics Sensor Health Assessment Using DDS
An Extensible Architecture for Avionics Sensor Health Assessment Using DDSAn Extensible Architecture for Avionics Sensor Health Assessment Using DDS
An Extensible Architecture for Avionics Sensor Health Assessment Using DDS
Sumant Tambe
 
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Sumant Tambe
 
Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++Sumant Tambe
 
Communication Patterns Using Data-Centric Publish/Subscribe
Communication Patterns Using Data-Centric Publish/SubscribeCommunication Patterns Using Data-Centric Publish/Subscribe
Communication Patterns Using Data-Centric Publish/Subscribe
Sumant Tambe
 
C++11 Idioms @ Silicon Valley Code Camp 2012
C++11 Idioms @ Silicon Valley Code Camp 2012 C++11 Idioms @ Silicon Valley Code Camp 2012
C++11 Idioms @ Silicon Valley Code Camp 2012
Sumant Tambe
 
Retargeting Embedded Software Stack for Many-Core Systems
Retargeting Embedded Software Stack for Many-Core SystemsRetargeting Embedded Software Stack for Many-Core Systems
Retargeting Embedded Software Stack for Many-Core Systems
Sumant Tambe
 
Ph.D. Dissertation
Ph.D. DissertationPh.D. Dissertation
Ph.D. Dissertation
Sumant Tambe
 

More from Sumant Tambe (20)

Systematic Generation Data and Types in C++
Systematic Generation Data and Types in C++Systematic Generation Data and Types in C++
Systematic Generation Data and Types in C++
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
New Tools for a More Functional C++
New Tools for a More Functional C++New Tools for a More Functional C++
New Tools for a More Functional C++
 
C++ Coroutines
C++ CoroutinesC++ Coroutines
C++ Coroutines
 
C++ Generators and Property-based Testing
C++ Generators and Property-based TestingC++ Generators and Property-based Testing
C++ Generators and Property-based Testing
 
Reactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and RxReactive Stream Processing in Industrial IoT using DDS and Rx
Reactive Stream Processing in Industrial IoT using DDS and Rx
 
RPC over DDS Beta 1
RPC over DDS Beta 1RPC over DDS Beta 1
RPC over DDS Beta 1
 
Remote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJSRemote Log Analytics Using DDS, ELK, and RxJS
Remote Log Analytics Using DDS, ELK, and RxJS
 
Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)Property-based Testing and Generators (Lua)
Property-based Testing and Generators (Lua)
 
Reactive Stream Processing for Data-centric Publish/Subscribe
Reactive Stream Processing for Data-centric Publish/SubscribeReactive Stream Processing for Data-centric Publish/Subscribe
Reactive Stream Processing for Data-centric Publish/Subscribe
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and Rx
 
Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)Fun with Lambdas: C++14 Style (part 2)
Fun with Lambdas: C++14 Style (part 2)
 
Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)
 
An Extensible Architecture for Avionics Sensor Health Assessment Using DDS
An Extensible Architecture for Avionics Sensor Health Assessment Using DDSAn Extensible Architecture for Avionics Sensor Health Assessment Using DDS
An Extensible Architecture for Avionics Sensor Health Assessment Using DDS
 
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDSOverloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
 
Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++Standardizing the Data Distribution Service (DDS) API for Modern C++
Standardizing the Data Distribution Service (DDS) API for Modern C++
 
Communication Patterns Using Data-Centric Publish/Subscribe
Communication Patterns Using Data-Centric Publish/SubscribeCommunication Patterns Using Data-Centric Publish/Subscribe
Communication Patterns Using Data-Centric Publish/Subscribe
 
C++11 Idioms @ Silicon Valley Code Camp 2012
C++11 Idioms @ Silicon Valley Code Camp 2012 C++11 Idioms @ Silicon Valley Code Camp 2012
C++11 Idioms @ Silicon Valley Code Camp 2012
 
Retargeting Embedded Software Stack for Many-Core Systems
Retargeting Embedded Software Stack for Many-Core SystemsRetargeting Embedded Software Stack for Many-Core Systems
Retargeting Embedded Software Stack for Many-Core Systems
 
Ph.D. Dissertation
Ph.D. DissertationPh.D. Dissertation
Ph.D. Dissertation
 

Recently uploaded

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 

Recently uploaded (20)

NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 

Kafka tiered-storage-meetup-2022-final-presented

  • 1. Kafka Tiered Storage Using Azure Blobs Towards running Kafka cost-effectively in Azure Sumant Tambe Sr. Software Engineer, Core Eng, LinkedIn March 4, 2022
  • 2. Agenda 1 Cost of Running Kafka in Azure 2 Tiered Storage 3 KIP-405 Kafka Tiered Storage Architecture 4 Operationalizing Tiered Storage Using Azure Blobs 5 Demo
  • 3. Cost of Running Kafka In Azure • To minimize the migration and schedule risks LinkedIn migrated Kafka to Azure as is* • Kafka is using remote managed disks in LinkedIn Azure deployments • LinkedIn's common Kafka VM in Azure • 64 core general-purpose VM • RAID0 array of remote managed SSDs for throughput • No RAID mirroring needed as managed disk are replicated • Azure SKUs with local disks more expensive than a similar disk-less SKUs with managed disks attached to them • Kafka is throughput-sensitive and not latency-sensitive • The size of the VM local disk is also a limitation. • Cluster size has a cap *https://engineering.linkedin.com/blog/2019/building-next-infra
  • 4. Cost of Running Kafka In Azure • Standard and premium SSDs cost more than managed HDDs as they offer better SLAs • Better bandwidth and P99 latency guarantees • For example, Bursting support in Azure Standard and Premium SSD • The connectivity SLA to the VM improves as they use more expensive disks. • Standard SSDs have transaction costs • Large number of small writes tend to be more expensive than small number of large writes • Costs include VM, disk, transactions, and bursts • Each read/write IO unit up to 256 KB is a billable IOP
  • 5. What is Tiered Storage Kafka storage in two persistent layers--local disk for fresh data and remote store for older data. Remote Store Local Disk Page Cache
  • 6. Kafka Tiered Storage Implementations In The Wild • KIP-405 • Led by Uber • HDFS and S3 (Uber) • Azure Blob Storage (LinkedIn and Microsoft) • possibly more... • Confluent Tiered Storage • Amazon S3, Google Cloud Storage, and Pure Storage FlashBlade (as of version 6.0.0) • Not open-source
  • 7. Why Use Kafka Tiered Storage? Kafka storage in two persistent layers--local disk for fresh data and remote store for older data Pros Cons Use cheaper storage for older data Potentially higher latency for older data Potentially increase retention at lower cost More complex due to additional system dependencies Reduce coupling between broker compute and storage Compacted topics are not supported Facilitate independent scaling of compute and storage Faster cluster maintenance operations. New brokers catch up to ISR much faster due to smaller local tier.
  • 8. KIP-405*: Kafka Tiered Storage Architecture • Kafka Improvement Proposal—KIP-405 • Remote Log Manager • (Pluggable) Remote Storage Manager • Stores data and indexes • (Pluggable) Remote Log Metadata Manager • Stores startOffset, endOffset, etc. Per segment • Manages lifecycle of segments • Kafka internals interact with RemoteLogManager only * Image Source: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage Producers Consumers (Oblivious to tiering)
  • 9. KIP-405: Kafka Tiered Storage Architecture • RemoteStorageManager and RemoteLogMetadataManager are separate • This allows stronger consistency guarantees for metadata than data. • The cost of using an object store for metadata may be too high in cloud environment due to frequent r/w access.
  • 10. KIP-405 Topic-Partition Offsets Due to Tiering • ListOffset API now offers • EARLIEST, LATEST, and EARLIEST_LOCAL_TIMESTAMP Local Local Remote Earliest Offset Earliest Local Offset Last Stable Offset Log End Offset Remote Log End Offset Newer Older
  • 11. KIP-405 Pluggable Remote Storage Manager (RSM) This interface provides storing and fetching remote log segment data and indexes • The API and the implementation operate at the segment level • Segment details don’t leak outside Kafka • API • CopyLogSegmentData (copies data and index) • FetchLogSegment • FetchIndex • DeleteLogSegment
  • 12. KIP-405 Pluggable Remote Log Metadata Manager (RLMM) This interface provides storing and fetching remote log segment metadata with strongly consistent semantics. • RemoteLogMetadataManager • Create/Read/Update API for RemoteLogSegmentMetadata Copy Segment Started Copy Segment Finished Delete Segment Started Delete Segment Finished Can read the segment in this state only
  • 13. Basic Remote Log Segment Life-Cycle in the Leader • ReplicaManager interacts with RemoteLogManager • When leader copies the log segments to remote tier and clean up the expired remote log segments • RemoteLogManager interacts with RemoteLogMetadataManager and RemoteStoreManager to change the state of the log segment Remote Log Metadata Manager Remote Storage Manager Add Remote Log Segment Metadata Copy Log Segment Data + index Update Remote Log Segment Metadata Time Time COPY_STARTED COPY_FINISHED Fetch Log Segment + index Fetch Log Segment Fetch Log Segment DELETE_STARTED DELETE_FINISHED Update Remote Log Segment Metadata Update Remote Log Segment Metadata Delete Log Segment Data + index Read 0-N times
  • 14. Data and Metadata Movement in RSM and RLMM 1. Leader copies log segments with the auxiliary state (includes time index, offset index, and producer-id snapshots) to the remote storage. 2. Leader publishes remote log segment metadata about beginning and completion of log segment copy. 3. Follower fetches the messages from the leader. 4. Follower consumes the required remote log segment metadata
  • 15. The TopicBasedRemoteLogMetadataManager (RLMM) Must provide strongly consistent view of the metadata • Read-your-own-writes • All nodes read the metadata updates in the same order (log append order) The internal topic name is __remote_log_metadata High Durability configs • Replication factor = 4 • min.insync.replicas = 2 (tolerate one planned and one unplanned failure in the cluster) • unclean.leader.election.enable=false • Producer sends data with Acks=all • Currently not a Compact topic. Grows indefinitely.
  • 16. Work-In-Progress Remote Storage Manager for Azure Blob Storage
  • 17. Possible Enhancements RemoteLogMetadataManager and RemoteStorageManager • Currently, stateless mapping of RemoteLogSegmentId to remote store container/bucket • The address space of the remote log segments is assumed to be well-known (global) for ALL the segments • However, If multiple storage accounts are needed for throughput or capacity reasons, currently there's no way to carry the "extra" metadata (location, e.g.) • Ideally the "extra" metadata can be opaque: byte[]
  • 18. Operationalizing Kafka Tiered Storage with Azure Blob Storage How to size brokers' local storage? • % of data stored locally • Depends on the maximum time the remote store may be unavailable • (as of this writing) Azure guarantees 99.9% r+w availability of LRS, ZRS, and GRS accounts. Equivalent to exactly one incidence of 8hr:45min downtime in year. • Ensure 9+ hours of local storage headroom all the time. In practice, may be 18 hrs or a day. Determining the local retention time of the individual topics • 0.5 hours is the current proposal Monitoring Tiered Storage • How much local storage is remaining, etc. using Cruise-Control
  • 19. Operationalizing Kafka Tiered Storage with Azure Blob Service Local segment size • Large writes and reads to/from the remote store • Read latency increases. Bootstrapping consumer FetchRequest may expire. Remember, no cache as of this writing • 100MB suggested in Confluent's implementation of tiered storage. 256 MB in LinkedIn without tiering. The __remote_log_metadata topic retention.ms longer than all topic in the cluster • May be compact retention policy?
  • 20. A Quick Demo 1. Produce and Consume 2. Write segments to Azure Blob Storage 3. Fetch from Azure Blob Storag and resume locally 4. Browse Azure Blob Store contents 5. Peek __remote_log_metadata