SlideShare a Scribd company logo
2
Apps
?
Streaming
records +
deltas
Search queries
Vector search
Real-time analytics
SQL
3
Apps
Streaming
records +
deltas
Search queries
Vector search
Real-time analytics
1. Sharding → scalability
Sharding choices and tradeoffs
4
Value-
dependent
mapping?
Data +
indexes
together?
no yes
yes no
✅ Opportunities for larger read I/Os
❌ Coordination overheads balloon as
ingest latency drops
Can
documents
change? yes
no
❌ Unable to
support most
search and
analytics apps
Doc sharding
❗Small read
I/Os
✅ Efficient
streaming ingest
✅ Consistent
indexes
5
Apps
Streaming
records +
deltas
Search queries
Vector search
Real-time analytics
1. Doc sharding → scalability + streaming ingest
6
2. Isolation between ingest and query work
7
2. Post-ingest replication → isolation + elasticity
App
B
App
A
Leader
Follower
8
3. Disaggregated storage → efficiency + elasticity
App
B
App
A
What technology for disaggregated storage?
9
Cold
(AWS S3)
Hot
(EBS or NVMe)
👍 Cheapest $/GB
👍 Highly durable
👍 Built-in RPC API
👎 High/unpredictable latency
👎 Expensive $/IOPS
👍 Cheapest $/IOPS
👍 Low latency
👍+👎 Build/run your own RPC service
👎 More expensive $/GB
Cloud-native search + analytics
1. Doc sharding with indexes – Converged indexing
→ Scalability
→ Streaming ingest
2. Post-ingest replication – Compute:compute separation
→ Isolation
→ Compute elasticity
3. Disaggregated hot storage – Compute:storage separation
→ High disk utilization
→ Compute elasticity
→ Storage elasticity
10
RocksDB replication
at Rockset
Shared
hot storage
1 Rockset shard ≈ 1 RocksDB instance
Data stream
Rockset stores data for each
shard in RocksDB
Writes are flushed to storage
after the memtable is full
SSD
Query
Ingest
App
Memtable
12
Document:RocksDB-key mapping is M:N
13
Shared
hot storage
Ingest turns logical update into physical deltas
Data stream
Ingesting one document applies
a delta to many RocksDB keys
Deltas are merged lazily by
RocksDB
SSD
Query
Ingest
App
Memtable
14
Shared
hot storage
Fine-grained RocksDB replication
Data stream App
Leader/follower replication
makes fresh data available in all
RocksDB instances
● Replication stream sends
data and metadata changes
● Applying memtable updates
takes 6× to 10× less CPU
than ingest
● Followers don’t run
compaction
SSD
Optional
Query
Ingest Query
App
Memtable Memtable
15
Shared
hot storage
Compute:compute separation for vector search
Data stream App
IVF ANN (inverted file
approximate nearest neighbor)
● Leader periodically builds
Voronoi cell decomposition of
vector space
● Cell membership can be
updated in real-time
● Follower queries are
executed the same as an
inverted index lookup
SSD
IVF
partitioning
Ingest Query
Memtable Memtable
16
IVF
cells
Shared hot storage
RocksDB is a log-structured merge tree (LSM)
1. Writes are buffered in RAM in a
“memtable”
2. Many megabytes of values are written to
a new file at once
3. Files are immutable, and store keys in
sorted order
4. “Compaction” creates new files by
merging several old files, making things
more sorted and removing duplicates
Big async writes
Disk
Memory
18
Rockset uses indexes to accelerate queries
Query
fragment
Cost-based
optimizer
Column scan
Index lookup
Filter
Fetch other cols
in matching rows
● Large reads
● Bandwidth limited
● Small reads
● Latency limited
● IOPS limited
19
Big writes
Small reads
Write to S3
Read from SSD
20
Shared
hot storage
Write to S3 + read from SSD
Data stream App
● (Big) writes go to AWS S3
○ High BW
○ Ensures durability
● Reads go to 1-copy SSD
○ Low latency
○ High IOPS
○ Efficient small reads
○ High space utilization
SSD
Query
Ingest Query
App
S3
Memtable Memtable
21
Challenge: Cache miss to S3 can be 1,000× slower
● Cold misses
○ Synchronous prefetch on file creation
○ Prefetch from periodic S3 list
● Capacity misses
○ Cluster auto-scaling
● Software restart for upgrades
○ Dual-head serving during rolls
● Cluster resizing
○ Double-reading with rendezvous hashing
● Failure recovery
○ Whole-cluster recovery with rendezvous hashing
22
IT HAS BEEN
DAYS SINCE
THE LAST
CACHE MISS
6
Rockset hot storage is a near-perfect S3 cache
● Cold misses
○ Synchronous prefetch on file creation
○ Prefetch from periodic S3 list
● Capacity misses
○ Cluster auto-scaling
● Software restart for upgrades
○ Dual-head serving during rolls
● Cluster resizing
○ Second-chance reads with recent old configs
● Failure recovery
○ Whole-cluster recovery with rendezvous hashing
99.9999% cache hit rate
https://rockset.com/blog/separate-compute-storage-rocksdb/ 23
Rockset
● Real-time indexing
● Cloud-native efficiency
● Full-featured SQL
24
rockset.com/index-conf

More Related Content

Similar to Isolating Streaming Ingest and Queries Using RocksDB

Storage talk
Storage talkStorage talk
Storage talk
christkv
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
guest18a0f1
 

Similar to Isolating Streaming Ingest and Queries Using RocksDB (20)

Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?2017 AWS DB Day |  AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?
 
Building Analytics Applications in the AWS Cloud
Building Analytics Applications in the AWS CloudBuilding Analytics Applications in the AWS Cloud
Building Analytics Applications in the AWS Cloud
 
Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud. Choosing the right data storage in the Cloud.
Choosing the right data storage in the Cloud.
 
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
(ARC311) Decoding The Genetic Blueprint Of Life On A Cloud Ecosystem
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Storage talk
Storage talkStorage talk
Storage talk
 
Taking Splunk to the Next Level - Technical
Taking Splunk to the Next Level - TechnicalTaking Splunk to the Next Level - Technical
Taking Splunk to the Next Level - Technical
 
Taking Splunk to the Next Level – Architecture
Taking Splunk to the Next Level – ArchitectureTaking Splunk to the Next Level – Architecture
Taking Splunk to the Next Level – Architecture
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Beyond the File System: Designing Large-Scale File Storage and Serving
 	Beyond the File System: Designing Large-Scale File Storage and Serving 	Beyond the File System: Designing Large-Scale File Storage and Serving
Beyond the File System: Designing Large-Scale File Storage and Serving
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-Time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-Time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-Time Data Exploration and Analytics with Amazon Elasticsearch Service
 
Deep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch ServiceDeep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch Service
 
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
HostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
HostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

Recently uploaded (20)

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdfBuy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
Buy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptxBuy Epson EcoTank L3210 Colour Printer Online.pptx
Buy Epson EcoTank L3210 Colour Printer Online.pptx
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 

Isolating Streaming Ingest and Queries Using RocksDB

  • 1.
  • 3. 3 Apps Streaming records + deltas Search queries Vector search Real-time analytics 1. Sharding → scalability
  • 4. Sharding choices and tradeoffs 4 Value- dependent mapping? Data + indexes together? no yes yes no ✅ Opportunities for larger read I/Os ❌ Coordination overheads balloon as ingest latency drops Can documents change? yes no ❌ Unable to support most search and analytics apps Doc sharding ❗Small read I/Os ✅ Efficient streaming ingest ✅ Consistent indexes
  • 5. 5 Apps Streaming records + deltas Search queries Vector search Real-time analytics 1. Doc sharding → scalability + streaming ingest
  • 6. 6 2. Isolation between ingest and query work
  • 7. 7 2. Post-ingest replication → isolation + elasticity App B App A Leader Follower
  • 8. 8 3. Disaggregated storage → efficiency + elasticity App B App A
  • 9. What technology for disaggregated storage? 9 Cold (AWS S3) Hot (EBS or NVMe) 👍 Cheapest $/GB 👍 Highly durable 👍 Built-in RPC API 👎 High/unpredictable latency 👎 Expensive $/IOPS 👍 Cheapest $/IOPS 👍 Low latency 👍+👎 Build/run your own RPC service 👎 More expensive $/GB
  • 10. Cloud-native search + analytics 1. Doc sharding with indexes – Converged indexing → Scalability → Streaming ingest 2. Post-ingest replication – Compute:compute separation → Isolation → Compute elasticity 3. Disaggregated hot storage – Compute:storage separation → High disk utilization → Compute elasticity → Storage elasticity 10
  • 12. Shared hot storage 1 Rockset shard ≈ 1 RocksDB instance Data stream Rockset stores data for each shard in RocksDB Writes are flushed to storage after the memtable is full SSD Query Ingest App Memtable 12
  • 14. Shared hot storage Ingest turns logical update into physical deltas Data stream Ingesting one document applies a delta to many RocksDB keys Deltas are merged lazily by RocksDB SSD Query Ingest App Memtable 14
  • 15. Shared hot storage Fine-grained RocksDB replication Data stream App Leader/follower replication makes fresh data available in all RocksDB instances ● Replication stream sends data and metadata changes ● Applying memtable updates takes 6× to 10× less CPU than ingest ● Followers don’t run compaction SSD Optional Query Ingest Query App Memtable Memtable 15
  • 16. Shared hot storage Compute:compute separation for vector search Data stream App IVF ANN (inverted file approximate nearest neighbor) ● Leader periodically builds Voronoi cell decomposition of vector space ● Cell membership can be updated in real-time ● Follower queries are executed the same as an inverted index lookup SSD IVF partitioning Ingest Query Memtable Memtable 16 IVF cells
  • 18. RocksDB is a log-structured merge tree (LSM) 1. Writes are buffered in RAM in a “memtable” 2. Many megabytes of values are written to a new file at once 3. Files are immutable, and store keys in sorted order 4. “Compaction” creates new files by merging several old files, making things more sorted and removing duplicates Big async writes Disk Memory 18
  • 19. Rockset uses indexes to accelerate queries Query fragment Cost-based optimizer Column scan Index lookup Filter Fetch other cols in matching rows ● Large reads ● Bandwidth limited ● Small reads ● Latency limited ● IOPS limited 19
  • 20. Big writes Small reads Write to S3 Read from SSD 20
  • 21. Shared hot storage Write to S3 + read from SSD Data stream App ● (Big) writes go to AWS S3 ○ High BW ○ Ensures durability ● Reads go to 1-copy SSD ○ Low latency ○ High IOPS ○ Efficient small reads ○ High space utilization SSD Query Ingest Query App S3 Memtable Memtable 21
  • 22. Challenge: Cache miss to S3 can be 1,000× slower ● Cold misses ○ Synchronous prefetch on file creation ○ Prefetch from periodic S3 list ● Capacity misses ○ Cluster auto-scaling ● Software restart for upgrades ○ Dual-head serving during rolls ● Cluster resizing ○ Double-reading with rendezvous hashing ● Failure recovery ○ Whole-cluster recovery with rendezvous hashing 22
  • 23. IT HAS BEEN DAYS SINCE THE LAST CACHE MISS 6 Rockset hot storage is a near-perfect S3 cache ● Cold misses ○ Synchronous prefetch on file creation ○ Prefetch from periodic S3 list ● Capacity misses ○ Cluster auto-scaling ● Software restart for upgrades ○ Dual-head serving during rolls ● Cluster resizing ○ Second-chance reads with recent old configs ● Failure recovery ○ Whole-cluster recovery with rendezvous hashing 99.9999% cache hit rate https://rockset.com/blog/separate-compute-storage-rocksdb/ 23
  • 24. Rockset ● Real-time indexing ● Cloud-native efficiency ● Full-featured SQL 24