SlideShare a Scribd company logo
1 of 35
Download to read offline
Demystifying the
Distributed Database
Landscape
A survey of technologies in 2021
Peter Corless
+ Listen to & share user stories
+ Write blogs & case studies
+ Play (and design) strategy &
roleplaying games
Director of Technical Advocacy
ScyllaDB
3
Distributed Database Landscape 2021
SQL
+ Distributed SQL
+ NewSQL
NoSQL
+ Key-value
+ Document
+ Wide-column
+ Graph
Multi-model
+ SQL + NoSQL
+ Multiple NoSQL
Production Environments
+ On-premises
+ Co-location
+ Public cloud
+ Private cloud
+ Hybrid cloud
+ Multicloud
+ Edge
+ IoT / Embedded
Business / Use Models
+ Open Source License
+ Enterprise License
+ OEM License
+ Service Agreements
Use Cases
+ OLTP
+ OLAP
+ HTAP
+ Time Series
4
This Next Tech Cycle
The wave of innovation we’re currently riding.
Hardware, software, and
methodologies are all
co-evolving to create this
next tech cycle.
5
This Next Tech Cycle
2000 2010 2020 2025+
Transistor
Count
42M
Pentium 4
(2000)
228M
Pentium D
(2005)
2.3B
Xeon Nahalem-EX
(2010)
10B
SPARC M7
(2015)
39B
Epyc Rome
(2019)
Core
Count 1 2 8 32 64
~60B?
Epyc Genoa
(2022)
96
~80B?
Epyc Bergamo
(2023)
128
1.2 ZB
IP traffic
(2016)
2 ZB
Data stored
(2010)
64 ZB
Data stored
(2020)
Broadband
Speeds
3G
(2002)
105mbps
(2014)
1.5 mbps
(2002)
16 mbps
(2008)
Wireless
Services
3Gbps
(2021)
1Gbps
(2018)
4G
(2014)
5G
(2018)
Zettabyte
Era
~180 ZB
Data stored
(2025)
Public
Cloud
AWS
(2006)
GCP
(2008)
Azure
(2010)
1021
7
+ Compute
+ From >100 cores to >1,000 cores per server
+ From multicore CPUs → full System on a Chip (SoC) designs (CPU, GPU, Cache, Memory)
+ Memory
+ Terabyte-scale RAM per server
+ DDR5 — 4600 MHz in 2020, 8000 MHz by 2024
+ DDR6 — 9600 MHz by 2025
+ Persistent memory — memory mode
+ Storage
+ Petabyte-scale storage per server
+ NVMe 2.0 [2021] — separation of base and transport
+ Persistent memory — app direct (storage) mode
Hardware Still Vertically Scaling
8
+ Agile [c. 2000]
+ CI/CD = CI [1991] + CD [2009]
+ DevOps [2009]
+ Chaos Monkey [2011]
+ Kubernetes [2014]
+ GitOps [2017]
+ DevSecOps [2018]
Methodologies Still Evolving
How It Started
How It’s Going
How It Evolved
9
Hybrid & Multicloud is Now-ish
10
+ <1 terabyte
+ 1 to 50 terabytes
+ 50-100 terabytes
+ >100 terabytes
How much data do you have under management in your own
transactional database systems?
Poll Question
11
The Distributed
Database Landscape
Here there be monstrous databases!
12
DB-Engines.com
+ 381 databases
+ Some are distributed databases
+ Others are not distributed databases
+ Some are SQL
+ Some are NoSQL
+ Some support both SQL + NoSQL
+ Some support multiple NoSQL types
+ Some are… not easily classifiable
+ A huge industry with some well-known
names
+ But popularity (by itself) ≠
fitness for use for your use case
13
Top 100 Databases
+ Narrowing field helps scope analysis
+ Still results in wide variety of databases
+ Many SQL
+ Many NoSQL
+ ScyllaDB is in the Top 100!
14
Top 100 Databases
(and Database-like systems)
on DB-Engines.com
[as of November 2021]
+ 49 SQL
+ 32 NoSQL
+ 5 Both SQL + NoSQL
+ 5 Search Engines
+ 6 Time Series
+ 3 Others
Top 100 Databases
Are these all really
distributed databases?
15
16
“Well…”
17
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)
+ Load balancing (client side or service in front of database)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels
+ Horizontal Scalability: Manual vs. Auto-sharding
+ Topology Awareness: Rack-awareness, Datacenter-awareness
What do you mean by a “Distributed Database?”
18
The Short List: Systems of Interest
SQL + NewSQL NoSQL
PostgreSQL MongoDB
CockroachDB Redis
ScyllaDB
19
PostgreSQL — distributed SQL
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)
+ Load balancing (client side or service in front of database)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels
+ Horizontal Scalability: Manual Sharding vs. Auto-sharding
+ Topology Awareness: Rack-awareness, Datacenter-awareness
Part of base offering
Can be added, but not part of base
20
CockroachDB — NewSQL
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)
+ Load balancing (client side or service in front of database)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels
+ Horizontal Scalability: Manual vs. Auto-sharding
+ Topology Awareness: Rack-awareness*, Datacenter-awareness
* Can be manually configured using localities
Part of base offering
Can be added, but not part of base
21
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)
+ Load balancing (client side or service in front of database)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels
+ Horizontal Scalability: Manual vs. Auto-sharding
+ Topology Awareness: Rack-awareness, Datacenter-awareness
MongoDB — the leading document store
Part of base offering
Can be added, but not part of base
22
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster*
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)*
+ Load balancing (client side or service in front of database)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels (e.g., strong locally; causal consistency in active-active*)
+ Horizontal Scalability: Manual vs. Auto-sharding
+ Topology Awareness: Rack-awareness, Datacenter-awareness
Redis — key-value in-memory DB/cache
* Redis Enterprise feature
Part of base offering
Can be added, but not part of base
23
+ Clustering & Distribution Strategies
+ Local clustering — multiple nodes in the same datacenter share updates
+ Cross-cluster updates — multiple clusters can share data between them
+ Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster
+ Node Roles, High Availability & Failover Strategies
+ Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes)
+ Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)
+ Load balancing (client side or service in front of database*)
+ Data Replication & Sharding Strategies
+ Replication Factors & Consistency Levels
+ Horizontal Scalability: Manual vs. Auto-sharding
+ Topology Awareness: Rack-awareness, Datacenter-awareness
ScyllaDB
Part of base offering
* For DynamoDB-compatible API
24
But for now, let’s move on...
25
Where are Distributed
Databases Headed Next?
Time to read the tea leaves
26
The Trend for SQL
+ Google Trends for “SQL”
are at 25% rate of 2004
+ Book citations for “SQL”
peaked in 2008 and
were down to 28% of
that rate by 2019
+ Back to 1994 levels of
interest, basically
+ Still dwarfs other
database terms like
“NoSQL” or “NewSQL” or
“RDBMS”
+ No single term or
technology sums up the
distributed database
market anymore
27
+ Cambrian Explosion will Continue — “What is a database anyway?”
+ Distributed Databases of all kinds
+ Distributed Streaming — “Kafka as a database?” (kSQL says “Yes!”)
+ Distributed Ledgers — “Blockchains/DAGs as a database?”
+ Further fragmentation of the market
+ NoSQL + SQL blending increasingly
+ Evolution of NoSQL back to SQL assumptions
+ Adding back Strong Consistency, Schema Constraints, Strict Typing
Where are Distributed Databases Going?
28
+ Elasticity — Faster provisioning/decommissioning, autoscaling
+ Uncoupling Compute from Storage — Tiered Storage, Plug-in Storage
+ Data over Time
+ Built for Event Streaming, Time Series
+ Data over Space
+ Geospatial queries, Geoindexing
+ Geographic / political boundaries — GDPR, data localization
regulatory compliance
Further Trends in Distributed Databases
29
+ Increasing Focus on Developer Enablement and Developer Experience (DX)
+ APIs for extensibility: extensions, plugins, modules, add-ons, integration layers
+ Database Specific: PostgreSQL extensions, Redis modules
+ Cross-industry: GraphQL, OpenAPI (Swagger), etc.
+ AI/ML integration and incorporation into databases
+ “Building models where your data resides” — Martin Heller (Apr 2021)
+ Amazon Redshift ML
+ BigQuery ML
+ Oracle, Db2, Microsoft SQL Server
Database as a Development Platform
30
+ Tighter Coupling of Data Engineering + Data Sciences +
Operations
+ Repairing rifts of the past decade
+ Bridging huge divides between people and systems
+ From “Data Pipelining” (production-oriented) to...
+ “Data Supply Chains” (consumption-oriented)
+ Like “Software Supply Chain,” but for data and data products.
Data Teaming
31
+ Specializing databases to run in the cloud (and cloud-only)
+ Providing “concierge” services
+ Ecosystem: can integrate into cloud vendor’s (or partners’) offerings
+ Managed for you — at a price
+ Making Open Source databases easier to run on infrastructural level
+ Making self-managed operations simpler
+ Flexibility: can run on premises or in the cloud
+ Self-service model — so long as you have the skillz
We Need Different Kinds of “Easy”
32
Hope You Enjoyed Your Trip!
http://slack.scylladb.com/
33
+ Kostja Osipov
+ Serge Leontiev
Thanks
Any errors, omissions, misinterpretations,
misrepresentations or misunderstandings
are purely my own.
Please send suggestions and corrections
to peter@scylladb.com
People who helped educate me
Disclaimer
Q&A
34
United States
2445 Faber St, Suite #200
Palo Alto, CA USA 94303
Israel
Maskit 4
Herzliya, Israel 4673304
www.scylladb.com
@scylladb
Learn NoSQL for free!
university.scylladb.com
@petercorless

More Related Content

What's hot

The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsScyllaDB
 
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...ScyllaDB
 
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating  Volatile Latencies Inside Rakuten’s NoSQL MigrationEliminating  Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating Volatile Latencies Inside Rakuten’s NoSQL MigrationScyllaDB
 
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...ScyllaDB
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaScyllaDB
 
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIScyllaDB
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your databaseScyllaDB
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsScyllaDB
 
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...Data Con LA
 
The Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking DatabasesThe Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking DatabasesScyllaDB
 
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...ScyllaDB
 
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseFireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseScyllaDB
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphScyllaDB
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityScyllaDB
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScyllaDB
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseScyllaDB
 
Addressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraAddressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraScyllaDB
 
Scylla Summit 2016: Graph Processing with Titan and Scylla
Scylla Summit 2016: Graph Processing with Titan and ScyllaScylla Summit 2016: Graph Processing with Titan and Scylla
Scylla Summit 2016: Graph Processing with Titan and ScyllaScyllaDB
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...DataStax Academy
 

What's hot (20)

The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS Options
 
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
 
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating  Volatile Latencies Inside Rakuten’s NoSQL MigrationEliminating  Volatile Latencies Inside Rakuten’s NoSQL Migration
Eliminating Volatile Latencies Inside Rakuten’s NoSQL Migration
 
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
TechTalk: Reduce Your Storage Footprint with a Revolutionary New Compaction S...
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible APIIntroducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
Introducing Project Alternator - Scylla’s Open-Source DynamoDB-compatible API
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your database
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
 
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
ScyllaDB: What could you do with Cassandra compatibility at 1.8 million reque...
 
The Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking DatabasesThe Do’s and Don’ts of Benchmarking Databases
The Do’s and Don’ts of Benchmarking Databases
 
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...
WEBINAR - Introducing Scylla Open Source 3.0: Materialized Views, Secondary I...
 
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph DatabaseFireEye & Scylla: Intel Threat Analysis Using a Graph Database
FireEye & Scylla: Intel Threat Analysis Using a Graph Database
 
Powering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraphPowering a Graph Data System with Scylla + JanusGraph
Powering a Graph Data System with Scylla + JanusGraph
 
How to achieve no compromise performance and availability
How to achieve no compromise performance and availabilityHow to achieve no compromise performance and availability
How to achieve no compromise performance and availability
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
Critical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency DatabaseCritical Attributes for a High-Performance, Low-Latency Database
Critical Attributes for a High-Performance, Low-Latency Database
 
Addressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraAddressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache Cassandra
 
Scylla Summit 2016: Graph Processing with Titan and Scylla
Scylla Summit 2016: Graph Processing with Titan and ScyllaScylla Summit 2016: Graph Processing with Titan and Scylla
Scylla Summit 2016: Graph Processing with Titan and Scylla
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
 

Similar to Demystifying the Distributed Database Landscape

Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfScyllaDB
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作James Chen
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosLester Martin
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DBHeriyadi Janwar
 
Managing Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingManaging Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingCollin Bennett
 
5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency DatabaseScyllaDB
 
Big data and hadoop overvew
Big data and hadoop overvewBig data and hadoop overvew
Big data and hadoop overvewKunal Khanna
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Robert Grossman
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟datastack
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An OverviewC. Scyphers
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 

Similar to Demystifying the Distributed Database Landscape (20)

Demystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdfDemystifying the Distributed Database Landscape (DevOps) (1).pdf
Demystifying the Distributed Database Landscape (DevOps) (1).pdf
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - Panorays
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
 
NOSQL
NOSQLNOSQL
NOSQL
 
Managing Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive ComputingManaging Big Data: An Introduction to Data Intensive Computing
Managing Big Data: An Introduction to Data Intensive Computing
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database
 
Big data and hadoop overvew
Big data and hadoop overvewBig data and hadoop overvew
Big data and hadoop overvew
 
Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)Managing Big Data (Chapter 2, SC 11 Tutorial)
Managing Big Data (Chapter 2, SC 11 Tutorial)
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An Overview
 
Nosql
NosqlNosql
Nosql
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Demystifying the Distributed Database Landscape

  • 1. Demystifying the Distributed Database Landscape A survey of technologies in 2021
  • 2. Peter Corless + Listen to & share user stories + Write blogs & case studies + Play (and design) strategy & roleplaying games Director of Technical Advocacy ScyllaDB
  • 3. 3 Distributed Database Landscape 2021 SQL + Distributed SQL + NewSQL NoSQL + Key-value + Document + Wide-column + Graph Multi-model + SQL + NoSQL + Multiple NoSQL Production Environments + On-premises + Co-location + Public cloud + Private cloud + Hybrid cloud + Multicloud + Edge + IoT / Embedded Business / Use Models + Open Source License + Enterprise License + OEM License + Service Agreements Use Cases + OLTP + OLAP + HTAP + Time Series
  • 4. 4 This Next Tech Cycle The wave of innovation we’re currently riding.
  • 5. Hardware, software, and methodologies are all co-evolving to create this next tech cycle. 5
  • 6. This Next Tech Cycle 2000 2010 2020 2025+ Transistor Count 42M Pentium 4 (2000) 228M Pentium D (2005) 2.3B Xeon Nahalem-EX (2010) 10B SPARC M7 (2015) 39B Epyc Rome (2019) Core Count 1 2 8 32 64 ~60B? Epyc Genoa (2022) 96 ~80B? Epyc Bergamo (2023) 128 1.2 ZB IP traffic (2016) 2 ZB Data stored (2010) 64 ZB Data stored (2020) Broadband Speeds 3G (2002) 105mbps (2014) 1.5 mbps (2002) 16 mbps (2008) Wireless Services 3Gbps (2021) 1Gbps (2018) 4G (2014) 5G (2018) Zettabyte Era ~180 ZB Data stored (2025) Public Cloud AWS (2006) GCP (2008) Azure (2010) 1021
  • 7. 7 + Compute + From >100 cores to >1,000 cores per server + From multicore CPUs → full System on a Chip (SoC) designs (CPU, GPU, Cache, Memory) + Memory + Terabyte-scale RAM per server + DDR5 — 4600 MHz in 2020, 8000 MHz by 2024 + DDR6 — 9600 MHz by 2025 + Persistent memory — memory mode + Storage + Petabyte-scale storage per server + NVMe 2.0 [2021] — separation of base and transport + Persistent memory — app direct (storage) mode Hardware Still Vertically Scaling
  • 8. 8 + Agile [c. 2000] + CI/CD = CI [1991] + CD [2009] + DevOps [2009] + Chaos Monkey [2011] + Kubernetes [2014] + GitOps [2017] + DevSecOps [2018] Methodologies Still Evolving How It Started How It’s Going How It Evolved
  • 10. 10 + <1 terabyte + 1 to 50 terabytes + 50-100 terabytes + >100 terabytes How much data do you have under management in your own transactional database systems? Poll Question
  • 11. 11 The Distributed Database Landscape Here there be monstrous databases!
  • 12. 12 DB-Engines.com + 381 databases + Some are distributed databases + Others are not distributed databases + Some are SQL + Some are NoSQL + Some support both SQL + NoSQL + Some support multiple NoSQL types + Some are… not easily classifiable + A huge industry with some well-known names + But popularity (by itself) ≠ fitness for use for your use case
  • 13. 13 Top 100 Databases + Narrowing field helps scope analysis + Still results in wide variety of databases + Many SQL + Many NoSQL + ScyllaDB is in the Top 100!
  • 14. 14 Top 100 Databases (and Database-like systems) on DB-Engines.com [as of November 2021] + 49 SQL + 32 NoSQL + 5 Both SQL + NoSQL + 5 Search Engines + 6 Time Series + 3 Others Top 100 Databases
  • 15. Are these all really distributed databases? 15
  • 17. 17 + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF) + Load balancing (client side or service in front of database) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels + Horizontal Scalability: Manual vs. Auto-sharding + Topology Awareness: Rack-awareness, Datacenter-awareness What do you mean by a “Distributed Database?”
  • 18. 18 The Short List: Systems of Interest SQL + NewSQL NoSQL PostgreSQL MongoDB CockroachDB Redis ScyllaDB
  • 19. 19 PostgreSQL — distributed SQL + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF) + Load balancing (client side or service in front of database) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels + Horizontal Scalability: Manual Sharding vs. Auto-sharding + Topology Awareness: Rack-awareness, Datacenter-awareness Part of base offering Can be added, but not part of base
  • 20. 20 CockroachDB — NewSQL + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF) + Load balancing (client side or service in front of database) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels + Horizontal Scalability: Manual vs. Auto-sharding + Topology Awareness: Rack-awareness*, Datacenter-awareness * Can be manually configured using localities Part of base offering Can be added, but not part of base
  • 21. 21 + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF) + Load balancing (client side or service in front of database) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels + Horizontal Scalability: Manual vs. Auto-sharding + Topology Awareness: Rack-awareness, Datacenter-awareness MongoDB — the leading document store Part of base offering Can be added, but not part of base
  • 22. 22 + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster* + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF)* + Load balancing (client side or service in front of database) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels (e.g., strong locally; causal consistency in active-active*) + Horizontal Scalability: Manual vs. Auto-sharding + Topology Awareness: Rack-awareness, Datacenter-awareness Redis — key-value in-memory DB/cache * Redis Enterprise feature Part of base offering Can be added, but not part of base
  • 23. 23 + Clustering & Distribution Strategies + Local clustering — multiple nodes in the same datacenter share updates + Cross-cluster updates — multiple clusters can share data between them + Multi-datacenter clustering — geographically, even globally disbursed. but same logical cluster + Node Roles, High Availability & Failover Strategies + Primary-replica (Active-passive; writes to primary only; read-only replicas; “hot standby” modes) + Peer-to-peer, leaderless (Active-Active, multi primaries; can write to any replica; no SPOF) + Load balancing (client side or service in front of database*) + Data Replication & Sharding Strategies + Replication Factors & Consistency Levels + Horizontal Scalability: Manual vs. Auto-sharding + Topology Awareness: Rack-awareness, Datacenter-awareness ScyllaDB Part of base offering * For DynamoDB-compatible API
  • 24. 24 But for now, let’s move on...
  • 25. 25 Where are Distributed Databases Headed Next? Time to read the tea leaves
  • 26. 26 The Trend for SQL + Google Trends for “SQL” are at 25% rate of 2004 + Book citations for “SQL” peaked in 2008 and were down to 28% of that rate by 2019 + Back to 1994 levels of interest, basically + Still dwarfs other database terms like “NoSQL” or “NewSQL” or “RDBMS” + No single term or technology sums up the distributed database market anymore
  • 27. 27 + Cambrian Explosion will Continue — “What is a database anyway?” + Distributed Databases of all kinds + Distributed Streaming — “Kafka as a database?” (kSQL says “Yes!”) + Distributed Ledgers — “Blockchains/DAGs as a database?” + Further fragmentation of the market + NoSQL + SQL blending increasingly + Evolution of NoSQL back to SQL assumptions + Adding back Strong Consistency, Schema Constraints, Strict Typing Where are Distributed Databases Going?
  • 28. 28 + Elasticity — Faster provisioning/decommissioning, autoscaling + Uncoupling Compute from Storage — Tiered Storage, Plug-in Storage + Data over Time + Built for Event Streaming, Time Series + Data over Space + Geospatial queries, Geoindexing + Geographic / political boundaries — GDPR, data localization regulatory compliance Further Trends in Distributed Databases
  • 29. 29 + Increasing Focus on Developer Enablement and Developer Experience (DX) + APIs for extensibility: extensions, plugins, modules, add-ons, integration layers + Database Specific: PostgreSQL extensions, Redis modules + Cross-industry: GraphQL, OpenAPI (Swagger), etc. + AI/ML integration and incorporation into databases + “Building models where your data resides” — Martin Heller (Apr 2021) + Amazon Redshift ML + BigQuery ML + Oracle, Db2, Microsoft SQL Server Database as a Development Platform
  • 30. 30 + Tighter Coupling of Data Engineering + Data Sciences + Operations + Repairing rifts of the past decade + Bridging huge divides between people and systems + From “Data Pipelining” (production-oriented) to... + “Data Supply Chains” (consumption-oriented) + Like “Software Supply Chain,” but for data and data products. Data Teaming
  • 31. 31 + Specializing databases to run in the cloud (and cloud-only) + Providing “concierge” services + Ecosystem: can integrate into cloud vendor’s (or partners’) offerings + Managed for you — at a price + Making Open Source databases easier to run on infrastructural level + Making self-managed operations simpler + Flexibility: can run on premises or in the cloud + Self-service model — so long as you have the skillz We Need Different Kinds of “Easy”
  • 32. 32 Hope You Enjoyed Your Trip! http://slack.scylladb.com/
  • 33. 33 + Kostja Osipov + Serge Leontiev Thanks Any errors, omissions, misinterpretations, misrepresentations or misunderstandings are purely my own. Please send suggestions and corrections to peter@scylladb.com People who helped educate me Disclaimer
  • 35. United States 2445 Faber St, Suite #200 Palo Alto, CA USA 94303 Israel Maskit 4 Herzliya, Israel 4673304 www.scylladb.com @scylladb Learn NoSQL for free! university.scylladb.com @petercorless