SlideShare a Scribd company logo
1 of 60
Materialized Views and
Secondary Indexes in Scylla:
They are finally here!
Piotr Sarna
Software Engineer @ScyllaDB
Presenter bio
Piotr is a software engineer very keen on open-source projects
and C++. He previously developed an open-source distributed
file system and had a brief adventure with Linux kernel during
an apprenticeship at Samsung Electronics. Piotr graduated from
University of Warsaw with MSc in Computer Science.
Agenda
▪ Introduction
▪ Materialized Views
▪ Secondary Indexes
▪ Filtering
▪ Summary
Introduction
Why finally?
▪ Materialized views
• experimental in 2.0
▪ Secondary indexes
• experimental in 2.1
▪ Filtering
Why finally?
▪ Materialized views
• experimental in 2.0, production-ready since 3.0
▪ Secondary indexes
• experimental in 2.1, production-ready since 3.0
▪ Filtering
• production-ready since 3.0
Materialized Views
Before Materialized Views
▪ How to query by something else other than primary key
columns?
Before Materialized Views
CREATE TABLE t (p int, c1 int, c2 int, v int, PRIMARY KEY (p, c1, c2));
Before Materialized Views
CREATE TABLE t (p int, c1 int, c2 int, v int, PRIMARY KEY (p, c1, c2));
▪ Querying for a regular column v:
• CREATE TABLE t2 (v int, p int, c1 int, c2 int, PRIMARY KEY(v, p, c1, c2));
• SELECT * FROM t2 WHERE v = 7;
▪ Querying for a non-prefix part of the primary key:
• CREATE TABLE t2 (c1 int, p int, c2 int, PRIMARY KEY(c1, p, c2));
• SELECT * FROM t2 WHERE c1 = 7;
Before Materialized Views
▪ Manual denormalization - problems
• updating the base table may require read-before-write
• there may be multiple denormalization tables for a table
• what if one of the writes fails?
• what if somebody forgets to write to one of the denormalized parts?
Read before write
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
CREATE TABLE denormalized (
v int,
p int,
c int,
PRIMARY KEY (v, p, c)
);
Read before write
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
p | c | v
---+---+---
0 | 1 | 8
CREATE TABLE denormalized (
v int,
p int,
c int,
PRIMARY KEY (v, p, c)
);
v | p | c
---+---+---
8 | 0 | 1
Read before write
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
p | c | v
---+---+---
0 | 1 | 8
UPDATE TABLE base_table
SET v = 9
WHERE p = 0 AND c = 1;
CREATE TABLE denormalized (
v int,
p int,
c int,
PRIMARY KEY (v, p, c)
);
v | p | c
---+---+---
8 | 0 | 1
DELETE FROM denormalized
WHERE v = 8; -- how do we know it’s 8?
INSERT INTO denormalized (v, p, c)
VALUES (9, 0, 1);
Materialized Views
▪ Let Scylla denormalize a table for you
• view updates are generated automatically and transparently
• read-before-write is performed when needed
• useful statistics are exposed
Materialized Views
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
p | c | v
---+---+---
0 | 1 | 8
CREATE MATERIALIZED VIEW
view_table AS
SELECT * FROM base_table
WHERE v IS NOT NULL
PRIMARY KEY(v, p, c);
v | p | c
---+---+---
8 | 0 | 1
Materialized Views
▪ Materialized view’s partition key can be a subset of primary key
parts and/or a regular column
• currently limited to a single regular column
• the whole base primary key must be included in view’s primary key
• all primary key fields must be restricted with IS NOT NULL
▪ Each table is allowed to have multiple views
More examples
CREATE TABLE base_table (
p int,
c int,
v1 int,
v2 int,
v3 int,
v4 int,
v5 int,
PRIMARY KEY (p, c)
);
p | c | v1 | v2 | v3 | v4 | v5
---+---+----+----+----+----+----
0 | 1 | 8 | 9 | 10 | 11 | 12
CREATE MATERIALIZED VIEW
view_table AS
SELECT c, p FROM base_table
WHERE c IS NOT NULL
PRIMARY KEY(c, p);
c | p
---+---
1 | 0
More examples
CREATE TABLE base_table (
p1 int,
p2 int,
c1 int,
c2 int,
v1 int,
v2 int,
v3 int,
PRIMARY KEY ((p1, p2), c1, c2)
);
p1 | p2 | c1 | c2 | v1 | v2 | v3
----+----+----+----+----+----+----
0 | 1 | 2 | 3 | 8 | 9 | 10
CREATE MATERIALIZED VIEW
view_table AS
SELECT c2, p1, p2, c1, v2 FROM
base_table
WHERE c2 IS NOT NULL
AND p1 IS NOT NULL
AND p2 IS NOT NULL
AND c1 IS NOT NULL
PRIMARY KEY(c2, p1, p2, c1);
c2 | p1 | p2 | c1 | v2
----+----+----+----+----
3 | 0 | 1 | 2 | 9
Challenges
▪ View rows must be eventually consistent with their base counterparts
• all updates should be propagated - inserts, updates, deletes
• updates should not be lost in case of temporary failures/restarts
▪ Cluster must not be overloaded with mv updates - backpressure
• each base write may trigger multiple independent updates
• so can streaming
▪ Views created on an existing table should fill themselves
with existing base data - view building
Consistency - synchronous model
C
B
V1
w(p: 1, v: 10)
V2
d(v: 5)
w(v:10, p: 1)
r: Ok
r: Ok
Consistency - asynchronous model
C
B
V1
w(p: 1, v: 10)
V2
d(v: 5)
w(v:10, p: 1)
r: Ok
r: Ok
Consistency
C
B
V1
w(p: 1, v: 10)
V2
d(v: 5)
w(v:10, p: 1)
r: Ok
r: Ok
Consistency
C
B
V1
w(p: 1, v: 10)
V2
d(v: 5)
w(v:10, p: 1)
r: Ok
r: Ok
Consistency
C
B
V1
w(p: 1, v: 10)
V2
d(v: 5)
w(v:10, p: 1)
r: Ok
r: Ok
solution: hinted handoff
Hinted handoff for materialized views
▪ Failed updates are stored on base node as hints
▪ They will be resent once the paired node is available
View building
▪ Views created on existing tables will be incrementally built from
existing data
▪ Progress can be tracked via system tables:
• system.views_builds_in_progress
• system.built_views
Backpressure
▪ a single user write can trigger multiple mv updates
▪ backpressure prevents overloading the cluster with them
• base replicas report their load to the coordinator
• coordinator is allowed to delay serving new user writes to lower the pressure
▪ there’s a whole presentation about the topic by Nadav Har’El
Public design document: https://docs.google.com/document/d/1J6GeLBvN8_c3SbLVp8YsOXHcLc9nOLlRY7pC6MH3JWo
Streaming
▪ Efficient way of sending data from one node to another
▪ Moves data directly to sstables of the target node,
bypassing the full write path
▪ Used under the hood of several cluster operations, e.g.:
• bootstrap
• repair
• rebuild
▪ In some cases, streamed data should generate materialized view
updates to ensure consistency
• unconditionally during node repair
• when the view is not yet completely built
▪ Affected sstables are stored and used to generate MV updates
Streaming
Secondary Indexes
Before Secondary Indexes
▪ Searching on non-partition columns
• full table scan + client-side filtering
• schema redesign + manual denormalization
• using materialized views
Secondary Indexes
Global
▪ based on materialized views
▪ reading - scalable
▪ writing - distributed
▪ low cardinality = wide partitions
▪ high cardinality = no problem
Local
▪ require custom code
▪ reading - doesn’t scale
▪ writing - fast, local operation
▪ low cardinality = wide local
partitions
▪ high cardinality = too many lookups
Secondary Indexes
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
p | c | v
---+---+---
0 | 1 | 8
CREATE INDEX ON base_table(v);
v | token | p | c
---+-------+---+---
8 | 0x123 | 0 | 1
CREATE INDEX ON base_table(c);
c | token | p
---+-------+---
8 | 0x123 | 0
Secondary Indexes
CREATE TABLE base_table (
p int,
c int,
v int,
PRIMARY KEY (p, c)
);
p | c | v
---+---+---
0 | 1 | 8
SELECT * FROM base_table WHERE v = 8;
SELECT * FROM base_table WHERE c = 1;
CREATE INDEX ON base_table(v);
v | token | p | c
---+-------+---+---
8 | 0x123 | 0 | 1
CREATE INDEX ON base_table(c);
c | token | p
---+-------+---
8 | 0x123 | 0
Global Secondary Indexes
Global secondary indexes
▪ Receive a query that may need indexes
▪ Check whether a matching index exists
▪ Execute the index query, retrieve matching base primary keys
▪ Execute the base query using mentioned primary keys
▪ Return query results
Secondary index paging
▪ Rows in the index table are small
• only the indexed column, base primary keys and token are stored, all with size limits
• it’s near impossible for 100 index rows to hit the query size limit
▪ Base rows may be much bigger
• even a single row may exceed the query size limit
• not to mention 100 of them
100 rows
page_size=100
100 keys
page_size=100
allow_short_read
Secondary Index Paging
C
I
B
3 rows
short_read=true
page_size=100
100 keys
page_size=100
allow_short_read
Secondary Index Paging
C
I
B
Secondary Indexes vs Materialized Views
▪ transparent - the same table is
used for querying
▪ may be more efficient with
storage
▪ creating/deleting them is easier
and more straightforward
▪ can cooperate with filtering
▪ uses 2-step query to join results
▪ querying doesn’t involve two steps,
which influences performance
▪ more flexible with primary keys
and complicated schemas
▪ denormalizes existing data
Filtering
Filtering
> SELECT * FROM base_table WHERE v = 8;
Cannot execute this query as it might involve data filtering and thus may have unpredictable
performance. If you want to execute this query despite the performance unpredictability, use
ALLOW FILTERING.
Filtering
▪ Query restrictions that may need filtering
• non-key fields (WHERE v = 1)
• parts of primary keys that are not prefixes (WHERE pk = 1 and c2 = 3)
• partition keys with something other than an equality relation (WHERE pk >= 1)
• clustering keys with a range restriction and then by other conditions
(WHERE pk =1 and c1 > 2 and c2 = 3)
Coordinator-side filtering
▪ Coordinator node retrieves all data from nodes
▪ Filtering is applied
▪ Only matching rows are returned to the client
▪ Easily extensible with optimizations (pre-filtering on data nodes
can be implemented and added)
Query selectivity
Low selectivity queries:
▪ return almost all rows (e.g. 70%)
High selectivity queries:
▪ return only a few rows (e.g. 1)
Query selectivity
Low selectivity queries:
▪ return almost all rows (e.g. 70%)
▪ good candidate for filtering
High selectivity queries:
▪ return only a few rows (e.g. 1)
▪ bad candidate for filtering
Filtering alternatives
▪ Materialized views and their alternatives
▪ Secondary indexes and their alternatives
Filtering + indexes
Combining filtering with indexes
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
Combining filtering with indexes
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
• extract rows using index on c2
• filter rows that match (v1 == 1 and v2 == 3)
Multiple indexing
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
CREATE INDEX ON t(v1);
▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
Multiple indexing
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
CREATE INDEX ON t(v1);
▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
• extract rows using index on c2
• filter rows that match (v1 == 1 and v2 == 3)
Key prefix optimizations
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
CREATE INDEX ON t(v1);
▪ SELECT * FROM t WHERE p = 0 and c1 = 1 and v2 = 7 ALLOW FILTERING;
• extract rows only from partition p=0 and sliced by c1=1
• filter rows that match (v2 = 7)
Key prefix optimizations
CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2));
CREATE INDEX ON t(c2);
CREATE INDEX ON t(v1);
▪ SELECT * FROM t WHERE p = 0 and c1 = 1 and v1 = 7 ALLOW FILTERING;
• extract rows from index v1, including p=0 and c1=1 in index query restrictions
• no filtering needed!
Future: selectivity statistics
Having selectivity statistics for every index would help with:
▪ identifying data model problems
• was indexing the right choice for the use case? Would filtering fit better?
▪ choosing the best index to query from in multiple index queries
• one index is used to retrieve results from the base replica
• remaining restrictions are filtered
• which combination is the best?
Conclusions
Conclusions
▪ As of 3.0, the following features are going GA:
• materialized views
• secondary indexes
• filtering support
Future plans
▪ MV repair
▪ optimized multi-index support
▪ add replica-side filtering optimizations
▪ as always - optimize even further
Thank You
Any Questions?
Please stay in touch
sarna@scylladb.com

More Related Content

What's hot

High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksEDB
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introductioncolorant
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...Altinity Ltd
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScyllaDB
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScyllaDB
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Postgresql Database Administration Basic - Day1
Postgresql  Database Administration Basic  - Day1Postgresql  Database Administration Basic  - Day1
Postgresql Database Administration Basic - Day1PoguttuezhiniVP
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]MongoDB
 
Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15Jonathan Katz
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouserpolat
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuFlink Forward
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaScyllaDB
 

What's hot (20)

High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
How the Postgres Query Optimizer Works
How the Postgres Query Optimizer WorksHow the Postgres Query Optimizer Works
How the Postgres Query Optimizer Works
 
Spark shuffle introduction
Spark shuffle introductionSpark shuffle introduction
Spark shuffle introduction
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion RecordsScylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
Scylla Summit 2022: How to Migrate a Counter Table for 68 Billion Records
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Postgresql Database Administration Basic - Day1
Postgresql  Database Administration Basic  - Day1Postgresql  Database Administration Basic  - Day1
Postgresql Database Administration Basic - Day1
 
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
Naver속도의, 속도에 의한, 속도를 위한 몽고DB (네이버 컨텐츠검색과 몽고DB) [Naver]
 
Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15Looking ahead at PostgreSQL 15
Looking ahead at PostgreSQL 15
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark WuVirtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
PostgreSQL replication
PostgreSQL replicationPostgreSQL replication
PostgreSQL replication
 

Similar to Materialized Views and Secondary Indexes in Scylla: They Are finally here!

Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryDataWorks Summit
 
What's new in PostgreSQL 11 ?
What's new in PostgreSQL 11 ?What's new in PostgreSQL 11 ?
What's new in PostgreSQL 11 ?José Lin
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsAgnieszka Figiel
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022Flink Forward
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...Flink Forward
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScyllaDB
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Platonov Sergey
 
Streaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaStreaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaScyllaDB
 
Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Mydbops
 
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopApache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopCloudera, Inc.
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015Dave Stokes
 
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015Dave Stokes
 
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Windows Developer
 
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...In-Memory Computing Summit
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkTimo Walther
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
Combining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityCombining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityEnkitec
 
Introduction databases and MYSQL
Introduction databases and MYSQLIntroduction databases and MYSQL
Introduction databases and MYSQLNaeem Junejo
 
PHP mysql Introduction database
 PHP mysql  Introduction database PHP mysql  Introduction database
PHP mysql Introduction databaseMudasir Syed
 

Similar to Materialized Views and Secondary Indexes in Scylla: They Are finally here! (20)

Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theory
 
What's new in PostgreSQL 11 ?
What's new in PostgreSQL 11 ?What's new in PostgreSQL 11 ?
What's new in PostgreSQL 11 ?
 
Table partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + RailsTable partitioning in PostgreSQL + Rails
Table partitioning in PostgreSQL + Rails
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...Flink Forward SF 2017: Timo Walther -  Table & SQL API – unified APIs for bat...
Flink Forward SF 2017: Timo Walther - Table & SQL API – unified APIs for bat...
 
Scylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized ViewsScylla Summit 2017: Distributed Materialized Views
Scylla Summit 2017: Distributed Materialized Views
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”
 
Streaming Data from Scylla to Kafka
Streaming Data from Scylla to KafkaStreaming Data from Scylla to Kafka
Streaming Data from Scylla to Kafka
 
Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.
 
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopApache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
 
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
 
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
 
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
 
CDC Stream Processing with Apache Flink
CDC Stream Processing with Apache FlinkCDC Stream Processing with Apache Flink
CDC Stream Processing with Apache Flink
 
Introduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]sIntroduction into MySQL Query Tuning for Dev[Op]s
Introduction into MySQL Query Tuning for Dev[Op]s
 
Combining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM StabilityCombining ACS Flexibility with SPM Stability
Combining ACS Flexibility with SPM Stability
 
Hive in Practice
Hive in PracticeHive in Practice
Hive in Practice
 
Introduction databases and MYSQL
Introduction databases and MYSQLIntroduction databases and MYSQL
Introduction databases and MYSQL
 
PHP mysql Introduction database
 PHP mysql  Introduction database PHP mysql  Introduction database
PHP mysql Introduction database
 

More from ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 

More from ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Recently uploaded

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Recently uploaded (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Materialized Views and Secondary Indexes in Scylla: They Are finally here!

  • 1. Materialized Views and Secondary Indexes in Scylla: They are finally here! Piotr Sarna Software Engineer @ScyllaDB
  • 2. Presenter bio Piotr is a software engineer very keen on open-source projects and C++. He previously developed an open-source distributed file system and had a brief adventure with Linux kernel during an apprenticeship at Samsung Electronics. Piotr graduated from University of Warsaw with MSc in Computer Science.
  • 3. Agenda ▪ Introduction ▪ Materialized Views ▪ Secondary Indexes ▪ Filtering ▪ Summary
  • 5. Why finally? ▪ Materialized views • experimental in 2.0 ▪ Secondary indexes • experimental in 2.1 ▪ Filtering
  • 6. Why finally? ▪ Materialized views • experimental in 2.0, production-ready since 3.0 ▪ Secondary indexes • experimental in 2.1, production-ready since 3.0 ▪ Filtering • production-ready since 3.0
  • 8. Before Materialized Views ▪ How to query by something else other than primary key columns?
  • 9. Before Materialized Views CREATE TABLE t (p int, c1 int, c2 int, v int, PRIMARY KEY (p, c1, c2));
  • 10. Before Materialized Views CREATE TABLE t (p int, c1 int, c2 int, v int, PRIMARY KEY (p, c1, c2)); ▪ Querying for a regular column v: • CREATE TABLE t2 (v int, p int, c1 int, c2 int, PRIMARY KEY(v, p, c1, c2)); • SELECT * FROM t2 WHERE v = 7; ▪ Querying for a non-prefix part of the primary key: • CREATE TABLE t2 (c1 int, p int, c2 int, PRIMARY KEY(c1, p, c2)); • SELECT * FROM t2 WHERE c1 = 7;
  • 11. Before Materialized Views ▪ Manual denormalization - problems • updating the base table may require read-before-write • there may be multiple denormalization tables for a table • what if one of the writes fails? • what if somebody forgets to write to one of the denormalized parts?
  • 12. Read before write CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); CREATE TABLE denormalized ( v int, p int, c int, PRIMARY KEY (v, p, c) );
  • 13. Read before write CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); p | c | v ---+---+--- 0 | 1 | 8 CREATE TABLE denormalized ( v int, p int, c int, PRIMARY KEY (v, p, c) ); v | p | c ---+---+--- 8 | 0 | 1
  • 14. Read before write CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); p | c | v ---+---+--- 0 | 1 | 8 UPDATE TABLE base_table SET v = 9 WHERE p = 0 AND c = 1; CREATE TABLE denormalized ( v int, p int, c int, PRIMARY KEY (v, p, c) ); v | p | c ---+---+--- 8 | 0 | 1 DELETE FROM denormalized WHERE v = 8; -- how do we know it’s 8? INSERT INTO denormalized (v, p, c) VALUES (9, 0, 1);
  • 15. Materialized Views ▪ Let Scylla denormalize a table for you • view updates are generated automatically and transparently • read-before-write is performed when needed • useful statistics are exposed
  • 16. Materialized Views CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); p | c | v ---+---+--- 0 | 1 | 8 CREATE MATERIALIZED VIEW view_table AS SELECT * FROM base_table WHERE v IS NOT NULL PRIMARY KEY(v, p, c); v | p | c ---+---+--- 8 | 0 | 1
  • 17. Materialized Views ▪ Materialized view’s partition key can be a subset of primary key parts and/or a regular column • currently limited to a single regular column • the whole base primary key must be included in view’s primary key • all primary key fields must be restricted with IS NOT NULL ▪ Each table is allowed to have multiple views
  • 18. More examples CREATE TABLE base_table ( p int, c int, v1 int, v2 int, v3 int, v4 int, v5 int, PRIMARY KEY (p, c) ); p | c | v1 | v2 | v3 | v4 | v5 ---+---+----+----+----+----+---- 0 | 1 | 8 | 9 | 10 | 11 | 12 CREATE MATERIALIZED VIEW view_table AS SELECT c, p FROM base_table WHERE c IS NOT NULL PRIMARY KEY(c, p); c | p ---+--- 1 | 0
  • 19. More examples CREATE TABLE base_table ( p1 int, p2 int, c1 int, c2 int, v1 int, v2 int, v3 int, PRIMARY KEY ((p1, p2), c1, c2) ); p1 | p2 | c1 | c2 | v1 | v2 | v3 ----+----+----+----+----+----+---- 0 | 1 | 2 | 3 | 8 | 9 | 10 CREATE MATERIALIZED VIEW view_table AS SELECT c2, p1, p2, c1, v2 FROM base_table WHERE c2 IS NOT NULL AND p1 IS NOT NULL AND p2 IS NOT NULL AND c1 IS NOT NULL PRIMARY KEY(c2, p1, p2, c1); c2 | p1 | p2 | c1 | v2 ----+----+----+----+---- 3 | 0 | 1 | 2 | 9
  • 20. Challenges ▪ View rows must be eventually consistent with their base counterparts • all updates should be propagated - inserts, updates, deletes • updates should not be lost in case of temporary failures/restarts ▪ Cluster must not be overloaded with mv updates - backpressure • each base write may trigger multiple independent updates • so can streaming ▪ Views created on an existing table should fill themselves with existing base data - view building
  • 21. Consistency - synchronous model C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  • 22. Consistency - asynchronous model C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  • 23. Consistency C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  • 24. Consistency C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok
  • 25. Consistency C B V1 w(p: 1, v: 10) V2 d(v: 5) w(v:10, p: 1) r: Ok r: Ok solution: hinted handoff
  • 26. Hinted handoff for materialized views ▪ Failed updates are stored on base node as hints ▪ They will be resent once the paired node is available
  • 27. View building ▪ Views created on existing tables will be incrementally built from existing data ▪ Progress can be tracked via system tables: • system.views_builds_in_progress • system.built_views
  • 28. Backpressure ▪ a single user write can trigger multiple mv updates ▪ backpressure prevents overloading the cluster with them • base replicas report their load to the coordinator • coordinator is allowed to delay serving new user writes to lower the pressure ▪ there’s a whole presentation about the topic by Nadav Har’El Public design document: https://docs.google.com/document/d/1J6GeLBvN8_c3SbLVp8YsOXHcLc9nOLlRY7pC6MH3JWo
  • 29. Streaming ▪ Efficient way of sending data from one node to another ▪ Moves data directly to sstables of the target node, bypassing the full write path ▪ Used under the hood of several cluster operations, e.g.: • bootstrap • repair • rebuild
  • 30. ▪ In some cases, streamed data should generate materialized view updates to ensure consistency • unconditionally during node repair • when the view is not yet completely built ▪ Affected sstables are stored and used to generate MV updates Streaming
  • 32. Before Secondary Indexes ▪ Searching on non-partition columns • full table scan + client-side filtering • schema redesign + manual denormalization • using materialized views
  • 33. Secondary Indexes Global ▪ based on materialized views ▪ reading - scalable ▪ writing - distributed ▪ low cardinality = wide partitions ▪ high cardinality = no problem Local ▪ require custom code ▪ reading - doesn’t scale ▪ writing - fast, local operation ▪ low cardinality = wide local partitions ▪ high cardinality = too many lookups
  • 34. Secondary Indexes CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); p | c | v ---+---+--- 0 | 1 | 8 CREATE INDEX ON base_table(v); v | token | p | c ---+-------+---+--- 8 | 0x123 | 0 | 1 CREATE INDEX ON base_table(c); c | token | p ---+-------+--- 8 | 0x123 | 0
  • 35. Secondary Indexes CREATE TABLE base_table ( p int, c int, v int, PRIMARY KEY (p, c) ); p | c | v ---+---+--- 0 | 1 | 8 SELECT * FROM base_table WHERE v = 8; SELECT * FROM base_table WHERE c = 1; CREATE INDEX ON base_table(v); v | token | p | c ---+-------+---+--- 8 | 0x123 | 0 | 1 CREATE INDEX ON base_table(c); c | token | p ---+-------+--- 8 | 0x123 | 0
  • 37. Global secondary indexes ▪ Receive a query that may need indexes ▪ Check whether a matching index exists ▪ Execute the index query, retrieve matching base primary keys ▪ Execute the base query using mentioned primary keys ▪ Return query results
  • 38. Secondary index paging ▪ Rows in the index table are small • only the indexed column, base primary keys and token are stored, all with size limits • it’s near impossible for 100 index rows to hit the query size limit ▪ Base rows may be much bigger • even a single row may exceed the query size limit • not to mention 100 of them
  • 41. Secondary Indexes vs Materialized Views ▪ transparent - the same table is used for querying ▪ may be more efficient with storage ▪ creating/deleting them is easier and more straightforward ▪ can cooperate with filtering ▪ uses 2-step query to join results ▪ querying doesn’t involve two steps, which influences performance ▪ more flexible with primary keys and complicated schemas ▪ denormalizes existing data
  • 43. Filtering > SELECT * FROM base_table WHERE v = 8; Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING.
  • 44. Filtering ▪ Query restrictions that may need filtering • non-key fields (WHERE v = 1) • parts of primary keys that are not prefixes (WHERE pk = 1 and c2 = 3) • partition keys with something other than an equality relation (WHERE pk >= 1) • clustering keys with a range restriction and then by other conditions (WHERE pk =1 and c1 > 2 and c2 = 3)
  • 45. Coordinator-side filtering ▪ Coordinator node retrieves all data from nodes ▪ Filtering is applied ▪ Only matching rows are returned to the client ▪ Easily extensible with optimizations (pre-filtering on data nodes can be implemented and added)
  • 46. Query selectivity Low selectivity queries: ▪ return almost all rows (e.g. 70%) High selectivity queries: ▪ return only a few rows (e.g. 1)
  • 47. Query selectivity Low selectivity queries: ▪ return almost all rows (e.g. 70%) ▪ good candidate for filtering High selectivity queries: ▪ return only a few rows (e.g. 1) ▪ bad candidate for filtering
  • 48. Filtering alternatives ▪ Materialized views and their alternatives ▪ Secondary indexes and their alternatives
  • 50. Combining filtering with indexes CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); ▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
  • 51. Combining filtering with indexes CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); ▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING; • extract rows using index on c2 • filter rows that match (v1 == 1 and v2 == 3)
  • 52. Multiple indexing CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); CREATE INDEX ON t(v1); ▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING;
  • 53. Multiple indexing CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); CREATE INDEX ON t(v1); ▪ SELECT * FROM t WHERE c2 = 3 and v1 = 1 and v2 = 3 ALLOW FILTERING; • extract rows using index on c2 • filter rows that match (v1 == 1 and v2 == 3)
  • 54. Key prefix optimizations CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); CREATE INDEX ON t(v1); ▪ SELECT * FROM t WHERE p = 0 and c1 = 1 and v2 = 7 ALLOW FILTERING; • extract rows only from partition p=0 and sliced by c1=1 • filter rows that match (v2 = 7)
  • 55. Key prefix optimizations CREATE TABLE t (p int, c1 int, c2 int, v1 int, v2 int, PRIMARY KEY(p, c1, c2)); CREATE INDEX ON t(c2); CREATE INDEX ON t(v1); ▪ SELECT * FROM t WHERE p = 0 and c1 = 1 and v1 = 7 ALLOW FILTERING; • extract rows from index v1, including p=0 and c1=1 in index query restrictions • no filtering needed!
  • 56. Future: selectivity statistics Having selectivity statistics for every index would help with: ▪ identifying data model problems • was indexing the right choice for the use case? Would filtering fit better? ▪ choosing the best index to query from in multiple index queries • one index is used to retrieve results from the base replica • remaining restrictions are filtered • which combination is the best?
  • 58. Conclusions ▪ As of 3.0, the following features are going GA: • materialized views • secondary indexes • filtering support
  • 59. Future plans ▪ MV repair ▪ optimized multi-index support ▪ add replica-side filtering optimizations ▪ as always - optimize even further
  • 60. Thank You Any Questions? Please stay in touch sarna@scylladb.com