SlideShare a Scribd company logo
Finding Logic Bugs in Database
Management Systems
Manuel Rigger
ETH Zurich, Switzerland
13/06/2020
@RiggerManuel @ast_eth https://people.inf.ethz.ch/suz/
2
Goal
Aim: Detect logic bugs in DBMS
3
Logic Bugs
DatabaseDBMS
✓
row1 φ
row2 φ
row1 φ
row2 φ
row3 ¬φ
SELECT *
FROM …
WHERE φ
4
Logic Bugs
DatabaseDBMS
 Logic Bugs are bugs that cause the DBMS
to return an incorrect result set
row1 φ
row2 φ
row3 ¬φ
row1 φ
SELECT *
FROM …
WHERE φ
5
Logic Bugs
DatabaseDBMS
 Might go unnoticed by
users (unlike crash bugs)
row1 φ
row2 φ
row3 ¬φ
row1 φ
SELECT *
FROM …
WHERE φ
6
Is the Problem Solved?
SQLite (~150,000 LOC) has 662 times
as much test code as source code
SQLite is extensively fuzzed (e.g., by
Google’s OS-Fuzz Project)
SQLite’s test cases achieve
100% branch test coverage
Anomaly testing (out-of-memory,
I/O error, power failures)
https://www.sqlite.org/testing.html
7
Tested Database Management Systems
PostgreSQL
Over 400 bugs in popular and
widely-used DBMS!
8
Bug-Hunting Challenge
22 bugs were classified as P1, and
6 as P2 → over 100 T-shirts!
https://pingcap.com/community-cn/tidb-bug-hunting/
9
Automatic Testing Core Challenges
1. Effective test case
Generate a
Database
Generate a
Query
Validate the
Query’s Result
10
Automatic Testing Core Challenges
1. Effective test case
Generate a
Database
Generate a
Query
Validate the
Query’s Result
2. Test oracle
11
Automatic Testing Core Challenges
1. Effective test case
Generate a
Database
Generate a
Query
Validate the
Query’s Result
2. Test oracle
Query and database generation
is not the focus of this talk
12
Heuristic Database Generation
Lower and upper limits
for all statements
CREATE TABLE t0(c0 INT UNIQUE);
CREATE TABLE t1(c0 INT, c1 TEXT);
Generate
Tables
Select Other
Actions
13
Heuristic Database Generation
INSERT INTO t1(c0, c1) VALUES (0, '');
UPDATE t1 SET c0 = 3;
CREATE INDEX i0 ON t1(c0, c1) WHERE c0 > 0;
INSERT INTO t0(c0) VALUES (0), (0);
CREATE TABLE t0(c0 INT UNIQUE);
CREATE TABLE t1(c0 INT, c1 TEXT);
Generate
Tables
Select Other
Actions
14
Heuristic Database Generation
INSERT INTO t1(c0, c1) VALUES (0, '');
UPDATE t1 SET c0 = 3;
CREATE INDEX i0 ON t1(c0, c1) WHERE c0 > 0;
INSERT INTO t0(c0) VALUES (0), (0);
CREATE TABLE t0(c0 INT UNIQUE);
CREATE TABLE t1(c0 INT, c1 TEXT);
Statements can fail or be redundant
Generate
Tables
Select Other
Actions
15
Random Expression Generation
IS
NOT
t0.c0 1
t0.c0 IS NOT 1;
We randomly generate
expressions used in
WHERE and other clauses
16
Automatic Testing Core Challenges
1. Effective test case
Generate a
Database
Generate a
Query
Validate the
Query’s Result
2. Test oracle
The focus of this talk are test
oracles for finding logic bugs!
17
Test Oracle
Test oracle: For a given input, determines
whether the system works as expected
18
Fuzzing
DatabaseDBMSSQLsmith
SELECT …
SEGMENTATION FAULT
Fuzzing cannot find logic bugs
19
Differential Testing
Query
Generator
RS1 = RS2 = RS3?
RS1
RS2
RS3
20
Differential Testing: RAGS (Slutz 1998)
“[Differential testing] proved to be extremely useful,
but only for the small set of common SQL”
21
Problem 1: Different SQL Dialects
DBMS-
specific SQL
Common
SQL Core
“We are unable to use Postgres as an
oracle because CockroachDB has slightly
different semantics and SQL support, and
generating queries that execute identically
on both is tricky […].” – Cockroach Labs
22
Problem 2: No Ground Truth
https://github.com/pingcap/tidb/issues/15743
DBMS are affected
by common bugs
https://github.com/pingcap/tidb/issues/15743
23
Other approaches
• Solver-based (Khalek et al. 2008, 2010)
• Performance (Jung et al. 2019)
• Estimated cost accuracy (Gu et al. 2012)
24
Automatic Testing Core Challenges
1. Effective test case
Generate a
Database
Generate a
Query
Validate the
Query’s Result
2. Test oracle
The problem of testing DBMS to find logic
bugs has not yet been well addressed
25
Overview
Pivoted Query
Synthesis (PQS)
Non-optimizing
Reference Engine
Construction
(NoREC)
Ternary Logic
Query Partitioning
(TLP)
26
Overview
Pivoted Query
Synthesis (PQS)
Non-optimizing
Reference Engine
Construction
(NoREC)
Ternary Logic
Query Partitioning
(TLP)
Generate a query for which it is
ensured that a randomly-selected
row, the pivot row, is fetched
~100
bugs
27
Overview
Pivoted Query
Synthesis (PQS)
Non-optimizing
Reference Engine
Construction
(NoREC)
Ternary Logic
Query Partitioning
(TLP)
Detecting optimization bugs by
rewriting the query so that it
cannot be optimized
>150
bugs
28
Overview
Pivoted Query
Synthesis (PQS)
Non-optimizing
Reference Engine
Construction
(NoREC)
Ternary Logic
Query Partitioning
(TLP)
Partition the query into
several partitioning queries,
which are then composed
>150
bugs
29
SQLancer (Synthesized Query Lancer)
https://github.com/sqlancer/
30
go-sqlancer by PingCAP
https://github.com/chaos-mesh/go-sqlancer
31
Non-optimizing
Reference Engine
Construction
(NoREC)
Non-optimizing Reference Engine Construction
32
Goal
Easy-to-realize technique that can
find bugs in the query optimizer
33
Optimization Bugs
Database
row1 φ
Optimizer
row1 φ
row2 φ
row3 ¬φ
SELECT *
FROM …
WHERE φ
DBMS
34
Optimization Bugs

row1
Optimizer
Optimizer
row1
row2
-O3
-O0
DBMS provide limited control
over optimizations
≠
SELECT *
FROM …
WHERE φ
DBMS
DBMS
35
Idea
Idea: Rewrite the query so that
the DBMS cannot optimize it
36
Idea
Query
Generator
Optimizer
Optimizer
Optimized
Query
Translation
Step
Unoptimized
Query

row1
row1
row2
≠
It is unobvious how the
translation step should look like
DBMS
DBMS
37
Idea
Query
Generator
Optimizer
Optimizer
Optimized
Query
Translation
Step
Unoptimized
Query

row1
row1
row2
≠
Insight: Checking the number of rows that are
fetched is sufficient to detect most bugs
DBMS
DBMS
38
Translation Step
SELECT *
FROM …
WHERE φ
row1 φ
row2 φ
SELECT φ
FROM …
TRUE φ
TRUE φ
FALSE ¬φ
φ evaluates
to TRUE for
two rows
φ evaluates
to TRUE for
two rows
=
✓
Optimized
Query
Unoptimized
Query
DBMS
DBMS
39
Translation Step
SELECT *
FROM …
WHERE φ
row1 φ
SELECT φ
FROM …
TRUE φ
TRUE φ
FALSE ¬φ
φ evaluates
to TRUE for
one row
φ evaluates
to TRUE for
two rows
≠

Optimized
Query
Unoptimized
Query
DBMS
DBMS
40
Translation Step
SELECT *
FROM …
WHERE φ
SELECT φ
FROM …
Optimized
Query
Unoptimized
Query
The result set contains
the original rows
The result set contains
TRUE and FALSE values
41
Translation Step
SELECT *
FROM …
WHERE φ
SELECT φ
FROM …
Optimized
Query
Unoptimized
Query
Most optimization aim to reduce
the amount of data that
is processed (e.g., using indexes)
42
Translation Step
SELECT *
FROM …
WHERE φ
SELECT φ
FROM …
Optimized
Query
Unoptimized
Query
The predicate must be
evaluated on every row
43
Translation Step
The translated query
cannot be efficiently
optimized by the DBMS
SELECT *
FROM …
WHERE φ
SELECT φ
FROM …
Optimized
Query
Unoptimized
Query
The predicate must be
evaluated on every row
44
Counting Implementation
SELECT COUNT(*)
FROM …
WHERE φ
SELECT SUM(count) FROM (
SELECT φ IS TRUE
as count
FROM <tables>
);
Optimized
Query
Unoptimized
Query
1
DBMS
DBMS
2
≠

45
Example SQLite3
Optimized
Query
Unoptimized
Query
CREATE TABLE t0(c0 UNIQUE);
INSERT INTO t0 VALUES (-1) ;
SELECT *
FROM t0
WHERE t0.c0 GLOB '-*';
SELECT t0.c0 GLOB '-*'
FROM t0;
c0
-1
t0
TRUE
{}
Result
should
contain
one row

46
Evaluation: Found Bugs
Closed
DBMS Fixed Verified Intended Duplicate
SQLite 110 0 6 0
MariaDB 1 5 0 1
PostgreSQL 5 2 1 0
CockroachDB 28 7 0 1
We found 159 bugs, 141 of
which have been fixed!
47
Evaluation: Found Bugs
Closed
DBMS Fixed Verified Intended Duplicate
SQLite 110 0 6 0
MariaDB 1 5 0 1
PostgreSQL 5 2 1 0
CockroachDB 28 7 0 1
We concentrated on testing SQLite
48
Evaluation: Test Oracles
Crash
DBMS Logic Error Release Debug
SQLite 39 30 15 26
MariaDB 5 0 1 0
PostgreSQL 0 4 2 1
CockroachDB 7 24 4 0
49
Evaluation: Test Oracles
Crash
DBMS Logic Error Release Debug
SQLite 39 30 15 26
MariaDB 5 0 1 0
PostgreSQL 0 4 2 1
CockroachDB 7 24 4 0
We found 51 optimization bugs!
50
Evaluation: Test Oracles
Crash
DBMS Logic Error Release Debug
SQLite 39 30 15 26
MariaDB 5 0 1 0
PostgreSQL 0 4 2 1
CockroachDB 7 24 4 0
Error and crash bugs seem to be
more common/easier to find
51
Discussion and Questions
52
Ternary Logic
Partitioning (TLP)
Ternary Logic Partitioning (TLP)
53
Challenge
NoREC is mostly limited to testing
WHERE clauses
54
Query Partitioning
Query
Generator
Q
RS(Q)
55
Query Partitioning
Query
Generator
Q
RS(Q)
Q denotes the (randomly-generated) original query
56
Query Partitioning
Query
Generator
Q
RS(Q)
RS(Q) denotes the original query’s result set
57
Query Partitioning
Query
Generator
Q
RS(Q)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
Partition the result set
58
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
Combine the results so that
RS(Q)=RS(Q’)
59
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
✓
=
60
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
≠

61
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
Partitioning queries
62
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
Partitions
63
Query Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
RS(Q’1)
RS(Q’2)
RS(Q’3)
RS(Q’n)
Q’1
Q’2
Q’3
Q’n
♢
Q’
Composition operator
64
How to Realize This Idea?
Key challenge: find a valid partitioning
strategy that stresses the DBMS
65
Ternary Logic
Consider a predicate φ and a given row r. Exactly one of the following
must hold:
• φ
• NOT φ
• φ IS NULL
ternary predicate variants
66
Ternary Logic
Consider a predicate φ and a given row r. Exactly one of the following
must hold:
• φ
• NOT φ
• φ IS NULL
φ
NOT φ
φ IS NULL
67
Example: MySQL
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0;

How did we find
this bug using TLP? t0.c0 t0.c1
68
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
UNION ALL
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
UNION ALL
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
SELECT * FROM t0, t1;
t0.c0 t0.c1
≠
69
Ternary Logic Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p)
RS(Q ¬p)
RS(Q’p IS NULL)
SELECT * FROM t0, t1;
70
Ternary Logic Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p)
RS(Q ¬p)
RS(Q’p IS NULL)
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
71
Ternary Logic Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p)
RS(Q ¬p)
RS(Q’p IS NULL)
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
72
Ternary Logic Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p)
RS(Q ¬p)
RS(Q’p IS NULL)
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
73
Ternary Logic Partitioning
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p)
RS(Q ¬p)
RS(Q’p IS NULL)
UNION ALL
74
Scope
• WHERE
• GROUP BY
• HAVING
• DISTINCT queries
• Aggregate functions
75
Testing WHERE Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT <columns>
FROM <tables>
[<joins>]
SELECT <columns>
FROM <tables>
[<joins>]
WHERE ptern
Q′p ⊎ Q′¬p ⊎ Q′p IS NULL
76
Testing WHERE Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT <columns>
FROM <tables>
[<joins>]
SELECT <columns>
FROM <tables>
[<joins>]
WHERE ptern
Q′p ⊎ Q′¬p ⊎ Q′p IS NULL
The multiset addition can be
implemented using UNION ALL
77
Testing HAVING Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT <columns>
FROM <tables>
[<joins>]
[WHERE …]
[GROUP BY …]
SELECT <columns>
FROM <tables>
[<joins>]
[WHERE …]
[GROUP BY …]
HAVING ptern
Q′p ⊎ Q′¬p ⊎ Q′p IS NULL
78
Testing DISTINCT Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT DISTINCT
<columns>
FROM <tables>
[<joins>]
SELECT [DISTINCT]
<columns>
FROM <tables>
[<joins>]
WHERE ptern
Q′p ∪ Q′¬p ∪ Q′p IS NULL
79
Testing DISTINCT Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT DISTINCT
<columns>
FROM <tables>
[<joins>]
SELECT [DISTINCT]
<columns>
FROM <tables>
[<joins>]
WHERE ptern;
Q′p ∪ Q′¬p ∪ Q′p IS NULL
The set union can be
implemented using UNION
80
Testing GROUP BY Clauses
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT <columns>
FROM <tables>
[<joins>]
GROUP BY <columns>
SELECT <columns>
FROM <tables>
[<joins>]
WHERE ptern
GROUP BY <columns>
Q′p ∪ Q′¬p ∪ Q′p IS NULL
81
Testing Self-decomposable Aggregate Functions
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT MAX(<e>)
FROM <tables>
[<joins>]
SELECT MAX(<e>)
FROM <tables>
[<joins>]
WHERE ptern;
MAX(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL)
A partition is an intermediate
result, rather than
a subset of the result set
82
Bug Example: CockroachDB
SET vectorize=experimental_on;
CREATE TABLE t0(c0 INT);
CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid);
INSERT INTO t0(c0) VALUES (0);
INSERT INTO t1(rowid, c0) VALUES(0, TRUE);
NULL 0
SELECT MAX(aggr) FROM (
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE '+' >= t1.c0 UNION ALL
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE NOT('+' >= t1.c0) UNION ALL
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE ('+' >= t1.c0) IS NULL
);
SELECT MAX(t1.rowid)
FROM t1;
≠
83
Testing Decomposable Aggregate Functions
Q Q’ptern ♢(Q’p, Q’¬p, Q’p IS NULL)
SELECT AVG(<e>)
FROM <tables>
[<joins>];
SELECT SUM(<e>) as s,
COUNT(<e>) as s
FROM <tables>
[<joins>];
SUM(s(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL))
SUM(c(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL))
We did not consider non-decomposable
aggregate functions (e.g., GROUP_CONCAT())
84
Evaluation: Found Bugs
Closed
DBMS Fixed Verified Intended Duplicate
SQLite 4 0 0 0
MySQL 1 6 3 0
CockroachDB 22 9 0 0
TiDB 26 35 0 1
DuckDB 71 1 0 2
We found 175 bugs, 123 of
which have been fixed!
85
Evaluation: Found Bugs
Closed
DBMS Fixed Verified Intended Duplicate
SQLite 4 0 0 0
MySQL 1 6 3 0
CockroachDB 22 9 0 0
TiDB 26 35 0 1
DuckDB 71 1 0 2
The total number of bugs
reflects our testing focus
86
Evaluation: Found Bugs
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
87
Evaluation: Found Bugs
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
The WHERE oracle is the simplest, but most effective oracle
88
Evaluation: Found Bugs
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
The other oracles found interesting, but fewer bugs
89
Evaluation: Found Bugs
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
We implemented these oracles only for DBMS
for which our bug-finding efforts saturated
90
Comparison to NoREC
TLP
• Aggregate functions,
DISTINCT, GROUP BY,
HAVING, WHERE clauses
• UNION, UNION ALL
• Bugs in (unoptimized)
JOINs and operators
NoREC
• Bugs in the unoptimized
query
91
Discussion and Questions
92
Pivoted Query
Synthesis (PQS)
Pivoted Query Synthesis (PQS)
93
Inherent Limitation of Metamorphic Testing
SQL Query Result Set
Execute
Derived SQL
Query
Derive
Metamorphic approaches cannot establish a ground truth
94
Idea: PQS
Idea: Divide-and-conquer
approach for testing DBMS
that provides a ground truth
95
Idea: PQS Automatic testing approach
considering only a single row
Column0 Column1 Column2
… … …
Valuei,0 Valuei,1 Valuei,2
… … …
Pivot Row
96
Idea: PQS
Generate a
Database
Generate a
Query
Validate the
Query’s Result
Generate query that is
guaranteed to fetch the pivot row
97
Idea: PQS
Generate a
Database
Generate a
Query
Validate the
Query’s Result
Validate that the pivot row is
contained in the result set
98
Intuition
• Simpler conceptually and implementation-wise
• Similar effectiveness as checking all rows
Column0 Column1 Column2
… … …
Valuei,0 Valuei,1 Valuei,2
… … …
✓
✓
✓
SELECT * FROM <tables>
WHERE φ
99
Approach
Randomly
Generate
Database
Select
Pivot Row
Generate
Query for the
Pivot Row
Verify that
the Pivot Row
is contained
100
Approach
Randomly
Generate
Database
Select
Pivot Row
Generate
Query for the
Pivot Row
Verify that
the Pivot Row
is contained
101
Approach
Randomly
Generate
Database
Select
Pivot Row
Generate
Query for the
Pivot Row
Verify that
the Pivot Row
is contained
102
Approach
Randomly
Generate
Database
Select
Pivot Row
Generate
Query for the
Pivot Row
Verify that
the Pivot Row
is contained
103
Example: SQLite3 Bug
c0
0
1
2
3
NULL
t0
CREATE TABLE t0(c0);
CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL;
INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL);
Pivot row
104
Example: SQLite3 Bug
c0
0
1
2
3
NULL
t0
CREATE TABLE t0(c0);
CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL;
INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL);
SELECT c0 FROM t0
WHERE t0.c0 IS NOT 1;
Expected to fetch the pivot row
105
Example: SQLite3 Bug
c0
0
1
2
3
NULL
t0
CREATE TABLE t0(c0);
CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL;
INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL);
SELECT c0 FROM t0
WHERE t0.c0 IS NOT 1;
0
2
3
The pivot row is not contained in the result set!

106
Approach
Randomly
Generate
Database
Select
Pivot Row
Generate
Query for the
Pivot Row
Verify that
the Pivot Row
is contained
How do we generate this query?
107
How do we Generate Queries?
Generate a predicate that yields
TRUE for the pivot row
SELECT c0 FROM t0
WHERE
108
Random Expression Generation
IS
NOT
t0.c0 1
t0.c0 IS NOT 1;
We implemented an
expression evaluator for
each node!
109
Random Expression Generation
IS
NOT
t0.c0 1
c0
0
1
2
3
NULL
t0
Evaluate the tree based
on the pivot row
110
Random Expression Generation
IS
NOT
t0.c0 1
c0
0
1
2
3
NULL
t0
Column references return the
values from the pivot row
NULL
111
Random Expression Generation
IS
NOT
t0.c0 1
c0
0
1
2
3
NULL
t0
Constant nodes return their
assigned literal values
NULL 1
112
Random Expression Generation
IS
NOT
t0.c0 1
c0
0
1
2
3
NULL
t0
NULL 1 Compound nodes
compute their result
based on their children
TRUE
113
t0.c0 IS NOT 1;
Query Synthesis
SELECT c0 FROM t0
WHERE
What if the expression does
not evaluate to TRUE?
114
Random Expression Rectification
switch (result) {
case TRUE:
result = randexpr;
case FALSE:
result = NOT randexpr;
case NULL:
result = randexpr IS NULL;
}
115
Implementation Effort
• Literal evaluator
• Simpler than PL AST Interpreters → No mutable state
• Simpler than query engines → only a single row needs to be considered
• Operators are implemented naively
• The performance of the DBMS is the bottleneck
• Higher implementation effort for functions (e.g. printf) and
complex operators
116
Bugs Overview
DBMS Fixed Verified
SQLite 65 0
MySQL 15 10
PostgreSQL 5 4
99 bugs!
117
Oracles
DBMS Logic Error Crash
SQLite 46 17 2
MySQL 14 10 1
PostgreSQL 1 7 1
61 logic bugs
118
Discussion and Questions
119
Comparison
Property PQS NoREC TLP
WHERE ✓ ✓ ✓
Aggregates, HAVING, …   ✓
Ground truth ✓  
Implementation effort Moderate Very Low Low
120
Discussion: Industry Adoption
https://github.com/chaos-mesh/go-sqlancer
121
@RiggerManuel @ast_eth @sqlancer_dbmsSummary
Goal: Logic Bugs NoREC: Derive Unoptimized Query TLP: Partition Result Set
PQS: Check Pivot Row Containment Evaluation: Over 400 bugs in DBMS

More Related Content

What's hot

J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
ksain
 
Pyconie 2012
Pyconie 2012Pyconie 2012
Pyconie 2012
Yaqi Zhao
 

What's hot (15)

J unit스터디슬라이드
J unit스터디슬라이드J unit스터디슬라이드
J unit스터디슬라이드
 
Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)Java Testing With Spock - Ken Sipe (Trexin Consulting)
Java Testing With Spock - Ken Sipe (Trexin Consulting)
 
Decompiling Java - SCAM2009 Presentation
Decompiling Java - SCAM2009 PresentationDecompiling Java - SCAM2009 Presentation
Decompiling Java - SCAM2009 Presentation
 
(automatic) Testing: from business to university and back
(automatic) Testing: from business to university and back(automatic) Testing: from business to university and back
(automatic) Testing: from business to university and back
 
What is new in JUnit5
What is new in JUnit5What is new in JUnit5
What is new in JUnit5
 
TDD Training
TDD TrainingTDD Training
TDD Training
 
Testing, Learning and Professionalism — 20171214
Testing, Learning and Professionalism — 20171214Testing, Learning and Professionalism — 20171214
Testing, Learning and Professionalism — 20171214
 
PVS-Studio and static code analysis technique
PVS-Studio and static code analysis techniquePVS-Studio and static code analysis technique
PVS-Studio and static code analysis technique
 
Testing: ¿what, how, why?
Testing: ¿what, how, why?Testing: ¿what, how, why?
Testing: ¿what, how, why?
 
Pyconie 2012
Pyconie 2012Pyconie 2012
Pyconie 2012
 
Software Engineering - RS4
Software Engineering - RS4Software Engineering - RS4
Software Engineering - RS4
 
Grails unit testing
Grails unit testingGrails unit testing
Grails unit testing
 
Python testing
Python  testingPython  testing
Python testing
 
Migrating to JUnit 5
Migrating to JUnit 5Migrating to JUnit 5
Migrating to JUnit 5
 
nosymbols - defcon russia 20
nosymbols - defcon russia 20nosymbols - defcon russia 20
nosymbols - defcon russia 20
 

Similar to Finding Logic Bugs in Database Management Systems

Mutant Tests Too: The SQL
Mutant Tests Too: The SQLMutant Tests Too: The SQL
Mutant Tests Too: The SQL
DataWorks Summit
 
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Yulia Tsisyk
 
Optimizing Tcl Bytecode
Optimizing Tcl BytecodeOptimizing Tcl Bytecode
Optimizing Tcl Bytecode
Donal Fellows
 
Testing Zen
Testing ZenTesting Zen
Testing Zen
day
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
Jonathan Levin
 
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forumlibinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
Nick Galbreath
 
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
MySQL Goes to 8!  FOSDEM 2020 Database Track, January 2nd, 2020MySQL Goes to 8!  FOSDEM 2020 Database Track, January 2nd, 2020
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
Geir Høydalsvik
 

Similar to Finding Logic Bugs in Database Management Systems (20)

Java Micro-Benchmarking
Java Micro-BenchmarkingJava Micro-Benchmarking
Java Micro-Benchmarking
 
IoT Best Practices: Unit Testing
IoT Best Practices: Unit TestingIoT Best Practices: Unit Testing
IoT Best Practices: Unit Testing
 
Mutant Tests Too: The SQL
Mutant Tests Too: The SQLMutant Tests Too: The SQL
Mutant Tests Too: The SQL
 
Unit Testing like a Pro - The Circle of Purity
Unit Testing like a Pro - The Circle of PurityUnit Testing like a Pro - The Circle of Purity
Unit Testing like a Pro - The Circle of Purity
 
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
Рахманов Александр "Что полезного в разборе дампов для .NET-разработчиков?"
 
Optimizing Tcl Bytecode
Optimizing Tcl BytecodeOptimizing Tcl Bytecode
Optimizing Tcl Bytecode
 
PHP - Introduction to Advanced SQL
PHP - Introduction to Advanced SQLPHP - Introduction to Advanced SQL
PHP - Introduction to Advanced SQL
 
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
 
Testing Zen
Testing ZenTesting Zen
Testing Zen
 
Oracle SQL Tuning for Day-to-Day Data Warehouse Support
Oracle SQL Tuning for Day-to-Day Data Warehouse SupportOracle SQL Tuning for Day-to-Day Data Warehouse Support
Oracle SQL Tuning for Day-to-Day Data Warehouse Support
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
 
How and what to unit test
How and what to unit testHow and what to unit test
How and what to unit test
 
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forumlibinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
libinjection: new technique in detecting SQLi attacks, iSEC Partners Open Forum
 
Oracle Database In-Memory Option in Action
Oracle Database In-Memory Option in ActionOracle Database In-Memory Option in Action
Oracle Database In-Memory Option in Action
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
10 sql tips
10 sql tips10 sql tips
10 sql tips
 
Five Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo VancouverFive Database Mistakes and how to fix them -- Confoo Vancouver
Five Database Mistakes and how to fix them -- Confoo Vancouver
 
Introduction to SQL Antipatterns
Introduction to SQL AntipatternsIntroduction to SQL Antipatterns
Introduction to SQL Antipatterns
 
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
MySQL Goes to 8!  FOSDEM 2020 Database Track, January 2nd, 2020MySQL Goes to 8!  FOSDEM 2020 Database Track, January 2nd, 2020
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
 

More from PingCAP

[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases
PingCAP
 

More from PingCAP (20)

[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
 
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
 
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
 
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
 
[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] QAGen: Generating query-aware test databases
 
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...[Paper Reading]  Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
 
[Paperreading] Paxos made easy (by sen han)
[Paperreading]  Paxos made easy (by sen han)[Paperreading]  Paxos made easy (by sen han)
[Paperreading] Paxos made easy (by sen han)
 
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in production
 
TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote
 
Chaos Practice in PingCAP
Chaos Practice in PingCAPChaos Practice in PingCAP
Chaos Practice in PingCAP
 
TiDB at PayPay
TiDB at PayPayTiDB at PayPay
TiDB at PayPay
 
Paper Reading: FPTree
Paper Reading: FPTreePaper Reading: FPTree
Paper Reading: FPTree
 
Paper Reading: Smooth Scan
Paper Reading: Smooth ScanPaper Reading: Smooth Scan
Paper Reading: Smooth Scan
 
Paper Reading: Flexible Paxos
Paper Reading: Flexible PaxosPaper Reading: Flexible Paxos
Paper Reading: Flexible Paxos
 
Paper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in OraclePaper reading: Cost-based Query Transformation in Oracle
Paper reading: Cost-based Query Transformation in Oracle
 
Paper reading: HashKV and beyond
Paper reading: HashKV and beyondPaper reading: HashKV and beyond
Paper reading: HashKV and beyond
 

Recently uploaded

LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
Kamal Acharya
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
AbrahamGadissa
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
Kamal Acharya
 

Recently uploaded (20)

Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturing
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Danfoss NeoCharge Technology -A Revolution in 2024.pdf
Danfoss NeoCharge Technology -A Revolution in 2024.pdfDanfoss NeoCharge Technology -A Revolution in 2024.pdf
Danfoss NeoCharge Technology -A Revolution in 2024.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
Hall booking system project report .pdf
Hall booking system project report  .pdfHall booking system project report  .pdf
Hall booking system project report .pdf
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 

Finding Logic Bugs in Database Management Systems