Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once written, is always safe and sound

Project Gemini
Roy Dahan, QA manager

Presenter
Roy Dahan, QA manager
Testing and Managing Scylla QA group for the last 3 years.
Managing testing teams in the field of data and storage for the
10 years.
Usually delivers the bad news during release process.

Project Gemini
Testing tool designed to detect data integrity issues like data loss and
data corruption.
Gemini accomplishes this by applying random testing to a system under
test and validating the results against a test oracle.
Started by Pekka Enberg, Larisa Ustalov, Henrik Johansson, and Alex
Bykov in 2017.
Implemented with Go programing language.

The Need
Data Integrity issues are rare and hard to find and debug.
■ Existing Tools focused on availability, stress, load & performance.
■ Limited to certain types of schemas with specific field types.
■ Fragile to schema changes during testing.
■ Hard to debug or reproduce when detecting one.

How Does Gemini Work?
System Under Test Test Oracle
OR

1. Generate a schema to be
used during the test.
2. Generate random CQL
operations on both clusters
at the same time.
3. Query both clusters and
compare each query results.

■ Schema generation is random (Support “seed” for test repeating).
■ Generate random values for every column in every table according to the schema.
■ Many threads run in parallel, each responsible for a specific partition key range.
■ Each thread generate either write operation or read operation.
■ Write operations are somewhat simple - INSERT / UPDATE / DELETE
■ Read operations which are being used to validate the data are more complex.
For example:
● SELECT a, b FROM tab WHERE pk = ? AND ck = ?
● SELECT a, b FROM tab WHERE pk = ? AND ck > ? LIMIT ?
● SELECT b FROM tab WHERE token(pk) >= ? LIMIT ?
● SELECT b FROM tab WHERE token(pk) >= ? AND c = ? LIMIT ? ALLOW FILTERING

Usage Example
gemini -d --duration 10800s --warmup 1800s -c 100 -m mixed -f --
non-interactive --cql-features normal --test-cluster=10.0.180.52 --
outfile /tmp/gemini-l0-c0d89088-f15f-436b-acf2-73fbab0b7f55.log --
seed 25 --oracle-cluster=10.0.60.205

Usage by QA
Integrated with Scylla-Cluster-Tests (aka SCT)
- Deployment of clusters (SUT & test Oracle)
- Deployment of a client running Gemini.
- Triggering “Nemesis” on the SUT.
- Searching the nodes for errors, coredumps, stalls, etc.
- Analyzing the Gemini final output.
- Sending full report.
In case Gemini detects any difference between SUT & Test Oracle,
it stops and leave both systems for further investigation.

Status FAILED
read_ops 117
write_errors 0
errors Validation failed: row count differ (test has 2100 rows, oracle has 2201 rows, test is missing rows: [pk0=76,
pk1=419622209, pk2=31, pk3=922835259, ck0=6644405980324451.754, ck1=1998-01-15 18:02:25 +0000 UTC pk0=111,
pk1=632366503, pk2=-48, pk3=1222579647, ck0=3345274792944728.080, ck1=2018-02-08 01:22:14 +0000 UTC pk0=108,
pk1=1643207977, pk2=-30, pk3=1878114379, ck0=3722122018497478.686, ck1=1974-07-21 15:32:05 +0000 UTC pk0=90,
pk1=278797784, pk2=-69, pk3=1755546, ck0=7809203802197026.969, ck1=2021-03-10 07:40:04 +0000 UTC pk0=-49,
pk1=1911330670, pk2=110, pk3=640637430, ck0=4592421251461013.628, ck1=1995-11-22 21:00:36 +0000 UTC pk0=-
118, pk1=554193011, pk2=-38, pk3=292494436, ck0=3338362084821289.559, ck1=1970-02-01 19:24:15 +0000 UTC pk0=-
26, pk1=1180095760, pk2=115, pk3=1114905090, ck0=4413492767842183.832, ck1=2017-05-11 09:33:37 +0000 UTC
pk0=-28, pk1=449180670, pk2=-120, pk3=1733204278, ck0=7825973161922347.653, ck1=2023-03-03 17:34:43 +0000 UTC
pk0=109, pk1=1404802328, pk2=116, pk3=1207752519, ck0=4901967462462222.815, ck1=1992-10-03 05:38:14 +0000
UTC pk0=100, pk1=351103930, pk2=20, pk3=956746865, ck0=1678423284211121.674, ck1=2021-10-12 07:13:52 +0000
UTC pk0=-47, pk1=658485119, pk2=19, pk3=968667022, ck0=3345274792944728.080, ck1=2018-02-08 01:22:14 +0000
UTC pk0=71, pk1=1718518478, pk2=-57, pk3=720416914, ck0=3338362084821289.559, ck1=1970-02-01
Failed Test Example

Thank you Stay in touch
Any questions?
Roy Dahan
roy@scylladb.com

Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once written, is always safe and sound

More Related Content

What's hot

More from ScyllaDB

Recently uploaded

Project Gemini - a fuzzing tool used by Scylla to guarantee that data, once written, is always safe and sound

Editor's Notes