Turning a Search Engine into a Relational Database

Turning a search engine
into a relational database
Matthias Wahl, Developer

Outline
• Introduction!
• Relations on Lucene!
• The How and Why!
• Crate in further detail!
• Query engine!
!
• [Demo]

Crate.io
THe company
• Founded in 2013 in Dornbirn/Austria!
• Ofﬁces in Dornbirn, Berlin, San
Francisco!
• Team of 14 People (with and without
strong austrian dialect)!
• won Techrunch Disrupt startup
battleﬁeld

SQL Database
TABLES
• Table == Tuple Store!
• Primary-Key -> Tuple!
• Index == B-Tree!
• allowing for equality and range
queries O(log(N))!
• sorting!
• Query Planner + Engine for LOCAL
query execution

LUCENE INDEX
• Inverted Index!
• equality queries!
• range queries!
• fulltext search with analyzed
queries!
• NO Sorting!
• Stored Fields!
• docValues / FieldCache

SQL TABLE
LUCENE INDEX
CREATE TABLE t (
id int primary key,
name string,
marks array(float),
text string index using fulltext
) clustered into 5 shards
with (number_of_replicas=1)
• 1 TABLE
• S shards, each with R replicas!
• metadata in cluster state!
• 1 SHARD
• 1 Lucene Index (inverted index +
stored documents/fields)!
• field mappings!
• field caches

SQL TABLE
LUCENE INDEX
Components
• Inverted Index!
• Translog (WAL)!
• “Tuple Store” - Stored Fields!
• Lucene Field Data !
• DocValues (on disk)

SQL TABLE
LUCENE INDICES
• Differences to Relational
Databases
• DISTRIBUTED!
• 2 different indices needed for all
operations!
• inverted index not suited for all
kinds of queries!
• persistence is expensive!
• limited schema altering

SQL TABLE
LUCENE INDICES
• Differences to Relational
Databases
• DISTRIBUTED!
• 2 different indices needed for all
operations!
• inverted index not suited for all
kinds of queries!
• persistence is expensive!
• limited schema altering!
• no pull based database cursor (yet)

Crate
Features
• Distributed SQL Database written in
Java (7)!
• accessible via HTTP & TCP (for java
clients only)!
• Graphical Admin Interface!
• Blob Storage!
• Plugin Infrastructure!
• Clients available in Java, Python, Ruby,
PHP, Scala, Node.js, Erlang!
• Runs on Docker, AWS, GCE, Mesos, …

Crate
SQL
• subset of ANSI SQL with extensions!
• arrays and nested objects!
• different types!
• Information Schema!
• Cluster and Node State exposed via
tables !
• Partitioned Tables!
• speaking JDBC, ODBC, SQLAlchemy,
Activerecord, PHP-PDO/DBAL

Crate
SQL
• Common relational Operators:!
• Projection!
• Grouping (incl. HAVING)!
• Aggregations!
• Sorting!
• Limit/Offset!
• WHERE-clause!
• Import/Export

Crate
SQL … NOT
• JOINS underway!
• no subselects, foreign key
constraints yet!
• no sessions, no client cursor!
• no transactions

CRATE -
A RELATIONAL DATABASE
• Relational Algebra
• SQL statement!
• Tree of Relational Operators!
• Mostly Tables == Leaves!
• ES!
• Single table operations only!
!
• No simple SQL wrapper around ES
Query DSL
SELECT
id,
substr(4, name),
id % 2 as “EVEN”
text,
marks
FROM t
WHERE
name IS NOT NULL
AND
match(text, ‘full’)
ORDER BY id DESC
LIMIT 10;

Querying crate
• Query Engine
• node based query execution!
• directly to Lucene indices!
• circumventing ES query execution
SELECT
id,
substr(4, name),
text,
marks
FROM t
WHERE
name IS NOT NULL
AND
ORDER BY id DESC
LIMIT 10;

SQL TABLE
LUCENE INDEX
INSERT INTO t (id, name, marks,
text)
VALUES (
42,
format(‘%d - %s’, 42, ‘Zaphod’),
[1.5, 4.6],
‘this is a quite full text!’)
ON DUPLICATE KEY UPDATE
name=‘DUPLICATE’;
• INSERT INTO
• insert values are validated by their
conﬁgured types!
• types are guessed for new
columns!
• primary key and routing values
extracted!
• JSON _source is created: !
• {“id”: 42 “name”: “42 - Zaphod”,
“marks”: [1.5, 4.6], “text”:”this is
a quite full text!”}

SQL TABLE
LUCENE INDEX
{
“id”: 42
“name”: “42 - Zaphod”,
“marks”: [1.5, 4.6],
“text”: “this is a quite full text!”
}
• INSERT INTO
• request is routed by “id” column to
node containing shard!
• row stored on shard

SQL TABLE
LUCENE INDEX
_uid: “default#42”
_routing: 42
_source: {…}
_type: “default”
_timestamp: 1435096992201
_version: 0
id: 42 (docvalues)
name: “42 - Zaphod” (sorted set)
marks: 1.5, 4.6
ft: “this is quite a full
text!” (indexed)
_size: 82
_field_names: _uid, _routing,
_source, _type, _timestamp,
_version, id, name, marks, ft,
_size

Querying crate
• Sorting and Grouping
• inverted index not enough!
• per document values (DocValues)
SELECT
id,
substr(4, name),
text,
marks
FROM t
WHERE
name IS NOT NULL
AND
ORDER BY id DESC
LIMIT 10;

Querying crate
• “Simple” SELECT - QTF
• Extract Fields to SELECT!
• Route to shards / Lucene Indices!
• Open and keep Lucene Reader in query
context!
• Only collect Doc/Row identifier (and all
necessary fields for sorting)!
• merge separate results on handler!
• apply limit/offset!
• fetch all fields!
• evaluate expressions!
• return Object[][]
SELECT
id,
substr(4, name),
text,
marks
FROM t
WHERE
name IS NOT NULL
AND
ORDER BY id DESC
LIMIT 10;

Querying crate
• INTERNAL PAGING
• Problems with big result sets /
high offsets!
• Need to fetch LIMIT + OFFSET
from every shard!
• Execution starts at TOP Relation!
• trickles down to tables (Lucene
Indices)!
• Hybrid of push and pull based
data ﬂow
SELECT
id,
substr(4, name),
text,
marks
FROM t
WHERE
name IS NOT NULL
AND
ORDER BY id DESC
LIMIT 1
OFFSET 10000000;

Querying crate
• GROUP BY - AGGREGATIONS
• Aggregation Framework developed
parallel to Elasticsearch aggregations!
• ES - 2 phase aggregations
(HyperLogLog, Moving Averages,
Percentiles …)!
• online algorithms on partial data
(mergeable) necessary!
• https://github.com/elastic/
elasticsearch/issues/4915
SELECT
avg(temp) as avg,
stddev(temp) as stddev,
max(temp) as max,
min(temp) as min,
count(distinct temp)
date_trunc(‘year’, date)
as year
FROM t
WHERE temp IS NOT NULL
GROUP BY 2
ORDER BY avg DESC
LIMIT 10;

Querying crate
• split in 3 phases!
• partial aggregation executed on
each shard in parallel!
• partial result distributed to
“Reduce” nodes by hashing the
group keys!
• ﬁnal aggregation on handler/
reducer!
• merge on handler
SELECT
avg(temp) as avg,
max(temp) as max,
min(temp) as min,
as year
FROM t
GROUP BY 2
ORDER BY avg DESC
LIMIT 10;

Querying crate
SELECT
avg(temp) as avg,
max(temp) as max,
min(temp) as min,
as year
FROM t
GROUP BY 2
ORDER BY avg DESC
LIMIT 10;
[1,2,2]
[2,3,7]
[4,9,42]
[1, 7, 9]
[2,3,4]
6
[1]
[2]
3
3
Shards
REDUCER
HANDLER

Querying crate
• Row Authority by hashing!
• split huge datasets!
• expensive intermediate
aggregation states possible
(COUNT DISTINCT)
SELECT
avg(temp) as avg,
max(temp) as max,
min(temp) as min,
as year
FROM t
GROUP BY 2
ORDER BY avg DESC
LIMIT 10;

FINALLY
• GETTING RELATIONAL…
• still in transition!
• more relational operators to come!
• JOINs are underway!
• CROSS JOINS already “work”

Thank YOU
matthias@crate.io!
jobs@crate.io !

Turning a Search Engine into a Relational Database

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (6)

Similar to Turning a Search Engine into a Relational Database

Similar to Turning a Search Engine into a Relational Database (20)

Recently uploaded

Recently uploaded (20)

Turning a Search Engine into a Relational Database