OQGraph at MySQL Users Conference 2011

OQGRAPH
Graphs and Heirarchies in Plain SQL

Antony T Curtis <atcurtis@gmail.com>

graph@openquery.com
http://openquery.com/graph

Hierarchies / Trees

● Trees typically have a single "root" node.
● All child nodes have only one parent.

Other examples:
● Menu structures.
● Organisation charts.
● Filesystem directories.

OQGRAPH computation engine © 2009-2011 Open Query

Graphs / Networks
● Nodes connected by Edges.
● Edges may be directional.
● Edges may have a "weight" / "cost" attribute.
● Directed graphs may have bi-directional edges.
● Unconnected sets of nodes may exist on same graph.
● There need not be a "root" node.

Examples:
● "Social Graphs" / friend relationships.
● Decision / State graphs.
● Airline routes

Problem Solving

Trees Networks

● Does Dilbert report to the ● What is the quickest air
PHB? route to MLA from SJC?

● How many people report ● What is the shortest path
to manager X? of decisions to get to state
#11 from state #5.
● How many people are
between the CEO and ● Playing "Six Degrees of
employee Y? Kevin Bacon"


RDBMS with Heirarchies and Graphs

● Not always a particularly good fit.
● Various tree models exist; each with limitations:
○ Adjacency model
■ Either uses fixed max depth or recursive queries.
■ Oracle has CONNECT BY PRIOR
■ SQL99 has WITH RECURSIVE...UNION...
○ Nested set
■ complex
■ recursive queries to find path to root.
○ Materialised path
■ Ugly and not relational.
■ Can be quite effective when used correctly.

Further reading: http://dev.mysql.com/tech-resources/articles/hierarchical-data.html


What is OQGRAPH?

● Implemented as a storage engine.
○ Original concept by Arjen Lentz
○ for MySQL
○ for Drizzle
○ for MariaDB
● Mk. II implementation by
○ Antony Curtis
○ Arjen Lentz @openquery
● Mk. III dev. on LaunchPad
● Licensing
○ GPLv2+


OQGRAPH: A Computation Engine

● It is not a general purpose data engine.
○ unlike MyISAM, InnoDB, PBXT or MEMORY.
● Looks like an ordinary table.
● Has a very different internal architecture.
● It does not operate in terms of
○ storing data for later retrieval.
○ having indexes on data.

● May be regarded as a "magic view" or "table function".


Getting OQGRAPH
MariaDB - available as a plugin.
● Included in mainline MariaDB 5.2 builds.
○ INSTALL PLUGIN oqgraph SONAME ‘oqgraph_engine’;
● Or build it for yourself.
○ All MySQL/MariaDB storage engines should be built with
same debug/compile flags for correct behaviour.
● Check with SHOW PLUGINS and SHOW STORAGE ENGINE.
● 64bit Windows build is currently unstable.
MySQL 5.0 does not have plugins so must be compiled in.
● Binaries available from ourdelta.org
● Included in '-sail' builds since 5.0.87-d10
○ SHOW GLOBAL VARIABLES LIKE 'have_oqgraph';
Drizzle
● Basic port has been done.


Anatomy of an OQGRAPH table

CREATE TABLE db.tblname (
latch SMALLINT UNSIGNED NULL,
origid BIGINT UNSIGNED NULL,
destid BIGINT UNSIGNED NULL,
weight DOUBLE NULL,
seq BIGINT UNSIGNED NULL,
linkid BIGINT UNSIGNED NULL,
KEY (latch, origid, destid) USING HASH,
KEY (latch, destid, origid) USING HASH
) ENGINE=OQGRAPH;

Note: Mk.3 has a few additional options, discussed later.

OQGRAPH Mk.II - Inserting data

● Only insert directed edges into its memory store.
● Edge weight are optional and default to 1.0
● Undirected edges may be represented as two directed
edges, in opposite directions.

INSERT INTO foo (origid,destid) VALUES
(1,2), (2,3), (2,4),
(4,5), (3,6), (5,6);


Selecting Edges

SELECT * FROM foo;
+-------+--------+--------+--------+------+--------+
| latch | origid | destid | weight | seq | linkid |
+-------+--------+--------+--------+------+--------+
| NULL | 1 | 2 | 1 | 0 | NULL |
| NULL | 2 | 3 | 1 | 1 | NULL |
| NULL | 2 | 4 | 1 | 2 | NULL |
| NULL | 4 | 5 | 1 | 3 | NULL |
| NULL | 3 | 6 | 1 | 4 | NULL |
| NULL | 5 | 6 | 1 | 5 | NULL |
+-------+--------+--------+--------+------+--------+


Now, it's time for some magic.
(shortest path calculation)

● SELECT * FROM foo
WHERE latch=1 AND origid=1 AND destid=6;
+-------+--------+--------+--------+------+--------+
| latch | origid | destid | weight | seq | linkid |
+-------+--------+--------+--------+------+--------+
| 1 | 1 | 6 | NULL | 0 | 1 |
| 1 | 1 | 6 | 1 | 1 | 2 |
| 1 | 1 | 6 | 1 | 2 | 3 |
| 1 | 1 | 6 | 1 | 3 | 6 |
+-------+--------+--------+--------+------+--------+

● SELECT GROUP_CONCAT(linkid ORDER BY seq) AS path
FROM foo WHERE latch=1 AND origid=1 AND destid=6 G

path: 1,2,3,6


Other computations,
● Which paths lead to node 4?
SELECT GROUP_CONCAT(linkid) AS list
FROM foo WHERE latch=1 AND destid=4 G

list: 1,2,4

● Where can I get to from node 4?
SELECT GROUP_CONCAT(linkid) AS list
FROM foo WHERE latch=1 AND origid=4 G

list: 6,5,4


Other computations, continued.

● See docs for latch 0 and latch NULL
● latch 1 : Dijkstra's shortest path.
○ O((V + E).log V)
● latch 2 : Breadth-first search.
○ O(V+E)
● Other algorithms possible


Joins make it prettier,
● INSERT INTO people VALUES
(1,’pearce’), (2,’hunnicut’), (3,’potter’),
(4,’hoolihan’), (5,’winchester’), (6,’
mulcahy’);

● SELECT GROUP_CONCAT(name ORDER BY seq) path
FROM foo
JOIN people ON (foo.linkid = people.id)
WHERE latch=1 AND origid=1 AND destid=6 G

path: pearce,hunnicut,potter,mulcahy


In brief: OQGRAPH Mk. II

● Behaviour similar to MEMORY engine:
○ Table-level locking for normal tables
○ No locking for temporary tables
○ No persistence
○ No transactions
● Insert performance O(N.LOG(N))

This means...
○ It’s usable for menus & more, up to say a (few) million edges.
○ Inserts get very slow when there are a lot of edges.
○ You can use the --init-file option to copy/load on startup.


First Look: OQGRAPH Mk. III

Features:
● Similar core graph implementation.
● Uses existing tables as a source for edge data.
● Does not impose any strict structure on the donor table.
● Efficient Judy sparse bitmaps for node traversal data.

Notes:
● Tables are read-only and only read from the backing table.
● Table must be in same schema as the backing table.
● Current implementation is not of release quality yet.
● But it works!


Tree of Life, with Mk.III
Load the tol.sql schema,

Create tol_link backing store table,
create table tol_link (
source int unsigned not null,
target int unsigned not null,
primary key (source, target),
key (target) ) engine=innodb;

Populate it with all the edges we need:
INSERT INTO tol_link (source,target)
SELECT parent,id FROM tol WHERE parent IS NOT NULL
UNION ALL
SELECT id,parent FROM tol WHERE parent IS NOT NULL;
Query OK, 178102 rows affected (14.66 sec)

Direct download: http://bazaar.launchpad.net/~openquery-core/oqgraph/trunk/view/head:/examples/tree-of-life/tol.sql


Tree of Life, cont.

Creating the OQGRAPH MkIII table:
CREATE TABLE tol_tree (
latch SMALLINT UNSIGNED NULL,
origid BIGINT UNSIGNED NULL,
destid BIGINT UNSIGNED NULL,
weight DOUBLE NULL,
seq BIGINT UNSIGNED NULL,
linkid BIGINT UNSIGNED NULL,
KEY (latch, origid, destid) USING HASH,
KEY (latch, destid, origid) USING HASH
) ENGINE=OQGRAPH
data_table='tol_link' origid='source' destid='target';
select count(*) from tol_treeG
count(*): 178102

Tree of Life - finding H.Sapiens

SELECT GROUP_CONCAT(name ORDER BY seq
SEPARATOR ' -> ') AS path
FROM tol_tree JOIN tol ON (linkid=id)
WHERE latch=1 AND origid=1 AND destid=16421 G

path: Life on Earth -> Eukaryotes -> Unikonts ->
Opisthokonts -> Animals -> Bilateria -> Deuterostomia ->
Chordata -> Craniata -> Vertebrata -> Gnathostomata ->
Teleostomi -> Osteichthyes -> Sarcopterygii -> Terrestrial
Vertebrates -> Tetrapoda -> Reptiliomorpha -> Amniota ->
Synapsida -> Eupelycosauria -> Sphenacodontia ->
Sphenacodontoidea -> Therapsida -> Theriodontia ->
Cynodontia -> Mammalia -> Eutheria -> Primates ->
Catarrhini -> Hominidae -> Homo -> Homo sapiens
1 row in set (2.13 sec)

We want your feedback!!!1one!

● Very easy to use...
But do feel free to ask us for help/advice.

● OpenQuery created friendlist_graph for Drupal 6.
○ Addition to the existing friendlist module.
○ Enables easy social networking in Drupal.
○ Peter Lieverdink (@cafuego) did this in about 30 minutes

● We would like to know how you are using OQGRAPH!
○ You could be doing something really cool...


Links and support
● Binaries & Packages
○ http://mariadb.com (MariaDB 5.2 & above) < easiest to begin
○ http://ourdelta.org (MySQL 5.0)
● Source collaboration
○ http://launchpad.net/maria (in /storage/oqgraph)
○ http://launchpad.net/oqgraph
○ Development Mk3 source is currently at https://code.launchpad.
net/~atcurtis/ourdelta/oqgraph-v3
● Info, Docs, Support, Licensing, Engineering
○ http://openquery.com/graph
○ This presentation: http://goo.gl/UrybZ

Thank you!
Antony Curtis & Arjen Lentz
graph@openquery.com

OQGraph at MySQL Users Conference 2011

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to OQGraph at MySQL Users Conference 2011

Similar to OQGraph at MySQL Users Conference 2011 (20)

Recently uploaded

Recently uploaded (20)

OQGraph at MySQL Users Conference 2011