OQGraph @ SCaLE 11x 2013
Upcoming SlideShare
Loading in...5
×
 

OQGraph @ SCaLE 11x 2013

on

  • 667 views

OpenQuery Graph engine for MariaDB

OpenQuery Graph engine for MariaDB

Statistics

Views

Total Views
667
Views on SlideShare
658
Embed Views
9

Actions

Likes
2
Downloads
5
Comments
0

1 Embed 9

http://aws.w3db.us 9

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

OQGraph @ SCaLE 11x 2013 OQGraph @ SCaLE 11x 2013 Presentation Transcript

  • OQGraph 3 for MariaDB Graphs and Hierarchies in Plain SQL http://goo.gl/gqr7bAntony T Curtis <atcurtis@gmail.com> graph@openquery.com http://openquery.com/graph
  • Graphs / Networks ● Nodes connected by Edges. ● Edges may be directional. ● Edges may have a "weight" / "cost" attribute. ● Directed graphs may have bi-directional edges. ● Unconnected sets of nodes may exist on same graph. ● There need not be a "root" node. Examples: ● "Social Graphs" / friend relationships. ● Decision / State graphs. ● Airline routesOQGRAPH computation engine © 2009-2013 Open Query
  • RDBMS with Heirarchies and Graphs ● Not always a particularly good fit. ● Various tree models exist; each with limitations: ○ Adjacency model ■ Either uses fixed max depth or recursive queries. ■ Oracle has CONNECT BY PRIOR ■ SQL99 has WITH RECURSIVE...UNION... ○ Nested set ■ complex ■ recursive queries to find path to root. ○ Materialised path ■ Ugly and not relational. ■ Can be quite effective when used correctly. Further reading: http://dev.mysql.com/tech-resources/articles/hierarchical-data.htmlOQGRAPH computation engine © 2009-2013 Open Query
  • What is OQGRAPH? ● Implemented as a storage engine. ○ Original concept by Arjen Lentz ● Mk. 2 implementation 2008 ○ GPLv2+ ○ Bundled with MariaDB 5.2+ ○ Boost Graph Library ● Mk. 3 implementation ○ GPLv2+ ○ Bundled with MariaDB 10.0 (soon) ● Easy to enable ○ INSTALL PLUGIN oqgraph SONAME ‘ha_oqgraph’;OQGRAPH computation engine © 2009-2013 Open Query
  • OQGRAPH: A Computation Engine ● It is not a general purpose data engine. ○ unlike MyISAM, InnoDB or MEMORY. ● Looks like an ordinary table. ● Has a very different internal architecture. ● It does not operate in terms of ○ storing data for later retrieval. ○ having indexes on data. ● May be regarded as a "magic view" or "table function".OQGRAPH computation engine © 2009-2013 Open Query
  • OQGRAPH: A Computation Engine MySQL Server Communications, Session and Thread Management DDL, DML, Management Tables, SQL Parser and SQL Views, Lock Services, Management Stored Procedure Engine Buffers Logging, and Utilities and Caches Runtime Libraries Query Optimizer and Execution Engine built in and run-time loaded plug ins OQGraph InnoDBOQGRAPH computation engine © 2009-2013 Open Query
  • Whats new in OQGRAPH 3 Features: ● Judy array bitmaps for Graph coloring. ● Uses existing tables for edge data. ● Much lower memory cost per query. ● Does not impose any strict structure on the source table. ● Can handle significantly larger graphs than OQGRAPHv2. ○ 100K+ index reads per second are possible. ○ Millions of edges are possible. ● All edges of graph need not fit in memory. ○ Only Judy bitmap array must be held in RAM. Notes: ● Tables are read-only and only read from the backing table. ● Table must be in same schema as the backing table. ● Table must have appropriate indexes.OQGRAPH computation engine © 2009-2013 Open Query
  • Anatomy of an OQGRAPH 3 table CREATE TABLE db.tblname ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table=link -- data table origid=source -- column name destid=target -- column name weight=weight; -- optional column name ;OQGRAPH computation engine © 2009-2013 Open Query
  • OQGRAPH - Data source ● Edges are directed edges. ● Edge weight are optional and default to 1.0 ● Undirected edges may be represented as two directed edges, in opposite directions. CREATE TABLE foo ( origid INT UNSIGNED NOT NULL, destid INT UNSIGNED NOT NULL, PRIMARY KEY(origid, destid), KEY (destid) ); INSERT INTO foo (origid,destid) VALUES (1,2), (2,3), (2,4), (4,5), (3,6), (5,6);OQGRAPH computation engine © 2009-2013 Open Query
  • OQGRAPH - Data source, cont. Creating the OQGRAPH table: CREATE TABLE foo_graph ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table=foo origid=origid destid=destid;OQGRAPH computation engine © 2009-2013 Open Query
  • Selecting Edges MariaDB [foo]> select * from foo_graph; +-------+--------+--------+--------+------+--------+ | latch | origid | destid | weight | seq | linkid | +-------+--------+--------+--------+------+--------+ | NULL | 1 | 2 | 1 | NULL | NULL | | NULL | 2 | 3 | 1 | NULL | NULL | | NULL | 2 | 4 | 1 | NULL | NULL | | NULL | 3 | 6 | 1 | NULL | NULL | | NULL | 4 | 5 | 1 | NULL | NULL | | NULL | 5 | 6 | 1 | NULL | NULL | +-------+--------+--------+--------+------+--------+ 6 rows in set (0.38 sec)OQGRAPH computation engine © 2009-2013 Open Query
  • Now, its time for some magic. (shortest path calculation) ● SELECT * FROM foo_graph WHERE latch=1 AND origid=1 AND destid=6; +-------+--------+--------+--------+------+--------+ | latch | origid | destid | weight | seq | linkid | +-------+--------+--------+--------+------+--------+ | 1 | 1 | 6 | NULL | 0 | 1 | | 1 | 1 | 6 | 1 | 1 | 2 | | 1 | 1 | 6 | 1 | 2 | 3 | | 1 | 1 | 6 | 1 | 3 | 6 | +-------+--------+--------+--------+------+--------+ ● SELECT GROUP_CONCAT(linkid ORDER BY seq) AS path FROM foo_graph WHERE latch=1 AND origid=1 AND destid=6 G path: 1,2,3,6OQGRAPH computation engine © 2009-2013 Open Query
  • Other computations, ● Which paths lead to node 4? SELECT GROUP_CONCAT(linkid) AS list FROM foo_graph WHERE latch=1 AND destid=4 G list: 1,2,4 ● Where can I get to from node 4? SELECT GROUP_CONCAT(linkid) AS list FROM foo_graph WHERE latch=1 AND origid=4 G list: 6,5,4OQGRAPH computation engine © 2009-2013 Open Query
  • Other computations, continued. ● See docs for latch 0 and latch NULL ● latch 1 : Dijkstras shortest path. ○ O((V + E).log V) ● latch 2 : Breadth-first search. ○ O(V+E) ● Other algorithms possibleOQGRAPH computation engine © 2009-2013 Open Query
  • Joins make it prettier, ● INSERT INTO people VALUES (1,’pearce’), (2,’hunnicut’), (3,’potter’), (4,’hoolihan’), (5,’winchester’), (6,’ mulcahy’); ● SELECT GROUP_CONCAT(name ORDER BY seq) path FROM foo_graph JOIN people ON (foo.linkid = people.id) WHERE latch=1 AND origid=1 AND destid=6 G path: pearce,hunnicut,potter,mulcahyOQGRAPH computation engine © 2009-2013 Open Query
  • Tree of Life Load the tol.sql schema, Create tol_link backing store table, CREATE TABLE tol_link ( source INT UNSIGNED NOT NULL, target INT UNSIGNED NOT NULL, PRIMARY KEY (source, target), KEY (target) ) ENGINE=innodb; Populate it with all the edges we need: INSERT INTO tol_link (source,target) SELECT parent,id FROM tol WHERE parent IS NOT NULL UNION ALL SELECT id,parent FROM tol WHERE parent IS NOT NULL; Query OK, 178102 rows affected (46.35 sec) Records: 178102 Duplicates: 0 Warnings: 0 Direct download: http://bazaar.launchpad.net/~openquery-core/oqgraph/trunk/view/head:/examples/tree-of-life/tol.sqlOQGRAPH computation engine © 2009-2013 Open Query
  • Tree of Life, cont. Creating the OQGRAPH table: CREATE TABLE tol_tree ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table=tol_link origid=source destid=target;OQGRAPH computation engine © 2009-2013 Open Query
  • Tree of Life - finding H.Sapiens SELECT GROUP_CONCAT(name ORDER BY seq SEPARATOR -> ) AS path FROM tol_tree JOIN tol ON (linkid=id) WHERE latch=1 AND origid=1 AND destid=16421 G path: Life on Earth -> Eukaryotes -> Unikonts -> Opisthokonts -> Animals -> Bilateria -> Deuterostomia -> Chordata -> Craniata -> Vertebrata -> Gnathostomata -> Teleostomi -> Osteichthyes -> Sarcopterygii -> Terrestrial Vertebrates -> Tetrapoda -> Reptiliomorpha -> Amniota -> Synapsida -> Eupelycosauria -> Sphenacodontia -> Sphenacodontoidea -> Therapsida -> Theriodontia -> Cynodontia -> Mammalia -> Eutheria -> Primates -> Catarrhini -> Hominidae -> Homo -> Homo sapiensOQGRAPH computation engine © 2009-2011 Open Query
  • Internet Movie DataBase (IMDB) Transform and load the movie database (this takes a long time) CREATE TABLE `entity` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` enum(ACTOR,MOVIE,TV MOVIE,TV MINI,TV SERIES,VIDEO MOVIE,VIDEO GAME,VOICE,ARCHIVE) NOT NULL, `name` varchar(128) COLLATE utf8_unicode_ci NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `type` (`type`,`name`) USING BTREE ) ENGINE=InnoDB; CREATE TABLE `link` ( `rel_id` int(11) NOT NULL AUTO_INCREMENT, `link_from` int(11) NOT NULL, `link_to` int(11) NOT NULL, PRIMARY KEY (`rel_id`), KEY `link_from` (`link_from`,`link_to`), KEY `link_to` (`link_to`) ) ENGINE=InnoDB;OQGRAPH computation engine © 2009-2013 Open Query
  • Degrees of N!xau Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are about 1GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR -> ) AS path -> FROM movie_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name=Kevin Bacon) -> AND destid=(SELECT b.id FROM entity b WHERE name=N!xau)GOQGRAPH computation engine © 2009-2013 Open Query
  • Degrees of N!xau Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are about 1GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR -> ) AS path -> FROM movie_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name=Kevin Bacon) -> AND destid=(SELECT b.id FROM entity b WHERE name=N!xau)G *************************** 1. row *************************** path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo -> The Gods Must Be Crazy (1981) -> N!xau 1 row in set (3 min 9.67 sec) --again *************************** 1. row *************************** path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo -> The Gods Must Be Crazy (1981) -> N!xau 1 row in set (1 min 7.13 sec) Each query requires approximately 7.8 million secondary key reads.OQGRAPH computation engine © 2009-2013 Open Query
  • Degrees of N!xau Graph of approximately 3.7 million nodes with 30 million edges. Tables are about 3.5GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR -> ) AS path -> FROM imdb_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name=Kevin Bacon) -> AND destid=(SELECT b.id FROM entity b WHERE name=N!xau)G *************************** 1. row *************************** path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) -> Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid (1993) -> N!xau 1 row in set (10 min 6.55 sec) --again *************************** 1. row *************************** path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) -> Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid (1993) -> N!xau 1 row in set (8 min 29.66 sec) Each query requires approximately 16.6 million secondary key reads.OQGRAPH computation engine © 2009-2013 Open Query
  • We want your feedback! ● Very easy to use... But do feel free to ask us for help/advice. ● OpenQuery created friendlist_graph for Drupal 6. ○ Currently based on OQGraph v2 ○ Addition to the existing friendlist module. ○ Enables easy social networking in Drupal. ○ Peter Lieverdink (@cafuego) did this in about 30 minutes ● We would like to know how you are using OQGRAPH! ○ You could be doing something really cool...OQGRAPH computation engine © 2009-2013 Open Query
  • Links and support ● Binaries & Packages ○ http://mariadb.com (MariaDB 10.0 soon) ● Source collaboration ○ https://launchpad.net/oqgraph ○ https://code.launchpad.net/~oqgraph-dev/maria/10.0-oqgraph3 ● Info, Docs, Support, Licensing, Engineering ○ http://openquery.com/graph ○ This presentation: http://goo.gl/gqr7b Thank you! Antony Curtis & Arjen Lentz graph@openquery.comOQGRAPH computation engine © 2009-2013 Open Query