SlideShare a Scribd company logo
1 of 24
Download to read offline
OQGraph 3 for MariaDB
   Graphs and Hierarchies in Plain SQL
             http://goo.gl/gqr7b




Antony T Curtis <atcurtis@gmail.com>


                        graph@openquery.com
                        http://openquery.com/graph
Graphs / Networks
     ● Nodes connected by Edges.
     ● Edges may be directional.
     ● Edges may have a "weight" / "cost" attribute.
     ● Directed graphs may have bi-directional edges.
     ● Unconnected sets of nodes may exist on same graph.
     ● There need not be a "root" node.




   Examples:
    ● "Social Graphs" / friend relationships.
    ● Decision / State graphs.
    ● Airline routes
OQGRAPH computation engine © 2009-2013 Open Query
RDBMS with Heirarchies and Graphs

     ● Not always a particularly good fit.
     ● Various tree models exist; each with limitations:
        ○ Adjacency model
           ■ Either uses fixed max depth or recursive queries.
           ■ Oracle has CONNECT BY PRIOR
           ■ SQL99 has WITH RECURSIVE...UNION...
        ○ Nested set
           ■ complex
           ■ recursive queries to find path to root.
        ○ Materialised path
           ■ Ugly and not relational.
           ■ Can be quite effective when used correctly.

                                              Further reading: http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

OQGRAPH computation engine © 2009-2013 Open Query
What is OQGRAPH?

    ● Implemented as a storage engine.
       ○ Original concept by Arjen Lentz
    ● Mk. 2 implementation 2008
       ○ GPLv2+
       ○ Bundled with MariaDB 5.2+
       ○ Boost Graph Library
    ● Mk. 3 implementation
       ○ GPLv2+
       ○ Bundled with MariaDB 10.0 (soon)
    ● Easy to enable
         ○ INSTALL PLUGIN oqgraph SONAME ‘ha_oqgraph’;




OQGRAPH computation engine © 2009-2013 Open Query
OQGRAPH: A Computation Engine

     ● It is not a general purpose data engine.
        ○ unlike MyISAM, InnoDB or MEMORY.
     ● Looks like an ordinary table.
     ● Has a very different internal architecture.
     ● It does not operate in terms of
        ○ storing data for later retrieval.
        ○ having indexes on data.

     ● May be regarded as a "magic view" or "table function".




OQGRAPH computation engine © 2009-2013 Open Query
OQGRAPH: A Computation Engine

                MySQL Server                   Communications, Session and Thread Management



                                       DDL, DML,
                 Management             Tables,            SQL Parser and SQL
                                      Views, Lock
                  Services,           Management
                                                         Stored Procedure Engine
                                                                                               Buffers
                  Logging,
                                                                                                and
                 Utilities and                                                                 Caches
                  Runtime
                  Libraries
                                       Query Optimizer and Execution Engine

                                                built in and run-time loaded plug ins


                                         OQGraph


                          InnoDB




OQGRAPH computation engine © 2009-2013 Open Query
What's new in OQGRAPH 3
   Features:
    ● Judy array bitmaps for Graph coloring.
    ● Uses existing tables for edge data.
    ● Much lower memory cost per query.
    ● Does not impose any strict structure on the source table.
    ● Can handle significantly larger graphs than OQGRAPHv2.
       ○ 100K+ index reads per second are possible.
       ○ Millions of edges are possible.
    ● All edges of graph need not fit in memory.
       ○ Only Judy bitmap array must be held in RAM.
   Notes:
    ● Tables are read-only and only read from the backing table.
    ● Table must be in same schema as the backing table.
    ● Table must have appropriate indexes.

OQGRAPH computation engine © 2009-2013 Open Query
Anatomy of an OQGRAPH 3 table
   CREATE TABLE db.tblname (
     latch SMALLINT UNSIGNED NULL,
     origid BIGINT UNSIGNED NULL,
     destid BIGINT UNSIGNED NULL,
     weight DOUBLE NULL,
     seq BIGINT UNSIGNED NULL,
     linkid BIGINT UNSIGNED NULL,
     KEY (latch, origid, destid) USING HASH,
     KEY (latch, destid, origid) USING HASH
   ) ENGINE=OQGRAPH
     data_table='link'       -- data table
     origid='source'         -- column name
     destid='target'         -- column name
     weight='weight';        -- optional column name
   ;
OQGRAPH computation engine © 2009-2013 Open Query
OQGRAPH - Data source
     ● Edges are directed edges.
     ● Edge weight are optional and default to 1.0
     ● Undirected edges may be represented as two directed
       edges, in opposite directions.

   CREATE TABLE foo (
      origid INT UNSIGNED NOT NULL,
      destid INT UNSIGNED NOT NULL,
      PRIMARY KEY(origid, destid),
      KEY (destid)
   );
   INSERT INTO foo (origid,destid) VALUES
   (1,2), (2,3), (2,4),
   (4,5), (3,6), (5,6);


OQGRAPH computation engine © 2009-2013 Open Query
OQGRAPH - Data source, cont.

   Creating the OQGRAPH table:
   CREATE TABLE foo_graph (
     latch SMALLINT UNSIGNED NULL,
     origid BIGINT UNSIGNED NULL,
     destid BIGINT UNSIGNED NULL,
     weight DOUBLE NULL,
     seq BIGINT UNSIGNED NULL,
     linkid BIGINT UNSIGNED NULL,
     KEY (latch, origid, destid) USING HASH,
     KEY (latch, destid, origid) USING HASH
   ) ENGINE=OQGRAPH
     data_table='foo' origid='origid' destid='destid';




OQGRAPH computation engine © 2009-2013 Open Query
Selecting Edges

   MariaDB [foo]> select * from foo_graph;
   +-------+--------+--------+--------+------+--------+
   | latch | origid | destid | weight | seq | linkid |
   +-------+--------+--------+--------+------+--------+
   | NULL |       1 |      2 |      1 | NULL |   NULL |
   | NULL |       2 |      3 |      1 | NULL |   NULL |
   | NULL |       2 |      4 |      1 | NULL |   NULL |
   | NULL |       3 |      6 |      1 | NULL |   NULL |
   | NULL |       4 |      5 |      1 | NULL |   NULL |
   | NULL |       5 |      6 |      1 | NULL |   NULL |
   +-------+--------+--------+--------+------+--------+
   6 rows in set (0.38 sec)




OQGRAPH computation engine © 2009-2013 Open Query
Now, it's time for some magic.
   (shortest path calculation)

      ● SELECT * FROM foo_graph
        WHERE latch=1 AND origid=1 AND destid=6;
        +-------+--------+--------+--------+------+--------+
        | latch | origid | destid | weight | seq | linkid |
        +-------+--------+--------+--------+------+--------+
        |     1 |      1 |      6 |   NULL |    0 |      1 |
        |     1 |      1 |      6 |      1 |    1 |      2 |
        |     1 |      1 |      6 |      1 |    2 |      3 |
        |     1 |      1 |      6 |      1 |    3 |      6 |
        +-------+--------+--------+--------+------+--------+


      ● SELECT GROUP_CONCAT(linkid ORDER BY seq) AS path
        FROM foo_graph WHERE latch=1 AND origid=1 AND destid=6 G
        path: 1,2,3,6


OQGRAPH computation engine © 2009-2013 Open Query
Other computations,
     ● Which paths lead to node 4?
        SELECT GROUP_CONCAT(linkid) AS list
        FROM foo_graph WHERE latch=1 AND destid=4 G

        list: 1,2,4


     ● Where can I get to from node 4?
        SELECT GROUP_CONCAT(linkid) AS list
        FROM foo_graph WHERE latch=1 AND origid=4 G

        list: 6,5,4




OQGRAPH computation engine © 2009-2013 Open Query
Other computations, continued.

     ● See docs for latch 0 and latch NULL
     ● latch 1 : Dijkstra's shortest path.
        ○ O((V + E).log V)
     ● latch 2 : Breadth-first search.
        ○ O(V+E)
     ● Other algorithms possible




OQGRAPH computation engine © 2009-2013 Open Query
Joins make it prettier,
     ● INSERT INTO people VALUES
       (1,’pearce’), (2,’hunnicut’), (3,’potter’),
       (4,’hoolihan’), (5,’winchester’), (6,’
       mulcahy’);


     ● SELECT GROUP_CONCAT(name ORDER BY seq) path
       FROM foo_graph
       JOIN people ON (foo.linkid = people.id)
       WHERE latch=1 AND origid=1 AND destid=6 G

        path: pearce,hunnicut,potter,mulcahy


OQGRAPH computation engine © 2009-2013 Open Query
Tree of Life
 Load the tol.sql schema,

 Create tol_link backing store table,
 CREATE TABLE tol_link (
   source INT UNSIGNED NOT NULL,
   target INT UNSIGNED NOT NULL,
   PRIMARY KEY (source, target),
   KEY (target) ) ENGINE=innodb;

 Populate it with all the edges we need:
 INSERT INTO tol_link (source,target)
 SELECT parent,id FROM tol WHERE parent IS NOT NULL
 UNION ALL
 SELECT id,parent FROM tol WHERE parent IS NOT NULL;
 Query OK, 178102 rows affected (46.35 sec)
 Records: 178102 Duplicates: 0 Warnings: 0

                 Direct download: http://bazaar.launchpad.net/~openquery-core/oqgraph/trunk/view/head:/examples/tree-of-life/tol.sql

OQGRAPH computation engine © 2009-2013 Open Query
Tree of Life, cont.

   Creating the OQGRAPH table:
   CREATE TABLE tol_tree (
     latch SMALLINT UNSIGNED NULL,
     origid BIGINT UNSIGNED NULL,
     destid BIGINT UNSIGNED NULL,
     weight DOUBLE NULL,
     seq BIGINT UNSIGNED NULL,
     linkid BIGINT UNSIGNED NULL,
     KEY (latch, origid, destid) USING HASH,
     KEY (latch, destid, origid) USING HASH
   ) ENGINE=OQGRAPH
     data_table='tol_link' origid='source' destid='target';




OQGRAPH computation engine © 2009-2013 Open Query
Tree of Life - finding H.Sapiens

   SELECT
      GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS path
      FROM tol_tree JOIN tol ON (linkid=id)
      WHERE latch=1 AND origid=1 AND destid=16421 G

   path: Life on Earth -> Eukaryotes -> Unikonts ->
   Opisthokonts -> Animals -> Bilateria ->
   Deuterostomia -> Chordata -> Craniata -> Vertebrata
   -> Gnathostomata -> Teleostomi -> Osteichthyes ->
   Sarcopterygii -> Terrestrial Vertebrates ->
   Tetrapoda -> Reptiliomorpha -> Amniota -> Synapsida
   -> Eupelycosauria -> Sphenacodontia ->
   Sphenacodontoidea -> Therapsida -> Theriodontia ->
   Cynodontia -> Mammalia -> Eutheria -> Primates ->
   Catarrhini -> Hominidae -> Homo -> Homo sapiens

OQGRAPH computation engine © 2009-2011 Open Query
Internet Movie DataBase (IMDB)
 Transform and load the movie database (this takes a long time)
 CREATE TABLE `entity` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `type` enum('ACTOR','MOVIE','TV MOVIE','TV MINI','TV SERIES','VIDEO
 MOVIE','VIDEO GAME','VOICE','ARCHIVE') NOT NULL,
   `name` varchar(128) COLLATE utf8_unicode_ci NOT NULL,
   PRIMARY KEY (`id`),
   UNIQUE KEY `type` (`type`,`name`) USING BTREE
 ) ENGINE=InnoDB;

 CREATE TABLE `link` (
   `rel_id` int(11) NOT NULL AUTO_INCREMENT,
   `link_from` int(11) NOT NULL,
   `link_to` int(11) NOT NULL,
   PRIMARY KEY (`rel_id`),
   KEY `link_from` (`link_from`,`link_to`),
   KEY `link_to` (`link_to`)
 ) ENGINE=InnoDB;




OQGRAPH computation engine © 2009-2013 Open Query
Degrees of N!xau
 Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are
 about 1GB and InnoDB configured for 512MB buffer pool.
 MariaDB [imdb]> SELECT
               -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS
 path
               -> FROM movie_graph JOIN entity ON (id=linkid)
               -> WHERE latch=1
               -> AND origid=(SELECT a.id FROM entity a
               ->               WHERE name='Kevin Bacon')
               -> AND destid=(SELECT b.id FROM entity b
                                WHERE name='N!xau')G




OQGRAPH computation engine © 2009-2013 Open Query
Degrees of N!xau
 Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are
 about 1GB and InnoDB configured for 512MB buffer pool.
 MariaDB [imdb]> SELECT
               -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS
 path
               -> FROM movie_graph JOIN entity ON (id=linkid)
               -> WHERE latch=1
               -> AND origid=(SELECT a.id FROM entity a
               ->               WHERE name='Kevin Bacon')
               -> AND destid=(SELECT b.id FROM entity b
                                WHERE name='N!xau')G
 *************************** 1. row ***************************
 path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo ->
 The Gods Must Be Crazy (1981) -> N!xau
 1 row in set (3 min 9.67 sec)
 --again
 *************************** 1. row ***************************
 path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo ->
 The Gods Must Be Crazy (1981) -> N!xau
 1 row in set (1 min 7.13 sec)
 Each query requires approximately 7.8 million secondary key reads.




OQGRAPH computation engine © 2009-2013 Open Query
Degrees of N!xau
 Graph of approximately 3.7 million nodes with 30 million edges. Tables are about
 3.5GB and InnoDB configured for 512MB buffer pool.
 MariaDB [imdb]> SELECT
               -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS
 path
               -> FROM imdb_graph JOIN entity ON (id=linkid)
               -> WHERE latch=1
               -> AND origid=(SELECT a.id FROM entity a
               ->                 WHERE name='Kevin Bacon')
               -> AND destid=(SELECT b.id FROM entity b
                                  WHERE name='N!xau')G
 *************************** 1. row ***************************
 path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) ->
 Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid
 (1993) -> N!xau
 1 row in set (10 min 6.55 sec)
 --again
 *************************** 1. row ***************************
 path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) ->
 Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid
 (1993) -> N!xau
 1 row in set (8 min 29.66 sec)
 Each query requires approximately 16.6 million secondary key reads.

OQGRAPH computation engine © 2009-2013 Open Query
We want your feedback!

     ● Very easy to use...
         But do feel free to ask us for help/advice.

     ● OpenQuery created friendlist_graph for Drupal 6.
        ○ Currently based on OQGraph v2
          ○ Addition to the existing friendlist module.
          ○ Enables easy social networking in Drupal.
          ○ Peter Lieverdink (@cafuego) did this in about 30 minutes

     ● We would like to know how you are using OQGRAPH!
       ○ You could be doing something really cool...



OQGRAPH computation engine © 2009-2013 Open Query
Links and support
    ● Binaries & Packages
         ○ http://mariadb.com (MariaDB 10.0 soon)
    ● Source collaboration
       ○ https://launchpad.net/oqgraph
         ○ https://code.launchpad.net/~oqgraph-dev/maria/10.0-oqgraph3
    ● Info, Docs, Support, Licensing, Engineering
         ○ http://openquery.com/graph
         ○ This presentation: http://goo.gl/gqr7b




                                     Thank you!
                                     Antony Curtis & Arjen Lentz
                                     graph@openquery.com
OQGRAPH computation engine © 2009-2013 Open Query

More Related Content

What's hot

MongoDB Shell Tips & Tricks
MongoDB Shell Tips & TricksMongoDB Shell Tips & Tricks
MongoDB Shell Tips & Tricks
MongoDB
 
Chapter 6 Interface Python with MYSQL.pptx
Chapter 6 Interface Python with MYSQL.pptxChapter 6 Interface Python with MYSQL.pptx
Chapter 6 Interface Python with MYSQL.pptx
sarofba
 

What's hot (20)

JSON in Oracle 18c and 19c
JSON in Oracle 18c and 19cJSON in Oracle 18c and 19c
JSON in Oracle 18c and 19c
 
Promises, promises, and then observables
Promises, promises, and then observablesPromises, promises, and then observables
Promises, promises, and then observables
 
PLSQL
PLSQLPLSQL
PLSQL
 
What Is Express JS?
What Is Express JS?What Is Express JS?
What Is Express JS?
 
MongoDB Shell Tips & Tricks
MongoDB Shell Tips & TricksMongoDB Shell Tips & Tricks
MongoDB Shell Tips & Tricks
 
Database, data storage, hosting with Firebase
Database, data storage, hosting with FirebaseDatabase, data storage, hosting with Firebase
Database, data storage, hosting with Firebase
 
Mysql query optimization best practices and indexing
Mysql query optimization  best practices and indexingMysql query optimization  best practices and indexing
Mysql query optimization best practices and indexing
 
Brief History of JavaScript
Brief History of JavaScriptBrief History of JavaScript
Brief History of JavaScript
 
JavaScript Event Loop
JavaScript Event LoopJavaScript Event Loop
JavaScript Event Loop
 
Percona XtraBackup - New Features and Improvements
Percona XtraBackup - New Features and ImprovementsPercona XtraBackup - New Features and Improvements
Percona XtraBackup - New Features and Improvements
 
Building Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at StripeBuilding Real Time Systems on MongoDB Using the Oplog at Stripe
Building Real Time Systems on MongoDB Using the Oplog at Stripe
 
Postgresql tutorial
Postgresql tutorialPostgresql tutorial
Postgresql tutorial
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
PostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | EdurekaPostgreSQL Tutorial For Beginners | Edureka
PostgreSQL Tutorial For Beginners | Edureka
 
Rate limits and Performance
Rate limits and PerformanceRate limits and Performance
Rate limits and Performance
 
Chapter 6 Interface Python with MYSQL.pptx
Chapter 6 Interface Python with MYSQL.pptxChapter 6 Interface Python with MYSQL.pptx
Chapter 6 Interface Python with MYSQL.pptx
 
Angular 6 - The Complete Guide
Angular 6 - The Complete GuideAngular 6 - The Complete Guide
Angular 6 - The Complete Guide
 
Introduction to rest.li
Introduction to rest.liIntroduction to rest.li
Introduction to rest.li
 
Overview of Rest Service and ASP.NET WEB API
Overview of Rest Service and ASP.NET WEB APIOverview of Rest Service and ASP.NET WEB API
Overview of Rest Service and ASP.NET WEB API
 
Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...
Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...
Node.js Tutorial for Beginners | Node.js Web Application Tutorial | Node.js T...
 

Similar to OQGraph @ SCaLE 11x 2013

OQGraph at MySQL Users Conference 2011
OQGraph at MySQL Users Conference 2011OQGraph at MySQL Users Conference 2011
OQGraph at MySQL Users Conference 2011
Antony T Curtis
 
Design and Implementation of the Security Graph Language
Design and Implementation of the Security Graph LanguageDesign and Implementation of the Security Graph Language
Design and Implementation of the Security Graph Language
Asankhaya Sharma
 

Similar to OQGraph @ SCaLE 11x 2013 (20)

OQGraph at MySQL Users Conference 2011
OQGraph at MySQL Users Conference 2011OQGraph at MySQL Users Conference 2011
OQGraph at MySQL Users Conference 2011
 
MySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdfMySQL HA Orchestrator Proxysql Consul.pdf
MySQL HA Orchestrator Proxysql Consul.pdf
 
Neo4j: Graph-like power
Neo4j: Graph-like powerNeo4j: Graph-like power
Neo4j: Graph-like power
 
GraphQL & DGraph with Go
GraphQL & DGraph with GoGraphQL & DGraph with Go
GraphQL & DGraph with Go
 
Design and Implementation of the Security Graph Language
Design and Implementation of the Security Graph LanguageDesign and Implementation of the Security Graph Language
Design and Implementation of the Security Graph Language
 
Scio - A Scala API for Google Cloud Dataflow & Apache Beam
Scio - A Scala API for Google Cloud Dataflow & Apache BeamScio - A Scala API for Google Cloud Dataflow & Apache Beam
Scio - A Scala API for Google Cloud Dataflow & Apache Beam
 
Apache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them AllApache Calcite: One Frontend to Rule Them All
Apache Calcite: One Frontend to Rule Them All
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Apache Hive for modern DBAs
Apache Hive for modern DBAsApache Hive for modern DBAs
Apache Hive for modern DBAs
 
Sorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at SpotifySorry - How Bieber broke Google Cloud at Spotify
Sorry - How Bieber broke Google Cloud at Spotify
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
 
What’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributorWhat’s new in 9.6, by PostgreSQL contributor
What’s new in 9.6, by PostgreSQL contributor
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache Spark
 
Apache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming modelApache spark - Spark's distributed programming model
Apache spark - Spark's distributed programming model
 
MapReduce and Hadoop
MapReduce and HadoopMapReduce and Hadoop
MapReduce and Hadoop
 
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran LonikarExploiting GPU's for Columnar DataFrrames by Kiran Lonikar
Exploiting GPU's for Columnar DataFrrames by Kiran Lonikar
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

OQGraph @ SCaLE 11x 2013

  • 1. OQGraph 3 for MariaDB Graphs and Hierarchies in Plain SQL http://goo.gl/gqr7b Antony T Curtis <atcurtis@gmail.com> graph@openquery.com http://openquery.com/graph
  • 2. Graphs / Networks ● Nodes connected by Edges. ● Edges may be directional. ● Edges may have a "weight" / "cost" attribute. ● Directed graphs may have bi-directional edges. ● Unconnected sets of nodes may exist on same graph. ● There need not be a "root" node. Examples: ● "Social Graphs" / friend relationships. ● Decision / State graphs. ● Airline routes OQGRAPH computation engine © 2009-2013 Open Query
  • 3. RDBMS with Heirarchies and Graphs ● Not always a particularly good fit. ● Various tree models exist; each with limitations: ○ Adjacency model ■ Either uses fixed max depth or recursive queries. ■ Oracle has CONNECT BY PRIOR ■ SQL99 has WITH RECURSIVE...UNION... ○ Nested set ■ complex ■ recursive queries to find path to root. ○ Materialised path ■ Ugly and not relational. ■ Can be quite effective when used correctly. Further reading: http://dev.mysql.com/tech-resources/articles/hierarchical-data.html OQGRAPH computation engine © 2009-2013 Open Query
  • 4. What is OQGRAPH? ● Implemented as a storage engine. ○ Original concept by Arjen Lentz ● Mk. 2 implementation 2008 ○ GPLv2+ ○ Bundled with MariaDB 5.2+ ○ Boost Graph Library ● Mk. 3 implementation ○ GPLv2+ ○ Bundled with MariaDB 10.0 (soon) ● Easy to enable ○ INSTALL PLUGIN oqgraph SONAME ‘ha_oqgraph’; OQGRAPH computation engine © 2009-2013 Open Query
  • 5. OQGRAPH: A Computation Engine ● It is not a general purpose data engine. ○ unlike MyISAM, InnoDB or MEMORY. ● Looks like an ordinary table. ● Has a very different internal architecture. ● It does not operate in terms of ○ storing data for later retrieval. ○ having indexes on data. ● May be regarded as a "magic view" or "table function". OQGRAPH computation engine © 2009-2013 Open Query
  • 6. OQGRAPH: A Computation Engine MySQL Server Communications, Session and Thread Management DDL, DML, Management Tables, SQL Parser and SQL Views, Lock Services, Management Stored Procedure Engine Buffers Logging, and Utilities and Caches Runtime Libraries Query Optimizer and Execution Engine built in and run-time loaded plug ins OQGraph InnoDB OQGRAPH computation engine © 2009-2013 Open Query
  • 7. What's new in OQGRAPH 3 Features: ● Judy array bitmaps for Graph coloring. ● Uses existing tables for edge data. ● Much lower memory cost per query. ● Does not impose any strict structure on the source table. ● Can handle significantly larger graphs than OQGRAPHv2. ○ 100K+ index reads per second are possible. ○ Millions of edges are possible. ● All edges of graph need not fit in memory. ○ Only Judy bitmap array must be held in RAM. Notes: ● Tables are read-only and only read from the backing table. ● Table must be in same schema as the backing table. ● Table must have appropriate indexes. OQGRAPH computation engine © 2009-2013 Open Query
  • 8. Anatomy of an OQGRAPH 3 table CREATE TABLE db.tblname ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table='link' -- data table origid='source' -- column name destid='target' -- column name weight='weight'; -- optional column name ; OQGRAPH computation engine © 2009-2013 Open Query
  • 9. OQGRAPH - Data source ● Edges are directed edges. ● Edge weight are optional and default to 1.0 ● Undirected edges may be represented as two directed edges, in opposite directions. CREATE TABLE foo ( origid INT UNSIGNED NOT NULL, destid INT UNSIGNED NOT NULL, PRIMARY KEY(origid, destid), KEY (destid) ); INSERT INTO foo (origid,destid) VALUES (1,2), (2,3), (2,4), (4,5), (3,6), (5,6); OQGRAPH computation engine © 2009-2013 Open Query
  • 10. OQGRAPH - Data source, cont. Creating the OQGRAPH table: CREATE TABLE foo_graph ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table='foo' origid='origid' destid='destid'; OQGRAPH computation engine © 2009-2013 Open Query
  • 11. Selecting Edges MariaDB [foo]> select * from foo_graph; +-------+--------+--------+--------+------+--------+ | latch | origid | destid | weight | seq | linkid | +-------+--------+--------+--------+------+--------+ | NULL | 1 | 2 | 1 | NULL | NULL | | NULL | 2 | 3 | 1 | NULL | NULL | | NULL | 2 | 4 | 1 | NULL | NULL | | NULL | 3 | 6 | 1 | NULL | NULL | | NULL | 4 | 5 | 1 | NULL | NULL | | NULL | 5 | 6 | 1 | NULL | NULL | +-------+--------+--------+--------+------+--------+ 6 rows in set (0.38 sec) OQGRAPH computation engine © 2009-2013 Open Query
  • 12. Now, it's time for some magic. (shortest path calculation) ● SELECT * FROM foo_graph WHERE latch=1 AND origid=1 AND destid=6; +-------+--------+--------+--------+------+--------+ | latch | origid | destid | weight | seq | linkid | +-------+--------+--------+--------+------+--------+ | 1 | 1 | 6 | NULL | 0 | 1 | | 1 | 1 | 6 | 1 | 1 | 2 | | 1 | 1 | 6 | 1 | 2 | 3 | | 1 | 1 | 6 | 1 | 3 | 6 | +-------+--------+--------+--------+------+--------+ ● SELECT GROUP_CONCAT(linkid ORDER BY seq) AS path FROM foo_graph WHERE latch=1 AND origid=1 AND destid=6 G path: 1,2,3,6 OQGRAPH computation engine © 2009-2013 Open Query
  • 13. Other computations, ● Which paths lead to node 4? SELECT GROUP_CONCAT(linkid) AS list FROM foo_graph WHERE latch=1 AND destid=4 G list: 1,2,4 ● Where can I get to from node 4? SELECT GROUP_CONCAT(linkid) AS list FROM foo_graph WHERE latch=1 AND origid=4 G list: 6,5,4 OQGRAPH computation engine © 2009-2013 Open Query
  • 14. Other computations, continued. ● See docs for latch 0 and latch NULL ● latch 1 : Dijkstra's shortest path. ○ O((V + E).log V) ● latch 2 : Breadth-first search. ○ O(V+E) ● Other algorithms possible OQGRAPH computation engine © 2009-2013 Open Query
  • 15. Joins make it prettier, ● INSERT INTO people VALUES (1,’pearce’), (2,’hunnicut’), (3,’potter’), (4,’hoolihan’), (5,’winchester’), (6,’ mulcahy’); ● SELECT GROUP_CONCAT(name ORDER BY seq) path FROM foo_graph JOIN people ON (foo.linkid = people.id) WHERE latch=1 AND origid=1 AND destid=6 G path: pearce,hunnicut,potter,mulcahy OQGRAPH computation engine © 2009-2013 Open Query
  • 16. Tree of Life Load the tol.sql schema, Create tol_link backing store table, CREATE TABLE tol_link ( source INT UNSIGNED NOT NULL, target INT UNSIGNED NOT NULL, PRIMARY KEY (source, target), KEY (target) ) ENGINE=innodb; Populate it with all the edges we need: INSERT INTO tol_link (source,target) SELECT parent,id FROM tol WHERE parent IS NOT NULL UNION ALL SELECT id,parent FROM tol WHERE parent IS NOT NULL; Query OK, 178102 rows affected (46.35 sec) Records: 178102 Duplicates: 0 Warnings: 0 Direct download: http://bazaar.launchpad.net/~openquery-core/oqgraph/trunk/view/head:/examples/tree-of-life/tol.sql OQGRAPH computation engine © 2009-2013 Open Query
  • 17. Tree of Life, cont. Creating the OQGRAPH table: CREATE TABLE tol_tree ( latch SMALLINT UNSIGNED NULL, origid BIGINT UNSIGNED NULL, destid BIGINT UNSIGNED NULL, weight DOUBLE NULL, seq BIGINT UNSIGNED NULL, linkid BIGINT UNSIGNED NULL, KEY (latch, origid, destid) USING HASH, KEY (latch, destid, origid) USING HASH ) ENGINE=OQGRAPH data_table='tol_link' origid='source' destid='target'; OQGRAPH computation engine © 2009-2013 Open Query
  • 18. Tree of Life - finding H.Sapiens SELECT GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS path FROM tol_tree JOIN tol ON (linkid=id) WHERE latch=1 AND origid=1 AND destid=16421 G path: Life on Earth -> Eukaryotes -> Unikonts -> Opisthokonts -> Animals -> Bilateria -> Deuterostomia -> Chordata -> Craniata -> Vertebrata -> Gnathostomata -> Teleostomi -> Osteichthyes -> Sarcopterygii -> Terrestrial Vertebrates -> Tetrapoda -> Reptiliomorpha -> Amniota -> Synapsida -> Eupelycosauria -> Sphenacodontia -> Sphenacodontoidea -> Therapsida -> Theriodontia -> Cynodontia -> Mammalia -> Eutheria -> Primates -> Catarrhini -> Hominidae -> Homo -> Homo sapiens OQGRAPH computation engine © 2009-2011 Open Query
  • 19. Internet Movie DataBase (IMDB) Transform and load the movie database (this takes a long time) CREATE TABLE `entity` ( `id` int(11) NOT NULL AUTO_INCREMENT, `type` enum('ACTOR','MOVIE','TV MOVIE','TV MINI','TV SERIES','VIDEO MOVIE','VIDEO GAME','VOICE','ARCHIVE') NOT NULL, `name` varchar(128) COLLATE utf8_unicode_ci NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `type` (`type`,`name`) USING BTREE ) ENGINE=InnoDB; CREATE TABLE `link` ( `rel_id` int(11) NOT NULL AUTO_INCREMENT, `link_from` int(11) NOT NULL, `link_to` int(11) NOT NULL, PRIMARY KEY (`rel_id`), KEY `link_from` (`link_from`,`link_to`), KEY `link_to` (`link_to`) ) ENGINE=InnoDB; OQGRAPH computation engine © 2009-2013 Open Query
  • 20. Degrees of N!xau Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are about 1GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS path -> FROM movie_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name='Kevin Bacon') -> AND destid=(SELECT b.id FROM entity b WHERE name='N!xau')G OQGRAPH computation engine © 2009-2013 Open Query
  • 21. Degrees of N!xau Graph of movies approximately 3.7 million nodes with 9 million edges. Tables are about 1GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS path -> FROM movie_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name='Kevin Bacon') -> AND destid=(SELECT b.id FROM entity b WHERE name='N!xau')G *************************** 1. row *************************** path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo -> The Gods Must Be Crazy (1981) -> N!xau 1 row in set (3 min 9.67 sec) --again *************************** 1. row *************************** path: Kevin Bacon -> The Air Up There (1994) -> Fanyana H. Sidumo -> The Gods Must Be Crazy (1981) -> N!xau 1 row in set (1 min 7.13 sec) Each query requires approximately 7.8 million secondary key reads. OQGRAPH computation engine © 2009-2013 Open Query
  • 22. Degrees of N!xau Graph of approximately 3.7 million nodes with 30 million edges. Tables are about 3.5GB and InnoDB configured for 512MB buffer pool. MariaDB [imdb]> SELECT -> GROUP_CONCAT(name ORDER BY seq SEPARATOR ' -> ') AS path -> FROM imdb_graph JOIN entity ON (id=linkid) -> WHERE latch=1 -> AND origid=(SELECT a.id FROM entity a -> WHERE name='Kevin Bacon') -> AND destid=(SELECT b.id FROM entity b WHERE name='N!xau')G *************************** 1. row *************************** path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) -> Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid (1993) -> N!xau 1 row in set (10 min 6.55 sec) --again *************************** 1. row *************************** path: Kevin Bacon -> The 45th Annual Golden Globe Awards (1988) -> Richard Attenborough -> In Darkest Hollywood: Cinema and Apartheid (1993) -> N!xau 1 row in set (8 min 29.66 sec) Each query requires approximately 16.6 million secondary key reads. OQGRAPH computation engine © 2009-2013 Open Query
  • 23. We want your feedback! ● Very easy to use... But do feel free to ask us for help/advice. ● OpenQuery created friendlist_graph for Drupal 6. ○ Currently based on OQGraph v2 ○ Addition to the existing friendlist module. ○ Enables easy social networking in Drupal. ○ Peter Lieverdink (@cafuego) did this in about 30 minutes ● We would like to know how you are using OQGRAPH! ○ You could be doing something really cool... OQGRAPH computation engine © 2009-2013 Open Query
  • 24. Links and support ● Binaries & Packages ○ http://mariadb.com (MariaDB 10.0 soon) ● Source collaboration ○ https://launchpad.net/oqgraph ○ https://code.launchpad.net/~oqgraph-dev/maria/10.0-oqgraph3 ● Info, Docs, Support, Licensing, Engineering ○ http://openquery.com/graph ○ This presentation: http://goo.gl/gqr7b Thank you! Antony Curtis & Arjen Lentz graph@openquery.com OQGRAPH computation engine © 2009-2013 Open Query