SlideShare a Scribd company logo
1 of 40
Download to read offline
GraphREL: A Decomposition-Based and
Selectivity-Aware Relational Framework for Processing
                  Sub-graph Queries

                                    Sherif Sakr
                       School of Computer Science and Engineering
                             University of New South Wales
                                            .
                          http://www.cse.unsw.edu.au/∼ssakr/



                  BIT Seminars ’09 - Free University of Bolzano, Italy


                                16 November 2009


 S. Sakr (CSE, UNSW)                 BIT Seminars’09                 16 November 2009   1 / 40
Outline

    Previous Work: Pathfinder - Relational XQuery Compiler.


    Current Work: GraphREL - General Graph Query Processor.


    Future Work: Scalable Graph Query Processing for New Generation
    of Database Applications.




   S. Sakr (CSE, UNSW)       BIT Seminars’09        16 November 2009   2 / 40
Outline

    Previous Work: Pathfinder - Relational XQuery Compiler.


    Current Work: GraphREL - General Graph Query Processor.


    Future Work: Scalable Graph Query Processing for New Generation
    of Database Applications.




   S. Sakr (CSE, UNSW)       BIT Seminars’09        16 November 2009   3 / 40
Pathfinder: A Relational XQuery Processor

                           XQuery Expression


                                        Pathfinder


                                    Relational Algebra



                MIL Code Generator                    SQL Code Generator

                 MIL Scripts                              SQL Scripts




                    Monet DBMS                       Conventional RDBMS

                         http://pathfinder-xquery.org/
   S. Sakr (CSE, UNSW)                  BIT Seminars’09                 16 November 2009   4 / 40
Pathfinder: A Relational XQuery Processor
                      XML                              XQuery
                    Document
                                                     Expression


                                                  Pathfinder


                                           Relational Algebra + Special Properties [VLDB’04]

                              Estimation Rules                                     Translation Templates

                                                          [VLDB’08]
     XPath Accelerator                                Cardinality Properties
      Encoding Tuples     XQuery Estimator                                        SQL Generator
            +
     Statistical Guide                               [SIGMOD’07]
                              Statistical Guide        [IJWIS’09]              Cardinality Properties Aware
                                                        [JDM’09]
                         Statistical Histograms                                        SQL Scripts

                                                                Relational
                                                                 Results           XML            XML
                  Conventional RDBMS                                             Serializer
             Statistical Histograms

   S. Sakr (CSE, UNSW) System Administrator
                                          BIT Seminars’09                                  16 November 2009   5 / 40
Outline

    Previous Work: Pathfinder - Relational XQuery Compiler.


    Current Work: GraphREL - General Graph Query Processor.


    Future Work: Scalable Graph Query Processing for New Generation
    of Database Applications.




   S. Sakr (CSE, UNSW)       BIT Seminars’09        16 November 2009   6 / 40
GraphREL: Motivations

    Graphs are among the most complicated and general form of data
    structures.
    Recently, they have been widely used to model many complex
    structured and schemaless data such as social networks, chemical
    compounds, biological pathways, spatial databases, semantic web and
    business process models.
    Retrieving related graphs containing a query graph from a large graph
    database is a key performance issue in all of these graph-based
    applications.
    The success of any graph database application is directly dependent
    on the efficiency of the graph indexing and query processing
    mechanisms.
    RDBMSs have repeatedly shown that they are very efficient, scalable
    and successful in hosting different kinds of data.
   S. Sakr (CSE, UNSW)         BIT Seminars’09          16 November 2009   7 / 40
Preliminaries: Graph Data Model

    In labelled graphs, vertices and edges represent the entities and the
    relationships between them respectively.
    The attributes associated with these entities and relationships are
    called labels.
    A graph database D is a collection of member graphs
    D = {g1 , g2 , ...gn } where each member graph gi is denoted as
    (V , E , Lv , Le ).
           V is the set of vertices.
           E ⊆ V × V is the set of edges joining two distinct vertices.
           Lv is the set of vertex labels.
           Le is the set of edge labels.
    labelled graphs are classified according to the direction of their edges
    into two main classes:
       1   Directed-labelled graphs such as XML, RDF and traffic networks.
       2   Undirected-labelled graphs such as social networks and chemical
           compounds.
   S. Sakr (CSE, UNSW)              BIT Seminars’09             16 November 2009   8 / 40
Preliminaries: Graph Queries

In principle, queries in graph databases can be broadly classified into the
follow- ing main categories:
     Subgraph queries: this category searches for a specific pattern in the
     graph database. The pattern can be either a small graph or a graph
     where some parts of it are uncertain, e.g., vertices with wildcard
     labels.

     Supergraph queries: this category searches for the graph database
     members of which their whole structures are contained in the input
     query.

     Similarity (Approximate Matching) queries: this category finds
     graphs which are similar, but not necessarily isomorphic to a given
     query graph.



    S. Sakr (CSE, UNSW)          BIT Seminars’09           16 November 2009   9 / 40
Preliminaries: Subgraph Search Queries

    Given a graph database D = {g1 , g2 , ..., gn } and a graph query q, it
    returns the query answer set A = {gi |q ⊆ gi , gi ∈ D}.

    A graph q is described as a sub-graph of another graph database
    member gi if the set of vertices and edges of q form subset of the
    vertices and edges of gi .

    Formally, g1 (V1 , E1 , Lv 1 , Le1 ) is defined as sub-graph of
    g2 (V2 , E2 , Lv 2 , Le2 ) if and only if:
       1   For every distinct vertex x ∈ V1 with a label vl ∈ Lv 1 , there is a
           distinct vertex y ∈ V2 with a label vl ∈ Lv 2 .

       2   For every distinct edge edge ab ∈ E1 with a label el ∈ Le1 , there is a
           distinct edge ab ∈ E2 with a label el ∈ Le2 .



   S. Sakr (CSE, UNSW)               BIT Seminars’09              16 November 2009   10 / 40
Preliminaries: Subgraph Search Queries


            A                                                     A                                                         z
    f             e                                      x                x                                       C                  A
                     m
                         A                                A                                       A
B                     C       n
                                      m          C   f            e           A       e
                                                                                          x           x
                                                                                                                  x
                                                                                                                        C
                                                                                                                                z
                                                                                                                                     A
        n        B           mA               xB              x   C           n   C                       A
                                                                                                                                         n
             y           z        n                                               x           x           n   e        x             n
                                             x
C       m        C
                      D D                        Dn                   m
                                                                              D
                                                 C       m        D               D                       D       A                  D
                  x           x
                                                     f                    m
                                                                                      f               m
                                                                                                                        Ax      x    D
                         A                                        B                               B
        g2               g1                              g2   g3                              g3                            qq
                                      (a) Sample graph database                                                   (b) Graph query

                                  Figure: An example graph database and graph query




                S. Sakr (CSE, UNSW)                                   BIT Seminars’09                             16 November 2009       11 / 40
Our Approach: GraphREL

    Relational encoding of graph data.


    SQL translation of sub-graph search queries.
          Filtering phase.


          Optional verification phase.


    Partitioned B-tree Indexes.


    Statistical Summaries.


    Decomposition-Based and Selectivity-Aware SQL Translation.


   S. Sakr (CSE, UNSW)            BIT Seminars’09    16 November 2009   12 / 40
Relational Encoding of Graph Data


    The starting point of our relational framework is to find an efficient
    and suitable encoding for each graph member gi in the graph
    database D.


    We use the Vertex-Edge mapping scheme for storing directed
    labelled graphs with the following structure:


          Vertices(graphID, vertexID, vertexLabel)


          Edges(graphID, sVertex, dVertex, edgeLabel)




   S. Sakr (CSE, UNSW)            BIT Seminars’09       16 November 2009   13 / 40
Relational Encoding of Graph Data


                                              graphID   vertexID     vLabel   graphID   sVertex     dVertex    eLabel

                 m
                         A   n
                              1               1         1            A        1         1           2          n

                                          m                                   1         1           3          m
g1   6       B                A       2
                                              1         2            A
                                                                              1         2           3          n
         y           z            n           1         3            D
                                                                              1         4           3          x
     5       C                D       3
                                              1         4            A        1         5           4          x

                                              1         5            C        1         6           5          y
             x               x
                                                                              1         5           2          z
                         A    4               1         6            B
                                                                              1         1           6          m
                                              2         1            A
                                                                              2         1           2          e

                 f
                         A   1
                             e
                                              2         2            C        2         2           3          m

                                              2         3            D        2         4           3          m
     5       B                C       2
g2
                                                                              2         4           2          n
                                              2         4            C
         x           n            m                                           2         5           4          x
                                              2         5            B
     4       C       m        D       3                                       2         1           5          f


                                                  Vertices Table                        Edges Table

     S. Sakr (CSE, UNSW)                                    BIT Seminars’09                 16 November 2009       14 / 40
SQL Translation of Graph Queries

    Filtering Phase: a sub-graph query q consists of a set of vertices
    QV with size equal m and a set of edges QE equal n is evaluated
    using the following SQL translation template:
     SELECT DISTINCT V1 .graphID, Vi .vertexID
     FROM Vertices as V1 ,..., Vertices as Vm , Edges as E1 ,..., Edges as En
     WHERE
     ∀m (V1 .graphID = Vi .graphID)
      i=2
     AND ∀n (V1 .graphID = Ej .graphID)
          j=1
     AND ∀m (Vi .vertexLabel = QVi .vertexLabel)
          i=1
     AND ∀n (Ej .edgeLabel = QEj .edgeLabel)
          j=1
     AND ∀n (Ej .sVertex = Vf .vertexID AND Ej .dVertex = Vf .vertexID);
          j=1


    Verification Phase: an optional phase which is used to verify that
    each vertex in the set of filtered vertices for each candidate graph is
    distinct. It is applied only if more than one vertex of the set of query
    vertices QV have the same label. This can be easily achieved using
    their vertex ID.

   S. Sakr (CSE, UNSW)                 BIT Seminars’09                  16 November 2009   15 / 40
Partitioned B-tree Indexes

    Partitioned B-tree indexing is a slight variant of the B-tree indexing
    structure.
    The main idea is the use of low-selectivity leading columns to
    maintain partitions within the associated B-tree.
    In labelled graphs, it is generally the case that the number of distinct
    vertices and edges labels are far less than the number of vertices and
    edges respectively.
    For example, having an index defined in terms of columns
    (vertexLabel, graphID) can reduce the access cost of sub-graph query
    with only one label to one disk page. On the contrary, an index
    defined in terms of the two columns (graphID, vertexLabel) requires
    scanning a large number of disk pages.
    Having partitioned B-trees indexes of the high-selectivity attributes
    achieves fixed execution times which are no longer dependent on the
    size of the whole graph database.
   S. Sakr (CSE, UNSW)          BIT Seminars’09           16 November 2009   16 / 40
Limitations of SQL-Based Translation Approach

    An obvious problem of the SQL translation template is that it
    involves a large number of conjunctive SQL predicates and join
    operations between the encoding tables.

    Most of relational query engines will certainly fail to execute the SQL
    translation queries of medium size or large sub-graph queries because
    they are too long and too complex (this does not mean they must
    consequently be too expensive).

    Therefore, we need a decomposition mechanism to divide this large
    and complex SQL translation query into a sequence of intermediate
    queries.
          Applying this decomposition mechanism blindly may lead to inefficient
          execution plans with very large, non-required and expensive
          intermediate results.
          We use statistical summary information to achieve an efficient
          decomposition process.
   S. Sakr (CSE, UNSW)           BIT Seminars’09           16 November 2009   17 / 40
Statistical Summaries

      In general, one of the most effective techniques for optimizing the
      execution times of SQL queries is to select the relational execution
      based on the accurate selectivity information of the query predicates.
      We construct three Markov tables to store information about the
      frequency of occurrence of the distinct labels of vertices, distinct
      labels of edges and connection between pair of vertices (edges).
 Vertex Label   Frequency    Edge Label    Frequency     Edge Label      Frequency
                                                         Connection
 A              100          a             40
                                                         ab              3
 B              200          c             5
                                                         ac              15
 C              38           e             28
                                                         ae              45
 D              4            l             54
                                                         ec              14
 E              50           m             140
                                                         em              103
 L              6            n             3
                                                         la              5
 M              10           o             20
                                                         pc              18
 N              250          p             15
                                                         px              45
 O              3            x             8
                                                         xy              25
 P              40           y             60
                                                         xz              2
 R              55           z             15
                                                         za              1
 Markov Table summary of     Markov Table summary of      Markov Table summary of
      vertices labels             edges labels           pair-wise edge connections

     S. Sakr (CSE, UNSW)           BIT Seminars’09            16 November 2009   18 / 40
Decomposition-Based and Selectivity-Aware SQL
Translation

    Identifying the pruning points.

    Calculating the number of partitions.

    Decomposed SQL translation.
          Blindly Single-Level Decomposition.

          Pruned Single-Level Decomposition.

          Pruned Multi-Level Decomposition


    Selectivity-aware Annotations.


   S. Sakr (CSE, UNSW)            BIT Seminars’09   16 November 2009   19 / 40
Decomposition-Based and Selectivity-Aware SQL
Translation
    Identifying the pruning points
          Each vertex label, edge label or edge connection with low frequency is
          considered as a pruning point in our relational evaluation mechanism.

          Given a query graph q, we first check the structure of q against our
          summary Markov tables to identify the possible pruning points (NPP).

    Calculating the number of partitions
          Having a sub-graph query q requires NJP join operations.

          Assuming that the relational query engine can evaluate up to number
          of join operations equal to MJP in one query.

          The number of partitions (NOP) is computed as: (NJP/MJP)



   S. Sakr (CSE, UNSW)            BIT Seminars’09            16 November 2009   20 / 40
Decomposition-Based and Selectivity-Aware SQL
Translation
    Blindly Single-Level Decomposition
          If NPP = 0 ⇒ we blindly decompose the query q into NOP partitions.
          Each partition is translated into an intermediate evaluation step Si .
          The final evaluation step joins all intermediate evaluation steps and
          adds the conjunctive conditions of the partition’s connectors.
    Pruned Single-Level Decomposition
          If NPP >= NOP ⇒ we distribute the pruning points across the
          different intermediate NOP partitions.
          It ensures a balanced effective pruning of all intermediate results.
    Pruned Multi-Level Decomposition
          if NPP < NOP ⇒ we distribute the pruning points across a first level
          intermediate results of NOP partitions. An intermediate collective
          pruned step IPS is constructed by joining all the pruned first level
          intermediate results.
          IPS is used as an entry pruning point for the rest (NOP − NPP)
          non-pruned partitions in a hierarchical multi-level fashion .
          Each pruning point can be used to prune more than one partition (if
          possible).
   S. Sakr (CSE, UNSW)             BIT Seminars’09             16 November 2009   21 / 40
Decomposition-Based and Selectivity-Aware SQL
Translation
                       S1
                  S1                            S2                                  FES -
                                      S2                            FES -
                                                                                    SQL
                                                                    SQL


                                                             S1 -      S1 - S2 -             S2 -
                                                             SQL       SQLSQL                SQL

                                           (a) NPP > NOP
                            S2

                                                                            FES -
                                                                            SQL
                                 S2                                                          FES -
                  S1                       S3                                                SQL
                                                             S1 -           S2 -              S3 -
                       S1                            S3      SQL            SQL               SQL
                                                                       S1 -                  S2 -              S3 -
                                           (b) NPP < NOP               SQL                   SQL               SQL

                   Figure: Selectivity-aware decomposition process

   S. Sakr (CSE, UNSW)                           BIT Seminars’09                            16 November 2009     22 / 40
Decomposition-Based and Selectivity-Aware SQL
Translation
    Selectivity-aware Annotations

          For any given SQL query, there are a large number of alternative
          execution plans. These alternative execution plans may differ
          significantly in their use of system resources or response time.

          We use the statistical summary information to give influencing hints for
          the query optimizers by injecting additional selectivity information for
          the individual query predicates into the SQL translations of the graph
          queries.

            SELECT fieldlist FROM tablelist
            WHERE Pi SELECTIVITY Si




   S. Sakr (CSE, UNSW)             BIT Seminars’09            16 November 2009   23 / 40
Experimental Results: Performance and Scalability

                                                              D2kV10E20L40M50                                                                              1MB
                                                              D10kV10E20L40M50                                                                             10MB
                        100000                                                                               10000
                                                              D50kV30E40L90M150                                                                            50MB
                                                              D100kV30E40L90M150
                                                                                                                                                           100MB

                         10000
                                                                                                             1000
  Execution Time (ms)




                                                                                       Execution Time (ms)
                         1000

                                                                                                              100

                          100


                                                                                                               10
                           10




                            1                                                                                   1
                                 Q4   Q8      Q12       Q16        Q20                                               Q4      Q8      Q12       Q16   Q20
                                           Query Size                                                                             Query Size



                                 (a) Synthetic Dataset                                                                    (b) DBLP Dataset

                                                    Figure: The scalability of GraphREL.

                        S. Sakr (CSE, UNSW)                                BIT Seminars’09                                              16 November 2009      24 / 40
Experimental Results: The effect of using Partitioned
B-tree Indexes and Selectivity Injections

                                                                                                                                                          Synthetic
                                                                       Synthetic
                                                                                                                                                          DBLP
                                    100                                DBLP
                                                                                                                        40

                                    90
                                                                                                                        35
                                    80
    Percentage of Improvement (%)




                                                                                                                        30
                                    70




                                                                                                 Execution Times (ms)
                                    60                                                                                  25

                                    50                                                                                  20

                                    40
                                                                                                                        15
                                    30
                                                                                                                        10
                                    20

                                                                                                                        5
                                    10

                                     0                                                                                  0
                                          Q4   Q8      Q12       Q16     Q20                                                 Q4   Q8      Q12       Q16     Q20
                                                    Query Size                                                                         Query Size




                                    (a) Partitioned B-tree indexes                           (b) Injection of selectivity annotations

Figure: The speedup improvement for the relational evaluation of sub-graph
queries using partitioned B-tree indexes and selectivity-aware annotations.

    S. Sakr (CSE, UNSW)                                                            BIT Seminars’09                                           16 November 2009         25 / 40
QBP: An Application of GraphREL

    Many of today’s Information Systems are driven by explicit process
    models.
    A business process is a set of coordinated activities to achieve a
    specific business objective.
    With the rapid and incremental increase in the number of process
    models, it becomes crucial for business process designers to be able to
    look up their repository for models efficiently.
    QBP is a query processor for business processes models.
    QBP is based on a new visual query language for business processes
    called BPMN-Q. The language addresses processes definitions and
    extends the standard BPMN notations for modeling business
    processes for its concrete syntax.
    A BPMN-Q query is considered to be a graph which is going to be
    matched with process graph(s).
   S. Sakr (CSE, UNSW)          BIT Seminars’09           16 November 2009   26 / 40
QBP: An Application of GraphREL

                         Credit Rating     Credit Rating
                         [accepted]        [rejected]                                               Offer loan protection
                                                                                                          insurance



                               Check credit rating


                                                                               Prepare contract

                                          Const. Doc                  All OK
                                          [valid]

                               Check real-estate                                                      Offer residence
  Customer applies for
                                 construction                                                            insurance
   real-estate credit
                                  document


                                          Const. Doc.
                                          [invalid]                            Reject application


                              Check land register
                                   record




                             Record              Record
                            [present]           [absent]




    S. Sakr (CSE, UNSW)                                    BIT Seminars’09                          16 November 2009        27 / 40
QBP: Application Architecture

                                       BPM-Q           BPM- Q Query         Semantic Query
                                     Query Editor                             Expander

                                                                                    Semantically
                                                                                    expanded queries


                Business Process        Model       Result Process Models     SQL-Based
                   Designers                                                Query Processor
                                        Editor

                                                            Query Results           SQL Script
                                   Updates

                                                           RDBMS
                                                     Relational Business
                                                     Process Repository




                                                    Translation Middleware



                                                                            UML
                                      BPMN          BPEL       ……….                  EPC
                                                                            ADs


   S. Sakr (CSE, UNSW)                       BIT Seminars’09                           16 November 2009   28 / 40
BPMN-Q Query Constructs


 Anonymous       It is used to indicate unknown activities in a query. It resembles an
 Activity        activity but is distinguished by the @ sign in the beginning of the label.
 Generic Node    It indicates an unknown node in a process. It could evaluate to any node
                 type.
 Generic Split   It refers to any type of split gateways.


 Generic Join    It refers to any type of join gateways.

 Negative        It states that two nodes A and B are not directly related by sequence
 Sequence Flow   flow.
 Path            It states that there must be a path from A to B. A query usually returns
                 all paths.
 Negative Path   It states that there is not any path between two nodes A and B.




    S. Sakr (CSE, UNSW)                           BIT Seminars’09                             16 November 2009   29 / 40
QBP: An Application of GraphREL

    Customer applies for
                                             //       Reject application
     real-estate credit

                          (a) BPMN-Q Query Example


                            Check credit rating



                            Check real-estate
   Customer applies for
                              construction
    real-estate credit
                               document


                            Check land register              Reject application
                                 record


                          (b) BPMN-Q Query Match



   S. Sakr (CSE, UNSW)              BIT Seminars’09         16 November 2009      30 / 40
QBP: Use Cases

    Searching the structure of the process models.


    Compliance checking.


    Detecting design anomalies.


    Discovery of frequent process patterns/anti-patterns.




   S. Sakr (CSE, UNSW)         BIT Seminars’09          16 November 2009   31 / 40
QBP: An Application of GraphREL




                         http://bpmnq.sourceforge.net/
   S. Sakr (CSE, UNSW)             BIT Seminars’09       16 November 2009   32 / 40
Conclusions

    GraphREL is a purely relational framework to store and query graph
    data.

    In principle GraphREL has the following advantages:
          It can reside on any relational database system and exploits its well
          known matured query optimization techniques as well as its efficient
          and scalable query processing techniques.

          It has no required time cost for offline or pre-processing steps.

          It can handle static and dynamic (with frequent updates) graph
          databases very well.

          The selectivity annotations for the SQL evaluation scripts provide the
          relational query optimizers with the ability to select the most efficient
          execution plans and apply an efficient pruning for the non-required
          graph database members.


   S. Sakr (CSE, UNSW)             BIT Seminars’09             16 November 2009   33 / 40
Outline

    Previous Work: Pathfinder - Relational XQuery Compiler.


    Current Work: GraphREL - General Graph Query Processor.


    Future Work: Scalable Graph Query Processing for New Generation
    of Database Applications.




   S. Sakr (CSE, UNSW)       BIT Seminars’09        16 November 2009   34 / 40
Future Work: Large Scale Graph Query Processing
(e.g: Social Networks)




   S. Sakr (CSE, UNSW)   BIT Seminars’09   16 November 2009   35 / 40
Future Work: Parallel Processing / MapReduce
(HadoopDB)




   S. Sakr (CSE, UNSW)   BIT Seminars’09   16 November 2009   36 / 40
Future Work: Storing and Querying Hypergraphs




   S. Sakr (CSE, UNSW)   BIT Seminars’09   16 November 2009   37 / 40
References

    [CIDR’03] G. Graefe. Sorting And Indexing With Partitioned
    B-Trees.

    [VLDB’04] T. Grust, S. Sakr, and J. Teubner. XQuery on SQL
    Hosts.

    [SIGMOD’07] T. Grust, M. Mayr, J. Rittinger, S. Sakr, and J.
    Teubner. A SQL:1999 Code Generator for the Pathfinder XQuery
    Compiler.

    [VLDB’08] J. Teubner, T. Grust, S. Maneth, and S. Sakr.
    Dependable Cardinality Forecats for XQuery.




   S. Sakr (CSE, UNSW)        BIT Seminars’09        16 November 2009   38 / 40
References

    [IJWIS’08] S. Sakr. ”Algebraic-Based XQuery Cardinality
    Estimation.

    [DASFAA’09] S. Sakr. GraphREL: A Decomposition-Based and
    Selectivity-Aware Relational Framework for Processing Sub-graph
    Queries.

    [UNISCON’09] S. Sakr. Storing and Querying Graph Data Using
    Efficient Relational Processing Techniques.

    [JDM’09] S. Sakr. Purely Relational Implementation of an XQuery
    Processor.




   S. Sakr (CSE, UNSW)        BIT Seminars’09         16 November 2009   39 / 40
The End




                        Thank You




  S. Sakr (CSE, UNSW)     BIT Seminars’09   16 November 2009   40 / 40

More Related Content

Viewers also liked

Horario de examenes 1 er trimestre colegio 2010
Horario de examenes 1 er trimestre colegio 2010Horario de examenes 1 er trimestre colegio 2010
Horario de examenes 1 er trimestre colegio 2010Stalyn Cruz
 
50 words powerpoint Cloie
50 words powerpoint Cloie50 words powerpoint Cloie
50 words powerpoint Cloiemrrobbo
 
Tbfs uc commercial presentation
Tbfs uc commercial presentationTbfs uc commercial presentation
Tbfs uc commercial presentationraj638
 
Flipping the Classroom - Phys Ed
Flipping the Classroom - Phys EdFlipping the Classroom - Phys Ed
Flipping the Classroom - Phys Edmrrobbo
 
Healthstory Update To MTIA Board
Healthstory Update To MTIA BoardHealthstory Update To MTIA Board
Healthstory Update To MTIA BoardNick van Terheyden
 
Stochastic optimization and risk management for an efficient planning of buil...
Stochastic optimization and risk management for an efficient planning of buil...Stochastic optimization and risk management for an efficient planning of buil...
Stochastic optimization and risk management for an efficient planning of buil...Emilio L. Cano
 
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...Eugenio Agnello
 
Gang announcements 2010 07
Gang announcements 2010 07Gang announcements 2010 07
Gang announcements 2010 07David Giard
 
The Business of Mobile Apps
The Business of Mobile AppsThe Business of Mobile Apps
The Business of Mobile AppsNorman Liang
 
Jack Thurston: Agra Europe Outlook
Jack Thurston: Agra Europe OutlookJack Thurston: Agra Europe Outlook
Jack Thurston: Agra Europe Outlookjackthur
 
GANG Announcements March 2010
GANG Announcements March 2010GANG Announcements March 2010
GANG Announcements March 2010David Giard
 
MybrothersamisdeadJustin
MybrothersamisdeadJustinMybrothersamisdeadJustin
MybrothersamisdeadJustinjustwok
 
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008jackthur
 
Verb To Be
Verb To BeVerb To Be
Verb To Beiurims
 
Διδάσκοντας μαθηματικά μέσω Facebook.
Διδάσκοντας μαθηματικά μέσω Facebook.Διδάσκοντας μαθηματικά μέσω Facebook.
Διδάσκοντας μαθηματικά μέσω Facebook.Liana Lignou
 
Διαδραστικοί Πίνακες
Διαδραστικοί ΠίνακεςΔιαδραστικοί Πίνακες
Διαδραστικοί ΠίνακεςLiana Lignou
 

Viewers also liked (20)

Horario de examenes 1 er trimestre colegio 2010
Horario de examenes 1 er trimestre colegio 2010Horario de examenes 1 er trimestre colegio 2010
Horario de examenes 1 er trimestre colegio 2010
 
50 words powerpoint Cloie
50 words powerpoint Cloie50 words powerpoint Cloie
50 words powerpoint Cloie
 
Where In The Us
Where In The UsWhere In The Us
Where In The Us
 
Tbfs uc commercial presentation
Tbfs uc commercial presentationTbfs uc commercial presentation
Tbfs uc commercial presentation
 
Flipping the Classroom - Phys Ed
Flipping the Classroom - Phys EdFlipping the Classroom - Phys Ed
Flipping the Classroom - Phys Ed
 
Healthstory Update To MTIA Board
Healthstory Update To MTIA BoardHealthstory Update To MTIA Board
Healthstory Update To MTIA Board
 
Stochastic optimization and risk management for an efficient planning of buil...
Stochastic optimization and risk management for an efficient planning of buil...Stochastic optimization and risk management for an efficient planning of buil...
Stochastic optimization and risk management for an efficient planning of buil...
 
XML Compression Benchmark
XML Compression BenchmarkXML Compression Benchmark
XML Compression Benchmark
 
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...
Edifici antisismici in Calcestruzzo Armato - 0 Obiettivi del capitolo 10 dell...
 
J query
J queryJ query
J query
 
Gang announcements 2010 07
Gang announcements 2010 07Gang announcements 2010 07
Gang announcements 2010 07
 
The Business of Mobile Apps
The Business of Mobile AppsThe Business of Mobile Apps
The Business of Mobile Apps
 
Jack Thurston: Agra Europe Outlook
Jack Thurston: Agra Europe OutlookJack Thurston: Agra Europe Outlook
Jack Thurston: Agra Europe Outlook
 
GANG Announcements March 2010
GANG Announcements March 2010GANG Announcements March 2010
GANG Announcements March 2010
 
MybrothersamisdeadJustin
MybrothersamisdeadJustinMybrothersamisdeadJustin
MybrothersamisdeadJustin
 
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
Jack Thurston talk at EU budget and CAP conference: Prague, 11 December 2008
 
Verb To Be
Verb To BeVerb To Be
Verb To Be
 
Διδάσκοντας μαθηματικά μέσω Facebook.
Διδάσκοντας μαθηματικά μέσω Facebook.Διδάσκοντας μαθηματικά μέσω Facebook.
Διδάσκοντας μαθηματικά μέσω Facebook.
 
Διαδραστικοί Πίνακες
Διαδραστικοί ΠίνακεςΔιαδραστικοί Πίνακες
Διαδραστικοί Πίνακες
 
Forze
ForzeForze
Forze
 

Similar to GraphREL: A Relational Framework for Processing Sub-graph Queries

Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...bhargavi804095
 
Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindingsDmitriy Lyubimov
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringWaqas Nawaz
 
Relational Model
Relational ModelRelational Model
Relational ModelSiti Ismail
 
Algebraic Property Graphs
Algebraic Property GraphsAlgebraic Property Graphs
Algebraic Property GraphsAdrian Wilke
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache sparkEmiliano Martinez Sanchez
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
A Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesA Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesMohsen Taheriyan
 
Presentation on Graph Clustering (vldb 09)
Presentation on Graph Clustering (vldb 09)Presentation on Graph Clustering (vldb 09)
Presentation on Graph Clustering (vldb 09)Waqas Nawaz
 
Module 2 - part i
Module   2 - part iModule   2 - part i
Module 2 - part iParthNavale
 
10.1.1.70.8789
10.1.1.70.878910.1.1.70.8789
10.1.1.70.8789Hoài Bùi
 
Planning-Based Approach for Automating Sequence Diagram Generation
Planning-Based Approach for Automating Sequence Diagram GenerationPlanning-Based Approach for Automating Sequence Diagram Generation
Planning-Based Approach for Automating Sequence Diagram GenerationYaser Sulaiman
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXBenjamin Bengfort
 
GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Ma...
GRAPHITE:  An Extensible Graph Traversal Framework for Relational Database Ma...GRAPHITE:  An Extensible Graph Traversal Framework for Relational Database Ma...
GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Ma...Marcus Paradies
 
用R语言做曲线拟合
用R语言做曲线拟合用R语言做曲线拟合
用R语言做曲线拟合Wenxiang Zhu
 
Relational Algebra
Relational AlgebraRelational Algebra
Relational AlgebraAmin Omi
 

Similar to GraphREL: A Relational Framework for Processing Sub-graph Queries (20)

Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...Graphs in data structures are non-linear data structures made up of a finite ...
Graphs in data structures are non-linear data structures made up of a finite ...
 
An Algebra of Hierarchical Graphs
An Algebra of Hierarchical GraphsAn Algebra of Hierarchical Graphs
An Algebra of Hierarchical Graphs
 
Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindings
 
3.SVM.pdf
3.SVM.pdf3.SVM.pdf
3.SVM.pdf
 
Collaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph ClusteringCollaborative Similarity Measure for Intra-Graph Clustering
Collaborative Similarity Measure for Intra-Graph Clustering
 
Relational Model
Relational ModelRelational Model
Relational Model
 
Algebraic Property Graphs
Algebraic Property GraphsAlgebraic Property Graphs
Algebraic Property Graphs
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
A Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured SourcesA Scalable Approach to Learn Semantic Models of Structured Sources
A Scalable Approach to Learn Semantic Models of Structured Sources
 
Presentation on Graph Clustering (vldb 09)
Presentation on Graph Clustering (vldb 09)Presentation on Graph Clustering (vldb 09)
Presentation on Graph Clustering (vldb 09)
 
Module 2 - part i
Module   2 - part iModule   2 - part i
Module 2 - part i
 
10.1.1.70.8789
10.1.1.70.878910.1.1.70.8789
10.1.1.70.8789
 
Planning-Based Approach for Automating Sequence Diagram Generation
Planning-Based Approach for Automating Sequence Diagram GenerationPlanning-Based Approach for Automating Sequence Diagram Generation
Planning-Based Approach for Automating Sequence Diagram Generation
 
2 rel-algebra
2 rel-algebra2 rel-algebra
2 rel-algebra
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Ma...
GRAPHITE:  An Extensible Graph Traversal Framework for Relational Database Ma...GRAPHITE:  An Extensible Graph Traversal Framework for Relational Database Ma...
GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Ma...
 
用R语言做曲线拟合
用R语言做曲线拟合用R语言做曲线拟合
用R语言做曲线拟合
 
uniT 4 (1).pptx
uniT 4 (1).pptxuniT 4 (1).pptx
uniT 4 (1).pptx
 
Relational Algebra
Relational AlgebraRelational Algebra
Relational Algebra
 

More from University of New South Wales (10)

Declarative analysis of noisy information networks
Declarative analysis of noisy information networksDeclarative analysis of noisy information networks
Declarative analysis of noisy information networks
 
InfiniteGraph
InfiniteGraphInfiniteGraph
InfiniteGraph
 
Gremlin
Gremlin Gremlin
Gremlin
 
DHHT - Modeling beyond plain graphs
DHHT - Modeling beyond plain graphsDHHT - Modeling beyond plain graphs
DHHT - Modeling beyond plain graphs
 
Dex
DexDex
Dex
 
Ontological Conjunctive Query Answering over Large Knowledge Bases
Ontological Conjunctive Query Answering over Large Knowledge BasesOntological Conjunctive Query Answering over Large Knowledge Bases
Ontological Conjunctive Query Answering over Large Knowledge Bases
 
Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the CloudKey-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
Key-Key-Value Stores for Efficiently Processing Graph Data in the Cloud
 
Allegograph
AllegographAllegograph
Allegograph
 
Neo4j
Neo4jNeo4j
Neo4j
 
Dependable Cardinality Forecast for XQuery
Dependable Cardinality Forecast for XQueryDependable Cardinality Forecast for XQuery
Dependable Cardinality Forecast for XQuery
 

Recently uploaded

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 

GraphREL: A Relational Framework for Processing Sub-graph Queries

  • 1. GraphREL: A Decomposition-Based and Selectivity-Aware Relational Framework for Processing Sub-graph Queries Sherif Sakr School of Computer Science and Engineering University of New South Wales . http://www.cse.unsw.edu.au/∼ssakr/ BIT Seminars ’09 - Free University of Bolzano, Italy 16 November 2009 S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 1 / 40
  • 2. Outline Previous Work: Pathfinder - Relational XQuery Compiler. Current Work: GraphREL - General Graph Query Processor. Future Work: Scalable Graph Query Processing for New Generation of Database Applications. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 2 / 40
  • 3. Outline Previous Work: Pathfinder - Relational XQuery Compiler. Current Work: GraphREL - General Graph Query Processor. Future Work: Scalable Graph Query Processing for New Generation of Database Applications. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 3 / 40
  • 4. Pathfinder: A Relational XQuery Processor XQuery Expression Pathfinder Relational Algebra MIL Code Generator SQL Code Generator MIL Scripts SQL Scripts Monet DBMS Conventional RDBMS http://pathfinder-xquery.org/ S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 4 / 40
  • 5. Pathfinder: A Relational XQuery Processor XML XQuery Document Expression Pathfinder Relational Algebra + Special Properties [VLDB’04] Estimation Rules Translation Templates [VLDB’08] XPath Accelerator Cardinality Properties Encoding Tuples XQuery Estimator SQL Generator + Statistical Guide [SIGMOD’07] Statistical Guide [IJWIS’09] Cardinality Properties Aware [JDM’09] Statistical Histograms SQL Scripts Relational Results XML XML Conventional RDBMS Serializer Statistical Histograms S. Sakr (CSE, UNSW) System Administrator BIT Seminars’09 16 November 2009 5 / 40
  • 6. Outline Previous Work: Pathfinder - Relational XQuery Compiler. Current Work: GraphREL - General Graph Query Processor. Future Work: Scalable Graph Query Processing for New Generation of Database Applications. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 6 / 40
  • 7. GraphREL: Motivations Graphs are among the most complicated and general form of data structures. Recently, they have been widely used to model many complex structured and schemaless data such as social networks, chemical compounds, biological pathways, spatial databases, semantic web and business process models. Retrieving related graphs containing a query graph from a large graph database is a key performance issue in all of these graph-based applications. The success of any graph database application is directly dependent on the efficiency of the graph indexing and query processing mechanisms. RDBMSs have repeatedly shown that they are very efficient, scalable and successful in hosting different kinds of data. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 7 / 40
  • 8. Preliminaries: Graph Data Model In labelled graphs, vertices and edges represent the entities and the relationships between them respectively. The attributes associated with these entities and relationships are called labels. A graph database D is a collection of member graphs D = {g1 , g2 , ...gn } where each member graph gi is denoted as (V , E , Lv , Le ). V is the set of vertices. E ⊆ V × V is the set of edges joining two distinct vertices. Lv is the set of vertex labels. Le is the set of edge labels. labelled graphs are classified according to the direction of their edges into two main classes: 1 Directed-labelled graphs such as XML, RDF and traffic networks. 2 Undirected-labelled graphs such as social networks and chemical compounds. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 8 / 40
  • 9. Preliminaries: Graph Queries In principle, queries in graph databases can be broadly classified into the follow- ing main categories: Subgraph queries: this category searches for a specific pattern in the graph database. The pattern can be either a small graph or a graph where some parts of it are uncertain, e.g., vertices with wildcard labels. Supergraph queries: this category searches for the graph database members of which their whole structures are contained in the input query. Similarity (Approximate Matching) queries: this category finds graphs which are similar, but not necessarily isomorphic to a given query graph. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 9 / 40
  • 10. Preliminaries: Subgraph Search Queries Given a graph database D = {g1 , g2 , ..., gn } and a graph query q, it returns the query answer set A = {gi |q ⊆ gi , gi ∈ D}. A graph q is described as a sub-graph of another graph database member gi if the set of vertices and edges of q form subset of the vertices and edges of gi . Formally, g1 (V1 , E1 , Lv 1 , Le1 ) is defined as sub-graph of g2 (V2 , E2 , Lv 2 , Le2 ) if and only if: 1 For every distinct vertex x ∈ V1 with a label vl ∈ Lv 1 , there is a distinct vertex y ∈ V2 with a label vl ∈ Lv 2 . 2 For every distinct edge edge ab ∈ E1 with a label el ∈ Le1 , there is a distinct edge ab ∈ E2 with a label el ∈ Le2 . S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 10 / 40
  • 11. Preliminaries: Subgraph Search Queries A A z f e x x C A m A A A B C n m C f e A e x x x C z A n B mA xB x C n C A n y z n x x n e x n x C m C D D Dn m D C m D D D A D x x f m f m Ax x D A B B g2 g1 g2 g3 g3 qq (a) Sample graph database (b) Graph query Figure: An example graph database and graph query S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 11 / 40
  • 12. Our Approach: GraphREL Relational encoding of graph data. SQL translation of sub-graph search queries. Filtering phase. Optional verification phase. Partitioned B-tree Indexes. Statistical Summaries. Decomposition-Based and Selectivity-Aware SQL Translation. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 12 / 40
  • 13. Relational Encoding of Graph Data The starting point of our relational framework is to find an efficient and suitable encoding for each graph member gi in the graph database D. We use the Vertex-Edge mapping scheme for storing directed labelled graphs with the following structure: Vertices(graphID, vertexID, vertexLabel) Edges(graphID, sVertex, dVertex, edgeLabel) S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 13 / 40
  • 14. Relational Encoding of Graph Data graphID vertexID vLabel graphID sVertex dVertex eLabel m A n 1 1 1 A 1 1 2 n m 1 1 3 m g1 6 B A 2 1 2 A 1 2 3 n y z n 1 3 D 1 4 3 x 5 C D 3 1 4 A 1 5 4 x 1 5 C 1 6 5 y x x 1 5 2 z A 4 1 6 B 1 1 6 m 2 1 A 2 1 2 e f A 1 e 2 2 C 2 2 3 m 2 3 D 2 4 3 m 5 B C 2 g2 2 4 2 n 2 4 C x n m 2 5 4 x 2 5 B 4 C m D 3 2 1 5 f Vertices Table Edges Table S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 14 / 40
  • 15. SQL Translation of Graph Queries Filtering Phase: a sub-graph query q consists of a set of vertices QV with size equal m and a set of edges QE equal n is evaluated using the following SQL translation template: SELECT DISTINCT V1 .graphID, Vi .vertexID FROM Vertices as V1 ,..., Vertices as Vm , Edges as E1 ,..., Edges as En WHERE ∀m (V1 .graphID = Vi .graphID) i=2 AND ∀n (V1 .graphID = Ej .graphID) j=1 AND ∀m (Vi .vertexLabel = QVi .vertexLabel) i=1 AND ∀n (Ej .edgeLabel = QEj .edgeLabel) j=1 AND ∀n (Ej .sVertex = Vf .vertexID AND Ej .dVertex = Vf .vertexID); j=1 Verification Phase: an optional phase which is used to verify that each vertex in the set of filtered vertices for each candidate graph is distinct. It is applied only if more than one vertex of the set of query vertices QV have the same label. This can be easily achieved using their vertex ID. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 15 / 40
  • 16. Partitioned B-tree Indexes Partitioned B-tree indexing is a slight variant of the B-tree indexing structure. The main idea is the use of low-selectivity leading columns to maintain partitions within the associated B-tree. In labelled graphs, it is generally the case that the number of distinct vertices and edges labels are far less than the number of vertices and edges respectively. For example, having an index defined in terms of columns (vertexLabel, graphID) can reduce the access cost of sub-graph query with only one label to one disk page. On the contrary, an index defined in terms of the two columns (graphID, vertexLabel) requires scanning a large number of disk pages. Having partitioned B-trees indexes of the high-selectivity attributes achieves fixed execution times which are no longer dependent on the size of the whole graph database. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 16 / 40
  • 17. Limitations of SQL-Based Translation Approach An obvious problem of the SQL translation template is that it involves a large number of conjunctive SQL predicates and join operations between the encoding tables. Most of relational query engines will certainly fail to execute the SQL translation queries of medium size or large sub-graph queries because they are too long and too complex (this does not mean they must consequently be too expensive). Therefore, we need a decomposition mechanism to divide this large and complex SQL translation query into a sequence of intermediate queries. Applying this decomposition mechanism blindly may lead to inefficient execution plans with very large, non-required and expensive intermediate results. We use statistical summary information to achieve an efficient decomposition process. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 17 / 40
  • 18. Statistical Summaries In general, one of the most effective techniques for optimizing the execution times of SQL queries is to select the relational execution based on the accurate selectivity information of the query predicates. We construct three Markov tables to store information about the frequency of occurrence of the distinct labels of vertices, distinct labels of edges and connection between pair of vertices (edges). Vertex Label Frequency Edge Label Frequency Edge Label Frequency Connection A 100 a 40 ab 3 B 200 c 5 ac 15 C 38 e 28 ae 45 D 4 l 54 ec 14 E 50 m 140 em 103 L 6 n 3 la 5 M 10 o 20 pc 18 N 250 p 15 px 45 O 3 x 8 xy 25 P 40 y 60 xz 2 R 55 z 15 za 1 Markov Table summary of Markov Table summary of Markov Table summary of vertices labels edges labels pair-wise edge connections S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 18 / 40
  • 19. Decomposition-Based and Selectivity-Aware SQL Translation Identifying the pruning points. Calculating the number of partitions. Decomposed SQL translation. Blindly Single-Level Decomposition. Pruned Single-Level Decomposition. Pruned Multi-Level Decomposition Selectivity-aware Annotations. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 19 / 40
  • 20. Decomposition-Based and Selectivity-Aware SQL Translation Identifying the pruning points Each vertex label, edge label or edge connection with low frequency is considered as a pruning point in our relational evaluation mechanism. Given a query graph q, we first check the structure of q against our summary Markov tables to identify the possible pruning points (NPP). Calculating the number of partitions Having a sub-graph query q requires NJP join operations. Assuming that the relational query engine can evaluate up to number of join operations equal to MJP in one query. The number of partitions (NOP) is computed as: (NJP/MJP) S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 20 / 40
  • 21. Decomposition-Based and Selectivity-Aware SQL Translation Blindly Single-Level Decomposition If NPP = 0 ⇒ we blindly decompose the query q into NOP partitions. Each partition is translated into an intermediate evaluation step Si . The final evaluation step joins all intermediate evaluation steps and adds the conjunctive conditions of the partition’s connectors. Pruned Single-Level Decomposition If NPP >= NOP ⇒ we distribute the pruning points across the different intermediate NOP partitions. It ensures a balanced effective pruning of all intermediate results. Pruned Multi-Level Decomposition if NPP < NOP ⇒ we distribute the pruning points across a first level intermediate results of NOP partitions. An intermediate collective pruned step IPS is constructed by joining all the pruned first level intermediate results. IPS is used as an entry pruning point for the rest (NOP − NPP) non-pruned partitions in a hierarchical multi-level fashion . Each pruning point can be used to prune more than one partition (if possible). S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 21 / 40
  • 22. Decomposition-Based and Selectivity-Aware SQL Translation S1 S1 S2 FES - S2 FES - SQL SQL S1 - S1 - S2 - S2 - SQL SQLSQL SQL (a) NPP > NOP S2 FES - SQL S2 FES - S1 S3 SQL S1 - S2 - S3 - S1 S3 SQL SQL SQL S1 - S2 - S3 - (b) NPP < NOP SQL SQL SQL Figure: Selectivity-aware decomposition process S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 22 / 40
  • 23. Decomposition-Based and Selectivity-Aware SQL Translation Selectivity-aware Annotations For any given SQL query, there are a large number of alternative execution plans. These alternative execution plans may differ significantly in their use of system resources or response time. We use the statistical summary information to give influencing hints for the query optimizers by injecting additional selectivity information for the individual query predicates into the SQL translations of the graph queries. SELECT fieldlist FROM tablelist WHERE Pi SELECTIVITY Si S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 23 / 40
  • 24. Experimental Results: Performance and Scalability D2kV10E20L40M50 1MB D10kV10E20L40M50 10MB 100000 10000 D50kV30E40L90M150 50MB D100kV30E40L90M150 100MB 10000 1000 Execution Time (ms) Execution Time (ms) 1000 100 100 10 10 1 1 Q4 Q8 Q12 Q16 Q20 Q4 Q8 Q12 Q16 Q20 Query Size Query Size (a) Synthetic Dataset (b) DBLP Dataset Figure: The scalability of GraphREL. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 24 / 40
  • 25. Experimental Results: The effect of using Partitioned B-tree Indexes and Selectivity Injections Synthetic Synthetic DBLP 100 DBLP 40 90 35 80 Percentage of Improvement (%) 30 70 Execution Times (ms) 60 25 50 20 40 15 30 10 20 5 10 0 0 Q4 Q8 Q12 Q16 Q20 Q4 Q8 Q12 Q16 Q20 Query Size Query Size (a) Partitioned B-tree indexes (b) Injection of selectivity annotations Figure: The speedup improvement for the relational evaluation of sub-graph queries using partitioned B-tree indexes and selectivity-aware annotations. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 25 / 40
  • 26. QBP: An Application of GraphREL Many of today’s Information Systems are driven by explicit process models. A business process is a set of coordinated activities to achieve a specific business objective. With the rapid and incremental increase in the number of process models, it becomes crucial for business process designers to be able to look up their repository for models efficiently. QBP is a query processor for business processes models. QBP is based on a new visual query language for business processes called BPMN-Q. The language addresses processes definitions and extends the standard BPMN notations for modeling business processes for its concrete syntax. A BPMN-Q query is considered to be a graph which is going to be matched with process graph(s). S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 26 / 40
  • 27. QBP: An Application of GraphREL Credit Rating Credit Rating [accepted] [rejected] Offer loan protection insurance Check credit rating Prepare contract Const. Doc All OK [valid] Check real-estate Offer residence Customer applies for construction insurance real-estate credit document Const. Doc. [invalid] Reject application Check land register record Record Record [present] [absent] S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 27 / 40
  • 28. QBP: Application Architecture BPM-Q BPM- Q Query Semantic Query Query Editor Expander Semantically expanded queries Business Process Model Result Process Models SQL-Based Designers Query Processor Editor Query Results SQL Script Updates RDBMS Relational Business Process Repository Translation Middleware UML BPMN BPEL ………. EPC ADs S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 28 / 40
  • 29. BPMN-Q Query Constructs Anonymous It is used to indicate unknown activities in a query. It resembles an Activity activity but is distinguished by the @ sign in the beginning of the label. Generic Node It indicates an unknown node in a process. It could evaluate to any node type. Generic Split It refers to any type of split gateways. Generic Join It refers to any type of join gateways. Negative It states that two nodes A and B are not directly related by sequence Sequence Flow flow. Path It states that there must be a path from A to B. A query usually returns all paths. Negative Path It states that there is not any path between two nodes A and B. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 29 / 40
  • 30. QBP: An Application of GraphREL Customer applies for // Reject application real-estate credit (a) BPMN-Q Query Example Check credit rating Check real-estate Customer applies for construction real-estate credit document Check land register Reject application record (b) BPMN-Q Query Match S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 30 / 40
  • 31. QBP: Use Cases Searching the structure of the process models. Compliance checking. Detecting design anomalies. Discovery of frequent process patterns/anti-patterns. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 31 / 40
  • 32. QBP: An Application of GraphREL http://bpmnq.sourceforge.net/ S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 32 / 40
  • 33. Conclusions GraphREL is a purely relational framework to store and query graph data. In principle GraphREL has the following advantages: It can reside on any relational database system and exploits its well known matured query optimization techniques as well as its efficient and scalable query processing techniques. It has no required time cost for offline or pre-processing steps. It can handle static and dynamic (with frequent updates) graph databases very well. The selectivity annotations for the SQL evaluation scripts provide the relational query optimizers with the ability to select the most efficient execution plans and apply an efficient pruning for the non-required graph database members. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 33 / 40
  • 34. Outline Previous Work: Pathfinder - Relational XQuery Compiler. Current Work: GraphREL - General Graph Query Processor. Future Work: Scalable Graph Query Processing for New Generation of Database Applications. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 34 / 40
  • 35. Future Work: Large Scale Graph Query Processing (e.g: Social Networks) S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 35 / 40
  • 36. Future Work: Parallel Processing / MapReduce (HadoopDB) S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 36 / 40
  • 37. Future Work: Storing and Querying Hypergraphs S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 37 / 40
  • 38. References [CIDR’03] G. Graefe. Sorting And Indexing With Partitioned B-Trees. [VLDB’04] T. Grust, S. Sakr, and J. Teubner. XQuery on SQL Hosts. [SIGMOD’07] T. Grust, M. Mayr, J. Rittinger, S. Sakr, and J. Teubner. A SQL:1999 Code Generator for the Pathfinder XQuery Compiler. [VLDB’08] J. Teubner, T. Grust, S. Maneth, and S. Sakr. Dependable Cardinality Forecats for XQuery. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 38 / 40
  • 39. References [IJWIS’08] S. Sakr. ”Algebraic-Based XQuery Cardinality Estimation. [DASFAA’09] S. Sakr. GraphREL: A Decomposition-Based and Selectivity-Aware Relational Framework for Processing Sub-graph Queries. [UNISCON’09] S. Sakr. Storing and Querying Graph Data Using Efficient Relational Processing Techniques. [JDM’09] S. Sakr. Purely Relational Implementation of an XQuery Processor. S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 39 / 40
  • 40. The End Thank You S. Sakr (CSE, UNSW) BIT Seminars’09 16 November 2009 40 / 40