SlideShare a Scribd company logo
© Adam Perer
                                           BudgetMatch




Budget-Match: Cost Effective Subgraph
     Matching on Large Networks
     Matthias Bröcheler, Andrea Pugliese
            & V.S. Subrahmanian
likes
                                                                                                                               type            BudgetMatch
                      friend                             friend                                                   Star                  Sci-Fi
                                          Bob                                Mark                                Wars IV
                                                   friend
      The                                                          attended                                                            likes
                                        friend                                         friend                                                         Titanic
                                                                                                             likes
    Godfather
                       likes                 attended               Halloween                                       Pizza
                                  John                                2008                   Peter
             likes                                                                                         attended Feast          organized likes             type

                               friend           attended                                          organized
                                                              organized                                                attended
               Francis                                                                                                                     Jennifer
                                                    Peter‘s                                                                                                   Drama
                               attended                                                                                       friend
    likes                                          Bday party                                attended
             organized            attended                                                                   Ashley                                   likes
                                                                                           attended
   Pulp                                                                      Sylvester                                            attended                       type
                                          Home-                                              friend
  Fiction                                                                      2009                    organized
                                          coming             attended
                                            09
        type                                                                                      attended Fundraiser
                                                         organized                                                                                          Gone
                                                                                                           for School
                                                                                    Bob                                                                    with the
  Thriller                                          Jessie                                          attended
                     Chill-out                                                                                                                              wind
                      Night                                                                                                      Alice
                                                    friend                     attended       likes                             Goodbye
        type                                                                                                                                       likes
                                                                                                             Mrs.
                         attended                                  organized                               Doubtfire                   organized
                                                                                    Spring
   Inception                                                                                                         type
                                          Melissa         friend                    Break
                                                                                     Trip                                                  Alice
                                                                                                                            attended
               likes                             likes
                                                                                                               Comedy                          likes
                                    friend                                           organized                                     likes
                                                     Harry           likes
                        Emily                                                                                                                      The Lion
                                                     Potter
                                                                                                      type                                           King
likes                                                                         Jon         likes
                                                   type
                                    Mystery                                                             Toy            type                    type
likes                                                                                                  Story                      Family
BudgetMatch
Linked Data - RDF
BudgetMatch

                   500 million users




50M tweets / day
BudgetMatch


Subgraph Matching Queries

                          attended
                  ?p                 Francis
                                               friend
     organi zed        attended

      Peter               ?u                    ?f
                                     likes
                  likes
                                     type
                               ?b            Drama
BudgetMatch


Prior Work
 Systems    (Storage, Index, Query answering)
    -  Jena, Sesame, RDF-3X, YARS, DOGMA,
        COSI, Hexastore, column stores, etc
    -  AllegroGraph, Neo4J, OWLIM, etc
 Query Optimization
    -  Stocker (WWW’08) and others
    -  similar to RDBMS with schema discovery
      •  Selectivity estimation, query plan search
          and join ordering
6
likes
                                                                                                                               type            BudgetMatch
                      friend                             friend                                                   Star                  Sci-Fi
                                          Bob                                Mark                                Wars IV
                                                   friend
      The                                                          attended                                                            likes
                                        friend                                         friend                                                         Titanic
                                                                                                             likes
    Godfather
                       likes                 attended               Halloween                                       Pizza
                                  John                                2008                   Peter
             likes                                                                                         attended Feast          organized likes             type

                               friend           attended                                          organized
                                                              organized                                                attended
               Francis                                                                                                                     Jennifer
                                                    Peter‘s                                                                                                   Drama
                               attended                                                                                       friend
    likes                                          Bday party                                attended
             organized            attended                                                                   Ashley                                   likes
                                                                                           attended
   Pulp                                                                      Sylvester                                            attended                       type
                                          Home-                                              friend
  Fiction                                                                      2009                    organized
                                          coming             attended
                                            09
        type                                                                                      attended Fundraiser
                                                         organized                                                                                          Gone
                                                                                                           for School
                                                                                    Bob                                                                    with the
  Thriller                                          Jessie                                          attended
                     Chill-out                                                                                                                              wind
                      Night                                                                                                      Alice
                                                    friend                     attended       likes                             Goodbye
        type                                                                                                                                       likes
                                                                                                             Mrs.
                         attended                                  organized                               Doubtfire                   organized
                                                                                    Spring
   Inception                                                                                                         type
                                          Melissa         friend                    Break
                                                                                     Trip                                                  Alice
                                                                                                                            attended
               likes                             likes
                                                                                                               Comedy                          likes
                                    friend                                           organized                                     likes
                                                     Harry           likes
                        Emily                                                                                                                      The Lion
                                                     Potter
                                                                                                      type                                           King
likes                                                                         Jon         likes
                                                   type
                                    Mystery                                                             Toy            type                    type
likes                                                                                                  Story                      Family
BudgetMatch


Network Characteristics
BudgetMatch


Network Statistics
 Most real world networks have power-law
  degree distributions
    -  Hence average statistics are not helpful



               mean                 mean
                                            Long tail




9
BudgetMatch


 Subgraph Matching
  On networks with power-law degree
   distributions, subgraph matching
   algorithms will visit high degree nodes
   when using static cost models
     -  Statistics won’t help us avoid those
     -  Existing subgraph matching cost models
         are static


10
likes
                                                                                                                               type            BudgetMatch
                      friend                             friend                                                   Star                  Sci-Fi
                                          Bob                                Mark                                Wars IV
                                                   friend
      The                                                          attended                                                            likes
                                        friend                                         friend                                                         Titanic
                                                                                                             likes
    Godfather
                       likes                 attended               Halloween                                       Pizza
                                  John                                2008                   Peter
                                                                                                           attended Feast                                      type




                                                                                                                                   ?
             likes                                                                                                                 organized likes
                               friend           attended                                          organized
                                                              organized                                                attended
               Francis                                                                                                                     Jennifer
                                                    Peter‘s                                                                                                   Drama
                               attended                                                                                       friend
    likes                                          Bday party                                attended
             organized            attended                                                                   Ashley                                   likes
                                                                                           attended
   Pulp                                                                      Sylvester                                            attended                       type
                                          Home-                                              friend
  Fiction                                                                      2009                    organized
                                          coming             attended
                                            09
        type                                                                                      attended Fundraiser
                                                         organized                                                                                          Gone
                                                                                                           for School
                                                                                    Bob                                                                    with the
  Thriller                                          Jessie                                          attended
                     Chill-out                                                                                                                              wind
                      Night                                                                                                      Alice
                                                    friend                     attended       likes                             Goodbye
        type                                                                                                                                       likes
                                                                                                             Mrs.
                         attended                                  organized                               Doubtfire                   organized
                                                                                    Spring
   Inception                                                                                                         type
                                          Melissa         friend                    Break
                                                                                     Trip                                                  Alice
                                                                                                                            attended
               likes                             likes
                                                                                                               Comedy                          likes
                                    friend                                           organized                                     likes
                                                     Harry           likes
                        Emily                                                                                                                      The Lion
                                                     Potter
                                                                                                      type                                           King
likes                                                                         Jon         likes
                                                   type
                                    Mystery                                                             Toy            type                    type
likes                                                                                                  Story                      Family
BudgetMatch


BudgetMatch
 IDEA: Use a dynamic cost model which
   updates its cost estimates as it learns
   more about the network
  -  Assigns an initial cost estimate
    •  Fixed or based on average statistics
  -  Processes nodes using its current cost
      estimate as a budget for processing
  -  If budget is exceeded, processing is
      aborted and the cost estimate updated
BudgetMatch


 BudgetMatch
  Depth first search query answering
    algorithm
     -  Memory efficient
     -  Parallelizable
  Based on the DOGMA query answering
    algorithm
     -  ISWC’09
  Provably correct
13
BudgetMatch


Example Query

                          attended
                  ?p                 Francis
                                               friend
     organi zed        attended

      Peter               ?u                    ?f
                                     likes
                  likes
                                     type
                               ?b            Drama
likes
                                                                                                                               type            BudgetMatch
                      friend                             friend                                                   Star                  Sci-Fi
                                          Bob                                Mark                                Wars IV
                                                   friend
      The                                                          attended                                                            likes
                                        friend                                         friend                                                         Titanic
                                                                                                             likes
    Godfather
                       likes                 attended               Halloween                                       Pizza
                                  John                                2008                   Peter
             likes                                                                                         attended Feast          organized likes             type

                               friend           attended                                          organized
                                                              organized                                                attended
               Francis                                                                                                                     Jennifer
                                                    Peter‘s                                                                                                   Drama
                               attended                                                                                       friend
    likes                                          Bday party                                attended
             organized            attended                                                                   Ashley                                   likes
                                                                                           attended
   Pulp                                                                      Sylvester                                            attended                       type
                                          Home-                                              friend
  Fiction                                                                      2009                    organized
                                          coming             attended
                                            09
        type                                                                                      attended Fundraiser
                                                         organized                                                                                          Gone
                                                                                                           for School
                                                                                    Bob                                                                    with the
  Thriller                                          Jessie                                          attended
                     Chill-out                                                                                                                              wind
                      Night                                                                                                      Alice
                                                    friend                     attended       likes                             Goodbye
        type                                                                                                                                       likes
                                                                                                             Mrs.
                         attended                                  organized                               Doubtfire                   organized
                                                                                    Spring
   Inception                                                                                                         type
                                          Melissa         friend                    Break
                                                                                     Trip                                                  Alice
                                                                                                                            attended
               likes                             likes
                                                                                                               Comedy                          likes
                                    friend                                           organized                                     likes
                                                     Harry           likes
                        Emily                                                                                                                      The Lion
                                                     Potter
                                                                                                      type                                           King
likes                                                                         Jon         likes
                                                   type
                                    Mystery                                                             Toy            type                    type
likes                                                                                                  Story                      Family
BudgetMatch


 BudgetMatch Example I
                              attended
         c= 5                                         c=5
         R = {}      ?p                     Francis   R = {francis}

                                                       friend
       organi zed         attended
                                        c=5
         Peter                ?u        R = {}           ?f
                                            likes     c=5
       c=5                                            R = {}
       R = {peter}   likes
                                            type
                                   ?b               Drama
                     c=5                                  c=5
       ANS =         R = {}                               R = {drama}
       {}
       θ = {}



16
BudgetMatch


BudgetMatch Example II
c= 5,
R = {},                                                                c=5
R’= {Peter’s bday party, Homecoming   ?p                     Francis   R = {francis}
09, Silvester 2009}                                                    R’= {}

                        organi zed         attended
                                                         c=5
                         Peter                 ?u        R = {}           ?f
                                                             likes     c=5
                       c=5                                             R = {}
                       R = {peter}    likes                            R’ = {Mark, John}
                                                             type
                                                    ?b               Drama
                                      c=5                                  c=5
                      ANS =           R = {}                               R = {drama}
                      {}
                      θ = {}
BudgetMatch


BudgetMatch Example III
 c= 5,
 R = {Peter’s bday party,                                         c=5
 Silvester 2009}                 ?p                     Francis   R’= {}


                                      attended
                                                    c=5
                      Peter               ?u        R = {}           ?f
                                                        likes     c=5
                   c=5                                            R = {Mark, John}
                   R = {peter}   likes
                                                        type
                                               ?b               Drama
                                 c=5                                  c = 25
                  ANS =          R = {}                               R = {drama}
                  {}                                                  R’ = {drama}
                  θ = {}
BudgetMatch


BudgetMatch Example IV
 c= 5,
 R = {Peter’s bday party,                                          c=5
 Silvester 2009}                  ?p                     Francis   R’= {}


                                       attended
                                                     c=5
                      Peter                ?u        R = {}           ?f
                                                                   c=5
                   c=5                                             R = {}
                   R = {peter}     likes
                                                         type
                                                ?b              Drama
                                 c = 25
                                 R = {Titanic, Star Wars IV}           c = 25
                  ANS =          R’ = {Titanic, Star Wars IV}          R = {drama}
                  {}
                  θ = {?f/Mark}
BudgetMatch


BudgetMatch Example V
        c= 5,                                          c=5
        R = {}       ?p                    Francis     R’= {}



                                       c=5
        Peter                ?u        R = {Francis,       ?f
                                       Jennifer, Ashley}        c=5
      c=5                              R’= {}                   R = {}
      R = {peter}
                                            type
                                  ?b                   Drama
                    c = 25
                    R = {Titanic, Star Wars IV}            c = 25
     ANS =          R’ = {Titanic}                         R = {drama}
     {}
     θ = {?f/Mark, ?p/Peter’s bday party, ?u/Jennifer}
BudgetMatch


BudgetMatch Example VI
        c= 5,                                     c=5
        R = {}      ?p                  Francis   R’= {}



                                    c=5
        Peter             ?u        R = {}           ?f
                                                           c=5
      c=5                                                  R = {}
      R = {peter}
                    c = 25
                    R = {}}    ?b             Drama = 25
                                                   c
      ANS = {θ}                                            R = {}

      θ = {?f/Mark, ?p/Peter’s bday party, ?u/
      Jennifer, ?b/Titanic}
BudgetMatch


 Cost Initialization & Update
  Initialize cost
     -  Constant initial cost
     -  Using average degree statistics
  Cost estimate update
     -  Multiply by a constant




22
BudgetMatch


 Budget Assignment




23
BudgetMatch


 Experiments
  Evaluated on a network with 1.12 billion
    edges
     -  Delicious social network crawl (partial)
  Used Neo4J as storage engine
     -  Custom batch loading, degree lookup
  Compared against DOGMA algorithm
  Evaluated on a set of 9 diverse benchmark
    queries (5-12 edges)

24
BudgetMatch


             Query Times (Warm Cache)
                     1)   assignBudget4, λ=100                     2)   assignBudget4,   λ=500
 10000
                     3)   assignBudget4, λ=2000                    4)   assignBudget4,   λ=15000
                     5)   assignBudget4, λ=100, statistics         6)   assignBudget2,   λ=500
                     7)   assignBudget3, λ=500                     8)   assignBudget1,   λ=500
     1000            9)   SN-3: DOGMA+statistics
Time in ms




         100



             10



              1
                  Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E)



                          Logarithmic Scale
         25
BudgetMatch


             Query Times (Warm Cache)
                     1)   assignBudget4, λ=100                     2)   assignBudget4,   λ=500
 10000
                     3)   assignBudget4, λ=2000                    4)   assignBudget4,   λ=15000
                     5)   assignBudget4, λ=100, statistics         6)   assignBudget2,   λ=500
                     7)   assignBudget3, λ=500                     8)   assignBudget1,   λ=500
     1000            9)   SN-3: DOGMA+statistics
Time in ms




         100



             10



              1
                  Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E)




         26
BudgetMatch


             Query Times (Warm Cache)
                     1)   assignBudget4, λ=100                     2)   assignBudget4,   λ=500
 10000
                     3)   assignBudget4, λ=2000                    4)   assignBudget4,   λ=15000
                     5)   assignBudget4, λ=100, statistics         6)   assignBudget2,   λ=500
                     7)   assignBudget3, λ=500                     8)   assignBudget1,   λ=500
     1000            9)   SN-3: DOGMA+statistics
Time in ms




         100



             10



              1
                  Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E)




         27
BudgetMatch


             Query Times (Cold Cache)
 100000
                    1)   assignBudget4, λ=100                              2)   assignBudget4,     λ=500
                    3)   assignBudget4, λ=2000                             4)   assignBudget4,     λ=15000
     10000
                    5)   assignBudget4, λ=100, statistics                  6)   assignBudget2,     λ=500
                    7)   assignBudget3, λ=500                              8)   assignBudget1,     λ=500
                    9)   SN-3: DOGMA+statistics
Time in ms




         1000



             100



              10



               1
                   Query1 (5E)   Query2 (7E)   Query3 (6E)   Query4 (9E)   Query5 (9E)   Query6 (7E)   Query7 (12E)   Query8 (12E)   Query9 (11E)




         28
BudgetMatch


 Comparison
  Compared configuration 4 against
      -  Neo4J subgraph matching (SN-1)
      -  DOGMA without statistics (SN-2)
      -  DOGMA with statistics (SN-3)

                   SN-1       SN-2         SN-3

     Cold Cache   12,867 x    12 x         11 x

     Warm Cache   44,794 x    18 x         14 x

29
BudgetMatch


                                                          DOGMA Index
                                                                                                  3
                                                                                    1                                                           Graph Locality
                                                                                        2         4



                                                     3                                                                                                     3
                                         1                                                                                                        1

                                              2      4                                                                                                     4
                                                                                                                                                     2


                                   3                                          3                                                             3                                      3
                      1                                    1                                                            1                                            1
                                   4                                          4                                              2              4                         2            4
                           2                                   2

                                                     Alice                                              sponsor    Bill
    Term                            Tax                            Term                       Jeff                                Term                                             A0467
                                                    Nimbe                     hasRole                             B004           10/02/94                Healt       A1589
    10/02/94   forOffice           Code                            11/06/90
                                                                                             Ryser
                                         subject       r                          Carla                             5                                     h
                                                   Has   Role                                     hasRole                                       Male                 amendmentTo        sponsor
hasRole                                                                           Bunes                                          Has Role                Care
                 IL                    B074         Term                                                     Senate                                        subject     Bill
 John                                                                               gender                                                  gender                                 Pierce
                                         4          10/12/94       A0056                                       NY                 Keith                               B053
 McRie                                                                                                                                                                             Dickes
                                                   For Office                                Term                                Farmer                    US           2sponsor
                                  amendmentTo                                                                                                                                      Has Role
       sponsor                                                                               10/21/94
                                                                                                        For Office                      sponsor
                                                    Senate                        Female                                                                 Senate       Peter
                                                                                                                  Term                                               Traves            Term
          A0772                A2187     A0342       MD            B1432                                          11/10/90
                                                                                                                                   A1232                                           10/12/94




                                                                              Disk Pages
BudgetMatch


           COSI Architecture
           Graph Data      Client          B   ?X



                
                                                      ?Z   C




                  
                                           A   ?Y

                   load                    Receive query -
                                           Return results
Partition Graph
                        Distribute data/
  (automatic)           Dispatch query         Query answer




                  
                       Exchange Data /
                                                      
                                                 Answer Queries
                                                    (complexity hidden)

                        Forward query
BudgetMatch


 COSI Partitioning
     Key Theorem
      Suppose vertex retrieval and inter-node comms
       are uniform across storage nodes. The partition
       of DB that minimizes query exec time coincides
       with the partition that minimizes edge cut cost
       in the graph (V,VV) with weight function
      w(u,v)= (E(u,v))+ (E(v,u)).

       SO MIN EDGE-CUTS IN COMPLETE GRAPHS IS
         CLOSELY RELATED TO MINIMIZING QUERY
         EXECUTION TIME.
32
BudgetMatch


 Further Information
   COSI: Cloud Oriented Subgraph Identification
     in Massive Social Networks
     Matthias Bröcheler, Andrea Pugliese and V.S. Subrahmanian, The
      2010 International Conference on Advances in Social Networks
      Analysis and Mining
     - Patent Pending -
   DOGMA: A Disk-Oriented Graph Matching
    Algorithm
     Matthias Broecheler,  Andrea Pugliese,  V.S. Subrahmanian,
      Proceedings of the 8th International Semantic Web Conference
     - Patent Pending -




33
BudgetMatch




dogma.umiacs.umd.edu
BudgetMatch


 Conclusion
  Dynamic cost models are beneficial for
    networks with heavy-tailed distributions
  Developed BudgetMatch query answering
    algorithm which dynamically updates cost
    estimations during execution.
  BudgetMatch yields huge improvements over
    standard static approaches for some queries



35
?
             BudgetMatch




Questions?
Comments?

More Related Content

More from Matthias Broecheler

Data Day Texas 2013
Data Day Texas 2013Data Day Texas 2013
Data Day Texas 2013
Matthias Broecheler
 
Adding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusAdding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and Faunus
Matthias Broecheler
 
Big Graph Data
Big Graph DataBig Graph Data
Big Graph Data
Matthias Broecheler
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
Matthias Broecheler
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
Matthias Broecheler
 
Probabilistic Soft Logic
Probabilistic Soft LogicProbabilistic Soft Logic
Probabilistic Soft Logic
Matthias Broecheler
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010
Matthias Broecheler
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksMatthias Broecheler
 
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksCOSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
Matthias Broecheler
 

More from Matthias Broecheler (9)

Data Day Texas 2013
Data Day Texas 2013Data Day Texas 2013
Data Day Texas 2013
 
Adding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and FaunusAdding Value through graph analysis using Titan and Faunus
Adding Value through graph analysis using Titan and Faunus
 
Big Graph Data
Big Graph DataBig Graph Data
Big Graph Data
 
Titan: Big Graph Data with Cassandra
Titan: Big Graph Data with CassandraTitan: Big Graph Data with Cassandra
Titan: Big Graph Data with Cassandra
 
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social NetworksPMatch: Probabilistic Subgraph Matching on Huge Social Networks
PMatch: Probabilistic Subgraph Matching on Huge Social Networks
 
Probabilistic Soft Logic
Probabilistic Soft LogicProbabilistic Soft Logic
Probabilistic Soft Logic
 
Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010Computing Marginal in CCMRFs - NIPS 2010
Computing Marginal in CCMRFs - NIPS 2010
 
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social NetworksA Scalable Framework for Modeling Competitive Diffusion in Social Networks
A Scalable Framework for Modeling Competitive Diffusion in Social Networks
 
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social NetworksCOSI: Cloud Oriented Subgraph Identification in Massive Social Networks
COSI: Cloud Oriented Subgraph Identification in Massive Social Networks
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 

Budget-Match: Cost Effective Subgraph Matching on Large Networks

  • 1. © Adam Perer BudgetMatch Budget-Match: Cost Effective Subgraph Matching on Large Networks Matthias Bröcheler, Andrea Pugliese & V.S. Subrahmanian
  • 2. likes type BudgetMatch friend friend Star Sci-Fi Bob Mark Wars IV friend The attended likes friend friend Titanic likes Godfather likes attended Halloween Pizza John 2008 Peter likes attended Feast organized likes type friend attended organized organized attended Francis Jennifer Peter‘s Drama attended friend likes Bday party attended organized attended Ashley likes attended Pulp Sylvester attended type Home- friend Fiction 2009 organized coming attended 09 type attended Fundraiser organized Gone for School Bob with the Thriller Jessie attended Chill-out wind Night Alice friend attended likes Goodbye type likes Mrs. attended organized Doubtfire organized Spring Inception type Melissa friend Break Trip Alice attended likes likes Comedy likes friend organized likes Harry likes Emily The Lion Potter type King likes Jon likes type Mystery Toy type type likes Story Family
  • 4. BudgetMatch 500 million users 50M tweets / day
  • 5. BudgetMatch Subgraph Matching Queries attended ?p Francis friend organi zed attended Peter ?u ?f likes likes type ?b Drama
  • 6. BudgetMatch Prior Work  Systems (Storage, Index, Query answering) -  Jena, Sesame, RDF-3X, YARS, DOGMA, COSI, Hexastore, column stores, etc -  AllegroGraph, Neo4J, OWLIM, etc  Query Optimization -  Stocker (WWW’08) and others -  similar to RDBMS with schema discovery •  Selectivity estimation, query plan search and join ordering 6
  • 7. likes type BudgetMatch friend friend Star Sci-Fi Bob Mark Wars IV friend The attended likes friend friend Titanic likes Godfather likes attended Halloween Pizza John 2008 Peter likes attended Feast organized likes type friend attended organized organized attended Francis Jennifer Peter‘s Drama attended friend likes Bday party attended organized attended Ashley likes attended Pulp Sylvester attended type Home- friend Fiction 2009 organized coming attended 09 type attended Fundraiser organized Gone for School Bob with the Thriller Jessie attended Chill-out wind Night Alice friend attended likes Goodbye type likes Mrs. attended organized Doubtfire organized Spring Inception type Melissa friend Break Trip Alice attended likes likes Comedy likes friend organized likes Harry likes Emily The Lion Potter type King likes Jon likes type Mystery Toy type type likes Story Family
  • 9. BudgetMatch Network Statistics  Most real world networks have power-law degree distributions -  Hence average statistics are not helpful mean mean Long tail 9
  • 10. BudgetMatch Subgraph Matching  On networks with power-law degree distributions, subgraph matching algorithms will visit high degree nodes when using static cost models -  Statistics won’t help us avoid those -  Existing subgraph matching cost models are static 10
  • 11. likes type BudgetMatch friend friend Star Sci-Fi Bob Mark Wars IV friend The attended likes friend friend Titanic likes Godfather likes attended Halloween Pizza John 2008 Peter attended Feast type ? likes organized likes friend attended organized organized attended Francis Jennifer Peter‘s Drama attended friend likes Bday party attended organized attended Ashley likes attended Pulp Sylvester attended type Home- friend Fiction 2009 organized coming attended 09 type attended Fundraiser organized Gone for School Bob with the Thriller Jessie attended Chill-out wind Night Alice friend attended likes Goodbye type likes Mrs. attended organized Doubtfire organized Spring Inception type Melissa friend Break Trip Alice attended likes likes Comedy likes friend organized likes Harry likes Emily The Lion Potter type King likes Jon likes type Mystery Toy type type likes Story Family
  • 12. BudgetMatch BudgetMatch  IDEA: Use a dynamic cost model which updates its cost estimates as it learns more about the network -  Assigns an initial cost estimate •  Fixed or based on average statistics -  Processes nodes using its current cost estimate as a budget for processing -  If budget is exceeded, processing is aborted and the cost estimate updated
  • 13. BudgetMatch BudgetMatch  Depth first search query answering algorithm -  Memory efficient -  Parallelizable  Based on the DOGMA query answering algorithm -  ISWC’09  Provably correct 13
  • 14. BudgetMatch Example Query attended ?p Francis friend organi zed attended Peter ?u ?f likes likes type ?b Drama
  • 15. likes type BudgetMatch friend friend Star Sci-Fi Bob Mark Wars IV friend The attended likes friend friend Titanic likes Godfather likes attended Halloween Pizza John 2008 Peter likes attended Feast organized likes type friend attended organized organized attended Francis Jennifer Peter‘s Drama attended friend likes Bday party attended organized attended Ashley likes attended Pulp Sylvester attended type Home- friend Fiction 2009 organized coming attended 09 type attended Fundraiser organized Gone for School Bob with the Thriller Jessie attended Chill-out wind Night Alice friend attended likes Goodbye type likes Mrs. attended organized Doubtfire organized Spring Inception type Melissa friend Break Trip Alice attended likes likes Comedy likes friend organized likes Harry likes Emily The Lion Potter type King likes Jon likes type Mystery Toy type type likes Story Family
  • 16. BudgetMatch BudgetMatch Example I attended c= 5 c=5 R = {} ?p Francis R = {francis} friend organi zed attended c=5 Peter ?u R = {} ?f likes c=5 c=5 R = {} R = {peter} likes type ?b Drama c=5 c=5 ANS = R = {} R = {drama} {} θ = {} 16
  • 17. BudgetMatch BudgetMatch Example II c= 5, R = {}, c=5 R’= {Peter’s bday party, Homecoming ?p Francis R = {francis} 09, Silvester 2009} R’= {} organi zed attended c=5 Peter ?u R = {} ?f likes c=5 c=5 R = {} R = {peter} likes R’ = {Mark, John} type ?b Drama c=5 c=5 ANS = R = {} R = {drama} {} θ = {}
  • 18. BudgetMatch BudgetMatch Example III c= 5, R = {Peter’s bday party, c=5 Silvester 2009} ?p Francis R’= {} attended c=5 Peter ?u R = {} ?f likes c=5 c=5 R = {Mark, John} R = {peter} likes type ?b Drama c=5 c = 25 ANS = R = {} R = {drama} {} R’ = {drama} θ = {}
  • 19. BudgetMatch BudgetMatch Example IV c= 5, R = {Peter’s bday party, c=5 Silvester 2009} ?p Francis R’= {} attended c=5 Peter ?u R = {} ?f c=5 c=5 R = {} R = {peter} likes type ?b Drama c = 25 R = {Titanic, Star Wars IV} c = 25 ANS = R’ = {Titanic, Star Wars IV} R = {drama} {} θ = {?f/Mark}
  • 20. BudgetMatch BudgetMatch Example V c= 5, c=5 R = {} ?p Francis R’= {} c=5 Peter ?u R = {Francis, ?f Jennifer, Ashley} c=5 c=5 R’= {} R = {} R = {peter} type ?b Drama c = 25 R = {Titanic, Star Wars IV} c = 25 ANS = R’ = {Titanic} R = {drama} {} θ = {?f/Mark, ?p/Peter’s bday party, ?u/Jennifer}
  • 21. BudgetMatch BudgetMatch Example VI c= 5, c=5 R = {} ?p Francis R’= {} c=5 Peter ?u R = {} ?f c=5 c=5 R = {} R = {peter} c = 25 R = {}} ?b Drama = 25 c ANS = {θ} R = {} θ = {?f/Mark, ?p/Peter’s bday party, ?u/ Jennifer, ?b/Titanic}
  • 22. BudgetMatch Cost Initialization & Update  Initialize cost -  Constant initial cost -  Using average degree statistics  Cost estimate update -  Multiply by a constant 22
  • 24. BudgetMatch Experiments  Evaluated on a network with 1.12 billion edges -  Delicious social network crawl (partial)  Used Neo4J as storage engine -  Custom batch loading, degree lookup  Compared against DOGMA algorithm  Evaluated on a set of 9 diverse benchmark queries (5-12 edges) 24
  • 25. BudgetMatch Query Times (Warm Cache) 1) assignBudget4, λ=100 2) assignBudget4, λ=500 10000 3) assignBudget4, λ=2000 4) assignBudget4, λ=15000 5) assignBudget4, λ=100, statistics 6) assignBudget2, λ=500 7) assignBudget3, λ=500 8) assignBudget1, λ=500 1000 9) SN-3: DOGMA+statistics Time in ms 100 10 1 Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E) Logarithmic Scale 25
  • 26. BudgetMatch Query Times (Warm Cache) 1) assignBudget4, λ=100 2) assignBudget4, λ=500 10000 3) assignBudget4, λ=2000 4) assignBudget4, λ=15000 5) assignBudget4, λ=100, statistics 6) assignBudget2, λ=500 7) assignBudget3, λ=500 8) assignBudget1, λ=500 1000 9) SN-3: DOGMA+statistics Time in ms 100 10 1 Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E) 26
  • 27. BudgetMatch Query Times (Warm Cache) 1) assignBudget4, λ=100 2) assignBudget4, λ=500 10000 3) assignBudget4, λ=2000 4) assignBudget4, λ=15000 5) assignBudget4, λ=100, statistics 6) assignBudget2, λ=500 7) assignBudget3, λ=500 8) assignBudget1, λ=500 1000 9) SN-3: DOGMA+statistics Time in ms 100 10 1 Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E) 27
  • 28. BudgetMatch Query Times (Cold Cache) 100000 1) assignBudget4, λ=100 2) assignBudget4, λ=500 3) assignBudget4, λ=2000 4) assignBudget4, λ=15000 10000 5) assignBudget4, λ=100, statistics 6) assignBudget2, λ=500 7) assignBudget3, λ=500 8) assignBudget1, λ=500 9) SN-3: DOGMA+statistics Time in ms 1000 100 10 1 Query1 (5E) Query2 (7E) Query3 (6E) Query4 (9E) Query5 (9E) Query6 (7E) Query7 (12E) Query8 (12E) Query9 (11E) 28
  • 29. BudgetMatch Comparison  Compared configuration 4 against -  Neo4J subgraph matching (SN-1) -  DOGMA without statistics (SN-2) -  DOGMA with statistics (SN-3) SN-1 SN-2 SN-3 Cold Cache 12,867 x 12 x 11 x Warm Cache 44,794 x 18 x 14 x 29
  • 30. BudgetMatch DOGMA Index 3 1 Graph Locality 2 4 3 3 1 1 2 4 4 2 3 3 3 3 1 1 1 1 4 4 2 4 2 4 2 2 Alice sponsor Bill Term Tax Term Jeff Term A0467 Nimbe hasRole B004 10/02/94 Healt A1589 10/02/94 forOffice Code 11/06/90 Ryser subject r Carla 5 h Has Role hasRole Male amendmentTo sponsor hasRole Bunes Has Role Care IL B074 Term Senate subject Bill John gender gender Pierce 4 10/12/94 A0056 NY Keith B053 McRie Dickes For Office Term Farmer US 2sponsor amendmentTo Has Role sponsor 10/21/94 For Office sponsor Senate Female Senate Peter Term Traves Term A0772 A2187 A0342 MD B1432 11/10/90 A1232 10/12/94 Disk Pages
  • 31. BudgetMatch COSI Architecture Graph Data Client B ?X  ?Z C  A ?Y load Receive query - Return results Partition Graph Distribute data/ (automatic) Dispatch query Query answer     Exchange Data /  Answer Queries (complexity hidden) Forward query
  • 32. BudgetMatch COSI Partitioning Key Theorem Suppose vertex retrieval and inter-node comms are uniform across storage nodes. The partition of DB that minimizes query exec time coincides with the partition that minimizes edge cut cost in the graph (V,VV) with weight function w(u,v)= (E(u,v))+ (E(v,u)).   SO MIN EDGE-CUTS IN COMPLETE GRAPHS IS CLOSELY RELATED TO MINIMIZING QUERY EXECUTION TIME. 32
  • 33. BudgetMatch Further Information   COSI: Cloud Oriented Subgraph Identification in Massive Social Networks Matthias Bröcheler, Andrea Pugliese and V.S. Subrahmanian, The 2010 International Conference on Advances in Social Networks Analysis and Mining - Patent Pending -   DOGMA: A Disk-Oriented Graph Matching Algorithm Matthias Broecheler,  Andrea Pugliese,  V.S. Subrahmanian, Proceedings of the 8th International Semantic Web Conference - Patent Pending - 33
  • 35. BudgetMatch Conclusion  Dynamic cost models are beneficial for networks with heavy-tailed distributions  Developed BudgetMatch query answering algorithm which dynamically updates cost estimations during execution.  BudgetMatch yields huge improvements over standard static approaches for some queries 35
  • 36. ? BudgetMatch Questions? Comments?