SlideShare a Scribd company logo
Flexible querying of graph data

          Graph processing room
          FOSDEM, 2 Feb 2013


                Petra Selmer
           petra.selmer.uk@gmail.com
       http://www.dcs.bbk.ac.uk/~lselm01/
Introduction

       I shall be presenting my PhD topic which involves
       a declarative query language allowing for the
       flexible querying of graph-structured data with
       complex paths.




2
Agenda

     Who (am I)?
     Why (the motivation)?
     Some background info
     What (is the query language and what
      can it do)?
     Illustrative examples
     How (is it done)?

3
Who?

     Petra Selmer
     Part-time PhD student:
       Birkbeck College, University of London
       Prof. Alexandra Poulovassilis
       Dr. Peter T. Wood
     Software Architect:
       University College London’s Institute of Neurology
        (Wellcome Trust Centre for Neuroimaging)




4
Why?

     Amount of graph-structured data is
      growing fast
     The structure of this data is
      becoming more complex, especially
      when multiple, heterogeneous data
      sources are integrated together
     The structure of the data is also
      always subject to change...

5
Why?
     Users of such systems may not be familiar with the underlying data
      structure: available paths etc
     The user may not be able to obtain meaningful answers (or indeed,
      any answers) from the data IF the querying system is limited to exact
      matching of users’ queries
     Also, the user may wish to explore the data by starting from a set of
      initial answers and proceeding from there
     The user may additionally wish to derive some intelligence from the
      connections....

        The data

                               The query         The user




6
Background: Ontologies

 Currently part of the Semantic Web stack (Tim Berners-
  Lee, RDF, triple stores)
 Models a domain of interest: inferences, reasoning...
 It can be thought of as a “schema” for graph data
 The following inference rules are included (among
  others):
     Subclass: ‘History’, ‘Languages’ are subclasses of
      ‘Humanities’
     Subproperty, Domain, Range...




7
What?
 Data model: G = (V, E)
   Very general model
   V : vertices (or nodes); each labelled with some
    constant
   E : directed, labelled edges; labels drawn from an
    alphabet {Ʃ U ‘type’}
 The query language is called Flex-It (it is
  declarative)
 The basis is that of conjunctive regular path
  queries
 There are two operators which may be applied to the
    original query

8
What?
 Conjunctive regular path queries:
   This is where the graph's paths to be traversed are expressed with a
    regular expression
 A single regular path query conjunct: (X, R, Y)
   X, Y: either constants or variables
   R: the regular expression
 “Conjunctive”: joining multiple conjuncts; e.g. (X, R1, Y), (Y,
    R2, Z), (Z, R3, A)
     The Y’s are matched, the Z’s are matched etc


                                             1) (N1, n+, ?Y):
         n           n            p               • Y = N2, N3
    N1          N2          N3         N4
                                             2) (N1, n*p, ?Y):
                                                  • Y = N4
9
What?
 Approximation allows for the approximate matching
  of labels in the path
 An edit operation is applied to each edge label in
  the path denoted by the regular expression:
      Edit operations: insertions, deletions, inversions,
       substitutions and transpositions of labels
      Each operation has a ‘cost’: usually 1
 Example:
      Query conjunct: (X, a*.b, Y)
      R = a*.b [answers returned at cost 0]
      R’ = p.a*.b (insertion of ‘p’) [answers returned at cost 1]
      R’’ = p.a*.b- (inversion of ‘b’) [answers returned at cost 2]


10
What?
      Relaxation is applied by using inference
      rules from an ontology (if one exists).
       Achieved by applying logical relaxation of the query
        conditions using the data’s ontology definition
       Relaxation operations: subclass, subproperty, domain
        and range
       Each operation has a ‘cost’ – usually 1
      Example:
       We have an ontology:
         Humanities (superclass)
         Languages and History (subclasses of Humanities)
       Assume our query states Languages may be relaxed
         Languages is relaxed to Humanities:
         Instances of Languages will be returned at cost 0
         Instances of History will be returned at cost 1

11
What?

      Answers are ranked according to how
       closely they match the original query;
       higher-cost answers have a lower ranking
      All answers at a certain distance d are
       ranked the same and returned before
       answers at a higher distance
      We allow for incremental execution: exact
       answers returned first; then answers at
       distance 1; ...
12
Example – ‘Lifelong learner metadata’


     sc



 History




13
sc

 History




14
 Query: “What work positions can I reach, having a degree in English”?
        Y = the episode; Z = the job
     (?Y, ?Z) 
        (?X, type, University),
        (?X, qualif.type, EnglishStudies),
        (?X, prereq+, ?Y),
        (?Y, type, Work),
        (?Y, job.type, ?Z)
15
 Query: “What work positions can I reach, having a degree in English”?
        Y = the episode; Z = the job
     (?Y, ?Z) 
        (?X, type, University),
        (?X, qualif.type, EnglishStudies),
        (?X, prereq+, ?Y),
        (?Y, type, Work),
        (?Y, job.type, ?Z)
      No results from User 2 will be returned...even though it is relevant!
16
 Allowing query approximation can yield some answers:
      Replacing the edge label prereq by next, at an edit cost of 1, we get this variant of the
       query:
        (?Y, ?Z) 
           (?X, type, University),
           (?X, qualif.type, EnglishStudies),
           APPROX(?X, prereq+, ?Y),
           (?Y, type, Work),
           (?Y, job.type, ?Z)
  prereq+ can be approximated by next.prereq* at edit distance 1:
      Result: Y = ep22, Z = AirTravelAssistant
17
 Allowing query approximation can yield some answers:
    Replacing the edge label prereq by next, at an edit cost of 1, we get this
       variant of the query:
       (?Y, ?Z) 
          (?X, type, University),
          (?X, qualif.type, EnglishStudies),
          APPROX(?X, prereq+, ?Y),
          (?Y, type, Work),
          (?Y, job.type, ?Z)
  next.prereq* can be approximated by next.next.prereq*, now at edit distance 2:
    Results:
       Y = ep23, Z = Journalist
       Y = ep24, Z = AssistantEditor
18
sc

     History




19
   Query: “What jobs are open to me if I study English, or something similar, at University”?
     (?Y, ?Z) 
         (?X, type, University), (?X, qualif, ?D),
         RELAX (?D, type, EnglishStudies),
         APPROX (?X, prereq+, ?Y),
         (?Y, type, Work), (?Y, job.type, ?Z)
        In addition to the answers (from User 2) obtained by the previous query, we now also have
         answers from the timeline of User 3
        prereq+ can be approximated by next.prereq* (distance 1) and EnglishStudies can be relaxed
         – via Languages - to Humanities (distance 2), encompassing History
          Result: Y = ep32, Z = PersonalAssistant (distance of 3 from original query)
20
   Query: “What jobs are open to me if I study English, or something similar, at
         University”?
     (?Y, ?Z) 
         (?X, type, University), (?X, qualif, ?D),
         RELAX (?D, type, EnglishStudies),
         APPROX (?X, prereq+, ?Y),
         (?Y, type, Work), (?Y, job.type, ?Z)
        next.prereq* can be approximated by next.next.prereq* (distance 2), with
         EnglishStudies again relaxed to Humanities (distance 2)
            Results: (both at distance 4 from the original query)
              Y = ep33, Z = Author
              Y = e34, Z = AssociateEditor
21
How?
      Theory
       Construction of a weighted non-deterministic finite
        automaton (NFA) to represent the regular expression
         We apply new states and transitions to the NFA to represent the
          approximation and relaxation operations
       Formation of a product automaton: NFA with data
        graph G
       We perform a lowest cost path traversal of the product
        automaton; construct query tree, do joins etc
       Polynomial time complexity
       Correctness of algorithms proven



22
How?

      Implementation of prototype
        Graph database: DEX (http://www.sparsity-
         technologies.com/dex)
        Programming language: C#
      Further work
        New flexible operation combining APPROX and
         RELAX  FLEX
        Optimisation!




23
Any questions?

     Thank you for your attention!

                      petra.selmer.uk@gmail.com
24

More Related Content

What's hot

Erlang session1
Erlang session1Erlang session1
Erlang session1
mohamedsamyali
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
telss09
 
Assignment 7
Assignment 7Assignment 7
Assignment 7
IIUM
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Universitat Politècnica de Catalunya
 
Oops concept
Oops conceptOops concept
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
Satyam Saxena
 
object oriented programming OOP
object oriented programming OOPobject oriented programming OOP
object oriented programming OOP
Anil Pokhrel
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
NTNU
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Universitat Politècnica de Catalunya
 
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
17. Java data structures trees representation and traversal
17. Java data structures trees representation and traversal17. Java data structures trees representation and traversal
17. Java data structures trees representation and traversal
Intro C# Book
 
14 Defining Classes
14 Defining Classes14 Defining Classes
14 Defining Classes
Intro C# Book
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
diannepatricia
 
Ifi7184.DT lesson 2
Ifi7184.DT lesson 2Ifi7184.DT lesson 2
Ifi7184.DT lesson 2
Sónia
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
Bhaskar Mitra
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
 
Java Foundations: Basic Syntax, Conditions, Loops
Java Foundations: Basic Syntax, Conditions, LoopsJava Foundations: Basic Syntax, Conditions, Loops
Java Foundations: Basic Syntax, Conditions, Loops
Svetlin Nakov
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
Bhaskar Mitra
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
Lin JiaMing
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLP
butest
 

What's hot (20)

Erlang session1
Erlang session1Erlang session1
Erlang session1
 
Language Technology Enhanced Learning
Language Technology Enhanced LearningLanguage Technology Enhanced Learning
Language Technology Enhanced Learning
 
Assignment 7
Assignment 7Assignment 7
Assignment 7
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
 
Oops concept
Oops conceptOops concept
Oops concept
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
object oriented programming OOP
object oriented programming OOPobject oriented programming OOP
object oriented programming OOP
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
17. Java data structures trees representation and traversal
17. Java data structures trees representation and traversal17. Java data structures trees representation and traversal
17. Java data structures trees representation and traversal
 
14 Defining Classes
14 Defining Classes14 Defining Classes
14 Defining Classes
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
 
Ifi7184.DT lesson 2
Ifi7184.DT lesson 2Ifi7184.DT lesson 2
Ifi7184.DT lesson 2
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Deep Neural Methods for Retrieval
Deep Neural Methods for RetrievalDeep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
 
Java Foundations: Basic Syntax, Conditions, Loops
Java Foundations: Basic Syntax, Conditions, LoopsJava Foundations: Basic Syntax, Conditions, Loops
Java Foundations: Basic Syntax, Conditions, Loops
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLP
 

Similar to Fosdem 2013 petra selmer flexible querying of graph data

Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data Sources
Jie Bao
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory building
ClarkTony
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Vrije Universiteit Amsterdam
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
yang947066
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
Jeff Z. Pan
 
Dipso K Mi
Dipso K MiDipso K Mi
Dipso K Mi
msabou
 
Adaptive relevance feedback in information retrieval
Adaptive relevance feedback in information retrievalAdaptive relevance feedback in information retrieval
Adaptive relevance feedback in information retrieval
YI-JHEN LIN
 
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
The Statistical and Applied Mathematical Sciences Institute
 
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptx
jonathanG19
 
Slides
SlidesSlides
Slides
butest
 
lecture_mooney.ppt
lecture_mooney.pptlecture_mooney.ppt
lecture_mooney.ppt
butest
 
OpenEdition Lab projects in Text Mining
OpenEdition Lab projects in Text MiningOpenEdition Lab projects in Text Mining
Poggi analytics - star - 1a
Poggi   analytics - star - 1aPoggi   analytics - star - 1a
Poggi analytics - star - 1a
Gaston Liberman
 
Fol
FolFol
Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...
inscit2006
 
Theory of computing
Theory of computingTheory of computing
Theory of computing
Bipul Roy Bpl
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
Pooyan Jamshidi
 
Gsdi10
Gsdi10Gsdi10
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
Vsevolod Dyomkin
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Anshika865276
 

Similar to Fosdem 2013 petra selmer flexible querying of graph data (20)

Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data Sources
 
Dsm as theory building
Dsm as theory buildingDsm as theory building
Dsm as theory building
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
The Rise of Approximate Ontology Reasoning: Is It Mainstream Yet? --- Revisit...
 
Dipso K Mi
Dipso K MiDipso K Mi
Dipso K Mi
 
Adaptive relevance feedback in information retrieval
Adaptive relevance feedback in information retrievalAdaptive relevance feedback in information retrieval
Adaptive relevance feedback in information retrieval
 
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...QMC: Transition Workshop - Approximating Multivariate Functions When Function...
QMC: Transition Workshop - Approximating Multivariate Functions When Function...
 
Lecture1.pptx
Lecture1.pptxLecture1.pptx
Lecture1.pptx
 
Slides
SlidesSlides
Slides
 
lecture_mooney.ppt
lecture_mooney.pptlecture_mooney.ppt
lecture_mooney.ppt
 
OpenEdition Lab projects in Text Mining
OpenEdition Lab projects in Text MiningOpenEdition Lab projects in Text Mining
OpenEdition Lab projects in Text Mining
 
Poggi analytics - star - 1a
Poggi   analytics - star - 1aPoggi   analytics - star - 1a
Poggi analytics - star - 1a
 
Fol
FolFol
Fol
 
Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...Intelligent Methods in Models of Text Information Retrieval: Implications for...
Intelligent Methods in Models of Text Information Retrieval: Implications for...
 
Theory of computing
Theory of computingTheory of computing
Theory of computing
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Gsdi10
Gsdi10Gsdi10
Gsdi10
 
Crash-course in Natural Language Processing
Crash-course in Natural Language ProcessingCrash-course in Natural Language Processing
Crash-course in Natural Language Processing
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 

Recently uploaded

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 

Recently uploaded (20)

Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 

Fosdem 2013 petra selmer flexible querying of graph data

  • 1. Flexible querying of graph data Graph processing room FOSDEM, 2 Feb 2013 Petra Selmer petra.selmer.uk@gmail.com http://www.dcs.bbk.ac.uk/~lselm01/
  • 2. Introduction  I shall be presenting my PhD topic which involves a declarative query language allowing for the flexible querying of graph-structured data with complex paths. 2
  • 3. Agenda  Who (am I)?  Why (the motivation)?  Some background info  What (is the query language and what can it do)?  Illustrative examples  How (is it done)? 3
  • 4. Who?  Petra Selmer  Part-time PhD student:  Birkbeck College, University of London  Prof. Alexandra Poulovassilis  Dr. Peter T. Wood  Software Architect:  University College London’s Institute of Neurology (Wellcome Trust Centre for Neuroimaging) 4
  • 5. Why?  Amount of graph-structured data is growing fast  The structure of this data is becoming more complex, especially when multiple, heterogeneous data sources are integrated together  The structure of the data is also always subject to change... 5
  • 6. Why?  Users of such systems may not be familiar with the underlying data structure: available paths etc  The user may not be able to obtain meaningful answers (or indeed, any answers) from the data IF the querying system is limited to exact matching of users’ queries  Also, the user may wish to explore the data by starting from a set of initial answers and proceeding from there  The user may additionally wish to derive some intelligence from the connections.... The data The query The user 6
  • 7. Background: Ontologies  Currently part of the Semantic Web stack (Tim Berners- Lee, RDF, triple stores)  Models a domain of interest: inferences, reasoning...  It can be thought of as a “schema” for graph data  The following inference rules are included (among others):  Subclass: ‘History’, ‘Languages’ are subclasses of ‘Humanities’  Subproperty, Domain, Range... 7
  • 8. What?  Data model: G = (V, E)  Very general model  V : vertices (or nodes); each labelled with some constant  E : directed, labelled edges; labels drawn from an alphabet {Ʃ U ‘type’}  The query language is called Flex-It (it is declarative)  The basis is that of conjunctive regular path queries  There are two operators which may be applied to the original query 8
  • 9. What?  Conjunctive regular path queries:  This is where the graph's paths to be traversed are expressed with a regular expression  A single regular path query conjunct: (X, R, Y)  X, Y: either constants or variables  R: the regular expression  “Conjunctive”: joining multiple conjuncts; e.g. (X, R1, Y), (Y, R2, Z), (Z, R3, A)  The Y’s are matched, the Z’s are matched etc 1) (N1, n+, ?Y): n n p • Y = N2, N3 N1 N2 N3 N4 2) (N1, n*p, ?Y): • Y = N4 9
  • 10. What?  Approximation allows for the approximate matching of labels in the path  An edit operation is applied to each edge label in the path denoted by the regular expression:  Edit operations: insertions, deletions, inversions, substitutions and transpositions of labels  Each operation has a ‘cost’: usually 1  Example:  Query conjunct: (X, a*.b, Y)  R = a*.b [answers returned at cost 0]  R’ = p.a*.b (insertion of ‘p’) [answers returned at cost 1]  R’’ = p.a*.b- (inversion of ‘b’) [answers returned at cost 2] 10
  • 11. What?  Relaxation is applied by using inference rules from an ontology (if one exists).  Achieved by applying logical relaxation of the query conditions using the data’s ontology definition  Relaxation operations: subclass, subproperty, domain and range  Each operation has a ‘cost’ – usually 1  Example:  We have an ontology:  Humanities (superclass)  Languages and History (subclasses of Humanities)  Assume our query states Languages may be relaxed  Languages is relaxed to Humanities:  Instances of Languages will be returned at cost 0  Instances of History will be returned at cost 1 11
  • 12. What?  Answers are ranked according to how closely they match the original query; higher-cost answers have a lower ranking  All answers at a certain distance d are ranked the same and returned before answers at a higher distance  We allow for incremental execution: exact answers returned first; then answers at distance 1; ... 12
  • 13. Example – ‘Lifelong learner metadata’ sc History 13
  • 15.  Query: “What work positions can I reach, having a degree in English”?  Y = the episode; Z = the job (?Y, ?Z)  (?X, type, University), (?X, qualif.type, EnglishStudies), (?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z) 15
  • 16.  Query: “What work positions can I reach, having a degree in English”?  Y = the episode; Z = the job (?Y, ?Z)  (?X, type, University), (?X, qualif.type, EnglishStudies), (?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z)  No results from User 2 will be returned...even though it is relevant! 16
  • 17.  Allowing query approximation can yield some answers:  Replacing the edge label prereq by next, at an edit cost of 1, we get this variant of the query: (?Y, ?Z)  (?X, type, University), (?X, qualif.type, EnglishStudies), APPROX(?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z)  prereq+ can be approximated by next.prereq* at edit distance 1:  Result: Y = ep22, Z = AirTravelAssistant 17
  • 18.  Allowing query approximation can yield some answers:  Replacing the edge label prereq by next, at an edit cost of 1, we get this variant of the query: (?Y, ?Z)  (?X, type, University), (?X, qualif.type, EnglishStudies), APPROX(?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z)  next.prereq* can be approximated by next.next.prereq*, now at edit distance 2:  Results:  Y = ep23, Z = Journalist  Y = ep24, Z = AssistantEditor 18
  • 19. sc History 19
  • 20. Query: “What jobs are open to me if I study English, or something similar, at University”? (?Y, ?Z)  (?X, type, University), (?X, qualif, ?D), RELAX (?D, type, EnglishStudies), APPROX (?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z)  In addition to the answers (from User 2) obtained by the previous query, we now also have answers from the timeline of User 3  prereq+ can be approximated by next.prereq* (distance 1) and EnglishStudies can be relaxed – via Languages - to Humanities (distance 2), encompassing History  Result: Y = ep32, Z = PersonalAssistant (distance of 3 from original query) 20
  • 21. Query: “What jobs are open to me if I study English, or something similar, at University”? (?Y, ?Z)  (?X, type, University), (?X, qualif, ?D), RELAX (?D, type, EnglishStudies), APPROX (?X, prereq+, ?Y), (?Y, type, Work), (?Y, job.type, ?Z)  next.prereq* can be approximated by next.next.prereq* (distance 2), with EnglishStudies again relaxed to Humanities (distance 2)  Results: (both at distance 4 from the original query)  Y = ep33, Z = Author  Y = e34, Z = AssociateEditor 21
  • 22. How?  Theory  Construction of a weighted non-deterministic finite automaton (NFA) to represent the regular expression  We apply new states and transitions to the NFA to represent the approximation and relaxation operations  Formation of a product automaton: NFA with data graph G  We perform a lowest cost path traversal of the product automaton; construct query tree, do joins etc  Polynomial time complexity  Correctness of algorithms proven 22
  • 23. How?  Implementation of prototype  Graph database: DEX (http://www.sparsity- technologies.com/dex)  Programming language: C#  Further work  New flexible operation combining APPROX and RELAX  FLEX  Optimisation! 23
  • 24. Any questions? Thank you for your attention! petra.selmer.uk@gmail.com 24