Your SlideShare is downloading. ×
0
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Ontological Conjunctive Query Answering over large, semi-structured knowledge bases
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ontological Conjunctive Query Answering over large, semi-structured knowledge bases

716

Published on

Ontological Conjunctive Query Answering knows today a renewed interest in knowledge systems that allow for expressive inferences. Most notably in the Semantic Web domain, this problem is known as …

Ontological Conjunctive Query Answering knows today a renewed interest in knowledge systems that allow for expressive inferences. Most notably in the Semantic Web domain, this problem is known as Ontology-Based Data Access. The problem consists in, given a knowledge base with some factual knowledge (very often a relational database) and universal knowledge (ontology), to check if there is an answer to a conjunctive query in the knowledge base. This problem has been successfully studied in the past, however the emergence of large and semi-structured knowledge bases and the increasing interest on non-relational databases have slightly changed its nature.
This presentation will highlight the following aspects. First, we introduce the problem and the manner we have chosen to address it. We then discuss how the size of the knowledge base impacts our approach. In a second time, we introduce the ALASKA platform, a framework for performing knowledge representation & reasoning operations over heterogeneously stored data. Finally we present preliminary results obtained by comparing efficiency of existing storage systems when storing knowledge bases of different sizes on disk and future implications.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
716
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge Bases Bruno Paiva Lima da Silva GraphIK Research Team, LIRMM FOSDEM 2012 - February 5thOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 1 / 32
  • 2. Introduction Research Problem ALASKA platform Tests & Results Current & Future work Questions 1 Introduction 2 Research Problem 3 ALASKA platform 4 Tests & Results 5 Current & Future work 6 QuestionsOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 2 / 32
  • 3. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsAbout me Bruno PAIVA LIMA DA SILVA 2nd year PhD Student @ GraphIK Research Team (http://www2.lirmm.fr/graphik) GraphIK team is located at LIRMM, Montpellier, France. Research topics: Knowledge representation (interrogation of knowledge bases), record linkage & argumentation problemsOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 3 / 32
  • 4. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOntological Conjunctive Query Answering Problem: Ontological Conjunctive Query Answering (OCQA) [Also known as Ontology-based Data Access (ODBA)]Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 4 / 32
  • 5. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOntological Conjunctive Query Answering Problem: Ontological Conjunctive Query Answering (OCQA) [Also known as Ontology-based Data Access (ODBA)] Given: Knowledge base (KB) Factual knowledge Ontology (Universal knowledge) (Boolean) Conjunctive QueryOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 4 / 32
  • 6. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOntological Conjunctive Query Answering Problem: Ontological Conjunctive Query Answering (OCQA) [Also known as Ontology-based Data Access (ODBA)] Given: Knowledge base (KB) Factual knowledge Ontology (Universal knowledge) (Boolean) Conjunctive Query OCQA consists in verifying if there is (or not) an answer to the query in the KB (if the query can be deduced from tke KB).Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 4 / 32
  • 7. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Let us describe the problem through a quick example: Factual knowledge: Alice and Bob are animals. Alice is a clownfish. Bob is a parrot. Ontology: “A clownfish is a fish.” “A fish swims.” “A parrot is a bird.” “A bird flies.”Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 5 / 32
  • 8. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Let us describe the problem through a quick example: Factual knowledge: Alice and Bob are animals. Alice is a clownfish. Bob is a parrot. Ontology: “A clownfish is a fish.” “A fish swims.” “A parrot is a bird.” “A bird flies.” Query #1: Is there a clownfish?Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 5 / 32
  • 9. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Let us describe the problem through a quick example: Factual knowledge: Alice and Bob are animals. Alice is a clownfish. Bob is a parrot. Ontology: “A clownfish is a fish.” “A fish swims.” “A parrot is a bird.” “A bird flies.” Query #1: Is there a clownfish? Yes, Alice.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 5 / 32
  • 10. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies?Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 11. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” “A parrot is a bird.” “A bird flies.”Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 12. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” Alice is a fish. “A parrot is a bird.” “A bird flies.”Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 13. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” Alice is a fish. “A parrot is a bird.” Alice swims. “A bird flies.”Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 14. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” Alice is a fish. “A parrot is a bird.” Alice swims. “A bird flies.” Bob is a bird.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 15. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” Alice is a fish. “A parrot is a bird.” Alice swims. “A bird flies.” Bob is a bird. Bob flies.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 16. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsExample Query #2: Is there an animal who flies? Factual knowledge: Alice and Bob are animals. Ontology: Alice is a clownfish. “A clownfish is a fish.” Bob is a parrot. “A fish swims.” Alice is a fish. “A parrot is a bird.” Alice swims. “A bird flies.” Bob is a bird. Bob flies. Answer: Yes, Bob.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 6 / 32
  • 17. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsFirst-Order Logic We use a decidable subset of First-Order Logic (FOL) to represent the problem: Definitions: Terms: Alice, Bob Predicates: flies(x), swims(x), friend(x,y), between(x,y,z) Atoms: parrot(Bob), friend(Alice,Bob) Rules: ∀x [hypothesis] bird(x) → [conclusion] flies(x) According to this formalism, we have: Factual knowledge: conjunctions of atoms Ontology: set of rules Conjunctive Query: conjunctions of atomsOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 7 / 32
  • 18. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsEquivalences According to the chosen set of rules, we retrieve semantical equivalences from our problem into others that are or have already been studied in the litterature. If O is empty, our problem becomes equivalent to the Entailment problem in RDF language.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 8 / 32
  • 19. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsEquivalences According to the chosen set of rules, we retrieve semantical equivalences from our problem into others that are or have already been studied in the litterature. If O is empty, our problem becomes equivalent to the Entailment problem in RDF language. If O is a set of ∀-rules, we enter the RDFS, Datalog and Conceptual Graphs (CGs) scope. [“if x has a car, then x has a driving licence”]Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 8 / 32
  • 20. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsEquivalences According to the chosen set of rules, we retrieve semantical equivalences from our problem into others that are or have already been studied in the litterature. If O is empty, our problem becomes equivalent to the Entailment problem in RDF language. If O is a set of ∀-rules, we enter the RDFS, Datalog and Conceptual Graphs (CGs) scope. [“if x has a car, then x has a driving licence”] If O is a set of ∀∃-rules, we obtain an equivalence to the problems found in Datalog± and CGs with rules. [“if x is an human, it exists y , another human, which is its parent”]Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 8 / 32
  • 21. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsDeduction F |= Q ... iff there is a substitution S associating every term of the query to a term in the facts. Problem: Finding substitutions (Also known as ENTAILMENT) {F, O} |= Q ... iff after being enriched by O, there is a substitution S associating every term of the query to a term in the facts. Problem: Applying rules, Finding substitutions (Also known as RULE-ENTAILMENT)Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 9 / 32
  • 22. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsRule application There are two distinct methods for applying rules: Forward chaining: (seen in the example) Knowledge base information is increased with rule application. Queries are applied (homomorphism computation) into the facts when no more information can be added (the base is saturated). Backwards chaining: Initial query is decomposed/rewritten according to the rules of the ontology. Those new queries are then applied to the knowledge base, which was not modified.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 10 / 32
  • 23. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsElementary operations The efficiency of finding substitutions and applying rules steps depends on the efficiency of some elementary operations: Finding substitutions (homomorphism): Retrieving a term in the knowledge base. Retrieving adjacent terms (neighbourhood) of a given term. Check the existence of an atom with given terms. Rule application: Finding substitutions. Inserting new pieces of information from time to time (and not all at once).Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 11 / 32
  • 24. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOverview Until very recently...Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 12 / 32
  • 25. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOverview Until very recently... Factual knowledge = RDBMSOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 12 / 32
  • 26. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsOverview Until very recently... Factual knowledge = RDBMS However different new factors have appeared, changing the nature of the problem...Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 12 / 32
  • 27. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsNew factors Semi-structured data (Abiteboul,1997) Knowledge bases with: “irregular, partial or implicit structure”, “very large schema”, “schema is ignored”, “schema evolving rapidly”, “difficult distinction between schema and data”, etc. Emergence of semi-structured knowledge bases over the web. KBs can now be very large (see the Semantic Web). For our work: Large → Does not fit in main memory.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 13 / 32
  • 28. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsState of the art What we already know about the subject... RDBs handle very well data stored in secondary memory, however: Using SQL for querying is not the best solution, as it relies on joins, which become very costly on larger queries. Homomorphism algorithms use SQL statements for elementary operations: their complexity also depend on the size of the tables. Graph homomorphism works very well with graphs stored in memory. They were not tested on GDBs yet.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 14 / 32
  • 29. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsObjectives Three different approaches to this problem exist in the litterature: 1 Approximative and probabilistic algorithms. 2 Algorithms optimization. 3 Analysis of storage methods.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 15 / 32
  • 30. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsObjectives Three different approaches to this problem exist in the litterature: 1 Approximative and probabilistic algorithms. 2 Algorithms optimization. 3 Analysis of storage methods. We try to show that items 2 and 3 are tightly correlated. How? Investigating different storage models (RDBs, GDBs & Triple Stores) and their internal data structure. Using an abstract architecture to compare their efficiency on elementary operations. Writing an efficient algorithm for deduction.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 15 / 32
  • 31. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsALASKA platform ALASKA platform Abstract Logic-based Architecture Storage systems & Knowledge base Analysis Its goal is to enable to perform OCQA in a logical, generic manner, over existing, heterogenous storage systems. Graph to RDB, RDB to Graph, all using an intermediary translation into logics.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 16 / 32
  • 32. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsFeatures & details Multi-layered architecture: Program goes from higher level operations down to I/O disk functions. Classes and interfaces ensuring all the storage systems connected will have same methods, using a common datatype (based on FOL). Written in JAVA: Very easy to plug several pieces of code in, however, with a significant loss in speed and efficiency. Systems already connected: TSs (Jena, Sesame), RDBs (MySQL, Sqlite), GDBs (DEX, Neo4j) - Non-definitive list All layer below application layer work as the lower level part for OCQA (and other KR problems) computation.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 17 / 32
  • 33. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsClass diagram Application KRR layer (1) operations < interface > < interface > < interface > IFact IAtom ITerm Abstract layer (2) Predicate Atom Term GDB RDB TS Translation Connectors Connectors Connectors layer (3) Data GDB RDB TS layer (4) Figure: Class diagram of ALASKA architecture.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 18 / 32
  • 34. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsALASKA for OCQA Our current goal is to use ALASKA to verify the efficiency of the connected systems on elementary operations: Storage tests: Measuring the time and size when storing smaller, then larger knowledge bases on disk. Querying tests: Measuring the time that each system takes to answer a set of queries using different algoritms/query engines. Once both tests are done, there will be a result analysis stage: Is the best system for storage also the best for querying? Is there a system that performs excellently on a certain task?Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 19 / 32
  • 35. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsStorage algorithm input = getInputManager(); fact = new XFact(DB location); A fact is created or loaded. X ∈ {DEX, Sqlite, Neo4j, MySQL, etc.} atoms = input.parse(content); Content is parsed, an atom iterator is returned fact.store(atoms); Atoms are added to the fact according to the storage typeOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 20 / 32
  • 36. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsStorage algorithm Case for a graph database: Algorithm 1: KB to Hypergraph Input: A an atom iterator Output: a boolean value 1 begin 2 g ←− empty graph; 3 foreach Atom a in A do 4 foreach Term ti in a.terms do 5 if !exists node with label t then 6 if t is a constant term then t ←− c : t; 7 else t ←− v : t; 8 add hyperedge (t1 ,...,tn ) with label a.predicate to g ; 9 return true; 10 endOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 21 / 32
  • 37. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsStorage algorithm Case for a relational database: Algorithm 2: KB to RDB Input: A an atom iterator Output: a boolean value 1 begin 2 foreach Atom a in A do 3 p ←− a.predicate; 4 if !exists table with label p then 5 create table with name p; 6 foreach Term t in a.terms do 7 if t is a constant term then t ←− c : t; 8 else t ←− v : t; 9 insert (t1 ,...,tn ) into table p; 10 return true; 11 endOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 22 / 32
  • 38. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow Input RDF File RDF Parser Layer (1) Manager IFact Manager Layer (2) IFact to RDB IFact to GDB Layer (3) Translation Translation Relational Triple Store Graph DB Layer (4) DB Figure: Testing protocol workflow for storing a knowledge base in RDF.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 23 / 32
  • 39. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsInput elements For our tests, we use knowledge bases from the SP2B Project: Presented in 2008 at ISWC. Initially a SPARQL benchmark. Has defined a set of queries that covers all SPARQL specifications. Also features a Knowledge Base generator, inspired on the DBLP structure. The generator is able to create bases of any size, maintaining the same structure.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 24 / 32
  • 40. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsPreliminary results Using our platform, we have evaluated the insertion efficiency of different storage systems: Knowledge Base Transformation into IFact Relational Triples Graph Database Store DatabaseOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 25 / 32
  • 41. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsPreliminary results Using our platform, we have evaluated the insertion efficiency of different storage systems: Knowledge Base Transformation into IFact Relational Triples Graph Database Store DatabaseOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 25 / 32
  • 42. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsPreliminary results Using our platform, we have evaluated the insertion efficiency of different storage systems: Knowledge Base Transformation into IFact Relational Triples Graph Database Store DatabaseOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 25 / 32
  • 43. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsIssues Results have shown that our method was not really appropriate for the size of knowledge bases we aim work with: Parsing issues: More or less memory is used by our program according to the parsing method used. Bigger memory consumption at parsing = less memory available for the storage system. Transaction sizes: At a certain level, it is impossible to store all information at once (Most systems went on swap). Creation of an atom buffer: information is treated in pieces, parsed then stored in a smaller transaction. Garbage collecting: GC overhead limit errors on almost every storage system bases beyond 20M triples. Recycling JAVA objects became mandatory: setting/re-setting objects attributes instead of creations/destructions.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 26 / 32
  • 44. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsChanges & improvements The algorithm was then changed to the following version: input = getInputManager(); fact = new XFact(DB location); input.store(fact,content); Calling the store method makes the parser create the atom buffer (array). An event is thrown when the parser finishes parsing a statement. Event handling method: if (buffer is full) { fact.store(buffer); position = 0; } The fact now only stores N (buffer size) atoms at a time. buffer[position].setPredicate(stmtPredicate); buffer[position].setTerms([stmtSubject,stmtObject]); Atoms in buffer are now recycled (Number of atoms created/destroyed = buffer size). position++;Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 27 / 32
  • 45. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsNew resultsOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 28 / 32
  • 46. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsInterrogation tests Next step in the project will be to perform interrogation tests. Using the platform + a Datalog-to-SQL algorithm, we aim evaluating querying performances of the selected storage systems: For GDBs: Comparing the efficiency of each system using the same algorithm. For RDBs: Comparing the efficiency of our algorithm against the native SQL interface.Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 29 / 32
  • 47. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow The workflow of the tests is detailed below: F |= Q Abstract Architecture Relational DB Graph DBOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 30 / 32
  • 48. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow The workflow of the tests is detailed below: F |= Q Abstract Architecture Relational DB Graph DBOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 30 / 32
  • 49. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow The workflow of the tests is detailed below: F |= Q Test results − Query size Time BT ... terms ... s Abstract Architecture Test results − Query size Time BT ... terms ... s Relational DB Graph DBOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 30 / 32
  • 50. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow The workflow of the tests is detailed below: F |= Q Test results − Query size Time BT ... terms ... s SQL ... terms ... s Abstract Q → SQL Architecture Test results − Query size Time BT ... terms ... s Relational DB Graph DBOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 30 / 32
  • 51. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsWorkflow The workflow of the tests is detailed below: F |= Q Test results − Query size Time BT ... terms ... s Q → Graph SQL ... terms ... s Abstract Q → SQL Architecture Query Test results − Query size Time BT ... terms ... s Graph ... terms ... s Relational DB Graph DBOntological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 30 / 32
  • 52. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsCurrent & Future work Currently, we focus on finding (really) difficult queries to check algorithms behaviour on these cases. But some questions in this field are still open: Traversal queries: Can they enhance homomorphism computation? How? Real world KBs vs. generated KBs Can we integrate a constraint solving program for computing homomorphism?Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 31 / 32
  • 53. Introduction Research Problem ALASKA platform Tests & Results Current & Future work QuestionsQuestions Thank you! Questions & comments...Ontological Conjunctive Query Answering over Large, Semi-Structured Knowledge BasesPAIVA LIMA DA SILVA Bruno (bplsilva@gmail.com) 32 / 32

×