Jmora.di.oeg.3x1e
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Jmora.di.oeg.3x1e

on

  • 833 views

 

Statistics

Views

Total Views
833
Views on SlideShare
833
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Referenceshere.ToDo: Halevy, Wache, Kossmann, Corcho, (Haas and Arens are alreadythere) Calvanese98, and thelasttwo boxes, I cannotthinkaboutthemnow. Y todas las de las cajas de la izquierda en querydistribution.

Jmora.di.oeg.3x1e Presentation Transcript

  • 1. Query Planning for Semantic Information Integration José Mora, Óscar Corcho {jmora, ocorcho}@fi.upm.es Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo s/n 28660 Boadilla del Monte, Madrid, Spain
  • 2. General Scenario – Semantic Information Integration When sources may have Local the global schema Local ontologies ease Let’s considereased as it We need a this model Integration is schema is an ontology it presents integration so much that explicit semantics, their Query happens at to which the When thenoSemantic is according the semantic Therenow. integration for is information some authors proposed additional advantages: own ontologies. when wewill write of the upgradehave onethe level, thehappens single distributed in several user details first. AnOntologies can be defined ontology is a explicit, formal Integration atlanguage, richer query no global models with semantic BTW: H. Wache et al., “Ontology- The (OWL) DL-Lite family was born queries. This abstracted. database.are schema will Then integration occurs sources We retrieving databases, can access level. Mapping creation ontologies, integration explicit semantics, according to different shared specification of a languages, information from them all all the ofsemantic level. most information differ at the the times in the This allows a greater basedgroup of DLsof information- as a integration with reduced differerent in expressiveness and conceptualization. Provides a and inference, easier is split (divide and conversion between a survey of existingefficient query expressiveness for approaches,” from the – c] in sources [Wache01 localSeparation heterogeneity just by database schemas automatically is shared vocabulary which can be thus in their properties wrt what integration would be schemas changes conquer) with other answering. This evolved to the in: Ontologies and Information desirable, but notatrivial. in each database, which easessupported, more to be querying it. comprehension ;) can be done with a domain. As a used to model them, complexity sources… (“semantic automatic. [Wache01 - b] propagation is limited. Sharing, vol. schema. QL. OWL2 profiles EL and 2001, 108-117. for tasks… even decidability willand integration… powerful integration. need to be mapped. global upgrade”) [Wache01 – a] Eg: PayGo from- Google. [Wache01 c] A A 2
  • 3. Scenario - Subproblems Schema Query Yes/No Disparities • PayGo: Large-Scale, mapping based definition distribution options • OBSERVER: Semantic mapping based • Battré, Quilitz: Semantic, SPARQL based Ad-hoc • Straightforward reformulation GAV GAV approaches • Lexic Materialization • SourceSibarski: Semantic, system changes affect the SPARQL, preferences • Bucket • Networked Graphs: Semantic, ad-hoc Syntax Update • Inverse rules Rewriting • Easy to add & remove sources information • LAV PICSEL LAV • Global schema has to be stable Paradigm • Bleiholder Semantic Path Search • Wang description • SIMS Terms of none • Pros of both, cons • GLAV Planning-by- GLAV • Harder to manage Quality rewriting Planning Concepts description • HTN • Calvanese Simple Simple • “Simple” to generate automatically Mappings Reasoning • Pragmatics Perez-Urbina Many others Mappings • Non-constructive for integration • SoftFacts 3
  • 4. State of the Art - Solutions SIMS Search for sources ISI Web services Planning-by- (planning) rewriting Physical vs DARQ HTN Logical search Distribute Battré Bucket queries Search for Rewriting Siberski sources Inverse (preferences) Rules Semantic Calvanese PICSEL Ontology Databases based Reasoning Pérez-Urbina OBSERVER Search for concepts SoftFacts and sources (fuzzy) Bleiholder Path oriented Search for Wang concepts 4
  • 5. Work – Base: REQUIEM • Base: REQUIEM by Pérez-Urbina • Ontology as the global schema, (DL ELHIO¬) • Rewrites to datalog queries by saturation • Logical search but not physical search (∃! local schema) clausification prune •EL: description logic Clauses DL-Lite (retains Clause tree similar to someValuesFrom ) •H: role inclusions saturation •I: inverse roles •O: basic concepts like {a} Query •¬: allows negative inclusions Mediator Datalog program unfolding Set of queries 5
  • 6. Clausification [Pérez-Urbina2010] Asunción Gómez Pérez 6
  • 7. Work – previous work • My previous work: Modification of REQUIEM • Ontology partially covered by the information source  prune • Increase in efficiency in the process because of this prune • Futile queries are not generated, less queries in the result clausification prune Clauses Clause tree saturation Query Datalog Mediator program unfolding Set of queries 7
  • 8. Results - Efficiency • Checked time for naïve and greedy modes • Global and first modes for ontology pruning • Only one ontology, several mapping files R2OO-BCN-GF R2OO-BCN-NG R2OO-EGM-GF R2OO-EGM-NG ms R2OO-Atlas-GF R2OO-Atlas-NG PU-G PU-N 0 1000 2000 3000 8
  • 9. Results – Effectiveness – # of Clauses (~queries) (1/2) • Checked the number of clauses at several stages of the algorithm • After parsing the initial ontology • Pruning the clauses with the information relevant for the query • Saturating the clauses • Unfolding the clauses • Pruning again (only performed in greedy mode) • Checked naïve and greedy modes for inference • Checked global and first modes for ontology pruning • Only one ontology, several mapping files providing different coverages 9
  • 10. Results – Effectiveness – # of Clauses (~queries) (2/2) 2500 2000 1500 After parsing 1000 After pruning (i) After saturation After unfolding 500 After pruning (ii) 0 10
  • 11. Example Query: Q(x) :- Water(x) Ground Freshwater Stream Groundwater Water Seawater Aquifer Continental Running Water Water Hydrographic phenomenon Water Transition Collector Water Surfacewater Punctual Junction Upwelling Hydronym Mouth Still Water Continental_Water(x) :- Groundwater(x) Groundwater(x) :- Ground_Stream(x) Continental_Water(x) :- Ground_Stream(x) Bold: mapped predicates 11
  • 12. After Pruning • Q(x) :- Water(x) • Q(x) :- Water(x) • Water(x) :- Freshwater(x) • Water(x) :- Freshwater(x) • Water(x) :- Seawater(x) • Water(x) :- • Water(x) :- Continental_Water(x) Continental_Water(x) • Continental_Water(x) :- • Continental_Water(x) :- Groundwater(x) Groundwater(x) • Continental_Water(x) :- • Continental_Water(x) :- Surfacewater(x) Surfacewater(x) • Groundwater(x) :- • Groundwater(x) :- Ground_Stream(x) Ground_Stream(x) • Groundwater(x) :- Aquifer(x) • Groundwater(x) :- Aquifer(x) • Surfacewater(x) :- Running_Water(x) • Surfacewater(x) :- ↑ New algorithm (presenting Transition_Water(x) now) • Surfacewater(x) :- Upwelling(x) ← Algorithm in REQUIEM • Surfacewater(x) :- Still_Water(x) 12
  • 13. After saturating • Q(x) :- Water(x) • Q(x) :- Freshwater(x) • Water(x) :- Freshwater(x) • Q(x) :- Freshwater(x) • Water(x) :- Seawater(x) • Continental_Water(x) :- • Water(x) :- Continental_Water(x) Ground_Stream(x) • Continental_Water(x) :- • Continental_Water(x) :- Groundwater(x) Aquifer(x) • Continental_Water(x) :- • Continental_Water(x) :- Surfacewater(x) Surfacewater(x) • Groundwater(x) :- Ground_Stream(x) • Groundwater(x) :- Aquifer(x) ↑ New algorithm (presenting • Surfacewater(x) :- now) (non retrievable Running_Water(x) predicates have been • Surfacewater(x) :- removed through inference) Transition_Water(x) • Surfacewater(x) :- Upwelling(x) ← Algorithm in REQUIEM • Surfacewater(x) :- Still_Water(x) 13
  • 14. Work – current work • @ISI: Integration w/ GAV mediator, DQP, OGSA-DAI • Other mediators should be straightforward • Real tests (w/ schemas and data): not done (yet) • Always open to suggestions for future (remote) collaboration clausification prune Clauses Clause tree saturation Query Datalog Mediator program unfolding Set of queries 14
  • 15. End Questions, comments, proposals, suggestions, … all feedback is welcome. 15
  • 16. Data Integration Working Group in the Ontology Engineering Group OEG Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net Phone: 34.91.3367439, 34.91.3366605 Fax: 34.91.3524819
  • 17. Semantic e-Science •Data Integration •Ontology-based DB access: R2O and ODEMapster •Semantic Grid •S-OGSA Architecture •WS-DAIOnt-RDF(S) OGF standard ll •RDF(S) Grid Access Bridge RDF(S) Grid Access Bridge Architecture Upper Upper Repository service layer service layer SelectorService Web Service Tier Internediate Internediate service layer RepositoryService service layer Resource Class Property Statement Service Service Service Service Lower Lower Container List Alt service layer service layer Service Service Service RDFSConnector RDF(S) Storage Layer Sesame Jena Atlas Connector Connector Connector ... Sesame Jena Atlas RDF Storage RDF Storage RDF Storage 17
  • 18. General scenario Several PhD students Query working in a shared general scenario at UPM Jose Mora – Query plans Freddy Priyatna – Victor Saquicela – Carlos Buil – Multi-RDB2RDF Automatic WS semantic annotation Distributed SPARQL queries Jean-Paul Calbimonte – Multi-SensorNetwork2RDF A A 18
  • 19. R2O++ - Freddy Priyatna R2O Mapping Document R2O Mapping R2O Parser objects Unfolder R2O Properties SQL R2O Query Triples Result Set evaluator Jena Postprocessor Model RDF Model Writer Document DB Asunción Gómez Pérez 19
  • 20. Semantic Streaming Data Access – Jean Paul Calbimonte O-O mapping R2O mappings q Query qr Query Qc reconciliation canonisation SNEEql’ (S1 S2 Sn) SPARQLSTR (Og) SPARQLSTR (O1 O2 On) SNEEql (S1 S2 Sn) Client Distributed Query Processing Data Data reconciliation decanonisation d dr Dc [tripleOg] [tripleO1 O2 On] [tuplel1 l2 l3] Semantic Integrator 20
  • 21. Semantic Annotation of RESTful Services – Victor Saquicela SpellingSuggestions Internet Web applications & API Syntactic description input output Syntactic description Semantic annotation Semantic annotation User Repository 21
  • 22. SparqlDQP – Carlos Buil
  • 23. Ontology Engineering Group Prof. Dr. Asunción Gómez-Pérez, Dr. Oscar Corcho Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net {asun,ocorcho}@fi.upm.es Phone: 34.91.3367439, 34.91.3366605 Fax: 34.91.3524819 Presenter: Jose Mora (jmora@fi.upm.es)
  • 24. People •Director: A. Gómez-Pérez •Research Group (37 people) • 2 Full Professor • 4 Associate Professors • 1 Assistant Professor • 3 Postdocs • 17 PhD Students • 8 MSc Students • 2 Software Engineers • Management (4) • 2 Project Managers • 1 System Administrator • 1 Secretary • 50+ Past Collaborators • 10+ visitors Asunción Gómez Pérez 24
  • 25. Research Areas 2004 2008 Internet of Things Semantic e-Science (Data Integration, Ontological Engineering Semantic Grid) 1995 (Social) Natural Semantic Language Web Processing 2000 1997
  • 26. Research projects 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Katalyx IGN/RAE/AMPER/XMEDIA WHO/IGN Group PLATA España Virtual/mIO!/Buscamedia REIMDOC (FIT) Red/Gis4Gov/11811/UPnP/UpGrid/Autores3.0/WEBn+1 ContentWeb Servicios Semánticos GeoBuddies 12 Ac. Especiales/Complementarias HA98-0002 HF02-0013 MKBEEM OntoWeb Esperonto PIKON Knowledge Web OntoGrid SEEMP NeOn Marie Curie ADMIRE SemSorGrid4Env DynaLearn Company EU Project Coordinators SEALS Spanish Projects EU Project Participation MONNET Asunción Gómez Pérez 26
  • 27. Ontological Engineering Knowledge Resources Ontological Resources •METHONTOLOGY & WebODE Non Ontological Resources Glossaries Dictionaries O. Design Patterns O. Repositories and Registries 3 4 Lexicons Flogic 5 6 Classification Taxonomies Thesauri RDF(S) •NeOn Methodology for building Schemas OWL Ontological Resource 2 Reuse 5 6 Networks of Ontologies 2 Non Ontological Resource Ontology Design 4 O. Aligning Pattern Reuse 3 Reuse • Ontology Scheduling 6 O. Merging 2 Ontological Resource 7 Reengineering 5 • Ontology Requirement Alignments Non Ontological Resource Reengineering 4 6 1 Specification O. Specification O. Conceptualization O. Formalization O. Implementation RDF(S) • Ontology Reuse Flogic 8 9 Ontology Restructuring • Non Ontological Resource (Pruning, Extension, OWL O. Localization Specialization, Modularization) 1,2,3,4,5,6,7,8, 9 Reuse and Reengineering Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation; Configuration Management; Evaluation (V&V); Assessment • Ontology Localization • Ontology Mapping • Ontology Design Patterns • Ontology Change Propagation Asunción Gómez Pérez 27
  • 28. Ontologies and Natural Language Processing (NLP) •LIR – Linguistic Information Repository •Multilingual ontologies & Label Translator •Lexico-Syntactic Patterns for automatic ontology building (Sp, En, Ge) Entity Properties View Lexical Entry Lexical Entry Information flueve Part Of Speech rivière noun river Synonyms rivière Lexicalization Information Translations Main Entry SI river Scientific Name Grammatical Number singular Lexicalization Sense Term Type acronym Sense Language in Context 01 en Lexicalization Source Source URL IATE http://iate.europa.eu/iatediff/Search... Definitions Definition Lang stream of water of considerable Lexicalization Notes volume and length that flows into en Notes Lang URL the see Flueve and rivière are usually considered Definition Source synonyms. However, the Source URL en http://www.cnrtl.fr/ use of fleuve should be avoid when the stream BritannicalOnline http://www.britannica.com/... does not flow in the sea. Asunción Gómez Pérez 28
  • 29. (Social) Semantic Web •Semantic Web Framework •Semantic Portals •Semantic Wikis •Annotation and Browsing Tools • Web content • Multimedia content in home environments •NeOn Methodology for building Large Scale Semantic Web Applications •Benchmarking Semantic Web Technologies •Evolution of folksonomies and ontologies Asunción Gómez Pérez 29
  • 30. Internet of Things • Topics • Large-scale data integration • Mobile devices • Legacy DB • Sensor networks • Sensor networks • Ubiquitous computing • User generated content • Large-scale data integration for mobile applications exploiting user-generated content Asunción Gómez Pérez 30
  • 31. Semantic e-Science •Data Integration •Ontology-based DB access: R2O and ODEMapster •Semantic Grid •S-OGSA Architecture •WS-DAIOnt-RDF(S) OGF standard ll •RDF(S) Grid Access Bridge RDF(S) Grid Access Bridge Architecture Upper Upper Repository service layer service layer SelectorService Web Service Tier Internediate Internediate service layer RepositoryService service layer Resource Class Property Statement Service Service Service Service Lower Lower Container List Alt service layer service layer Service Service Service RDFSConnector RDF(S) Storage Layer Sesame Jena Atlas Connector Connector Connector ... Sesame Jena Atlas RDF Storage RDF Storage RDF Storage 31
  • 32. Colaboration with other research groups Univ. of Wien DFKI Univ. of NR & ALS Univ. of Augsburg KSL. Stanford Univ. Univ. of Amsterdam Univ. of Innsbruck Univ. of Karlsruhe Free Univ. of Amsterdam Univ. of Koblenz Univ. of Hannover Univ. of Brasilia Univ. of Mannheim Univ. of Bielefeld Free Univ. of Brussels Forschungszentrum Informatik Univ. of Galway (DERI) Úniv. of Zurich Ústav Informatiky Open University Oxford University Academy of Sciences Univ. of Manchester Univ. of Liverpool Univ. of Sheffield Univ. of Aberdeen Univ. of Tel Aviv Univ. of Edinburgh CNR Univ. of Southampton Univ. of Trento INRIA Univ. of Hull Univ. of Athens Univ. of Bolzano TUC Asunción Gómez Pérez 32