Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Optique presentation

636 views

Published on

Optique - to provide semantic end-to-end connection between users and data sources; enable users to rapidly formulate intuitive queries using familiar vocabularies and conceptualisations and return timely answers from large scale and heterogeneous data sources.

Published in: Technology
  • Be the first to comment

Optique presentation

  1. 1. Ian Horrocks Information Systems Group Department of Computer Science University of Oxford
  2. 2. What is Big Data?
  3. 3. What is Big Data? “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications” (wikipedia)
  4. 4. What is Big Data? “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications” (wikipedia)
  5. 5. Case Study: Energy Services  Service centres responsible for remote monitoring and diagnostics of 1,000s of gas/steam turbines  Engineers use a variety of data for visualization, diagnostics and trend detection:  several TB of time-stamped sensor data  several GB of event data  data grows at 30GB per day
  6. 6. Case Study: Energy Services  Service centres responsible for remote monitoring and diagnostics of 1,000s of gas/steam turbines  Engineers use a variety of data for visualization, diagnostics and trend detection:  several TB of time-stamped sensor data  several GB of event data  data grows at 30GB per day Service Requests 1,000 requests per center per year 80% of time used on data gathering Potential saving: €50,000,000/year
  7. 7. Case Study: Energy Services  Service centres responsible for remote monitoring and diagnostics of 1,000s of gas/steam turbines  Engineers use a variety of data for visualization, diagnostics and trend detection:  several TB of time-stamped sensor data  several GB of event data  data grows at 30GB per day Service Requests 1,000 requests per center per year 80% of time used on data gathering Potential saving: €50,000,000/year Diagnostic Functionality 2–6 p/m to add new function New diagnostics → better exploitation of data Potential saving: incalculable
  8. 8. Case Study: Exploration  Develop stratigraphic models of unexplored areas  Geologists & geophysicists use data from previous operations in nearby locations  1,000 TB of relational data  using diverse schemata  spread over 1,000s of tables  and multiple data bases
  9. 9. Case Study: Exploration  Develop stratigraphic models of unexplored areas  Geologists & geophysicists use data from previous operations in nearby locations  1,000 TB of relational data  using diverse schemata  spread over 1,000s of tables  and multiple data bases Data Access 900 geologists & geophysicists 30-70% of time on data gathering 4 day turnaround for new queries Potential saving: €70,000,000/year
  10. 10. Case Study: Exploration  Develop stratigraphic models of unexplored areas  Geologists & geophysicists use data from previous operations in nearby locations  1,000 TB of relational data  using diverse schemata  spread over 1,000s of tables  and multiple data bases Data Access 900 geologists & geophysicists 30-70% of time on data gathering 4 day turnaround for new queries Potential saving: €70,000,000/year Data Exploitation Better use of experts time Data analysis “most important factor” for drilling success Potential value: > €10bn/project
  11. 11. Data Access Problem
  12. 12. Data Access Problem Solution: OBDA
  13. 13. Objectives  Provide semantic end-to-end connection between users and data sources
  14. 14. Objectives  Provide semantic end-to-end connection between users and data sources  Enable users to rapidly formulate intuitive queries using familiar vocabularies and conceptualisations
  15. 15. Objectives  Provide semantic end-to-end connection between users and data sources  Enable users to rapidly formulate intuitive queries using familiar vocabularies and conceptualisations  Return timely answers from large scale and heterogeneous data sources
  16. 16. Solution
  17. 17. Solution Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small
  18. 18. Solution Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large
  19. 19. Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large Other features: support for query formulation Solution
  20. 20. Query Formulation
  21. 21. Query Formulation
  22. 22. Query Formulation
  23. 23. Query Formulation
  24. 24. Query Formulation
  25. 25. Query Formulation
  26. 26. Query Formulation
  27. 27. Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large Other features: “Bootstrapping” Ontology & mappings Solution
  28. 28. Solution Metadata propagator OWL Vocabulary SOTA Ontology Ontology Alignment OWL Ontology Extended OWL Ontology Direct Direct Mappings Mapping Extractor Bootstrapping:
  29. 29. Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large Other features: IT-expert oversees O&M management Solution
  30. 30. Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large Other features: Adapter to support streaming data Solution
  31. 31. Stream Adapter  Goal:  Support for data  generated by sensors  historical data
  32. 32. Stream Adapter  Goal:  Support for data  generated by sensors  historical data  Challenges:  Time aware OBDA  Queries  Ontologies  Mappings  Data
  33. 33. Stream Adapter  Goal:  Support for data  generated by sensors  historical data  Challenges:  Time aware OBDA  Queries  Ontologies  Mappings  Data  STARQL query language  Temporalised SPARQL
  34. 34. Query rewriting: • uses ontology & mappings • computationally hard • ontology & mappings small Query evaluation: • ind. of ontology & mappings • computationally tractable • data sets very large Other features: Distributed query execution Solution
  35. 35. Thank you for listening Any questions?

×