Successfully reported this slideshow.
Your SlideShare is downloading. ×

Querying Heterogeneous Datasets on the Linked Data Web

Querying Heterogeneous Datasets on the Linked Data Web

Download to read offline

The growing number of datasets published on the Web as linked data brings both opportunities for high data availability and challenges inherent to querying data in a semantically heterogeneous and distributed environment. Approaches used for querying siloed databases fail at Web-scale because users don't have an a priori understanding of all the available datasets. This article investigates the main challenges in constructing a query and search solution for linked data and analyzes existing approaches and trends.

The growing number of datasets published on the Web as linked data brings both opportunities for high data availability and challenges inherent to querying data in a semantically heterogeneous and distributed environment. Approaches used for querying siloed databases fail at Web-scale because users don't have an a priori understanding of all the available datasets. This article investigates the main challenges in constructing a query and search solution for linked data and analyzes existing approaches and trends.

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Querying Heterogeneous Datasets on the Linked Data Web

  1. 1. Digital Enterprise Research Institute www.deri.ie Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends André Freitas, Edward Curry, João G. Oliveira, Seán O’Riain © Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
  2. 2. IEEE Internet Computing Digital Enterprise Research Institute www.deri.ie A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain, “Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends,”e  IEEE Internet Computing, vol. 16, no. 1, pp. 24-33, 2012. http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141 http://andrefreitas.org
  3. 3. Digital Enterprise Research Institute www.deri.ie Motivation
  4. 4. Querying Data over the Web Digital Enterprise Research Institute www.deri.ie  We can see (a) natural language query over two search engines; (b) corresponding SPARQL representation; and (c) semantic gap between the user’s information needs and data representation.
  5. 5. Expressivity-Usability Trade-Off Digital Enterprise Research Institute www.deri.ie  Expressivity–usability trade-off for querying over structured data.  Blue dots indicate an ideal query mechanism for linked data must provide both high expressivity and high usability
  6. 6. Digital Enterprise Research Institute www.deri.ie Challenges
  7. 7. Challenges Digital Enterprise Research Institute www.deri.ie  Analysis focuses on investigation of existing approaches under the perspective of the usability-expressivity trade-off.  This focus guides the categorization and analysis of existing challenges, approaches and trends.
  8. 8. Challenge Dimensions Digital Enterprise Research Institute www.deri.ie  Query Expressivity  Ability to query datasets by referencing elements in data model structure, as well as to operate over the data (aggregate results, express conditional statements, etc.)  Usability  Easy-to-operate, intuitive, and task-efficient query interface  Vocabulary-level Semantic Matching  Ability to semantically match user query terms to dataset vocabulary-level terms
  9. 9. Challenge Dimensions Digital Enterprise Research Institute www.deri.ie  Entity Reconciliation  Matches entities expressed in the query to semantically equivalent dataset entities  Semantic Tractability  Ability to answer queries not supported by explicit dataset statements – For example, “Is Natalie Portman an Actress?” can be supported by the statement “Natalie Portman starred Star Wars,” instead of an explicit statement “Natalie Portman occupation Actress,” which might not be present in dataset
  10. 10. Digital Enterprise Research Institute www.deri.ie Approaches
  11. 11. Approaches Digital Enterprise Research Institute www.deri.ie  Information Retrieval approaches  Entity-centric search  Structure search  Natural Language approaches  Question Answering  Semantic best-effort natural language interfaces
  12. 12. Entity-Centric Search Digital Enterprise Research Institute www.deri.ie e.g. Sindice
  13. 13. Structure Search Digital Enterprise Research Institute www.deri.ie e.g. Semplore
  14. 14. Question Answering Digital Enterprise Research Institute www.deri.ie e.g. FreyA
  15. 15. Semantic Best-Effort/NL Digital Enterprise Research Institute www.deri.ie e.g. Treo
  16. 16. Comparative Analysis (Approaches) Digital Enterprise Research Institute www.deri.ie
  17. 17. Addressing the Challenges Digital Enterprise Research Institute www.deri.ie  The functionality analysis of existing approaches provides insights on how the major challenges should be addressed.  This set of strategic functionalities define the set of trends.
  18. 18. Linked Data Web Digital Enterprise Research Institute www.deri.ie
  19. 19. Digital Enterprise Research Institute www.deri.ie Trends
  20. 20. Trends Digital Enterprise Research Institute www.deri.ie  Complementary Search and Query Services  User Interaction and Feedback Mechanisms  Semantic Best-Effort Query Model  Natural Language Processing Techniques  Distributional Semantic Model  External Knowledge Sources for Semantic Enrichment  Integrated Entity Reconciliation Techniques
  21. 21. IEEE Internet Computing Digital Enterprise Research Institute www.deri.ie A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain, “Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends,”e  IEEE Internet Computing, vol. 16, no. 1, pp. 24-33, 2012. http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141 http://andrefreitas.org
  22. 22. Further Reading Digital Enterprise Research Institute www.deri.ie  A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain, A Distributional Structured Semantic Space for Querying RDF Graph Data, International Journal of Semantic Computing, vol. 5, no. 4, pp. 433-462, 201  S. O’Riain, E. Curry, and A. Harth, XBRL and Open Data for Global Financial Ecosystems: A Linked Data Approach, International Journal of Accounting Information Systems, vol. 13, no. 2, pp. 141-162, 2012.  A. Freitas, E. Curry, and S. O'Riain, p A Distributional Approach for Terminology-Level Semantic Search on the Linked Data Web, in 27th ACM Symposium On Applied Computing (SAC 2012), 2012.  A. Freitas, J. G. Oliveira, S. O'Riain, and E. Curry,WA Multidimensional Semantic Space for Data Model Independent Queries over RDF Data, in Fifth IEEE International Conference on Semantic Computing (ICSC 2011)  A. Freitas, T. Knap, S. O’Riain, and E. Curry, W3P: Building an OPM based provenance model for the Web, Future Generation Computer Systems, vol. 27, no. 6, pp. 766-774, Jun. 2011.

×