We present Ontop, an open-source Ontology-Based Data Access (OBDA) system that allows for querying relational data sources through a conceptual representation of the domain of interest, provided in terms of an ontology, to which the data sources are mapped. Key features of Ontop are its solid theoretical foundations, a virtual approach to OBDA, which avoids materializing triples and is implemented through the query rewriting technique, extensive optimizations exploiting all elements of the OBDA architecture, its compliance to all relevant W3C recommendations (including SPARQL queries, R2RML mappings, and OWL 2 QL and RDFS ontologies), and its support for all major relational databases.
A tutorial on how to create mappings using ontop, how inference (OWL 2 QL and RDFS) plays a role answering SPARQL queries in ontop, and how ontop's support for on-the-fly SQL query translation enables scenarios of semantic data access and data integration.
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
This presentation will provide a brief introduction to logical reasoning and overview of the most popular semantic schema and ontology languages: RDFS and the profiles of OWL 2.
While automatic reasoning has always inspired the imagination, numerous projects have failed to deliver to the promises. The typical pitfalls related to ontologies and symbolic reasoning fall into two categories:
- Over-engineered ontologies. The selected ontology language and modeling patterns can be too expressive. This can make the results of inference hard to understand and verify, which in its turn makes KG hard to evolve and maintain. It can also impose performance penalties far greater than the benefits.
- Inappropriate reasoning support. There are many inference algorithms and implementation approaches, which work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities.
- Inappropriate data layer architecture. One such example is reasoning with virtual KG, which is often infeasible.
Introduction to Text Mining and Topic ModellingDavid Paule
A brief introduction to Text Mining and Topic Modelling given at the Urban Big Data Centre (University of Glasgow).
Want to know more? Visit my website davidpaule.es
This presentation contains the introduction to NOSQL databases, it's types with examples, differentiation with 40 year old relational database management system, it's usage, why and we should use it.
Dmitry Kan, Principal AI Scientist at Silo AI and host of the Vector Podcast [1], will give an overview of the landscape of vector search databases and their role in NLP, along with the latest news and his view on the future of vector search. Further, he will share how he and his team participated in the Billion-Scale Approximate Nearest Neighbor Challenge and improved recall by 12% over a baseline FAISS.
Presented at https://www.meetup.com/open-nlp-meetup/events/282678520/
YouTube: https://www.youtube.com/watch?v=RM0uuMiqO8s&t=179s
Follow Vector Podcast to stay up to date on this topic: https://www.youtube.com/@VectorPodcast
A tutorial on how to create mappings using ontop, how inference (OWL 2 QL and RDFS) plays a role answering SPARQL queries in ontop, and how ontop's support for on-the-fly SQL query translation enables scenarios of semantic data access and data integration.
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
This presentation will provide a brief introduction to logical reasoning and overview of the most popular semantic schema and ontology languages: RDFS and the profiles of OWL 2.
While automatic reasoning has always inspired the imagination, numerous projects have failed to deliver to the promises. The typical pitfalls related to ontologies and symbolic reasoning fall into two categories:
- Over-engineered ontologies. The selected ontology language and modeling patterns can be too expressive. This can make the results of inference hard to understand and verify, which in its turn makes KG hard to evolve and maintain. It can also impose performance penalties far greater than the benefits.
- Inappropriate reasoning support. There are many inference algorithms and implementation approaches, which work well with taxonomies and conceptual models of few thousands of concepts, but cannot cope with KG of millions of entities.
- Inappropriate data layer architecture. One such example is reasoning with virtual KG, which is often infeasible.
Introduction to Text Mining and Topic ModellingDavid Paule
A brief introduction to Text Mining and Topic Modelling given at the Urban Big Data Centre (University of Glasgow).
Want to know more? Visit my website davidpaule.es
This presentation contains the introduction to NOSQL databases, it's types with examples, differentiation with 40 year old relational database management system, it's usage, why and we should use it.
Dmitry Kan, Principal AI Scientist at Silo AI and host of the Vector Podcast [1], will give an overview of the landscape of vector search databases and their role in NLP, along with the latest news and his view on the future of vector search. Further, he will share how he and his team participated in the Billion-Scale Approximate Nearest Neighbor Challenge and improved recall by 12% over a baseline FAISS.
Presented at https://www.meetup.com/open-nlp-meetup/events/282678520/
YouTube: https://www.youtube.com/watch?v=RM0uuMiqO8s&t=179s
Follow Vector Podcast to stay up to date on this topic: https://www.youtube.com/@VectorPodcast
"Managing the Complete Machine Learning Lifecycle with MLflow"Databricks
Machine Learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure.
In this session, we introduce MLflow, a new open-source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
NLTK: Natural Language Processing made easyoutsider2
Natural Language Toolkit(NLTK), an open source library which simplifies the implementation of Natural Language Processing(NLP) in Python is introduced. It is useful for getting started with NLP and also for research/teaching.
TensorFlow에 대한 분석 내용
- TensorFlow?
- 배경
- DistBelief
- Tutorial - Logistic regression
- TensorFlow - 내부적으로는
- Tutorial - CNN, RNN
- Benchmarks
- 다른 오픈 소스들
- TensorFlow를 고려한다면
- 설치
- 참고 자료
https://telecombcn-dl.github.io/2017-dlsl/
Winter School on Deep Learning for Speech and Language. UPC BarcelonaTech ETSETB TelecomBCN.
The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data analytics tools.
MongoDB ? Elasticsearch ? Ces techonologies sont elles faites pour être concurrentes ?
C'est l'histoire d'une rencontre entre ces deux produits. A l'occasion de cette conférence vous apprendrez comment MongoDB et Elastic peuvent se compléter, comment tirer partie du meilleur de chaque monde : "le bon outil pour le bon usage".
Enfin, nous vous présenterons les principaux patterns d'architecture permettant d'intégrer ces deux technologies.
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...Khirulnizam Abd Rahman
Application of Ontology in Semantic Information Retrieval
by Prof Shahrul Azman from FSTM, UKM
Presentation for MyREN Seminar 2014
Berjaya Hotel, Kuala Lumpur
27 November 2014
Ontology Learning from Text
Ontology construction ‘Layer Cake’
Knowledge representation and knowledge management systems
Subtasks in ontology learning
Most Popular Ontology Learning Tools
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
Vector Database is a new vertical of databases used to index and measure the similarity between different pieces of data. While it works well with structured data, when utilized for Vector Similarity Search (VSS) it really shines when comparing similarity in unstructured data, such as vector embedding of images, audio, or long pieces of text
For efficient and innovative use of big data, it is important to integrate multiple data bases across domains. For example, various public data bases are developed in life science, and how to find a novel scientific result using them is an essential technique. In social and business areas, open data strategies in many countries promote diversity of public data, how to combine big data and open data is a big challenge. That is, diversity of dataset is a problem to be solved for big data.
Ontology gives a systematized knowledge to integrate multiple datasets across domains with semantics of them. Linked Data also provides techniques to interlink datasets based on semantic web technologies. We consider that combinations of ontology and Linked Data based on ontological engineering can contribute to solution of diversity problem in big data.
In this talk, I discuss how ontological engineering could be applied to big data with some trial examples.
Fine tune and deploy Hugging Face NLP modelsOVHcloud
Are you currently managing AI projects that require a lot of GPU power?
Are you tired of managing the complexity of your infrastructures, GPU instances and your Kubeflow yourself?
Need flexibility for your AI platform or SaaS solution?
OVHcloud innovates in AI by offering simple and turnkey solutions to train your models and put them into production.
This talk is an introduction to the vector search engine Weaviate. You will learn how storing data using vectors enables semantic search and automatic data classification. Topics like the underlying vector storage mechanism and how the pre-trained language vectorization model enables this are touched. In addition, this presentation consists of live demos to show the power of Weaviate and how you can get started with your own datasets. No prior technical knowledge is required; all concepts are illustrated with real use case examples and live demos. Most of all data is unstructured. Additionally, data is often stored without context, meaning and relation to concepts in the real world. This means that all this data is difficult to index, classify and search through. While this is traditionally solved by manual effort or expensive machine learning models, Weaviate takes another approach to this problem. Weaviate is a vector search engine, which stores data as vectors and automatically adds context and meaning to new data. This enables to search through the data without using exact matching keywords. Moreover, data can be automatically classified. Weaviate is completely open source, has a built-in machine learning model, has a graph-like data model, completely API-based and is cloud-native. Weaviate uses a GraphQL API next to RESTful endpoints to interact with the data in an intuitive manner. Additionally, Python, Go and JavaScript clients are available to facilitate interaction between Weaviate and your applications. GraphQL and client examples will be shown in the presentation.
"Managing the Complete Machine Learning Lifecycle with MLflow"Databricks
Machine Learning development brings many new complexities beyond the traditional software development lifecycle. Unlike in traditional software development, ML developers want to try multiple algorithms, tools, and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. To address these problems, many companies are building custom “ML platforms” that automate this lifecycle, but even these platforms are limited to a few supported algorithms and to each company’s internal infrastructure.
In this session, we introduce MLflow, a new open-source project from Databricks that aims to design an open ML platform where organizations can use any ML library and development tool of their choice to reliably build and share ML applications. MLflow introduces simple abstractions to package reproducible projects, track results, and encapsulate models that can be used with many existing tools, accelerating the ML lifecycle for organizations of any size.
NLTK: Natural Language Processing made easyoutsider2
Natural Language Toolkit(NLTK), an open source library which simplifies the implementation of Natural Language Processing(NLP) in Python is introduced. It is useful for getting started with NLP and also for research/teaching.
TensorFlow에 대한 분석 내용
- TensorFlow?
- 배경
- DistBelief
- Tutorial - Logistic regression
- TensorFlow - 내부적으로는
- Tutorial - CNN, RNN
- Benchmarks
- 다른 오픈 소스들
- TensorFlow를 고려한다면
- 설치
- 참고 자료
https://telecombcn-dl.github.io/2017-dlsl/
Winter School on Deep Learning for Speech and Language. UPC BarcelonaTech ETSETB TelecomBCN.
The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data analytics tools.
MongoDB ? Elasticsearch ? Ces techonologies sont elles faites pour être concurrentes ?
C'est l'histoire d'une rencontre entre ces deux produits. A l'occasion de cette conférence vous apprendrez comment MongoDB et Elastic peuvent se compléter, comment tirer partie du meilleur de chaque monde : "le bon outil pour le bon usage".
Enfin, nous vous présenterons les principaux patterns d'architecture permettant d'intégrer ces deux technologies.
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...Khirulnizam Abd Rahman
Application of Ontology in Semantic Information Retrieval
by Prof Shahrul Azman from FSTM, UKM
Presentation for MyREN Seminar 2014
Berjaya Hotel, Kuala Lumpur
27 November 2014
Ontology Learning from Text
Ontology construction ‘Layer Cake’
Knowledge representation and knowledge management systems
Subtasks in ontology learning
Most Popular Ontology Learning Tools
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
Vector Database is a new vertical of databases used to index and measure the similarity between different pieces of data. While it works well with structured data, when utilized for Vector Similarity Search (VSS) it really shines when comparing similarity in unstructured data, such as vector embedding of images, audio, or long pieces of text
For efficient and innovative use of big data, it is important to integrate multiple data bases across domains. For example, various public data bases are developed in life science, and how to find a novel scientific result using them is an essential technique. In social and business areas, open data strategies in many countries promote diversity of public data, how to combine big data and open data is a big challenge. That is, diversity of dataset is a problem to be solved for big data.
Ontology gives a systematized knowledge to integrate multiple datasets across domains with semantics of them. Linked Data also provides techniques to interlink datasets based on semantic web technologies. We consider that combinations of ontology and Linked Data based on ontological engineering can contribute to solution of diversity problem in big data.
In this talk, I discuss how ontological engineering could be applied to big data with some trial examples.
Fine tune and deploy Hugging Face NLP modelsOVHcloud
Are you currently managing AI projects that require a lot of GPU power?
Are you tired of managing the complexity of your infrastructures, GPU instances and your Kubeflow yourself?
Need flexibility for your AI platform or SaaS solution?
OVHcloud innovates in AI by offering simple and turnkey solutions to train your models and put them into production.
This talk is an introduction to the vector search engine Weaviate. You will learn how storing data using vectors enables semantic search and automatic data classification. Topics like the underlying vector storage mechanism and how the pre-trained language vectorization model enables this are touched. In addition, this presentation consists of live demos to show the power of Weaviate and how you can get started with your own datasets. No prior technical knowledge is required; all concepts are illustrated with real use case examples and live demos. Most of all data is unstructured. Additionally, data is often stored without context, meaning and relation to concepts in the real world. This means that all this data is difficult to index, classify and search through. While this is traditionally solved by manual effort or expensive machine learning models, Weaviate takes another approach to this problem. Weaviate is a vector search engine, which stores data as vectors and automatically adds context and meaning to new data. This enables to search through the data without using exact matching keywords. Moreover, data can be automatically classified. Weaviate is completely open source, has a built-in machine learning model, has a graph-like data model, completely API-based and is cloud-native. Weaviate uses a GraphQL API next to RESTful endpoints to interact with the data in an intuitive manner. Additionally, Python, Go and JavaScript clients are available to facilitate interaction between Weaviate and your applications. GraphQL and client examples will be shown in the presentation.
https://www.eventbrite.com/e/talk-by-paco-nathan-graph-analytics-in-spark-tickets-17173189472
Big Brains meetup hosted by BloomReach, 2015-06-04
Case study / demo of a large-scale graph analytics project, leveraging GraphX in Apache Spark to surface insights about open source developer communities — based on data mining of their email forums. The project works with any Apache email archive, applying NLP and machine learning techniques to analyze message threads, then constructs a large graph. Graph analytics, based on concise Scala coding examples in Spark, surface themes and interactions within the community. Results are used as feedback for respective developer communities, such as leaderboards, etc. As an example, we will examine analysis of the Spark developer community itself.
Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources.
When publishing these lists in RDF, an important concern is making them easy to consume.
Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it.
However, a specific domain model can be implemented in different ways and vocabularies may provide alternative solutions.
In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data.
We take the case of RDF Lists and make the hypothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothesis).
To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability.
Finally, we derive good (and bad) practices on how to publish lists as linked open data.
By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmarking linked data modelling solutions.
Towards a rebirth of data science (by Data Fellas)Andy Petrella
Nowadays, Data Science is buzzing all over the place.
But what is a, so-called, Data Scientist?
Some will argue that a Data Scientist is a person able to report and present insights in a data set. Others will say that a Data Scientist can handle a high throughput of values and expose them in services. Yet another definition includes the capacity to create meaningful visualizations on the data.
However, we enter an age where velocity is a key. Not only the velocity of your data is high, but the time to market is shortened. Hence, the time separating the moment you receive a set of data and the time you’ll be able to deliver added value is crucial.
In this talk, we’ll review the legacy Data Science methodologies, what it meant in terms of delivered work and results.
Afterwards, we’ll slightly move towards different concepts, techniques and tools that Data Scientists will have to learn and appropriate in order to accomplish their tasks in the age of Big Data.
The dissertation is closed by exposing the Data Fellas view on a solution to the challenges, specially thanks to the Spark Notebook and the Shar3 product we develop.
What is a distributed data science pipeline. how with apache spark and friends.Andy Petrella
What was a data product before the world changed and got so complex.
Why distributed computing/data science is the solution.
What problems does that add?
How to solve most of them using the right technologies like spark notebook, spark, scala, mesos and so on in a accompanied framework
"SPARQL Cheat Sheet" is a short collection of slides intended to act as a guide to SPARQL developers. It includes the syntax and structure of SPARQL queries, common SPARQL prefixes and functions, and help with RDF datasets.
The "SPARQL Cheat Sheet" is intended to accompany the SPARQL By Example slides available at http://www.cambridgesemantics.com/2008/09/sparql-by-example/ .
An investigation of how PostgreSQL and its latest capabilities (JSONB data type, GIN indices, Full Text Search) can be used to store, index and perform queries on structured Bibliographic Data such as MARC21/MARCXML, breaking the dependence on proprietary and arcane or obsolete software products.
Talk presented at FOSDEM 2016 in Brussels on 31/01/2016. This is a very practical & hands-on presentation with example code which is certainly not optimal ;)
Complex queries in a distributed multi-model databaseMax Neunhöffer
A multi-model database is a document store, a graph database as well as a key/value store. To allow for convenient and powerful querying such a database needs a query language that understands all three data models and allows to mix these models in queries. For example, it should be possible to find some documents in a collection according to some criteria, then follow some edges in a graph in which the documents represent vertices, and finally join the results with documents from yet another collection.
In this talk I will explain how a query engine for such a language works, give an overview of the life of a query from parsing, over translation into an execution plan, the optimisation phase and finally the execution. I will show how distributed query execution plans look like, how the query optimiser reasons about them and how the distributed execution works.
Similar to Ontop: Answering SPARQL Queries over Relational Databases (20)
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Ontop: Answering SPARQL Queries over Relational Databases
1. Ontop: Answering SPARQL Queries over Relational Databases
Guohui Xiao
Faculty of Computer Science, Free University of Bozen-Bolzano, Italy
Free University of Bozen-Bolzano
February 12, 2016
Stanford University, CA, USA
2. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
About Me
• Guohui Xiao, PhD
• Assistant Professor at KRDB Research Centre for Knowledge and Data,
Free University of Bozen-Bolzano, Italy
• Educations
PhD in Computer Science, Vienna University of Technology, Austria
MSc and BSc in Mathematics, Peking University, China
• Research interests:
Artificial intelligence, Knowledge representation
Description logics, Ontology, Semantic Web
Ontology-based Data Access
Implementation and Optimization of reasoning systems
• Ontop team leader
• Current project: Optique (Scalable End-user Access to Big Data), EU FP7
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 1/56
3. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 2/56
4. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 3/56
5. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
We are Living in the Era of Big Data
Data Never
Sleeps 2.0
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 4/56
6. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
The Problem: information access
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 5/56
7. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
The Problem: information access
How to formulate the right question
to obtain the right answer
in the ocean of Big Data.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 5/56
8. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
How much time is spent searching for data?
Engineers in industry spend a significant amount of their time searching
for data that they require for their core tasks.
For example, in the oil&gas industry, 30–70% of engineers’ time is spent
looking for data and assessing its quality (Crompton, 2008).
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 6/56
9. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example: Statoil Exploration
Experts in geology and geophysics develop
stratigraphic models of unexplored areas on
the basis of data acquired from previous
operations at nearby locations.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 7/56
10. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example: Statoil Exploration
Experts in geology and geophysics develop
stratigraphic models of unexplored areas on
the basis of data acquired from previous
operations at nearby locations.
Facts:
• 1,000 TB of relational data
• using diverse schemata
• spread over 2,000 tables, over multiple individual data bases
Data Access for Exploration:
• 900 experts in Statoil Exploration.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 7/56
11. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
How much time/money is spent searching for data?
A user query at Statoil
Show all norwegian wellbores with some additional attributes (wellbore id,
.....................). Limit to all wellbores with ... and show attributes like
............................................... Limit to all wellbores with ... in .................
and show key attributes in a table. After connecting to ... we could for instance
limit further to cores in ... with ...... and where it is larger than a given value,
for instance ..... We could also find out whether there are cores in ..... which are
not stored in .... (based on .....) and where there could be .......... value. Some
of the missing data we possibly own, other not.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56
12. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
How much time/money is spent searching for data?
A user query at Statoil
Show all norwegian wellbores with some additional attributes (wellbore id,
.....................). Limit to all wellbores with ... and show attributes like
............................................... Limit to all wellbores with ... in .................
and show key attributes in a table. After connecting to ... we could for instance
limit further to cores in ... with ...... and where it is larger than a given value,
for instance ..... We could also find out whether there are cores in ..... which are
not stored in .... (based on .....) and where there could be .......... value. Some
of the missing data we possibly own, other not.
SELECT [...]
FROM
db_name.table1 table1,
db_name.table2 table2a,
db_name.table2 table2b,
db_name.table3 table3a,
db_name.table3 table3b,
db_name.table3 table3c,
db_name.table3 table3d,
db_name.table4 table4a,
db_name.table4 table4b,
db_name.table4 table4c,
db_name.table4 table4d,
db_name.table4 table4e,
db_name.table4 table4f,
db_name.table5 table5a,
db_name.table5 table5b,
db_name.table6 table6a,
db_name.table6 table6b,
db_name.table7 table7a,
db_name.table7 table7b,
db_name.table8 table8,
db_name.table9 table9,
db_name.table10 table10a,
db_name.table10 table10b,
db_name.table10 table10c,
db_name.table11 table11,
db_name.table12 table12,
db_name.table13 table13,
db_name.table14 table14,
db_name.table15 table15,
db_name.table16 table16
WHERE [...]
table2a.attr1=‘keyword’ AND
table3a.attr2=table10c.attr1 AND
table3a.attr6=table6a.attr3 AND
table3a.attr9=‘keyword’ AND
table4a.attr10 IN (‘keyword’) AND
table4a.attr1 IN (‘keyword’) AND
table5a.kinds=table4a.attr13 AND
table5b.kinds=table4c.attr74 AND
table5b.name=‘keyword’ AND
(table6a.attr19=table10c.attr17 OR
(table6a.attr2 IS NULL AND
table10c.attr4 IS NULL)) AND
table6a.attr14=table5b.attr14 AND
table6a.attr2=‘keyword’ AND
(table6b.attr14=table10c.attr8 OR
(table6b.attr4 IS NULL AND
table10c.attr7 IS NULL)) AND
table6b.attr19=table5a.attr55 AND
table6b.attr2=‘keyword’ AND
table7a.attr19=table2b.attr19 AND
table7a.attr17=table15.attr19 AND
table4b.attr11=‘keyword’ AND
table8.attr19=table7a.attr80 AND
table8.attr19=table13.attr20 AND
table8.attr4=‘keyword’ AND
table9.attr10=table16.attr11 AND
table3b.attr19=table10c.attr18 AND
table3b.attr22=table12.attr63 AND
table3b.attr66=‘keyword’ AND
table10a.attr54=table7a.attr8 AND
table10a.attr70=table10c.attr10 AND
table10a.attr16=table4d.attr11 AND
table4c.attr99=‘keyword’ AND
table4c.attr1=‘keyword’ AND
table11.attr10=table5a.attr10 AND
table11.attr40=‘keyword’ AND
table11.attr50=‘keyword’ AND
table2b.attr1=table1.attr8 AND
table2b.attr9 IN (‘keyword’) AND
table2b.attr2 LIKE ‘keyword’% AND
table12.attr9 IN (‘keyword’) AND
table7b.attr1=table2a.attr10 AND
table3c.attr13=table10c.attr1 AND
table3c.attr10=table6b.attr20 AND
table3c.attr13=‘keyword’ AND
table10b.attr16=table10a.attr7 AND
table10b.attr11=table7b.attr8 AND
table10b.attr13=table4b.attr89 AND
table13.attr1=table2b.attr10 AND
table13.attr20=’‘keyword’’ AND
table13.attr15=‘keyword’ AND
table3d.attr49=table12.attr18 AND
table3d.attr18=table10c.attr11 AND
table3d.attr14=‘keyword’ AND
table4d.attr17 IN (‘keyword’) AND
table4d.attr19 IN (‘keyword’) AND
table16.attr28=table11.attr56 AND
table16.attr16=table10b.attr78 AND
table16.attr5=table14.attr56 AND
table4e.attr34 IN (‘keyword’) AND
table4e.attr48 IN (‘keyword’) AND
table4f.attr89=table5b.attr7 AND
table4f.attr45 IN (‘keyword’) AND
table4f.attr1=‘keyword’ AND
table10c.attr2=table4e.attr19 AND
(table10c.attr78=table12.attr56 OR
(table10c.attr55 IS NULL AND
table12.attr17 IS NULL))
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56
13. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
How much time/money is spent searching for data?
A user query at Statoil
Show all norwegian wellbores with some additional attributes (wellbore id,
.....................). Limit to all wellbores with ... and show attributes like
............................................... Limit to all wellbores with ... in .................
and show key attributes in a table. After connecting to ... we could for instance
limit further to cores in ... with ...... and where it is larger than a given value,
for instance ..... We could also find out whether there are cores in ..... which are
not stored in .... (based on .....) and where there could be .......... value. Some
of the missing data we possibly own, other not.
SELECT [...]
FROM
db_name.table1 table1,
db_name.table2 table2a,
db_name.table2 table2b,
db_name.table3 table3a,
db_name.table3 table3b,
db_name.table3 table3c,
db_name.table3 table3d,
db_name.table4 table4a,
db_name.table4 table4b,
db_name.table4 table4c,
db_name.table4 table4d,
db_name.table4 table4e,
db_name.table4 table4f,
db_name.table5 table5a,
db_name.table5 table5b,
db_name.table6 table6a,
db_name.table6 table6b,
db_name.table7 table7a,
db_name.table7 table7b,
db_name.table8 table8,
db_name.table9 table9,
db_name.table10 table10a,
db_name.table10 table10b,
db_name.table10 table10c,
db_name.table11 table11,
db_name.table12 table12,
db_name.table13 table13,
db_name.table14 table14,
db_name.table15 table15,
db_name.table16 table16
WHERE [...]
table2a.attr1=‘keyword’ AND
table3a.attr2=table10c.attr1 AND
table3a.attr6=table6a.attr3 AND
table3a.attr9=‘keyword’ AND
table4a.attr10 IN (‘keyword’) AND
table4a.attr1 IN (‘keyword’) AND
table5a.kinds=table4a.attr13 AND
table5b.kinds=table4c.attr74 AND
table5b.name=‘keyword’ AND
(table6a.attr19=table10c.attr17 OR
(table6a.attr2 IS NULL AND
table10c.attr4 IS NULL)) AND
table6a.attr14=table5b.attr14 AND
table6a.attr2=‘keyword’ AND
(table6b.attr14=table10c.attr8 OR
(table6b.attr4 IS NULL AND
table10c.attr7 IS NULL)) AND
table6b.attr19=table5a.attr55 AND
table6b.attr2=‘keyword’ AND
table7a.attr19=table2b.attr19 AND
table7a.attr17=table15.attr19 AND
table4b.attr11=‘keyword’ AND
table8.attr19=table7a.attr80 AND
table8.attr19=table13.attr20 AND
table8.attr4=‘keyword’ AND
table9.attr10=table16.attr11 AND
table3b.attr19=table10c.attr18 AND
table3b.attr22=table12.attr63 AND
table3b.attr66=‘keyword’ AND
table10a.attr54=table7a.attr8 AND
table10a.attr70=table10c.attr10 AND
table10a.attr16=table4d.attr11 AND
table4c.attr99=‘keyword’ AND
table4c.attr1=‘keyword’ AND
table11.attr10=table5a.attr10 AND
table11.attr40=‘keyword’ AND
table11.attr50=‘keyword’ AND
table2b.attr1=table1.attr8 AND
table2b.attr9 IN (‘keyword’) AND
table2b.attr2 LIKE ‘keyword’% AND
table12.attr9 IN (‘keyword’) AND
table7b.attr1=table2a.attr10 AND
table3c.attr13=table10c.attr1 AND
table3c.attr10=table6b.attr20 AND
table3c.attr13=‘keyword’ AND
table10b.attr16=table10a.attr7 AND
table10b.attr11=table7b.attr8 AND
table10b.attr13=table4b.attr89 AND
table13.attr1=table2b.attr10 AND
table13.attr20=’‘keyword’’ AND
table13.attr15=‘keyword’ AND
table3d.attr49=table12.attr18 AND
table3d.attr18=table10c.attr11 AND
table3d.attr14=‘keyword’ AND
table4d.attr17 IN (‘keyword’) AND
table4d.attr19 IN (‘keyword’) AND
table16.attr28=table11.attr56 AND
table16.attr16=table10b.attr78 AND
table16.attr5=table14.attr56 AND
table4e.attr34 IN (‘keyword’) AND
table4e.attr48 IN (‘keyword’) AND
table4f.attr89=table5b.attr7 AND
table4f.attr45 IN (‘keyword’) AND
table4f.attr1=‘keyword’ AND
table10c.attr2=table4e.attr19 AND
(table10c.attr78=table12.attr56 OR
(table10c.attr55 IS NULL AND
table12.attr17 IS NULL))
At Statoil, it takes up to 4 days to formulate a query in SQL.
Statoil loses up to 50.000.000e per year because of this!!
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56
14. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Challenges Accessing Big Data
This is what happens:
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 9/56
15. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Need for Abstraction
We need to facilitate access to Data
• by abstracting away from how the data is stored, and
• by making use of high level views on the data, so called ontologies.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 10/56
16. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontology Based Data Access Framework
. . .
. . .
. . .
. . .
ONTOLOGY
=
global vocabulary
+
conceptual view
DATA SOURCES
external and
heterogeneous
MAPPINGS
how to populate
the ontology
query
result
Logical transparency in accessing data:
• does not know where and how data is stored;
• can only see a conceptual view of data.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56
17. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontology Based Data Access Framework
. . .
. . .
. . .
. . .
ONTOLOGY
=
global vocabulary
+
conceptual view
DATA SOURCES
external and
heterogeneous
MAPPINGS
how to populate
the ontology
query
result
Logical transparency in accessing data:
• does not know where and how data is stored;
• can only see a conceptual view of data.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56
18. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontology Based Data Access Framework
. . .
. . .
. . .
. . .
ONTOLOGY
=
global vocabulary
+
conceptual view
DATA SOURCES
external and
heterogeneous
MAPPINGS
how to populate
the ontology
query
result
Logical transparency in accessing data:
• does not know where and how data is stored;
• can only see a conceptual view of data.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56
19. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontology Based Data Access Framework
. . .
. . .
. . .
. . .
ONTOLOGY
=
global vocabulary
+
conceptual view
DATA SOURCES
external and
heterogeneous
MAPPINGS
how to populate
the ontology
query
result
Logical transparency in accessing data:
• does not know where and how data is stored;
• can only see a conceptual view of data.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56
20. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 12/56
21. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop
• Is a platform to query databases through ontologies, relying on semantic
technologies.
• Compliant with the standards of the W3C.
• Supports all major relational DBs (Oracle, DB2, Postgres, MySQL, etc.).
• Open-source and released under Apache license.
• Development of Ontop:
development started 6 years ago
already well established:
• +200 topics in the mail list
• +2300 downloads in last 10 months
currently being developed in the context of the EU project Optique
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 13/56
22. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Architecture of Ontop
Ontop SPARQL Query Answering Engine (Quest)
OWL-API Sesame Storage And Inference Layer (SAIL) API
R2RML API
OWL-API
(OWL Parser)
Sesame API
(SPARQL Parser)
JDBC
Protege
Optique
Platform
Sesame Workbench &
SPARQL Endpoint
Application
Layer
API
Layer
Ontop
Core
Inputs Relational
Databases
R2RML
Mappings
OWL 2 QL
Ontologies
SPARQL
Queries
Figure: Architecture of the Ontop system
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 14/56
23. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Databases
Ontop supports standard relational database engines via JDBC.
• commercial databases: DB2, Oracle, MS SQL Server
• open-source databases: PostgreSQL, MySQL, H2, HSQL
• federated databases (e.g., Teiid1
or Exareme2
) to support multiple data sources
(e.g., relational databases, XML, CSV, and Web Services).
1
http://teiid.jboss.org
2
http://www.exareme.org
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 15/56
24. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example: Hospital Database
Table: tbl patient
pid name type stage
1 ’Mary’ false 4
2 ’John’ true 1
types:
• false for Non-Small Cell Lung Carcinoma (NSCLC)
• true for Small Cell Lung Carcinoma (SCLC),.
Stage
• NSCLC: 1–6 for stages I, II, III, IIIa, IIIb, and IV, respectively;
• SCLC: 1 and 2 for stages Limited and Extensive, respectively.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 16/56
25. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontology
• Ontop uses RDFS and OWL 2 QL as ontology languages.
• OWL 2 QL is based on the DL-Lite family of lightweight description logics, which
guarantees FO-rewritability
Example
:NSCLC rdfs:subClassOf :LungCancer .
:SCLC rdfs:subClassOf :LungCancer .
:LungCancer rdfs:subClassOf :Neoplasm .
:hasNeoplasm rdfs:domain :Patient .
:hasNeoplasm rdfs:range :Neoplasm .
:hasName a owl:DatatypeProperty .
:hasStage a owl:ObjectProperty .
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 17/56
26. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Mappings
Ontop supports two mapping languages:
• W3C RDB2RDF mapping language R2RML
• Ontop native mapping language
Example (Mappings in Ontop native mapping language)
:db1/{pid} a :Patient .
← SELECT pid FROM tbl patient
:db1/neoplasm/{pid} a :NSCLC .
← SELECT pid FROM tbl patient
WHERE type = false
:db1/neoplasm/{pid} a :SCLC .
← SELECT pid FROM tbl patient WHERE type = true
:db1/{pid} :hasName {name} .
← SELECT pid, name FROM tbl patient
:db1/{pid} :hasNeoplasm :db1/neoplasm/{pid} .
← SELECT pid FROM tbl patient
:db1/neoplasm/{pid} :hasStage :stage-IIIa .
← SELECT pid FROM tbl patient WHERE stage = 4 and type = false
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 18/56
27. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Queries
• Ontop supports essentially all features of SPARQL 1.0 as well as the OWL 2 QL
entailment regime of SPARQL 1.1.
• Implementation of other features of SPARQL 1.1 (e.g., aggregates, property path
queries, negation) is working in progress.
The following SPARQL query retrieves all the names of all patients who have a
neoplasm (tumor) at stage IIIa.
SELECT ?name WHERE {
?p a :Patient ;
:hasName ?name ;
:hasNeoplasm ?tumor .
?tumor a :Neoplasm ;
:hasStage :stage -IIIa . }
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 19/56
28. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop Core API
• The core of Ontop is the SPARQL query answering engine Quest.
• We will explain the details in the next section.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 20/56
29. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
API layer of Ontop
System developers can use Ontop as a Java library
• OWL API is a reference implementation for creating, manipulating, and serializing
OWL ontologies. We extended the OWLReasoner Java interface to support
SPARQL query answering.
• Sesame is a de-facto standard framework for processing RDF data. Ontop
implements the Sesame Storage And Inference Layer (SAIL) API supporting
inferencing and querying over relational databases.
• Available as Maven artifacts from central repository.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 21/56
30. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Application Layer of Ontop
• Command line interface
• Prot´eg´e plugin
• Sesame Workbench and SPARQL Endpoint
• Optique Platform
• Stardog
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 22/56
31. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop Prot´eg´e plugin
The Ontop Prot´eg´e plugin provides a graphical interface for:
• editing mappings
• executing SPARQL queries
• checking (in)consistency of the ontology
• bootstrapping ontologies and mappings from the database
• importing and exporting R2RML mappings
• materializing RDF triples, etc.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 23/56
32. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Mapping Editor in Prot´eg´e
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 24/56
33. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
SPARQL query answering in Prot´eg´e
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 25/56
34. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop plugin available from Prot´eg´e plugin repository
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 26/56
35. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Sesame workbench and SPARQL endpoint
• Sesame OpenRDF Workbench is a web application for administrating Sesame
repositories.
• We extended the Workbench to create and manage Ontop repositories.
• Such repositories can then be used as standard HTTP SPARQL endpoints.
• Currently Ontop only supports Sesame v2, we are working on supporting v4.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 27/56
36. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Sesame workbench and SPARQL endpoint
• Sesame OpenRDF Workbench is a web application for administrating Sesame
repositories.
• We extended the Workbench to create and manage Ontop repositories.
• Such repositories can then be used as standard HTTP SPARQL endpoints.
• Currently Ontop only supports Sesame v2, we are working on supporting v4.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 27/56
37. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Screenshot of the Ontop Sesame Workbench
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 28/56
38. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop in the Optique Architecture
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 29/56
39. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop in the Optique Architecture
Ontop
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 29/56
40. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Stardog
• Stardog is a commercial triplestore developed by complexible, Inc.
• Since version 4 released in November 2015, Stardog has integrated Ontop code to
support SPARQL queries over virtual RDF graphs.
• The Virtual Graph feature is only available in the enterprise edition
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 30/56
41. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 31/56
42. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
43. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
44. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
45. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
46. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
47. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
48. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Conceptual Framework of Query Answering by Query Rewriting
ONTOLOGY
MAPPINGS
DATA
SOURCES
. . .
. . .
. . .
. . .
Ontological Query q
Rewritten Query
SQLRelational Answer
Ontological Answer
qresult
Rewriting
Unfolding
Evaluation
Result Translation
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56
49. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Ontop Workflow
Ontop
ON-LINE OFF-LINE
Reasoner
Ontology
Mapping-
Optimiser
Mappings
DB Integrity Constraints
Classified
Ontology
T-mapping
SPARQL
Query
Query Rewriter
SQL query
SPARQL to SQL
Translator
Figure: The Ontop workflow
• The off-line stage (start-up time) processes the ontology, mappings, and database
integrity constraints.
• The on-line stage executes SPARQL queries by rewriting to SQL queries
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 33/56
50. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Offline Stage
The offline stage can be thought of as consisting of three phases:
• ontology classification
• T-mapping construction
• T-mapping optimization
Example
• New axioms in the classified Ontology
:NSCLC rdfs:subClassOf :Neoplasm .
:SCLC rdfs:subClassOf :Neoplasm .
• Inferred Mappings after T-mapping construction
:db1/neoplasm/{pid} a :Neoplasm .
← SELECT pid FROM tbl patient WHERE type = false
:db1/neoplasm/{pid} a :Neoplasm .
← SELECT pid FROM tbl patient WHERE type = true
• Optimized T-mappings
:db1/neoplasm/{pid} a :Neoplasm .
← SELECT pid FROM tbl patient WHERE type = false OR type = true
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 34/56
51. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Online Stage
During query execution (the online stage), Ontop transforms an input SPARQL queries
into an optimized SQL query using the T-mappings and database integrity constraints.
Optimizing the generated SQL queries
structural optimizations
• pushing the joins inside the unions,
• pushing the functions as high as possible in the query tree,
• eliminating sub-queries.
Semantic query optimizations
semantic analysis of SQL queries to reduce the size and complexity
• removing redundant self-joins,
• detecting unsatisfiable or trivially valid (true) conditions.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 35/56
52. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example of SQL translation and optimization
• Consider a SPARQL query
SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }
• Non-optimized generated SQL query
SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE type = false OR type = true) Q1
JOIN (SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE stage = 4 AND type = false) Q2
ON Q1.x = Q2.x)
• SQL query after the structural optimization
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT T1.pid
FROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pid
WHERE (T1.type = false OR T1.type = true)
AND T2.stage = 4 AND T2.type = false) Q
• SQL query after the self-join elimination
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q
• SQL query after the second structural optimization
SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patient
WHERE type = false AND stage = 4
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56
53. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example of SQL translation and optimization
• Consider a SPARQL query
SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }
• Non-optimized generated SQL query
SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE type = false OR type = true) Q1
JOIN (SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE stage = 4 AND type = false) Q2
ON Q1.x = Q2.x)
• SQL query after the structural optimization
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT T1.pid
FROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pid
WHERE (T1.type = false OR T1.type = true)
AND T2.stage = 4 AND T2.type = false) Q
• SQL query after the self-join elimination
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q
• SQL query after the second structural optimization
SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patient
WHERE type = false AND stage = 4
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56
54. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example of SQL translation and optimization
• Consider a SPARQL query
SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }
• Non-optimized generated SQL query
SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE type = false OR type = true) Q1
JOIN (SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE stage = 4 AND type = false) Q2
ON Q1.x = Q2.x)
• SQL query after the structural optimization
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT T1.pid
FROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pid
WHERE (T1.type = false OR T1.type = true)
AND T2.stage = 4 AND T2.type = false) Q
• SQL query after the self-join elimination
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q
• SQL query after the second structural optimization
SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patient
WHERE type = false AND stage = 4
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56
55. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example of SQL translation and optimization
• Consider a SPARQL query
SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }
• Non-optimized generated SQL query
SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE type = false OR type = true) Q1
JOIN (SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE stage = 4 AND type = false) Q2
ON Q1.x = Q2.x)
• SQL query after the structural optimization
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT T1.pid
FROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pid
WHERE (T1.type = false OR T1.type = true)
AND T2.stage = 4 AND T2.type = false) Q
• SQL query after the self-join elimination
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q
• SQL query after the second structural optimization
SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patient
WHERE type = false AND stage = 4
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56
56. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Example of SQL translation and optimization
• Consider a SPARQL query
SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }
• Non-optimized generated SQL query
SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE type = false OR type = true) Q1
JOIN (SELECT concat(":db1/neoplasm/", pid) AS x
FROM tbl_patient WHERE stage = 4 AND type = false) Q2
ON Q1.x = Q2.x)
• SQL query after the structural optimization
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT T1.pid
FROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pid
WHERE (T1.type = false OR T1.type = true)
AND T2.stage = 4 AND T2.type = false) Q
• SQL query after the self-join elimination
SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM
(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q
• SQL query after the second structural optimization
SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patient
WHERE type = false AND stage = 4
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56
57. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 37/56
58. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Statoil Use Case
• Optique Use Case Partner
• Main reference: “Ontology Based Access to Exploration Data at Statoil
[Kharlamov, Hovland, et al., 2015, ISWC In-use Track].
• Exploration domain
• Improve the efficiency of the information gathering routine for geologists at Statoil
• Efficient, creative data collection from multiple data sources
• ⇒ separate slides for this use case
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 38/56
59. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Siemens Use Case
• Optique Use Case Partner
• Main reference: “How Semantic Technologies Can Enhance Data Access at
Siemens Energy” [Kharlamov, Solomakhina, et al., 2014, ISWC In-use Track]
• ⇒ separate slides for this use case
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 39/56
60. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
EPNet Use Case
• EPNet Project (ERC Advanced Grant EPNet “Production and distribution of
food during the Roman Empire: Economics and Political Dynamics”,
ERC-2013-ADG 340828).
• Main reference: “Ontology-Based Data Integration in EPNet: Production and
Distribution of Food During the Roman Empire” [Calvanese, Liuzzo, et al., 2016,
J. of Eng. Appl. of AI]
• Ontology-Based Data Integration: integrating multiple datasource.
• Linking three datasets: the EPNet relational repository , the Epigraphic Database
Heidelberg, and the Pleiades dataset
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 40/56
61. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
EMSec Use Case
• EMSec (Echtzeitdienste f¨ur die Maritime Sicherheit, Real-time Services for the Maritime Security)
is a German BMBF (Federal Ministry of Research and Education) funded project
• Geo-spatial support by Ontop-spatial (developed as a fork of Ontop)
• Sextant for visualizing linked geospatial data
• Use case paper “Ontology-based Data Access for Maritime Security” is under
submission
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 41/56
62. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
IBM Research Ireland Use Case
• Main reference: “Data Access Linking and Integration with DALI: building a
Safety Net for an Ocean of City Data” [Lopez et al., 2015, ISWC In-use Track]
• Smarter Cities Technology Centre, IBM Research, Ireland
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 42/56
63. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Electronic Health Records Use Case
• Main reference: “Validating an ontology-based algorithm to identify patients with
Type 2 Diabetes Mellitus in Electronic Health Records” [Rahimi et al., 2014, Int.
J. of Medical Informatics]
• Medicine, The University of New South Wales, Australia
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 43/56
64. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Electronic Health Records Use Case (Cont.)
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 44/56
65. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Use Cases
• More use cases are in https://github.com/ontop/ontop/wiki/UseCases
• Unfortunately, we are not able to track all use cases of Ontop.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 45/56
66. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Outline
1 Introduction
2 Overview of Ontop
3 SPARQL Query Answering in Ontop
4 Use Cases
5 Recent Progresses and Future
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 46/56
67. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Recent Progresses
More recent lines of research on Ontop include
• formalization of SPARQL in the context of OBDA [Rodriguez-Muro and Rezk,
2015, J. Web Semantics] [Kontchakov et al., 2014, ISWC]
• OWL 2 QL entailment regime [Kontchakov et al., 2014, ISWC]
• SWRL rule language with a limited form of recursion handled by SQL Common
Table Expressions [Xiao et al., 2014, RR]
• owl:sameAs for cross-linked datasets [Calvanese, Giese, et al., 2015, ISWC]
• Expressive ontologies beyond OWL 2 QL by rewriting and approximation with the
help of the mapping layer [Botoeva et al., 2016, AAAI]
• System description of Ontop [Calvanese, Cogrel, et al., 2016, Semantic Web J.,
to appear]
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 47/56
68. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Recent Progresses
More recent lines of research on Ontop include
• formalization of SPARQL in the context of OBDA [Rodriguez-Muro and Rezk,
2015, J. Web Semantics] [Kontchakov et al., 2014, ISWC]
• OWL 2 QL entailment regime [Kontchakov et al., 2014, ISWC]
• SWRL rule language with a limited form of recursion handled by SQL Common
Table Expressions [Xiao et al., 2014, RR]
• owl:sameAs for cross-linked datasets [Calvanese, Giese, et al., 2015, ISWC]
• Expressive ontologies beyond OWL 2 QL by rewriting and approximation
with the help of the mapping layer [Botoeva et al., 2016, AAAI]
• System description of Ontop [Calvanese, Cogrel, et al., 2016, Semantic Web J.,
to appear]
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 47/56
69. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Beyond OWL 2 QL (AAAI 16 paper)
• Framework for Rewriting and Approximation of OBDA specifications
. . .
. . .
. . .
. . .
T , M, S T , M , S
⇒
. . .
. . .
. . .
. . .
Rewriting The new specification is equivalent to the original one w.r.t. query
answering (query-inseparable).
Approximation The new specification is a sound approximation of the original one
w.r.t. query answering.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 48/56
70. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Beyond OWL 2 QL (II)
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 49/56
71. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Beyond OWL 2 QL (III)
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 50/56
72. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
WIP: OBDA beyond Relational Databases
Mapping parsing
a
Ontology parsing
b
Mapping
compilation
c
sparql
parsing
0
Rewriting
1 Unfolding w.r.t.
mappings
2
Structural/semantic
optimization
3
Normalization/
Decomposition
4
RA-to-native query
translation
5
Evaluation
6
Post-
processing
7
Mapping file
Ontology file
Mapping M
Ontology T
T -Mapping MT
sparql
string
sparql Q Rewritten
sparql QT
RA q1
RA q2
RA q3Native
queries
Native
results
sparql
result
OFFLINE
ONLINE
• NoSQL Movement
• In fact most of the components of Ontop are SQL-independent
• We are working on OBDA over non-relational datasource
• We are targeting on MongoDB now
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 51/56
73. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
Future
• In order to further improve performance, we will investigate data-dependent
optimizations.
• support larger fragments of SPARQL (e.g., aggregation, negation, and path
queries) and R2RML (e.g., named graphs).
• For end-users, we will improve the GUI and extend utilities to make Ontop even
more user-friendly.
• go beyond relational databases and support other kinds of data sources (e.g.,
graph and document databases).
• Continue building community
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 52/56
76. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
References I
Kharlamov, Evgeny, Nina Solomakhina, ¨Ozg¨ur L¨utf¨u ¨Oz¸cep, Dmitriy Zheleznyakov, Thomas Hubauer, Steffen Lamparter,
Mikhail Roshchin, Ahmet Soylu, and Stuart Watson (2014). “How Semantic Technologies Can Enhance Data Access at
Siemens Energy”. In: The Semantic Web - ISWC 2014 - 13th International Semantic Web Conference, Riva del Garda,
Italy, October 19-23, 2014. Proceedings, Part I, pp. 601–619.
Kontchakov, Roman, Martin Rezk, Mariano Rodriguez-Muro, Guohui Xiao, and Michael Zakharyaschev (2014). “Answering
SPARQL Queries over Databases under OWL 2 QL Entailment Regime”. In: vol. 8796.
doi:10.1007/978-3-319-11964-9 35, pp. 552–567.
Rahimi, Alireza, Siaw-Teng Liaw, Jane Taggart, Pradeep Ray, and Hairong Yu (2014). “Validating an ontology-based
algorithm to identify patients with Type 2 Diabetes Mellitus in Electronic Health Records”. In: Int. J. of Medical
Informatics 83.10. doi:10.1016/j.ijmedinf.2014.06.002, pp. 768–778.
Xiao, Guohui, Martin Rezk, Mariano Rodriguez-Muro, and Diego Calvanese (2014). “Rules and Ontology Based Data
Access”. In: Proc. 8th Int. Conference on Web Reasoning and Rule Systems (RR 2014). Ed. by Marie-Laure Mugnier and
Roman Kontchakov. Lecture Notes in Computer Science. Springer.
Calvanese, Diego, Martin Giese, Dag Hovland, and Martin Rezk (2015). “Ontology-based Integration of Cross-linked
Datasets”. In: Proc. of the 14th Int. Semantic Web Conference (ISWC). Lecture Notes in Computer Science. Springer.
Kharlamov, Evgeny, Dag Hovland, et al. (2015). “Ontology Based Access to Exploration Data at Statoil”. In: The Semantic
Web - ISWC 2015 - 14th International Semantic Web Conference, Bethlehem, PA, USA, October 11-15, 2015,
Proceedings, Part II, pp. 93–112.
Lopez, Vanessa, Martin Stephenson, Spyros Kotoulas, and Pierpaolo Tommasi (2015). “Data Access Linking and Integration
with DALI: Building a Safety Net for an Ocean of City Data”. In: The Semantic Web - ISWC 2015 - 14th International
Semantic Web Conference, Bethlehem, PA, USA, October 11-15, 2015, Proceedings, Part II, pp. 186–202.
Rodriguez-Muro, Mariano and Martin Rezk (2015). “Efficient SPARQL-to-SQL with R2RML Mappings”. In: 33.
doi:10.1016/j.websem.2015.03.001, pp. 141–169.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 55/56
77. Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future
References II
Botoeva, Elena, Diego Calvanese, Valerio Santarelli, Domenico Fabio Savo, Alessandro Solimando, and Guohui Xiao (2016).
“Beyond OWL 2 QL in OBDA: Rewritings and Approximations”. In: Proc. of the 30th AAAI Conf. on Artificial
Intelligence (AAAI). AAAI Press.
Calvanese, Diego, Benjamin Cogrel, Sarah Komla-Ebri, Roman Kontchakov, Davide Lanti, Martin Rezk,
Mariano Rodriguez-Muro, and Guohui XIao (2016). “Ontop: Answering SPARQL Queries over Relational Databases”. In:
Semantic Web Journal.
Calvanese, Diego, Pietro Liuzzo, Alessandro Mosca, Jose Remesal, Martin Rezk, and Guillem Rull (2016). “Ontology-Based
Data Integration in EPNet: Production and Distribution of Food During the Roman Empire”. In: Engineering Applications
of Artificial Intelligence.
G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 56/56