Slides of my PhD presentation @ Eurecom, presenting our work on publishing and consuming geo-spatial data and government data using Semantic Web technologies.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
Land use/land cover classification using machine learning modelsIJECEIAES
An ensemble model has been proposed in this work by combining the extreme gradient boosting classification (XGBoost) model with support vector machine (SVM) for land use and land cover classification (LULCC). We have used the multispectral Landsat-8 operational land imager sensor (OLI) data with six spectral bands in the electromagnetic spectrum (EM). The area of study is the administrative boundary of the twin cities of Odisha. Data collected in 2020 is classified into seven land use classes/labels: river, canal, pond, forest, urban, agricultural land, and sand. Comparative assessments of the results of ten machine learning models are accomplished by computing the overall accuracy, kappa coefficient, producer accuracy and user accuracy. An ensemble classifier model makes the classification more precise than the other state-of-the-art machine learning classifiers.
Seven Most tenable applications of AI o Water Resources ManagementMrinmoy Majumder
AI or Artificial Intelligence is a pioneering technique that has enabled the creation of intelligent machines.or smart machines which has the power to self adapt based on the situation presented to it. It requires situations whose response is known and based on this training data set it learns the problems which it has to solve when it is ready. Due to the alarming success with AI in robotics, electronics etc fields the same technique is now used to solve the problems of water resource management.. This ppt shows seven most notable use of AI in water resources-based problems where satisfactory improvement has encouraged further application of the technique.
The following presentation was delivered by Robert Morrison, Principal Consultant at Esri Ireland, at the 2019 NICS ICT Conference in October 2019.
The presentation focuses on taking a geographic approach to machine learning to help you "see what other's can't".
Imagery and remotely sensed data is a valuable resource for many organisations who have made substantial investment obtaining the data. The field of Machine Learning is both broad and deep and is constantly evolving. Using ArcGIS and Machine Learning allows organisations to derive valuable new content.
ArcGIS is an open, interoperable platform that allows for the integration of complementary methods and techniques that empower ArcGIS users to solve complex, real-world problems in a fundamentally spatial way.
Learn how by combining powerful built-in Image analysis tools with any machine learning package users can benefit from the spatial validation, geo-enrichment and visualisation. See how this Machine Learning is being applied in real world use-cases from marine farming and crime analysis to agriculture and sustainability.
Big Data to avoid weather related flight delaysAkshatGiri3
This topic generally belongs to weather forecasting, how we will implement Big Data computing for future weather prediction so that weather Related Flight Delays get minimized.
Landuse Classification from Satellite Imagery using Deep LearningDataWorks Summit
With the abundance of remote sensing satellite imagery, the possibilities are endless as to the kind of insights that can be derived from them. One such use is to determine land use for agriculture and non-agricultural purposes.
In this talk, we’ll be looking at leveraging Sentinel-2 satellite imagery data along with OpenStreetMap labels to be able to classify land use as agricultural or non-agricultural.
Sentinel-2 data has a 10-meter resolution in RGB bands and is well-suited for land use classification. Using these two datasets, many different machine learning tasks can be performed like image segmentation into two classes (farm land and non-farm land) or more challenging task of identification of crop type being cultivated on fields.
For this talk, we’ll be looking at leveraging convolutional neural networks (CNNs) built with Apache MXNet to train deep learning models for land use classification. We’ll be covering the different deep learning architectures considered for this particular use case along with the appropriate metrics.
We’ll be leveraging streaming pipelines built on Apache Flink and Apache NiFi for model training and inference. Developers will come away with a better understanding of how to analyze satellite imagery and the different deep learning architectures along with their pros/cons when analyzing satellite imagery for land use. SUNEEL MARTHI and CHRIS OLIVIER, Software Development Engineer Amazon Web Services
Land use/land cover classification using machine learning modelsIJECEIAES
An ensemble model has been proposed in this work by combining the extreme gradient boosting classification (XGBoost) model with support vector machine (SVM) for land use and land cover classification (LULCC). We have used the multispectral Landsat-8 operational land imager sensor (OLI) data with six spectral bands in the electromagnetic spectrum (EM). The area of study is the administrative boundary of the twin cities of Odisha. Data collected in 2020 is classified into seven land use classes/labels: river, canal, pond, forest, urban, agricultural land, and sand. Comparative assessments of the results of ten machine learning models are accomplished by computing the overall accuracy, kappa coefficient, producer accuracy and user accuracy. An ensemble classifier model makes the classification more precise than the other state-of-the-art machine learning classifiers.
Seven Most tenable applications of AI o Water Resources ManagementMrinmoy Majumder
AI or Artificial Intelligence is a pioneering technique that has enabled the creation of intelligent machines.or smart machines which has the power to self adapt based on the situation presented to it. It requires situations whose response is known and based on this training data set it learns the problems which it has to solve when it is ready. Due to the alarming success with AI in robotics, electronics etc fields the same technique is now used to solve the problems of water resource management.. This ppt shows seven most notable use of AI in water resources-based problems where satisfactory improvement has encouraged further application of the technique.
The following presentation was delivered by Robert Morrison, Principal Consultant at Esri Ireland, at the 2019 NICS ICT Conference in October 2019.
The presentation focuses on taking a geographic approach to machine learning to help you "see what other's can't".
Imagery and remotely sensed data is a valuable resource for many organisations who have made substantial investment obtaining the data. The field of Machine Learning is both broad and deep and is constantly evolving. Using ArcGIS and Machine Learning allows organisations to derive valuable new content.
ArcGIS is an open, interoperable platform that allows for the integration of complementary methods and techniques that empower ArcGIS users to solve complex, real-world problems in a fundamentally spatial way.
Learn how by combining powerful built-in Image analysis tools with any machine learning package users can benefit from the spatial validation, geo-enrichment and visualisation. See how this Machine Learning is being applied in real world use-cases from marine farming and crime analysis to agriculture and sustainability.
Big Data to avoid weather related flight delaysAkshatGiri3
This topic generally belongs to weather forecasting, how we will implement Big Data computing for future weather prediction so that weather Related Flight Delays get minimized.
Web Mapping 101: What Is It and Making It Work For YouSafe Software
Web mapping is the process of using the internet to visualize, analyze, and share your geospatial data through a map. Web maps are an important tool for many organizations as they provide the ability to distribute critical information to anyone, anywhere, and at any time.
Web maps provide endless potential for visualizing valuable data that may otherwise go unused. But, not everyone knows how to get started with creating one. In this webinar, we’ll cover:
- An overview of web mapping and how it works
- How OpenLayers and Leaflet work with web mapping
- How to use web mapping tools, including Esri Leaflet and Mapbox with the HTMLReportGenerator
- How to create vector tilesets in FME to make web mapping easier than ever
Join our team of Support Specialists to learn how to get started using FME to create a web map of your own to visualize and share your data.
Stop wasting the value of your geospatial data by letting it sit unused. You’ll leave this webinar with the tools to get you started with creating a web map of your own so you can present your data in a way thats easy to understand and share with others.
PhD thesis defense presentation for my topic "Improving Content Delivery and Service Discovery in Networks" for wireless and other networks. Columbia University, 2016.
A confluence of factors have converged to afford the opportunity to apply data science at large scale to agricultural production. The demand for agricultural outputs is growing and there is a need to meet this demand by utilizing increasingly mechanized precision agriculture and enormous data volumes collected to intelligently optimize agriculture outputs. We will consider the machine learning challenges related to optimizing global food production.
Presentation on applications of AI in the geospatial domain at the Fourth Edition of AI in Practice (6th November 2019, Startup Village, Amsterdam, The Netherlands)
Erik Van Der Zee, Enterprise Architect, Geodan
This is most benificial for the First year Engineering students.This presentation consists of videos and many applications of GIS. The processes and the other parts of GIS is also nicely explained.
Web Mapping 101: What Is It and Making It Work For YouSafe Software
Web mapping is the process of using the internet to visualize, analyze, and share your geospatial data through a map. Web maps are an important tool for many organizations as they provide the ability to distribute critical information to anyone, anywhere, and at any time.
Web maps provide endless potential for visualizing valuable data that may otherwise go unused. But, not everyone knows how to get started with creating one. In this webinar, we’ll cover:
- An overview of web mapping and how it works
- How OpenLayers and Leaflet work with web mapping
- How to use web mapping tools, including Esri Leaflet and Mapbox with the HTMLReportGenerator
- How to create vector tilesets in FME to make web mapping easier than ever
Join our team of Support Specialists to learn how to get started using FME to create a web map of your own to visualize and share your data.
Stop wasting the value of your geospatial data by letting it sit unused. You’ll leave this webinar with the tools to get you started with creating a web map of your own so you can present your data in a way thats easy to understand and share with others.
PhD thesis defense presentation for my topic "Improving Content Delivery and Service Discovery in Networks" for wireless and other networks. Columbia University, 2016.
A confluence of factors have converged to afford the opportunity to apply data science at large scale to agricultural production. The demand for agricultural outputs is growing and there is a need to meet this demand by utilizing increasingly mechanized precision agriculture and enormous data volumes collected to intelligently optimize agriculture outputs. We will consider the machine learning challenges related to optimizing global food production.
Presentation on applications of AI in the geospatial domain at the Fourth Edition of AI in Practice (6th November 2019, Startup Village, Amsterdam, The Netherlands)
Erik Van Der Zee, Enterprise Architect, Geodan
This is most benificial for the First year Engineering students.This presentation consists of videos and many applications of GIS. The processes and the other parts of GIS is also nicely explained.
What REALLY Differentiates The Best Content Marketers From The RestRoss Simmonds
I’ve been privileged to work with brands from all over the world in the last few years. Through this work, I’ve also had a chance to meet, become friends with, work with and collaborate with some of the best content marketers in the world. Some of these marketers have their faces plastered in magazines while others keep it low key and aren’t anything close to household names.
When I first started my career, I made it my mission to learn from the best. I studied and read books from the advertising greats and consumed every blog post I could fine from the top modern day marketers I could fine. Through discussions, research and studying the craft, I’ve been able to identify and uncover a few common traits that are found in the best content marketers today. If you want to be a great content marketer, you need to know what it takes to be considered such. Here’s a few traits that differentiate the best content marketers from the rest.
Presentation Location and Context World, 2015. Palo Alto, CA November 3-4, 2015.
Abstract: Creating useful local context requires big data platforms and marketplaces. Contextual awareness is relevant to location based marketing, first responders, urban planners and many others. Location-aware mobile devices are revolutionizing how consumers and brands interact in the physical world. Situational awareness is a key element to efficiently handling any emergency response. In all cases, big data processing and high velocity streaming of location based data creates the richest contextual awareness. Data from many sources including IoT devices, sensor webs, surveillance and crowdsourcing are combined with semantically-rich urban and indoor data models. The resulting context information is delivered to and shared by mobile devices in connected and disconnected operations. Standards play a key role in establishing context platforms and marketplaces. Successful approaches will consolidate data from ubiquitous sensing technologies on a common space-time basis to enabled context-aware analysis of environmental and social dynamics.
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...GIS in the Rockies
This talk presents an overview of some of the most important Open Specifications (OS) for the storage, transport and processing of geospatial data and why they matter for the development of the next generation of geospatial systems and data infrastructures. What is the importance of being Open? What is the relationship of OS and geospatial software (both FOSS4G and private/proprietary software)? A Web-based system architecture based on OS and FOSS4G will be presented.
This talk opened the geospatial track of the Apache Big Data conference. The geospatial track aimed to increase the benefits of implementing open source consistent with open geospatial standards.
After an introduction of the geospatial track this talk focused on these topics:
- Applications of Big Geo Data
- Geospatial Open Standards
- Big Geo Use Cases
- Open Source and Open Standards.
Analysis Ready Data workshop - OGC presentation George Percivall
The Open Geospatial Consortium (OGC) has activities relevant to the workshop scope of "the current state-of-the-art in satellite data interoperability”. This presentation will focus on two main topics with the option to discuss other relevant topics that the participants may wish to discuss, e.g., WFS3. The two focus areas of development: 1) Geospatial Datacubes and 2) Earth Observation Exploitation Platforms. 1) A Geospatial Datacube provides access to and analytics on analysis ready data (ARD) organized with coordinate axes of space and time with cells in the cube containing data of geospatial features, e.g., imagery. OGC members implementing geospatial datacubes are documenting common practices to spur development and leading to the possibility to federated geospatial datacubes. 2) OGC is forming a Earth Observation Exploitation Platform Domain Working Group with the goal of defining a standards-based framework for cloud-based access to and analysis of EO data. An ad-hoc meeting was held in March 2018 to scope the working group with the results issued in a request for comment: http://www.opengeospatial.org/pressroom/pressreleases/2792
Presented by Tony Mathys at a Current Issues and Applications of the Geospatial Technologies Lecture, Department of Geography and Environment, Aberdeen University, 24 February 2012
Application packaging and systematic processing in earth observation exploita...terradue
An overview of Terradue's solutions supporting Earth Observations (EO) Exploitation Platforms across multiple domains.
Presentation done as part of the Open Geospatial Consortium (OGC) Technical Committee ad-hoc meeting for the setup of a new domain working group on EO Exploitation Platforms.
OGC Update for State of Geospatial Tech at T-RexGeorge Percivall
An update on OGC activities in three time horizons: Now, Next and After Next. Finishing with how to keep updated on OGC activities.
Now
Recently approved OGC standards
Implementation of approved standards
Next
Standards Program
Innovation Program
After Next
Tech Forecast
How to keep in touch
Analyzing the Evolution of Vocabulary Terms and Their Impact on the LOD CloudMOVING Project
Vocabularies are used for modeling data in Knowledge Graphs
(KGs) like the Linked Open Data Cloud and Wikidata. During their life-time, vocabularies are subject to changes. New terms are coined while existing terms are modified or deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of KGs.
We conduct our experiments on three large-scale KGs for which time-stamped information is available, namely the Billion Triples Challenge datasets, Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact due to the large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the
deprecated terms are still used by data publishers. The adoption time of terms coming from different vocabularies ranges from very fast (few days) to very slow (few years). Surprisingly, we could observe some adoptions before the vocabulary changes were published. Understanding the evolution of vocabulary terms is important to avoid wrong assumptions about the modeling status of data published on the web, which may result in difficulties when querying the data from distributed sources.
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...Anita Graser
Presentation of arxiv preprint https://arxiv.org/abs/2006.16900
Mobility data science lacks common data structures and analytical functions. This position paper assesses the current status and open issues towards a universal API for mobility data science. In particular, we look at standardization efforts revolving around the OGC Moving Features standard which, so far, has not attracted much attention within the mobility data science community. We discuss the hurdles any universal API for movement data has to overcome and propose key steps of a roadmap that would provide the foundation for the development of this API.
Using R to Visualize Spatial Data: R as GIS - Guy LansleyGuy Lansley
This talk demonstrates some of the benefits of using R to visualize spatial data efficiently and clearly.
It was originally presented by Guy Lansley (UCL and the Consumer Data Research Centre) to the GIS for Social Data and Crisis Mapping Workshop at the University of Kent.
SDSC Technology Forum: Increasing the Impact of High Resolution Topography Da...OpenTopography Facility
High-resolution topography is a powerful tool for studying the Earth's surface, vegetation, and urban landscapes, with broad scientific, engineering, and educational-based applications. Over the past decade, there has been dramatic growth in the acquisition of these data for scientific, environmental, engineering and planning purposes. In the US, the U.S. Geological Society is undertaking the 3D Elevation Program (3DEP) to map the entire lower 48 with lidar by 2023.
The richness of these topography datasets make them extremely valuable beyond the application that drove their acquisition and thus are of interest to a large and varied user community. A cyberinfrastructure platform that enables users to efficiently discover, access and process these massive volumes of data increases the impact of investments in collection of the data and catalyzes scientific discovery as well as informs critical decisions that are made across our Nation every day that depend on elevation data, ranging from immediate safety of life, property, and environment to long term planning for infrastructure projects.
Join us to hear about the motivations, technology, and data assets behind the National Science Foundation funded OpenTopography platform, which aims to democratize access to high resolution topographic data. OpenTopography’s innovation is in co-locating massive volumes of topographic data with processing tools that enable users with varied expertise and application domains to quickly and easily access and process data, to enable innovation and decision making.
"The Golden Age of Geospatial Data Science and Engineering" presented as the inital lecture in the Geospatial Data Science Distinguished Speaker Series at the University of Illinois, Urbana-Champaign. Series organized and presented by Professor Shaowen Wang, Head of the Geography and Geographic Information Science Department.
"Data Science is in a golden age. The mathematical foundations of Data Science, known for many years, are now seeing broad applicability due to engineering advances in cloud and big data computing and due to the explosive availability of data about nearly every aspect of human activity coming from mobile devices, remote sensing and the Internet of Things. Nearly all of this data has components of location and time leading to stunning advances in geospatial data science. Development of intelligent systems using knowledge models leading to insights and understanding have the potential to significantly transform geospatial data sciences. To achieve the fullest extent of their potential, these innovations require establishment of open consensus standards. This talk will review recent developments in innovations, standards, and applications of geospatial data science and engineering."
The development of a Geographic Information System for traffic route planni...Matthew Pulis
This was my MSc. Informatics thesis. The project started with a Literature Review studying the historic advancements of Location Based Services and Geographic Information Systems, in particular Open Source GIS. Case Studies were reviewed so as to gain knowledge from past experiences. The methodology used for this project followed the DSDM methodology and requirements were drawn following the MoSCoW priorities. A full working version of the project which is presented in
a Web Interface can be accessed online.
What do a consumer goods manufacturer and a credit insurance group have in common? Both are subject to a variety of risks which, if not detected, may dramatically impact their operations and bottom lines. Delve into the challenges of putting together a semantic, technology-based business solution that monitors and reacts to a large amount of consumer feedback in real time, providing insights on consumer product quality. Hear how this approach assists credit risk analysts in the early detection of signals and events affecting companies’ solvency to anticipate default risks of targeted companies. Walk through this journey to solve real-world problems with business intelligence solutions based on semantic data and technologies.
Benchmarking Commercial RDF Stores with Publications Office DatasetGhislain Atemezing
The slides present a benchmark of RDF stores with real-world datasets and queries from the EU Publications Office (PO). The study compares the performance of four commercial triple stores: Stardog 4.3 EE, GraphDB 8.0.3 EE, Oracle 12.2c and Virtuoso 7.2.4.2 with respect to the following requirements: bulk loading, scalability, stability and query execution.
Slides of the talk during Terracognita 2014 in RIVA del GARDA, where the authors presented the description of ontologies for geometries, coordinate reference systems and publication of French Administrative Units on the Web. Paper can be downloaded at http://event.cwi.nl/terracognita2014/terra2014_1.pdf.
Slides of the paper presented at #COLD2014 available at http://ceur-ws.org/Vol-1264/cold2014_AtemezingT.pdf, on building a Linked-data Visualization Wizard.
Information Content based Ranking Metric for Linked Open VocabulariesGhislain Atemezing
This talk was presented in Leipzig, during the SEMANTiCS '2014 Conference, in September. It basically gives an overview of how Information Content Theory metrics can be applied to Semantic Web, and especially to vocabularies. The results of the proposed ranking metrics can be applied in three areas: (1) vocabulary life-cycle management, (ii) semantic web visualizations and (iii) Interlinking process.
Harmonizing services for LOD vocabularies: a case studyGhislain Atemezing
This presentation describes a solution on how to align well-know services with the aim of managing and harmonizing vocabularies' metadata, with a special use case on prefix.cc.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
1. Publishing and Consuming Geo-
spatial and Government Data on
the Semantic Web
Ghislain Auguste Atemezing
Multimedia Department, Eurecom
Supervisor:
Dr. Raphaël Troncy, MM Department, Eurecom
2. § Open Government Data benefits
Ø Transparency in decision making
Ø Better governance or e-governance
Ø Eco-system of added value application
§ Barriers and challenges
Ø Heterogeneity of data formats
Ø Variety of access method
Ø Lack of nomenclature
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
2
Thesis Context
In this thesis we explore how Semantic Web technologies
can be used for better integration and consumption of geo-
spatial data.
3. § Introduction
§ Research Questions
§ Contributions
Ø Semantic publishing of geospatial data
Ø Visualizations of Government Linked Data
Ø Best Practices for metadata in vocabularies
§ Conclusion
§ Publications
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
3
Outline
4. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
4
Geospatial data: why it matters
“80% of needs for decisions from public authorities have
a geospatial component”.
(Philippe Grelot, IGN-France)
5. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
5
GeoData on the LOD Cloud
http://lod-cloud.net/versions/2014-08-30/
Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer,
Anja Jentzsch and Richard Cyganiak.
http://lod-cloud.net/
In 2011 19,43% à31 geo-datasets in LOD
http://lod-cloud.net/state/
6. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
6
French IGN- Reference Geodata
« ..describes the French national territory and the
occupation of its land, elaborates and updates perpetual
inventory of the forest resources »
ü Different databases with overlapping information:
BD ORTHO, BD PARCELLAIRE, POINT ADRESSE, BD ALTI 25m,
BD TOPO; etc.
ü CRS: LAMBERT93 or RGF93
Q: ”Give me all the
bridges in a radius of
2km from the "Eiffel
Tower“?
A: Not straightforward
7. § How to efficiently represent and store geospatial data
on the Web to ensure interoperable applications?
Ø the publication of real datasets
Ø the interlinking with other datasets having overlapping coverage
§ What are the best options for a user to discover,
browse and interact with semantic content?
Ø Can we propose a generic model for visualizing semantic
content?
§ What are the mechanisms to help preserving
structured data of a high quality on the Web?
Ø How to improve reusing vocabularies for better interoperability
Ø How to detect incompatibility between data and metadata
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
7
Research Questions
2015/04/10
8. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
8
Part 1
Semantic Publication of
Geo-spatial Data
2015/04/10
9. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
9
Modeling geospatial objects
2015/04/10
3 main components:
§ CRS
§ Features
§ Geometry
10. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
10
Coordinate Reference Systems (CRS)
2015/04/10
§ A representation of the locations of geographic
features within a common geographic framework
§ Each CRS is defined by its measurement framework
Ø Geographic: spherical coordinates (unit: decimal degrees)
Ø Planimetric: 2D planar surface (unit: meters) + a map projection
Ø Additional measurement properties: ellipsoid, datum, standard
parallels, central meridians, etc.
§ Situation:
Ø Several hundred Geographical Coordinate Systems
(WGS84, ETRS89, etc.)
Ø Several thousands Projected Coordinate Systems
(UTM, Lambert93, etc.),
http://resources.esri.com/help/9.3/arcgisserver/apis/rest/pcs.html
11. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
11
CRS in France (Metropolitan + overseas)
2015/04/10
§ 10 CRSs coverage (mostly RGF93)
§ 10 projections (mostly used Lambert93)
§ 3 different ellipsoids to define the CRS
The Web only used
WGS84
Publishers need converters
for their geodata
12. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
12
CRS Converters and Limitations
2015/04/10
§ Circé: A Converter Software from IGN
Ø Allows conversion between some France CRSs
to WGS84
Ø Closed Source, No open service
§ Existing Web tools (e.g., world coordinate converter)
Ø No open service, closed source
Ø Converting algorithms don’t usually map to any national
mapping authority.
§ Contribution: a Web based service to
perform conversion between various CRSs.
Ø The integration of such a converter can ease the
process of the publishing different types of geometries
on the web regardless of their Coordinates systems.
13. § WGS 84 <–> Lambert 93
§ WGS 84 <–> UTM
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
13
Developing a REST converter service
2015/04/10
• Many regional's’
published data
involve with
local CRS
International
Communication
• Interpret the
coordinates
between these
CRSs
A medium
• Open services
for community
• RESTFul Web
Service
A Converter
Service
GOOD accuracy of the
results
14. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
14
Vocabularies for Modeling Features
2015/04/10
§ Authority list of terms (e.g. Foursquare)
Ø No semantics at all!
§ SKOS Categories (e.g. GeoNames)
Ø Classes are skos:conceptScheme, codes are
skos:Concept
§ Domain specific ontologies (OrdS, GeoLD)
Ø Interconnected subdomain ontologies (transport,
hydrography, etc.)
§ Data driven ontologies (LGD, GeOnto)
Ø Deeper taxonomy to structure the ontology
Ø Many classes
15. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
15
Modeling Geometry: State of the Art
2015/04/10
§ Point (lat/long)
Ø WGS 84 vocabulary described by W3C
§ Rectangle (“bounding box”)
Ø Geopolitical Vocabulary (FAO)
§ Points in a List
Ø Sequence of points (LinkedGeoData)
Ø An object is “formedBy” a ListOfPoints (GeoLinkedData.es)
§ Literals WKT datatype in RDF
Ø Ordnance Survey (UK), GeoSPARQL embedding CRS in literals
§ More structured representation of complex geometry
Ø NeoGeo Vocabulary (GeoVocamp), http://geovocab.org/
16. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
16
Reusing Existing Ontologies (GeOnto)
2015/04/10
§ Ontology for geographic objects (POI)
Ø Output of a French (ANR) research project
Ø Obtained from NLP tools
§ Classes in French
Ø rdfs:labels in FR & EN
Ø No rdfs:comments
Ø Few owl:ObjectProperty
Ø 783 classes
§ Overlap with other vocabs
Ø Need for alignment
17. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
17
Aligning GeOnto with existing Ontologies
2015/04/10
§ Alignment of GeOnto with 5 ontologies and 2 simple
taxonomies
Ø LGD, DBpedia, Schema.org, GeoNames, bdtopo
Ø Foursquare, Google Places
§ Goal: finding owl:equivalentClass
Ø Tool : Silk framework
Ø Metrics : LevenshteinDistance, Jaro
Ø Labels : @en des classes
Ø Aggregation Function: Mean
§ Manual validation
Ø For « rdfs:subClassOf »
Ø Specific alignments with GeoNames codes
18. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
18
Alignment Process (GeoNames)– Results
2015/04/10
§ High precisions > 80%
§ BUT P(Schema.org) = 50%
Silk
• Look for skos codes that
matches GeOnto classes
• Verify the links <70%
• Generate « sameAs » links
SPARL
Endpoint
• Use SPARQL «Construct»
to generate a new graph.
Alignment
File
• Export the RDF file
Vocab/
taxonomies
#Classes #Classes aligned
Bdtopo 237 153 (64.65%)
GeoNames 699 287 (41.06%)
Google
Place (*)
126 41 (32.54%)
Schema.org
(*)
296 52 (17.57%)
LGD 1294 178 (13.76%)
Foursquare 359 46 (12.81%)
DBpedia 366 42 (11.48%)
GeOnto entities are more
specific to France
19. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
19
Proposal: CheckList for modeling geodata
2015/04/10
§ Complex Geometry Coverage
Ø Need to publish more data with complex geometries
Ø Reuse and extend suitable ontologies (NeoGeo, GeoSPARQL)
§ Features MUST be connected to Geometry
Ø Sometimes it may requires two namespaces
§ Serialization in other GIS formats
Ø Provide serialization in other GIS formats (GML, WKT, KML, etc.)
§ Structured Representation
Ø Use of structured representation for complex geometry
Ø This covers some of the Use Cases at IGN
20. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
20
Proposal: Vocabulary for complex Geometry
2015/04/10
§ Extend and reuse existing vocabularies
§ Available at http://data.ignf.fr/def/geometrie
@prefix ngeo: <http://geovocab.org/geometry#>.
@prefix sf: <http://www.opengis.net/ont/sf#>.
[…]
geom:Geometry a owl:Class;
rdfs:comment "Primitive géométrique non instanciable, racine de
l'ontologie des primitives géométriques. Une géométrie est
associée à un système de coordonnées et un seul."@fr;
rdfs:label "Géométrie"@fr, "Geometry"@en;
owl:equivalentClass [ a owl:Restriction;
owl:onClass ignf:CoordinatesSystem;
owl:onProperty geom:crs;
owl:qualifiedCardinality "1"^^xsd:nonNegativeInteger];
rdfs:subClassOf ngeo:Geometry;
rdfs:subClassOf sf:Geometry.
21. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
21
Associating CRS to Geometries
2015/04/10
§ A vocabulary for describing CRS
Ø Subset of ISO 19111 model
Ø Available at http://data.ignf.fr/def/ingf
§ A dataset for French CRS
Ø Convert from XML data published by IGN France to RDF
Ø Eg: “Lambert 2 étendu” http://data.ign.fr/id/ignf/crs/NTFLAMB2E
22. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
22
Overview of the vocabularies and relationships
2015/04/10
23. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
23
A workflow for publishing Linked(Geo)Data
2015/04/10
§ DATALIFT: a set of modular tools for “lifting” raw data in RDF.
24. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
24
Publishing French Reference Geodata in RDF
2015/04/10
§ GEOFLA® DATASET ON FRENCH ADMINISTRATIVE UNITS IN
RDF
Ø Features are of type http://data.ign.fr/geofla
§ FRENCH GAZETTEER DATASET
Ø Features are of type http://data.ign.fr/topo
Mapping results with external dataset
DATASET
GEOFLA
DATASET
#mappings Tool
INSEE 37,020 SILK
DBPEDIA-FR 23, 252 (communes)
and 93 departments
LIMES
NUTS 105 links (14 comm.,
75 depts, 16 regions)
SILK
GADM 70 links (10 comm., 51
depts, 9 regions)
SILK
FRENCH
GAZETTEER
LINKED
GEODATA
654 links
(lgdo:Amenity) LIMES
25. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
25
French Open Addresses in RDF Mappings
2015/04/10
BANO2RDF
LGData
Amenities Paris (248052) Marseille (401404) Lyon (89061)
#matched
%matched
#matched
%matched
#matched
%matched
Shop (778680) 21171
2.71
8556
1.098
3049
0.391
Restaurant
(260675) 13567
5.204
2654
1.018
1882
0.721
PostOffice (87731) 971
1.106
555
0.632
173
0.197
School (318287) 883
0.277
411
0.129
197
0.061
Parking (250516) 735
0.293
625
0.24
210
0.083
PlaceOfWorship
(357445) 272
0.076
193
0.053
31
0.008
PublicBuilding
(26735) 97
0.362
64
0.239
21
0.078
Building (22283) 5
0.022
12
0.053
0
0
26. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
26
Contribution to the LOD Cloud (FrLOD)
2015/04/10
§ 340 millions of triples in RDF (10% of DBPedia 2014)
§ Part of our work is under reused by the W3C/OGC
Spatial Data Working Group
27. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
27
Why Visualizations Matters
2015/04/10
“Don’t ask what you can do for
the Semantic Web; ask what
The Semantic Web can do for you!”
(D. Karger, MIT CSAIL)
1. How to build bridge to fill the gap
between traditional InfoVis tools and
Semantic Web technologies
2. How can Semantic Web help in
visualization?
“If you use our Linked Data, please let
us know, or we might switch off!”
(Ordnance Survey)
28. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
28
Part 2
Generating Visualizations
For Linked Data
2015/04/10
29. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
29
Visualization Categories in Government
Portals
2015/04/10
§ Study of applications consuming Open Data
Ø 13 applications from UK (7), USA (3) and France (3)
Ø Domain: education, health, transport, government,
city, housing, criminality, foreign aid
§ Different dimensions
Ø Platform (web, mobile), data sources, which views are
available (maps, charts, timeline, etc.)
Ø URL policy for identifying data objects
Ø Licenses for the application / for the data
Ø Commercial / non-commercial
30. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
30
Relevant Features in Visualization Tools
2015/04/10
§ Data format given as input (csv, xml, shp, rdf, etc.)
§ Data access
(API, dump, etc.)
§ Language code
§ Type of view
§ External Libraries
§ License
§ Metadata:
author, organization
31. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
31
Describing an Application: Opendatacom
2015/04/10
Scope/Domain: Department for Communities and Local
Government, datasets access
Description: visualize available datasets (finance, housing,
deprivation, geography) by authorities or postcode.
On the dashboard, it provides graphs showing the national
distribution of a district and how the values for this local
authority compare with others in England.
Supported Platform: Web
URL Policy: http://{domain}/id/{...} with redirection to the
corresponding document at: http://{domain}/doc/{...}.
Hampshire County Council is:
http://opendatacommunities.org/id/county-council/hampshire
Data Sources: 36 datasets from DCLG, Administrative
Geography and Postcodes from Ordnance Survey.
Type of View: Graph, Map views.
Visualization Tools: google visualization API, raphael.js
License: Open Government license [OGL]
Business Value: Non commercial
32. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
32
DVIA: A vocabulary to describe Applications
2015/04/10
dvia: Application
dct:title
dvia:description
dvia:keyword
dvia:url
dct:creator
dvia:businessValue
dvia:scope
dvia:view
dctype:Software
dvia: Platform
dct:title
dvia:system
dvia:preferredNavi
gator
dvia:alternativeNav
igator
dvia: VisualTool
dct:title
dct:description
dvia:accessUrl
dvia: downloadUrl
dcat: Dataset
dct: title
dcat: accessURL
dct:references
dcat: keyword
org:Organization
dvia:consumes
dvia:platform
Prefixes:
@prefix dct: <http://purl.org/dc/terms/>.
@prefix dcat: <http://www.w3.org/ns/dcat#>.
@prefix dctype: <http://purl.org/dc/dcmitype/>.
@prefix org: <http://www.w3.org/ns/org#>.
@prefix dvia: <http://data.eurecom.fr/ontology/dvia#>.
33. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
33
DVIA in Real World Datasets
2015/04/10
§ 4 applications re-using DVIA
Ø Use to populate 22 past events of
hack-athons in Europe, with 889
applications from Apps4Europe.
a. The script will fetch the event or application information using
an AJAX-‐request.
b. After the server has responded with the event/application
information, the information is displayed on the page in human
readable form by modifying the DOM of the page. Furthermore, the
same information is presented also in computer readable format by
embedding RDFa in the DOM.
The embeddable script is non-optimal in the sense that the event or application information is
loaded only after the actual page (where the script is embedded) has loaded. This causes some
additional delay before the page is fully rendered, but the problem can be alleviated by showing
loading indicators.
CSS styling for the events and applications is provided using Bootstrap. All CSS rules have been
made specific to the container div-element of the plugin, in order to prevent the CSS from
conflicting with the CSS of the page. In the future, another solution called shadow DOM could be
used to prevent conflicts between different components of the page, but at the moment the
browser support is not satisfactory. Since the event and application information is directly 29
embedded on the page using DOM, the event organizer can add his own CSS styling to the
plugin by overriding the provided CSS rules.
Fig 9. A screenshot of a test event displayed on an event organizer’s page. The content inside
the red square is injected after page load by the embeddable script. Note! The red square is not
visible in the actual page, it was added only for visualization purposes.
Ø Implementation of a universal JavaScript plugin to
embed RDFa in organizers events
Ø An extension for Wordpress uses DVIA.
34. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
34
Developing Web Applications: from specific
to generic approach
2015/04/10
§ Scenario 1
Ø Known Datasets, Known
vocabularies à Specific
SPARQL queries
Ø Visualizations: dataset
specific
§ Example
Ø Datasets on schools in France
Ø Vocabularies: geo vocab, data
cube, geometrie,
Ø Application: PerfectSchool
35. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
35
Developing Web Applications: from specific
to generic approach
2015/04/10
§ Scenario 2
Ø Unknown Datasets, Known
domains, so domain-
specific SPARQL queries
Ø Visualizations: domain
specific
§ Example
Ø Endpoints of geodatasets
Ø Domain: geospatial
Ø Application: GeoRDFviz
36. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
36
Developing Web Applications: from specific
to generic approach
2015/04/10
§ Scenario 3
Ø Unknown Datasets, Unknown
domains, so generic SPARQL
queries
Ø Visualizations: adapted to
domains specific
Ø Any endpoints
Ø Multiple domains: geodata,
statistics, persons, cross-
domains, etc..
Ø Application: ?????
Related work on configuring Semantic Web widgets by data mapping. [1]
Application: Efficient search for Semantic News demonstrator
in Cultural Heritage Dataset
Tool: ClioPatria
[1] Hildebrand, Michiel, and Jacco Van Ossenbruggen. "Configuring semantic web interfaces by data mapping."
Visual Interfaces to the Social and the Semantic Web (VISSW 2009) 443 (2009): 96.
…but “method not apply to create
interfaces on top of arbitrary
SPARLQ endpoints”
37. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
37
Our Proposal
2015/04/10
Linked Data
Vizualization Wizard
(LDVizWiz)
38. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
38
Requirements of LDVizWiz (LDViz-”Wise”)
2015/04/10
§ Predefined categories associated to visual elements
§ Build on top of RDF standards
Ø e.g., SPARQL queries ; Semantic Web technologies
§ Reuse existing Visualization libraries
Ø e.g., Google Maps, Google Charts, D3.js, etc.
§ Reuse On-line Library of Information Visualization
(OLIVE)
§ Target to non “RDF/SPARQL speakers”
§ Input: Datasets published as LOD
39. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
39
Mapping Categories and vocabularies
2015/04/10
§ Geographic
information
Ø geo: vocab,
schema:Place, etc.
§ Temporal information
Ø Time, interval ontologies
§ Event information
Ø lode, event, sport, etc..
§ Agent/Person
Ø foaf:Person/foaf:Agent
§ Organization
information
Ø ORG vocabulary,
foaf:Organization
§ Statistics information
Ø Data cube, SDMX model
§ Knowledge information
Ø Schemas, classifications
using SKOS vocabulary
40. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
40
LDVizWiz Workflow
2015/04/10
41. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
41
Step 1: Categories Detection
2015/04/10
§ Detection of main categories in datasets
Ø ASK SPARQL queries on predefined categories
Ø Uses well-known vocabularies in LOV
Ø Condition the type of visual elements
Detection
42. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
42
Experiment: Categories Detection
2015/04/10
Category Number %
GEO DATA 97 21.84%
EVENT DATA 16 3.60%
TIME DATA 27 6.08%
SKOS DATA 02 0.45%
ORG DATA 48 10.81%
PERSON DATA 59 13.28%
STAT DATA 29 6.6%
§ Applications
Ø Automatic detection of
endpoints categories
Ø More “trustable” than human
tagging
Ø Map categories detected with
“suitable” visual elements for
the visualizations (e.g.,
TimeLine + maps for events
data)
(*) All the endpoints retrieved from sparqles.org
Detection
Ø 444 endpoints (*) analyzed, 278
good answers (62.61%) using
ASK queries.
Ø Few taxonomies in SKOS, many
GEO DATA
43. § Build candidate properties for visualization
Ø For pop-up menus
Ø For facet browsing
Ø For charts display
G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
43
Step 2: Properties Aggregation
2015/04/10
§ Retrieve properties from external datasets
Ø So called “enriched properties”
Detection Aggregation
§ Goal: Exploit the “connectors” between
graphs
§ “connectors” are used to enrich a given graph
Ø e.g., owl:sameAs ; rdfs:seeAlso,
skos:exactMatch
44. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
44
Step 3: Publication
2015/04/10
§ Visualization Generator
Ø Recommend the visual elements based on categories
Ø Transform ASK queries to SELECT or CONSTRUCT
queries for input to visual library.
Detection Aggregation Publication
§ Visualization Publisher
Ø Export the description of visualization in RDF
Ø Add metadata for the visualization (charts) and steps
used to create it.
Ø e.g., dcat:Dataset, prov:wasDerivedFrom,
chart vocabulary, void:ExampleResource.
45. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
45
Current Implementation
2015/04/10
§ Javascript light version as “proof-of-concept”
§ Url: http://semantics.eurecom.fr/datalift/rdfViz/apps/
46. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
46
Part 3
Best Practices for metadata
in vocabularies
2015/04/10
47. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
47
W3C Govn’t LD Best Practices
2015/04/10
§ 10 best practices to help government worldwide to
access and reuse their by taking benefit of Linked Data
mechanism.
§ 4 steps to rich 5★ ratings datasets in TimBL scale.
Dataset
selection
Dataset
preparation
Dataset
conversion
in RDF
Dataset
publication and
advertisement
Four steps corresponding to the best practices
to publish Linked Data by Governments.
(1) (2)
(3)
(4)
48. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
48
LOV in a Nutshell : http://lov.okfn.org/dataset/lov/
2015/04/10
§ A curated list of vocabularies
Ø More than 495 vocabularies
Ø Each of them described by vocabulary-
of-a-friend (voaf)
Ø Provide a dump in N3 of the different
versions of a vocabulary
Ø Quasi linearity of the growth, started with
75 vocabularies
49. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
49
Focus on vocabularies: Disambiguating Vocabulary Prefixes
2015/04/10
Goal:
align services against Linked Open Vocabularies to
harmonize and manage vocabularies’ namespaces
§ Global namespaces
Ø With good practices to
recommend a prefix
Ø Have a more transparent list
of built-in prefixes
Ø All the services understand
each other with prefixes
Ø Some de facto prefixes
emerging: rdfs:, foaf:, rdf:,
owl:, skos:,
50. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
50
LOV vs PREFIX.CC-Alignment Findings
2015/04/10
14%
11%
19%
56%
Findings
during
alignment
process
lov-‐able
vocabs
Intersect-‐prefixes
vocabs
in
LOV
vocabs
in
prefix.cc
Category Number
lov-able vocabs
227
Intersect-prefixes
188
vocabs in LOV
321
vocabs in prefix.cc
925
More than 200 prefixes in
prefix.cc are vocabularies
51. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
51
Vocabulary Search and Ranking
2015/04/10
Goal
Ranking vocabularies based on Information
Content Metrics
§ Metrics
Ø Information Content Metric (IC): value of
information associated with a given entity
Ø Partition Information Content Metric (PIC)
Ø Proposed a ranking based on IC and PIC
52. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
52
Ranking Algorithm
2015/04/10
Output ranking
Compute PIC score
Compute IC score
Grouping terms by namespace &
weight assignment
Candidate terms selection in LOV
53. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
53
Ranking Algorithm
2015/04/10
§ dcterms:
http://purl.org/dc/terms/
§ Candidate terms: 53 (39
properties + 14 classes)
§ wf = 1+ 2+3 = 6
§ PIC = 1724.844
§ foaf:
http://xmlns.com/foaf/0.1/
§ Candidate terms: 35 (26
properties + 9 classes)
§ wf = 1+ 2+ 3 = 6
§ PIC = 1033.197
PIC(dcterms) > PIC(foaf)
54. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
54
Ranking Algorithm
2015/04/10
§ Top-15 terms (IC value) § Top-15 vocabs (PIC value)
55. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
55
Applications of the Ranking Metrics
2015/04/10
§ Vocabulary life-cycle management
Ø Help assessing the use of terms and vocabulary updates
Ø Monitoring the use of owl:deprecated or
http://www.w3.org/2003/06/sw-vocab-status/ns#:term_status
§ Semantic Web applications
Ø Vocabularies with higher PIC might be proposed to a user
as much as possible, e.g. for choosing properties to display
in a facetted browsing interface
§ Interlinking datasets
Ø Generate sameAs links between resources based on
vocabularies terms with lower IC value
56. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
56
Licenses Compatibility
2015/04/10
§ Approach:
Ø Licenses retrieval both for the
dataset and for the
vocabularies used in the
dataset.
Ø Deontic logic [1] to compute
compatibility using RDF
representation of licenses
9%#
3%#
47%#
1%#
3%#2%#
3%#
3%#
5%#
5%#
2%#
2%#
9%#
6%#
Crea/ve#Commons#A6ribu/on:
NonCommercial:ShareAlike#3.0#Unported##
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
Unported##
Crea/ve#Commons#A6ribu/on#3.0#Unported##
Crea/ve#Commons#Public#Domain#Mark#1.0#
Licence#Ouverte#/#Open#Licence#
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
United#States#
Crea/ve#Commons#Zero#Public#Domain#
Dedica/on#
Apache#License#Version#2.0#
ODC#Public#Domain#Dedica/on#and#Licence#
(PDDL)#
ISA#Open#Metadata#License#1.1#
W3C#SoSware#No/ce#and#License#
Crea/ve#Commons#A6ribu/on:ShareAlike#3.0#
United#States#
Crea/ve#Commons#A6ribu/on:NoDerivs#3.0#
Unported##
Crea/ve#Commons#A6ribu/on:
NonCommercial:ShareAlike#2.0#Generic#
Goal
Reasoning on Licenses for checking compatibilities
between vocabularies and datasets
[1] Governatori, Guido, et al. "One License to Compose Them All." The Semantic Web–ISWC 2013.
Springer Berlin Heidelberg, 2013. 151-166.
57. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
57
Our Proposal: LIVE Framework
2015/04/10
LOV
Licenses
retrieval
module
Licenses
compatibility
module
Check consistency of
licensing information
for dataset D
dataset D
retrieve vocabularies
used in the dataset
retrieve licenses
for selected vocabularies
vocabularies and data
licenses
LIVE framework
Warning: licenses are
not compatible
http://www.eurecom.fr/~atemezin/licenseChecker/
58. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
58
The Logic
2015/04/10
§ RDF licenses: http://purl.org/NET/rdflicense
§ Logic of deontic rules:
Ø constructive account of basic deontic modalities (obligation,
prohibition, permission)
Ø compute the set of all conclusions for each license and then
check whether incompatible conclusions are obtained.
59. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
59
Live Evaluation
2015/04/10
LIVE provides the compatibility in less than 5 seconds for 7 datasets
LIVE retrieves 48 vocabularies in less than 14 seconds
60. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
60
Conclusions
2015/04/10
§ We have contributed in managing vocabulary metadata
Ø Disambiguating prefixes between vocabularies
Ø Improving vocabulary search and ranking
Ø Providing a license compatibility framework
Ø Contributing to Best Practices standard
§ We have presented models for Geodata
Ø For better handling complex geometries
Ø Supporting *all* CRS
Ø For easy querying in SPARQL
§ We have proposed a generic visualization wizard
Ø Based on predefined categories
Ø Targeted to lay-users
61. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
61
Future Work
2015/04/10
§ Managing Data updates and versioning
Ø Versioning: integrate Memento protocol?
Ø Spatio-temporal evolution
§ Multiple representation: need for metadata?
Ø Level of detail
Ø Geometry modeling rules and reasoning
§ Tracking Provenance of geodata
Ø To ensure quality of published dataset
Ø To ensure trust from application consumers
62. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
62
Future Work
2015/04/10
§ Visualizations
Ø Extend categories and vocabularies for detection
Ø Provide templates for generating “mash-ups” to combine
domains, an mash-up widget generator
Ø Investigate the “importance” of a category in dataset
Ø Provide a user evaluation
§ Metadata management
Ø Publish a list of common recommended prefixes
Ø Foster and support current effort towards a more sustainable
governance of vocabularies.
Ø Compare (P)IC with other graph-based ranking (e.g. pagerank)
Ø Investigate the dependency ranking between vocabularies
63. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
63
Publications
2015/04/10
§ Pierre-Yves Vandenbussche, G.A. Atemezing, Maria Poveda, Bernard Vatant:
Linked Open Vocabularies (LOV): a gateway to reusable semantic vocabularies
on the Web. Semantic Web Journal, under review, 2015.
§ G.A. Atemezing, Raphael Troncy: Modeling visualization tools and applications
on the Web. Semantic Web Journal, under review, 2015.
§ G.A. Atemezing et al.: Transforming meteorological data into linked data. In
Semantic Web journal, Special Issue on Linked Dataset descriptions, 2012. IOS
Press.
§ Guido Governatori, Ho-Pun Lam, Antonino Rotolo, Serena Villata, G.A.
Atemezing and Fabien Gandon: LIVE: a Tool for Checking Licenses
Compatibility between Vocabularies and Data.(ISWC 2014, Demo Track)
§ G.A. Atemezing and Raphael Troncy: Information content based ranking metric
for linked open vocabularies. (SEMANTICS 2014)
§ Ahmad Assaf, G.A. Atemezing, Raphael Troncy and Elena Cabrio: What are the
important properties of an entity? Comparing users and knowledge graph point
of view. In 11th Extended Semantic Web Conference (Demo Track, ESWC 2014)
64. G.A. Atemezing - Publishing and Consuming Geo-Spatial & Government Data on the Semantic Web
64
Publications
2015/04/10
§ Francois Scharffe, G.A Atemezing, Raphael Troncy, Fabien Gandon, Serena
Villata, Bénédicte Bucher, Faycal Hamdi, Laurent Bihanic, Gabriel Kepeklian,
Franck Cotton, Jerome Euzenat, Zhengjie Fan, Pierre-Yves Vandenbussche
and Bernard Vatant: Enabling linked-data publication with the datalift platform.
(AAAI, W10:Semantic Cities, 2012)
§ G.A. Atemezing and Raphael Troncy: Vers une meilleure interopérabilité des
donneés geographiques francaises sur le Web de donneés. (IC 2012).
§ Houda Khrouf, G.A. Atemezing, Thomas Steiner, Giuseppe Rizzo and Raphael
Troncy: Confomaton: A conference enhancer with social media from the cloud.
(ESWC 2012, Demo Track)
§ Bernadette Hyland, G.A. Atemezing and Boris Villazón-Terrazas (editors): Best
Practices for Publishing Linked Data. W3C Working Group Note pub- lished on
January 9, 2014. URL: http://www.w3.org/TR/ld-bp/
§ Bernadette Hyland, G.A Atemezing, Michael Pendleton, Biplav Srivastava
(editors): Linked Data Glossary. W3C Working Group Note published on June
27, 2013. URL: www.w3.org/TR/ld-glossary/
65. Thank you
for your attention!
Credits layout
Mariella Sabatino: mll.sabatino@gmail.com