Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Linked Data Publication Pipelines for Agri-Related use cases

5 views

Published on

LSWT2019 Talk by Marcin Krystek, Computer Systems Architect @ Poznan Supercomputing and Networking Center (PSNC)

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Linked Data Publication Pipelines for Agri-Related use cases

  1. 1. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 1 This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732064 This project is part of BDV PPP LINKED DATA PUBLICATION PIPELINES FOR AGRI-RELATED USE CASES Raul Palma1, Soumya Brahma1 , Marcin Krystek1, Karel Charvát2, Raitis Berzins2 1Poznan Supercomputing and Networking Center, Poland 2WirelessInfo, Czech Republic 7th Leipzig Semantic Web Day 22nd May, 2019
  2. 2. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 2 Linked data publication •  LD is increasingly becoming a popular method for publishing data on the Web •  Improves data accessibility by both humans and machines, e.g., for finding, reuse and integration •  Enables to discover more useful data through the links (and inferencing), and to exploit data with semantic queries •  Growing number of datasets in the LOD cloud •  1,234 datasets with 16,136 links (as of June 2018) •  Coverage of the LOD cloud •  Large cross-domain datasets (dbpedia, freebase, etc.) •  Variable domain coverage (e.g., Geography, Government, BioInformatics) •  What about Agriculture? •  “Just” few datasets (e.g., AGRIS biblio records, AGROVOC thesaurus + other thesaurus like NALT) •  Farming and agri-activities related data? http://lod-cloud.net/
  3. 3. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 3 Why is Linked Data relevant in Agriculture: Farming context •  Farm management •  Multiple activities and stakeholders •  Multiple applications, tools and devices •  Multiple data sources, types and formats • Challenge •  To combine/integrate those different and heterogeneous data sources in order to make economically and environmentally sound decisions
  4. 4. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 4 Data Integration in relevant projects (context) •  Data integration challenges have been the focus of relevant projects EU FP7, ICT CIP, 2014- 2017 EU FP7, ICT CIP, 2014- 2017 FOODIE aimed at building an open and interoperable cloud-based platform addressing among others the integration of data relevant to farming production including their geo-spatial dimension, as well as their publication as Linked data. SDI4Apps aimed at building a cloud- based framework with open API for data integration focusing on the development of six pilot apps, drawing along the lines of INSPIRE, Copernicus and GEOSS DataBio aims at showcasing the benefits of Big Data technologies in the raw material production from agriculture & others for the bioeconomy industry; deploying an interoperable platform on top of the existing partners’ infrastructure. DataBio aims at delivering solutions for big data mgmt., including i) the storage and querying of various big data sources; ii) the harmonization and integration of a large variety of data from many sources, using linked data as a federated layer
  5. 5. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 5 Linked data principles & guidelines •  Simple set of principles & technologies •  URI, HTTP, RDF, SPARQL •  Involves a set of (common) tasks Datasets identification Model specification RDF data generation Linking Exploiting Hyland et al. Hausenblas et al. Villazón-Terrazas et al. Best Practices for Publishing Linked Data 5-star deployment scheme for Linked Open Data
  6. 6. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 6 Implementing Linked Data publication pipelines
  7. 7. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 7 Implementing Linked Data publication pipelines •  Goal: to define and deploy (semi-) automatic processes to carry out the necessary steps to transform and publish different input datasets as Linked Data. •  A pipeline connect different data processing components to carry out the transformation of data into RDF and their linking, and includes the mapping specifications to process the input datasets. •  Each pipeline is configured to support specific input dataset types (same format, model and delivery form). •  Principles •  Pipelines can be directly re-executed and re-applied (e.g., extended/updated datasets) •  Pipelines must be easily reusable •  Pipelines must be easily adapted for new input datasets •  Pipeline execution should be as automatic as possible. The final target is to fully automated processes. •  Pipelines should support both: (mostly) static data and data streams (e.g., sensor data) •  The resulting datasets available as Linked Data, will provide an integrated view over the initial (disconnected and heterogeneous) datasets, in compliance with any privacy and access control needs
  8. 8. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 8 Use cases (examples) •  UC1: Publication of farm related linked data from agri pilots in DataBio project, particularly Pilot 8 [B1.4] Cereals and biomass crops. •  Goal: query and access heterogeneous agricultural data sources from Rostenice farm via an integrated layer in order to make informed decisions and discover new knowledge •  UC2: Publication of Open EU/national datasets relevant for agri-food pilots as Linked Data •  Goal: provide access to multiple, isolated data sources relevant for agri-pilots, and identify links with farm datasets, from a single integrated layer •  UC3: Publication of sensor data as linked data on the fly from Pilot 9 [B2.1] Machinery management •  Goal: provide access to sensor data integrated with other farm data and other relevant datasets
  9. 9. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 9 Datasets identification and collection •  UC1 Datasets •  Farm data (Rostenice holding) that holds information about each field names with the associated cereal crop classifications and arranged by year. •  Data about the field boundaries and crop map and yield potential of most of the fields in Rostenice farm from Czech Republic. •  Yield records from two fields (Pivovarka, Predni) within the pilot farm that were harvested in 2017/2018. Source data types: shapefiles
  10. 10. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 10 Datasets identification and collection •  UC2 datasets •  Czech datasets •  Czech LPIS data showing the actual land boundaries. •  Czech erosion zones (strongly/SEO and moderately / MEO erosion-endangered soil zones) •  Czech water bodies (e.g., restricted area near to water bodies has 25m buffer according to the nitrate directive). •  The data about soil types from all over Czech Republic. •  Polish datasets •  Polish LPIS data showing the cadastral land boundaries from all over the country. •  European datasets •  FADN (Farm Accountancy Data Network) data about the income of agricultural holdings and the impacts of the Common Agricultural Policy from all EU member states •  Various open European geospatial datasets including •  (part of) Open Land Use (OLU) •  (part of) Open Transport Map (OTM) •  Smart Points of Interest (SPOI), •  (part of) Urban Atlas (pan-European comparable land use and land cover data for Large Urban Zones ) •  (part of) CORINE Land Cover •  HILUCS (Hierarchical INSPIRE Land Use Classification System ) •  Experimental sample dataset from the review platform Yelp (global coverage). The data contents are regarding the geographical location of a business, review and reviewer information. Source data types: shapefiles, JSON, CSV, relational databases
  11. 11. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 11 Datasets identification and collection •  UC3 dataset •  Sensor data stored and collected by Senslog platform, including the readings of IoT devices on tractors. •  Data updates is high, reading coming frequently Source data types: relational databases
  12. 12. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 12 Data models •  Various ontologies/vocabularies were selected and reused/extended for the representation of data •  For farm-related data, INSPIRE based FOODIE ontology has been selected and extended as needed •  For the land parcel and cadastral data (for Czech republic & Poland), erosion-endangered soil zones, water buffer and soil type classification, also FOODIE ontology and in some cases its extensions were used for modelling the classes and properties. •  In case of the FADN the main ontologies used were Data Cube Vocabulary and its SDMX ISO standard extensions that were much effective in aligning such multidimensional data. Moreover the Data Cube Vocabulary encompasses well known RDF vocabularies like SKOS, SCOVO, VoiD, FOAF, Dublin Core, etc. •  For the Yelp dataset various ontologies like review, FOAF, schema.org, POI, etc. were used to represent the classes and the properties identified from the input data sources. •  For some other datasets (e.g., corine, hilucs, olu, otm, urban atlas) simple ontologies/voabularies were generated in line with standards and are available in https://github.com/FOODIE-cloud/ontology. •  For sensor data: Semantic Sensor Network (SSN) along with the SOSA (Sensor, Observation, Sample, and Actuator) ontology for describing sensors and their observations, the involved procedures, the studied features of interest, etc. Additionally, Data Cube Vocabulary and its SDMX ISO standard extensions for multidimensional data
  13. 13. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 13 Data model for farming data •  (FOODIE) farming data model principles •  application vocabulary covering the different categories of information dealt by the farm mgmt. tools/apps (in FOODIE) •  in line with existing standards and best practices •  Resulting model* •  Builds on the INSPIRE AF specification for agricultural data, and •  the INSPIRE specification for themes in annex I for geospatial data, based on •  ISO/OGC standards for geographical information •  Created as an UML model *consulted with experts from various institutions, e.g., EU DG JRC, EU Global Navigation Satellite Systems Agency (GSA), Czech Ministry of Agriculture, Global Earth Observation System of Systems (GEOSS), German Kuratorium für Technik und Bauwesen in der Landwirtschaft (KTBL). Challenge: Transform model into OWL ontology
  14. 14. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 14 Transformation of model into OWL ontology Palma R., Reznik T., Esbri M., Charvat K., Mazurek C., An INSPIRE- based vocabulary for the publication of Agricultural Linked Data. Proceedings of the OWLED Workshop: collocated with the ISWC-2015, Bethlehem PA, USA, October 11-15, 2015 ShapeChange implements ISO 19150-2 standard rules for mapping ISO geographic information UML models to OWL ontologies. semi-automatic process: besides transformation configuration, additional pre and post processing task were needed
  15. 15. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 15 FOODIE ontology - overview •  Overall structure (ShapeChange output) •  UML featureTypes and dataTypes modelled as classes, and their attributes as datatype or object properties •  UML codeLists modelled as classes/concepts, and their attributes as concept members •  Cardinalities restrictions defined on properties (exactly, min, max) •  DataType properties ranges defined according to model/ mappings •  Object properties ranges defined according to model/ mappings •  Object properties inverseOf defined Top hierarchy FeatureType hierarchy Codelist hierarchy Datatype hierarchy
  16. 16. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 16 FOODIE ontology – main classes overview •  We found the lack of a feature on a more detailed level than Site that is already part of the INSPIRE AF data model. •  Main concept: Plot •  Represents a continuous area of agricultural land with one type of crop species, cultivated by one user in one farming mode •  Two kinds of data associated: •  metadata information •  agro-related information §  Next level: Management Zone •  Enables a more precise description of the land characteristics in fine-grained area foodie:Plot INSPIRE-AF:Site foodie:Alert Foodie:Intervention Foodie:CropSpecies Foodie:ManagementZone containsPlot containsManagementZone interventionPlot speciesPlot alertPlot plotAlert Foodie:ProductionType production Foodie:SoilNutrients zoneNutrients
  17. 17. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 17 FOODIE ontology – main classes overview •  The Intervention is the basic feature type for any kind of (farming) application with explicitly defined geometry, e.g., tillage or pruning. •  Has multiple indirect associations with different concepts Foodie:Intervention Foodie:Treatment Foodie:TreatmentPlan Foodie:Product Foodie:ProductPreparation Foodie:ActiveIngredients is-a plan productPlan planProduct preparationProduct preparation productTreatment treatmentProduct preparationPlan ingredientProduct Foodie:FormOfT reatmentValue Foodie:Treatme ntPurposeValue formOfTreatment treatmentPurpose
  18. 18. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 18 RDF data generation •  Different tools were deployed, configured and used for the generation of RDF data, depending on the source data type •  D2RQ: mainly for relational databases •  Geotriples: mainly for shapefiles •  R2MLprocessor: mainly for JSON, CSV data sources •  All these tools require a mapping file (in RDF) specifying how to map the data source elements to the target ontology concepts and properties. •  Mapping specifications use R2RML (RDB to RDF Mapping Language) and/or extensions (e.g., RDF Mapping language (RML))
  19. 19. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 19 RDF data generation •  The mapping file also specifies the connection details for the source dataset •  Based on the mapping file, the data source (e.g., database content, shapefile, JSON, CSV, etc.) is either •  i) dumped to an RDF file; or •  ii) transformed on the fly as a virtual RDF graph (e.g., for data streaming) •  RDF files were loaded into Virtuoso triplestore (FOODIE endpoint)
  20. 20. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 20 Linking the generated RDF datasets •  In order to link the resulting RDF datasets with other datasets, we follow different approaches: •  Apply existing tools like Silk or LIMES to discover equivalence relations •  For other relations use queries (e.g., geospatial) to access the integrated data as per need. •  In our experiments with equivalence relations; however we also had to do some manual entries •  We found issues in handling large datasets in Silk, specially those accessed via SPARQL endpoint that we cannot control There were recent optimizations of LIMES and a new tool for geospatial linking (from Leipzig) that we plan to test
  21. 21. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 21 Triplestore statistics (examples) Dataset Name Graph in FOODIE endpoint Source Triples OLU** http://w3id.org/foodie/olu# Transformed from PostgreSQL 127,926,060 SPOI http://www.sdi4apps.eu/poi.rdf Provided by WRLS (also available in FOODIE endpoint) 407,629,170 NUTS http://nuts.geovocab.org/ Open Source (available in FOODIE endpoint) 316,238 OTM*** http://w3id.org/foodie/otm# Transformed from PostgreSQL 154,340,785 Yelp academic dataset http://data.yelp.com/academic_dataset# Transformed from JSON 86,348,185 LPIS data (CZ) http://w3id.org/foodie/open/cz/ pLPIS_180616_WGS# Transformed from shapefile 24,491,282 FADN http://ec.europa.eu/agriculture/FADN/ {dataset}# Transformed from CSV 25,457,255 Pilot 8 farm data private Transformed from shapefile 1,569,439 Total: over 1 bilion triples! FOODIE triplestore is one of the largest semantic repositories related to agriculture, which has been recognized by the EC innovation radar as „arable farming data integrator for Smart Farming”
  22. 22. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 22 Exploiting the Linked Data – querying triplestore •  Sparql endpoint: https://www.foodie-cloud.org/sparql
  23. 23. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 23 Exploiting the Linked Data – querying sensor data as virtual RDF graph •  Sparql endpoint: http://senslogrdf.foodie-cloud.org/sparql •  SNORQL search endpoint: http://senslogrdf.foodie-cloud.org/snorql/ •  Web-based visualization: http://senslogrdf.foodie-cloud.org/
  24. 24. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 24 Exploiting the Linked Data – querying examples(1-2 linked datasets) •  get information of Points Of Interest in a given polygon (one dataset) •  get SPOIs for given Open Land Use parcel (linking 2 datasets) SELECT * FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class ?POI_Class . ?Resource geo:asWKT ?Coordinates . FILTER(bif:st_intersects (?Coordinates, bif:st_geomFromText("POLYGON((6.11553983198 54.438016608357, 6.95050076948 47.230985358357, 13.36651639448 47.626493170857, 14.99249295698 54.701688483357, 6.11553983198 54.438016608357))"))) . } SELECT * FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class ?POI_Class . ?Resource geo:asWKT ?Coordinates . FILTER(bif:st_intersects (?Coordinates, bif:st_geomFromText(? coordinates))) . { SELECT bif:st_astext(?x) as ?coordinates FROM <http://w3id.org/foodie/olu#> WHERE { olu-instance: geo:hasGeometry ?geometry. ?geometry geo:asWKT ?x } } }
  25. 25. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 25 Exploiting the Linked Data – querying examples (3+ linked datasets) •  Show me all the land parcels (OLU) that have hotels (SPOI) and that lie not more than 50 meters away from the major highway (OTM) (linking 3 datasets)? SELECT DISTINCT ?olu ?hilucs ?source ?municode ?specificLandUse FROM <http://w3id.org/foodie/olu#> WHERE { ?olu a olu:LandUse . ?olu geo:hasGeometry ?geometry . ?olu olu:hilucsLandUse ?hilucs . ?olu olu:geometrySource ?source . OPTIONAL {?olu olu:municipalCode ?municode} . OPTIONAL {?olu olu:specificLandUse ?specificLandUse} . ?geometry geo:asWKT ?coordinatesOLU . FILTER(bif:st_within(bif:st_geomFromText(?coordinatesPOI),?coordinatesOLU)). { SELECT DISTINCT ?Resource, ?Label, bif:st_astext(?coordinatesPOIa) as ?coordinatesPOI FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class <http://gis.zcu.cz/SPOI/Ontology#lodging> . ?Resource geo:asWKT ?coordinatesPOIa . FILTER(bif:st_within(?coordinatesPOIa,bif:st_geomFromText(?coordinatesOTM),0.00045)) . { SELECT bif:st_astext(?x) as ?coordinatesOTM FROM <http://w3id.org/foodie/otm#> WHERE { ?roadlink a otm:RoadLink . ?roadlink otm:roadName ?name. ?roadlink otm:functionalRoadClass ?class. ?roadlink otm:centerLineGeometry ?geometry . ?geometry geo:asWKT ?x . FILTER(bif:st_intersects (?x, bif:st_geomFromText("POLYGON((14.426647 50.0751251,14.426647 50.07685089,14.4305469 50.07685089,14.43054696 50.0751251,14.426647 50.0751251))"))) . FILTER(STRSTARTS(STR(?class),"firstClass") ) . } } } } }
  26. 26. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 26 Exploiting the Linked Data – search & navigate •  Faceted search endpoint: http://www.foodie-cloud.org/fct
  27. 27. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 27 Exploiting the Linked Data – visualisation •  Map visualisation: http://ng.hslayers.org/examples/foodie-zones/
  28. 28. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 28 Exploiting the Linked Data – visualisation •  Map visualisation: http://ng.hslayers.org/examples/produce-3d/ Object information on click
  29. 29. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 29 Exploiting the Linked Data – visualisation •  Map visualisation: http://ng.hslayers.org/examples/olu_spoi •  OLU polygons colored by the number of SPOI that lie inside them
  30. 30. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 30 Exploiting the Linked Data – visualisation •  Map visualisation: http://app.hslayers.org/project-databio/land •  Usage scenarios: •  Find and show buffer zones around water bodies (user will specify the distance), which define the areas within the fields with limited/restricted application of agro- chemicals. •  select farm/fields based on the ID_UZ attribute from public CZ LPIS database and search EO data over all fields •  visualize crop species based on the farm data (need parcels with crop types- not available from open LPIS data) •  select fields with different soil types •  select all fields with certain crop in max distance from certain point (it could be for logistic, distribution of biomass etc) •  show/select erosion zones for specific farm ID
  31. 31. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 31 Exploiting the Linked Data – visualisation •  Map visualisation: http://app.hslayers.org/project-databio/land
  32. 32. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 32 Exploiting the Linked Data – visualisation •  Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  33. 33. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 33 Exploiting the Linked Data – visualisation •  Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  34. 34. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 34 Exploiting the Linked Data – visualisation •  Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  35. 35. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 35 Exploiting the Linked Data – visualisation •  Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  36. 36. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 36 Security of RDF Data •  SPARQL endpoints are web services capable of providing Read-Only access to a back-end graph DBMS. •  SPARQL endpoints can be sometimes purpose-specific so their access privilege therefore must be limited to some basic operations over the graph. •  The privileges provided by a given Virtuoso SPARQL endpoint include specific user identities with specific database roles and privileges. •  Virtuoso offers three methods for securing SPARQL endpoints: •  Digest Authentication via SQL Accounts •  OAuth Protocol based Authentication •  WebID Protocol based Authentication
  37. 37. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 37 Linked data publication technologies overview •  Used technologies: •  D2RQ, Geotriples and R2MLProcessor for input datasets as Virtual RDF Graphs •  RDF for the representation of data •  FOODIE (Farming ontology) providing the underlying vocabulary and relations, plus a number of other ontologies/vocabularies (existing and generated) •  Virtuoso for storing the semantic data •  Silk/LIMES for discovery of links •  Sparql for querying semantic data •  Hslayers NG for visualisation of data •  Metaphactory for visualisation of data D2RQ
  38. 38. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 38 Thank you for your attention! Raul Palma <rpalma@man.poznan.pl>

×