SQL Database Design For Developers at php[tek] 2024
Building Knowledge Graphs for the Agri-Food sector
1. Building a Knowledge
Graph for Agri-Food Sector
Dr. Raul Palma
Head of Data Analytics and Semantics Department
Poznan Supercomputing and Networking Center
Modeling Sustainability Workshop
Knowledge Graph Conference
3rd May 2021
2. Place
Pilot
icon The agri-food context
• Farm management
• Multiple activities and stakeholders
• Multiple data sources, types and formats
• Multiple applications, tools and devices
Schematic overview of relationships between farm management
and its environment (Sörensen, et al., 2010)
3. Place
Pilot
icon
Data & modeling challenges in agri-food sector
Source: Accenture
The rapid advances of IoT technologies, AI and Big Data, among
others, have boosted the adoption of smart farming practices.
This has led to an explosion of data, generated by a wide range
of different systems and platforms that rarely interoperate.
• The lack of integrated data access, in turn, hinders the full potential of value
creation and decision support based on all the available data
Some of the key challenges hampering a seamless exchange and
integration of the data produced/collected by those systems are:
• Availability of data in different formats and represented according to different
models
• Heterogeneity of data models and semantics used to represent data
• Lack of related standards dominating this space
• Insufficient interoperability mechanisms enabling the connection of existing
agri-food data models
4. Place
Pilot
icon The project(s) behind
Data integration challenges in agri-food sector have been addressed in various key EU projects
…solutions for (big) data mgmt., including
the harmonization and integration of a large
variety of data from many sources
…open and interoperable cloud-based
solution addressing the integration of
data relevant to farming production
5. Place
Pilot
icon Knowledge Graphs
KG provide a flexible and efficient solution to
address many of those challenges.
• They can provide an integrated view over
(initially) disconnected and heterogeneous
datasets,
• through the interlinking of different entities,
typically by applying Linked Data principles
• Improves data accessibility by both humans & machines
• Enable to discover new knowledge
• and in compliance with any privacy and access
control needs.
http://lod-cloud.net/
7. Place
Pilot
icon
Use case: national crop data access &
monitoring
Goal:
• To enable access to AgroDataCube (AGC), a large collection of both open and derived data from
Netherlands for use in agri-food applications (by Wageningen Environmental Research), and
• To connect it with other open and widely used EU vocabularies (e.g., AgroVoc, Eurostat).
AGC exposes a REST API with various resources:
• Fields (crop registration datasets), Altitude, Meteo, Soil, NDVI
• Data is returned in GeoJSON format
Insights from this data may be relevant for
• Organisations collecting or validating agri-related indicators, e.g.,
paying agencies advisors looking for granular views of crops;
• Advisory organizations, e.g., looking to gain insights into the
distribution of crops in their region;
• Researchers, e.g., looking to find potential demonstration farms;
• Producers, e.g., looking to identify clusters of crops, etc.
8. Place
Pilot
icon Visualize and exploit the linked data
Demo app: http://metaphactory.foodie-cloud.org/resource/:AGROVOC-crops
10. Place
Pilot
icon
Use case: Farm Productivity and Sustainability
Benchmarking
Goal:
• To enable the benchmarking on the productivity and sustainability
performance of the farms;
• monitoring and comparing different conditions and parameters
affecting such indicators, and
• collecting the data & integrating it in a unified layer accessible by DSS
Such information is relevant for:
• Organisations collecting or validating agri-related indicators, e.g.,
paying agencies advisors who need a complete view at different
levels and to identify poorly performing regions;
• Advisory organizations, e.g., looking to gain insights of agri-indicators
in their region;
• Researchers, e.g., looking for regions with poor performance, or with
particular conditions;
• Producers, e.g., interested in identifying high-demand regions and
their challenges to customize their offer, etc
11. Place
Pilot
icon Visualize and exploit the data
Measure: Total
outcome,
farm income,
economic size,..
Type of Farming:
Fieldcrops, Horticulture, Wine, …
12. Place
Pilot
icon Use case: Farm machinery management
This use case collects real time telematic data from machinery on the field
The data is collected by machinery sensors, and it is
stored and manged by Senslog
Senslog is a web-based sensor data management system
• receives measured data (observations) directly from sensor devices
• stores sensor data in SensLog data model implemented in RDBMS;
• can pre-process data and/or analyze data
• Publish data through web-services
13. Place
Pilot
icon Visualize and exploit the linked data
Sparql endpoint: http://senslogrdf.foodie-cloud.org/sparql
SNORQL search endpoint: http://senslogrdf.foodie-
cloud.org/snorql/
Web-based visualization: http://senslogrdf.foodie-cloud.org/
Demo app: http://metaphactory.foodie-cloud.org/resource/:senslog-data
15. Place
Pilot
icon Agriculture Information Model - AIM
AIM aims to establish the basis of a common agricultural data
space, enable the interoperation of different systems, and the
analysis of data produced by those systems in an integrated way
AIM follows a modular approach in a layered architecture:
realized as a suite of ontologies and corresponding JSON-LD
contexts, and associated SHACL shapes
implemented in line with best practices, reusing existing
standards and well-scoped models
establishes alignments between base models to enable their
interoperability and the integration of existing data
Palma R., Roussaki I., Döhmen T., et.al (2021). “Agriculture Information
Model” in D. D. Bochtis, et.al (Eds). Information and Communication
Technologies for Agriculture—Theme III: Decision. Springer (TBP)
https://w3id.org/demeter/
https://github.com/rapw3k/DEMETER/tree/master/models
17. Place
Pilot
icon Visualization & exploitation
https://metaphacts.com/
Key features
Knowledge Graph Asset
Management
Rapid Application Building
End-user oriented interaction
18. Place
Pilot
icon Future work & references
Extend pipelines for other preconfigured data types: Soil data, weather data, etc.
Extend implementation with additional capabilities:
• Enrichment, link discovery, etc.
• Additional pre-process/post-process methods
• Integrate additional tools to handle other data sources/cases (e.g., non-sql db, sparql-transformations, etc.)
Triplestore endpoint: https://www.foodie-cloud.org/sparql
Faceted search: https://www.foodie-cloud.org/fct/
Demo applications: http://metaphactory.foodie-cloud.org/
Linked data pipelines Web service: https://dpi-enabler-demeter.apps.paas-dev.psnc.pl/api/swagger/
Linked data pipelines CLI source: https://git.man.poznan.pl/stash/projects/DEM/repos/pipelines/browse
AIM ontology: https://github.com/rapw3k/DEMETER/tree/master/models
Read more: https://blog.metaphacts.com/a-knowledge-graph-for-the-agri-food-sector
https://www.slideshare.net/rapw3k/presentations
Farm management is a complex process that involves multiple activities carried out by farmers and other stakeholders, who have to manage multiple and heterogeneous data sources collected and generated through various applications, services and devices.
Such process, however, has become even more complex in recent years. In particular
Such data integration challenges
The approach we adopted to address the data integration challenges in those projects was through the use of knowledge graphs.
However as we started creating and reusing more and more KGs to cover multiple use cases in multiple projects, we noticed a recurring process to get things up and running. This led us to the design and implementation of "Linked Data pipelines", which automate as much as possible the processes to carry out the necessary steps to transform and publish different input datasets from various heterogeneous sources as Linked Data.
connect different data processing components that carry out the transformation of data into RDF and their linking
Re-executable
Re-usable
Adaptable
(semi-) Automatic
For both:
Static (mostly)
Dynamic (e.g., Iot data)
so that it can interoperate with other services, and be connected with datasets