Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Webinar@AIMS: Big Data challenges and solutions in agricultural and environmental research

1,053 views

Published on

Agricultural and environmental researchers traditionally work with large data sets and have through time developed their ways to handle scenarios involving massive data. Current developments in ICT and (big) data science potentially provide innovative and more effective ways to do this. However there are numerous barriers and pitfalls, sometimes unknown to ICT professionals, that cause initiatives to be less successful than possible. The presentation provides an overview of the current state-of-play regarding the position of Big Data in agro-environmental research, experiences from several projects and a (non-exhaustive) summary of do’s and don’ts and challenges for successfully applying Big Data technologies in this domain.

Published in: Science
  • D0WNL0AD FULL ▶ ▶ ▶ ▶ http://1lite.top/SjbeB ◀ ◀ ◀ ◀
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Webinar@AIMS: Big Data challenges and solutions in agricultural and environmental research

  1. 1. Big Data challenges and solutions in agricultural and environmental research Big Data Europe – AIMS webinar, 17 December 2015 Rob Lokers Alterra, Wageningen UR The Netherlands
  2. 2. Outline Historic perspective (agricultural & environmental modelling) Expectations for the (near) future Some Big Data examples from the agri-food domain Big Data challenges in agri-environmental research Expectations versus reality in 2015 2
  3. 3. 3 1960 - 1980 Crop science Animal science Food Science Economics Institutional data collection Institutional data collection Institutional data collection Institutional data collection 1980 - 2000 2000- 2010 2010- 2015 First computer models Institutional applications Integrated modelling frameworks First computer models Institutional applications First computer models Institutional applications First computer models Institutional applications Open data across sectors IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics)
  4. 4. 4 Crop science Animal science Food Science Economics 2010- 2015 Open data across sectors IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics) 2015 - 2020 BIG DATA: one massive linked data pool across disciplines and strong computational capabilities Computational capabilities: • Amazon • Microsoft Azure • Google Earth Engine • EC research infrastructures New data sources: • Remote sensing • Crowd sourcing • Rapid phenotyping/ Omics • Social media Potential to solve problems on agriculture, nutrition, food security, climate change?
  5. 5. Data analysis and integration, Models, Artificial Intelligence, Linked Open Data, Semantic web technologies, ... Policy options, Products, Services, Costs, Benefits, Scenarios, Impact Assessments, Decision Support Systems, Integrated models, ..... Decision domain (policy/industry) Process of data based value creation and roles involved Policy makers/industry/societal stakeholders Wisdom Knowledge info + application Information data + added meaning (Big) Data raw material Knowledge domain (science / consultants) Interests (economic, social, environmental), values, preferences, trade-offs, risks, intangibles, ethics, .... Databases, Satellites, Sensor networks, Social media, Citizen Observatories, ... Open(data)Standards,(meta)datarepositories, Businessdevelopment,Visualizationtoolsand methods,Contextualization,KnowledgeBrokerage,...
  6. 6. Food Security example: Monitoring Agricultural ReSources (MARS) Wisdom Knowledge Information Data Owned and operated by EC-JRC Crop forecasts at EU level needed to take rapid decisions on Common Agricultural Policy instruments during the year Provide information on vulnerability in specific food insecure areas In support of: ● European Common Agricultural Policy on commodities & subsidies (focus on Europe, Asia) ● Food aid (focus on Africa) Monitoring weather and crop conditions of current growing season (early warning, extreme events)
  7. 7. Example: Monitoring Agricultural ReSources (MARS) Wisdom Knowledge Information Data weather archives live data streams crop, soil databases Models
  8. 8. Example: Monitoring Agricultural ReSources (MARS) Wisdom Knowledge Information Data weather archives live data streams crop, soil databases Models Rescaling, interpolations GIS Crop models
  9. 9. Example: Monitoring Agricultural ReSources (MARS) Wisdom Knowledge Information Data weather archives live data streams crop, soil databases Models Statistical tools Decision support Data mining & reporting
  10. 10. Example: Monitoring Agricultural ReSources (MARS) Wisdom Knowledge Information Data weather archives live data streams crop, soil databases Models Policy & decision making
  11. 11. Big Data technologies Technologies currently used (agri-environmental research) RDBMS, geo-databases ● but also file-based, Excel etc. Various “old & proven” programming languages (esp. for modelling, data processing) ● Fortran, C/C++, Java etc. Remote sensing: dedicated tools & environments for processing and analysis ● ENVI, R, GDAL etc. GIS & spatial analysis packages Harmonized information / data models (but still per discipline) Local, optimized solutions for computing and storage
  12. 12. Big Data technologies Experimental technologies (ICT research for agriculture): High Performance clusters / grids ● E.g. parallelization of modelling and analysis software RDF databases ● Linked Data applications linking sources of metadata, bibliographical data, statistical data Vocabularies and ontologies ● Annotation of (meta)data for improved discovery Semantic technologies NLP algorithms However: agro-environmental research seems to be a “wicked domain” with specific challenges
  13. 13. Challenge - Variety Wisdom Knowledge Information Data Wisdom Knowledge Information Data Wisdom Knowledge Information Data Velocity Variety Variety Volume Climatology Agronomy Soil Science
  14. 14. Challenge - Variety Agro-environmental research = Interdisciplinary: ● different targeted objectives ● different data formats ● different schema’s, vocabularies etc. ● different levels of standardization ● different granularities Example: relevant domains for agricultural impact assessments ● Agronomy ● Climate ● Water/irrigation management ● Economy etc...
  15. 15. Challenge - Variety Semantic alignment can be problematic Different domains use different semantics to describe the same knowledge Semantics maybe in different, non-recognized standards or not existing Ontology alignment tools usually do not work Manual alignment is resource-consuming and requires multi- disciplinary experts No fitting vocabularies and ontologies to effectively annotate datasets Example: climate data – temperature ● Modelled differently in different vocabularies/ontologies ● Not specific enough to characterize data
  16. 16. Challenge - Variety Commonly used vocabularies usually do not fit scientific requirements Many seem to be designed for annotation of bibliographic data Often not complete, extended in response to requirements of the owner / maintainer Unbalanced, e.g. level of detail and granularity Example: tree species vs. climate variables in Agrovoc Quercus Quercus Robur Fagaceae Fagus Quercus Ilex temperature Air temperature Measure Interest rate Body temperature
  17. 17. Food production example: Smart Farming: Monitoring, planning & control 17 Genome sequences Feed uptake Performance Manure Temperature Activity Heart rate pH Antibodies Biomarkers Medicine use ........ ........ Size Location Performance Manure Water Energy Nutrition Health management . . . . . . . . . . . . Distance to . . Public health Living environment Mineral cycles Healthy products Disease risks Economic figures Environmental issues . . . . . . . . . . . . . . Crop or Animal level Farm level Environmental level Supporting sustainable food production and contributing to the realization of (inter)national policy agenda’s. Market prices Logistics Regulations . . . . . . . . . . . . . . Market level
  18. 18. Challenge - Veracity Wisdom Knowledge Information Data Wisdom Knowledge Information Data Wisdom Knowledge Information Data Velocity Variety Variety Volume Climatology Agronomy Soil Science
  19. 19. 3 1960 - 1980 Crop science Animal science Food Science Economics Institutional data collection Institutional data collection Institutional data collection Institutional data collection 1980 - 2000 2000- 2010 2010- 2015 First computer models Institutional applications Integrated modelling frameworks First computer models Institutional applications First computer models Institutional applications First computer models Institutional applications Open data across sectors IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics) IT improvements (meta data, semantics)
  20. 20. Challenge - Veracity In science, trust is essential, in two ways: ● Trust of end users in the quality of (meta)data ● Trust of data providers regarding end users Data accessibility is generally low ● stored in silos ● only exchanged among peers in known networks ● hardly documented ● This has a lot to do with culture and policies in research organisations regarding data management
  21. 21. Challenge - Veracity Publication of datasets ● A lot of data is inaccessible, for various reasons and either intentional or unintentional. ● Some data is accessible, but might be ● not provided through standardized interfaces ● not easily discoverable Example: weather data (station observations)
  22. 22. Challenge - Veracity Documentation of datasets ● No incentive to provide metadata ● End of project activity, with no benefits for scientist ● Boring work ● No clear perspective on end user requirements ● Metadata schema’s are generally considered too complex Metadata quality is generally poor ● Minimal amount of metadata provided ● No, non-standardized or irrelevant annotations Example: Forestry Clearinghouse
  23. 23. Expectations versus reality in 2015... Promise: new technologies will make our life much easier RDF databases Semantic technologies Grid computing, cloud storage solutions etc. Reality: we keep work with the old stuff Alignment and integration are hard to accomplish New technologies prove to be too immature for the real world Production systems are still developed using what we know works well (e.g. RDBMS, legacy models and data formats) Successful innovative initiatives use hybrid solutions, often build on “proven” technologies ● Limited to metadata level or small &medium sized data, with limited domain coverage
  24. 24. Expectations versus reality in 2015... Promise: “Googlification” of scientific data provision Transparent access to big, distributed, heterogeneous datasets “Magical” semantic (and linguistic) query processing Tools seamlessly transform heterogeneous data to model data input, information and knowledge for decision making Reality: we struggle getting ourselves into shape Attempts are mainly successful on metadata level and bibliographic sources (genetics might be an exception) Cumbersome first attempts to harmonize big heterogeneous data streams Custom-build data collection and processing chains still remain dominant
  25. 25. Source: Gartner (August 2015) 25
  26. 26. Thank you for your attention 26

×