Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth Observation


Published on

European Geosciences Union
General Assembly 2018
Wolfgang Ksoll
Vienna, 9th of April 2018

Published in: Science
  • Be the first to comment

  • Be the first to like this

NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth Observation

  1. 1. NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth Observation Bente Bye, Wolfgang Ksoll, Nuno Catarino, Marie-Francoise Voidrot, Erwin Goor, Julian Meyer-Arnek, Pedro Goncalves, and Nuno Grosso EGU 2018,Vienna, 9th of April 2018
  2. 2. Agenda ● The project NextGEOSS ● Data flow ● Data sources ● Data consumers: pilots ● CKAN ● Experiences - lessons learned
  3. 3. The NextGEOSS User Experience
  4. 4. NextGEOSS at a glance User Feedback Mechanism Enabling users to efficiently deliver and find fit-for-purpose GEOSS data and information Advanced Discovery Tools Increased discoverability of Earth observations and related information for thematic areas.
  5. 5. NextGEOSS at a glance Community Enhancement Developing solutions with the communities for the communities, creating relevant tools tailored to meet community specific needs.
  6. 6. NextGEOSS at a glance Open, Inclusive, and Agile Development Strategy The NextGEOSS approach and methodology are aligned with the EU openness policies and the GEO open data sharing policy. Multiple releases allow extensive collaboration.
  7. 7. NextGEOSS at a glance NextGEOSS Project Facts H2020 project* Who: 27 partners from 13 countries Period: 2016 - 2020 Budget: 10M EURO *NextGEOSS is a winning answer to the H2020 SC5-20-2016 call
  8. 8. Open Source Technology – Earth Observation Science – Benefits management - Sustainability End User Communities: Civil Society, Government, Business Extension SPARQL Extension RDF Zookeeper Solrcloud Search CKAN API Web Access/ GUI Open Search Innovative Research: - Agriculture/ Foresting - Biodiversity - Space+Security - Cold Regions - Air Pollution - Disaster Risk Reduction Search Business Cases Market Study VCM Business Innovation/ Sustainability Benefit Assessment Sustainability Report Sustainability Development Goals of the UN Metadata + Tags Dublin Core DCAT GeoDCAT-AP ISO ITags Harvesting Connectors Providers 1 - Sentinel 2 - GOME-2 3 - Proba-V 4 - CMEMS … WP2.2 Data Flow Data Sources Open Data Copernicus Sentinel 1-5 Marine Land Atmosphere Citizen Commercial Provider WP3 Resources / Raw Data Metadata CKAN Data Hub Interfaces Data Discovery Guide Data Ingestion Guide NIMMbus User-feed back Pilots/Apps/Cloud Other Data Cube Apps. GEO DAB Business: - Territorial Planning - Food Security - Smart Cities - Energy - Grid Operating - Solar Mapping CKAN Core WP4+5+6+7 WP8 WP2.1 Requirements Harvest
  9. 9. User Feedback with NiMMBUS (UofB, Barcelona) Example for integration of external programs 1. User Feedback for a particular dataset starts in the NextGEOSS data hub 2. NiMMbus as an external program is called (see login) 3. Mask is filled in in NiMMbus 4. Feedback data are in the last step available in NiMMbus and NextGEOSS
  10. 10. Data Sources ● Sources ○ Satellites: Copernicus, Sentinel, ESA (Fotos, Radar, Laser, …) ○ In Situ ○ Civil Society ● All Open Data ● The EU wants to update the PSI directive: not only public service but also with public money financed data in public transport, energy research (open access data) ● These sources produce a tremendous amount of data in real time (10,000 datasets a day) ● How to search in the data to support transforming data to knowledge?
  11. 11. PILOTSInnovative Pilots Business Pilots IP1 Agricultural Monitoring IP2 Biodiversity IP4 Cold Regions IP5 Air Pollution in Mega Cities IP3 Space & Security IP6 Disaster Risk Reduction BP1 Territorial Planning BP2 Food Security BP4.1/2 Energy* BP3 Smart Cities NOA ● Pilots are numerical intensive applications in the cloud ● Time Series are calculated from the sources, e.g. in agriculture for crop performance optimization ● Machine Learning is applied, e.g. in Biodiversity pilots ● Smart Cities pilot brings together data from different stakeholders like civil society, government and business ● Energy pilots creates e.g. solar maps for cities ● Air Pollution measures e.g. NOx in cities
  12. 12. CKAN - Open Source and Standard Metadata ● Open Source in Github ( ● Harvesting metadata ● Searchable by Web GUI or API (OpenSearch, RDF), tagging with iTAG ● Metadata-standards: ○ “Normal” standards: Dublin Core,DCAT, GeoCAT, ISO ○ Community metadata standards. Essential Variables: Biodiversity, Climate, Ocean Extension SPARQL Extension RDF Zookeeper Solrcloud Search CKAN API Web Access/ GUI Open Search Metadata + Tags Dublin Core DCAT GeoDCAT-AP ISO ITags Interfaces Data Discovery Guide Data Ingestion Guide CKAN Core WP2.1
  13. 13. Experiences - Lessons learned ● Faster discovery from a bunch of sources and easier access by a single point of access ● Due to large number of datasets harvesters have to be designed carefully to catch up the load ● IT-architecture has to be scalable ● End-2-End tests (sources - datahub - application in the cloud) bring quality assurance to the parts ● Not many sources offer standard based metadata -> connectors have to be programmed individually ● Metadata are not standardized enough: spelling, language, meaning Quality issues ● Traditional metadata like Dublin Core, DCAT, GeoDCAT or ISO are not enough Some communities need special metadata (Essential Biodiversity Variables, Climate, Ocean, …) ● Licenses: public domain (German: gemeinfrei) is free of rights. No license possible. But administrations are creative in finding licences (>100) for open data Sources therefore have often licences not mentioned: too complicate See also: Theoretical Availability versus Practical Accessibility: The Critical Role of Metadata Management in Open Data Portals
  14. 14. Questions? Wolfgang Ksoll Twitter: @Nextgeoss Facebook: Nextgeoss YouTube: Nextgeoss
  15. 15. Backup: Pilot Examples
  16. 16. IP1: Time Series Analysis for Agricultural Monitoring Pilot Scope • Scale up Time Series analysis tools to huge amounts of HR EO-data • SAT EO-data & in-situ data Pilot Objectives • Extend Proba-V MEP & Copernicus Global Land Time Series Viewer with Sent-2 derived VGT indices • REST and/or WPS end-points → WP3 • Extend prototype of Agro STAC (Spatial Temporal Catalogue for Agronomy) from FP-7 SIGMA → towards operations • Temporal and attribute accuracy on WM(T)S: guidelines and prototype Challenges • Integrate with processing chains & data on public clouds • Transfer to operations (in-situ)
  17. 17. BP2: Crop Monitoring supporting Food Security Pilot Scope • Use of Sentinel-2 for crop monitoring in collaboration with industry • Data fusion between Proba-V 100 m and Sentinel-2 Pilot Objectives • Deploy and run HR processing chain for Vegetation Parameters on public cloud: on-demand & subscription • Develop dynamic dashboard: integration of time series analysis • Demonstrations & training for users from Agro and Insurance sector Challenges • Convenient & scalable processing of large amounts Sentinel-2 • Data analytics • Data fusion of Proba-V and Sentinel-2
  18. 18. IP2:Biodiversity Pilot Scope • Essential Biodiversity Variables (RS-EBVs) for habitat mapping and monitoring Pilot Objectives • demonstrate the value of an European Data Hub for the creation of RS-EBVs, which leads to creating a GEOhub for EBVs by linking the key policy/user network groups (GEO-BON, CBD and IPBES) with the space agencies. • demonstrate the use of the European Data Hub in terms high resolution RS-EBVs for habitat mapping (distribution, suitability and probability) in order to support the European Environment Agency (EEA) and its Topic Centre for Biological Diversity (ETC-BD). The integration of EO data with in-situ observations, vegetation relevés, will play an important role. Challenges • Incorporation of several RS-EBVs (e.g. phenology) to improve the distribution mapping of EUNIS habitats. • How far can we integrate different aspects of the developed habitat modelling method (data & models) into Cloud Sandbox Solution?
  19. 19. IP5: Air polution, Urban Growth, Health Risks in Megacities Pilot Scope • Analysis of air pollution trends, urban growth rates and health risk indicators for megacities by integrating EO data with the nextGEOSS infrastructure • New inputs from Sentinel-3, -5P, CAMS, WDC/RSAT Pilot Objectives • Develop a multi-sensor approach to analyse air pollution variability in megacities linked to urban growth rates • Develop a tool to analyse local trends and health risks using the NextGEOSS infrastructure • Exploit Copernicus data and servies (Sentinel-3, -5P, CAMS) • Strengthen the link to the health community Challenges • Integrate with Copernicus data hubs and processing chains
  20. 20. BP3: Smart Cities Pilot Scope • Pilot based in work developed in ESPRESSO H2020 support action. Smart cities use the ISO 37120 and we will see how that maps on the SDG for EO, as well as pilot how we can integrate smart city sensors in the in-situ EO Pilot Objectives • Mapping ISO 37120 and SDG, sensor integration in GC Challenges • Sensor standards in Smart Cities and standards in in-situ EO