Open data distributed on Amazon’s cloud service
Roope Tervo, Mikko Visa
Finnish Meteorological Institute
• Finnish Meteorological Institute
opened its data in 2013.
• Basically everything that FMI has
property rights was opened.
• Both (near) real-time and historical
and climatological data.
• Data is provided in freely in machine
readable format.
9/12/2017 2
FMI Open Data
https://en.ilmatieteenlaitos.fi/open-data
Open data distributed on Amazon’s cloud service
FMI Open Data Portal follows INSPIRE requirements.
FMI Open Data
Data Portal
Meta data
Services
The very same data portal works as Open Data and
INSPIRE portal.
9/12/2017 4
ISO19115 WFS WMS
CSW
Grid Series
Observations
Time Series
Observations
Data
Models O&M
Simple
Feature
GRIB
NetCDF GeoTiff
Open data distributed on Amazon’s cloud service
FMI Open Data
Registration
o Registration is required to use View and Download
Services
o Working email address is the only mandatory
information
o After registration the user gets an API key which have to
be added into all requests
o GET parameter fmi-apikey=…&
o Header fmi-apikey; …
o Part of url http://wms.fmi.fi/fmi-apikey/…/wms?
o One can create several API keys with one email
9/12/2017 5Open data distributed on Amazon’s cloud service
FMI Open Data
Usage Limits
With one API key it’s allowed to
o do at most 20 000 requests per day to Download Service
o do at most 10 000 requests per day to View Service
o do at most 600 requests per 5 minutes to both services
o If all observations from one time step is calculated to as one,
little over 17 000 new data sets are published daily
o So, with one API key it’s allowed load everything once
o View service can be used for testing but can not be used as a
back end for popular clients
9/12/2017 6Open data distributed on Amazon’s cloud service
And a little over
830 000 data
downloads
per day
(9,6 req/s)
At the moment
about 11 700
registered users
Some Experiences
9/12/2017 7Open data distributed on Amazon’s cloud service
Motivation
Increase appropriability of
weather and climate data
• Data is valuable only when it
reach relevant audience
• Maximal coverage requires
several different channels
and services
 FMI joins Amazon’s Public
Datasets
9/12/2017
Open Source Software @ Finnish Meteorological
Institute | Roope Tervo, Harri Pietarila, Mikko
Rauhala
8
FMI OpenData on AWS
• FMI OpenData is also distributed on
Amazon Web Services (AWS) Cloud platform
• 2-years pilot
• Started in May 2017
• Hirlam surface and pressure levels in the first stage
• The objective is to
• increase the utility and effective use of weather and climate data
• support public-private-partnership
• Specially convenient for users who need the whole model data
• i.e for post-processing or generating map visualizations
• Licence: CC BY 4.0
9/12/2017 Open data distributed on Amazon’s cloud service 9
FMI OpenData on AWS
• Content
• Hirlam Surface
• Hirlam Pressure Levels
• Coverage: Europe
• Grid resolution: 7,5 km
• Updates: 4 times a day
• Time range: 54 hours (from model
run start)
• Time step: 1 hour
• Archive kept during the pilot
• Parameters:
http://en.ilmatieteenlaitos.fi/open-
data-on-aws-s3
9/12/2017 Open data distributed on Amazon’s cloud service 10
FMI OpenData on AWS
• Access through buckets:
• Surface data: fmi-opendata-rcrhirlam-surface-grib
• Pressure level data: fmi-opendata-rcrhirlam-pressure-grib
• Browse bucket content:
• http://fmi-opendata-rcrhirlam-surface-grib.s3-website-eu-west-
1.amazonaws.com/
• http://fmi-opendata-rcrhirlam-pressure-grib.s3-website-eu-west-
1.amazonaws.com/
• Public Amazon SNS topics are available for every new object added to
the Amazon S3:
• arn:aws:sns:eu-west-1:916174725480:new-fmi-opendata-rcrhirlam-surface-grib
• arn:aws:sns:eu-west-1:916174725480:new-fmi-opendata-rcrhirlam-pressure-
grib
9/12/2017 Open data distributed on Amazon’s cloud service 11
FMI OpenData on AWS
See documentation:
http://en.ilmatieteenlaitos.fi/open-data-on-aws-s3
9/12/2017 Open data distributed on Amazon’s cloud service 12
Requests by Data Format
in FMI Open Data Portal
9/12/2017 Open data distributed on Amazon’s cloud service 13
Requests and
ip-addresses in
July 2017
Hirlam Requests in binary
format by Channel
9/12/2017 Open data distributed on Amazon’s cloud service 14
Hirlam requests
in July 2017
Point requests are
still by far the most
popular type
For binary data
downloaders S3 is
already the most
popular channel
(in three months!)
Some Experiences
9/12/2017 16Open data distributed on Amazon’s cloud service
9/12/2017 Open data distributed on Amazon’s cloud service 17
“Use cases of cloud computing and requirements towards data distribution within a private meteorological service provider”
https://meetingorganizer.copernicus.org/EMS2017/Orals/25520/210111
www.fmi.fi
https://github.com/fmidev
https://en.ilmatieteenlaitos.fi/open-data
http://en.ilmatieteenlaitos.fi/open-data-on-aws-s3
http://roopetervo.com
http://www.slideshare.net/tervo

FMI Open Data on S3

  • 1.
    Open data distributedon Amazon’s cloud service Roope Tervo, Mikko Visa Finnish Meteorological Institute
  • 2.
    • Finnish MeteorologicalInstitute opened its data in 2013. • Basically everything that FMI has property rights was opened. • Both (near) real-time and historical and climatological data. • Data is provided in freely in machine readable format. 9/12/2017 2 FMI Open Data https://en.ilmatieteenlaitos.fi/open-data Open data distributed on Amazon’s cloud service
  • 3.
    FMI Open DataPortal follows INSPIRE requirements. FMI Open Data Data Portal Meta data Services The very same data portal works as Open Data and INSPIRE portal. 9/12/2017 4 ISO19115 WFS WMS CSW Grid Series Observations Time Series Observations Data Models O&M Simple Feature GRIB NetCDF GeoTiff Open data distributed on Amazon’s cloud service
  • 4.
    FMI Open Data Registration oRegistration is required to use View and Download Services o Working email address is the only mandatory information o After registration the user gets an API key which have to be added into all requests o GET parameter fmi-apikey=…& o Header fmi-apikey; … o Part of url http://wms.fmi.fi/fmi-apikey/…/wms? o One can create several API keys with one email 9/12/2017 5Open data distributed on Amazon’s cloud service
  • 5.
    FMI Open Data UsageLimits With one API key it’s allowed to o do at most 20 000 requests per day to Download Service o do at most 10 000 requests per day to View Service o do at most 600 requests per 5 minutes to both services o If all observations from one time step is calculated to as one, little over 17 000 new data sets are published daily o So, with one API key it’s allowed load everything once o View service can be used for testing but can not be used as a back end for popular clients 9/12/2017 6Open data distributed on Amazon’s cloud service
  • 6.
    And a littleover 830 000 data downloads per day (9,6 req/s) At the moment about 11 700 registered users Some Experiences 9/12/2017 7Open data distributed on Amazon’s cloud service
  • 7.
    Motivation Increase appropriability of weatherand climate data • Data is valuable only when it reach relevant audience • Maximal coverage requires several different channels and services  FMI joins Amazon’s Public Datasets 9/12/2017 Open Source Software @ Finnish Meteorological Institute | Roope Tervo, Harri Pietarila, Mikko Rauhala 8
  • 8.
    FMI OpenData onAWS • FMI OpenData is also distributed on Amazon Web Services (AWS) Cloud platform • 2-years pilot • Started in May 2017 • Hirlam surface and pressure levels in the first stage • The objective is to • increase the utility and effective use of weather and climate data • support public-private-partnership • Specially convenient for users who need the whole model data • i.e for post-processing or generating map visualizations • Licence: CC BY 4.0 9/12/2017 Open data distributed on Amazon’s cloud service 9
  • 9.
    FMI OpenData onAWS • Content • Hirlam Surface • Hirlam Pressure Levels • Coverage: Europe • Grid resolution: 7,5 km • Updates: 4 times a day • Time range: 54 hours (from model run start) • Time step: 1 hour • Archive kept during the pilot • Parameters: http://en.ilmatieteenlaitos.fi/open- data-on-aws-s3 9/12/2017 Open data distributed on Amazon’s cloud service 10
  • 10.
    FMI OpenData onAWS • Access through buckets: • Surface data: fmi-opendata-rcrhirlam-surface-grib • Pressure level data: fmi-opendata-rcrhirlam-pressure-grib • Browse bucket content: • http://fmi-opendata-rcrhirlam-surface-grib.s3-website-eu-west- 1.amazonaws.com/ • http://fmi-opendata-rcrhirlam-pressure-grib.s3-website-eu-west- 1.amazonaws.com/ • Public Amazon SNS topics are available for every new object added to the Amazon S3: • arn:aws:sns:eu-west-1:916174725480:new-fmi-opendata-rcrhirlam-surface-grib • arn:aws:sns:eu-west-1:916174725480:new-fmi-opendata-rcrhirlam-pressure- grib 9/12/2017 Open data distributed on Amazon’s cloud service 11
  • 11.
    FMI OpenData onAWS See documentation: http://en.ilmatieteenlaitos.fi/open-data-on-aws-s3 9/12/2017 Open data distributed on Amazon’s cloud service 12
  • 12.
    Requests by DataFormat in FMI Open Data Portal 9/12/2017 Open data distributed on Amazon’s cloud service 13 Requests and ip-addresses in July 2017
  • 13.
    Hirlam Requests inbinary format by Channel 9/12/2017 Open data distributed on Amazon’s cloud service 14 Hirlam requests in July 2017
  • 14.
    Point requests are stillby far the most popular type For binary data downloaders S3 is already the most popular channel (in three months!) Some Experiences 9/12/2017 16Open data distributed on Amazon’s cloud service
  • 15.
    9/12/2017 Open datadistributed on Amazon’s cloud service 17 “Use cases of cloud computing and requirements towards data distribution within a private meteorological service provider” https://meetingorganizer.copernicus.org/EMS2017/Orals/25520/210111
  • 16.

Editor's Notes

  • #6 - Registration is not very convenient for users and is against open data principles but it provides us a useful information about the usage and makes it easier for us to prevent misusage of the portal
  • #9 We have done well but this is not enough
  • #11 weather forecast model Covers europe Updates 4 times a day Two days ahead Archive is provided
  • #14 The pilot have been going on for 3 months now How well has it succeed? Before showing actual numbers I want to show some context This chart show how requests are divided between different data models FMI provides the same data in different formats and data models Three columns on the right side of the graphs are point data requests and the invisible column on the left side is grid data requests (in binaray format) So: most of the requests are point data requests But there are still almost 40 thousand grid data requests from 259 different ip addresses during July
  • #15 This chart shows how Hirlam weather forecast data is divided between FMI data portal and S3 The blue chart shows number of requests, based on that S3 is far more popular But data in S3 is divided in a way that every parameter (like temperature or precipitation) is in separate files and from FMI data portal one can download everything with one request If we assume that every S3 user downloads everything, we can compare a popularity of these channels /shown in red column) and we can see that we get about 13 thousand requests more on S3 than on FMI data portal