Assessing the quality of volunteered weather observations to provide high-resolution weather maps

Irene Garcia-Marti
Irene Garcia-MartiData Scientist
Irene Garcia-Marti
Gerard van der Schrier
Jan-Willem Noteboom
7th November 2019
Assessing the quality
of volunteered
weather observations
to provide
high-resolution
weather maps
› Weather observations are
crucial!
› Spatial sparsity is a
challenge for high-res
weather forecasts
› Increasing number of
weather-related citizen
science projects
– WOW, Wunderground,
Netatmo, Meteoclimatic
Motivation
6 november 2019
Koninklijk Nederlands Meteorologisch Instituut 2
1st
September 2019: 1,400 million observations and 17K stations worldwide
› 2015: KNMI partner of WOW
› Contributors: 400+ CWS
› Data NL+BE: 3.7M obs/month
› Devices: semi-professional
– Manufacturers: Davis, Oregon
scientific, Ventus, Alecto…
– Expected “reasonable” quality of
the observations
WOW-NL
4
› Quality not only related to
device:
– Good with respect to what? What
variables are (not) properly
monitored?
– Local processes: radiation,
shadowing, siting
› Classical challenges of citizen
science data:
– Gaps in data
– Noisy observations
WOW-NL
5
WOW-NL
6
› Quality not only related to
device:
– Good with respect to what? What
variables are (not) properly
monitored?
– Local processes: radiation,
shadowing, siting
› Classical challenges of citizen
science data:
– Discretization
– Scale of the phenomena
7
What is the quality
of WOW-NL data?
1. Quality control
2. Interpolated maps
8
Preprocessing
WOW json csv
11.6M
observations
65 features
SVF
Feature engineering
6 november 2019
Koninklijk Nederlands Meteorologisch Instituut 10
Quality control
11
› Narrowing down variables
– Not reinventing the wheel:
▪ Temperature:
• (Napoly, 2018) 🡪
• (Meier, 2017)
• (Lussana, Titan project)
– QC based on (Napoly, 2018)
▪ Feasible to implement on WOW-NL
▪ Levels have been compacted
Each of the 11.6M observations go through this workflow and is labeled with a quality level
Overview of the quality of WOW-NL
(Each square represents 10K observations)
M0: incorrect metadata
(not shown)
M1: insufficient Z-score
(presence outliers)
M2: insufficient day/mon
coverage
M3: insufficient (Pearson)
correlation
M4: OK
› Setting up a baseline:
– Kriging (or GP)
▪ Ordinary: no {external drift, trend}
▪ Temporal resolution: hourly
Interpolation
14
Radiation
bias?
Assessing the quality of volunteered weather observations to provide high-resolution weather maps
16
› If good enough:
– Open the door for new research:
▪ Fine-grained interpolated layers
▪ Nowcasting / hi-res weather
▪ Gridded weather variables
▪ Study local phenomena
– Governance level:
▪ Lower cost for administration
▪ Better weather forecast for
underrepresented areas
▪ Bottom-up initiatives might work in
developing regions
Why the quality of citizen
science weather data is
important?
17
› Met offices have enough
knowledge to develop QC’s
– Does it make sense?
– Collaboration: “the more, the merrier”
› Re-use pieces of software:
– Corrections: radiation, air pressure
– QC’s for other variables:
precipitation, wind
› Ideally: work together to
build a pipeline with mechanistic,
corrective, and statistical filters
Not reinventing
the wheel
18
Challenges and
opportunities ahead
19
Big data problem!
We are here
Imperfect data, but volume is difficult to ignore
› Big Data problem
– Paradigm shift: code 🡪 data
– Adoption of cloud technologies
› Data analysis:
– Machine learning for QC: Outlier
detection, add external drifts
– Coupling volunteered data with
weather models
– Study local patterns: urban wind
dynamics, urban rainfall, effect of
SVF on outliers
– Assess other quality control
schemas
Questions? ☺
Thanks!
6 november 2019
Koninklijk Nederlands Meteorologisch Instituut 20
1 of 20

Recommended

Using volunteered weather observations to explore urban and regional patterns... by
Using volunteered weather observations to explore urban and regional patterns...Using volunteered weather observations to explore urban and regional patterns...
Using volunteered weather observations to explore urban and regional patterns...Irene Garcia-Marti
35 views15 slides
IoT Data - Like No Data We have Ever Seen by
IoT Data - Like No Data We have Ever SeenIoT Data - Like No Data We have Ever Seen
IoT Data - Like No Data We have Ever SeenInterCon
15 views18 slides
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2... by
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...
Building Smart Cities: The Data-Driven Way (Created For The Big 5 Construct 2...SocialCops
2.1K views34 slides
ESD-big data by Rasmus Benestad by
ESD-big data by Rasmus Benestad ESD-big data by Rasmus Benestad
ESD-big data by Rasmus Benestad BigData_Europe
414 views45 slides
BDE ESD Tool - Big Data Met NORWAY Rasmus Benestad by
BDE ESD Tool - Big Data  Met NORWAY Rasmus BenestadBDE ESD Tool - Big Data  Met NORWAY Rasmus Benestad
BDE ESD Tool - Big Data Met NORWAY Rasmus BenestadMandy Vlachogianni
526 views45 slides
Estimating the Impact of Agriculture on the Environment of Catalunya by means... by
Estimating the Impact of Agriculture on the Environment of Catalunya by means...Estimating the Impact of Agriculture on the Environment of Catalunya by means...
Estimating the Impact of Agriculture on the Environment of Catalunya by means...Andreas Kamilaris
279 views30 slides

More Related Content

Similar to Assessing the quality of volunteered weather observations to provide high-resolution weather maps

Building Climate Resilience: Translating Climate Data into Risk Assessments by
Building Climate Resilience: Translating Climate Data into Risk Assessments Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments Safe Software
746 views103 slides
Swiss Territorial Data Lab - geo Data Science - colloque FHNW by
Swiss Territorial Data Lab - geo Data Science - colloque FHNWSwiss Territorial Data Lab - geo Data Science - colloque FHNW
Swiss Territorial Data Lab - geo Data Science - colloque FHNWRaphael Rollier
764 views54 slides
Emnotion 2016 by
Emnotion 2016Emnotion 2016
Emnotion 2016Maryna Tarasenko
185 views17 slides
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ... by
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...Patrick Van Renterghem
241 views27 slides
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of... by
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...TERN Australia
643 views19 slides
sevt by
sevtsevt
sevtIryna Rozum
98 views1 slide

Similar to Assessing the quality of volunteered weather observations to provide high-resolution weather maps(20)

Building Climate Resilience: Translating Climate Data into Risk Assessments by Safe Software
Building Climate Resilience: Translating Climate Data into Risk Assessments Building Climate Resilience: Translating Climate Data into Risk Assessments
Building Climate Resilience: Translating Climate Data into Risk Assessments
Safe Software746 views
Swiss Territorial Data Lab - geo Data Science - colloque FHNW by Raphael Rollier
Swiss Territorial Data Lab - geo Data Science - colloque FHNWSwiss Territorial Data Lab - geo Data Science - colloque FHNW
Swiss Territorial Data Lab - geo Data Science - colloque FHNW
Raphael Rollier764 views
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ... by Patrick Van Renterghem
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...
How STIB-MIVB Uses Data to Improve the Brussels Public Transport Experience, ...
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of... by TERN Australia
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...
Arturo Sanchez-Azofeifa_Challenges and opportunities in the implementation of...
TERN Australia643 views
Data Ecosystems for Geospatial Data by Slim Turki, Dr.
Data Ecosystems for Geospatial DataData Ecosystems for Geospatial Data
Data Ecosystems for Geospatial Data
Slim Turki, Dr.113 views
Wilfried van Sark (Utrecht University) Citizen Science - het succes van de Te... by Renatuurlijk
Wilfried van Sark (Utrecht University) Citizen Science - het succes van de Te...Wilfried van Sark (Utrecht University) Citizen Science - het succes van de Te...
Wilfried van Sark (Utrecht University) Citizen Science - het succes van de Te...
Renatuurlijk501 views
Action Point Case Study PVPLC by Miriam O'Brien
Action Point Case Study PVPLCAction Point Case Study PVPLC
Action Point Case Study PVPLC
Miriam O'Brien169 views
Air Pollution in Sofia - Solution through Data Science by Kiwi team by Data Science Society
Air Pollution in Sofia - Solution through Data Science by Kiwi teamAir Pollution in Sofia - Solution through Data Science by Kiwi team
Air Pollution in Sofia - Solution through Data Science by Kiwi team
Complex Weather data and a Multi-platform Audience: Big Data at The Weather N... by Innovation Enterprise
Complex Weather data and a Multi-platform Audience: Big Data at The Weather N...Complex Weather data and a Multi-platform Audience: Big Data at The Weather N...
Complex Weather data and a Multi-platform Audience: Big Data at The Weather N...
Mmea program - from sensors to services. Keynote from Dr. Tero Eklin by CLEEN_Ltd
Mmea program - from sensors to services. Keynote from Dr. Tero Eklin Mmea program - from sensors to services. Keynote from Dr. Tero Eklin
Mmea program - from sensors to services. Keynote from Dr. Tero Eklin
CLEEN_Ltd436 views
Day 1 9 rupa kumar kolli, wmo, arrcc-carissa workshop by ICIMOD
Day 1 9 rupa kumar kolli, wmo, arrcc-carissa workshopDay 1 9 rupa kumar kolli, wmo, arrcc-carissa workshop
Day 1 9 rupa kumar kolli, wmo, arrcc-carissa workshop
ICIMOD80 views

More from Irene Garcia-Marti

Detecting probability of ice formation on overhead lines of the Dutch railway... by
Detecting probability of ice formation on overhead lines of the Dutch railway...Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...Irene Garcia-Marti
130 views20 slides
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti... by
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...Irene Garcia-Marti
110 views20 slides
Modelling tick dynamics using volunteer data (2017) by
Modelling tick dynamics using volunteer data (2017)Modelling tick dynamics using volunteer data (2017)
Modelling tick dynamics using volunteer data (2017)Irene Garcia-Marti
66 views32 slides
AGILE Conference - Castelló (2014) by
AGILE Conference - Castelló (2014)AGILE Conference - Castelló (2014)
AGILE Conference - Castelló (2014)Irene Garcia-Marti
97 views18 slides
Vector-borne diseases and Lyme disease (2016) by
Vector-borne diseases and Lyme disease (2016)Vector-borne diseases and Lyme disease (2016)
Vector-borne diseases and Lyme disease (2016)Irene Garcia-Marti
381 views27 slides
Modelling tick bites dynamics using VGI (2015) by
Modelling tick bites dynamics using VGI (2015)Modelling tick bites dynamics using VGI (2015)
Modelling tick bites dynamics using VGI (2015)Irene Garcia-Marti
121 views29 slides

More from Irene Garcia-Marti(8)

Detecting probability of ice formation on overhead lines of the Dutch railway... by Irene Garcia-Marti
Detecting probability of ice formation on overhead lines of the Dutch railway...Detecting probability of ice formation on overhead lines of the Dutch railway...
Detecting probability of ice formation on overhead lines of the Dutch railway...
Irene Garcia-Marti130 views
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti... by Irene Garcia-Marti
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...
Irene Garcia-Marti110 views
Modelling tick dynamics using volunteer data (2017) by Irene Garcia-Marti
Modelling tick dynamics using volunteer data (2017)Modelling tick dynamics using volunteer data (2017)
Modelling tick dynamics using volunteer data (2017)
Vector-borne diseases and Lyme disease (2016) by Irene Garcia-Marti
Vector-borne diseases and Lyme disease (2016)Vector-borne diseases and Lyme disease (2016)
Vector-borne diseases and Lyme disease (2016)
Irene Garcia-Marti381 views
Modelling tick bites dynamics using VGI (2015) by Irene Garcia-Marti
Modelling tick bites dynamics using VGI (2015)Modelling tick bites dynamics using VGI (2015)
Modelling tick bites dynamics using VGI (2015)
Irene Garcia-Marti121 views
Modelling tick densities using VGI and machine learning (2016) by Irene Garcia-Marti
Modelling tick densities using VGI and machine learning (2016)Modelling tick densities using VGI and machine learning (2016)
Modelling tick densities using VGI and machine learning (2016)
Irene Garcia-Marti208 views

Recently uploaded

Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...ILRI
5 views6 slides
Pollination By Nagapradheesh.M.pptx by
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptxMNAGAPRADHEESH
19 views9 slides
1978 NASA News Release Log by
1978 NASA News Release Log1978 NASA News Release Log
1978 NASA News Release Logpurrterminator
11 views146 slides
NUTRITION IN BACTERIA.pdf by
NUTRITION IN BACTERIA.pdfNUTRITION IN BACTERIA.pdf
NUTRITION IN BACTERIA.pdfNandadulalSannigrahi
32 views14 slides
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F... by
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...SwagatBehera9
5 views36 slides
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Anmol Vishnu Gupta
7 views10 slides

Recently uploaded(20)

Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 views
Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH19 views
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F... by SwagatBehera9
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
SwagatBehera95 views
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy... by Anmol Vishnu Gupta
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Evaluation and Standardization of the Marketed Polyherbal drug Patanjali Divy...
Experimental animal Guinea pigs.pptx by Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya35 views
Light Pollution for LVIS students by CWBarthlmew
Light Pollution for LVIS studentsLight Pollution for LVIS students
Light Pollution for LVIS students
CWBarthlmew9 views
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio... by Trustlife
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...
Trustlife127 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles1.2K views
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance... by InsideScientific
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
InsideScientific78 views
Conventional and non-conventional methods for improvement of cucurbits.pptx by gandhi976
Conventional and non-conventional methods for improvement of cucurbits.pptxConventional and non-conventional methods for improvement of cucurbits.pptx
Conventional and non-conventional methods for improvement of cucurbits.pptx
gandhi97620 views
A giant thin stellar stream in the Coma Galaxy Cluster by Sérgio Sacani
A giant thin stellar stream in the Coma Galaxy ClusterA giant thin stellar stream in the Coma Galaxy Cluster
A giant thin stellar stream in the Coma Galaxy Cluster
Sérgio Sacani17 views
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe... by Anmol Vishnu Gupta
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...

Assessing the quality of volunteered weather observations to provide high-resolution weather maps

  • 1. Irene Garcia-Marti Gerard van der Schrier Jan-Willem Noteboom 7th November 2019 Assessing the quality of volunteered weather observations to provide high-resolution weather maps
  • 2. › Weather observations are crucial! › Spatial sparsity is a challenge for high-res weather forecasts › Increasing number of weather-related citizen science projects – WOW, Wunderground, Netatmo, Meteoclimatic Motivation 6 november 2019 Koninklijk Nederlands Meteorologisch Instituut 2
  • 3. 1st September 2019: 1,400 million observations and 17K stations worldwide
  • 4. › 2015: KNMI partner of WOW › Contributors: 400+ CWS › Data NL+BE: 3.7M obs/month › Devices: semi-professional – Manufacturers: Davis, Oregon scientific, Ventus, Alecto… – Expected “reasonable” quality of the observations WOW-NL 4
  • 5. › Quality not only related to device: – Good with respect to what? What variables are (not) properly monitored? – Local processes: radiation, shadowing, siting › Classical challenges of citizen science data: – Gaps in data – Noisy observations WOW-NL 5
  • 6. WOW-NL 6 › Quality not only related to device: – Good with respect to what? What variables are (not) properly monitored? – Local processes: radiation, shadowing, siting › Classical challenges of citizen science data: – Discretization – Scale of the phenomena
  • 7. 7
  • 8. What is the quality of WOW-NL data? 1. Quality control 2. Interpolated maps 8
  • 9. Preprocessing WOW json csv 11.6M observations 65 features SVF Feature engineering
  • 10. 6 november 2019 Koninklijk Nederlands Meteorologisch Instituut 10
  • 11. Quality control 11 › Narrowing down variables – Not reinventing the wheel: ▪ Temperature: • (Napoly, 2018) 🡪 • (Meier, 2017) • (Lussana, Titan project) – QC based on (Napoly, 2018) ▪ Feasible to implement on WOW-NL ▪ Levels have been compacted
  • 12. Each of the 11.6M observations go through this workflow and is labeled with a quality level
  • 13. Overview of the quality of WOW-NL (Each square represents 10K observations) M0: incorrect metadata (not shown) M1: insufficient Z-score (presence outliers) M2: insufficient day/mon coverage M3: insufficient (Pearson) correlation M4: OK
  • 14. › Setting up a baseline: – Kriging (or GP) ▪ Ordinary: no {external drift, trend} ▪ Temporal resolution: hourly Interpolation 14 Radiation bias?
  • 16. 16
  • 17. › If good enough: – Open the door for new research: ▪ Fine-grained interpolated layers ▪ Nowcasting / hi-res weather ▪ Gridded weather variables ▪ Study local phenomena – Governance level: ▪ Lower cost for administration ▪ Better weather forecast for underrepresented areas ▪ Bottom-up initiatives might work in developing regions Why the quality of citizen science weather data is important? 17
  • 18. › Met offices have enough knowledge to develop QC’s – Does it make sense? – Collaboration: “the more, the merrier” › Re-use pieces of software: – Corrections: radiation, air pressure – QC’s for other variables: precipitation, wind › Ideally: work together to build a pipeline with mechanistic, corrective, and statistical filters Not reinventing the wheel 18
  • 19. Challenges and opportunities ahead 19 Big data problem! We are here Imperfect data, but volume is difficult to ignore › Big Data problem – Paradigm shift: code 🡪 data – Adoption of cloud technologies › Data analysis: – Machine learning for QC: Outlier detection, add external drifts – Coupling volunteered data with weather models – Study local patterns: urban wind dynamics, urban rainfall, effect of SVF on outliers – Assess other quality control schemas
  • 20. Questions? ☺ Thanks! 6 november 2019 Koninklijk Nederlands Meteorologisch Instituut 20