Hello Sensor
1
Agenda
1. Weather Underground Introduction
2. Making Your Own PWS
3. Data Ingestion & QC
4. API
2
Weather Underground Intro
3
What is Weather Underground?
● Web
● Flagship app
● Storm
● WunderStation
● PWS Network
● API
4
Web
● Powered by 200k+
weather stations
● Visually engaging
● Provides low-level
weather data
5
Flagship App
● The most hyperlocal forecasts
● Data presented in a stunningly
simple interface
6
Storm
● The best app for the worst
weather
● Highest resolution radar
● Severe weather alerts
7
WunderStation
● Personalized weather
dashboard
● Features your own PWS data
8
PWS Network
● There are about 12k
government provided
weather stations
● We fill in the gaps with
over 200k Personal
Weather Stations
9
Making Your Own PWS
10
What is a Weather Station
Traditional stations
Qualitative reporting (crowd reports)
Image recognition
Phone Sensors
Car sensors
Maker Station
11
Weather hungry data monsters
To serve globally we need more data
-Engage with local met offices (if they exist)
-Engage with education/maker community
More data, better data = better forecasts.
12
Roll your own
Open source weather stations make IoT and
weather more available/flexible for local
needs
Can be part of an education program
13
What does it take
1.Sensor (Temp, precip, humidity, uv, etc)
2.Controller (arduino, particle, etc)
3.Memory and/or
Transmitter (flash,wifi, cellular)
4.Power (solar, battery, mains)
14
Station challenges
Hardware:
1:power (limits everything)
2:transmit (expensive power budget item)
3:durability (usually moving parts)
4:sensors (minor technical issues)
5:controller (very low requirements)
15
Station challenges
Biggest contributor to data variation:
Enclosure design
The Effectiveness of the ASOS, MMTS, Gill, and CRS Air Temperature Radiation Shields: K. G. Hubbard, X. Lin, and E. A. Walter-Shea 16
Tiny wifi
Tiny wifi connected station
limited battery life
Used to monitor terrarium
17
Ol faithful
Good reliability, online over a year
Solar and battery powered
Enclosure made from
~$6 garden supplies
Particle Photon (WiFi)
Spark Fun Weather Shield
-HTU21D humidity sensor
-MPL3115A2 pressure sensor
18
Cell-o there
Particle Electron: cell radio + microcontroller
BMP280: temp, humidity, pressure sensor
Enclosure made from a painted soda cup
Data is good if kept in shade however:
no venting = heat buildup
ok proof of concept, needs refinement
19
Data Ingestion & QC
20
Ingestion
Rapidfire
● Ingests and stores data reported at rates as fast as one observation
every 2 seconds
● Stores data in current condition file, records history data at as high
resolution as once every 5 seconds
21
Quality Control (QC)
Before QC
22
Quality Control (QC)
After QC
23
Quality Control (QC)
24
The QC Checks
● Range Check
● Stuck Sensor Check
● Neighbor Check
25
Range Check
Have these readings ever happened on Earth?
Temperature < -130º F or > 135º F.
Dew Point < -90º F or > 90º F.
Wind Speed < 0 mph or > 279 mph.
Wind Direction < 0º or > 360º.
Pressure < 846 inHg or > 1100 inHg.
26
Stuck Sensor Check
Has the temperature changed in the past 6 hours?
● by at least 0.1°F
● lack of change is often an indication of
other stuck sensors as well
27
Neighbor Check
Is the temperature of this station similar to the majority of stations nearby?
● collect sensors in 15 km of current sensor
● find clusters divided by 3° F
● determine majority cluster(s)
● throw out statistical outliers
Most essential customer-facing check
28
Neighbor Check
29
The Next Step - QC on Ingest
● Current QC
○ cycle is 15 minutes, allowing bad observations to linger on the site
and apps during that time
○ written in difficult to maintain and extend multi-threaded C++ code
● IBM Streams + QC
○ clean obs all the time
○ written in single threaded Python with better performance, stability,
extensibility, third-party libraries like Spark, and support for modern
technologies like JSON and REST
30
API
31
200,000+ Personal Weather Stations
2.2 Billion forecast locations | 180 M consumers / month 32
33
Uptime: 99.95 %
Latency ~25 ms
Autoscale to 20B requests per day
Scalability
Average 10s of Billions requests per day
Global Coverage
(US East, US West, EU, Asia)
Partial DeploymentsVersioned artifacts
and rollbacks
Faster code to prod:
Less dependency b/w teams
Your favorite tech /
language here
34
Architecture: Storage Polyglot
Real time data
and caching
Historical weather data
Data Migration
Gateway Data
Analytics
Archives
Images
Videos
Analytics
Informatica
Drupal
35
Thank you!
36
Questions?
37

Weather Underground - PWS, Data Ingestion and APIs

  • 1.
  • 2.
    Agenda 1. Weather UndergroundIntroduction 2. Making Your Own PWS 3. Data Ingestion & QC 4. API 2
  • 3.
  • 4.
    What is WeatherUnderground? ● Web ● Flagship app ● Storm ● WunderStation ● PWS Network ● API 4
  • 5.
    Web ● Powered by200k+ weather stations ● Visually engaging ● Provides low-level weather data 5
  • 6.
    Flagship App ● Themost hyperlocal forecasts ● Data presented in a stunningly simple interface 6
  • 7.
    Storm ● The bestapp for the worst weather ● Highest resolution radar ● Severe weather alerts 7
  • 8.
  • 9.
    PWS Network ● Thereare about 12k government provided weather stations ● We fill in the gaps with over 200k Personal Weather Stations 9
  • 10.
  • 11.
    What is aWeather Station Traditional stations Qualitative reporting (crowd reports) Image recognition Phone Sensors Car sensors Maker Station 11
  • 12.
    Weather hungry datamonsters To serve globally we need more data -Engage with local met offices (if they exist) -Engage with education/maker community More data, better data = better forecasts. 12
  • 13.
    Roll your own Opensource weather stations make IoT and weather more available/flexible for local needs Can be part of an education program 13
  • 14.
    What does ittake 1.Sensor (Temp, precip, humidity, uv, etc) 2.Controller (arduino, particle, etc) 3.Memory and/or Transmitter (flash,wifi, cellular) 4.Power (solar, battery, mains) 14
  • 15.
    Station challenges Hardware: 1:power (limitseverything) 2:transmit (expensive power budget item) 3:durability (usually moving parts) 4:sensors (minor technical issues) 5:controller (very low requirements) 15
  • 16.
    Station challenges Biggest contributorto data variation: Enclosure design The Effectiveness of the ASOS, MMTS, Gill, and CRS Air Temperature Radiation Shields: K. G. Hubbard, X. Lin, and E. A. Walter-Shea 16
  • 17.
    Tiny wifi Tiny wificonnected station limited battery life Used to monitor terrarium 17
  • 18.
    Ol faithful Good reliability,online over a year Solar and battery powered Enclosure made from ~$6 garden supplies Particle Photon (WiFi) Spark Fun Weather Shield -HTU21D humidity sensor -MPL3115A2 pressure sensor 18
  • 19.
    Cell-o there Particle Electron:cell radio + microcontroller BMP280: temp, humidity, pressure sensor Enclosure made from a painted soda cup Data is good if kept in shade however: no venting = heat buildup ok proof of concept, needs refinement 19
  • 20.
  • 21.
    Ingestion Rapidfire ● Ingests andstores data reported at rates as fast as one observation every 2 seconds ● Stores data in current condition file, records history data at as high resolution as once every 5 seconds 21
  • 22.
  • 23.
  • 24.
  • 25.
    The QC Checks ●Range Check ● Stuck Sensor Check ● Neighbor Check 25
  • 26.
    Range Check Have thesereadings ever happened on Earth? Temperature < -130º F or > 135º F. Dew Point < -90º F or > 90º F. Wind Speed < 0 mph or > 279 mph. Wind Direction < 0º or > 360º. Pressure < 846 inHg or > 1100 inHg. 26
  • 27.
    Stuck Sensor Check Hasthe temperature changed in the past 6 hours? ● by at least 0.1°F ● lack of change is often an indication of other stuck sensors as well 27
  • 28.
    Neighbor Check Is thetemperature of this station similar to the majority of stations nearby? ● collect sensors in 15 km of current sensor ● find clusters divided by 3° F ● determine majority cluster(s) ● throw out statistical outliers Most essential customer-facing check 28
  • 29.
  • 30.
    The Next Step- QC on Ingest ● Current QC ○ cycle is 15 minutes, allowing bad observations to linger on the site and apps during that time ○ written in difficult to maintain and extend multi-threaded C++ code ● IBM Streams + QC ○ clean obs all the time ○ written in single threaded Python with better performance, stability, extensibility, third-party libraries like Spark, and support for modern technologies like JSON and REST 30
  • 31.
  • 32.
    200,000+ Personal WeatherStations 2.2 Billion forecast locations | 180 M consumers / month 32
  • 33.
  • 34.
    Uptime: 99.95 % Latency~25 ms Autoscale to 20B requests per day Scalability Average 10s of Billions requests per day Global Coverage (US East, US West, EU, Asia) Partial DeploymentsVersioned artifacts and rollbacks Faster code to prod: Less dependency b/w teams Your favorite tech / language here 34
  • 35.
    Architecture: Storage Polyglot Realtime data and caching Historical weather data Data Migration Gateway Data Analytics Archives Images Videos Analytics Informatica Drupal 35
  • 36.
  • 37.