SlideShare a Scribd company logo
Forecasting Fine-Grained Air Quality
Based on Big Data
Date: 2015/10/15
Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing
Shan, Eric Chang, Tianrui Li
Source: KDD '15
Advisor: Jia-ling Koh
Spearker: LIN,CI-JIE
1
Outline
Introduction
Method
Experiment
Conclusion
2
Introduction
 People are increasingly concerned with air pollution, which impacts human
health and sustainable development around the world
 There is a rising demand for the prediction of future air quality, which can
inform people’s decision making
3
Challenges
 Multiple complex factors vs. insufficient and inaccurate data
 Urban air changes over location and time significantly
 Inflection points and sudden changes
Good [0-50) Moderate [50-100) Unhealthy [150-200)
Very Unhealthy [200-300)Unhealthy for sensitive [100-150)
A) Monitoring stations B) Distribution of the max-min gaps
C) AQI of different stations changing over time of day
Inflection Points
Introduction
 Goal: construct a real-time air quality forecasting system that
uses data-driven models to predict fine-grained air quality over
the following 48 hours(first 6, 7-12, 12-24, and 24-48 hours)
5
Outline
Introduction
Method
Experiment
Conclusion
6
Architecture of our system
7
Framework
Temporal Predictor
Inflection
Predictor
Spatial Predictor
Local Data
Shape
features
Recent
Meteorology
Weather
Forecast
Recent
AQI
AQIAQI
Prediction Aggregator
Spatial Neighbor Data
AQI
Recent Meteorology
Selected factors
Recent
AQI
Threshold
Final AQI
AQI
AQI
Framework
Temporal Predictor
Inflection
Predictor
Spatial Predictor
Local Data
Shape
features
Recent
Meteorology
Weather
Forecast
Recent
AQI
AQIAQI
Prediction Aggregator
Spatial Neighbor Data
AQI
Recent Meteorology
Selected factors
Recent
AQI
Threshold
Final AQI
AQI
AQI
Temporal Predictor (TP)
 Considering the prediction more from its own historical and future
conditions (local)
 A linear regression is employed to model the local change of air quality
 Train a model respectively for each hour in the next six hours, and two
models for each time interval (from 7 to 48 hours) to predict its maximum
and minimum values
10
tc-1 tctc-2tc-h+1 tc+1 tc+6tc+2 tc+7 tc+12 tc+24 tc+48tc+13 tc+25
Features
 The AQIs of the past ℎ hours at the station
 The local meteorology (such as sunny, overcast, cloudy, foggy, humidity,
wind speed, and direction) at the current time 𝑡 𝑐
 Time of day and day of the week
 The weather forecasts (including Sunny/overcast/cloudy, wind speed, and
wind direction) of the time interval we are going to predict
11
Framework
Temporal Predictor
Inflection
Predictor
Spatial Predictor
Local Data
Shape
features
Recent
Meteorology
Weather
Forecast
Recent
AQI
AQIAQI
Prediction Aggregator
Spatial Neighbor Data
AQI
Recent Meteorology
Selected factors
Recent
AQI
Threshold
Final AQI
AQI
AQI
Spatial Predictor (SP)
 Modeling the spatial correlation of air pollution
 Predicting the air quality from other locations’ status consisting of AQIs
and meteorological data
 Train multiple spatial predictors corresponding to different future time
intervals
 Two major steps:
 Spatial partition and aggregation
 Prediction based on a Neural Network
Spatial partition and aggregation
 Partition the spatial space into regions by using three circles with different
diameters
 Calculate the average AQI for a given kind of air pollutant; same for
temperature and humidity
 Each region will only have one set of aggregated air quality readings and
meteorology
14
A) Spatial partition B) Spatial aggregation
S
Spatial Predictor
15
 Features of SP
 the AQI of the past three hours (𝑨𝑸𝑰𝑖)
 meteorological features (𝑀 𝑖), including the wind speed and direction,
of the current time 𝑡 𝑐.
Framework
Temporal Predictor
Inflection
Predictor
Spatial Predictor
Local Data
Shape
features
Recent
Meteorology
Weather
Forecast
Recent
AQI
AQIAQI
Prediction Aggregator
Spatial Neighbor Data
AQI
Recent Meteorology
Selected factors
Recent
AQI
Threshold
Final AQI
AQI
AQI
Prediction Aggregator(PA)
 The prediction aggregator dynamically integrates the predictions that the
spatial and temporal predictors have made for a location
 Feature Set
 wind speed, direction, humidity, sunny, cloudy, overcast, and foggy
 the predictions generated by the spatial and temporal predictors
 the corresponding Δ𝐴𝑄𝐼 (from the ground truth)
 Train a Regression Tree (RT) to model the dynamic combination of these
factors and predictions
17
Prediction Aggregator(PA)
18
Spatial
0.003 >0.003
Temporal
-0.001
Foggy
Humidity
=1
54.56.62 >6.62
LM2 LM3
>-0.001
LM5
Temporal
LM4
-0.08 >-0.08
Spatial
Wind speed
>-0.14-0.14
LM1 LM8
=0
LM7
>54.5
LM6
LM 3:
AQI = 0.666×Spatial +
0.1627×Temporal +
0.001×isSunnyCloudyOvercast +
0.002×Foggy - 0.001×Wind_Dir_SE -
0.022×Wind_Dir_NE - 0.003×WinSpeed
- 0.0003×Humidity - 0.0452
LM 2:
AQI =
0.186×Spatial+2.52×Temporal+
0.001×SunnyCloudyOvercast +
0.002×Foggy-0.001×Wind_Dir_SE -
0.09×Wind_Dir_NE - 0.007×WinSpeed -
0.001×Humidity + 0.399
Framework
Temporal Predictor
Inflection
Predictor
Spatial Predictor
Local Data
Shape
features
Recent
Meteorology
Weather
Forecast
Recent
AQI
AQIAQI
Prediction Aggregator
Spatial Neighbor Data
AQI
Recent Meteorology
Selected factors
Recent
AQI
Threshold
Final AQI
AQI
AQI
Inflection Predictor
 The air quality of a location changes sharply in a few hours
 Too infrequent to be predicted
 Invoke to handle sudden changes
 Need to know when to invoke the IP model
20
Good [0-50) Moderate [50-100) Unhealthy [150-200)
Very Unhealthy [200-300)Unhealthy for sensitive [100-150)
A) Monitoring stations B) Distribution of the max-min gaps
C) AQI of different stations changing over time of day
Inflection Points
Inflection Predictor
1. Select the sudden drop instances 𝐷𝑖 from historical data 𝐷
 AQI is bigger than 200 and decreases over a threshold in the next few hours
2. Find surpassing ranges and categories
21
D Di
Dt
PDF
PDF
c1 c2 c3 c4
a1 a2 a4a3
A) Select sudden
drop instances Di
B) Distributions of a
continuous feature
Di D-Di Di D-Di
C) Distributions of
a discrete feature
D Di
Dt
Inflection Predictor (IP)
𝐸 = 𝑀𝑎𝑥 (
|𝑥1|
𝐷𝑖
−
|𝑥2|
𝐷 − 𝐷𝑖
) ×
∆|𝑥1|
∆|𝑥2|
𝐷𝑡 = 𝑥1 ∪ 𝑥2 is a collection of instances retrieved by a set of surpassing ranges and categories
𝑥1
𝑥2
3. Select surpassing ranges and categories as thresholds
 there are multiple surpassing ranges and categories, some of them may not
really be discriminative enough
 need to find a set of surpassing ranges and categories as thresholds, with which
we can retrieve as many instances from 𝐷𝑖 as possible while involving the
instances from 𝐷− 𝐷𝑖 as few as possible
 The problem can be solved by using Simulated Annealing
Inflection Predictor (IP)
23
Ranges/categories |𝒙 𝟏|/ 𝑫𝒊 |𝒙 𝟐|/|D-𝑫𝒊| ∆|𝒙 𝟏|/∆|𝒙 𝟐| 𝑬
WinSpeed:13.9-max 0.130 0.031 0.065 0.006
Humidity:1-40 0.380 0.173 0.128 0.026
Downpour 0.382 0.174 0.714 0.149
Wind Northwest 0.478 0.263 0.078 0.017
Sunny 0.643 0.405 0.084 0.020
Moderate rainy 0.680 0.437 0.087 0.020
Inflection Predictor (IP)
4. Train an inflection predictor with 𝐷𝑡
 The features used in the inflection predictor to determine the specific
drop values are the same as those of the temporal predictor
 The inflection predictor is based on a RT
 The output of the inflection predictor is a delta of AQI to be appended
to the final result
24
Outline
Introduction
Method
Experiment
Conclusion
25
Datasets
26
Results
Time 1-6h 7-12h 13-24h 25-48h Sudden Changes
Cities 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆
Beijing 0.750 30 0.62 64 0.53 78.3 0.496 81.1 0.300 78.3
Tianjin 0.746 31 0.634 62.1 0.595 67.4 0.579 68.6 0.437 70.9
Guangzhou 0.805 13 0.748 23.9 0.714 26.8 0.681 29.5 0.477 54.6
Shenzhen 0.838 8.4 0.764 17.6 0.728 20 0.689 22.8 0.575 45.3
𝑝 = 1 −
𝑖 | 𝑦𝑖 − 𝑦𝑖|
𝑖 𝑦𝑖
𝑒 = 𝑖 | 𝑦 𝑖−𝑦 𝑖|
𝑛
.
Results
28
Results
29
Outline
Introduction
Method
Experiment
Conclusion
30
Conclusion
 Report on a real-time air quality forecasting system that uses data-driven
models to predict fine-grained air quality over the following 48 hours
 It can achieve an accuracy of 0.75 for the first 6 hours and 0.6 for the next
7-12 hours in Beijing
 It predicts the sudden changes of air quality much better than baseline
methods
31
Thanks for listening
32

More Related Content

What's hot

Francisco J. Doblas-Big Data y cambio climático
Francisco J. Doblas-Big Data y cambio climáticoFrancisco J. Doblas-Big Data y cambio climático
Francisco J. Doblas-Big Data y cambio climático
Fundación Ramón Areces
 
The PuffR R Package for Conducting Air Quality Dispersion Analyses
The PuffR R Package for Conducting Air Quality Dispersion AnalysesThe PuffR R Package for Conducting Air Quality Dispersion Analyses
The PuffR R Package for Conducting Air Quality Dispersion Analyses
Richard Iannone
 
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
Picarro
 
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
Sandia National Laboratories: Energy & Climate: Renewables
 
2005-10-31 Characterization of Aerosol Events
2005-10-31 Characterization of Aerosol Events2005-10-31 Characterization of Aerosol Events
2005-10-31 Characterization of Aerosol EventsRudolf Husar
 
Uncertainty of the Solargis solar radiation database
Uncertainty of the Solargis solar radiation databaseUncertainty of the Solargis solar radiation database
Uncertainty of the Solargis solar radiation database
Sandia National Laboratories: Energy & Climate: Renewables
 
DSD-INT 2020 Radar rainfall estimation and nowcasting
DSD-INT 2020 Radar rainfall estimation and nowcastingDSD-INT 2020 Radar rainfall estimation and nowcasting
DSD-INT 2020 Radar rainfall estimation and nowcasting
Deltares
 
Generation of one minute data
Generation of one minute dataGeneration of one minute data
IGARSS 2011_Priestley.ppt
IGARSS 2011_Priestley.pptIGARSS 2011_Priestley.ppt
IGARSS 2011_Priestley.pptgrssieee
 
GC minyak jelantah
GC minyak jelantahGC minyak jelantah
GC minyak jelantah
shintia putri
 
Data Assimilation in Numerical Weather Prediction Models
Data Assimilation in Numerical Weather Prediction ModelsData Assimilation in Numerical Weather Prediction Models
Data Assimilation in Numerical Weather Prediction Models
Africa Perianez
 
Chaos andweathercontrol
Chaos andweathercontrolChaos andweathercontrol
Chaos andweathercontrolClifford Stone
 
Optimization of the 45th weather squadron’s linear first guess equation prese...
Optimization of the 45th weather squadron’s linear first guess equation prese...Optimization of the 45th weather squadron’s linear first guess equation prese...
Optimization of the 45th weather squadron’s linear first guess equation prese...
James Brownlee
 
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
Deltares
 
Low Cost Sensors to Measure Air Quality
Low Cost Sensors to Measure Air QualityLow Cost Sensors to Measure Air Quality
Low Cost Sensors to Measure Air Quality
Gangadhar Sulkunte
 
Air Quality Monitoring in Stuttgart
Air Quality Monitoring in StuttgartAir Quality Monitoring in Stuttgart
Air Quality Monitoring in Stuttgart
Devansh Sharma
 
Paper2_CSE6331_Vivek_1001053883
Paper2_CSE6331_Vivek_1001053883Paper2_CSE6331_Vivek_1001053883
Paper2_CSE6331_Vivek_1001053883Vivek Sharma
 
Project co prediction Regression analysis | MTH 426 IITK
Project co prediction Regression analysis | MTH 426 IITK Project co prediction Regression analysis | MTH 426 IITK
Project co prediction Regression analysis | MTH 426 IITK
Vivekananda Samiti
 
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
Integrated Carbon Observation System (ICOS)
 
Regression project report | Regression analysis | MTH 426 IITK
Regression project report | Regression analysis | MTH 426 IITK Regression project report | Regression analysis | MTH 426 IITK
Regression project report | Regression analysis | MTH 426 IITK
Vivekananda Samiti
 

What's hot (20)

Francisco J. Doblas-Big Data y cambio climático
Francisco J. Doblas-Big Data y cambio climáticoFrancisco J. Doblas-Big Data y cambio climático
Francisco J. Doblas-Big Data y cambio climático
 
The PuffR R Package for Conducting Air Quality Dispersion Analyses
The PuffR R Package for Conducting Air Quality Dispersion AnalysesThe PuffR R Package for Conducting Air Quality Dispersion Analyses
The PuffR R Package for Conducting Air Quality Dispersion Analyses
 
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
Methane Maps of DISH and Flower Mound (Texas) - Likely Indication of Benzene ...
 
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
20 bethke hammer_timeseries_of_spectrally_resolved_solar_irradiance_data_from...
 
2005-10-31 Characterization of Aerosol Events
2005-10-31 Characterization of Aerosol Events2005-10-31 Characterization of Aerosol Events
2005-10-31 Characterization of Aerosol Events
 
Uncertainty of the Solargis solar radiation database
Uncertainty of the Solargis solar radiation databaseUncertainty of the Solargis solar radiation database
Uncertainty of the Solargis solar radiation database
 
DSD-INT 2020 Radar rainfall estimation and nowcasting
DSD-INT 2020 Radar rainfall estimation and nowcastingDSD-INT 2020 Radar rainfall estimation and nowcasting
DSD-INT 2020 Radar rainfall estimation and nowcasting
 
Generation of one minute data
Generation of one minute dataGeneration of one minute data
Generation of one minute data
 
IGARSS 2011_Priestley.ppt
IGARSS 2011_Priestley.pptIGARSS 2011_Priestley.ppt
IGARSS 2011_Priestley.ppt
 
GC minyak jelantah
GC minyak jelantahGC minyak jelantah
GC minyak jelantah
 
Data Assimilation in Numerical Weather Prediction Models
Data Assimilation in Numerical Weather Prediction ModelsData Assimilation in Numerical Weather Prediction Models
Data Assimilation in Numerical Weather Prediction Models
 
Chaos andweathercontrol
Chaos andweathercontrolChaos andweathercontrol
Chaos andweathercontrol
 
Optimization of the 45th weather squadron’s linear first guess equation prese...
Optimization of the 45th weather squadron’s linear first guess equation prese...Optimization of the 45th weather squadron’s linear first guess equation prese...
Optimization of the 45th weather squadron’s linear first guess equation prese...
 
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
 
Low Cost Sensors to Measure Air Quality
Low Cost Sensors to Measure Air QualityLow Cost Sensors to Measure Air Quality
Low Cost Sensors to Measure Air Quality
 
Air Quality Monitoring in Stuttgart
Air Quality Monitoring in StuttgartAir Quality Monitoring in Stuttgart
Air Quality Monitoring in Stuttgart
 
Paper2_CSE6331_Vivek_1001053883
Paper2_CSE6331_Vivek_1001053883Paper2_CSE6331_Vivek_1001053883
Paper2_CSE6331_Vivek_1001053883
 
Project co prediction Regression analysis | MTH 426 IITK
Project co prediction Regression analysis | MTH 426 IITK Project co prediction Regression analysis | MTH 426 IITK
Project co prediction Regression analysis | MTH 426 IITK
 
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
Hammer, Samuel: Monitoring ffCO₂ emission hotspots using atmospheric ¹⁴CO₂ me...
 
Regression project report | Regression analysis | MTH 426 IITK
Regression project report | Regression analysis | MTH 426 IITK Regression project report | Regression analysis | MTH 426 IITK
Regression project report | Regression analysis | MTH 426 IITK
 

Viewers also liked

Effective string processing and matching for author entity
Effective string processing and matching for author entityEffective string processing and matching for author entity
Effective string processing and matching for author entity祺傑 林
 
Mobile query reformulations
Mobile query reformulationsMobile query reformulations
Mobile query reformulations
祺傑 林
 
Using Java reflection to break  encapsulation
Using Java reflection to break  encapsulationUsing Java reflection to break  encapsulation
Using Java reflection to break  encapsulation
祺傑 林
 
Fine-Grained Location Extraction from Tweets with Temporal Awareness
Fine-Grained Location Extraction from Tweets with Temporal AwarenessFine-Grained Location Extraction from Tweets with Temporal Awareness
Fine-Grained Location Extraction from Tweets with Temporal Awareness
祺傑 林
 
Untitled
UntitledUntitled
Untitled
祺傑 林
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
祺傑 林
 
How Many Folders Do You Really Need? Classifying Email into a Handful of Cate...
How Many Folders Do You Really Need?Classifying Email into a Handful of Cate...How Many Folders Do You Really Need?Classifying Email into a Handful of Cate...
How Many Folders Do You Really Need? Classifying Email into a Handful of Cate...
祺傑 林
 
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
Leveraging Knowledge Basesfor Contextual Entity Exploration CategoriesLeveraging Knowledge Basesfor Contextual Entity Exploration Categories
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
祺傑 林
 
Concept based short text classification and ranking
Concept based short text classification and rankingConcept based short text classification and ranking
Concept based short text classification and ranking
祺傑 林
 
Extending facet search to the general web
Extending facet search to the general webExtending facet search to the general web
Extending facet search to the general web
祺傑 林
 

Viewers also liked (10)

Effective string processing and matching for author entity
Effective string processing and matching for author entityEffective string processing and matching for author entity
Effective string processing and matching for author entity
 
Mobile query reformulations
Mobile query reformulationsMobile query reformulations
Mobile query reformulations
 
Using Java reflection to break  encapsulation
Using Java reflection to break  encapsulationUsing Java reflection to break  encapsulation
Using Java reflection to break  encapsulation
 
Fine-Grained Location Extraction from Tweets with Temporal Awareness
Fine-Grained Location Extraction from Tweets with Temporal AwarenessFine-Grained Location Extraction from Tweets with Temporal Awareness
Fine-Grained Location Extraction from Tweets with Temporal Awareness
 
Untitled
UntitledUntitled
Untitled
 
Using Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's TablesUsing Linked Data to Mine RDF from Wikipedia's Tables
Using Linked Data to Mine RDF from Wikipedia's Tables
 
How Many Folders Do You Really Need? Classifying Email into a Handful of Cate...
How Many Folders Do You Really Need?Classifying Email into a Handful of Cate...How Many Folders Do You Really Need?Classifying Email into a Handful of Cate...
How Many Folders Do You Really Need? Classifying Email into a Handful of Cate...
 
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
Leveraging Knowledge Basesfor Contextual Entity Exploration CategoriesLeveraging Knowledge Basesfor Contextual Entity Exploration Categories
Leveraging Knowledge Bases for Contextual Entity Exploration Categories
 
Concept based short text classification and ranking
Concept based short text classification and rankingConcept based short text classification and ranking
Concept based short text classification and ranking
 
Extending facet search to the general web
Extending facet search to the general webExtending facet search to the general web
Extending facet search to the general web
 

Similar to Forecasting fine grained air quality based on big data

WindSight Validation (March 2011)
WindSight Validation (March 2011)WindSight Validation (March 2011)
WindSight Validation (March 2011)
Carlos Pinto
 
Evaluation of procedures to improve solar resource assessments presented WREF...
Evaluation of procedures to improve solar resource assessments presented WREF...Evaluation of procedures to improve solar resource assessments presented WREF...
Evaluation of procedures to improve solar resource assessments presented WREF...Gwendalyn Bender
 
Optimal combinaison of CFD modeling and statistical learning for short-term w...
Optimal combinaison of CFD modeling and statistical learning for short-term w...Optimal combinaison of CFD modeling and statistical learning for short-term w...
Optimal combinaison of CFD modeling and statistical learning for short-term w...
Jean-Claude Meteodyn
 
Muhammad Saiful Islam @FTF2013
Muhammad Saiful Islam @FTF2013Muhammad Saiful Islam @FTF2013
Muhammad Saiful Islam @FTF2013FTF2013
 
Future guidelines on solar forecasting the research view - David Pozo (Univer...
Future guidelines on solar forecasting the research view - David Pozo (Univer...Future guidelines on solar forecasting the research view - David Pozo (Univer...
Future guidelines on solar forecasting the research view - David Pozo (Univer...
IrSOLaV Pomares
 
PPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerningPPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerning
Misbanausheen1
 
Power Performance Optimization using LiDAR technology : India Pilot Project R...
Power Performance Optimization using LiDAR technology : India Pilot Project R...Power Performance Optimization using LiDAR technology : India Pilot Project R...
Power Performance Optimization using LiDAR technology : India Pilot Project R...
Karim Fahssis 卡卡
 
Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)
IrSOLaV Pomares
 
Development of an Integrated Attitude Determination System for Small Unmanned...
Development of an Integrated Attitude Determination System for Small Unmanned...Development of an Integrated Attitude Determination System for Small Unmanned...
Development of an Integrated Attitude Determination System for Small Unmanned...
IRJET Journal
 
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
Sandia National Laboratories: Energy & Climate: Renewables
 
Delineation of Mahanadi River Basin by Using GIS and ArcSWAT
Delineation of Mahanadi River Basin by Using GIS and ArcSWATDelineation of Mahanadi River Basin by Using GIS and ArcSWAT
Delineation of Mahanadi River Basin by Using GIS and ArcSWAT
inventionjournals
 
Solar Resource Assessment - How to get bankable meteo data
Solar Resource Assessment - How to get bankable meteo dataSolar Resource Assessment - How to get bankable meteo data
Solar Resource Assessment - How to get bankable meteo data
SolarReference
 
Talha Javed Presentation01.pptx
Talha Javed Presentation01.pptxTalha Javed Presentation01.pptx
Talha Javed Presentation01.pptx
shafiqueahmad52
 
IRJET- Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
IRJET-  	  Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...IRJET-  	  Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
IRJET- Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
IRJET Journal
 
An Investigation of Weather Forecasting using Machine Learning Techniques
An Investigation of Weather Forecasting using Machine Learning TechniquesAn Investigation of Weather Forecasting using Machine Learning Techniques
An Investigation of Weather Forecasting using Machine Learning Techniques
Dr. Amarjeet Singh
 
2202.11214.pdf
2202.11214.pdf2202.11214.pdf
2202.11214.pdf
Umangbhalla2
 
Comparison_of_rain_attenuation_models_of_satellite.pdf
Comparison_of_rain_attenuation_models_of_satellite.pdfComparison_of_rain_attenuation_models_of_satellite.pdf
Comparison_of_rain_attenuation_models_of_satellite.pdf
DeepakSinghNagarkoti
 
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
Wassim Derguech
 

Similar to Forecasting fine grained air quality based on big data (20)

WindSight Validation (March 2011)
WindSight Validation (March 2011)WindSight Validation (March 2011)
WindSight Validation (March 2011)
 
Evaluation of procedures to improve solar resource assessments presented WREF...
Evaluation of procedures to improve solar resource assessments presented WREF...Evaluation of procedures to improve solar resource assessments presented WREF...
Evaluation of procedures to improve solar resource assessments presented WREF...
 
Optimal combinaison of CFD modeling and statistical learning for short-term w...
Optimal combinaison of CFD modeling and statistical learning for short-term w...Optimal combinaison of CFD modeling and statistical learning for short-term w...
Optimal combinaison of CFD modeling and statistical learning for short-term w...
 
Muhammad Saiful Islam @FTF2013
Muhammad Saiful Islam @FTF2013Muhammad Saiful Islam @FTF2013
Muhammad Saiful Islam @FTF2013
 
Future guidelines on solar forecasting the research view - David Pozo (Univer...
Future guidelines on solar forecasting the research view - David Pozo (Univer...Future guidelines on solar forecasting the research view - David Pozo (Univer...
Future guidelines on solar forecasting the research view - David Pozo (Univer...
 
LabReport (2)
LabReport (2)LabReport (2)
LabReport (2)
 
PPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerningPPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerning
 
Power Performance Optimization using LiDAR technology : India Pilot Project R...
Power Performance Optimization using LiDAR technology : India Pilot Project R...Power Performance Optimization using LiDAR technology : India Pilot Project R...
Power Performance Optimization using LiDAR technology : India Pilot Project R...
 
Mutt_Wind_Tunnel_Results_v2
Mutt_Wind_Tunnel_Results_v2Mutt_Wind_Tunnel_Results_v2
Mutt_Wind_Tunnel_Results_v2
 
Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)
 
Development of an Integrated Attitude Determination System for Small Unmanned...
Development of an Integrated Attitude Determination System for Small Unmanned...Development of an Integrated Attitude Determination System for Small Unmanned...
Development of an Integrated Attitude Determination System for Small Unmanned...
 
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
 
Delineation of Mahanadi River Basin by Using GIS and ArcSWAT
Delineation of Mahanadi River Basin by Using GIS and ArcSWATDelineation of Mahanadi River Basin by Using GIS and ArcSWAT
Delineation of Mahanadi River Basin by Using GIS and ArcSWAT
 
Solar Resource Assessment - How to get bankable meteo data
Solar Resource Assessment - How to get bankable meteo dataSolar Resource Assessment - How to get bankable meteo data
Solar Resource Assessment - How to get bankable meteo data
 
Talha Javed Presentation01.pptx
Talha Javed Presentation01.pptxTalha Javed Presentation01.pptx
Talha Javed Presentation01.pptx
 
IRJET- Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
IRJET-  	  Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...IRJET-  	  Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
IRJET- Implementation of IoT based Dual Axis Photo-Voltaic Solar Tracker ...
 
An Investigation of Weather Forecasting using Machine Learning Techniques
An Investigation of Weather Forecasting using Machine Learning TechniquesAn Investigation of Weather Forecasting using Machine Learning Techniques
An Investigation of Weather Forecasting using Machine Learning Techniques
 
2202.11214.pdf
2202.11214.pdf2202.11214.pdf
2202.11214.pdf
 
Comparison_of_rain_attenuation_models_of_satellite.pdf
Comparison_of_rain_attenuation_models_of_satellite.pdfComparison_of_rain_attenuation_models_of_satellite.pdf
Comparison_of_rain_attenuation_models_of_satellite.pdf
 
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
 

Recently uploaded

Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
NoelManyise1
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 

Recently uploaded (20)

Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiologyBLOOD AND BLOOD COMPONENT- introduction to blood physiology
BLOOD AND BLOOD COMPONENT- introduction to blood physiology
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 

Forecasting fine grained air quality based on big data

  • 1. Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang, Tianrui Li Source: KDD '15 Advisor: Jia-ling Koh Spearker: LIN,CI-JIE 1
  • 3. Introduction  People are increasingly concerned with air pollution, which impacts human health and sustainable development around the world  There is a rising demand for the prediction of future air quality, which can inform people’s decision making 3
  • 4. Challenges  Multiple complex factors vs. insufficient and inaccurate data  Urban air changes over location and time significantly  Inflection points and sudden changes Good [0-50) Moderate [50-100) Unhealthy [150-200) Very Unhealthy [200-300)Unhealthy for sensitive [100-150) A) Monitoring stations B) Distribution of the max-min gaps C) AQI of different stations changing over time of day Inflection Points
  • 5. Introduction  Goal: construct a real-time air quality forecasting system that uses data-driven models to predict fine-grained air quality over the following 48 hours(first 6, 7-12, 12-24, and 24-48 hours) 5
  • 8. Framework Temporal Predictor Inflection Predictor Spatial Predictor Local Data Shape features Recent Meteorology Weather Forecast Recent AQI AQIAQI Prediction Aggregator Spatial Neighbor Data AQI Recent Meteorology Selected factors Recent AQI Threshold Final AQI AQI AQI
  • 9. Framework Temporal Predictor Inflection Predictor Spatial Predictor Local Data Shape features Recent Meteorology Weather Forecast Recent AQI AQIAQI Prediction Aggregator Spatial Neighbor Data AQI Recent Meteorology Selected factors Recent AQI Threshold Final AQI AQI AQI
  • 10. Temporal Predictor (TP)  Considering the prediction more from its own historical and future conditions (local)  A linear regression is employed to model the local change of air quality  Train a model respectively for each hour in the next six hours, and two models for each time interval (from 7 to 48 hours) to predict its maximum and minimum values 10 tc-1 tctc-2tc-h+1 tc+1 tc+6tc+2 tc+7 tc+12 tc+24 tc+48tc+13 tc+25
  • 11. Features  The AQIs of the past ℎ hours at the station  The local meteorology (such as sunny, overcast, cloudy, foggy, humidity, wind speed, and direction) at the current time 𝑡 𝑐  Time of day and day of the week  The weather forecasts (including Sunny/overcast/cloudy, wind speed, and wind direction) of the time interval we are going to predict 11
  • 12. Framework Temporal Predictor Inflection Predictor Spatial Predictor Local Data Shape features Recent Meteorology Weather Forecast Recent AQI AQIAQI Prediction Aggregator Spatial Neighbor Data AQI Recent Meteorology Selected factors Recent AQI Threshold Final AQI AQI AQI
  • 13. Spatial Predictor (SP)  Modeling the spatial correlation of air pollution  Predicting the air quality from other locations’ status consisting of AQIs and meteorological data  Train multiple spatial predictors corresponding to different future time intervals  Two major steps:  Spatial partition and aggregation  Prediction based on a Neural Network
  • 14. Spatial partition and aggregation  Partition the spatial space into regions by using three circles with different diameters  Calculate the average AQI for a given kind of air pollutant; same for temperature and humidity  Each region will only have one set of aggregated air quality readings and meteorology 14 A) Spatial partition B) Spatial aggregation S
  • 15. Spatial Predictor 15  Features of SP  the AQI of the past three hours (𝑨𝑸𝑰𝑖)  meteorological features (𝑀 𝑖), including the wind speed and direction, of the current time 𝑡 𝑐.
  • 16. Framework Temporal Predictor Inflection Predictor Spatial Predictor Local Data Shape features Recent Meteorology Weather Forecast Recent AQI AQIAQI Prediction Aggregator Spatial Neighbor Data AQI Recent Meteorology Selected factors Recent AQI Threshold Final AQI AQI AQI
  • 17. Prediction Aggregator(PA)  The prediction aggregator dynamically integrates the predictions that the spatial and temporal predictors have made for a location  Feature Set  wind speed, direction, humidity, sunny, cloudy, overcast, and foggy  the predictions generated by the spatial and temporal predictors  the corresponding Δ𝐴𝑄𝐼 (from the ground truth)  Train a Regression Tree (RT) to model the dynamic combination of these factors and predictions 17
  • 18. Prediction Aggregator(PA) 18 Spatial 0.003 >0.003 Temporal -0.001 Foggy Humidity =1 54.56.62 >6.62 LM2 LM3 >-0.001 LM5 Temporal LM4 -0.08 >-0.08 Spatial Wind speed >-0.14-0.14 LM1 LM8 =0 LM7 >54.5 LM6 LM 3: AQI = 0.666×Spatial + 0.1627×Temporal + 0.001×isSunnyCloudyOvercast + 0.002×Foggy - 0.001×Wind_Dir_SE - 0.022×Wind_Dir_NE - 0.003×WinSpeed - 0.0003×Humidity - 0.0452 LM 2: AQI = 0.186×Spatial+2.52×Temporal+ 0.001×SunnyCloudyOvercast + 0.002×Foggy-0.001×Wind_Dir_SE - 0.09×Wind_Dir_NE - 0.007×WinSpeed - 0.001×Humidity + 0.399
  • 19. Framework Temporal Predictor Inflection Predictor Spatial Predictor Local Data Shape features Recent Meteorology Weather Forecast Recent AQI AQIAQI Prediction Aggregator Spatial Neighbor Data AQI Recent Meteorology Selected factors Recent AQI Threshold Final AQI AQI AQI
  • 20. Inflection Predictor  The air quality of a location changes sharply in a few hours  Too infrequent to be predicted  Invoke to handle sudden changes  Need to know when to invoke the IP model 20 Good [0-50) Moderate [50-100) Unhealthy [150-200) Very Unhealthy [200-300)Unhealthy for sensitive [100-150) A) Monitoring stations B) Distribution of the max-min gaps C) AQI of different stations changing over time of day Inflection Points
  • 21. Inflection Predictor 1. Select the sudden drop instances 𝐷𝑖 from historical data 𝐷  AQI is bigger than 200 and decreases over a threshold in the next few hours 2. Find surpassing ranges and categories 21 D Di Dt PDF PDF c1 c2 c3 c4 a1 a2 a4a3 A) Select sudden drop instances Di B) Distributions of a continuous feature Di D-Di Di D-Di C) Distributions of a discrete feature
  • 22. D Di Dt Inflection Predictor (IP) 𝐸 = 𝑀𝑎𝑥 ( |𝑥1| 𝐷𝑖 − |𝑥2| 𝐷 − 𝐷𝑖 ) × ∆|𝑥1| ∆|𝑥2| 𝐷𝑡 = 𝑥1 ∪ 𝑥2 is a collection of instances retrieved by a set of surpassing ranges and categories 𝑥1 𝑥2 3. Select surpassing ranges and categories as thresholds  there are multiple surpassing ranges and categories, some of them may not really be discriminative enough  need to find a set of surpassing ranges and categories as thresholds, with which we can retrieve as many instances from 𝐷𝑖 as possible while involving the instances from 𝐷− 𝐷𝑖 as few as possible  The problem can be solved by using Simulated Annealing
  • 23. Inflection Predictor (IP) 23 Ranges/categories |𝒙 𝟏|/ 𝑫𝒊 |𝒙 𝟐|/|D-𝑫𝒊| ∆|𝒙 𝟏|/∆|𝒙 𝟐| 𝑬 WinSpeed:13.9-max 0.130 0.031 0.065 0.006 Humidity:1-40 0.380 0.173 0.128 0.026 Downpour 0.382 0.174 0.714 0.149 Wind Northwest 0.478 0.263 0.078 0.017 Sunny 0.643 0.405 0.084 0.020 Moderate rainy 0.680 0.437 0.087 0.020
  • 24. Inflection Predictor (IP) 4. Train an inflection predictor with 𝐷𝑡  The features used in the inflection predictor to determine the specific drop values are the same as those of the temporal predictor  The inflection predictor is based on a RT  The output of the inflection predictor is a delta of AQI to be appended to the final result 24
  • 27. Results Time 1-6h 7-12h 13-24h 25-48h Sudden Changes Cities 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆 𝒑 𝒆 Beijing 0.750 30 0.62 64 0.53 78.3 0.496 81.1 0.300 78.3 Tianjin 0.746 31 0.634 62.1 0.595 67.4 0.579 68.6 0.437 70.9 Guangzhou 0.805 13 0.748 23.9 0.714 26.8 0.681 29.5 0.477 54.6 Shenzhen 0.838 8.4 0.764 17.6 0.728 20 0.689 22.8 0.575 45.3 𝑝 = 1 − 𝑖 | 𝑦𝑖 − 𝑦𝑖| 𝑖 𝑦𝑖 𝑒 = 𝑖 | 𝑦 𝑖−𝑦 𝑖| 𝑛 .
  • 31. Conclusion  Report on a real-time air quality forecasting system that uses data-driven models to predict fine-grained air quality over the following 48 hours  It can achieve an accuracy of 0.75 for the first 6 hours and 0.6 for the next 7-12 hours in Beijing  It predicts the sudden changes of air quality much better than baseline methods 31

Editor's Notes

  1. WinSpeed:13.9-max or Humidity:1-40 or Downpour