2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael Tuffly


Published on

Ozone (O3) is a powerful oxidizer (e.g. reacting with oxygen). Ozone in the upper atmosphere is considered beneficial due to the ability of the compound to filter harmful UV rays generated from the sun. However, ground level concentrations of ozone influence animal and plant health. In animals, one symptom of ground level ozone is lung tissue damage resulting in respiratory complications. Excess ozone in plants can cause excessive water loss; thus, emulate drought conditions. Ozone simulates the stomata cell in plant leaves so that these cells do not function properly. That is the stomata cells do not close completely, resulting in excess water loss (Smith et al. 2008). Anthropogenic ozone can be created via internal combustion engines and coal fired power plants.
Collecting data from the Environmental Protection Agency (EPA) CASTnet site for the time periods 1990 to 2010 I use spatial interpolation techniques to create an ozone surface concentration for the contiguous United States.

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael Tuffly

  1. 1. Modeling Ground Ozone for the Contiguous United States By Michael Tuffly, Ph.D. ERIA Consultants, LLC GIS in the Rockies 2013 Cable Center Denver, Colorado 10/9/2013 http://www.eriaconsultants.com mtuffly@eriaconsultants.com
  2. 2. What is Ozone  Chemically  It is a molecule containing 3 Oxygen atoms (aka triatomic) oxygen (O3).  Ozone is a powerful oxidizer (e.g. combines with Oxygen).  Examples of Oxidation  Rust on metal objects  Fire “Oxidation is an increase in the oxidation number or a real or apparent loss of one or more electrons.” (Miller 1981). Miller. G. T., 1981. Chemistry: A basic Introduction Second Edition. Wadsworth Publishing Company, Belmont, Californai. USA.
  3. 3. Ozone’s Location  Ozone which is located in the lower stratosphere (20 – 50 km in elevation) is beneficial to life on earth.  In the lower stratosphere ozone molecules form a protective layer that filters out much of the high-energy solar ultraviolet radiation. 3O2 Ultraviolet Radiation 2 O3
  4. 4.  Ground Ozone Ozone at ground level can be an issue to the health of plants and animals  One way ground ozone is formed is via a reaction of NOx VOC’s, and sunlight.  The primary source of NOx is from internal combustion engines (i.e. cars) and coal fire power plants.  Many sources of VOC’s  Methane, CFC, Benzene, Methylene chloride, etc…  VOC’s have a high vapor pressure which produces low boiling point temperatures  Low boiling point temperatures allows VOC’s to escape to the atmosphere
  5. 5. Some Effects of Ground Ozone  In animals   Lung tissue damage can result from inhalation of ozone In plants  Leaf surface damage (oxidation)  Disruption in stomata cell functions  Causing excessive water loss emulating drought conditions (Smith et al. 2008). Smith, G. C., J. W. Coulston, and B. M. O'Connell. 2008. Ozone Bioindicators and Forest Health: A Guide to the Evaluation, Analysis and Interpretation of the Ozone Injury Data in the Forest Inventory and Analysis Program. United States Department of Agriculture, Forest Service General Technical Report 34
  6. 6. Other ways ozone can be formed  Lighting (natural) (small contributor)  Shorts in electrical equipment (anthropogenic)   Provides that unique smell (very small contributor) Ozone is also use as a replacement for Chlorine (potentially high contributor; but, really unknown)  In swimming Pools  In sewage treatment plants  In domestic water supply as a disinfectant
  7. 7. Modeling Ozone  Source ozone data are from EPA CASTNET   ftp://ftp.epa.gov/castnet/data/ Data are from a single year 2010  In the summer months during the “Ozone Activity Envelope” (OAE)  June – August from 1:00 PM – 5:00 PM  Base data for ozone are recorded every hour  Only 73 ground ozone collections sites were used   This is part of a larger study over a ten year time period. These 73 sites were the only sites consistent from 2002 to 2011. Five variables were extracted from these data for the OAE and averaged:  Ozone (PPB)  Wind Speed (MS)  Relative Humidity (% * 100)  Solar Radiation (Watts per m2)  Temperature (degrees C * 10)
  8. 8. Modeling Methods  Four different modeling methods were investigated:   Ordinary Kriging  Generalized Linear Model (GLM)   Inverse Weighted Distance (IDW) Geographically Weighted Regression (GWR) Results for all four modeling methods were:  Compared with a set of sample data not used in model creation via the Mean Squared Error Predicted (MSEP) method.
  9. 9. Autocorrelation  First, need to know if the data are autocorrelated  If the data are autocorrelated then we can use:    IDW Kriging Results from Morans’I (a test for autocorrelation) (Moran 1950  Data have a strong positive autocorrelation    Data points that are close together have similar values Index = 0.421; p-value = 0 If data were not autocorrelated  Our best estimate using IDW or Kriging would be the mean for the whole study site. . Moran, P.A.P. (1950). Notes on continuous stochastic phenomena, Biometrika 37, pp17-23
  10. 10. IDW  Called a deterministic function  Using the same input parameters will get the same results.  Data needs to be spatially autocorrelated  Three Basic parameters are required  Number of nearest neighbors  Power  Study area boundary  Useful for Continuous data (e.g. rainfall, elevation)  Not useful for: Categorical, Binary, Ordinal
  11. 11. Identifying IDW Parameters  Cross Validation   Calculate a new value for that point using the neighboring points  Repeat this for all points   Remove one data point at one location Calculate the mean squared error and variance Mean Squared Error Predicted (MSEP) gives:  The best number of nearest neighbors  The best power  The fewer number of nearest neighbors produces good local estimates; but, poor global.  A larger number of nearest neighbors produces good global estimates; but, poor local.  Need to balance between local and global estimates.
  12. 12. IDW n Zi ∑ Dy i =1 i x= n 1 ∑ Dy i =1 i y = some exponent:; usually 1 or 2
  13. 13. Distance is calculated using the Pythagorean Theorem a2 + b2 = c2 For Distance A to x (C) 1.582 + 1.582 = 2.232 2.4964 + 2.4964 = 4.9729 4.97290.5 = 2.23 A a B c b C D
  14. 14. 55 50 45 out2.mse 50 45 out.mse 55 60 Year = 2010 Power = 2, MSE Resd 60 Year = 2010 Power = 1, MSE Resi 43.3 40 35 35 40 41.8 0 8 10 20 30 40 50 num_neighbors 60 70 0 8 10 20 30 40 50 num_neighbors 60 70
  15. 15. Ordinary Kriging  (Krige 1951) (Matheron 1962) A stochastic or indeterminate interpolation process  Where estimates or interpolations at an unobserved location are made based upon: the weighted average of values at an observed location  Weights are base upon    The distance separating points The function for the variogram A variogram is used to identify key Kriging parameters:   Assumes an unknown stationary mean.   Sill, Range, Nugget, and covariance Stationary mean refers that the mean over the area behaves predictably (e.g.. Gaussian). Consider unbias    Mean residual sum to zero Variance of error is minized BLUE  Best Linear Unbias Estimator (Isaaks and Srivastava 1989) Isaaks, E. H., and Srivastava, R (1989). An Introduction to Applied Spatial Statistics. Oxford, UK: Oxford University Press. Krige, D. G. 1951 A statistical approach to some basic mine valuation problems on the Witwatersrand. Journal of the Chemical, Metal and Mining Society of South Africa 52 (6): 119 – 139) Matheron, G. 1962. Traite de geostatistique appliquee. Editions Technip.
  16. 16. R output from Variogram Spherical Least Squares Estimate Nugget = 7.7377 Sill = 47.48165 Range = 1100000 AICC = 125.5306 Estimates: Nugget = 15 Sill = 30 Range = 1,100,000 Gaussian Least Squares Estimate Nugget = 13.6845 Sill = 52.25631 Range = 1100000 AICC = 128.4038 Exponential Nugget = 9.2776 Sill = 71.61078 Range = 1100000 AICC = 132.1289 Spherical and Gaussian have an AICC is less than 3 units apart; So there is no difference.
  17. 17. 70 Graphic R Output 60 Gau 50 40 Sph R a n g e 0 10 20 30 Ozone Values Sill 52.7 Exp Year = 2010 Krig Raw Data Nugget 13.6 0e+00 2e+05 4e+05 6e+05 Distance Meters 8e+05 1e+06
  18. 18. Number of Nearest Neighbors 39 38 37 36 35 var(crossidw$resid) 40 41 Kriging Cross Validation, Gaussian Model 5 10 15 20 No. of Neighbors 25 30
  19. 19. Generalized Linear Models (GLM)  Similar to linear regression  Different than IDW and Kriging  Needs predictor input variables  solar radiation and relative humidity proved to be significant predicator variables.  Need to create the solar radiation and relative humidity surface via IDW as input into the GLM equation.  The GLM equation is: 45.35 + (SR * 0.0332) + (RH * -0.235)  R2 = 0.58  The GLM describes the “Large Scale Variability”  The “Small Scale Variability” is computed by calculating the differences between the observed values and the (GLM) predicted values.  Adding the “Large Scale Variability” to the “Small Scale Variability” can produce a good predicative surface.
  20. 20. Geographically Weighted Regression (GWR)  A powerful modeling method that includes:    Linear Regression Space In a nutshell  GWR creates a series of local linear equations base upon the spatial parameters of the independent variables:  Kernel Function    Fixed Search Radius Variable (number of neighbors)* (AKA Adaptive) Bandwidth Method (fixed radius)    Cells located with in the search radius will have the same coefficients. Best if sample points are located in a systematic method (e.g. no a gird with fixed distances). Bandwidth Method (Adaptive or variable search radius)  One that uses the number of nearest neighbors from user input  One that uses a cross validation method which attempts to minimize the collinearity  Best if sample points are randomly located in the study area.    A sample point will be used multiple times to construct multiple linear equations Each cell may contain different regression coefficients Each linear equation (fixed radius or adaptive) uses the same global predictor variables as GLM  Solar Radiation and Relative Humidity proved to be the best global independent variables.
  21. 21. Results Test Residuale Autocorrelated No GLM + IDW GWR using AICC and 25 nn No GWR using CV No IDW No Kriging No MSE MSE New Points 0.54 196.06 21.98 265.09 38.43 241.2 0.6 204.45 6.48 191.86 Data Issues 1) Should have more data points to create and test the models 2) Data points should be more distributed over the study area (e.g. no points in Oregon, Idaho, etc.. and few points in center of the nation.) 3) IDW MSE values for the observe points should not be different. This is likely due to cell size and rounding errors. 4) The variables temperature and wind speed were tested in the GWR model. Test results using these covariates included both the CV method or number of nearest neighbors. Results were very poor and not shown here.
  22. 22. Take Home Message Final  Statistical models are an abstraction of reality.  No statistical model is perfect. (e.g. errors)  Some models are better than other (Crawley 2007).  The correct model can never be known with complete certainty (Crawley 2007).  The simpler the model the better it is (Crawley 2007).  Models should include the Principle of Parsimony (Occam’s Razor)    Use the fewest number of variables The correct explanation is the simplest explanation Make sure that the assumptions of the model are followed.    Are the data IID. Are the data spatially autocorrelated Are the input variables correct?    Errors in measurement Using temperature when solar radiation is a better independent variable. How was the data collect    Random Sample, Systematic, etc… Is there bias in the sample data? Always as yourself does this model make sense.  Is the model predicted something where it should not  Example a fish population on land. Crawley, M. J. 2007. The R Book. Imperial College London at Silwood Park, UK.
  23. 23. Final Quote “Son you're going to drive me to drinking… if you don’t stop driving that hot rod Lincoln.” 1971.