Olga Ivina PhD thesis presentation short

2,050 views
1,769 views

Published on

1 Comment
0 Likes
Statistics
Notes
  • You can hardly find a student who enjoys writing a college papers. Among all the other tasks they get assigned in college, writing essays is one of the most difficult assignments. Fortunately for students, there are many offers nowadays which help to make this process easier. The best service which can help you is DigitalEssay.net.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
2,050
On SlideShare
0
From Embeds
0
Number of Embeds
480
Actions
Shares
0
Downloads
22
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Olga Ivina PhD thesis presentation short

  1. 1. Conformal prediction of air pollution concentrations for the Barcelona Metropolitan Region PhD Thesis summary Olga Ivina University of Girona GRECS research group CIBER de Epidemiolog´ y la Salud P´blica ıa u November 22, 2012 1 / 42
  2. 2. Outline Introduction Air pollution and its effects Air pollution exposure assessment Conformal predictors for air pollution problem Objectives Methods and data Kriging Conformal predictors Computing Data Results Ordinary kriging and RRCM models in default setting Kernelisation: a Gaussian kernel Kernelisation: other kernels Comparison of models Discussion Conclusion Conformal predictors and geostatistics Future research 2 / 42
  3. 3. Air pollution and its effectsIntroductionAir pollutant is a problem of growing concern all over the world.There exists great body of scientific evidence of hazardous effect of airpollution on people’s health and well-being, as well as on generalecological condition of our planet.In people: association with adverse health outcomes - both in adults andin children. Children are specially susceptible to pollution. They getaffected from the very first stages of their lives and on. Linked outcomes(to name a few):- preterm birth and low birth weight- asthma aggravation, cough and bronchitis- allergies: hay fever, rhinitis, ...- excess risk of mortality 3 / 42
  4. 4. Air pollution and its effects - 2IntroductionAdults are influenced by pollution as well. In them, pollution is linked toboth long-term and short-term health effects (to name a few):- respiratory: COPD, asthma, chronic bronchitis- lung cancer- cardiovascular morbidity- mortality: cancer, all-cause, cardiopulmonary, non-accidental,...Special factors of impact: SES and geographical location of a person. 4 / 42
  5. 5. Air pollution and its effects - 3Introduction Global air pollution map produced by Envisat’s SCIAMACHY. Authors: S. Beirle, U. Platt and T. Wagner, University of Heidelberg’s Institute for Environmental Physics. 5 / 42
  6. 6. Air pollution and its effects - 4IntroductionThe main contributor to air pollution in urban areas is traffic. Two -”criteria” - traffic-related air pollutants are taken up in this study:- nitrogen dioxide (NO2)- particulate matter PM10NO2 effects: short-term: respiratory effects and asthma aggravation long-term: risk of coronary heart disease and fatal eventsPM10 effects: short-term: aggravation of respiratory and cardiovascular diseases, premature death, ... long-term: development of heart and lung diseases, premature death,... 6 / 42
  7. 7. Air pollution exposure assessmentIntroductionProblem: direct measurements of pollution not always available.There exists a large number of models aimed t predict pollution at a givenspot. The main classes are:- proximity models- geostatistical models- land use regression (LUR) models- dispersion models- integrated meteorological emission (IME) models- hybrid models 7 / 42
  8. 8. Conformal predictors for air pollution problemIntroductionProblem: nowadays existing methods for air pollution exposureassessment may lack confidence in predictions.In order to tackle this problem, this research suggests making use of anewly developed approach that is conformal predictors. A conformalpredictor is a “confidence predictor”, where the level of confidence forprediction is introduced ad hoc. This prediction is always valid - providedby definition of conformal predictor. 8 / 42
  9. 9. Conformal predictors for air pollution problem - 2IntroductionA conformal predictor is defined by some nonconformity measure, and ithas two major desiderata:- validity of predictions- efficiency of preditionsConformal predictors are flexible: they can be based upon almost anyunderlying statistical algorithm.In air pollution modeling, if a regression-based algorithm is taken up, suchas LUR or kriging, regression residuals serve as a nonconformity measure. 9 / 42
  10. 10. ObjectivesThis dissertation has two major objectives: 1 To demonstrate the capacity of conformal predictors as a method for spatial environmental modeling. 2 To provide valid estimates of nitrogen dioxide and fine particulate matter for Barcelona Metropolitan Region. 10 / 42
  11. 11. KrigingMethods and dataKriging is a spatial interpolation method. Provides a prediction of a factorof interest in an unobserved point on the basis of a set of observed points.Also provides an estimate of error variance (called “kriging variance”).First introduced in 1951 by a South African engineer D.H. Krige in hismaster work devoted to estimation of a mineral ore body. The method hasbeen further developed: nowadays the notion “kriging” stands for asset ofmethods such as ordinary kriging, simple kriging, co-kriging, Bayesiankriging etc.In its simples form, a kriging estimate of the data at an unobservedlocation is a linear combination of the observed data. The coefficients ofthe equation depend on spatial structure of the data and on the spatialcovariance. 11 / 42
  12. 12. Kriging - 2Methods and dataThe most common kriging is ordinary kriging. It is used when the meanof the second order stationary process is unknown. It is based on ageostatistical concept of variogram, and its approach - covariance function.Let there be n neighboring observed locations, x1 , . . . , xn , and anunobserved location x0 , on a spatial domain D. Let Z (x) : x ∈ D denotethe process, and let it have a variogram γ(h). Then the ordinary kriging ∗estimate ZOK (x0 ) at the unobserved point x0 will take the followinganalytical form: n ∗ ZOK (x0 ) = ωα Z (xα ), (1) α=1where ωα are the kriging weights. Ordinary kriging provides BLUEestimates of a random field, together with an error variance estimate(kriging variance.) 12 / 42
  13. 13. New methods. Conformal predictorsMethods and dataHow it works? Provided: pairs of observations of (xi , yi ) where xi is anobject and yi is a label. Then Z := X × Y (2)denotes the example space. Z is a measurable space. Given an incompletedata sequence (x1 , y1 ), (x2 , y2 ), . . . , (xn−1 , yn−1 ) ∈ Z∗ , the aim is to predicta label yn for an object xn . An operator: D : Z∗ × X → Y (3)denotes then a simple predictor. (e.g., an ordinary kriging predictor). 13 / 42
  14. 14. New methods. Conformal predictors - 2Methods and dataThe prediction can be described as: yn = D(x1 , y1 , x2 , y2 , . . . ; xn−1 ), Yn ∈ Y. (4)Let us allow the predictor to output the prediction sets Yn large enough toprovide the confidence in prediction. This means, that the real value of ynwill fall in Yn with a given level of confidence, which is chosen andprovided to a predictor ad hoc.A conformal predictor is a confidence predictor defined by somenonconformity measure. Given the measure, a conformal predictor outputsthe prediction set assuming that the new example conforms with theobserved ones. 14 / 42
  15. 15. New methods. Conformal predictors - 3Methods and dataRidge regression confidence machine (RRCM) is a regression-basedconformal predictor. It makes use of the ridge regression procedure (A. E.Hoerl, 1971) as an underlying algorithm.Suppose Xn is the n × p matrix of objects (independent variables), and Ynis the vector of labels (dependent variables). Then, a RRCM estimate ofparameters ω takes form: ω = (Xn Xn + aIp )−1 Xn Yn , (5)where a is a ridge factor. a = 0 yields a standard least squares estimate.The nonconformity scores for this predictor are the regression residuals:|ei | := |yi − yi |. ˆ 15 / 42
  16. 16. New methods. Conformal predictors - 4Methods and dataBased on a significance level for prediction introduced (roughly, aprobability of error not to exceed), a RRCM predictor outputs a set oflabels y for yn : Si := {y : αi (y ) ≥ αn (y )} = {y : |ai + bi y | ≥ |an + bn y |}, (6)where ai and bi are the components of the vectors A and B.RRCM outputs prediction sets instead of point predictions (what krigingdoes). These sets can be in form of a point, an interval, a ray, a union oftwo rays, the whole real line, or empty. Usually, it is an interval. 16 / 42
  17. 17. New methods. Conformal predictors - 5Methods and dataWhen the number of parameters p is large, computation is hard. “Kerneltrick” is a method that helps deal with hight-dimensional data. It allows toconsider nonlinearity in RRCM.A kernel is a similarity measure that operates in a feature space. Providedan input space X with a dot product, and an operator Φ that maps X to afeature space H: Φ:X →H x → x := Φ(x)a kernel will be defined as follows. For xα , xβ ∈ X : k(xα , xβ ) = Φ(xα ), Φ(xβ ) (7) 17 / 42
  18. 18. New methods. Conformal predictors - 6Methods and dataAny conventional covariance function for kriging can be taken up asa kernel for RRCM. This research uses three (positive definite) kernels: a dot product kernel (default) a radial basis Gaussian kernel an inhomogeneous polynomial kernel of a second degree 18 / 42
  19. 19. ComputingMethods and dataAll computational work made with R.- Kriging: geoR package. Function krige.conv- RRCM: PredictiveRegression package. Function iidpred.- “Kernel trick” self-developed (on the basis of the PredictiveRegression :package) functions for RRCM in “dual form” and for implementing thekernels. 19 / 42
  20. 20. DataMethods and dataThe data for this study has been kindly provided by XVPCA (Network forMonitoring and Forecasting of Air Pollution) of the Generalitat deCatalunya.Mean annual concentrations of two criteria pollutants, NO2 and PM10, areprovided for the Barcelona Metropolitan Region, together with thegeographical coordinates of the monitoring stations(Mercator, UTM 31).Time frames: - NO2: 1998 - 2009, ex. 2003 - PM10: 2001 - 2009, ex.2003 20 / 42
  21. 21. Data - 2Methods and data49 monitoring stations over the area in total.Barcelona Metropolitan Region has a territory of about 3200 km2 andaccommodates over 5 million inhabitants.In BMR, there happen about 107 million displacements weekly, 54.1% ofthem - by means of motorized transport. 21 / 42
  22. 22. Data - 3Methods and data Table: 1. Data on mean annual nitrogen dioxide concentrations Available observations for each year 1998 1999 2000 2001 2002 2004 2005 2006 2007 2008 2009 24 25 25 25 25 24 22 24 25 25 24 Table: 2. Data on mean annual particulate matter concentrations Available observations for each year 2001 2002 2004 2005 2006 2007 2008 2009 22 24 28 28 29 30 33 36 22 / 42
  23. 23. Data - 4Methods and dataTwo major drawbacks, or limiting factors, of the data set: Size: there was a small number of observations for each year and pollutant, Distribution: the measurement spots are situated quite far apart from one another, and they are distributed, or placed, unevenly over the geographic region.Also, the data is the mean averages, and more frequent observations wereunavailable for this study. 23 / 42
  24. 24. Ordinary kriging and RRCM modeling resultsResults 24 / 42
  25. 25. Ordinary kriging and RRCM modeling results - 2Results 25 / 42
  26. 26. Ordinary kriging and RRCM modeling results - 3Results 26 / 42
  27. 27. Kernelisation: a Gaussian kernelResults 27 / 42
  28. 28. Kernelisation: a Gaussian kernel - 2Results 28 / 42
  29. 29. Kernelisation: a Gaussian kernel - 3Results 29 / 42
  30. 30. Comparison of the RRCM modelsResults 30 / 42
  31. 31. Comparison of the RRCM models - 2Results 31 / 42
  32. 32. Comparison of the RRCM models - 3Results Table: Comparison of models for different ridge factors (µg/m3 ) linear iid RBF polynomial ridge 0.01 1 2 0.01 1 2 0.01 1 2 2001 64.46 64.44 67.13 71.08 63.11 66.06 71.95 74.63 77.24 2002 43.43 42.46 45.54 47.41 42.91 45.05 50.44 53.17 55.82 2004 47.26 39.17 34.59 51.48 39.29 35.19 34.66 37.00 39.51 2005 39.65 45.14 49.28 35.50 47.60 51.91 51.44 54.76 57.76 2006 47.68 45.40 48.63 55.51 46.09 48.86 52.48 55.27 57.86 2007 91.43 94.02 96.45 85.40 94.09 96.65 99.83 102.11 104.29 2008 49.48 50.90 52.58 45.42 55.27 58.21 55.60 57.26 58.91 2009 28.42 27.32 29.01 29.16 26.11 27.79 32.26 33.67 35.09 32 / 42
  33. 33. Comparison of the RRCM models - 4Results 33 / 42
  34. 34. Comparison of the RRCM models - 5Results 34 / 42
  35. 35. Comparison of the RRCM models - 6Results Table: Comparison of models for different ridge factors (µg/m3 ) linear iid RBF polynomial ridge 0.01 1 2 0.01 1 2 0.01 1 2 1998 76.08 72.33 68.27 65.81 72.37 68.37 65.27 64.71 65.99 1999 66.31 60.11 61.44 67.68 60.57 60.39 65.32 68.20 70.87 2000 51.69 55.27 57.89 50.91 52.90 55.63 61.89 64.19 66.38 2001 36.25 41.30 44.90 35.32 38.65 42.36 49.54 52.34 54.95 2002 52.12 46.57 49.51 47.78 51.44 57.38 54.51 56.99 59.37 2004 53.65 59.11 62.46 53.89 56.95 60.41 67.06 69.36 71.60 2005 78.75 84.77 88.57 79.44 82.18 86.14 94.41 96.94 99.43 2006 61.79 66.39 69.78 61.24 63.82 67.38 74.90 77.36 79.76 2007 47.01 49.35 53.13 48.15 47.11 51.04 57.15 59.91 62.48 2008 46.96 50.15 53.58 47.45 48.04 51.55 57.63 60.21 62.63 2009 55.59 55.17 53.89 48.38 54.35 52.68 52.79 55.19 57.57 35 / 42
  36. 36. Efficiency of predictionsDiscussionKriging predictions are smooth and vary little, also made for mean annualdata. Error estimates, however, are huge in case of nitrogen dioxide, andsmall in case of airborne particles - subject to properties of the substances:NO2 is known to have a generally larger variability than PM10.Kriging intervals can be derived, assuming the Gaussianity of datadistribution. This assumption is common, but not always correct. RRCMmakes no assumption on data distribution, apart from being iid.Two factors help boost the efficiency of RRCM prediction: kernels andridge factor. The least is chosen by the brute force method (or the methodof consecutive approximations). 36 / 42
  37. 37. Conformal predictors and geostatisticsConclusion Table: Comparison of OK and RRCM OK RRCM point predictions prediction sets (usually intervals) regression algorithm regression algorithm Gaussianity assumption iid assumption estimates error variance - uses variogram and uses any appropriate covariance function kernel to approach it - ridge factor may lack confidence confidence level is chosen and guaranteed 37 / 42
  38. 38. Future researchConclusion Extend the existing data set for BMR Provide additional validation for the methods Test these models on the data for other cities Develop conformal predictors on the basis of other popular air pollution exposure modeling algorithms (land use regression, dispersion models etc.) 38 / 42
  39. 39. Selected references V.Vovk, A.Gammerman, G.Shafer, Algorithmic learning in a random world, Springer (2005). V.Vovk, I.Nouretdinov, A. Gammerman, On-line predictive linear regression, The Annals of Statistics (2009). H. Wackernagel, Multivariate geostatistics: an introduction with applications, Springer (2003). B. Sch¨lkopf, J. Smola, Learning with kernels: support vector o machines, regularization, optimization, and beyond, MIT Press (2002). A. Lertxundi-Manterola, M. Saez, Modelling of nitrogen dioxide (NO2) and fine particulate matter (PM10) air pollution in the metropolitan areas of Barcelona and Bilbao, Spain, Environmetrics (2009). 39 / 42
  40. 40. Selected references - 2 A. Hoerl, R. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12.1 (1970). P. Diggle, P. Ribeiro Jr., Model-Based Geostatistics, Springer (2007). P. Ribeiro Jr., P. Diggle, geoR: a package for geostatistical analysis, R-NEWS 1.2 (2001). N. Cressie, Statistics for spatial data, Wiley (1993). M. Jerrett et al., A review and evaluation of intraurban air pollution exposure models, Journal of exposure analysis and environmental epidemiology (2005). 40 / 42

×