SlideShare a Scribd company logo
1 of 57
Download to read offline
1
IMPACTS OF COVID-19 ON URBAN AIR
POLLUTION IN LONDON
LAURENT JOSE LACAZE SANTOS
Supervisor: Dr. JAMES HAWORTH
Department of Civil, Environmental & Geomatic Engineering
University College London | UCL
Year of Submission: 2020
Submitted in partial fulfilment of the requirements for the degree of MSc in Geospatial Sciences,
Geographic Information Science and Computing
September, 2020
2
Abstract
The development of models to assess air pollution exposures is a major topic for sustainable
development in urban areas. This study aims to compare emissions of nitrogen dioxide (NO2)
particles in pre-Covid-19 period and during the nationwide Covid-19 lockdown in inner-
London.
The inventory of pollution data is provided by a network of monitoring stations used to assess
and monitor pollutant particles. The study applies exploratory data analysis and spatial
kriging interpolation to estimate pollution at unsampled locations and provide insightful
information for pollution control policy. It has been demonstrated that coupling pollution data
analysis and predictive spatial modelling can provide insightful information and improve the
understanding of the impact of Covid-19 in NO2 emissions.
Important conclusions are raised. Firstly, the lockdown has substantially changed the spatial
concentration of pollutants in the study area. The analysis also demonstrates that NO2 levels
exhibit temporal patterns during the lockdown. The study concludes that London has been
positively affected in air quality by the lockdown. Looking forward, opportunities for future
research are presented.
3
Acknowledgments
I am grateful to many people who in diverse ways have been involved in the completion of
this dissertation. In particular, I would like to express my gratitude to my supervisor, James
Haworth who provided important advices throughout the study and introduced and equipped
with the techniques of spatial-temporal data analysis and big data analytics.
I also must thank Tao Cheng, professor in geoinformatics at UCL, for all the skills and on-
going academic mentorship in the field of spatial analysis and geocomputation. To Mohamed
Ibrahim, PhD researcher at SpaceTimeLab, for all support and debugging with R
programming.
I am also grateful to all professors of the Department of Civil, Environmental and Geomatic
Engineering (CEGE) who have been generous with their time and expertise.
This dissertation is dedicated to my wife Alessandra, my children Nicole and Thomas, to my
father, Jose Roberto, who was always anxious about updates, to my mother Renee, and my
brother Francois.
4
CONTENTS
ABSTRACT.....................................................................................................................................................2
ACKNOWLEDGMENTS...................................................................................................................................3
LIST OF FIGURES............................................................................................................................................5
LIST OF TABLES .............................................................................................................................................6
ABBREVIATIONS ...........................................................................................................................................7
CHAPTER 1. BACKGROUND AND OBJECTIVES................................................................................................8
1.1 AIM AND RESEARCH QUESTIONS ....................................................................................................................... 8
CHAPTER 2. LITERATURE REVIEW................................................................................................................10
2.1 SPATIAL AIR POLLUTION MODELLING............................................................................................................... 10
2.2 AIR POLLUTION EXPOSURE MONITORING IN GREATER LONDON ............................................................................ 13
2.3 EXPLORATORY SPATIAL DATA ANALYSIS ............................................................................................................ 15
2.3.1 Spatial Dependence and Autocorrelation ........................................................................................ 16
2.3.2 Correlation in Temporal Data .......................................................................................................... 17
2.4 GEOSTATISTICAL INTERPOLATION..................................................................................................................... 18
2.4.1 Spatial Interpolation Methodologies ............................................................................................... 19
2.4.2 Kriging.............................................................................................................................................. 21
2.4.3 Spatial Variance............................................................................................................................... 22
CHAPTER 3. DATA .......................................................................................................................................25
CHAPTER 4. METHODOLOGY.......................................................................................................................27
4.1 EMPIRICAL FRAMEWORK................................................................................................................................ 27
CHAPTER 5. RESULTS...................................................................................................................................30
5.1 DATA PATTERNS AND CHARACTERISTICS ........................................................................................................... 30
5.2 SPATIAL PATTERNS ....................................................................................................................................... 33
5.3 TEMPORAL PATTERNS ................................................................................................................................... 35
5.3.1 Temporal Autocorrelation................................................................................................................ 39
5.4 SPACE-TIME SEMIVARIOGRAM ........................................................................................................................ 41
5.5 SPATIAL INTERPOLATION................................................................................................................................ 43
5.5.1 Semivariogram Modelling................................................................................................................ 47
5.5.2 Spatial Prediction with Kriging......................................................................................................... 49
CHAPTER 6. ANALYSIS AND DISCUSSION.....................................................................................................52
CHAPTER 7. RECOMMENDATIONS FOR FURTHER STUDIES .........................................................................55
REFERENCES................................................................................................................................................56
5
List of Figures
Figure 2.1 Sample empirical semivariogram........................................................................23
Figure 2.2 The semivariance parameters..............................................................................24
Figure 3.1 Map of the 71 monitoring stations inside Inner London Authority Boundary....26
Figure 4.1 Outlook of the methodological approach............................................................27
Figure 5.1 Frequency of daily mean NO2 by season (in µg/m3
)...........................................31
Figure 5.2 Frequency of daily mean NO2 (in µg/m3
) ...........................................................32
Figure 5.3 Outlook of daily concentration of NO2 (in µg/m3
).............................................33
Figure 5.4 Pairwise Scatterplots of GRS and Mean NO2 .....................................................34
Figure 5.5 Network of Monitoring Stations and Mean NO2 by Season. ..............................35
Figure 5.6 Daily concentration of NO2 (in µg/m3
) by days of week and season................36
Figure 5.7 Daily concentration of NO2 (in µg/m3
) by month in 2019 (top) and by season
(botton) ..................................................................................................................................37
Figure 5.8 Intraday concentration of NO2 (in µg/ m3
) by season and the period of
lockdown ...............................................................................................................................38
Figure 5.9 Temporal autocorrelation ...................................................................................39
Figure 5.10 Temporal autocorrelation function (ACF) for 2019 and lockdown period.......40
Figure 5.11 Temporal autocorrelation function (ACF) lag plots for 2019 and lockdown
period in 2020........................................................................................................................40
Figure 5.12 Parcial temporal autocorrelation function (PACF) for 2019 and lockdown
period.....................................................................................................................................41
Figure 5.13 2D semivariogram for NO2 in Inner London ....................................................42
Figure 5.14 3D semivariogram for NO2 in Inner London ....................................................43
Figure 5.15 Semivariogram cloud plot for NO2 data............................................................44
Figure 5.16 Empirical semivariogram for NO2 data.............................................................45
Figure 5.17 Directional variogram .......................................................................................46
Figure 5.18 Semivariance with fitted model function ..........................................................48
Figure 5.19 Spatial Prediction with Kriging.........................................................................49
6
List of Tables
Table 2.1 Urban Air Pollution modelling methodologies....................................................11
Table 2.2 The Environment Research Group Urban Pollution Modelling Methods...........14
Table 2.3 Interpolation methods ..........................................................................................20
Table 2.4 The main forms of linear Kriging.........................................................................21
Table 5.1 Summary of statistics of daily NO2 (µg/m3
) in Inner London Boundary.............31
Table 5.2 Statistics of daily NO2 (µg/m3
)............................................................................38
Table 5.3 Fitting accuracy of the semivariance modelling...................................................47
Table 5.4 Fitting accuracy of the semivariance modelling..................................................49
7
Abbreviations
ACDC Air Quality Data Commons
ACF Autocorrelation function
ADMS Atmospheric Dispersion Modelling System
AIO Area of interest
BLUE Best linear unbiased estimate
CMAQ Community Multi-scale Air Quality Model
CREA Centre for Research on Energy and Clean Air
EDA Exploratory data analysis
ERG The Environmental Research Group
ESDA Exploratory spatial data analysis
ESTDA Exploratory spatio-temporal data analysis
EU European Union
GIS Geographic information systems
GLA Greater London Authority
GRS Geographic reference system
IDW Inverse distance weighting
KCL King’ s College London
LAEI London Atmospheric Emissions Inventory
LAQM Local Air Quality Management
LAQN Breath London Air Quality monitoring network
LHEM London hybrid exposure model
LISA spatial local indicators of spatial association
MAQS Mayor’s Air Quality Strategy
OK Ordinary Kriging
ONS Office for National Statistics
PACF Partial autocorrelation function
PMCC Pearson’s product moment correlation
QQ plot Quantile-quantile plot
TFL Transport for London
UCL University College London
UK United Kingdom
ULEZ Ultra low emission zone
WHO World Health Organisation
8
Chapter 1. Background and Objectives
The development of models to assess air pollution exposure within cities is a major topic for
sustainable development in urban areas. People travel and are exposed to air pollutants
differently. They might use different modes of transport to work and school at different time
frames. Variability in air pollution is intrinsically associated to trends of emission components
and exposes populations in different spatial and temporal scales.
Poor air quality has long been recognized as having adverse effects on health. To improve the
understanding of these effects requires air monitoring systems and modelling predictions,
especially in urban areas where pollutants concentration coincide with high population
densities. Health and epidemiological studies provide sufficient evidence of causal
relationship of air quality induced health effects as asthma, impaired lung function, total and
cardiovascular mortality and cardiovascular morbidity (Beevers et al, 2013).
In London, network monitoring stations have been used to assess and monitor pollutant
particles. Information on emissions can provide a representation of pollutant concentrations
with data brought together from a relatively small number of sites. Modelling air pollution is
an effective way to understand how pollution affects urban areas and provide insights to
ground emission control measures. This calls for theory and methods to gain a better
understanding of the observed spatial and temporal processes on pollutant data.
1.1 Aim and Research Questions
This study aims to the compare the exposure of nitrogen dioxide (NO2) particles in ordinary
circumstances and during the enforced Covid-9 lockdown period in 2020 in London to
provide insights at local level emissions of NO2.
The work also aims to capture the temporal variation of emissions in order to represent the
short-term variations of concentration across London in 2019 and compare it to the disruptive
event of Covid-19 pandemic in 2020, with important implications for future emission control
and urban environmental strategies in predicting outdoor human exposure.
A key contribution of this work is to provide insightful information for pollution control
strategies. The results can be used on geo-referenced data to understand human exposure to
9
NO2 at different places and provide evidences to recommend policy improvements for
pollution control and sustainable development.
The main question of the dissertation can be summarised as follow:
• How NO2 pollution changes at different spatial and temporal scales with the
disruptive economic and environmental event of Covid-19 in London?
The following sub questions can also be stated as follow:
• What are the usual NO2 concentrations in London in an ordinary year comparing
to the enforced lockdown period in 2020?
• How do NO2 emissions during the lockdown differ from typical patterns?
10
Chapter 2. Literature Review
In this section, we will introduce urban air pollution modelling methodologies and their main
characteristics. Recent studies on the impact of the Covid-19 enforced lockdown are also
mentioned.
The chapter also aims to introduce the main governance scheme for ambient air quality in
London as well as the main research projects that study urban air pollution in the city.
Finally, the section explores the methodologies of exploratory data analysis and the
techniques of geostatistical interpolation.
2.1 Spatial Air Pollution Modelling
Ambient concentrations of air pollutants at potentially harmful levels in urban areas have
been subject to scientific studies seeking to understand the characteristics and driving forces
of atmospheric pollution (Jerret et al., 2005; Elliot et al, 2000; Steinle et al, 2013). There is a
worldwide concern about air quality in intraurban areas and methodologies to modelling and
monitoring air concentration of pollutants have been developed to provide reference for
formulating and design preventive measures (Jerret et al., 2005).
Air pollution in cities are subject to high spatial and temporal variability. Exposure results
from the relationships and interactions between environment and human systems (Steinle et
al, 2013). Large cities with high population densities are specially affected and every
individual has unique activity-patterns that result in differing exposures. Therefore,
accurately measuring exposure at fine spatial and temporal scales is of crucial importance.
As a result, level of exposure and impact of air pollution effects have been subject to
assessment and control policies in the UK, such as the UK National Air Quality Strategy
(NAQS) and London Environment Strategy (Mayor of London, 2018).
To assess and control pollution in urban environments, a system of networks of air quality
monitor stations is usually use. With the inventory of pollutants provided by monitor stations,
a range of methods have been used to measure human exposure to air pollution. The process
of monitoring pollutant particles has been defined by Zartarian et al. (2007) as “… the process
11
of estimating or measuring magnitude, frequency and duration of exposure to an agent (…)”
(p.58).
The methods vary in their sophistication and attempt to develop exposure models capable of
identifying small-area variations in pollution. Simple measures as proximity to road traffic
can serve as a proxy to better capture the intraurban variability in pollutants concentrations.
More sophisticated techniques apply dispersion, atmospheric and time-activity models with
geographic information systems (GIS) capabilities (Jerret et al, 2005). Latest advances in
technology unable the tracking of individuals while simultaneously measuring pollutant
concentrations with individual monitor sensors (Steinle et al., 2013).
In literature, diverse models to assess air pollution exposures within cities are reviewed (Jerret
et al., 2005; Elliot et al, 2000). In broad terms, methodologies can be classified under four
classes, namely: (i) spatial regression, (ii) geostatistical interpolation, (iii) dispersion models
and (iv) hybrid models. Table 2.1 presents their main characteristics.
Table 2.1
Urban Air Pollution modelling methodologies
Methodologies Applications
Spatial regression • Models the concentration of pollutants as a
function of predictable variables.
• Establishes a statistical relationship between
pollutants and variables as surrounding land
use, population density, traffic pattern and
meteorological data.
Geostatistical interpolation • From data collected at a set of monitoring
stations, it estimates of the concentration of
pollutants in neighbourhood areas.
• Interpolation is based on pure spatial or spatial-
temporal modelling.
Dispersion models • Simulates pollution fate and transport with
atmospheric data and time-activity models (e.g.
KCLurban1
and CMAQ urban2
).
Hybrid models • Combines personal or household exposure
monitoring with one or two different methods
(e.g. LHEM3
)
Spatial regression seeks to predict pollution concentrations at a given location based on
surrounding land use and traffic characteristics. The method uses measured pollution
concentration yi at location s as the response variable of independent variables xi within areas
1
Reference: http://www.erg.kcl.ac.uk/research/home/modelling-pollution-in-london.html [Accessed 18 July 2020].
2
Reference: http://www.erg.kcl.ac.uk/research/home/modelling-pollution-in-london.html [Accessed 18 July 2020].
3
Reference: https://pubs.acs.org/doi/abs/10.1021/acs.est.6b01817 [Accessed 18 July 2020].
12
called buffers as predictors of the measured concentrations (Jerret et al, 2005). The regression
modelling aims to predict pollution surfaces as a function of exogenous independent variables
at any spatio-temporal resolution.
Interpolation technique also relies on pollutant data derived from monitoring stations. The
aim is to estimate the concentration of concentration of pollutant at sites other than the
stations. By means of a grid imposed over the study area, a continuous surface of pollutants
can be obtained. The most common geostatistical interpolation technique is kriging, which
model spatial dependence to develop continuous surfaces of pollution. It applies the best
estimate (linear unbiased estimate, BLUE) of the variable’ s value at any point of the study
area (Burrough and McDonnel, 1998). The predicted values and their standard errors called
kriging variance, quantify the degree of uncertainty in spatial predictions at any site. Other
methods such as splines, inverse distance weighting and Theissen triangulation rely on
deterministic algorithm are also commonly applied as interpolation methods but do not offer
means to estimate errors (Jerret et al, 2005).
Dispersion models generally rely on deterministic processes assumptions and require the use
of meteorological conditions and topography in conjunction with emission data. It aims to
offer a more realistic representation of the problem. Meteorological data provide information
about wind speed direction, ambient temperature, solar radiation and atmospheric stability
(Gualtieri and Tartaglia, 1998). After calibrated, the model computes the pollution levels at
the study area extension. Beevers et al (2013) explain that the main advantage of dispersion
modelling-based approach relies on its ability to disaggregate by composite and source origin
in view of predict past and future air quality as well as to assess the impact of prevention
measures.
Hybrid models are two modelling approaches combined. They usually combine personal or
regional monitoring with other air pollution exposure methods. Personal exposure assessment
is evolving quickly and latest advances in technology enable the tracking of individuals while
simultaneously measuring pollutant concentrations (Steinle et al, 2013).
Overall, the dispersion models are considered more reliable than the others but require a
substantial amount of data on emissions and meteorology (Jerret et al., 2005).
Finally, it worth mentioning that intraurban pollution is a major public health issue as recent
studies point to the fact that air pollution may play an important role in helping understand
and combat the spread of the Covid-19 pandemic. Particles of pollution might help carry the
13
virus further afield suggesting a link between death rate and the spread of diseases (The
Guardian, 2020).
The Centre for Research on Energy and Clean Air (CREA) has also studied the impacts of
the enforced Covid-19 lockdown has had on air-pollution levels in 12 big cities around the
world. The studies point to the fact that Nitrogen dioxide (NO2) particles levels fell by about
27% ten days after governments issued stay-at-home orders, compared with the same period
in 2017-19. Another component, particulate matter (PM), declined by an average of about 5%
in a group of 12 big cities in which data are readily available (The Economist, 2020).
2.2 Air Pollution Exposure Monitoring in Greater London
With a population of more than 8 million people according to the 2011 census (ONS, 2012),
London is one of the largest cities in the world.
In November 1999, The Greater London Authority Act received Royal Assent to provide the
governance framework for the Mayor of London, leading to the publishing of the Mayor’s
Air Quality Strategy (MAQS) for Greater London. The MAQS aims to meet the requirements
of the Local Air Quality Management (LAQM), an important part of the Government’s
strategy to meet both the UK air quality objectives and the EU limit values (Oxley et al, 2009).
LAQM requires all local authorities to carry out regular assessments of air quality in their
boundaries.
The London Atmospheric Emissions Inventory4
(LAEI) is the key tool for air quality analysis
and policy development in London. Provided by the Greater London Authority (GLA) it is a
regularly updated database of pollutant emissions. LAEI data remain from two network
stations of air quality monitoring: 1. Regulatory air quality monitoring sites, managed by
GLA and 2. Breath London Air Quality monitoring network (LAQN), provided by Transport
for London (TFL).
The Environmental Research Group5
(ERG), part of the School of Population Health &
Environmental Sciences at King’s College London, has since the publishing of the MAQS
developed and provided air quality research and information in London and the United
4
Reference: https://data.london.gov.uk/air-quality/
5
Reference: https://www.kcl.ac.uk/lsm/research/divisions/aes/research/ERG
14
Kingdom (UK). The main outputs the ERG consist of a hybrid model and dispersion
modelling systems. Table 2.2 depicts the main characteristics of the applied methodologies
for urban air pollution in London under ERG.
Table 2.2
The Environment Research Group Urban Pollution Modelling Methods
Dispersion Models KCLurban • Is a dispersion modelling system using
Atmospheric Dispersion Modelling
System (ADMS) dispersion model and
road source model from the London
Atmospheric Emissions Inventory
(LAEI).
• Gives annual mean air quality prediction
on a regular 20 x 20 m grid.
Community Multi-scale Air
Quality Model (CMAQ-
urban)
• Is deterministic (uses fundamental
physics and chemistry) and runs over a
much larger model domain.
• predicts hourly concentrations coupling
road models with regional site monitoring
stations.
Hybrid Models London Hybrid Exposure
Model (LHEM)
• Uses anonymous activity data provided
by Transport for London, and advanced
air pollution and micro-environmental
modelling.
Source: adapted from Beevers et al (2013).
Both dispersion models (i.e. KCLurban and CMAQ) establish the spatio-temporal patterns of
NOx-NO2, PM10
and PM2.5
. KCLurban uses ADMS dispersion model as well as intraurban
road traffic and meteorological data. Beevers et al (2013) explain that KCLurban has been
used in air quality decision making in London in the scheme of Mayor’s Air Quality Strategy.
The KCLurban gives annual mean air quality predictions of pollutants on a regular 20m x
20m spatial scale grid.
From its side, CMAQ-urban model is deterministic and predicts hourly concentration of
pollutants across London on a spatial scale of 20m x 20m grid.
The London Hybrid Exposure Model (LHEM) combines dispersion model at small spatial
and temporal (hourly) scales with detailed space-time-activity data taken by TFL. The hybrid
model enables to estimate the exposure misclassification associated with using estimates of
average concentration at the home post code and to increase the understanding of interactions
between exposure and vulnerable sub-groups (Beevers et al, 2013). It aims to provide a more
accurate measurement of pollutant exposure by considering individual level data with details
from approximately 200,000 journeys.
15
LHEM compares hourly concentrations of NOx, NO2, O3, CO, PM10 and PM2.5 with measured
hourly concentration of 42 automatic monitoring sites. The monitoring sites are those of the
from the London Air Quality Network (ibid, 2013).
2.3 Exploratory Spatial Data Analysis
A fundamental task prior to any data analysis is to examine the structure and the
characteristics of the dataset. Initial examination of the data, visually or using descriptive
statistics, is a powerful way to better understand the data on hand.
This pre-modelling exploration is helpful to understand patterns before setting up any
statistical modelling of spatial process. This initial examination provides useful information
of variables and a framework to tackle spatial problems. The goal is to develop an
understanding of the data by revealing key relationships and processes. This initial step aims
to generate insights and is helpful to modeling data that vary across space.
Haworth (2018) contents that Exploratory Data Analysis (EDA) focuses on the analyses of
datasets to explore their characteristics and drive inferences. Here, data visualisation and
basic descriptive statistics are the common techniques to generate evidence from empirical
data. According to Cheng and Haworth (2019), among the objectives of this initial phase of
data examination include:
• maximising insight into a dataset
• uncovering underlying structure
• extracting important variables
• detecting outliers and anomalies
• testing underlying assumptions.
Although many of these methods are applicable to non-spatial data, considering the spatial
dimension is essential when the data is geographical (Fotheringham et al., 2000).
Exploratory spatial data analysis (ESDA) is the extension of EDA to spatial data. ESDA
combines tools of EDA with maps and measures of spatial data.
16
2.3.1 Spatial Dependence and Autocorrelation
Location establishes context. Comparing attributes and distances of objects with those of
other objects in close proximity is powerful to generate insight from the data (De Smith et al.,
2007).
Dependence refers to any statistical relationship between two variables. Conversely, the
correlation between an attribute data with itself is called autocorrelation.
Classical statistical inference usually assumes the assumption of independence on the
observations under study. The same assumption is usually not applicable to geographical data
since geographic attributes or units are tied together, a phenomenon termed as spatial
dependence.
In spatial analysis, the core measure of spatial dependence is the Pearson’s Product Moment
Correlation Coefficient (PMCC) as presented in equation 1. It forms the basis for many of the
correlation measures used in spatial (and time series) analysis (Haworth, 2018).
𝑟𝑋𝑌 =
∑𝑛
𝑖=1 (𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌)
√∑𝑛
𝑖=1 (𝑋𝑖 − 𝑋)2√∑𝑛
𝑖=1 (𝑌𝑖 − 𝑌)2
(1)
where:
n is sample size
x, y are the individual sample points indexed with i
𝑥̅ is the sample mean of xi and yi
PMCC results can be interpreted the following way:
+1 = perfect positive correlation
0 = no correlation
-1 = perfect negative correlation
Spatial autocorrelation describes how an attribute is distributed over space. That is, to what
extent the value of the attribute in one spatial area depends on the values of the attribute in
neighbouring zones (Goodchild, 1986).
To assess the significance of the autocorrelation coefficient, two main strategies are usually
employed (Haworth, 2018):
• Adjacency based measures: applied with spatial weight matrix
17
• Distance based measures: use distance between locations to define proximity.
The former technique is applied to spatial areas spatially adjacent. The latter is a function of
the distance between observations, typically dealing with point of interest data6
. Local
measures of autocorrelation and clustering in areal data are usually assessed with spatial local
indicators of spatial association (LISAs). These techniques are not used in the present study
since the sites that monitor pollution are too sparse in the spatial domain (i.e. inner London
boundary) and spatial aggregation cannot be achieved wisely.
2.3.2 Correlation in Temporal Data
Correlation in temporal data is explored based on observations over time frames. A time
series is a set of observations on quantitative variables collected over time. Univariate time
series is a sequence of measurements of the same variable collected over time. Time-series
and spatial data analysis are used separately to examine whether the data is correlated and
stationary in time and space. In conjunction with a spatial component, the series turns out to
be a space-time series of values for a quantitative variable over time7
.
The autocorrelation function (ACF) for a series gives correlations between the series and
temporal lagged values of the series for different lags. It calculates the correlation of a variable
with a lagged specification of itself (i.e. autocorrelation). The ACF is used to identify the
possible structure of time series data. For example, if a time series exhibits significant
autocorrelation then its previous values can be used to predict its future values.
The ACF of the series gives correlations between xt and xt-h for h = 1, 2, 3, i (Penn State
University, 2020). The ACF between xt and xt-h equals:
Covariance(𝑥𝑡, 𝑥𝑡−ℎ)
Std.Dev.(𝑥𝑡)Std.Dev.(𝑥𝑡−ℎ)
=
Covariance(𝑥𝑡, 𝑥𝑡−ℎ)
Variance(𝑥𝑡)
(2)
6
In this study, spatial autocorrelation is measured from spatial points formed by monitoring stations. Therefore, the
technique semivariance modelling is applied.
7
Time series also violates the assumption of data independence in classical statistics as more recent information is usually
more useful than less recent information in forecasting.
18
Most time series are not stationary, that is, violates the assumption that the mean is the same
for all lags (which denotes that the values are independent of time). It may exhibit temporal
seasonal patterns as for example transport flows and weather.
A partial correlation function (PACF) is a conditional correlation. It is the correlation
between two variables under the assumption that we know and take into account the values
of some other set of variables. It is a measure that show how much more information each
additional variable provides (Haworth, 2018).
For example, in a regression problem with y as the response variable and x1, x2, x3 as the
predictor variables, the PACF can be calculated as follow:
Covariance(𝑦, 𝑥3|𝑥1, 𝑥2)
√Variance(𝑦|𝑥1, 𝑥2)Variance(𝑥3|𝑥1, 𝑥2)
(3)
The partial correlation between y and x3 is the correlation between the variables determined
taking into account how both y and x3 are related to x1 and x2 (Penn State University, 2020).
2.4 Geostatistical Interpolation
Given a spatial framework, an area can be modelled as a function of an attribute variable. The
variation of a spatial phenomenon over a continuous geographical scale is sometimes
modelled from point data spatially disperse, as in the case of urban sensor stations.
Bivand et al. (2008) define geostatistical data as “those that could in principle be measured
anywhere, but that typically come as measurements at a limited number of observation
locations (…)” (p. 191). By extension, geostatistics can be defined as the analysis of spatial
variation of an attribute by means of a function with geostatistical data.
Spatial interpolation aims to create a surface, usually referred as a grid, from spaced point
data, allowing predictions of a variable at spatial areas based on neighbouring observations,
or distances between points.
The modelling approach in geostatistics regards the analysis of random fields Z(s), with Z
random and s the non-random spatial index. With a limited number of sample locations,
measurements on Z are available, and prediction (interpolation) of Z is modelled at non-
19
observed locations s0 by means of a spatial correlation function (i.e. the semivariance
function).
The collection of geostatistical data with the temporal domain also enable the modelling of
temporal variability in conjunction with spatial data, a technique known as spatio-temporal
interpolation8
.
Typical problems where interpolation methodologies are applied are the creation of digital
elevation model, environmental analysis such as air quality or soil pollution and estimation
of spatial averages from continuous, spatially correlated data and house prices (Haworth,
2018). Other problems include monitoring network optimization, where observation locations
are to be located or removed.
2.4.1 Spatial Interpolation Methodologies
The common process to apply spatial interpolation is basically composed of three basic steps:
1. observations of a phenomenon are recorded at point locations (e.g. monitoring stations); 2.
a grid (a raster layer) is overlaid on the area of interest and 3. the value of each grid cell is
estimated using some function of the observed points.
There are a number of techniques for creating grids (De Smith et al., 2007). In broad terms,
they can be classified in two main strategies:
1. Deterministic methods: the values at unsampled (grid) points are computed as a
simple linear weighted average of neighboring measured data points within a given
neighborhood under consideration,
2. Probabilistic methods: fit a model to the data. Regionalised variation is determined
by modeling the semivariance and using the fitted function in the interpolation
process.
Table 2.3 presents the main characteristics of the most used methods of spatial interpolation.
8
This technique is not applied in this study.
20
Table 2.3
Interpolation methods
Method Strategy Advantages Disadvantages
Nearest Neighbours Each point is given the
average value of the k
nearest points to it.
• Distribution
free
• Computationally
simple
• Based on the sample
data - some neighbours
may be far away.
• Considers all
neighbours equally and
not based on distance.
Distance decay Applies mathematical
function which is used to
weight observations based
on their distance from the
point to be estimated.
Inverse distance is usually
applied to decrease
similarity with distance.
• Computationally
efficient
• Sensible to outliers
and sampling
configuration (
clustered and isolated
points).
Inverse distance
weighting (IDW)
Calculates the value at a
point as a weighted sum of
surroundings points, where
the weight is proportional to
the inverse of the distance
from the point.
• Conceptually
simple
• Easy to apply
• Not
computationally
intensive if used
efficiently.
• Deterministic: based
purely on prior
assumptions of the distance
decay relationship
• Does not take the spatial
distribution of the data into
account
• Cannot be ‘trained’.
Kriging Based on semi-variogram
modelling.
Describes how variance
increases as a function of
distance between
observations (distance
decay).
A function is fit to the semi-
variogram which is used to
weight distances between
points.
• Fits a model to the
data, rather than
relying on prior
assumptions
• Uncertainty in the
predictions can be
quantified (if the
assumptions are
correct).
• Very flexible
framework – lots of
functions can be
used to model
semivariogram
• Computationally
intensive for large regions
• More complicated than
other methods; requires
training to be used
correctly.
Source: adapted from Haworth (2018).
In deterministic problems, the estimation process involves the use of a simple linear
expression in order to compute grid values (equation 4):
𝑍𝑗 ∑ 𝜆𝑖𝑍𝑖
𝑛
𝑖=1
(4)
where zj is the z-value to be estimated for location j, the λi are a set of estimated weights and
the zi are the known (measured) values at points (xi,yi). As zj is a simple weighted average an
additional constraint is required ensuring that the sum of the weights adds up to 1. To tackle
21
the interpolation problem, we must determine the optimum weights to be used (Bivand et al.,
2008).
Among the many factors that influence the quality of interpolation, the distribution of
observations plays a major role. Interpolation methods often assume data points are subject
to error. Regularly spaced data may be subject to bias due to intrinsic frequencies in the data,
spacing and directional effects (De Smith et al., 2007).
2.4.2 Kriging
Kriging is a geostatistical method based on statistical models that calculates relationship
among data from measured points. The model assumes that the distance or direction between
sample points reflects a spatial correlation in the study area. This model is used to interpolate,
or predict, values at unsampled locations, in much the same way as with deterministic
interpolation of continuous spatial phenomena (De Smith et al., 2007). The interpolated
values are modeled by a Gaussian process governed by prior covariances (Chen et al., 2010).
Kriging has many methods – simple kriging, ordinary kriging (OK) and universal Kriging
(table 2.4)
Table 2. 4
The main forms of linear Kriging
Kriging Form Mean Drift Model Prerequisite
Simple Kriging Known None Covariance
Ordinary Kriging Unknown Constant Variogram
Universal Kriging Unknown Function of coordinates
Variogram
Variogram
Kriging with external
drift
Unknown External variable Variogram
Source: Chiles and Delfiner, 2012, p. 148.
By analyzing the sample data, it is possible to derive a general model that describes how the
sample values vary with distance and direction (i.e. isotropy and anisotropy). This model may
be then used to interpolate, or predict, values at unsampled locations, in much the same way
as with deterministic interpolation.
22
De Smith et al. (2007) point to the fact that kriging cannot deal with duplicate observations
(i.e. data that share the same location) because they are perfectly correlated, leading to
singular covariance matrices.
The modelling procedure is based on semivariogram models.
The general formulae for predicting values is as follows (Haworth, 2018):
𝑍(𝑥, 𝑦) = 𝑚(𝑥, 𝑦) + 𝑒1(𝑥, 𝑦) + 𝑒1(𝑥, 𝑦) (5)
where:
Z(x, y) is the value to be predicted at location x,y,
m(x,y) is a deterministic model of z at locations x,y. In ordinary Kriging, this is a mean value,
e1(x,y) is the statistical variation from z(x,y). This part is modelled using semivariogram.
e2(x,y) is a random error component used for residual analysis.
2.4.3 Spatial Variance
In interpolation, spatial variance is modelled by means of the variogram or semivariogram.
Spatial variance refers to the amount of variability in a phenomenon over distance. The
semivariogram plots spatial semivariance as a function of distance.
In standard statistics problems, correlation is any statistical relationship or association
between two random variables. It measures the degree to which a pair of variables are related
and is commonly studied by means of a scatterplot graphic.
With spatial problems, the correlation of variables at locations s1 and s2 cannot be estimated
as only a single pair is available (Bivand et al., 2008). Additionally, we might study if the
point data holds the stationary assumption, which relies on the property that the mean,
variance and autocorrelation structure do not change over time.
Spatial analysis packages as stat, compute the squared differences between all pairs of values
in the dataset, and then allocate these to lag classes based on the distance between the pair.
The procedure computes a set of semivariance values for distance lags, h, increasing in steps
from 0 to a value less than the greatest distance between point pairs. A variogram graph (also
23
known as empirical semivariogram) represents this set of values plotted against the separation
distance, h (see an example in figure 3.1).
Figure 2.1
Sample empirical semivariogram
Semivariogram modelling allows values at unknown locations be estimated to obtain weights
that may be used in an interpolation process using a relatively simple equation (De Smith et
al., 2007). It involves fitting a mathematical function to an empirical semivariogram by
calculating the sum of squared errors. The goal is to draw a line through all the points that
minimizes the residual error between each point and the model.
There are three parameters to adjust in semivariance modelling: sill, range and nugget (figure
2.2)
Sill (s): is the approximate distance at which spatial correlation between data point
pairs ceases or become much less variable. At this range a plateau in the semivariance
values is reached9
.
Nugget (n): is a zero or non-zero intercept with the y-axis in the model that has been
fitted.
Range (r): Is the distance at which the increase in semivariance levels off.
9
Non-transitive variograms are ones in which the sill is not reached.
24
Figure 2.2
The semivariance parameters
Source: Esri (2019)
To fit a model to the empirical semivariogram some steps need to be taken (Bivand et al.,
2008).
1. Choose a suitable model (such as exponential or gaussian), with or without nugget.
2. Choose suitable initial values for partial sill(s), range(s), and possibly nugget.
3. Fit this model, using one of the fitting criteria.
25
Chapter 3. Data
Individuals in urban areas are exposed to a large variety of pollutants mixes. Air quality is
affected by pollutants such as nitrogen oxides (NOx), particule matter (PM), carbon
monoxide (CO) and ground level ozone (O3). These substances interact, react and create
heterogeneous pollutant mixes (Jerret et al., 2005).
At present, urban sensor stations allow to monitor different pollutant particles at an increasing
temporal resolution. In this study, the main pollutant particle analysed is nitrogen dioxide
(NO2), a secondary pollutant formed mainly from nitrogen oxide (NO) with a surface lifetime
if around 1 day. It is normally measured with passive sensors and is in general more spatially
homogeneous than NO, the predominant species in vehicle exhaust (ibid , 2005). Nitrogen
dioxide levels are calculated by the number of micrograms in every cubic metre of air
(µg/m3).
The primary data is obtained from a broad variety of sensors deployed by stakeholders,
including citizen scientists and community advocates, expert researchers, government
agencies, sensor manufacturers and others, and provided by Air Quality Data Commons
(AQDC), organization which seeks to accelerate solutions to air pollution by standardizing
and sharing air quality data. In brief, a pollutant is sampled at multiple fixed sites monitors
acting as a proxy for “true” human exposure.
Data of pollution is stored with timestamp value as epoch time (as stored from the sensors)
in hourly basis. The epoch time is then used by the downstream applications of the AQDC
online platform. The dataset of pollutant data available to London is composed by monitoring
stations of London Air Quality Network (LAQN) positioned within Greater London
Authority Boundary and licensed under the terms of the Open Government License.
The data was downloaded using the SQL editor for large dataset on AQDC platform. A
dataset of over 1 million records from 101 stations across London for the period of December
21st
2018 (the start of the winter season) to June 6th
2020 (the last available data prior to the
study) was downloaded on “csv” format.
The present study adopts the definition of astronomical seasons which uses the dates of
equinoxes and solstices to mark the beginning and end of the seasons. Monday to Friday are
termed as weekdays, Saturday and Sunday as weekend.
26
As the density of monitoring stations in outer London falls and becomes less regular, a subset
of sensors data within inner London boundary has been selected (Figure 3.1). This strategy
aims to increase accuracy and reliability of the interpolation technique to be applied in the
study.
The spatial scale relies within Greater London Authority and comprises a network of fixed-
based monitoring of 71 fixed monitoring stations with 123 sq mi (319 km2
) and a population
of 3,535,700 inhabitants (ONS, 2011).
Figure 3.1
Map of the 71 monitoring stations inside Inner London Authority Boundary
27
Chapter 4. Methodology
There are a number of methodologies to investigate human exposure to air pollutants. The
literature provides good examples of techniques such as geostatistical interpolation, land-use
regression models, dispersion models and hybrid models which combines space–time–
activity data, personal measurements and air quality models (Jerret et al., 2005; Elliot et al,
2000; Steinle et al, 2013). The present study applies exploratory data analysis and
interpolation technique for estimating pollution at unsampled locations.
Due to the irregular distribution, the density of the monitoring stations (i.e. 71 stations within
inner London boundary) and the spatial variability of the data measurement, a 2D
probabilistic interpolation is the preferred choice of the study. Therefore, kriging, a
geostatistical model, has been the methodological approach chosen taking into account the
irregular spatial distribution of the dataset. The method assumes that the data point values
represent a sample of a continuous spatial phenomena, in this case NO2 pollutant emissions.
4.1 Empirical framework
In order to address the research questions, the study is empirically designed into a few
stages as presented in Figure 4.1.
Figure 4.1
Outlook of the methodological approach
Step 1: The first section aims to explore the dataset characteristics and generate hypotheses
by means of graphical methods as histograms, box-plots and scatterplots.
Exploratory spatial data analysis (ESDA) strategies are applied to examine spatial and
temporal patterns in the dataset. Time-series analysis with metrics as week, month, season
and time, and spatial analysis are used separately to study the patterns of pollution. This
way, some inferences about the dataset can be raised. The goal is to develop an
1. Examination
of data
patterns and
characteristics
2. Modelling of
spatial
correlation
3. Spatial
interpolation
4. Model
diagnostics
and
discussions
28
understanding of the data by revealing key relationships and processes. Among the
objectives of the first part of the study, include:
• generate insight into the dataset
• uncover underlying structure
• detect outliers
• Identify underlying assumptions.
In this section, a data-driven approach is applied by making a list of data components and
then transforming them into graphics. To explore the data, a methodological pathway is
taken. First by tidying the data, which mean to store in a regular form in accordance with
the semantics of commands applied. Once the data is tidied, a data transformation process
is applied. Transformation includes narrowing in on observations of interest, eventually
creating new variables and calculating a summary statistic. Finally, techniques of
visualization are applied to driving knowledge generation.
Step 2: The second stage aims to study the spatial correlation in the dataset, the process of
semivariance modeling. More precisely, the spatial correlation is modelled by means of
semivariogram (here, the word variogram is used synonymously with semivariogram).
In geostatistics the spatial correlation is modelled by the variogram instead of a
covariogram. The semivariogram is used for spatial interpolation of the observed process
based on point observations, in this case monitoring pollution stations.
This step applies a process commonly known as exploratory variogram analysis, which
means to explore spatial correlation on the data. The experimental semivariogram is
calibrated and then fit with a suitable algorithm model that describes how the sample values
vary with distance. To fit a variogram to the empirical semivariogram, the following steps
are taken:
1. Study the spatial correlation by means of a semivariogram and verify directional
dependence.
2. Choose a suitable semivariogram model taking into account accuracy measures
and suitable initial values for partial sill(s), range(s), and nugget.
Step 3: This stage explores the spatial prediction of unknown quantities of NO2 based on the
previously step. This model is used to interpolate, or predict, values at unsampled locations
29
taking into account the form of pollution particles and its variance. The following procedures
are taken:
1. Based on the semivariance model, interpolate sites onto a regular grid of (x, y)
locations.
2. Apply kriging function to compute predictions of value data.
3. Display the results on a grid map.
Step 4: The last stage deals with the results of the interpolation process. A discussion about
the results and a comparative review of the different timeframes analysed. It also draws some
conclusions about the impact of Covid-19 on the pollutant data of NO2 and implication for
policies to improve air quality in London.
30
Chapter 5. Results
To perform the study, the dataset was first imported into R. A number of procedures have
been carried out in view of storing the data in a consistent form, ensuring it is organized in a
way that matches the semantic of the original dataset but getting into a format for up-front
work10
.
To explore the data, a methodological pathway has been applied. First by tidying the data,
which means to store in a regular form in accordance with the semantics of commands be
applied. Initial queries could then be performed by making some transformation in the dataset,
narrowing in on observations of interest, creating new variables and calculating some
summary statistics. Finally, techniques of visualisation are applied.
On the initial phases, unusual observations, data that doen’ t seem to fit the pattern, have been
identified. Asymmetric distribution of NO2 measures was noted and outliers were eliminated
applying the z-score technique. This variability technique measures observation's variability
and identifies outliers based on the standard deviations above or below the mean each data
point relies.
5.1 Data Patterns and Characteristics
An initial check was conducted to verify if the outlier detection and treatment lead to a closer
approximation to normality in the distribution of NO2. Figure 5.1 displays the frequency of
daily mean NO2 by season (for analysis purposes, the period of the Covid-19 lockdown is
expressed the same manner of a season).
It can be noted that the shape of the data dispersion (after the initial transformation) lead a
distribution close to the normal (i.e. a Gaussian distribution) in all seasons.
10
A procedure usually refers as data tidying.
31
Figure 5.1
Frequency of daily mean NO2 by season (in µg/m3
)
Following the initial statistics on the dataset, table 5.1 presents a summary of daily NO2 from
December 21th
2018 to June 6th
2020.
Table 5.1
Summary of statistics of daily NO2 (µg/m3
) in Inner London Boundary
Min 20.18
1st Qu. 30.94
Median 35.99
Mean 36.66
3rd Qu. 40.85
Max. 65.35
Note: Data from December 21st
2018 to June 6th
2020
To better study how the distribution diverges from normality a quantile-quantile (QQ) plot
provides a good way to describe the distribution on the data. The straight red line is the
theoretical normal distribution (if the data is normally distributed, they would fall on this
line). Figure 5.2 displays the distribution of daily concentration of NO2 in 2019 and during
the enforced lockdown period established on March 23rd
, 2020. It can be seen that the
distribution diverges from normal at its upper tail. This suggests that the NO2 recorded at
32
some stations or certain periods present much higher indices than the average. This trend is
more evident in 2019 than during the lockdown. Some hypotheses can be raised:
• Some stations have systematically higher pollutant concentration than others.
• There is a temporal trend in the data: i.e. at a certain seasons, days or times, the
pollutants are systematically higher than at other times.
Figure 5.2
Frequency of daily mean NO2 (in µg/m3
)
A closer look on the spatial and temporal patterns would respond some of these questions.
Finally, figure 5.3 provides an outlook of the daily exposure variation on NO2. The graph
presents the daily mean and segregates the data by weekdays and weekends. It also highlights
the beginning of the official lockdown in the UK.
33
The displayed line computes a smooth local regression and helps to visualize the downward
trend in pollution, in particular during the lockdown period. It also shows that, in general,
workdays present higher daily mean of NO2. The plot also suggests seasonality on the data,
from the Winter and Autumn to Spring and Summer. Covariation analysis would provide a
better clarification of this hypotheses.
Figure 5.3
Outlook of daily concentration of NO2 (in µg/m3
)
5.2 Spatial Patterns
To explore how data vary in space, a matrix of scatterplots showing the relationship between
mean NO2 per station, latitude, longitude has been drawn on figure 5.4. The first row shows
longitude on the y-axis, and latitude and mean NO2 on the x-axis in columns 2 and 3
respectedly.
From the graph, there is no clear evidence of the relationship of these three variables.
Longitude, latitude and mean NO2 do not present a clear relationship. In a certain extent, this
is expected since a geographic reference system (i.e. longitude and latitude) has little effect
on pollution in urban areas.
34
Figure 5. 4
Pairwise scatterplots of GRS and mean NO2
I better analysis would be to concentrate on the spatial scale of the area of the study. Figure
5.5 provides the context of the mean NO2 across the area of the study. The maps are split by
seasons and the density index informs NO2 mean. A partial spectral scale has been applied
and the breaks are based on the distribution of the data.
The maps draw attention to the low density and irregular location of the station network across
London. Some regions on inner east and south have a low density of monitoring stations. On
the other hand, the plot confirms seasonality on the data: Autumn and Winter are the season
most affected by pollution. A better picture is provided by the temporal analysis on next
section.
On the other hand, the lockdown period displays a better figure. London seems be positively
affected by the lockdown in terms of urban pollution. This can be explained by better road
traffic conditions. Congestion is characterised by slow moving stop and go traffic and is very
costly for pollution.
35
Figure 5.5
Network of Monitoring Stations and Mean NO2 by Season.
Local measures of autocorrelation and clustering are usually assessed with spatial adjacency
and spatial weight matrices and spatial local indicators of spatial association (LISAs). These
techniques have not been applied in the study as the network of monitor sites are located too
sparse in the spatial domain (i.e. inner London boundary). Consequently, spatial aggregation
with areal data cannot be achieved wisely. Autocorrelation in point data is presented in the
section 5.5 (Spatial Interpolation).
5.3 Temporal Patterns
To examine how NO2 particles vary in temporal dimensions, it is necessary first to pull out
individual parts from the timestamp value (as stored from the sensor) with arithmetic for date-
time components. Dates and times have been aggregated and classified by season, days of the
week, month and time components.
The objective is to explain the relationships between variables and to analyse causal
relationships between different time aggregations. Several analyses have been carried out to
spot temporal patterns of NO2 data.
36
To further explore patterns of NO2 mean by season, figure 5.6 presents the box-plot graph
with daily concentration of pollutant by weekly days and seasons11
.
The box-plot confirms some patterns. Weekends present lower concentrations of pollutant
than weekdays. This can be explained by lower number of urban congestions on weekends.
During the lockdown, the same pattern is also verified. In winter and autumn, the months
with higher concentration of pollutants, the distribution is positively skewed which indicates
higher levels of pollution some days.
Figure 5.6
Daily concentration of NO2 (in µg/m3
) by days of week and season
Figure 5.7 outlines a similar approach to analyse temporal patterns. By means of rectangles,
the tile graph plots the daily mean value of NO2 by month and season.
Visually, it can be noticed that the period ranging from October to December is the most
polluted. The average seasonal variation of NO2 reveals that the period comprising March to
June is the least polluted. Overall, warmer months and seasons present less pollution
components.
11
The box-plot is a useful descriptive statistics with order-bases indicators, as the median, quartiles and extreme values.
37
Figure 5.7
Daily concentration of NO2 (in µg/m3
) by month in 2019 (top) and by season (botton)
In addition to season or monthly average emission rates, it is important to capture short-term
variation of concentrations. Intraday intervals can also be computed with hourly resolution
(figure 5.8). The diurnal fluxes of NO2 reveals a homogeneous trend. On times of commuting
to work, roughly between 6:00 to 10:00 am, and when workers head home, fluxes of NO2 get
high. Interestingly, this trend is also present in the period of the lockdown. Spring 2019 and
lockdown 2020 with coincident months, display very similar pattern.
Dynamics in journey times are clearly present in all seasons but, in particular, evening peak
is less pronounced during the lockdown.
38
Figure 5.8
Intraday concentration of NO2 (in µg/ m3
) by season and the period of lockdown
Finally, table 5.2 summarises daily NO2 for different time periods.
Table 5.2
Statistics of daily NO2 (µg/m3
)
Min 1st Qu. Median Mean 3rd Qu. Max. Std. Dev.
Winter (2018-
2019)
20.18 30.84 38.27 38.45 46.04 64.44 10.25022
Spring (2019) 20.92 29.08 33.11 33.63 36.97 52.46 6.644549
Summer (2019) 24.76 30.12 34.71 36.31 39.27 64.25 8.590492
Autumn (2019) 26.14 34.79 38.84 39.75 43.82 58.54 6.988886
Full year 2019 20.18 31.16 36.29 36.98 41.22 64.44 8.41272
SPLA1
of Covid-
19 lockdown
20.92 28.99 33.18 33.86 37.87 52.46 6.804802
Lockdown 20202
19.68 28.10 30.95 32.34 35.10 53.49 6.212479
1
period comprising the same period in 2019 of the Covid-19 lockdown in 2020.
2
period from March 23rd
to June 6th
, 2020.
39
5.3.1 Temporal Autocorrelation
To quantify the extent to which near observations of pollution are more similar than distant
observation in time, we should investigate temporal autocorrelation on the data. For the length
of the series, we can calculate the autocorrelation between one day (denoted as dt) and the
previous day (dt-1).
Figure 5.9 displays the temporal autocorrelation NO2 pollutants in London averaged for all
stations. The first plot presents the time series of pollutants and the second a scatter plot with
daily level on the x-axis, and of the previous day on the y-axis. The value of the
autocorrelation coefficient is 0.609 (shown on the plot), which demonstrate that daily mean
of NO2 is positively correlated.
Figure 5.9
Temporal autocorrelation
A temporal aggregation of daily NO2 mean has been applied to the temporal autocorrelation
function (the dependence of one day on the previous days). The ACF of full year (FY) 2019
and the lockdown period are presented on figure 5.10. The blue lines indicated bounds for
statistical significance.
The plot shows nonsignificant autocorrelation from the third lag on for 2019. The ACF for
the lockdown differs from 2019 in two mains aspects: 1. the autocorrelation is significant for
a larger number of lags and 2. there is an unstable trend with positive and negative values
suggesting a kind of seasonality for the days of the week.
40
Figure 5.10
Temporal autocorrelation function (ACF) for 2019 and lockdown period
To further investigate whether the series has temporal dependency, we can draw lag plots as
an exploratory tool to give visual impression of the dependence (figure 5.11). The lag plots
for the first 10 lags have been calculated. We can see that there is an increasing scatter with
progressive lags (seen from the clustering around the slopes). A principal point can be drawn
from the plots: the slopes, and therefore the correlations, are clearly positive for 2019 and for
some lags unclear in the lockdown. Lower levels of NO2 seem to have an impact on temporal
autocorrelation of NO2 with alternates positive and negative correlations.
Figure 5.11
Temporal autocorrelation function (ACF) lag plots for 2019 and lockdown period in 2020
41
The same temporal aggregation of daily NO2 mean has been applied to the partial
autocorrelation function (PACF). PACF measures the autocorrelation at a particular lag after
accounting for autocorrelation at all lower lags.
Overall, both time-period present alternate positive and negative and decaying to zero.
Significant correlations are present only on the first lag, followed by correlations that are not
significant (figure 5.12).
Figure 5.12
Parcial temporal autocorrelation function (PACF) for 2019 and lockdown period
5.4 Space-time Semivariogram
The space-time semivariogram calculates the semivariance at intervals in space and time. It
aims to examine how semivariance varies with increasing spatial and temporal separations
between observations. Figure 5.13 displays a spatio-temporal variogram in two-dimensions
(2D) of NO2 observations with x-axis representing the spatial distance lag and y-axis different
time-lags aggregated in days for two periods: full year 2019 and the lockdown period in 2020.
42
It can be seen that the semivariance increases rapidly with increasing temporal lags in 2019.
The increase with spatial lags is less dramatic. A different pattern is verified for the lockdown.
The semivariance increment with increasing temporal lag is less intense and homogeneous.
Overall, the lockdown presents lower semivarianve values.
Figure 5.13
2D semivariogram for NO2 in Inner London
FY 2019
Lockdown 2020
A 3D version of the plot is shown in figure 5.14. Again, it can be noted a less homogeneous
pattern in the lockdown with lower semivarianve values.
43
Figure 5.14
3D semivariogram for NO2 in Inner London
FY 2019 Lockdown 2020
5.5 Spatial Interpolation
In geostatistics the spatial correlation is modelled by the variogram instead of a correlogram
or covariogram. A large variety of geographic information system (GIS) packages and
geostatistical software for geostatistic are available. Gstat12
has been selected for the present
study since it offers considerable flexibility in the modeling and display process.
The first step is to examine the data distribution to investigate normality and trends. In kriging
interpolation, a normal distribution is required in order to provide a best unbiased predictor
of values at unsampled points. Eventually, it may be necessary to transform the values prior
to analysis. Figure 5.1 (in the first section of the chapter) outlines the frequency of daily mean
NO2 by season, which demonstrated a distribution close to the normal in all seasons. Figure
5.2 also displays the distribution of daily concentration of NO2 in 2019 and during the
enforced lockdown period. It can be seen that in both periods distribution does not diverge
substantially from the straight line indicating a normal distribution. Though, any
transformation needs to be carry-out.
Prior to model the empirical variogram, the exploration of the spatial correlation is achieved
by a variogram cloud. The variogram cloud is obtained by plotting all possible squared
differences of observation pairs against their separation distance. Figure 5.15 depicts the
semivariogram cloud for the full year (FY) 2019 and the lockdown period in 2020. The plots
12
Full information on Gstat is available at https://cran.r-project.org/web/packages/gstat/index.html.
44
show a lot of scatter, but it can be noted some increase of maximum values for distances
increasing up to 2,000 m in 2019. Variation in lockdown is less regular and consistent than
2019.
Figure 5.15
Semivariogram cloud plot for NO2 data
FY 2019
Lockdown period 2020
To measure how variance increases as a function of the distance between stations, the
empirical semivariogram has been calculated based in successive distance bands (figure 5.16)
with a cutoff value of 8000 metres.
45
Figure 5.16
Empirical semivariogram for NO2 data
FY 2019
Enforced lockdown period 2020
Figure 5.16 indicates that for the full year 2019, the semivariance stop increasing close to
1000m. At this range a plateau in the semivariance values is reached and the increase in
semivariance levels off.
In 2020, during the enforced lockdown period, the semivariance values do not levels off. The
lack a consistency or fixed pattern of semivariance values indicates a non-transitive
variogram pattern.
To verify if the variogram is not of constant form in all directions, an alternative approach
would be to divide the lag band into a series of discrete segments and then calculate separate
46
variograms for each direction. The results are displayed in figure 5.17. In many instances the
variogram is not of constant form in all directions, indicating a property known as anisotropy.
Figure 5.17
Directional variogram
FY 2019
Enforced lockdown period 2020
Here, directions have been classified into four direction intervals: 0, 45, 90 and 135 degrees).
The first plot gives the variogram in the zero direction, which is North, 90 degrees is East. It
is clear that the rate of pattern of semivariance does not present a clear directional patter,
47
which would indicate anisotropy. In both periods, the semivariogram is independent of the
direction, indicating a semivariance uniformity in all orientations, though anisotropy must not
be modeled and the process can be assumed as isotropic. For the semivariance modelling this
means that the point pairs are merged on the basis of distance, not direction.
5.5.1 Semivariogram Modelling
To estimate values of NO2 at unknown locations, the observed pattern of variation with
distance must be modelled. The traditional way of finding a suitable variogram model is to
fit a parametric model to the empirical variogram. The model is then used to calculate the
weights in the kriging equations.
In order to facilitate and provide better insights from the interpolation process, three different
time-frames have been applied to the semivariogram modelling: 1. Full-year 2019; 2. same
period last year (SPLY) of the enforced lockdown period in 2020 and 3. enforced lockdown
period of 2020.
To fit the variogram model to the empirical semivariogram several steps have been taken:
1. Model and test different model functions (e.g. exponential or gaussian)
2. Choose suitable initial values for partial sill(s), range(s), and nugget
3. Fit this model, using one of the fitting criteria.
Table 5.3 presents the fitting accuracy results for each model function applied to the
modelling process. Residuals were calculated using ordinary least squares.
Table 5.3
Fitting accuracy of the semivariance modelling
Model
Fitting accuracy
Full Year 2009
SPLY of Lockdown
(March 23rd
to June
6th
, 2019)
Lockdown
(March 23rd
to June
6th
, 2020)
Circular
3.506397e-09 2.057457e-09 2.03075e-08
Exponential
5.607275e-09 3.139532e-09 2.605411e-07
Gaussian
2.374513e-09 1.895494e-09 1.983014e-08
Linear
2.910855e-09 1.881238e-09 1.970009e-08
Matern 5.607275e-09 3.139532e-09 2.605411e-07
48
Pentaspherical
3.674069e-09 2.288446e-09 2.112738e-08
Spherical
3.293529e-09 2.172618e-09 2.072471e-08
From the fitting accuracy modelling, the exponential type presents the best fitting accuracy.
Figure 5.18 presents the results of the semivariagram modelling.
Figure 5.18
Semivariance with fitted model function
FY 2019 (exponential)
SPLY of Lockdown (exponential)
Lockdown, 2020 (exponential)
49
Finally, the sill, range and nugget from the fitted model are presented on table 5.4.
Table 5.4
Fitting accuracy of the semivariance modelling
Full Year 2009
SPLY of Lockdown
(March 23rd
to June 6th
,
2019)
Lockdown
(March 23nd
to June 6th
,
2020)
Nugget (n) 0 0 0
Range (r) 516.3186 504.0462 430.1904
Sill (s) 0.02522716 0.02922275 0.04954327
5.5.2 Spatial Prediction with Kriging
The interpolation model is applied to predict unknown quantities of NO2 based on sample
data and assumptions regarding its variance and spatial correlation.
To fit the kriging model, the semivariogram produced on the last section has been used. A
matrix of distances between points has been calculated and applied to the variogram function.
The kriging model applied in the exercise is the Ordinary Kriging (OK), one of the most
widespread procedure of this type offered in GIS packages.
Figure 5.19
Spatial Prediction with Kriging
50
FY 2019
SPLY of Lockdown
Lockdown, 2020
51
From the figure 5.19 we can initially point out that NO2 interpolation map of the full year of
2019 and the same period of the lockdown in 2019 are comparatively close. Overall, they
share the same format of areas with higher and lower pollution levels. Interestingly, during
the enforced lockdown in 2020, the image is quite different in the central and east zones. The
fact that commuting to work has fallen sharply seems to have a positive effect on air quality
over these areas.
52
Chapter 6. Analysis and Discussion
In this research, we have applied a geostatistical approach to study NO2 pollutant
concentration in inner London during the Covid-19 lockdown and compare the results to pre-
lockdown periods. The study demonstrates that coupling pollution data analysis and spatial
modelling can provide insightful information and improve the understanding of the impact of
Covid-19 in NO2 emissions.
There are a number of important lessons to be learned from NO2 emissions during the Covid-
19 lockdown in London. Firstly, the spatial interpolation with kriging demonstrates a
common pattern of NO2 levels in the full year 2019 and the same period of the lockdown in
2019 (figure 5.19).
The zone comprising the London Congestion Charge Zone in central London and inner west
and south share the same pattern standing as higher polluted areas. The lockdown has
substantially changed this perspective with an area of lower levels of pollution encircling
central London. In contrast, north and some inner east areas change pattern, moving to zones
with higher levels of pollution. In all scenarios, the Ultra Low Emission Zone (ULEZ), which
aim to help improve air quality in central London, persist as a zone of higher levels of
pollution. It can also be inferred that monitoring stations in this area have systematically
higher pollutant concentrations than others (see figure 5.5).
Further notice that prior to the interpolation process with kriging, the empirical semivariance
for 2019 and the lockdown period indicates a semivariance uniformity in all orientations.
Thus, the semivariance modelling assumes an isotropic pattern for both periods.
The study also demonstrates that NO2 levels exhibit temporal patterns. Weekends present
lower concentrations of NO2 pollutants than weekdays. This patter is also present in the
lockdown period. This can be explained by less road congestions on weekends.
Short-term variations are also present in the data. Dynamics in journey times are clearly
present in all periods. Intraday intervals computed with hourly resolution demonstrate that
diurnal fluxes of NO2 have a homogenous trend. Between 6:00 and 10:00am fluxes of NO2
get high. Spring 2019 and during the lockdown with coincident months, display a very similar
pattern. Evening peaks are less pronounced during the lockdown which indicate less traffic
congestions coming from the enforced intervention on mobility.
53
For the period pre-lockdown, the period ranging from October to December 2019 is the most
polluted and the interval from March to June present lower NO2 emissions. This suggest that
in general, warmer months display less NO2 levels in London.
Clearly, London has been positively affected by the lockdown in urban pollution. NO2 levels
in inner London fell from a daily mean of 37µg/m3
in 2019 to 32µg/m3
for the period of March
23rd
to June 6th
2020. Similarly, if we compare same periods, the data also demonstrate a
decrease in pollution since the same period of the lockdown in 2019 displays 34µg/m3
. All
seasons in 2019 present a higher concentration of NO2: 38µg/m3
in winter, 34µg/m3
in spring,
36µg/m3
in summer and 40 µg/m3
in autumn.
The results also demonstrate that the overall values of NO2 meet both the UK13
and the World
Health Organisation (WHO) air quality guidelines limits which state that NO2 levels above
40µg/ m3
are harmful to people. Nevertheless, recorded days exceeding this limit were also
verified, with daily levels up to 64 µg/m3 in Winter and Summer 2019. Covid-19 lockdown
also exceed that limit with levels up to 53 µg/m3. For this reason, the governance air pollution
scheme for the Mayor of London recognises that London’s environment is improving, but it
still faces major challenges (Mayor of London, 2018).
The empirical results offered by the present study should serve to future research, but
limitations must also be noted.
Firstly, the results should be interpreted with caution in function of a limited time period
analysis. Future studies would benefit from longer time-series in order to reinforce the
analysis.
Secondly, while the network of location-based monitoring stations provides a straightforward
application for the analysis of urban NO2 exposure, it has considerable limitations. Such
fixed-location data estimates ignore the impact of individual mobility patterns. A detailed
individual level activity data with GPS-enabled monitoring devices would substantially
improve the accuracy of the exposure data and ultimately enhance the analysis of
environmental pollution in London.
Finally, to improve reliability and accuracy of the interpolation process, a more regular
distribution of sensors would be necessary. The network of monitoring stations presents a low
13
The legislation for air quality objectives has been defined by the UK Air Quality Strategy (AQS) for England, Scotland,
Wales and Northern Ireland in January 2000. The UK Government’ s strategy for air quality management requires all local
authorities to carry out regular assessments of air quality in their areas.
54
density and irregular distribution design. Spatial correlation analysis would benefit of a better
spatial arrangement of sensors.
55
Chapter 7. Recommendations for Further Studies
Looking forward, many opportunities can be explored to provide important insights of the
impact of Covid-19 pandemic in pollutants emission in London.
A good research strategy would be to refine the air pollution modelling with other sources of
data and gain a more detailed representation. Air pollution is not associated only with
anthropogenic systems. Meteorological conditions as wind speed direction, ambient
temperature, solar radiation, and atmospheric stability, matters too and would provide a more
accurate measurement of pollutant exposure. Coupling emission inventories and dispersion
modelling can provide valuable insights for pollution monitoring studies.
Another possibility is to apply regression kriging, combining values of pollutants with
additional variables as distance to roadside locations. The relationship between the dependent
variable and some independent variables could then be modelled using a linear regression
model. Additionally, ambient measurements could also focus on components directly related
to vehicle exhaust, such as black carbon particulate matter (PM) components.
Predictive analysis based on spatio-temporal statistical modeling would also be of great
interest. The main advantage of this technique is that observations taken at other times can be
included into the spatio-temporal interpolation, increasing accuracy prediction. To extend
spatial statistics to include the time dimension implies modelling variability in space and time,
though increasing statistical sophistication than purely spatial or purely temporal modelling.
Finally, an interesting research path would be to model forecasting post Covid-19 scenarios
to the ambient air in the city. This would certainly provide important insights to ambient air
policy and governance in London.
56
References
Beevers, S.D., Kitwiroon, N., Williams, M.L., Kelly, F.J., Anderson, H.R. and Carslaw, D.C., 2013.
Air pollution dispersion models for human exposure predictions in London. Journal of exposure
science & environmental epidemiology, 23(6), pp.647-653.
Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V. and Pebesma, E.J., 2008. Applied spatial data
analysis with R (Vol. 747248717, pp. 237-268). New York: Springer.
Burrough, P.A., McDonnell, R., McDonnell, R.A. and Lloyd, C.D., 2015. Principles of geographical
information systems. Oxford university press.
Burrough, P.A. and McDonnell, R.A., 1998. Creating continuous surfaces from point data. Principles
of Geographic Information Systems. Oxford University Press, Oxford, UK.
Chen, D., Ou, T., Gong, L., Xu, C.Y., Li, W., Ho, C.H. and Qian, W., 2010. Spatial interpolation of
daily precipitation in China: 1951–2005. Advances in Atmospheric Sciences, 27(6), pp.1221-1232.
Cheng, T, and J. Haworth. 2019. “Spatio-Temporal Data Analysis and Big Data Mining.” University
College London.
Chiles, J.-P., Delfiner, P., 2012. Geostatistics: modeling spatial uncertainty, 2nd ed. ed, Wiley series
in probability and statistics. Wiley, Hoboken, N.J.
De Smith, M.J., Goodchild, M.F. and Longley, P., 2007. Geospatial analysis: a comprehensive guide
to principles, techniques and software tools. Troubador publishing ltd.Steinle, S., Reis, S. and Sabel,
C.E., 2013. Quantifying human exposure to air pollution—Moving from static monitoring to spatio-
temporally resolved personal exposure assessment. Science of the Total Environment, 443, pp.184-
193.
Elliot, P., Wakefield, J.C., Best, N.G. and Briggs, D.J., 2000. Spatial epidemiology: methods and
applications. Oxford University Press.
Esri (2019), Geostatistics and the Semivariogram. Available from:
https://storymaps.arcgis.com/stories/5ad922df3a724149ab55d8054bec4970 [Accessed 16th
August 2020].
Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2000). Quantitative geography: perspectives on
spatial data analysis. Sage.
Gualtieri, G. and Tartaglia, M., 1998. Predicting urban traffic air pollution: a GIS
framework. Transportation Research Part D: Transport and Environment, 3(5), pp.329-336.
Goodchild, M.S., 1986. Autocorrelation: Concepts and Techniques in Modern Geography. Norwich,
UK: Geo Books.
Haworth, J. (2018). Spatial Analysis and GeoComputation Lecture, Lecture 5: Spatial Interpolation.
Haworth, J. (2018). Spatial Analysis and GeoComputation Lecture, Lecture 2: Exploratory (Spatial)
Data Analysis.
Haworth, J. (2018). Spatial Analysis and GeoComputation: A tutorial guide. Unpublished manuscript.
Jerrett, M., Arain, A., Kanaroglou, P., Beckerman, B., Potoglou, D., Sahsuvaroglu, T., Morrison, J.
and Giovis, C., 2005. A review and evaluation of intraurban air pollution exposure models. Journal of
Exposure Science & Environmental Epidemiology, 15(2), pp.185-204.
Mayor of London (2018). London Environment Strategy: implementation plan. Mayor of London, UK.
Available from: https://www.london.gov.uk/sites/default/files/implementation_plan.pdf [Accessed 28th
August 2020].
57
ONS (2012). Census- Population and household estimates for England and Wales, March 2011.
Statistical Bulletin, Office for National Statistics, UK. Available at
http://www.ons.gov.uk/ons/dcp171778_270487.pdf
Oxley, T., Valiantis, M., Elshkaki, A. and ApSimon, H.M., 2009. Background, road and urban
transport modelling of air quality limit values (the BRUTAL model). Environmental Modelling &
Software, 24(9), pp.1036-1050.
Pebesma, E. and Heuvelink, G., 2016. Spatio-temporal interpolation using gstat. RFID Journal, 8(1),
pp.204-218.
Penn State University (2020), Applied Time Series Analysis. Available from:
https://online.stat.psu.edu/stat510/ [Accessed 21th August 2020].
The Economist, 2020. Air pollution is returning to pre-Covid levels. Available
from:https://www.economist.com/graphic-detail/2020/09/05/air-pollution-is-returning-to-pre-Covid-
levels [accessed 18th July 2020].
The Guardian, 2020. Coronavirus detected on particles of air pollution. Available from:
https://www.theguardian.com/environment/2020/apr/24/coronavirus-detected-particles-air-pollution
[accessed 12nd September 2020].
Singleton, Spielman, A., and D. Folch. 2018. Urban Analytics. Spatial Analytics; GIS Series.
Steinle, S., Reis, S. and Sabel, C.E., 2013. Quantifying human exposure to air pollution—Moving
from static monitoring to spatio-temporally resolved personal exposure assessment. Science of the
Total Environment, 443, pp.184-193.

More Related Content

Similar to Impacts of covid 19 on urban air pollution in london

Go ions v2_021312
Go ions v2_021312Go ions v2_021312
Go ions v2_021312
Femi Prince
 
TineGeldof_Thesis fysica
TineGeldof_Thesis fysicaTineGeldof_Thesis fysica
TineGeldof_Thesis fysica
Tine Geldof
 
Auralization Methodology for External and Internal Truck Sounds
Auralization Methodology for External and Internal Truck SoundsAuralization Methodology for External and Internal Truck Sounds
Auralization Methodology for External and Internal Truck Sounds
Wiktor Eriksson
 
masteroppgave_larsbrusletto
masteroppgave_larsbruslettomasteroppgave_larsbrusletto
masteroppgave_larsbrusletto
Lars Brusletto
 
Masters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
Masters Thesis - Exploration Phase_Deepwater Reservoir Data IntegrationMasters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
Masters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
Alan Mössinger
 
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
Alan Mössinger
 
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
Francesco Carrea
 
Final_project_watermarked
Final_project_watermarkedFinal_project_watermarked
Final_project_watermarked
Norbert Naskov
 
Manuscrit de Doctorat_El Abdellaouy Hanane
Manuscrit de Doctorat_El Abdellaouy HananeManuscrit de Doctorat_El Abdellaouy Hanane
Manuscrit de Doctorat_El Abdellaouy Hanane
Elabdellaouy Hanane
 

Similar to Impacts of covid 19 on urban air pollution in london (20)

Go ions v2_021312
Go ions v2_021312Go ions v2_021312
Go ions v2_021312
 
TineGeldof_Thesis fysica
TineGeldof_Thesis fysicaTineGeldof_Thesis fysica
TineGeldof_Thesis fysica
 
Netland thesis
Netland thesisNetland thesis
Netland thesis
 
Auralization Methodology for External and Internal Truck Sounds
Auralization Methodology for External and Internal Truck SoundsAuralization Methodology for External and Internal Truck Sounds
Auralization Methodology for External and Internal Truck Sounds
 
Fulltext01
Fulltext01Fulltext01
Fulltext01
 
masteroppgave_larsbrusletto
masteroppgave_larsbruslettomasteroppgave_larsbrusletto
masteroppgave_larsbrusletto
 
Moukalled et-al-fvm-open foam-matlab
Moukalled et-al-fvm-open foam-matlabMoukalled et-al-fvm-open foam-matlab
Moukalled et-al-fvm-open foam-matlab
 
MyThesis
MyThesisMyThesis
MyThesis
 
Masters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
Masters Thesis - Exploration Phase_Deepwater Reservoir Data IntegrationMasters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
Masters Thesis - Exploration Phase_Deepwater Reservoir Data Integration
 
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
Thesis - Exploration Phase: Deepwater Carbonate Reservoir Data Integration fo...
 
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
Carrea - Shake-Table Test on a Full-Scale Bridge Reinforced Concrete Column -...
 
Master Thesis
Master ThesisMaster Thesis
Master Thesis
 
thesis
thesisthesis
thesis
 
Final_project_watermarked
Final_project_watermarkedFinal_project_watermarked
Final_project_watermarked
 
thesis
thesisthesis
thesis
 
thesis
thesisthesis
thesis
 
Global Illumination Techniquesfor the Computation of High Quality Images in G...
Global Illumination Techniquesfor the Computation of High Quality Images in G...Global Illumination Techniquesfor the Computation of High Quality Images in G...
Global Illumination Techniquesfor the Computation of High Quality Images in G...
 
Manuscrit de Doctorat_El Abdellaouy Hanane
Manuscrit de Doctorat_El Abdellaouy HananeManuscrit de Doctorat_El Abdellaouy Hanane
Manuscrit de Doctorat_El Abdellaouy Hanane
 
Master thesis
Master thesisMaster thesis
Master thesis
 
Final_Report_Tano_Retamales
Final_Report_Tano_RetamalesFinal_Report_Tano_Retamales
Final_Report_Tano_Retamales
 

Recently uploaded

Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
yulianti213969
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Stephen266013
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
mikehavy0
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get CytotecAbortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
Abortion pills in Doha {{ QATAR }} +966572737505) Get Cytotec
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
 
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...Solution manual for managerial accounting 8th edition by john wild ken shaw b...
Solution manual for managerial accounting 8th edition by john wild ken shaw b...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
Las implicancias del memorándum de entendimiento entre Codelco y SQM según la...
 
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
Statistics Informed Decisions Using Data 5th edition by Michael Sullivan solu...
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptxClient Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
Client Researchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pptx
 
DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1DAA Assignment Solution.pdf is the best1
DAA Assignment Solution.pdf is the best1
 
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
Abortion Clinic in Kempton Park +27791653574 WhatsApp Abortion Clinic Service...
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
Unsatisfied Bhabhi ℂall Girls Vadodara Book Esha 7427069034 Top Class ℂall Gi...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted KitAbortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
Abortion pills in Riyadh Saudi Arabia| +966572737505 | Get Cytotec, Unwanted Kit
 

Impacts of covid 19 on urban air pollution in london

  • 1. 1 IMPACTS OF COVID-19 ON URBAN AIR POLLUTION IN LONDON LAURENT JOSE LACAZE SANTOS Supervisor: Dr. JAMES HAWORTH Department of Civil, Environmental & Geomatic Engineering University College London | UCL Year of Submission: 2020 Submitted in partial fulfilment of the requirements for the degree of MSc in Geospatial Sciences, Geographic Information Science and Computing September, 2020
  • 2. 2 Abstract The development of models to assess air pollution exposures is a major topic for sustainable development in urban areas. This study aims to compare emissions of nitrogen dioxide (NO2) particles in pre-Covid-19 period and during the nationwide Covid-19 lockdown in inner- London. The inventory of pollution data is provided by a network of monitoring stations used to assess and monitor pollutant particles. The study applies exploratory data analysis and spatial kriging interpolation to estimate pollution at unsampled locations and provide insightful information for pollution control policy. It has been demonstrated that coupling pollution data analysis and predictive spatial modelling can provide insightful information and improve the understanding of the impact of Covid-19 in NO2 emissions. Important conclusions are raised. Firstly, the lockdown has substantially changed the spatial concentration of pollutants in the study area. The analysis also demonstrates that NO2 levels exhibit temporal patterns during the lockdown. The study concludes that London has been positively affected in air quality by the lockdown. Looking forward, opportunities for future research are presented.
  • 3. 3 Acknowledgments I am grateful to many people who in diverse ways have been involved in the completion of this dissertation. In particular, I would like to express my gratitude to my supervisor, James Haworth who provided important advices throughout the study and introduced and equipped with the techniques of spatial-temporal data analysis and big data analytics. I also must thank Tao Cheng, professor in geoinformatics at UCL, for all the skills and on- going academic mentorship in the field of spatial analysis and geocomputation. To Mohamed Ibrahim, PhD researcher at SpaceTimeLab, for all support and debugging with R programming. I am also grateful to all professors of the Department of Civil, Environmental and Geomatic Engineering (CEGE) who have been generous with their time and expertise. This dissertation is dedicated to my wife Alessandra, my children Nicole and Thomas, to my father, Jose Roberto, who was always anxious about updates, to my mother Renee, and my brother Francois.
  • 4. 4 CONTENTS ABSTRACT.....................................................................................................................................................2 ACKNOWLEDGMENTS...................................................................................................................................3 LIST OF FIGURES............................................................................................................................................5 LIST OF TABLES .............................................................................................................................................6 ABBREVIATIONS ...........................................................................................................................................7 CHAPTER 1. BACKGROUND AND OBJECTIVES................................................................................................8 1.1 AIM AND RESEARCH QUESTIONS ....................................................................................................................... 8 CHAPTER 2. LITERATURE REVIEW................................................................................................................10 2.1 SPATIAL AIR POLLUTION MODELLING............................................................................................................... 10 2.2 AIR POLLUTION EXPOSURE MONITORING IN GREATER LONDON ............................................................................ 13 2.3 EXPLORATORY SPATIAL DATA ANALYSIS ............................................................................................................ 15 2.3.1 Spatial Dependence and Autocorrelation ........................................................................................ 16 2.3.2 Correlation in Temporal Data .......................................................................................................... 17 2.4 GEOSTATISTICAL INTERPOLATION..................................................................................................................... 18 2.4.1 Spatial Interpolation Methodologies ............................................................................................... 19 2.4.2 Kriging.............................................................................................................................................. 21 2.4.3 Spatial Variance............................................................................................................................... 22 CHAPTER 3. DATA .......................................................................................................................................25 CHAPTER 4. METHODOLOGY.......................................................................................................................27 4.1 EMPIRICAL FRAMEWORK................................................................................................................................ 27 CHAPTER 5. RESULTS...................................................................................................................................30 5.1 DATA PATTERNS AND CHARACTERISTICS ........................................................................................................... 30 5.2 SPATIAL PATTERNS ....................................................................................................................................... 33 5.3 TEMPORAL PATTERNS ................................................................................................................................... 35 5.3.1 Temporal Autocorrelation................................................................................................................ 39 5.4 SPACE-TIME SEMIVARIOGRAM ........................................................................................................................ 41 5.5 SPATIAL INTERPOLATION................................................................................................................................ 43 5.5.1 Semivariogram Modelling................................................................................................................ 47 5.5.2 Spatial Prediction with Kriging......................................................................................................... 49 CHAPTER 6. ANALYSIS AND DISCUSSION.....................................................................................................52 CHAPTER 7. RECOMMENDATIONS FOR FURTHER STUDIES .........................................................................55 REFERENCES................................................................................................................................................56
  • 5. 5 List of Figures Figure 2.1 Sample empirical semivariogram........................................................................23 Figure 2.2 The semivariance parameters..............................................................................24 Figure 3.1 Map of the 71 monitoring stations inside Inner London Authority Boundary....26 Figure 4.1 Outlook of the methodological approach............................................................27 Figure 5.1 Frequency of daily mean NO2 by season (in µg/m3 )...........................................31 Figure 5.2 Frequency of daily mean NO2 (in µg/m3 ) ...........................................................32 Figure 5.3 Outlook of daily concentration of NO2 (in µg/m3 ).............................................33 Figure 5.4 Pairwise Scatterplots of GRS and Mean NO2 .....................................................34 Figure 5.5 Network of Monitoring Stations and Mean NO2 by Season. ..............................35 Figure 5.6 Daily concentration of NO2 (in µg/m3 ) by days of week and season................36 Figure 5.7 Daily concentration of NO2 (in µg/m3 ) by month in 2019 (top) and by season (botton) ..................................................................................................................................37 Figure 5.8 Intraday concentration of NO2 (in µg/ m3 ) by season and the period of lockdown ...............................................................................................................................38 Figure 5.9 Temporal autocorrelation ...................................................................................39 Figure 5.10 Temporal autocorrelation function (ACF) for 2019 and lockdown period.......40 Figure 5.11 Temporal autocorrelation function (ACF) lag plots for 2019 and lockdown period in 2020........................................................................................................................40 Figure 5.12 Parcial temporal autocorrelation function (PACF) for 2019 and lockdown period.....................................................................................................................................41 Figure 5.13 2D semivariogram for NO2 in Inner London ....................................................42 Figure 5.14 3D semivariogram for NO2 in Inner London ....................................................43 Figure 5.15 Semivariogram cloud plot for NO2 data............................................................44 Figure 5.16 Empirical semivariogram for NO2 data.............................................................45 Figure 5.17 Directional variogram .......................................................................................46 Figure 5.18 Semivariance with fitted model function ..........................................................48 Figure 5.19 Spatial Prediction with Kriging.........................................................................49
  • 6. 6 List of Tables Table 2.1 Urban Air Pollution modelling methodologies....................................................11 Table 2.2 The Environment Research Group Urban Pollution Modelling Methods...........14 Table 2.3 Interpolation methods ..........................................................................................20 Table 2.4 The main forms of linear Kriging.........................................................................21 Table 5.1 Summary of statistics of daily NO2 (µg/m3 ) in Inner London Boundary.............31 Table 5.2 Statistics of daily NO2 (µg/m3 )............................................................................38 Table 5.3 Fitting accuracy of the semivariance modelling...................................................47 Table 5.4 Fitting accuracy of the semivariance modelling..................................................49
  • 7. 7 Abbreviations ACDC Air Quality Data Commons ACF Autocorrelation function ADMS Atmospheric Dispersion Modelling System AIO Area of interest BLUE Best linear unbiased estimate CMAQ Community Multi-scale Air Quality Model CREA Centre for Research on Energy and Clean Air EDA Exploratory data analysis ERG The Environmental Research Group ESDA Exploratory spatial data analysis ESTDA Exploratory spatio-temporal data analysis EU European Union GIS Geographic information systems GLA Greater London Authority GRS Geographic reference system IDW Inverse distance weighting KCL King’ s College London LAEI London Atmospheric Emissions Inventory LAQM Local Air Quality Management LAQN Breath London Air Quality monitoring network LHEM London hybrid exposure model LISA spatial local indicators of spatial association MAQS Mayor’s Air Quality Strategy OK Ordinary Kriging ONS Office for National Statistics PACF Partial autocorrelation function PMCC Pearson’s product moment correlation QQ plot Quantile-quantile plot TFL Transport for London UCL University College London UK United Kingdom ULEZ Ultra low emission zone WHO World Health Organisation
  • 8. 8 Chapter 1. Background and Objectives The development of models to assess air pollution exposure within cities is a major topic for sustainable development in urban areas. People travel and are exposed to air pollutants differently. They might use different modes of transport to work and school at different time frames. Variability in air pollution is intrinsically associated to trends of emission components and exposes populations in different spatial and temporal scales. Poor air quality has long been recognized as having adverse effects on health. To improve the understanding of these effects requires air monitoring systems and modelling predictions, especially in urban areas where pollutants concentration coincide with high population densities. Health and epidemiological studies provide sufficient evidence of causal relationship of air quality induced health effects as asthma, impaired lung function, total and cardiovascular mortality and cardiovascular morbidity (Beevers et al, 2013). In London, network monitoring stations have been used to assess and monitor pollutant particles. Information on emissions can provide a representation of pollutant concentrations with data brought together from a relatively small number of sites. Modelling air pollution is an effective way to understand how pollution affects urban areas and provide insights to ground emission control measures. This calls for theory and methods to gain a better understanding of the observed spatial and temporal processes on pollutant data. 1.1 Aim and Research Questions This study aims to the compare the exposure of nitrogen dioxide (NO2) particles in ordinary circumstances and during the enforced Covid-9 lockdown period in 2020 in London to provide insights at local level emissions of NO2. The work also aims to capture the temporal variation of emissions in order to represent the short-term variations of concentration across London in 2019 and compare it to the disruptive event of Covid-19 pandemic in 2020, with important implications for future emission control and urban environmental strategies in predicting outdoor human exposure. A key contribution of this work is to provide insightful information for pollution control strategies. The results can be used on geo-referenced data to understand human exposure to
  • 9. 9 NO2 at different places and provide evidences to recommend policy improvements for pollution control and sustainable development. The main question of the dissertation can be summarised as follow: • How NO2 pollution changes at different spatial and temporal scales with the disruptive economic and environmental event of Covid-19 in London? The following sub questions can also be stated as follow: • What are the usual NO2 concentrations in London in an ordinary year comparing to the enforced lockdown period in 2020? • How do NO2 emissions during the lockdown differ from typical patterns?
  • 10. 10 Chapter 2. Literature Review In this section, we will introduce urban air pollution modelling methodologies and their main characteristics. Recent studies on the impact of the Covid-19 enforced lockdown are also mentioned. The chapter also aims to introduce the main governance scheme for ambient air quality in London as well as the main research projects that study urban air pollution in the city. Finally, the section explores the methodologies of exploratory data analysis and the techniques of geostatistical interpolation. 2.1 Spatial Air Pollution Modelling Ambient concentrations of air pollutants at potentially harmful levels in urban areas have been subject to scientific studies seeking to understand the characteristics and driving forces of atmospheric pollution (Jerret et al., 2005; Elliot et al, 2000; Steinle et al, 2013). There is a worldwide concern about air quality in intraurban areas and methodologies to modelling and monitoring air concentration of pollutants have been developed to provide reference for formulating and design preventive measures (Jerret et al., 2005). Air pollution in cities are subject to high spatial and temporal variability. Exposure results from the relationships and interactions between environment and human systems (Steinle et al, 2013). Large cities with high population densities are specially affected and every individual has unique activity-patterns that result in differing exposures. Therefore, accurately measuring exposure at fine spatial and temporal scales is of crucial importance. As a result, level of exposure and impact of air pollution effects have been subject to assessment and control policies in the UK, such as the UK National Air Quality Strategy (NAQS) and London Environment Strategy (Mayor of London, 2018). To assess and control pollution in urban environments, a system of networks of air quality monitor stations is usually use. With the inventory of pollutants provided by monitor stations, a range of methods have been used to measure human exposure to air pollution. The process of monitoring pollutant particles has been defined by Zartarian et al. (2007) as “… the process
  • 11. 11 of estimating or measuring magnitude, frequency and duration of exposure to an agent (…)” (p.58). The methods vary in their sophistication and attempt to develop exposure models capable of identifying small-area variations in pollution. Simple measures as proximity to road traffic can serve as a proxy to better capture the intraurban variability in pollutants concentrations. More sophisticated techniques apply dispersion, atmospheric and time-activity models with geographic information systems (GIS) capabilities (Jerret et al, 2005). Latest advances in technology unable the tracking of individuals while simultaneously measuring pollutant concentrations with individual monitor sensors (Steinle et al., 2013). In literature, diverse models to assess air pollution exposures within cities are reviewed (Jerret et al., 2005; Elliot et al, 2000). In broad terms, methodologies can be classified under four classes, namely: (i) spatial regression, (ii) geostatistical interpolation, (iii) dispersion models and (iv) hybrid models. Table 2.1 presents their main characteristics. Table 2.1 Urban Air Pollution modelling methodologies Methodologies Applications Spatial regression • Models the concentration of pollutants as a function of predictable variables. • Establishes a statistical relationship between pollutants and variables as surrounding land use, population density, traffic pattern and meteorological data. Geostatistical interpolation • From data collected at a set of monitoring stations, it estimates of the concentration of pollutants in neighbourhood areas. • Interpolation is based on pure spatial or spatial- temporal modelling. Dispersion models • Simulates pollution fate and transport with atmospheric data and time-activity models (e.g. KCLurban1 and CMAQ urban2 ). Hybrid models • Combines personal or household exposure monitoring with one or two different methods (e.g. LHEM3 ) Spatial regression seeks to predict pollution concentrations at a given location based on surrounding land use and traffic characteristics. The method uses measured pollution concentration yi at location s as the response variable of independent variables xi within areas 1 Reference: http://www.erg.kcl.ac.uk/research/home/modelling-pollution-in-london.html [Accessed 18 July 2020]. 2 Reference: http://www.erg.kcl.ac.uk/research/home/modelling-pollution-in-london.html [Accessed 18 July 2020]. 3 Reference: https://pubs.acs.org/doi/abs/10.1021/acs.est.6b01817 [Accessed 18 July 2020].
  • 12. 12 called buffers as predictors of the measured concentrations (Jerret et al, 2005). The regression modelling aims to predict pollution surfaces as a function of exogenous independent variables at any spatio-temporal resolution. Interpolation technique also relies on pollutant data derived from monitoring stations. The aim is to estimate the concentration of concentration of pollutant at sites other than the stations. By means of a grid imposed over the study area, a continuous surface of pollutants can be obtained. The most common geostatistical interpolation technique is kriging, which model spatial dependence to develop continuous surfaces of pollution. It applies the best estimate (linear unbiased estimate, BLUE) of the variable’ s value at any point of the study area (Burrough and McDonnel, 1998). The predicted values and their standard errors called kriging variance, quantify the degree of uncertainty in spatial predictions at any site. Other methods such as splines, inverse distance weighting and Theissen triangulation rely on deterministic algorithm are also commonly applied as interpolation methods but do not offer means to estimate errors (Jerret et al, 2005). Dispersion models generally rely on deterministic processes assumptions and require the use of meteorological conditions and topography in conjunction with emission data. It aims to offer a more realistic representation of the problem. Meteorological data provide information about wind speed direction, ambient temperature, solar radiation and atmospheric stability (Gualtieri and Tartaglia, 1998). After calibrated, the model computes the pollution levels at the study area extension. Beevers et al (2013) explain that the main advantage of dispersion modelling-based approach relies on its ability to disaggregate by composite and source origin in view of predict past and future air quality as well as to assess the impact of prevention measures. Hybrid models are two modelling approaches combined. They usually combine personal or regional monitoring with other air pollution exposure methods. Personal exposure assessment is evolving quickly and latest advances in technology enable the tracking of individuals while simultaneously measuring pollutant concentrations (Steinle et al, 2013). Overall, the dispersion models are considered more reliable than the others but require a substantial amount of data on emissions and meteorology (Jerret et al., 2005). Finally, it worth mentioning that intraurban pollution is a major public health issue as recent studies point to the fact that air pollution may play an important role in helping understand and combat the spread of the Covid-19 pandemic. Particles of pollution might help carry the
  • 13. 13 virus further afield suggesting a link between death rate and the spread of diseases (The Guardian, 2020). The Centre for Research on Energy and Clean Air (CREA) has also studied the impacts of the enforced Covid-19 lockdown has had on air-pollution levels in 12 big cities around the world. The studies point to the fact that Nitrogen dioxide (NO2) particles levels fell by about 27% ten days after governments issued stay-at-home orders, compared with the same period in 2017-19. Another component, particulate matter (PM), declined by an average of about 5% in a group of 12 big cities in which data are readily available (The Economist, 2020). 2.2 Air Pollution Exposure Monitoring in Greater London With a population of more than 8 million people according to the 2011 census (ONS, 2012), London is one of the largest cities in the world. In November 1999, The Greater London Authority Act received Royal Assent to provide the governance framework for the Mayor of London, leading to the publishing of the Mayor’s Air Quality Strategy (MAQS) for Greater London. The MAQS aims to meet the requirements of the Local Air Quality Management (LAQM), an important part of the Government’s strategy to meet both the UK air quality objectives and the EU limit values (Oxley et al, 2009). LAQM requires all local authorities to carry out regular assessments of air quality in their boundaries. The London Atmospheric Emissions Inventory4 (LAEI) is the key tool for air quality analysis and policy development in London. Provided by the Greater London Authority (GLA) it is a regularly updated database of pollutant emissions. LAEI data remain from two network stations of air quality monitoring: 1. Regulatory air quality monitoring sites, managed by GLA and 2. Breath London Air Quality monitoring network (LAQN), provided by Transport for London (TFL). The Environmental Research Group5 (ERG), part of the School of Population Health & Environmental Sciences at King’s College London, has since the publishing of the MAQS developed and provided air quality research and information in London and the United 4 Reference: https://data.london.gov.uk/air-quality/ 5 Reference: https://www.kcl.ac.uk/lsm/research/divisions/aes/research/ERG
  • 14. 14 Kingdom (UK). The main outputs the ERG consist of a hybrid model and dispersion modelling systems. Table 2.2 depicts the main characteristics of the applied methodologies for urban air pollution in London under ERG. Table 2.2 The Environment Research Group Urban Pollution Modelling Methods Dispersion Models KCLurban • Is a dispersion modelling system using Atmospheric Dispersion Modelling System (ADMS) dispersion model and road source model from the London Atmospheric Emissions Inventory (LAEI). • Gives annual mean air quality prediction on a regular 20 x 20 m grid. Community Multi-scale Air Quality Model (CMAQ- urban) • Is deterministic (uses fundamental physics and chemistry) and runs over a much larger model domain. • predicts hourly concentrations coupling road models with regional site monitoring stations. Hybrid Models London Hybrid Exposure Model (LHEM) • Uses anonymous activity data provided by Transport for London, and advanced air pollution and micro-environmental modelling. Source: adapted from Beevers et al (2013). Both dispersion models (i.e. KCLurban and CMAQ) establish the spatio-temporal patterns of NOx-NO2, PM10 and PM2.5 . KCLurban uses ADMS dispersion model as well as intraurban road traffic and meteorological data. Beevers et al (2013) explain that KCLurban has been used in air quality decision making in London in the scheme of Mayor’s Air Quality Strategy. The KCLurban gives annual mean air quality predictions of pollutants on a regular 20m x 20m spatial scale grid. From its side, CMAQ-urban model is deterministic and predicts hourly concentration of pollutants across London on a spatial scale of 20m x 20m grid. The London Hybrid Exposure Model (LHEM) combines dispersion model at small spatial and temporal (hourly) scales with detailed space-time-activity data taken by TFL. The hybrid model enables to estimate the exposure misclassification associated with using estimates of average concentration at the home post code and to increase the understanding of interactions between exposure and vulnerable sub-groups (Beevers et al, 2013). It aims to provide a more accurate measurement of pollutant exposure by considering individual level data with details from approximately 200,000 journeys.
  • 15. 15 LHEM compares hourly concentrations of NOx, NO2, O3, CO, PM10 and PM2.5 with measured hourly concentration of 42 automatic monitoring sites. The monitoring sites are those of the from the London Air Quality Network (ibid, 2013). 2.3 Exploratory Spatial Data Analysis A fundamental task prior to any data analysis is to examine the structure and the characteristics of the dataset. Initial examination of the data, visually or using descriptive statistics, is a powerful way to better understand the data on hand. This pre-modelling exploration is helpful to understand patterns before setting up any statistical modelling of spatial process. This initial examination provides useful information of variables and a framework to tackle spatial problems. The goal is to develop an understanding of the data by revealing key relationships and processes. This initial step aims to generate insights and is helpful to modeling data that vary across space. Haworth (2018) contents that Exploratory Data Analysis (EDA) focuses on the analyses of datasets to explore their characteristics and drive inferences. Here, data visualisation and basic descriptive statistics are the common techniques to generate evidence from empirical data. According to Cheng and Haworth (2019), among the objectives of this initial phase of data examination include: • maximising insight into a dataset • uncovering underlying structure • extracting important variables • detecting outliers and anomalies • testing underlying assumptions. Although many of these methods are applicable to non-spatial data, considering the spatial dimension is essential when the data is geographical (Fotheringham et al., 2000). Exploratory spatial data analysis (ESDA) is the extension of EDA to spatial data. ESDA combines tools of EDA with maps and measures of spatial data.
  • 16. 16 2.3.1 Spatial Dependence and Autocorrelation Location establishes context. Comparing attributes and distances of objects with those of other objects in close proximity is powerful to generate insight from the data (De Smith et al., 2007). Dependence refers to any statistical relationship between two variables. Conversely, the correlation between an attribute data with itself is called autocorrelation. Classical statistical inference usually assumes the assumption of independence on the observations under study. The same assumption is usually not applicable to geographical data since geographic attributes or units are tied together, a phenomenon termed as spatial dependence. In spatial analysis, the core measure of spatial dependence is the Pearson’s Product Moment Correlation Coefficient (PMCC) as presented in equation 1. It forms the basis for many of the correlation measures used in spatial (and time series) analysis (Haworth, 2018). 𝑟𝑋𝑌 = ∑𝑛 𝑖=1 (𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌) √∑𝑛 𝑖=1 (𝑋𝑖 − 𝑋)2√∑𝑛 𝑖=1 (𝑌𝑖 − 𝑌)2 (1) where: n is sample size x, y are the individual sample points indexed with i 𝑥̅ is the sample mean of xi and yi PMCC results can be interpreted the following way: +1 = perfect positive correlation 0 = no correlation -1 = perfect negative correlation Spatial autocorrelation describes how an attribute is distributed over space. That is, to what extent the value of the attribute in one spatial area depends on the values of the attribute in neighbouring zones (Goodchild, 1986). To assess the significance of the autocorrelation coefficient, two main strategies are usually employed (Haworth, 2018): • Adjacency based measures: applied with spatial weight matrix
  • 17. 17 • Distance based measures: use distance between locations to define proximity. The former technique is applied to spatial areas spatially adjacent. The latter is a function of the distance between observations, typically dealing with point of interest data6 . Local measures of autocorrelation and clustering in areal data are usually assessed with spatial local indicators of spatial association (LISAs). These techniques are not used in the present study since the sites that monitor pollution are too sparse in the spatial domain (i.e. inner London boundary) and spatial aggregation cannot be achieved wisely. 2.3.2 Correlation in Temporal Data Correlation in temporal data is explored based on observations over time frames. A time series is a set of observations on quantitative variables collected over time. Univariate time series is a sequence of measurements of the same variable collected over time. Time-series and spatial data analysis are used separately to examine whether the data is correlated and stationary in time and space. In conjunction with a spatial component, the series turns out to be a space-time series of values for a quantitative variable over time7 . The autocorrelation function (ACF) for a series gives correlations between the series and temporal lagged values of the series for different lags. It calculates the correlation of a variable with a lagged specification of itself (i.e. autocorrelation). The ACF is used to identify the possible structure of time series data. For example, if a time series exhibits significant autocorrelation then its previous values can be used to predict its future values. The ACF of the series gives correlations between xt and xt-h for h = 1, 2, 3, i (Penn State University, 2020). The ACF between xt and xt-h equals: Covariance(𝑥𝑡, 𝑥𝑡−ℎ) Std.Dev.(𝑥𝑡)Std.Dev.(𝑥𝑡−ℎ) = Covariance(𝑥𝑡, 𝑥𝑡−ℎ) Variance(𝑥𝑡) (2) 6 In this study, spatial autocorrelation is measured from spatial points formed by monitoring stations. Therefore, the technique semivariance modelling is applied. 7 Time series also violates the assumption of data independence in classical statistics as more recent information is usually more useful than less recent information in forecasting.
  • 18. 18 Most time series are not stationary, that is, violates the assumption that the mean is the same for all lags (which denotes that the values are independent of time). It may exhibit temporal seasonal patterns as for example transport flows and weather. A partial correlation function (PACF) is a conditional correlation. It is the correlation between two variables under the assumption that we know and take into account the values of some other set of variables. It is a measure that show how much more information each additional variable provides (Haworth, 2018). For example, in a regression problem with y as the response variable and x1, x2, x3 as the predictor variables, the PACF can be calculated as follow: Covariance(𝑦, 𝑥3|𝑥1, 𝑥2) √Variance(𝑦|𝑥1, 𝑥2)Variance(𝑥3|𝑥1, 𝑥2) (3) The partial correlation between y and x3 is the correlation between the variables determined taking into account how both y and x3 are related to x1 and x2 (Penn State University, 2020). 2.4 Geostatistical Interpolation Given a spatial framework, an area can be modelled as a function of an attribute variable. The variation of a spatial phenomenon over a continuous geographical scale is sometimes modelled from point data spatially disperse, as in the case of urban sensor stations. Bivand et al. (2008) define geostatistical data as “those that could in principle be measured anywhere, but that typically come as measurements at a limited number of observation locations (…)” (p. 191). By extension, geostatistics can be defined as the analysis of spatial variation of an attribute by means of a function with geostatistical data. Spatial interpolation aims to create a surface, usually referred as a grid, from spaced point data, allowing predictions of a variable at spatial areas based on neighbouring observations, or distances between points. The modelling approach in geostatistics regards the analysis of random fields Z(s), with Z random and s the non-random spatial index. With a limited number of sample locations, measurements on Z are available, and prediction (interpolation) of Z is modelled at non-
  • 19. 19 observed locations s0 by means of a spatial correlation function (i.e. the semivariance function). The collection of geostatistical data with the temporal domain also enable the modelling of temporal variability in conjunction with spatial data, a technique known as spatio-temporal interpolation8 . Typical problems where interpolation methodologies are applied are the creation of digital elevation model, environmental analysis such as air quality or soil pollution and estimation of spatial averages from continuous, spatially correlated data and house prices (Haworth, 2018). Other problems include monitoring network optimization, where observation locations are to be located or removed. 2.4.1 Spatial Interpolation Methodologies The common process to apply spatial interpolation is basically composed of three basic steps: 1. observations of a phenomenon are recorded at point locations (e.g. monitoring stations); 2. a grid (a raster layer) is overlaid on the area of interest and 3. the value of each grid cell is estimated using some function of the observed points. There are a number of techniques for creating grids (De Smith et al., 2007). In broad terms, they can be classified in two main strategies: 1. Deterministic methods: the values at unsampled (grid) points are computed as a simple linear weighted average of neighboring measured data points within a given neighborhood under consideration, 2. Probabilistic methods: fit a model to the data. Regionalised variation is determined by modeling the semivariance and using the fitted function in the interpolation process. Table 2.3 presents the main characteristics of the most used methods of spatial interpolation. 8 This technique is not applied in this study.
  • 20. 20 Table 2.3 Interpolation methods Method Strategy Advantages Disadvantages Nearest Neighbours Each point is given the average value of the k nearest points to it. • Distribution free • Computationally simple • Based on the sample data - some neighbours may be far away. • Considers all neighbours equally and not based on distance. Distance decay Applies mathematical function which is used to weight observations based on their distance from the point to be estimated. Inverse distance is usually applied to decrease similarity with distance. • Computationally efficient • Sensible to outliers and sampling configuration ( clustered and isolated points). Inverse distance weighting (IDW) Calculates the value at a point as a weighted sum of surroundings points, where the weight is proportional to the inverse of the distance from the point. • Conceptually simple • Easy to apply • Not computationally intensive if used efficiently. • Deterministic: based purely on prior assumptions of the distance decay relationship • Does not take the spatial distribution of the data into account • Cannot be ‘trained’. Kriging Based on semi-variogram modelling. Describes how variance increases as a function of distance between observations (distance decay). A function is fit to the semi- variogram which is used to weight distances between points. • Fits a model to the data, rather than relying on prior assumptions • Uncertainty in the predictions can be quantified (if the assumptions are correct). • Very flexible framework – lots of functions can be used to model semivariogram • Computationally intensive for large regions • More complicated than other methods; requires training to be used correctly. Source: adapted from Haworth (2018). In deterministic problems, the estimation process involves the use of a simple linear expression in order to compute grid values (equation 4): 𝑍𝑗 ∑ 𝜆𝑖𝑍𝑖 𝑛 𝑖=1 (4) where zj is the z-value to be estimated for location j, the λi are a set of estimated weights and the zi are the known (measured) values at points (xi,yi). As zj is a simple weighted average an additional constraint is required ensuring that the sum of the weights adds up to 1. To tackle
  • 21. 21 the interpolation problem, we must determine the optimum weights to be used (Bivand et al., 2008). Among the many factors that influence the quality of interpolation, the distribution of observations plays a major role. Interpolation methods often assume data points are subject to error. Regularly spaced data may be subject to bias due to intrinsic frequencies in the data, spacing and directional effects (De Smith et al., 2007). 2.4.2 Kriging Kriging is a geostatistical method based on statistical models that calculates relationship among data from measured points. The model assumes that the distance or direction between sample points reflects a spatial correlation in the study area. This model is used to interpolate, or predict, values at unsampled locations, in much the same way as with deterministic interpolation of continuous spatial phenomena (De Smith et al., 2007). The interpolated values are modeled by a Gaussian process governed by prior covariances (Chen et al., 2010). Kriging has many methods – simple kriging, ordinary kriging (OK) and universal Kriging (table 2.4) Table 2. 4 The main forms of linear Kriging Kriging Form Mean Drift Model Prerequisite Simple Kriging Known None Covariance Ordinary Kriging Unknown Constant Variogram Universal Kriging Unknown Function of coordinates Variogram Variogram Kriging with external drift Unknown External variable Variogram Source: Chiles and Delfiner, 2012, p. 148. By analyzing the sample data, it is possible to derive a general model that describes how the sample values vary with distance and direction (i.e. isotropy and anisotropy). This model may be then used to interpolate, or predict, values at unsampled locations, in much the same way as with deterministic interpolation.
  • 22. 22 De Smith et al. (2007) point to the fact that kriging cannot deal with duplicate observations (i.e. data that share the same location) because they are perfectly correlated, leading to singular covariance matrices. The modelling procedure is based on semivariogram models. The general formulae for predicting values is as follows (Haworth, 2018): 𝑍(𝑥, 𝑦) = 𝑚(𝑥, 𝑦) + 𝑒1(𝑥, 𝑦) + 𝑒1(𝑥, 𝑦) (5) where: Z(x, y) is the value to be predicted at location x,y, m(x,y) is a deterministic model of z at locations x,y. In ordinary Kriging, this is a mean value, e1(x,y) is the statistical variation from z(x,y). This part is modelled using semivariogram. e2(x,y) is a random error component used for residual analysis. 2.4.3 Spatial Variance In interpolation, spatial variance is modelled by means of the variogram or semivariogram. Spatial variance refers to the amount of variability in a phenomenon over distance. The semivariogram plots spatial semivariance as a function of distance. In standard statistics problems, correlation is any statistical relationship or association between two random variables. It measures the degree to which a pair of variables are related and is commonly studied by means of a scatterplot graphic. With spatial problems, the correlation of variables at locations s1 and s2 cannot be estimated as only a single pair is available (Bivand et al., 2008). Additionally, we might study if the point data holds the stationary assumption, which relies on the property that the mean, variance and autocorrelation structure do not change over time. Spatial analysis packages as stat, compute the squared differences between all pairs of values in the dataset, and then allocate these to lag classes based on the distance between the pair. The procedure computes a set of semivariance values for distance lags, h, increasing in steps from 0 to a value less than the greatest distance between point pairs. A variogram graph (also
  • 23. 23 known as empirical semivariogram) represents this set of values plotted against the separation distance, h (see an example in figure 3.1). Figure 2.1 Sample empirical semivariogram Semivariogram modelling allows values at unknown locations be estimated to obtain weights that may be used in an interpolation process using a relatively simple equation (De Smith et al., 2007). It involves fitting a mathematical function to an empirical semivariogram by calculating the sum of squared errors. The goal is to draw a line through all the points that minimizes the residual error between each point and the model. There are three parameters to adjust in semivariance modelling: sill, range and nugget (figure 2.2) Sill (s): is the approximate distance at which spatial correlation between data point pairs ceases or become much less variable. At this range a plateau in the semivariance values is reached9 . Nugget (n): is a zero or non-zero intercept with the y-axis in the model that has been fitted. Range (r): Is the distance at which the increase in semivariance levels off. 9 Non-transitive variograms are ones in which the sill is not reached.
  • 24. 24 Figure 2.2 The semivariance parameters Source: Esri (2019) To fit a model to the empirical semivariogram some steps need to be taken (Bivand et al., 2008). 1. Choose a suitable model (such as exponential or gaussian), with or without nugget. 2. Choose suitable initial values for partial sill(s), range(s), and possibly nugget. 3. Fit this model, using one of the fitting criteria.
  • 25. 25 Chapter 3. Data Individuals in urban areas are exposed to a large variety of pollutants mixes. Air quality is affected by pollutants such as nitrogen oxides (NOx), particule matter (PM), carbon monoxide (CO) and ground level ozone (O3). These substances interact, react and create heterogeneous pollutant mixes (Jerret et al., 2005). At present, urban sensor stations allow to monitor different pollutant particles at an increasing temporal resolution. In this study, the main pollutant particle analysed is nitrogen dioxide (NO2), a secondary pollutant formed mainly from nitrogen oxide (NO) with a surface lifetime if around 1 day. It is normally measured with passive sensors and is in general more spatially homogeneous than NO, the predominant species in vehicle exhaust (ibid , 2005). Nitrogen dioxide levels are calculated by the number of micrograms in every cubic metre of air (µg/m3). The primary data is obtained from a broad variety of sensors deployed by stakeholders, including citizen scientists and community advocates, expert researchers, government agencies, sensor manufacturers and others, and provided by Air Quality Data Commons (AQDC), organization which seeks to accelerate solutions to air pollution by standardizing and sharing air quality data. In brief, a pollutant is sampled at multiple fixed sites monitors acting as a proxy for “true” human exposure. Data of pollution is stored with timestamp value as epoch time (as stored from the sensors) in hourly basis. The epoch time is then used by the downstream applications of the AQDC online platform. The dataset of pollutant data available to London is composed by monitoring stations of London Air Quality Network (LAQN) positioned within Greater London Authority Boundary and licensed under the terms of the Open Government License. The data was downloaded using the SQL editor for large dataset on AQDC platform. A dataset of over 1 million records from 101 stations across London for the period of December 21st 2018 (the start of the winter season) to June 6th 2020 (the last available data prior to the study) was downloaded on “csv” format. The present study adopts the definition of astronomical seasons which uses the dates of equinoxes and solstices to mark the beginning and end of the seasons. Monday to Friday are termed as weekdays, Saturday and Sunday as weekend.
  • 26. 26 As the density of monitoring stations in outer London falls and becomes less regular, a subset of sensors data within inner London boundary has been selected (Figure 3.1). This strategy aims to increase accuracy and reliability of the interpolation technique to be applied in the study. The spatial scale relies within Greater London Authority and comprises a network of fixed- based monitoring of 71 fixed monitoring stations with 123 sq mi (319 km2 ) and a population of 3,535,700 inhabitants (ONS, 2011). Figure 3.1 Map of the 71 monitoring stations inside Inner London Authority Boundary
  • 27. 27 Chapter 4. Methodology There are a number of methodologies to investigate human exposure to air pollutants. The literature provides good examples of techniques such as geostatistical interpolation, land-use regression models, dispersion models and hybrid models which combines space–time– activity data, personal measurements and air quality models (Jerret et al., 2005; Elliot et al, 2000; Steinle et al, 2013). The present study applies exploratory data analysis and interpolation technique for estimating pollution at unsampled locations. Due to the irregular distribution, the density of the monitoring stations (i.e. 71 stations within inner London boundary) and the spatial variability of the data measurement, a 2D probabilistic interpolation is the preferred choice of the study. Therefore, kriging, a geostatistical model, has been the methodological approach chosen taking into account the irregular spatial distribution of the dataset. The method assumes that the data point values represent a sample of a continuous spatial phenomena, in this case NO2 pollutant emissions. 4.1 Empirical framework In order to address the research questions, the study is empirically designed into a few stages as presented in Figure 4.1. Figure 4.1 Outlook of the methodological approach Step 1: The first section aims to explore the dataset characteristics and generate hypotheses by means of graphical methods as histograms, box-plots and scatterplots. Exploratory spatial data analysis (ESDA) strategies are applied to examine spatial and temporal patterns in the dataset. Time-series analysis with metrics as week, month, season and time, and spatial analysis are used separately to study the patterns of pollution. This way, some inferences about the dataset can be raised. The goal is to develop an 1. Examination of data patterns and characteristics 2. Modelling of spatial correlation 3. Spatial interpolation 4. Model diagnostics and discussions
  • 28. 28 understanding of the data by revealing key relationships and processes. Among the objectives of the first part of the study, include: • generate insight into the dataset • uncover underlying structure • detect outliers • Identify underlying assumptions. In this section, a data-driven approach is applied by making a list of data components and then transforming them into graphics. To explore the data, a methodological pathway is taken. First by tidying the data, which mean to store in a regular form in accordance with the semantics of commands applied. Once the data is tidied, a data transformation process is applied. Transformation includes narrowing in on observations of interest, eventually creating new variables and calculating a summary statistic. Finally, techniques of visualization are applied to driving knowledge generation. Step 2: The second stage aims to study the spatial correlation in the dataset, the process of semivariance modeling. More precisely, the spatial correlation is modelled by means of semivariogram (here, the word variogram is used synonymously with semivariogram). In geostatistics the spatial correlation is modelled by the variogram instead of a covariogram. The semivariogram is used for spatial interpolation of the observed process based on point observations, in this case monitoring pollution stations. This step applies a process commonly known as exploratory variogram analysis, which means to explore spatial correlation on the data. The experimental semivariogram is calibrated and then fit with a suitable algorithm model that describes how the sample values vary with distance. To fit a variogram to the empirical semivariogram, the following steps are taken: 1. Study the spatial correlation by means of a semivariogram and verify directional dependence. 2. Choose a suitable semivariogram model taking into account accuracy measures and suitable initial values for partial sill(s), range(s), and nugget. Step 3: This stage explores the spatial prediction of unknown quantities of NO2 based on the previously step. This model is used to interpolate, or predict, values at unsampled locations
  • 29. 29 taking into account the form of pollution particles and its variance. The following procedures are taken: 1. Based on the semivariance model, interpolate sites onto a regular grid of (x, y) locations. 2. Apply kriging function to compute predictions of value data. 3. Display the results on a grid map. Step 4: The last stage deals with the results of the interpolation process. A discussion about the results and a comparative review of the different timeframes analysed. It also draws some conclusions about the impact of Covid-19 on the pollutant data of NO2 and implication for policies to improve air quality in London.
  • 30. 30 Chapter 5. Results To perform the study, the dataset was first imported into R. A number of procedures have been carried out in view of storing the data in a consistent form, ensuring it is organized in a way that matches the semantic of the original dataset but getting into a format for up-front work10 . To explore the data, a methodological pathway has been applied. First by tidying the data, which means to store in a regular form in accordance with the semantics of commands be applied. Initial queries could then be performed by making some transformation in the dataset, narrowing in on observations of interest, creating new variables and calculating some summary statistics. Finally, techniques of visualisation are applied. On the initial phases, unusual observations, data that doen’ t seem to fit the pattern, have been identified. Asymmetric distribution of NO2 measures was noted and outliers were eliminated applying the z-score technique. This variability technique measures observation's variability and identifies outliers based on the standard deviations above or below the mean each data point relies. 5.1 Data Patterns and Characteristics An initial check was conducted to verify if the outlier detection and treatment lead to a closer approximation to normality in the distribution of NO2. Figure 5.1 displays the frequency of daily mean NO2 by season (for analysis purposes, the period of the Covid-19 lockdown is expressed the same manner of a season). It can be noted that the shape of the data dispersion (after the initial transformation) lead a distribution close to the normal (i.e. a Gaussian distribution) in all seasons. 10 A procedure usually refers as data tidying.
  • 31. 31 Figure 5.1 Frequency of daily mean NO2 by season (in µg/m3 ) Following the initial statistics on the dataset, table 5.1 presents a summary of daily NO2 from December 21th 2018 to June 6th 2020. Table 5.1 Summary of statistics of daily NO2 (µg/m3 ) in Inner London Boundary Min 20.18 1st Qu. 30.94 Median 35.99 Mean 36.66 3rd Qu. 40.85 Max. 65.35 Note: Data from December 21st 2018 to June 6th 2020 To better study how the distribution diverges from normality a quantile-quantile (QQ) plot provides a good way to describe the distribution on the data. The straight red line is the theoretical normal distribution (if the data is normally distributed, they would fall on this line). Figure 5.2 displays the distribution of daily concentration of NO2 in 2019 and during the enforced lockdown period established on March 23rd , 2020. It can be seen that the distribution diverges from normal at its upper tail. This suggests that the NO2 recorded at
  • 32. 32 some stations or certain periods present much higher indices than the average. This trend is more evident in 2019 than during the lockdown. Some hypotheses can be raised: • Some stations have systematically higher pollutant concentration than others. • There is a temporal trend in the data: i.e. at a certain seasons, days or times, the pollutants are systematically higher than at other times. Figure 5.2 Frequency of daily mean NO2 (in µg/m3 ) A closer look on the spatial and temporal patterns would respond some of these questions. Finally, figure 5.3 provides an outlook of the daily exposure variation on NO2. The graph presents the daily mean and segregates the data by weekdays and weekends. It also highlights the beginning of the official lockdown in the UK.
  • 33. 33 The displayed line computes a smooth local regression and helps to visualize the downward trend in pollution, in particular during the lockdown period. It also shows that, in general, workdays present higher daily mean of NO2. The plot also suggests seasonality on the data, from the Winter and Autumn to Spring and Summer. Covariation analysis would provide a better clarification of this hypotheses. Figure 5.3 Outlook of daily concentration of NO2 (in µg/m3 ) 5.2 Spatial Patterns To explore how data vary in space, a matrix of scatterplots showing the relationship between mean NO2 per station, latitude, longitude has been drawn on figure 5.4. The first row shows longitude on the y-axis, and latitude and mean NO2 on the x-axis in columns 2 and 3 respectedly. From the graph, there is no clear evidence of the relationship of these three variables. Longitude, latitude and mean NO2 do not present a clear relationship. In a certain extent, this is expected since a geographic reference system (i.e. longitude and latitude) has little effect on pollution in urban areas.
  • 34. 34 Figure 5. 4 Pairwise scatterplots of GRS and mean NO2 I better analysis would be to concentrate on the spatial scale of the area of the study. Figure 5.5 provides the context of the mean NO2 across the area of the study. The maps are split by seasons and the density index informs NO2 mean. A partial spectral scale has been applied and the breaks are based on the distribution of the data. The maps draw attention to the low density and irregular location of the station network across London. Some regions on inner east and south have a low density of monitoring stations. On the other hand, the plot confirms seasonality on the data: Autumn and Winter are the season most affected by pollution. A better picture is provided by the temporal analysis on next section. On the other hand, the lockdown period displays a better figure. London seems be positively affected by the lockdown in terms of urban pollution. This can be explained by better road traffic conditions. Congestion is characterised by slow moving stop and go traffic and is very costly for pollution.
  • 35. 35 Figure 5.5 Network of Monitoring Stations and Mean NO2 by Season. Local measures of autocorrelation and clustering are usually assessed with spatial adjacency and spatial weight matrices and spatial local indicators of spatial association (LISAs). These techniques have not been applied in the study as the network of monitor sites are located too sparse in the spatial domain (i.e. inner London boundary). Consequently, spatial aggregation with areal data cannot be achieved wisely. Autocorrelation in point data is presented in the section 5.5 (Spatial Interpolation). 5.3 Temporal Patterns To examine how NO2 particles vary in temporal dimensions, it is necessary first to pull out individual parts from the timestamp value (as stored from the sensor) with arithmetic for date- time components. Dates and times have been aggregated and classified by season, days of the week, month and time components. The objective is to explain the relationships between variables and to analyse causal relationships between different time aggregations. Several analyses have been carried out to spot temporal patterns of NO2 data.
  • 36. 36 To further explore patterns of NO2 mean by season, figure 5.6 presents the box-plot graph with daily concentration of pollutant by weekly days and seasons11 . The box-plot confirms some patterns. Weekends present lower concentrations of pollutant than weekdays. This can be explained by lower number of urban congestions on weekends. During the lockdown, the same pattern is also verified. In winter and autumn, the months with higher concentration of pollutants, the distribution is positively skewed which indicates higher levels of pollution some days. Figure 5.6 Daily concentration of NO2 (in µg/m3 ) by days of week and season Figure 5.7 outlines a similar approach to analyse temporal patterns. By means of rectangles, the tile graph plots the daily mean value of NO2 by month and season. Visually, it can be noticed that the period ranging from October to December is the most polluted. The average seasonal variation of NO2 reveals that the period comprising March to June is the least polluted. Overall, warmer months and seasons present less pollution components. 11 The box-plot is a useful descriptive statistics with order-bases indicators, as the median, quartiles and extreme values.
  • 37. 37 Figure 5.7 Daily concentration of NO2 (in µg/m3 ) by month in 2019 (top) and by season (botton) In addition to season or monthly average emission rates, it is important to capture short-term variation of concentrations. Intraday intervals can also be computed with hourly resolution (figure 5.8). The diurnal fluxes of NO2 reveals a homogeneous trend. On times of commuting to work, roughly between 6:00 to 10:00 am, and when workers head home, fluxes of NO2 get high. Interestingly, this trend is also present in the period of the lockdown. Spring 2019 and lockdown 2020 with coincident months, display very similar pattern. Dynamics in journey times are clearly present in all seasons but, in particular, evening peak is less pronounced during the lockdown.
  • 38. 38 Figure 5.8 Intraday concentration of NO2 (in µg/ m3 ) by season and the period of lockdown Finally, table 5.2 summarises daily NO2 for different time periods. Table 5.2 Statistics of daily NO2 (µg/m3 ) Min 1st Qu. Median Mean 3rd Qu. Max. Std. Dev. Winter (2018- 2019) 20.18 30.84 38.27 38.45 46.04 64.44 10.25022 Spring (2019) 20.92 29.08 33.11 33.63 36.97 52.46 6.644549 Summer (2019) 24.76 30.12 34.71 36.31 39.27 64.25 8.590492 Autumn (2019) 26.14 34.79 38.84 39.75 43.82 58.54 6.988886 Full year 2019 20.18 31.16 36.29 36.98 41.22 64.44 8.41272 SPLA1 of Covid- 19 lockdown 20.92 28.99 33.18 33.86 37.87 52.46 6.804802 Lockdown 20202 19.68 28.10 30.95 32.34 35.10 53.49 6.212479 1 period comprising the same period in 2019 of the Covid-19 lockdown in 2020. 2 period from March 23rd to June 6th , 2020.
  • 39. 39 5.3.1 Temporal Autocorrelation To quantify the extent to which near observations of pollution are more similar than distant observation in time, we should investigate temporal autocorrelation on the data. For the length of the series, we can calculate the autocorrelation between one day (denoted as dt) and the previous day (dt-1). Figure 5.9 displays the temporal autocorrelation NO2 pollutants in London averaged for all stations. The first plot presents the time series of pollutants and the second a scatter plot with daily level on the x-axis, and of the previous day on the y-axis. The value of the autocorrelation coefficient is 0.609 (shown on the plot), which demonstrate that daily mean of NO2 is positively correlated. Figure 5.9 Temporal autocorrelation A temporal aggregation of daily NO2 mean has been applied to the temporal autocorrelation function (the dependence of one day on the previous days). The ACF of full year (FY) 2019 and the lockdown period are presented on figure 5.10. The blue lines indicated bounds for statistical significance. The plot shows nonsignificant autocorrelation from the third lag on for 2019. The ACF for the lockdown differs from 2019 in two mains aspects: 1. the autocorrelation is significant for a larger number of lags and 2. there is an unstable trend with positive and negative values suggesting a kind of seasonality for the days of the week.
  • 40. 40 Figure 5.10 Temporal autocorrelation function (ACF) for 2019 and lockdown period To further investigate whether the series has temporal dependency, we can draw lag plots as an exploratory tool to give visual impression of the dependence (figure 5.11). The lag plots for the first 10 lags have been calculated. We can see that there is an increasing scatter with progressive lags (seen from the clustering around the slopes). A principal point can be drawn from the plots: the slopes, and therefore the correlations, are clearly positive for 2019 and for some lags unclear in the lockdown. Lower levels of NO2 seem to have an impact on temporal autocorrelation of NO2 with alternates positive and negative correlations. Figure 5.11 Temporal autocorrelation function (ACF) lag plots for 2019 and lockdown period in 2020
  • 41. 41 The same temporal aggregation of daily NO2 mean has been applied to the partial autocorrelation function (PACF). PACF measures the autocorrelation at a particular lag after accounting for autocorrelation at all lower lags. Overall, both time-period present alternate positive and negative and decaying to zero. Significant correlations are present only on the first lag, followed by correlations that are not significant (figure 5.12). Figure 5.12 Parcial temporal autocorrelation function (PACF) for 2019 and lockdown period 5.4 Space-time Semivariogram The space-time semivariogram calculates the semivariance at intervals in space and time. It aims to examine how semivariance varies with increasing spatial and temporal separations between observations. Figure 5.13 displays a spatio-temporal variogram in two-dimensions (2D) of NO2 observations with x-axis representing the spatial distance lag and y-axis different time-lags aggregated in days for two periods: full year 2019 and the lockdown period in 2020.
  • 42. 42 It can be seen that the semivariance increases rapidly with increasing temporal lags in 2019. The increase with spatial lags is less dramatic. A different pattern is verified for the lockdown. The semivariance increment with increasing temporal lag is less intense and homogeneous. Overall, the lockdown presents lower semivarianve values. Figure 5.13 2D semivariogram for NO2 in Inner London FY 2019 Lockdown 2020 A 3D version of the plot is shown in figure 5.14. Again, it can be noted a less homogeneous pattern in the lockdown with lower semivarianve values.
  • 43. 43 Figure 5.14 3D semivariogram for NO2 in Inner London FY 2019 Lockdown 2020 5.5 Spatial Interpolation In geostatistics the spatial correlation is modelled by the variogram instead of a correlogram or covariogram. A large variety of geographic information system (GIS) packages and geostatistical software for geostatistic are available. Gstat12 has been selected for the present study since it offers considerable flexibility in the modeling and display process. The first step is to examine the data distribution to investigate normality and trends. In kriging interpolation, a normal distribution is required in order to provide a best unbiased predictor of values at unsampled points. Eventually, it may be necessary to transform the values prior to analysis. Figure 5.1 (in the first section of the chapter) outlines the frequency of daily mean NO2 by season, which demonstrated a distribution close to the normal in all seasons. Figure 5.2 also displays the distribution of daily concentration of NO2 in 2019 and during the enforced lockdown period. It can be seen that in both periods distribution does not diverge substantially from the straight line indicating a normal distribution. Though, any transformation needs to be carry-out. Prior to model the empirical variogram, the exploration of the spatial correlation is achieved by a variogram cloud. The variogram cloud is obtained by plotting all possible squared differences of observation pairs against their separation distance. Figure 5.15 depicts the semivariogram cloud for the full year (FY) 2019 and the lockdown period in 2020. The plots 12 Full information on Gstat is available at https://cran.r-project.org/web/packages/gstat/index.html.
  • 44. 44 show a lot of scatter, but it can be noted some increase of maximum values for distances increasing up to 2,000 m in 2019. Variation in lockdown is less regular and consistent than 2019. Figure 5.15 Semivariogram cloud plot for NO2 data FY 2019 Lockdown period 2020 To measure how variance increases as a function of the distance between stations, the empirical semivariogram has been calculated based in successive distance bands (figure 5.16) with a cutoff value of 8000 metres.
  • 45. 45 Figure 5.16 Empirical semivariogram for NO2 data FY 2019 Enforced lockdown period 2020 Figure 5.16 indicates that for the full year 2019, the semivariance stop increasing close to 1000m. At this range a plateau in the semivariance values is reached and the increase in semivariance levels off. In 2020, during the enforced lockdown period, the semivariance values do not levels off. The lack a consistency or fixed pattern of semivariance values indicates a non-transitive variogram pattern. To verify if the variogram is not of constant form in all directions, an alternative approach would be to divide the lag band into a series of discrete segments and then calculate separate
  • 46. 46 variograms for each direction. The results are displayed in figure 5.17. In many instances the variogram is not of constant form in all directions, indicating a property known as anisotropy. Figure 5.17 Directional variogram FY 2019 Enforced lockdown period 2020 Here, directions have been classified into four direction intervals: 0, 45, 90 and 135 degrees). The first plot gives the variogram in the zero direction, which is North, 90 degrees is East. It is clear that the rate of pattern of semivariance does not present a clear directional patter,
  • 47. 47 which would indicate anisotropy. In both periods, the semivariogram is independent of the direction, indicating a semivariance uniformity in all orientations, though anisotropy must not be modeled and the process can be assumed as isotropic. For the semivariance modelling this means that the point pairs are merged on the basis of distance, not direction. 5.5.1 Semivariogram Modelling To estimate values of NO2 at unknown locations, the observed pattern of variation with distance must be modelled. The traditional way of finding a suitable variogram model is to fit a parametric model to the empirical variogram. The model is then used to calculate the weights in the kriging equations. In order to facilitate and provide better insights from the interpolation process, three different time-frames have been applied to the semivariogram modelling: 1. Full-year 2019; 2. same period last year (SPLY) of the enforced lockdown period in 2020 and 3. enforced lockdown period of 2020. To fit the variogram model to the empirical semivariogram several steps have been taken: 1. Model and test different model functions (e.g. exponential or gaussian) 2. Choose suitable initial values for partial sill(s), range(s), and nugget 3. Fit this model, using one of the fitting criteria. Table 5.3 presents the fitting accuracy results for each model function applied to the modelling process. Residuals were calculated using ordinary least squares. Table 5.3 Fitting accuracy of the semivariance modelling Model Fitting accuracy Full Year 2009 SPLY of Lockdown (March 23rd to June 6th , 2019) Lockdown (March 23rd to June 6th , 2020) Circular 3.506397e-09 2.057457e-09 2.03075e-08 Exponential 5.607275e-09 3.139532e-09 2.605411e-07 Gaussian 2.374513e-09 1.895494e-09 1.983014e-08 Linear 2.910855e-09 1.881238e-09 1.970009e-08 Matern 5.607275e-09 3.139532e-09 2.605411e-07
  • 48. 48 Pentaspherical 3.674069e-09 2.288446e-09 2.112738e-08 Spherical 3.293529e-09 2.172618e-09 2.072471e-08 From the fitting accuracy modelling, the exponential type presents the best fitting accuracy. Figure 5.18 presents the results of the semivariagram modelling. Figure 5.18 Semivariance with fitted model function FY 2019 (exponential) SPLY of Lockdown (exponential) Lockdown, 2020 (exponential)
  • 49. 49 Finally, the sill, range and nugget from the fitted model are presented on table 5.4. Table 5.4 Fitting accuracy of the semivariance modelling Full Year 2009 SPLY of Lockdown (March 23rd to June 6th , 2019) Lockdown (March 23nd to June 6th , 2020) Nugget (n) 0 0 0 Range (r) 516.3186 504.0462 430.1904 Sill (s) 0.02522716 0.02922275 0.04954327 5.5.2 Spatial Prediction with Kriging The interpolation model is applied to predict unknown quantities of NO2 based on sample data and assumptions regarding its variance and spatial correlation. To fit the kriging model, the semivariogram produced on the last section has been used. A matrix of distances between points has been calculated and applied to the variogram function. The kriging model applied in the exercise is the Ordinary Kriging (OK), one of the most widespread procedure of this type offered in GIS packages. Figure 5.19 Spatial Prediction with Kriging
  • 50. 50 FY 2019 SPLY of Lockdown Lockdown, 2020
  • 51. 51 From the figure 5.19 we can initially point out that NO2 interpolation map of the full year of 2019 and the same period of the lockdown in 2019 are comparatively close. Overall, they share the same format of areas with higher and lower pollution levels. Interestingly, during the enforced lockdown in 2020, the image is quite different in the central and east zones. The fact that commuting to work has fallen sharply seems to have a positive effect on air quality over these areas.
  • 52. 52 Chapter 6. Analysis and Discussion In this research, we have applied a geostatistical approach to study NO2 pollutant concentration in inner London during the Covid-19 lockdown and compare the results to pre- lockdown periods. The study demonstrates that coupling pollution data analysis and spatial modelling can provide insightful information and improve the understanding of the impact of Covid-19 in NO2 emissions. There are a number of important lessons to be learned from NO2 emissions during the Covid- 19 lockdown in London. Firstly, the spatial interpolation with kriging demonstrates a common pattern of NO2 levels in the full year 2019 and the same period of the lockdown in 2019 (figure 5.19). The zone comprising the London Congestion Charge Zone in central London and inner west and south share the same pattern standing as higher polluted areas. The lockdown has substantially changed this perspective with an area of lower levels of pollution encircling central London. In contrast, north and some inner east areas change pattern, moving to zones with higher levels of pollution. In all scenarios, the Ultra Low Emission Zone (ULEZ), which aim to help improve air quality in central London, persist as a zone of higher levels of pollution. It can also be inferred that monitoring stations in this area have systematically higher pollutant concentrations than others (see figure 5.5). Further notice that prior to the interpolation process with kriging, the empirical semivariance for 2019 and the lockdown period indicates a semivariance uniformity in all orientations. Thus, the semivariance modelling assumes an isotropic pattern for both periods. The study also demonstrates that NO2 levels exhibit temporal patterns. Weekends present lower concentrations of NO2 pollutants than weekdays. This patter is also present in the lockdown period. This can be explained by less road congestions on weekends. Short-term variations are also present in the data. Dynamics in journey times are clearly present in all periods. Intraday intervals computed with hourly resolution demonstrate that diurnal fluxes of NO2 have a homogenous trend. Between 6:00 and 10:00am fluxes of NO2 get high. Spring 2019 and during the lockdown with coincident months, display a very similar pattern. Evening peaks are less pronounced during the lockdown which indicate less traffic congestions coming from the enforced intervention on mobility.
  • 53. 53 For the period pre-lockdown, the period ranging from October to December 2019 is the most polluted and the interval from March to June present lower NO2 emissions. This suggest that in general, warmer months display less NO2 levels in London. Clearly, London has been positively affected by the lockdown in urban pollution. NO2 levels in inner London fell from a daily mean of 37µg/m3 in 2019 to 32µg/m3 for the period of March 23rd to June 6th 2020. Similarly, if we compare same periods, the data also demonstrate a decrease in pollution since the same period of the lockdown in 2019 displays 34µg/m3 . All seasons in 2019 present a higher concentration of NO2: 38µg/m3 in winter, 34µg/m3 in spring, 36µg/m3 in summer and 40 µg/m3 in autumn. The results also demonstrate that the overall values of NO2 meet both the UK13 and the World Health Organisation (WHO) air quality guidelines limits which state that NO2 levels above 40µg/ m3 are harmful to people. Nevertheless, recorded days exceeding this limit were also verified, with daily levels up to 64 µg/m3 in Winter and Summer 2019. Covid-19 lockdown also exceed that limit with levels up to 53 µg/m3. For this reason, the governance air pollution scheme for the Mayor of London recognises that London’s environment is improving, but it still faces major challenges (Mayor of London, 2018). The empirical results offered by the present study should serve to future research, but limitations must also be noted. Firstly, the results should be interpreted with caution in function of a limited time period analysis. Future studies would benefit from longer time-series in order to reinforce the analysis. Secondly, while the network of location-based monitoring stations provides a straightforward application for the analysis of urban NO2 exposure, it has considerable limitations. Such fixed-location data estimates ignore the impact of individual mobility patterns. A detailed individual level activity data with GPS-enabled monitoring devices would substantially improve the accuracy of the exposure data and ultimately enhance the analysis of environmental pollution in London. Finally, to improve reliability and accuracy of the interpolation process, a more regular distribution of sensors would be necessary. The network of monitoring stations presents a low 13 The legislation for air quality objectives has been defined by the UK Air Quality Strategy (AQS) for England, Scotland, Wales and Northern Ireland in January 2000. The UK Government’ s strategy for air quality management requires all local authorities to carry out regular assessments of air quality in their areas.
  • 54. 54 density and irregular distribution design. Spatial correlation analysis would benefit of a better spatial arrangement of sensors.
  • 55. 55 Chapter 7. Recommendations for Further Studies Looking forward, many opportunities can be explored to provide important insights of the impact of Covid-19 pandemic in pollutants emission in London. A good research strategy would be to refine the air pollution modelling with other sources of data and gain a more detailed representation. Air pollution is not associated only with anthropogenic systems. Meteorological conditions as wind speed direction, ambient temperature, solar radiation, and atmospheric stability, matters too and would provide a more accurate measurement of pollutant exposure. Coupling emission inventories and dispersion modelling can provide valuable insights for pollution monitoring studies. Another possibility is to apply regression kriging, combining values of pollutants with additional variables as distance to roadside locations. The relationship between the dependent variable and some independent variables could then be modelled using a linear regression model. Additionally, ambient measurements could also focus on components directly related to vehicle exhaust, such as black carbon particulate matter (PM) components. Predictive analysis based on spatio-temporal statistical modeling would also be of great interest. The main advantage of this technique is that observations taken at other times can be included into the spatio-temporal interpolation, increasing accuracy prediction. To extend spatial statistics to include the time dimension implies modelling variability in space and time, though increasing statistical sophistication than purely spatial or purely temporal modelling. Finally, an interesting research path would be to model forecasting post Covid-19 scenarios to the ambient air in the city. This would certainly provide important insights to ambient air policy and governance in London.
  • 56. 56 References Beevers, S.D., Kitwiroon, N., Williams, M.L., Kelly, F.J., Anderson, H.R. and Carslaw, D.C., 2013. Air pollution dispersion models for human exposure predictions in London. Journal of exposure science & environmental epidemiology, 23(6), pp.647-653. Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V. and Pebesma, E.J., 2008. Applied spatial data analysis with R (Vol. 747248717, pp. 237-268). New York: Springer. Burrough, P.A., McDonnell, R., McDonnell, R.A. and Lloyd, C.D., 2015. Principles of geographical information systems. Oxford university press. Burrough, P.A. and McDonnell, R.A., 1998. Creating continuous surfaces from point data. Principles of Geographic Information Systems. Oxford University Press, Oxford, UK. Chen, D., Ou, T., Gong, L., Xu, C.Y., Li, W., Ho, C.H. and Qian, W., 2010. Spatial interpolation of daily precipitation in China: 1951–2005. Advances in Atmospheric Sciences, 27(6), pp.1221-1232. Cheng, T, and J. Haworth. 2019. “Spatio-Temporal Data Analysis and Big Data Mining.” University College London. Chiles, J.-P., Delfiner, P., 2012. Geostatistics: modeling spatial uncertainty, 2nd ed. ed, Wiley series in probability and statistics. Wiley, Hoboken, N.J. De Smith, M.J., Goodchild, M.F. and Longley, P., 2007. Geospatial analysis: a comprehensive guide to principles, techniques and software tools. Troubador publishing ltd.Steinle, S., Reis, S. and Sabel, C.E., 2013. Quantifying human exposure to air pollution—Moving from static monitoring to spatio- temporally resolved personal exposure assessment. Science of the Total Environment, 443, pp.184- 193. Elliot, P., Wakefield, J.C., Best, N.G. and Briggs, D.J., 2000. Spatial epidemiology: methods and applications. Oxford University Press. Esri (2019), Geostatistics and the Semivariogram. Available from: https://storymaps.arcgis.com/stories/5ad922df3a724149ab55d8054bec4970 [Accessed 16th August 2020]. Fotheringham, A. S., Brunsdon, C., & Charlton, M. (2000). Quantitative geography: perspectives on spatial data analysis. Sage. Gualtieri, G. and Tartaglia, M., 1998. Predicting urban traffic air pollution: a GIS framework. Transportation Research Part D: Transport and Environment, 3(5), pp.329-336. Goodchild, M.S., 1986. Autocorrelation: Concepts and Techniques in Modern Geography. Norwich, UK: Geo Books. Haworth, J. (2018). Spatial Analysis and GeoComputation Lecture, Lecture 5: Spatial Interpolation. Haworth, J. (2018). Spatial Analysis and GeoComputation Lecture, Lecture 2: Exploratory (Spatial) Data Analysis. Haworth, J. (2018). Spatial Analysis and GeoComputation: A tutorial guide. Unpublished manuscript. Jerrett, M., Arain, A., Kanaroglou, P., Beckerman, B., Potoglou, D., Sahsuvaroglu, T., Morrison, J. and Giovis, C., 2005. A review and evaluation of intraurban air pollution exposure models. Journal of Exposure Science & Environmental Epidemiology, 15(2), pp.185-204. Mayor of London (2018). London Environment Strategy: implementation plan. Mayor of London, UK. Available from: https://www.london.gov.uk/sites/default/files/implementation_plan.pdf [Accessed 28th August 2020].
  • 57. 57 ONS (2012). Census- Population and household estimates for England and Wales, March 2011. Statistical Bulletin, Office for National Statistics, UK. Available at http://www.ons.gov.uk/ons/dcp171778_270487.pdf Oxley, T., Valiantis, M., Elshkaki, A. and ApSimon, H.M., 2009. Background, road and urban transport modelling of air quality limit values (the BRUTAL model). Environmental Modelling & Software, 24(9), pp.1036-1050. Pebesma, E. and Heuvelink, G., 2016. Spatio-temporal interpolation using gstat. RFID Journal, 8(1), pp.204-218. Penn State University (2020), Applied Time Series Analysis. Available from: https://online.stat.psu.edu/stat510/ [Accessed 21th August 2020]. The Economist, 2020. Air pollution is returning to pre-Covid levels. Available from:https://www.economist.com/graphic-detail/2020/09/05/air-pollution-is-returning-to-pre-Covid- levels [accessed 18th July 2020]. The Guardian, 2020. Coronavirus detected on particles of air pollution. Available from: https://www.theguardian.com/environment/2020/apr/24/coronavirus-detected-particles-air-pollution [accessed 12nd September 2020]. Singleton, Spielman, A., and D. Folch. 2018. Urban Analytics. Spatial Analytics; GIS Series. Steinle, S., Reis, S. and Sabel, C.E., 2013. Quantifying human exposure to air pollution—Moving from static monitoring to spatio-temporally resolved personal exposure assessment. Science of the Total Environment, 443, pp.184-193.