Mapping urban air pollution using gis a regression based approach

Full Terms & Conditions of access and use can be found at
http://www.tandfonline.com/action/journalInformation?journalCode=tgis20
International Journal of Geographical Information
Science
ISSN: 1365-8816 (Print) 1362-3087 (Online) Journal homepage: http://www.tandfonline.com/loi/tgis20
Mapping urban air pollution using GIS: a
regression-based approach
DAVID J. BRIGGS , SUSAN COLLINS , PAUL ELLIOTT , PAUL FISCHER , SIMON
KINGHAM , ERIK LEBRET , KAREL PRYL , HANS VAN REEUWIJK , KIRSTY
SMALLBONE & ANDRE VAN DER VEEN
To cite this article: DAVID J. BRIGGS , SUSAN COLLINS , PAUL ELLIOTT , PAUL FISCHER ,
SIMON KINGHAM , ERIK LEBRET , KAREL PRYL , HANS VAN REEUWIJK , KIRSTY
SMALLBONE & ANDRE VAN DER VEEN (1997) Mapping urban air pollution using GIS: a
regression-based approach, International Journal of Geographical Information Science, 11:7,
699-718, DOI: 10.1080/136588197242158
To link to this article: https://doi.org/10.1080/136588197242158
Published online: 29 Jun 2010.
Submit your article to this journal
Article views: 10578
View related articles
Citing articles: 286 View citing articles

int. j. geographical information science, 1997, vol. 11, no. 7, 699± 718
Research Article
Mapping urban air pollution using GIS: a regression-based approach
DAVID J. BRIGGS
1
, SUSAN COLLINS
2
, PAUL ELLIOTT
3
,
PAUL FISCHER
5
, SIMON KINGHAM
4
, ERIK LEBRET
5
,
KAREL PRYL
6
, HANS VAN REEUWIJK
5
, KIRSTY SMALLBONE
7
and ANDRE VAN DER VEEN
5
1
Nene Centre for Research, Nene College Northampton, Northampton
NN2 7AL, England, UK
2
She eld Centre for Geographic Information and Spatial Analysis, University
of She eld, She eld S10 2TN, England, UK
3
Department of Epidemiology and Public Health, Imperial College Medical
School at St. Mary’s, London W2 1PG, England, UK
4
Institute of Environmental and Policy Analysis, University of Hudders® eld,
Hudders® eld, HD1 1RA, England, UK
5
RIVM, 1 Antonievanhoeklaan, 3720 BA Bilthoven, The Netherlands
6
National Institute of Hygiene, Warsaw, Poland
7
Department of Geography, University of Brighton, Brighton, BN2 4AT,
England, UK
(Received 8 June 1996; accepted 23 December 1996 )
Abstract. As part of the EU-funded SAVIAH project, a regression-based meth-
odology for mapping tra c-related air pollution was developed within a GIS
environment. Mapping was carried out for NO2 in Amsterdam, Hudders® eld and
Prague. In each centre, surveys of NO2 , as a marker for tra c-related pollution,
were conducted using passive di usion tubes, exposed for four 2-week periods. A
GIS was also established, containing data on monitored air pollution levels, road
network, tra c volume, land cover, altitude and other, locally determined, fea-
tures. Data from 80 of the monitoring sites were then used to construct a regression
equation, on the basis of predictor environmental variables, and the resulting
equation used to map air pollution across the study area. The accuracy of the
map was then assessed by comparing predicted pollution levels with monitored
levels at a range of independent reference sites. Results showed that the map
produced extremely good predictions of monitored pollution levels, both for
individual surveys and for the mean annual concentration, with r
2
~0´79± 0´87
across 8± 10 reference points, though the accuracy of predictions for individual
survey periods was more variable. In Hudders® eld and Amsterdam, further mon-
itoring also showed that the pollution map provided reliable estimates of NO2
concentrations in the following year (r
2
~0´59± 0´86 for n=20).
1. Introduction
Despite the major improvements in air quality seen in many European cities
over the last 30± 40 years, the problem of urban air pollution remains. Levels of
traditional pollutants, such as smoke and sulphur dioxide (SO2) have declined, as a
result of industrial restructuring, technological changes and pollution control, but
the rapid growth in road tra c has given rise to new pollutants and new concerns.
Between 1970 and 1990, for example, passenger car transport in Europe increased
by ca. 3´4 per cent per annum, and car ownership in East European countries is
1365± 8816/97 $12´00 Ñ 1997 Taylor & Francis Ltd.

D. Briggs et al.700
currently rising by 7± 12 per cent per year (Stanners and Bordeau 1995). As a result
of these increases, emissions of many pollutants are growing: in the U.K., emissions
of nitrogen dioxide (NO2) of which 45± 50 per cent is derived from the transport
sector rose by 120 per cent between 1970 and 1990, before declining slightly; emissions
of volatile organic compounds (VOCs), of which ca. 40 per cent derives from
transport, rose by 73 per cent (Department of the Environment 1995). For the future,
these trends seem set to continue. In the absence of major shifts in policy, a further
doubling of both passenger and freight transport in Europe is anticipated by the
year 2010 (Stanners and Bordeau 1995). Notwithstanding the e ects of improvements
in engine design and fuel technology, this is likely to lead to at least the maintenance
of current emission levels in most West European countries. In East Europe, where
the rates of increase are higher, emissions may be expected to grow.
Against this background, there has been heightening concern about the health
e ects of tra c-related pollution. Several factors contribute to this concern. One is
the simple arithmetic of exposure. In Europe, for example, about 70 per cent of the
population is classi® ed as urbanized (UNEP 1993), while Flachsbart (1992) estimates
that between 9´5 and 18 million people spend a considerable part of their working
day at or near roadsides. Equally, urban areas are estimated to account for the
major proportion of emissions. The potential for human exposure to tra c-related
pollution is therefore large. Secondly, there is growing evidence from epidemiological
studies of a relationship between air pollution and respiratory illness and mortality
(e.g., Schwartz 1993, 1994) and of increased levels of respiratory symptoms in people
living close to major roads, or in areas of high tra c density (e.g., Edwards et al.
1994, Ishizaki et al. 1987, Weiland et al. 1994, Wjst et al. 1993). In a study of 1000
adults in Oslo, NILU (1991) also found a positive association between self-reported
symptoms of cough and chronic bronchitis and modelled levels of air pollution at
the place of residence. At the same time, there has been an apparent increase in
levels of respiratory illness, particularly asthma, in vulnerable groups such as children
and the old (Anderson et al. 1994, Burney 1988, Burney et al. 1990, Haahtela
et al. 1990).
In the light of these concerns, there is clearly a need for improved information
on levels of tra c-related air pollution and their potential links to human health.
This information is required for a wide range of purposes: to help investigate the
relationships involved, as inputs to health risk assessment, to assist in establishing
and monitoring air quality standards, and to help evaluate and compare transport
policies and plans. For all these purposes, information is needed not only on the
temporal trends in air pollution (as, for example, provided by data from ® xed-site
monitoring stations), but also on geographical variations. Maps are needed, for
example, to identify pollution `hot-spots’, to de® ne at-risk groups, to show changes
in spatial patterns of pollution resulting from policy or other interventions, and to
provide improved estimates of exposure for epidemiological studies.
Mapping urban air pollution nevertheless faces many problems. The complex
geography of emission sources and the equal complexity of dispersion processes in
an urban environment mean that levels of air pollution typically vary over extremely
short distances, often no more than a few tens of metres (e.g., Hewitt 1991). On the
other hand, data on both emission sources and pollution levels are often sparse. As
a result, maps of urban air pollution tend to be highly generalized, and estimates of
exposure to air pollutants subject to serious misclassi® cation.
The development of GIS techniques, however, o ers considerable potential to

Mapping urban air pollution using GIS 701
improve upon this situation. Digital data on urban road networks, for example, are
now becoming increasingly available, providing a valuable data source for pollution
modelling. The spatial analysis and overlay techniques available in GIS also pro-
vide powerful tools for pollution mapping. This paper describes and evaluates a
regression-based approach to air pollution mapping, developed as part of the
EU-funded SAVIAH (Small Area Variations in Air quality and Health) project. This
study was a multi-centre project, involving collaborators in London and Hudders® eld
(U.K.), Bilthoven (Netherlands), Prague (Czech Republic) and Warsaw (Poland).
The aim of the study was to develop and validate methods for analysing relationships
between air pollution and health at the small area scale. A description of the overall
SAVIAH study is given by Elliott et al. (1995).
2. Approaches to pollution mapping
Traditionally, two general approaches to air pollution mapping can be identi® ed:
spatial interpolation and dispersion modelling (Briggs 1992). The former uses statist-
ical or other methods to model the pollution surface, based upon measurements at
monitoring sites. With the development of GIS and geostatistical techniques in recent
years, a wide range of spatial interpolation methods have now become available.
Burrough (1986) divides these into global methods (e.g., trend surface analysis),
which ® t a single surface on the basis of the entire data set, and local methods (e.g.,
moving window methods, kriging, spline interpolation) in which a series of local
estimates are made, based on the nearest data points. Recently, particular attention
has tended to focus on kriging in its various forms (e.g., Oliver and Webster 1990,
Myers 1994). Nevertheless, despite a number of studies comparing this with other
techniques (e.g., Abbass et al. 1990, Dubrule 1984, Laslett et al. 1987, von Kuilenburg
et al. 1982, Weber and Englund 1992, Knotters et al. 1995), there is no clear consensus
to suggest that any one approach is universally optimal. Instead, performance of the
various methods tends to vary depending upon the character of the underlying
spatial variation being modelled, and the speci® c characteristics of the data concerned
(e.g., sampling density, sampling distribution).
A number of these interpolation methods have found applications in pollution
mapping, albeit mainly at a relatively broad, regional scale. Linear interpolation, for
example, has been widely used to derive contour maps of pollution surfaces on the
basis of point measurements (e.g., Archibold and Crisp 1983, Muschett 1981). Kriging
in its various forms has been used to map national patterns of NO2 concentrations
(Campbell et al. 1994), acid precipitation (Venkatram 1988, Schaug et al. 1993) and
ozone concentrations (Lefohn et al. 1988, Liu et al. 1995), and to help design
continental-scale monitoring networks (Haas 1992). Wartenberg (1993) also reports
the use of kriging to estimate and map exposures to groundwater pollution and
microwave radiation. Mapping of tra c-related pollution in urban areas, however,
potentially faces far more severe di culties. First and foremost is the inherent
complexity of the pollution surfaces involved. Within an urban area, emissions may
derive from a large number of intersecting line sources. The distance decay of
pollution levels away from these sources is also rapid, and greatly a ected by local
meteorological and topographical conditions. Marked variations in pollutant levels
can thus occur over distances of less than 100 metres in urban areas (e.g., Hewitt
1991). In contrast, the density and distribution of most monitoring networks is
generally poor. The number of stations monitoring pollutants such as NO2 , VOCs

D. Briggs et al.702
and ® ne particulates on a routine basis is generally small, and the location of these
stations is often biased towards speci® c pollution environments. As a consequence,
the existing monitoring networks provide only a limited picture of spatial patterns
of urban air pollution, potentially biased estimates of trends, and poor indications
of human exposure. Even where purpose-designed surveys can be conducted (e.g.,
using passive samplers), constraints of time and cost severely limit the sampling
density which is possible.
The main alternative to interpolation is the use of dispersion modelling tech-
niques. This involves constructing a dynamic model of the dispersion processes,
taking account of all the main factors which in¯ uence the ultimate pollution concen-
tration ® eld. It is an approach which has been most widely used in relation to point
sources, but several models have also been built for road tra c pollution: examples
include the CALINE models, developed on behalf of the US EPA (Benson 1992)
the CAR model, developed in the Netherlands (Eerens et al. 1992), the Highways
Agency Design Manual for Roads and Bridges (DMRB) model, and the ADMS
model (which has been developed from DMRB by Cambridge Environmental
Research Consultants Ltd).
Dispersion modelling has much to commend it as a basis for air pollution
mapping in that it attempts to re¯ ect the processes of dispersion and can relatively
easily be adapted to new pollutants or areas, without the need for additional mon-
itoring. On the other hand, it also has a number of important constraints. Amongst
the most serious are the relatively severe data demands of most dispersion models:
typically, data are required not only on the distribution of the road network, but
also tra c volumes and composition, tra c speed, emission factors for all main
classes of vehicle, street characteristics (e.g., road width, building height or type), and
meteorological conditions (e.g., wind speed, wind direction, atmospheric stability,
mixing height). Rarely are these data available for a su ciently dense network of
locations in an urban area, with the result that considerable data extrapolation often
has to occur. Line dispersion models also provide estimates of pollution concentra-
tions only within the immediate vicinity of the roadwayÐ up to a distance of only
35 metres in the CAR model, for example, and ca. 200 metres for the CALINE
models. This means that they do not easily provide information on variations in
background concentrations, at greater distances from major roads. In addition,
unlike spatial interpolation techniques, there is as yet little progress either in integrat-
ing dispersion models into GIS, or coupling the two technologies (though the ADMS
model does have a limited interface with GIS).
Against this background, there is clearly a need to develop more practicable
techniques of air pollution mapping, which can make use of the capability o ered
by GIS, and extract the maximum amount of information from the di erent data
sets which are available within urban areas. Regression-mapping o ers particular
potential in this respect. This involves using least squares regression techniques to
generate predictive models of the pollution surface, based on a combination of
monitored pollution data and exogenous information. The technique is widely used
for exploratory and explanatory investigations, and is also used to help classify
remote sensing imagery. Regression methods have only rarely been used, however,
for mapping purposes. Examples include the development of a regression-based
model of road salt contamination (Mattson and Godfrey 1994), air pollution (Wagner
1995) and soil depth (Knotters et al. 1995).

3. The SAVIAH regression method
3.1. Pollution monitoring
As part of the SAVIAH study, surveys of nitrogen dioxide pollution were carried
out in four study areas: Hudders® eld (U.K.), Amsterdam (Netherlands), Prague
(Czech Republic) and Poznan (Poland). NO2 was selected as the pollutant of concern
both because it is considered to provide a good marker of tra c-related pollution
and because of its relative ease of measurement, using low-cost passive sampling
devices. A summary of the study areas and the sampling design is given in table 1;
because of di erences in sampling regime, results from the Poznan study are not
reported here.
In each area, a pilot survey was initially undertaken in June 1993, during which
two di erent samplers, Palmes tubes and Willems badges, were compared (van
Reeuwijk et al. 1995) and sampling strategies assessed. Thereafter, three surveys were
carried out, in October 1993, February/March 1994 and May/June 1994. On each
of these three occasions Palmes tubes were exposed for a period of two-weeks at a
® xed set of 80 core monitoring sites in each study area. In each area, 40 so-called
variable sites were also established, (i.e., sites which were moved from one survey to
another), to provide further insight into local patterns of variation and to provide
additional validation of the results. In addition, a series of 8± 10 reference sites in
each area was monitored continuously on a monthly basis over the study period, to
provide independent estimates of the mean annual pollution concentration and to
act as reference sites for validation purposes. In Hudders® eld and Amsterdam only,
monitoring was also continued at a limited number of sites over the following year,
in order to allow the stability of the pollution surface to be analysed.
In each case, duplicate samplers were exposed at each site in order to provide a
measure of precision and to give back-up in the case of damage or loss to one tube.
Despite these precautions, however, loss or damage to ca. 10 per cent of samplers
occurred, resulting in gaps within the full data set. For this reason, estimates of mean
annual pollution at each of the 80 `core’ sites were made using a mixed-e ect model,
with terms for measurement error and site and survey e ects (Lebret et al. 1995).
Signi® cantly, when compared to the measured means at the `reference’ sites, these
were found to provide good estimates of the true mean, with an r
2
value across all
study areas of 0´95 (n=28) and a standard error of the estimate of 5´02 mg mÕ
3
.
These modelled results were therefore used in all subsequent analysis. Mean concen-
trations for core sites in Hudders® eld and Amsterdam are shown in ® gures 1 and 2.
3.2. Development of a regression model
Development and testing of a regression-mapping approach was carried out in
all three centres using ARC/INFO version 7.1. In each centre, a GIS was ® rst
established, containing four main sets of data:
Ð road tra c (e.g., road network, road type, tra c volume)
Ð land cover/land use
Ð altitude
Ð monitored NO2 concentrations
Additional variables were also added locally, as required. The variables and data
sources used in each centre are summarized in table 2.
Due to variations in data availability, classi® cation systems (e.g., of tra c ¯ ow
and land use) and local topography, it was not felt appropriate to use exactly the

D. Briggs et al.704
Table1.Summarydescriptionofstudyareasandsurveyresults.
Concentration(mgmÕ
3
)
Survey
StudyareaDescriptionNo./DateDevice(no.)MinMaxMeanSD
AmsterdamUrbanareawithbusyroads1.June/July1993Badges(80)12´473´536´412´2
borderedbyhigh-riseTubes(20)
blocks;areaca.26km
2
;2.Nov1993Tubes(80)39´672´151´86´4
population1558833.Feb/March1994Tubes(80)28´268´250´07´2
(1992)4.May/June1994Tubes(80)27´570´841´310´8
Hudders®eldMixedurban-ruralarea;area1.June1993Badges(80)10´588´428´214´1
305km
2
;altituderangeTubes(80)
80±582mOD;population2Oct.1993Tubes(20)27´778´547´610´2
2113003.Feb.1994Tubes(80)9´551´525´69´5
(1991)4.May1994Tubes(80)15´669´433´112´7
Tubes(80)
PragueMixtureofresidential,1.June/July1993Badges(80)5´665´822´213´9
industrialandopenspace;Tubes(20)
areaca.48km
2
;altitude2.Oct1993Tubes(80)21´965´439´610´2
range172±355mOD;3.Feb1994Tubes(80)20´057´733´49´6
population163700(1991)4.May1994Tubes(80)14´882´936´718´0

Figure 1. Monitoring results: mean annual NO2 concentration, Hudders® eld, UK.
Figure 2. Monitoring results: mean annual NO2 concentration, Amsterdam, The
Netherlands.

D. Briggs et al.706
Table 2. Data sets and sources.
Centre Variable De® nition Data source
Road network All roads (stored as 10 mHudders® eld Aerial photography (1510 k)
grid)
Tra c volume Mean 18 hour tra c ¯ ow Motorways: automatic
(vehicles/hour) for each counts (Department of
road segment Transport); A roads:
automatic counts (West
Yorkshire Highways);
manual counts (Kirklees
Highways Services);
Other roads (estimates
based on local knowledge)
Land cover Land cover class (20 Aerial photography (1510 k)
classes, stored as 10 m
grid)
Altitude Metres OD 50 m DTM (Yorkshire
Water)
NO2 concentration Mean NO2 concentration Field monitoring
(by survey period, and
modelled annual mean)
Sample height Height of sampler Field measurement
(metres) above ground
surface
Site exposure Mean angle to visible Field measurement
horizon at each
monitoring site
Topographical Mean di erence in GIS-GRID based on DTM
exposure altitude between each
pixel and the eight
surrounding pixels
Amsterdam Road network All access roads within City highways authority
the study area
Road type Classi® cation of road on City highways authority
basis of population
served
Distance to road Distance to nearest road GIS
serving >25 000
people
Land cover Area of built up land Planning maps
Prague Road network All tra c routes in the Department of
study area Development, Prague
Municipal Authority
Tra c volume Mean daytime tra c Department of
¯ ow (vehicles/hour) Development, Prague
Municipal Authority
Land cover Land cover class (6 City planning maps
classes, based on
building density)
Altitude Height (metres) above Topographic maps
sea level

same regression procedure in all study areas. Instead, each centre developed its own
equation, subject only to the constraint that: (a) it included terms for tra c volume,
land cover and topography; (b) a similar bu ering approach was used. In each centre,
therefore, data on the relevant input variables were ® rst computed for a series of
bands around each monitoring site (to a distance of 300 m) using the GRID routines
in ARC/INFO. These were then entered into a multiple regression analysis using as
the dependent variable either the monitored NO2 values for a speci® c survey period
(to provide a pollution map for that period alone) or the modelled annual mean
concentration (to provide an annual average pollution map). The equation thus
generated was ® nally used to compute the predicted pollution level at all unmeasured
sites for a ® ne grid of points across the study area and the results mapped.
As an example, details of the approach used in Hudders® eld are presented in
® gure 3. In brief, the method was as follows:
1. GIS development. Coverages were compiled in ARC/INFO as outlined in
table 2.
2. Computation of a weighted tra c volume factor (Tvol300) for the 300 metre
bu er around each monitoring site. Daytime tra c volumes (vehicle km/hour) were
estimated for each 20 m zone around each sample point (to a maximum distance of
300 metres) using the FOCALSUM command in ARC/INFO. Results were then
entered into a multiple regression analysis (in SPSS) against the modelled mean
annual NO2 concentrations and di erent combinations of band width compared.
The best-® t combination (as de® ned by the r
2
value) was selected, and weights for
each band determined by examination of the slope coe cients. This gave two bands,
weighted as follows: 0± 40 m (weight=15) and 40± 300 m (weight=1). These were
thus combined into a compound tra c volume factor (equation (1).
Tvol300=15Tvol0± 40 + Tvol40± 300 (1)
3. Computation of a compound land cover factor (Land300 ) for the 300 m bu er
around each monitoring site. The area of each land use type (hectares) within each
20 m band around each sample point (to a maximum distance of 300 m) was calcu-
lated using the FOCALSUM command in ARC/INFO. Results were then entered
into a multiple regression analysis (in SPSS) against the residuals from the previous
analysis (step 2). Di erent combinations of land cover and distance were compared
in terms of the r
2
value, and the best-® t non-negative combination selected. This
gave a single band (0± 300 m) comprising two land use types: high density housing
(HDH0± 300 ) and industry (Ind0± 300 ). Weights were identi® ed by examination of the
slope coe cients, and a compound land use factor computed in equation (2).
L and300=1´8HDH0± 300 +Ind0± 300 (2)
4. Stepwise multiple regression analysis was rerun using the two compound
factors (Tvol300 and L and300 ), together with altitude (variously transformed), topex,
sitex and sampler height, against the modelled mean nitrogen dioxide concentrations.
Only variables signi® cant at the 5 per cent con® dence level were retained. A number
of equations were derived from this procedure, all explaining generally similar
proportions of variation in the monitored NO2 levels. From these, regression equa-
tion (3) was chosen for further analysis, because of its marginally higher r
2
value
which was 0´607 and because all variables were signi® cant at the 0´05 level.
MeanNO2=11´83+(0´00398Tvol300 )+(0´268Land300 )
Õ (0´0355RSAlt)+(6´777Sampht) (3)

D. Briggs et al.708
Figure3.Theregressionmappingmethod:Hudders®eld,UK.

5. This equation was then used to construct a complete air pollution coverage
for the study area by applying the equation on a cell by cell basis to all locations in
the study area.
In Prague, a broadly similar approach was used. In this case, however, air
pollution data were seen to be skewed, so the data were log-transformed prior to
analysis. In this case, also, no attempt was made to produce a compound tra c
volume; instead separate tra c volume factors were computed for each zone. A
single land cover factor was computed in equation (4), representing the weighted
sum of the areas of each land cover type within the bu er zone around the site and
equation (5) was thus derived
L and= (area*density class) (4)
LogMeanNO2=3´48+(1´17Tvol60 )+(0´125Tvol120 )
+(0´000554L and60 )Õ (0´00152Alt) (5)
where: Tvol60=tra c volume (1000 vehicle km hrÕ
1
) within 60 m of the site
Tvol120=tra c volume (1000 vehicles kmhrÕ
1
) within 60± 120 m of the site
L and60=land cover factor within 60 m of the site
Alt=altitude (m)
This equation gave r
2
=0´72 for the 80 sample points. Plotting of the residuals
showed one outlier, with a large negative residual. When this was removed and the
regression analysis rerun, the equation (6) was obtained
LogMeanNO2=3´46+(1´17Tvol60 )+(0´110Tvol120 )
+(0´000569L and60 )Õ (0´00155Alt) (6)
The r
2
value was again 0´72, but the plot of residuals showed a better distribution,
with no outliers. This equation was therefore used for subsequent analysis.
In Amsterdam, the lack of data on tra c volume, and the essentially ¯ at nature
of the local terrain, meant that a di erent approach was used. In this case, road
segments were classi® ed into broad types, based upon the classi® cation used by the
city Highways Department. Three road types were identi® ed, as follows:
RD1 access road for residential areas with >25 000 people
RD2 access road for residential areas with >5000 and <25 000 people
RD3 access road for residential areas with >1000 and <5000 people
The length of each road type within a 0± 50 m and 50± 200 m bu er was then
computed, in GRID, giving six road variables. Distance to the nearest road serving
>25 000 people (RD1) was also included as a variable. Land cover was de® ned as
all built up land within 100 m of the site, based on planning maps. The variables
thus computed were then entered into a stepwise regression model, using the modelled
mean NO2 concentration as the independent variable. The resulting model (with
r
2
=0´62) is given in equation (7).
MeanNO2 =41´64+(0´5832RD150 )+(0´6190RD250)
+(0´0723RD1200 )Õ (0´0570RD2200 )Õ (0´0348RD3200 )
+(0´0133RD350 )Õ (0´0246Land100 )+(0´0036DistRD1 ) (7)
It should be noted that the approach used here di ers from that in the other two
centres, in that no attempt was made to derive compound tra c ¯ ow variables,

D. Briggs et al.710
based on tra c volumes in bu er zones around each site. Instead, a large number
of individual tra c-related variables were used, and allowed to enter the regression
model freely. The result is a regression equation which is to some extent counter-
intuitive, in that some variables (e.g., RD2200 , RD3200 and Land100 ) have negative
signs. While the model thus optimizes predictions for this data set, it is less generic
in structure, and may not be readily transferable to other areas.
4. Validation
Examples of the air pollution maps for Hudders® eld and Amsterdam are shown
in ® gures 4 and 5. As these indicate, the maps show considerable local detail, and
clearly pick out high zones of pollution along the main road network.
Validation of the air pollution maps was carried out by comparing the NO2
concentrations predicted from each of the air pollution maps with the measured
concentrations at the 8± 10 `reference’ sites, using simple least squares regression.
(Note that these sites were not used in the generation of the original regression
equation). Results of the validation are summarized in table 3 and ® gure 6. Because
of di erences in the number of sample sites available for validation, and the di erent
structures of the regression models, care is needed in comparing the results. Overall,
however, it is apparent that the regression method gave extremely good predictions
of the pollution levels at the reference sites (as shown by the r
2
values and mean
deviations).
In Hudders® eld, for example, the regression equation ® tted well to the measured
NO2 concentrations for the 80 sites on which it was based, with an r
2
value of 0´61.
Examination of the residuals showed some heterogeneity, with a tendency for over-
prediction at low concentrations and under-prediction at high concentrations.
Figure 4. The regression map: mean annual NO2, Hudders® eld, UK.

Figure 5. The regression map: mean annual NO2 , Amsterdam, The Netherlands.
Table 3. Performance of the regression maps: r
2
and standard error of estimate for reference
sites.
Standard error
of estimate
Centre Number of sites r
2
(mg mÕ
3
)
Hudders® eld 8 0´82 3´69
Prague 10 0´87 4´67
Amsterdam 10 0´79 4´45
Residuals showed no spatial correlation, however, and Moran’s I coe cients were
consistently low (ca. 0´08). Although attempts were therefore made to use kriging on
the residuals, no improvement in predictions was obtained.
Re¯ ecting this, the r
2
values for the `reference’ sites was uniformly high in all
centres (0´79± 0´87), while the standard error of the estimate ranged from 3´5± 4´7 mg
mÕ
3
(table 3). As ® gure 6 also shows, the slope of the relationship between predicted
and measured concentrations is close to unity, albeit with some tendency to under-
predict actual concentrations in Amsterdam and to over-predict in Prague. In
Hudders® eld, the relationship between measured and predicted concentrations also
tends to be non-linear, resulting in some over-prediction at low concentrations.
In Hudders® eld, predictions for individual surveys were also tested by rerunning
the regression procedure outlined above, using data for surveys 2 to 4 separately
(survey 1 was not analysed, because of the small number of di usion tubes for which
data were available). Regression models and r
2
values for the individual survey
periods are shown in table 4. The validity of each of these models was then further

D. Briggs et al.712
Figure 6. Predicted mean annual concentration from the regression map versus monitored
concentration (1993± 1994), all centres.
tested by comparing the predicted and measured values at the 8 reference and 40
`variable’ sites. Table 4 includes the resulting r
2
values. For all three surveys, broadly
similar regression equations are obtained. Notably, for the last period (4), the r
2
for
the regression equation is relatively high (>0´6) and model coe cients are close to
those derived for the mean annual concentration. In the second and third surveys,
a reduced number of variables enter the equation, and the r
2
value is lower
(0´36± 0´39). All three equations, however, provide good estimates of concentrations
at the 8 reference sites, with r
2
values between 0´69 and 0´75. Survey 4 also gives a
reasonably good prediction of concentrations at the 40 `variable’ sites (r
2
=0´52).
Surveys 2 and 3, however, are less strongly predictive of the variable sites, with r
2
values of 0´26 and 0´35 respectively. In part, this may re¯ ect the distribution of these
variable sites in these surveys, in that they were speci® cally selected to examine
variation in background concentrations, and are thus not representative of pollution
conditions across the whole study area. Overall, however, it appears that results of
regression mapping are more variable when applied to individual survey periods.
This should not cause surprise, for the tra c data used in the model refer to long-

Table4.Regressionequationsforindividualsurveyperiods:Hudders®eld.
r
2
CoresitesVariablesitesContinuoussites
PeriodEquation(SurveymeanNO2)(n=80)(n=40)(n=8)
Oct199340´720+0´00487Tvol300+0´250Land3000´3870´2560´729
Feb199411´505+0´00326Tvol300+0´197Land300+4´092Sampht0´3620´3500´743
May19947´011+0´00566Tvol300+0´385Land300+7´674Sampht-0´6110´5240´690
0´052RSalt

D. Briggs et al.714
term mean ¯ ows, and thus do not take account of short-term variations in tra c
conditions; nor is any allowance made for variations in weather conditions.
As noted earlier, in both Hudders® eld and Amsterdam, pollution data were also
available for a subset of 20 sites in the following year (October 1994± September
1995). In Hudders® eld, these were measured over 21 consecutive 2-week periods; in
Amsterdam, they were measured during four two-week campaigns. Comparing these
measured data with the estimated concentrations from the regression map allowed
the temporal stability of the pollution map to be estimated. Results are summarized
in ® gures 7 and 8. As can be seen, the correlation between the predicted value from
the map and the mean annual concentration measured in the following year is strong
in both centres (r
2
=0´59 in Hudders® eld and 0´86 in Amsterdam). In Hudders® eld,
however, the map tends to underpredict actual concentrations in the following year.
This re¯ ects the relatively hot summer and still conditions experienced in 1994± 5,
which contributed to higher than average pollution levels. Nevertheless, it is apparent
that, notwithstanding the short-term variations in air pollution levels which
undoubtedly occur, the geography of pollution is relatively stable from year to year.
The air pollution maps derived from the regression method thus have longer-term
validity. On this evidence, the pollution maps should provide a basis for estimating
historic exposures, at least over recent years. In the case of health outcomes with a
relatively long lag period, this has considerable signi® cance for environmental epi-
demiology. In addition, the maps clearly provide a useful framework for designing
air pollution monitoring systems, and identifying the areas for which individual
monitoring sites can be considered representative.
Figure 7. Predicted NO2 concentrations from the regression map versus observed mean
annual concentration the following year: Hudders® eld, UK.

Figure 8. Predicted NO2 concentrations from the regression map versus observed mean
annual concentration the following year: Amsterdam, The Netherlands.
5. Discussion and conclusions
The results of this study clearly illustrate the complex nature of spatial variation
in urban air pollution, and con® rm the marked variation in levels of tra c-related
pollution which may occur over small distances (<100 m). At the same time, however,
they also demonstrate that this variation is predictable, and can be modelled to a
high degree of accuracy, using information on emission sources and topography.
The spatial variations in pollution levels seen in this study have considerable
implications, both for air pollution monitoring and management, and for environ-
mental epidemiology. It means, for example, that pollution data derived from a
single monitoring site cannot normally be considered representative of more than a
small surrounding area. This clearly limits the extent to which existing monitoring
networksÐ based as they are on a small number of ® xed-site monitoring devicesÐ
can either provide spatially representative measures of levels of compliance with air
pollution legislation, or give meaningful warnings of air pollution hazards across an
urban environment. Equally, epidemiological studies based upon only a small number
of monitoring sites are likely to involve considerable misclassi® cation of exposure.
In both cases, understanding of the geography of pollution within the area of interest
is vital. Signi® cantly, however, this study suggests that the long term pattern of
pollution within an urban area may be reliably predicted from a relatively small
number of surveys; in this case only four two-week surveys provided good predictions
of mean annual concentrations of NO2. The study also suggests that, whilst average
concentrations of NO2 inevitably vary from year to year, the spatial pattern of
pollution within an urban area remains broadly stable.
The method of GIS-based regression mapping used in this study appears to o er

D. Briggs et al.716
an e ective method for mapping tra c-related pollutants, such as NO2 . The maps
produced here consistently gave good predictions of pollution levels at unsampled
points. As with any empirical approach, however, regression mapping has its limita-
tions. It potentially su ers, for example, from being case- and area-speci® c. This is
especially likely to be true when models are developed simply through a process of
statistical optimization. As the results for Amsterdam show in this study, the regres-
sion equation may then be somewhat counter-intuitive, and as such may not be
valid outside the speci® c study area. Nevertheless, where the regression models are
developed from clear underpinning principles, as, here, in Prague and Hudders® eld,
it may be expected that the models would be more generally valid. Recent studies,
to be reported elsewhere, do in fact demonstrate that the Hudders® eld regression
model can be applied successfully to other urban areas in the UK. The empirical
nature of regression mapping is also part of its strength, for, unlike formal dispersion
modelling, it can be readily adapted to local circumstances and data availability. It
thus allows optimal use to be made of the available data. In small area studies,
where monitored data are scarce and where the need for high resolution maps is
paramount, GIS-based regression mapping thus o ers a powerful tool.
Acknowledgments
The SAVIAH study was a multi-centre project, funded under the EU Third
Framework Programme. It was led by Professor Paul Elliott (Department of
Epidemiology and Public Health, Imperial College School of Medicine at St. Mary’s,
London UK formerly at the London School of Hygiene and Tropical Medicine) and
co-principal investigators were Professor David Briggs (Nene Centre for Research,
Nene College, Northampton, UK formerly at the University of Hudders® eld), Dr
Erik Lebret (Environmental Epidemiology Unit, National Institute of Public Health
and Environmental Protection, Bilthoven, Netherlands), Dr Pawel Gorynski
(National Institute of Hygiene, Warsaw, Poland) and Professor Bohimir Kriz
(Department of Public Health, Charles University Prague). Other members of the
project team were: Marco Martuzzi and Chris Grundy (London School of Hygiene
and Tropical Medicine, London, UK); Susan Collins, Emma Livesely and Kirsty
Smallbone (University of Hudders® eld, UK); Caroline Ameling, Gerda Doornbos,
Arnold Dekker, Paul Fischer, L. Gras, Hans van Reeuwijk and Andre van der Veen
(National Institute of Public Health and Environmental Protection, Bilthoven, NL);
Henrik Harssema (Wageningen Agricultural University); Bogdan Wojtyniak and
Irene Szutowicz (National Institute of Hygiene, Warsaw, Poland); Martin Bobak
and Hynek Pikhart (National Institute of Public Health, Prague, CR) and Karel
Pryl (City Development Authority, Prague, CR). All members of this team made
invaluable contributions to all parts of the project, and this paper is a product of
their joint e ort and expertise. Thanks are also due to the local authorities and
health authorities in the four study areas, Amsterdam, Hudders® eld, Prague and
Poznan, for their assistance in carrying out this research.
References
Abbass, T., El Jallouli, C., Albouy, Y., and Diament, M., 1990, A comparison of surface
® tting algorithms for geophysical data. Terra Nova 2, 467± 75.
Anderson, H. R., Butland, B. K., and Strachan, D. P., 1994, Trends in prevalence and
severity of childhood asthma. British Medical Journal 308, 1600± 4.
Archibold, O. W., and Crisp, P. T., 1983, The distribution of airborne metals in the Illawarra
region of New South Wales, Australia. Applied Geography, 3 (4), 331± 344.

Benson, P. E., 1992, A review of the development and application of the CALINE3 and
CALINE4 models. Atmospheric Environment, 26B, 379± 90.
Briggs, D. J., 1992, Mapping environmental exposure. In: Geographical and Environmental
Epidemiology : Methods for Small-area Studies, edited by P. Elliott, J. Cuzick,
D. English, and R. Stern. (Oxford: Oxford University Press) pp. 158± 76.
Burney, P., 1988, Asthma deaths in England and Wales 1931± 85: evidence for a true increase
in asthma mortality. Journal of Epidemiology and Community Health, 42, 316± 20.
Burney, P., Chinn, S., and Rona, R. J., 1990, Has the prevalence of asthma increased in
children? Evidence from the national study of health and growth 1973± 86. British
Medical Journal, 300, 1306± 10.
Burrough, P. A., 1986, Principles of Geographical Information Systems for Land Resources
Assessment. Monographs on soil and resources survey, no. 12 (Oxford: Clarendon
Press).
Campbell, G. W., Stedman, J. R., and Stevenson, K., 1994, A survey of nitrogen dioxide
concentrations in the United Kingdom using di usion tubes July± December 1991.
Atmospheric Environment, 28 (3), 477± 87.
Department of the Environment, 1995, Digest of Environmental Statistics. No. 17. 1995.
London: HMSO.
Dubrule, O., 1984, Comparing splines and kriging. Computers & Geosciences, 10, 327± 38.
Edwards, J., Walters, S., and Griffiths, R. K., 1994, Hospital admissions for asthma in
preschool children: relationship to major roads in Birmingham, United Kingdom.
Archives of Environmental Health, 49, 223± 7.
Eerens, H., Sliggers, C., and van der Hout, K., 1993, The CAR model: the Dutch method
to determine city street air quality. Atmospheric Environment, 27B, 389± 99.
Elliott, P., Briggs, D., Lebret, E., Gorynski, P., and Kriz, B., 1995, Small area variations
in air quality and health (the SAVIAH study): design and methods. (Abstract).
Epidemiology, 6, S32.
Flachsbart, P. G., 1992, Human exposure to motor vehicle air pollution. In Motor Vehicle
Air Pollution: Public Health Impact and Control Measures, edited by D. T. Mage, and
O. Zali (Geneva: WHO Division of Environment), pp. 85± 113.
Haahtela, T., Lindholm, H., Bjorkstoen, F., Koskenvuo, K., and Laitinen, I. A., 1990,
Prevalence of asthma in Finnish young men. British Medical Journal, 301, 266± 8.
Haas, T. C., 1992, Redesigning continental-scale monitoring networks. Atmospheric
Environment, 26A, 3323± 33.
Hewitt, C. N., 1991, Spatial variations in nitrogen dioxide concentration in an urban area.
Atmospheric Environment, 25B, 429± 34.
Ishizaki, T., Koizumi, K., Ikemori, R., Ishiyama, Y., and Kushibiki, E., 1987, Studies of
prevalence of Japanese cedar pollinosis among the residents in a densely cultivated
area. Annals of Allergy, 58, 265± 70.
Knotters, M., Brus, D. J., and Voshaar, J. H. O., 1995, A comparison of kriging, co-kriging
and kriging combined with regression for spatial interpolation of horizon depth with
censored observations. Geoderma, 67, 227± 46.
Laslett, G. M., McBratney, A. B., Pahl, P. J., and Hutchinson, M. F., 1987, Comparison
of several prediction methods for soil pH. Journal of Soil Science, 38, 325± 70.
Lebret, E., Briggs, D., Collins, S., van Reeuwijk, H., and Fischer, P., 1995, Small area
variations in exposure to NO2 . (Abstract). Epidemiology, 6, S31.
Lefohn, A. S., Knudsen, H. P., and McEvoy, L. R., 1988, The use of kriging to estimate
monthly ozone exposure parameters for the southeastern United States. Environmental
Pollution, 53, 27± 42.
Liu, L. J. S., Rossini, A., and Koutrakis, P., 1995, Development of cokriging models to
predict 1- and 12-hour ozone concentrations in Toronto. (Abstract). Epidemiology,
6, S69.
Mattson, M. D., and Godfrey, P. J., 1994, Identi® cation of road salt contamination using
multiple regression and GIS. Environmental Management, 18, 767± 73.
Muschett, F. D., 1981, Spatial distribution of urban atmospheric particulate concentrations.
Annals of the Association of American Geographers, 71, 552± 65.
Mutius, E., 1993, Road tra c and adverse e ects on respiratory health in children. British
Medical Journal, 307, 596± 600.

Mapping urban air pollution using GIS718
Myers, D. E., 1994, Spatial interpolation: an overview. Geoderma, 62, 17± 28.
NILU, 1991, The health e ects of tra c pollution as measured in the Valerenga area of Oslo.
Summary report (Lillestrom: Norsk Institut fur Luftforskning).
Oliver, M. A., and Webster, R., 1990, Kriging: a method of interpolation for geographical
information systems. International Journal of Geographical Information Systems, 4,
313± 32.
Schaug, J., Iversen, T., and Pedersen, U., 1993, Comparison of measurements and model
results for airborne sulphur and nitrogen components with kriging. Atmospheric
Environment, 27A, 831± 44.
Schwartz, J., 1993, Particulate air pollution and chronic respiratory disease. Environmental
Research, 62, 7± 13.
Schwartz, J., 1994, Air pollution and daily mortality: a review and meta-analysis.
Environmental Research, 64, 36± 52.
Stanners, D., and Bordeau, P., (editors), 1995, Europe’s Environment. The DobrõÂ s Assessment,
(Copenhagen: European Environment Agency).
UNEP 1993, Environmental Data Report 1993± 94 (Oxford: Blackwell ).
van Kuilenburg, J., de Gruijter, J. J., Marsman, B., and Bouma, J., 1982, Accuracy of
spatial interpolation between point data on moisture supply capacity, compared with
estimates from mapping units. Geoderma, 27, 311± 25.
van Reeuwijk, H., Lebret, E., Fischer, P. H., Smallbone, K., Celko, M., and Harssema, H.,
1995, Performance of NO2 passive samplers in ambient air in dense networks,
(Abstract). Epidemiology, 6, S60.
Venkatram, A., 1988, On the use of kriging in spatial analysis of acid precipitation data.
Atmospheric Environment, 22, 1963± 75.
Wagner, E., 1995, Impacts on air pollution in urban areas. Environmental Management,
18, 759± 65.
Wartenberg, D., 1993, Some epidemiologic applications of kriging. In Geostatistics Troia
’92. Vol. 2. (A. Soares, ed.) (New York: Kluwer), Quantitative Geology and Statistics,
5, pp. 911± 22.
Weber, D., and Englund, E., 1992, Evaluation and comparison of spatial interpolators.
Mathematical Geology, 24, 381± 91.
Weiland, S. K., Mundt, K., Ruckmann, A., and Keil, U., 1994, Self-reported wheezing and
allergic rhinitis in children and tra c density on street of residence. AEP, 4, 243± 7.
Wjst, M., Reitmeir, P., Dold, S., Wulff, A., Nicolai, T., von Loeffelholz-Colberg, E. F.,
and von Mutius, E., 1993, Road tra c and adverse e ects on respiratory health in
children. British Medical Journal, 307, 596± 600.

Mapping urban air pollution using gis a regression based approach

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Mapping urban air pollution using gis a regression based approach

Similar to Mapping urban air pollution using gis a regression based approach (20)

Recently uploaded

Recently uploaded (20)

Mapping urban air pollution using gis a regression based approach