This document summarizes a talk on estimating the effectiveness of speed cameras. It discusses how sites with high accident rates are selected for remedial treatments like speed cameras. However, simply comparing accident rates before and after can overestimate effectiveness due to regression to the mean. The talk presents methods to account for regression to the mean and trends when evaluating speed camera effectiveness, using empirical Bayes and analyses of transparency data from camera partnerships. Allowing for these factors suggests the true safety benefit of cameras may be lower than initial estimates based only on before-after comparisons.
1. Estimating the Effectiveness of
Speed Cameras
Mike Maher
Institute for Transport Studies
University of Leeds
Hong Kong Poly U, 22 Oct 2013
2. Background to the talk
•
•
•
•
•
Speed cameras widely-used in UK
Do they save lives? Or simply make money?
Unpopular with many motorists
Long-running controversy
DfT under pressure to establish their effect
3. A brief history of speed cameras (1)
• First introduced in UK in early 1990s
• Policy decision in December 1998
– “hypothecation”, local camera partnerships set up
• In 2002, cameras made more conspicuous
• In 2004, 3-year evaluation report criticised
– no allowance for regression to mean
• In 2005, 4 year evaluation report
– Appendix H allows for RTM on subset of data
– shows camera effectiveness reduced
4. A brief history of speed cameras (2)
• In 2011, Minister’s letter to English local
authorities
– requiring them to publish camera data
– FSCs, PICS for all years 1990 – 2010
• 2012, Scottish safety camera bulletin criticised
– report advising how to analyse and present data
• June 2013 RAC Foundation report: R Allsop
– guidance on use of transparency data
– proposing a method of analysis of such data
– allows for trend, RTM and estimates camera effect
• Still criticised by anti-camera lobby
5. Outline of the talk
• Remedial safety treatments (eg cameras)
– applied to “problem” sites
– to reduce accidents at the site
• Identification of “problem” sites
– those with high number of accidents in last 3 years
• Evaluation of effect of remedial treatment
– compare after accidents with before accidents
– allow for trend (compare with regional numbers)
• But - not as straightforward as it may seem!
6. North Lanarkshire data
Number of sites Nk with k accidents in 3-year period
k
0
Nk
7411
1
2
3
4
5
6
7
8
9
11 13
1645 341 117 38 26 13
7
2
1
1
Sites with at least 4 accidents called “cluster sites”
Earmarked for remedial treatment
1
7. North Lanarkshire data
Number of sites Nk with k accidents in 3-year period
k
0
Nk
7411
1
2
3
4
5
6
7
8
9
11 13
1645 341 117 38 26 13
7
2
1
1
before
after
change
Whole network
3136
2799
-11%
Cluster sites
458
233
-49%
1
8. North Lanarkshire data
Number of sites Nk with k accidents in 3-year period
k
0
Nk
7411
1
2
3
4
5
6
7
8
9
11 13
1645 341 117 38 26 13
7
2
1
1
before
after
change
Whole network
3136
2799
-11%
Cluster sites
458
233
-49%
BUT – NO TREATMENT APPLIED!!
1
9. Regression to the mean
• Bias by selection
• Sites chosen on basis of high Y = m + ε
• Top sites tend to have both:
–
–
–
–
high systematic component (mean m)
high positive random component (error ε)
systematic component persists …
… but random component does not
• Exaggerated estimate of treatment
effectiveness, unless corrected for
10. Why “regression to the mean”?
• Sir Francis Galton, (1822 – 1911), eugenicist,
biometrician, statistician, observed:
Tall fathers tend to have sons who
are also tall – but who are not as tall
as themselves
13. RTM appears in other places, too
• Golf tournaments:
– the players who score well in the first round
tend, on average, to score well in the second
round too - but not as well as they did in the
first
14. 2013 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
Name
Rounds 1-2
Miguel Angel Jimenez
139
Henrik Stenson
140
Lee Westwood
140
Tiger Woods
140
Dustin Johnson
140
Zach Johnson
141
Angel Cabrera
141
Rafael Cabrera-Bello
141
Martin Laird
141
10
Ryan Moore
142
Rounds 3-4
15. 2013 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
Name
Rounds 1-2
Miguel Angel Jimenez
139
Henrik Stenson
140
Lee Westwood
140
Tiger Woods
140
Dustin Johnson
140
Zach Johnson
141
Angel Cabrera
141
Rafael Cabrera-Bello
141
Martin Laird
141
10
Ryan Moore
142
Rounds 3-4
150
144
145
146
153
145
147
150
153
151
16. 2013 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
10
Name
Rounds 1-2
Rounds 3-4
Miguel Angel Jimenez
139
150
Henrik Stenson
140
144
Lee Westwood
140
145
Tiger Woods
140
146
Dustin Johnson
140
153
Zach Johnson
141
145
Angel Cabrera
141
147
Rafael Cabrera-Bello
141
150
Martin Laird
141 increase = 7.9
153
Average
Ryan Moore
142
151
17. 2012 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
Name
Brandt Snedeker
Adam Scott
Tiger Woods
Thorbjorn Olesen
Graeme McDowell
Thomas Aiken
Matt Kuchar
Jason Duffner
Paul Lawrie
10
Ernie Els
Rounds 1-2
Rounds 3-4
130
147
131
143
134
143
135
145
136
142
136
143
136
144
136
147
136 increase = 9.1
148
Average
137
136
18. 2011 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
Name
Darren Clarke
Lucas Glover
Thomas Bjorn
Chad Campbell
Martin Kaymer
Miguel A Jimenez
Dustin Johnson
Davis Love III
George Coetzee
10
Charl Schwartzel
Rounds 1-2
Rounds 3-4
136
139
136
147
137
142
137
143
137
146
137
150
138
140
138
144
138 increase = 7.2
146
Average
138
147
19. 2010 British Open Golf Tournament
Pos
1
2
3
4
5
6
7
8
9
Name
Louis Oosthuizen
Mark Calcavecchia
Lee Westwood
Paul Casey
Jin Jeong
Alejandro Canizares
Retief Goosen
Sean O’Hair
Tom Lehman
10
Graeme McDowell
Rounds 1-2
Rounds 3-4
132
140
137
157
138
141
138
142
138
146
138
148
139
142
139
143
139 increase = 7.3
145
Average
139
146
20. The problem
• Treatment applied at sites with high number of
accidents
– eg k ≥ 4 accidents in before period
– speed cameras: ≥ 8 PICs/km in three years
• Accidents reduce even if nothing done
– bias produced by selection criterion (RTM)
– so need to allow for (or avoid) that in the analysis
– and also allow for other effects: eg trend
21. Problem and possible approaches
• observed before frequency is not a reliable
measure of true frequency
• Empirical Bayes Method (EBM)
– use predictive accident model to estimate µ
µ is a function of site variables: flow, length ..
– combine observed accidents y with µ to give m
• Use time series data for each camera site
– as for “transparency” data
– before period, selection period, after installation
22. Empirical Bayes Method
• What is the expected value of true mean m
– given the observed value of no. accidents y?
• Bayes’ Theorem
– prior distribution for m from predictive accident model
– combine with observed y
– to give posterior estimate of m
• Depends on the model and its precision
ˆ
m = α µ + (1 − α ) y
µ
where : α = 1 +
K
−1
23. RTM in camera partnerships data
• Asked by DfT to work with UCL and PA on
four year report
–
–
–
–
previous Napier / Liverpool EPSRC research
carry out our analysis on subset of data
allow for trend and RTM
see how much apparent effect of cameras is
due to RTM, and how much is real
Overall reduction = trend + RTM + camera effect
24. So ….
• Subset of 216 sites for which data available
– urban sites (30 and 40 mph limits)
– traffic flows and number of junctions/km
• Use this data in an existing predictive accident
model to calculate number of accidents µ to be
expected at such a site
• Best estimate of true mean number of accidents
in before period at the site is then:
ˆ
m = α µ + (1 − α ) y
µ
where: α = 1 +
K
−1
25. Results – for FSCs
FSCs/site/year:
before
1.05
after
0.48
(-54%)
Overall reduction = 0.57 = 0.10 + 0.36 + 0.11
trend
RTM
camera
54% = 10% + 34% + 10%
relative to what would have been:
50%
allowing for trend
19%
allowing for trend + RTM
26. RAC Foundation method
• Report written by Richard Allsop in June 2013
• Not all partnerships have yet published data
–
–
–
–
–
–
–
in varied formats originally
data from ten partnerships analysed in report
available on RACF website as .csv files
now in standard format: one row per camera per year
annual data for 21 years: 1990 – 2010
main interest on PICS and FSCs, but also casualties
trend given by partnership annual totals
27.
28. Data periods
• For each camera, years divided into periods
–
–
–
–
before
site selection period (SSP): 3 years (not specified?)
transition (year camera installed): specified
post-installation (camera period)
• If camera installed in mid-2000
– SSP assumed to be 97-99
– before period is then 90-96
– camera period is 01-10
29. Form of model
•
•
•
•
•
accidents yit Poisson distributed mean mit
mit proportional to partnership total Pt
rate factored in SSP by α (RTM effect)
rate factored in camera period by β
dummy (0/1) variables to indicate period
– before, SSP or camera
• Poisson regression model to estimate α and β
– with confidence intervals
• or, equivalent but simpler, multinomial model
– split of total accidents between periods
30. Form of data required
For each camera (eg PICs at LCR C1)
Before
SSP
After
No. years
9
3
8
Site total
78
31
43
32376
11308
24148
Partnership total
Compare the numbers in each period relative to partnership totals
43/24148
Approximate camera estimate =
= 0.739
78/32376
33. Timing of the SSP?
• longer gap from end of SSP to installation?
• is there an ASBiC period?
– after selection but before installation of camera
– mean rate drops to the before level
• if SSP is earlier, some RTM in the before period
– hence inflates camera benefit
– important to get timing of assumed SSP right
– assuming it is not known
34. Accs/yr (adjusted for trend)
SSP
RTM
ASBiC
Camera effect
Pre-SSP
Post-installation
Installation of camera
time
35. Plot from all ten partnerships
553 cameras in total
Line up sites by installation date
so year 1 is first post-camera year
Transition year omitted
Scaled and averaged
Clear signs of raised level
before assumed SSP
So how to define SSP?
Leave out 4 years instead of 3?
Find 3-yrs with max accidents
to find “most likely” SSP?
before
SSP?
post-camera
36. EBM or RACF method?
• EBM more complex
–
–
–
–
allows for trend, using national totals
requires a predictive accident model (PAM)
and data on flows etc for each camera site
robust to uncertainty about the timing of the SSP
• RACF method simpler in many respects
–
–
–
–
–
–
allows for trend (in same way)
needs long run of annual accident data
comparison of accidents in before, SSP, camera periods
estimates obtained by statistical model fitting – eg R
but no PAM, and no flow data required
but potentially sensitive to assumption of SSP
37. Summary
• EBM used in DfT 4-year evaluation report
– but requires reliable flow data for each site (and PAM)
– seen as complex
• RACF method has some advantages
–
–
–
–
–
–
–
no PAM needed, no flow data needed
but does need sufficient before accident data
but arguments about what to assume about SSP
discussions between Richard Allsop and me ..
.. criticised by Idris Francis, Dave Finney and others ...
lots of letters in Local Transport Today
Allsop revising his recommended method