1. Model Development of
Tank Car Conditional Probability of Release
and Expected Quantity of Release
Xiaonan Zhou & Rapik Saat, Ph.D.
Rail Transportation and Engineering Center
Department of Civil Engineering
University of Illinois at Urbana-Champaign
INFORMS 2013 Annual Conference
October 7th 2013
Minneapolis, MN
2. Slide 2
Outline
• Study Background
– Hazardous Materials Transportation Risk
– Tank Car Conditional Probability of Release (CPR)
– Tank Car Expected Quantity of Release (EQR)
• Study Objectives
• Modelling Process Overview
– Dataset Refinement
– Variable Selection Process
– Model Estimation
• Future Work
3. Slide 3
Hazardous Materials Transportation Risk
RHazmat = f (PA, CPR, EQR, C)
Probability of an Accident
Conditional Probability of Release
(CPR)
Expected Quantity of Release
(EQR)
Consequences of Release
4. Slide 4
Event Tree of Hazmat Release Accidents
Quantity of
Release
Probability of
Release
Probability
of Accident
Release
Accident
Non Release
Toxic Cloud
Pool Fire
Explosion
Flash Fire
Consequences
of Release
Non Derailment/
Collision
Derailment/
Collision
Quantity of
Release
5. Slide 5
Definition and Measurement Methods
• Definition
– CPR: Probability of a release given that a tank car involved in a
derailment or collision
– QR: Quantity of release from a tank car in an accident-caused
release incident
– EQR: Expected QR
• Measurement methods using historical data
– CPR = Number of Car Released / Number of Cars Involved in
Accidents
– QR = Amount of Lading Loss / Car Tank Capacity
6. Slide 6
Motivation of Model Development
• Increasing concern with the safety of tank cars
transporting hazardous materials
• CPR and QR are two major factors affecting the risk of
hazardous material transportation
• The model of CPR and EQR will provide
– Analysis of the performance of existing tank car design and
safety features
– Predictions of the effects of tank car design modifications
– Guidance of future tank car designs
7. Slide 7
Data Source
• Railway Supply Institute (RSI) and the Association of American
Railroads (AAR) Tank Car Accident Database (TCAD)
• More than 40 thousand records of tank cars involved in accidents
have been recorded since 1970 in TCAD
• Resultant database provides a robust source of information for
quantitative analysis of tank car safety design
8. Slide 8
Dataset Refinement
Filter
Number of Records
Removed
Cumulative Number of
Records Remaining
Accident occurring before 1980 (DOA) 18,884 26,127
Cars built before 1970 (YCB) 20,053 20,834
Without stub sills (Sill Type) 10,334 20,542
Without shelf couplers (Coupler Type) 21,962 18,424
Other than 100 ton and 110 ton cars (Tonnage) 10,276 18,258
Release due to fire exposure (Fire Cause) 848 18,180
Empty cars 9,874 14,507
Material other than TC128, A515 or A516 19,886 13,553
Other than spec. 105, 111, 112, 114 or 211 5,485 13,519
Incomplete / Inconsistent Records 6,805 6,665
Lading Lost (LDL) - 1,337
• As of 2010, the TCAD contains records for 45,011 tank cars that have been
involved in a collision or derailment
CPR
EQR
9. Slide 9
Variables in the Model
Type of Variable Name Value
Tank Design
Material specification TC128, A515 or A516
Material thickness 0.437 to 1.033 inches
Presence of jacket Yes/No
Insulation thickness 0 to 8 inches
Tank inside diameter 87 to 121 inches
Cargo tank capacity 10,422 to 34,500 gallons
Presence of head shield full, half or none
Pressure rating of top fittings pressure or non-pressure
Presence of bottom fittings Yes/No
Presence of external heating coils Yes/No
Accident
Track type mainline or yard
Accident type derailment or collision
Number of cars derailed continuous number
Train speed 0 to 78 mph
Severity of impact 0 to 78 mph
Hazard environment 1 to 8
Commodities Lading commodity Lading A, B, C, D…
10. Slide 10
Statistical Approach to Modelling CPR & EQR
• Four main components of a tank car can fail and result
in a release of hazardous materials
– Shell
– Head
– Top Fittings (TF)
– Bottom Fittings (BF)
Head
Shell Top Fittings
Bottom Fittings
12. Slide 12
Modelling Component Specific CPR
• The observations for dependent variable of CPR is
binary
– A release occurred (1)
– A release did not occur (0)
• CPRi follows a logistic function
CPRi = eLi(X) / (1+eLi(X))
– i is head, shell, TF, BF
– Li(X) is the likelihood function for CPR of
each component
13. Slide 13
3-Step Modelling Procedure for Component
Specific Likelihood Functions
• Data refinement & variable set selection
• Variable selection using R’s gMCP procedure with a logistic
distribution
– A bi-level coordinate descent algorithm variable selection
procedure first available in November 2012
– Used 10-fold cross-validation to identify the most influential
variables
• Coefficient estimation & model finalization using SAS’s
PROC LOGISTIC
– Developed models using combined statistical tests
– Selected the best performing model in which coefficients for
variables behave in an intuitive manner
• Hosmer-Lemeshow Test
• Area-Under-The-Curve / Concordance Index
14. Slide 14
Example of CPRHead Modeling Result
• 25 initial variables/interactions
considered
• Variable selection:
– Min CVE ≅ 0.31
– Optimum lambda = 0.0851
– 8 variables/interactions selected
(includes main effects and
interaction terms)
• Model Finalization:
– Concordance Index of 0.7757
– Hosmer-Lemeshow P-value =
0.0599
Head Model Cross Validation Error
(CVE) Minimization through
Coordinate Descent
Head Model ROC Curve
16. Slide 16
Average Quantity of Lading Lost
for Each Components
63 63
13
24
58
0
10
20
30
40
50
60
70
Head Shell Top Fi7ngs Bo<om
Fi7ngs
Mul@ple
Sources
Percentage of Car Capacity (%)
Release Source
17. Slide 17
Distribution of Percentage of
Quantity of Lading Lost for Each Components
15 14
40
79
66
6 9 10
4
10
17
12 14
3
12
20
17
11
3 2
42
49
25
11 10
0
10
20
30
40
50
60
70
80
90
Head Shell Top Fittings Bottom Fittings Multiple Sources
Frequency(%)
Cause of Loss
0-5 5-20 20-50 50-80 80-100
18. Slide 18
gMCP is designed for
response variables
following Gaussian,
binomial or logistic
distributions
The QR
observations
range from (0, 1),
following beta
distribution
Modelling Component Specific EQR
EQRi = elogitEQRi / (1+elogitEQRi)
logitEQRi=log(EQRi/(1-EQRi)) : (0,1) R
19. Slide 19
3-Step Modelling Procedure for Component
Specific LogitEQR Function
• Variable set selection & data refinement
• Variable selection using R’s gMCP procedure with a Gaussian
distribution
– Used BIC to identify the most influential variables
– Used 10-fold cross-validation to identify additional influential variables
• Coefficient estimation & model finalization using SAS’s PROC
Glimmix
– Used individual P-values to evaluate the inclusion of additional
influential variables
– Selected the best performing model in which coefficients for variables
behave in an intuitive manner
• Normal test P-value for residual
20. Slide 20
Example of EQRBottom_Fittings Modeling Results
• 25 initial variables/interactions
considered
• Variable selection
– Min CVE ≅ 16.6
– Optimum lambda = 0.5058
– 12 variables/interactions selected
(includes main effects and interaction
terms)
• Model finalization
– Normal test P-value
of the residual
Bottom Fittings Model Cross
Validation Error
Bottom Fitting Model Q-Q Plot
21. Slide 21
Comparison between EQR Observation
Distribution vs. Prediction Distribution
Observation Distribution Prediction Distribution
23. Slide 23
Example of Modeling Results:
Tank Thickness Effects
Head
Shell
0
2
4
6
8
10
12
14
16
0.4 0.5 0.6 0.7 0.8 0.9 1
PercentChanceofLadingLoss(%)
Tank Thickness (Inches)
All Bare Tanks – No Jacket, No Head Shield
(for the sixth car derailed in a 11-car derailment
at train speed of 26 mph)
Note: Analysis of Class I railroad accidents between 2003-2012 from the FRA database indicated that on average,
11 cars derailed on a derailment on mainline or siding at an average speed of 26 mph
24. Slide 24
Example of Modeling Results:
Head Shield Effects
No Head Shield
Half-Height HS
Full-Height HS
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
0.4 0.5 0.6 0.7 0.8 0.9 1
PercentChanceofLadingLoss(%)
Tank Thickness (Inches)
All Cars Jacketed/Insulated
(for the sixth car derailed in a 11-car derailment at
train speed of 26 mph)
Note: Analysis of Class I railroad accidents between 2003-2012 from the FRA database indicated that on average,
11 cars derailed on a derailment on mainline or siding at an average speed of 26 mph
25. Slide 25
Example of Modeling Results:
Jacket/Insulation Effects
Head - No Jacket
Head w/Jacket
Shell - No Jacket
Shell w/Jacket
0
2
4
6
8
10
12
14
16
0.4 0.5 0.6 0.7 0.8 0.9 1
PercentChanceofLadingLoss(%)
Tank Thickness (Inches)
All Cars Unequipped with Head Shields
(for the sixth car derailed in a 11-car derailment at
train speed of 26 mph)
Note: Analysis of Class I railroad accidents between 2003-2012 from the FRA database indicated that on average,
11 cars derailed on a derailment on mainline or siding at an average speed of 26 mph
26. Slide 26
Future Work
• Finalize the model calculator for CPR model
• Compare the results with RA-05-02 report
• Build the model calculator for EQR model
• Analyze how tank thickness, head shield and tank
jacket affected EQR
27. Slide 27
Acknowledgements
• Funding for this research has been provided by the Railway Supply Institute
(RSI) – Association of American Railroads (AAR) Railroad Tank Car Safety
Research & Test Project
• CPR Model was developed by:
– Laura Ghosh
• Industry partnership and support has been provided by:
– Todd Treichel (RSI-AAR)
– Steve Kirkpatrick (Applied Research Associates, Inc.)
• Insight and assistance from UIUC have also been provided by:
– Rapik Saat
– Junho Song
– Chris Barkan
– Lanqing Hua from the Illinois Statistics Office
– Jesus Aguilar Serrano & Chen-Yu Lin
28. Slide 28
References
• http://www.jeffstrainsite.com
• http://www.westchestergov.com/emergserv/jdocs/railroadTank.htm
• Treichel, T. T., J. P. Hughes, C. P. L. Barkan, R. D. Sims, E. A. Phillips
and M. R. Saat (2006). Safety Performance of Tank Cars in Accidents:
Probabilities of Lading Loss (RA-05-02). RSI-AAR Railroad Tank Car
Safety Research and Test Project.
• Barkan, Christopher P.L., Satish V. Ukkusuri and S. Travis Waller
(2007). Optimizing the design of railway tank cars to minimize accident-
caused releases. Computers & Operations Research, Volume 34, Issue
5, Pages 1266-1286, ISSN 0305-0548, 10.1016/j.cor.2005.06.002.
(http://www.sciencedirect.com/science/article/pii/S0305054805001814)
• Ma, S., J. Huang, F. Wei, Y. Xie and K. Fang, 2011. Integrative analysis
of multiple cancer prognosis studies with gene expression
measurements. Stat Med. Vol. 30, Issue 28, pp. 3361-71. doi: 10.1002/
sim. 4337. Epub 2011 Aug 25. (
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3399910/)