Prob-Dist-Toll-Forecast-Uncertainty
December 16, 2015
In [2]: # special IPython command to prepare the notebook for matplotlib
%matplotlib inline
import numpy as np
import pandas as pd
import scipy as sp
import seaborn as sns
import matplotlib.pyplot as plt
Estimating the probability distribution of a travel demand forecast Authors: John L. Bowman,
Dinesh Gopinath, and Moshe Ben-Akiva
0.0.1 Algorithm
1. Identify variables that induce error in Toll Revenue prediction : x = (x1, x2, ..., xk, ..., xK)
• Simple Toll Revenue Model - Variables that induce error in Toll Revenue Prediction r(p)
are: (1)
Value of Time, (2) Population, (3) Households, (4) Employment
• Let: x1 = Value of Time, x2 = Population, x3 = Households, x4 = Employment. Thus K = 4,
i.e. 4-Dimensional space of possible outcomes xk
• x = (x1, x2, x3, x4)
2. Obtain probability distribution of xk for k = 1, 2, . . . , K. Distribution can be based on:
(a) Direct input or (b) Assumption, e.g. Triangular, Normal, etc. For each dimension k
discretize an assumed probability distribution and identify a small set of discrete outcomes xnk
k , where
nk = 1, 2, 3, . . . , Nk (assign probabilities p(xnk
k ) to these discrete outcomes based on reasoning and
empirical evidence to approximate xk’s true distribution???):
• Let N1 = 4, N2 = 3, N3 = 5, N4 = 4 and xk = {x1
k, x2
k, x3
k, ..., xNk
k }
• x1 discrete outcomes = {x1
1, x2
1, x3
1, x4
1}, with p(x1
1) + p(x2
1) + p(x3
1) + p(x4
1) = 1
• x2 discrete outcomes = {x1
2, x2
2, x3
2}, with p(x1
2) + p(x2
2) + p(x3
2) = 1
• x3 discrete outcomes = {x1
3, x2
3, x3
3, x4
3, x5
3}, with p(x1
3) + p(x2
3) + p(x3
3) + p(x4
3) + p(x5
3) = 1
• x4 discrete outcomes = {x1
4, x2
4, x3
4, x4
4}, with p(x1
4) + p(x2
4) + p(x3
4) + p(x4
4) = 1
3. Develop Toll Revenue Model for Baseline Scenario : - Get Predicted r
(p)
base for Baseline Scenario
x = (xbase
1 , xbase
2 , xbase
3 , xbase
4 ) from output of the model
4. Run Toll Revenue Model one time for each variable that induce error in prediction :
• Get predicted r
(p)
k=1 based on x = (xextreme
1 , xbase
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=2 based on x = (xbase
1 , xextreme
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=3 based on x = (xbase
1 , xbase
2 , xextreme
3 , xbase
4 )
• Get predicted r
(p)
k=4 based on x = (x1, x2, x3, x4 )
1
5. Calculate change in Predicted Toll Revenue and variables that induce error in prediction
:
• rchange
k =
r
(p)
base−r
(p)
k
r
(p)
k
, where k = 1, 2, 3, 4
• xchange
k =
xbase
1 −xk
xk
, where k = 1, 2, 3, 4
6. Calculate Elasticity of Toll Revenue with respect to variables that induce error in pre-
diction :
• er
k =
rchange
k
xchange
k
, where k = 1, 2, 3, 4
7. Define a set of scenarios: S = {(xn1
1 , ..., xnk
k , ..., xNk
K ); nk = 1, 2, 3, ..., Nk; k = 1, 2, ..., K}, covering
all combinations of the discrete coutcomes in all K = 4 dimensions
• For simple example, S = {$(x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ1), (x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ2), . . . , (x 1ˆ1,
x 2ˆ3, x 3ˆ5, x 4ˆ4), . . . , (x 1ˆ4, x 2ˆ3, x 3ˆ5, x 4ˆ4) $}
• Number of scenarios in S =
K
k=1 Nk. Thus the number of scenarios in simple example is S = 4
x 3 x 5 x 4 = 240. Thus s = 1, 2, 3, . . . , 240
• Using s as a 1-Dimensional index of the member of S: Refer to a single member of S as x(s)
=
(x
(s)
1 , x
(s)
2 , ..., x
(s)
k , ..., x
(s)
K ). For simple example: x(s=1)
= (x1
1, x1
2, x1
3, x1
4); x(s=2)
= (x1
1, x1
2, x1
3, x2
4);
x(s=240)
= (x4
1, x3
2, x5
3, x4
4)
8. Calculate the probability of each scenario: Error variables are mutually independent, thus the
probability of each scenario is given by: p(s) =
K
k=1 p(x
(s)
k ), s ∈ S, thus for simple example:
• p(s = 1) =
4
k=1 p(x
(s=1)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x1
4)
• p(s = 2) =
4
k=1 p(x
(s=2)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x2
4)
• p(s = 240) =
4
k=1 p(x
(s=240)
k ) = p(x4
1)p(x4
2)p(x5
3)p(x4
4)
9. Calculate Toll Revenue for scenario s, r(s)
= r
(p)
base
K
k=1
x
(s)
k
xbase
k
er
k
, s ∈ S, thus for simple example:
• r(s=1)
= r
(p)
base
4
k=1
x
(s=1)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x1
4
xbase
4
er
4
• r(s=2)
= r
(p)
base
4
k=1
x
(s=2)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x2
4
xbase
4
er
4
• r(s=240)
= r
(p)
base
4
k=1
x
(s=240)
k
xbase
k
er
k
= r
(p)
base
x4
1
xbase
1
er
1 x3
2
xbase
2
er
2 x5
3
xbase
3
er
3 x4
4
xbase
4
er
4
10. Using pairs r(s)
and p(s) Plot Revenue CDF
0.0.2 Sources of Uncertainty - Toll (Kockleman)
• Estimates of trip generation
• Estimates of land development
• Models: Trip Generation, Trip Distribution, Mode Choice
• Toll-technology adoption rates
• Hetrogeneity in (VOT) Value of Time savings
2
• Network attributes - Traffic congestion (low-volume corridors have greater uncertanity in their
forecasts)
• Uncertainty in land development patterns
• Demographic and employment projections
• Tolling design - shadow tolls (govt. pays the concessionaire an amount based n toll road use - similar
to toll free) or user-paid tolls (drivers willingness to pay is more complex and difficult to understand -
more forecasting risk)
• Tolling culture of a region, i.e. the degree to which tolls have been used in the past
• Travel demand model imperfections (Heterogenity of VOT is ignored, Variable tolls or HOT lanes
that are free at certain hours)
• Competitive advantage - Toll on the only bridge vs toll on freeway - more options to route
• User attributes - toll facilities serving a small market segment of travelers allow more reliable forecasts
vs hetrogenous populations
• Road location, configurations
• Demand variations over times of days and days of the year also affect forecast reliability
• Brian and Wilkins (2002) - poorly estimated VOTT’s, economic downturns, mis-prediction of future
land use patterns, lower than predicted time savings, added competition, lower than anticipated truck
usage, high variability in traffic volumes
• Economic growth and related changes in income and employment
• Total Demand Model errors
• Model error in elasticity of demand
• Value of time
• Errors in measurement of network times and costs
• Operating speed
• Roadway improvements
1 Texas North Tarrant Express Segment 3A
• Revenue and Transaction Forecast Year = 2018
2018 Revenue and Transactions
• Forecasted 2018 Annual Project Revenue (000’s 2011 Dollars) = 27612
• Forecasted 2018 Daily Transactions = 40086
Truck VOT Calculations
• SOV VOT - Lognormal distribution with mean = $18.59 and standard deviation = $7.4 (µ = 2.849
and σ = 0.383)
3
• Coefficient of variation, Csov
v = 7.4
18.59 = 0.398
• HHM Truck VOT: Mean = $36.48 and Standard deviation = $30.24
• AECOM Truck VOT: Mean = $60.76 and Standard deviation = $51.08
• AverageTruckV OT = HHMT ruckV OT +AECOMT ruckV OT
2 = $48.62
• Standard deviation of Average Truck VOT = Csov
v ∗ AverageTruckV OT = $19.35 (µ = 3.811 and σ
= 0.383, calculations below)
• µσ
In [3]: # Parameters for Truck Lognormal Distribution
m = 48.62
s = 19.35
truck_ln_mu = np.log(m/np.sqrt(1+((s**2)/(m**2))))
truck_ln_sigma = np.sqrt(np.log(1+((s**2)/(m**2))))
print ’truck_ln_mu = %1.3f’ % truck_ln_mu
print ’truck_ln_sigma = %1.3f’ % truck_ln_sigma
truck ln mu = 3.811
truck ln sigma = 0.383
r
(p)
base = 74754 (000’s 2011 Dollars)
Variables: Sources of Uncertainty
• Truck VOT: x1
– Elasticity of Revenue to Truck VOT = 0.994
– xbase
1 = $60.76
– Probability distribution: Lognormal with mean = $48.62 and std. dev = $19.35 (µ = 3.811 and
σ = 0.383)
• Travel Demand: x2
– Elasticity of Revenue to Demand (Transactions as proxy) = 2.57
– xbase
2 = 61056
– Probability distribution: Normal with µ = 58871.5 and σ = 2184.5
• Car VOT Growth: x3
– Elasticity of Revenue to Car VOT Growth = 0.19
– xbase
3 = 2.1%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.05%, and Max = 2.1%
• Truck VOT Growth: x4
4
– Elasticity of Revenue to Truck VOT Growth = 0.19
– xbase
4 = 2.5%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.25%, and Max = 2.5%
Truck VOT Probability Distribution: Lognormal
In [4]: mu = 3.811
sigma = 0.383
low = 1
high = 120
dx_1 = 2 # Length of interval
# Comb points along x axis
x_1 = np.arange(low, high, dx_1)
# Compute y values: pdf at each value of x
vot_y = (1/(sigma * x_1 * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x_1) - mu)/sigma) ** 2)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_1, vot_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$x_1$’)
plt.ylabel(’$p(x_1)$’)
plt.title(’Discretized Log-Normal Probability Density’)
area = np.sum(dx_1 * vot_y)
print ’Probability Sum = %1.4f’ % area
print ’N_1 = %d’ % len(x_1)
temp1 = np.array([x_1, vot_y * dx_1])
Probability Sum = 0.9946
N 1 = 60
5
Travel Demand Probability Distribution: Normal
In [5]: # Mean Transactions = (2035 Transactions + 2018 Transactions)/2
print ’Mean Transactions = %s’ % ((63635+40086)/2.0)
# Std Dev Transactions = (2035 Transactions - 2018 Transactions)/2
print ’Std. Dev Transactions = %s’ % ((63635-40086)/2.0)
Mean Transactions = 51860.5
Std. Dev Transactions = 11774.5
In [6]: demand_mean = 51860.5
demand_sd = 11774.5
demand_low = demand_mean - 3 * demand_sd # low end of x axis
demand_high = demand_mean + 3 * demand_sd # high end of x axis
dx_2 = 2000 # Length of interval
# Comb points along x axis
x_2 = np.arange(demand_low, demand_high, dx_2)
# Compute y values: pdf at each value of x
demand_y = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_2 - demand_mean)/demand_sd)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_2, demand_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Demand$’)
plt.ylabel(’$p(Demand)$’)
plt.title(’Discretized Normal Probability Density’)
area = np.sum(dx_2 * demand_y)
print ’Probability Sum = %1.4f’ % area
print ’N_2 = %d’ % len(x_2)
Probability Sum = 0.9978
N 2 = 36
6
Car VOT Growth Probability Distribution: Triangular
In [7]: min_growth_car = 0.004
mean_growth_car = 0.0105
max_growth_car = 0.022
car_array = np.random.triangular(min_growth_car, mean_growth_car, max_growth_car, size = 100000)
#plt.hist(car_array, bins = 10)
car_val = np.histogram(car_array, bins = 20)
car_y = [float(i)/np.sum(car_val[0]) for i in car_val[0]]
# Binwidth issue
x_car = car_val[1]
x_3 = []
for i in range(len(x_car) - 1):
temp = (x_car[i] + x_car[i+1])/2
x_3.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_3, car_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Car Growth$’)
plt.ylabel(’$p(Car Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_3)
print np.sum(car_y)
20
1.0
Truck VOT Growth Probability Distribution: Triangular
7
In [8]: min_growth_truck = 0.004
mean_growth_truck = 0.0125
max_growth_truck = 0.026
truck_array = np.random.triangular(min_growth_truck, mean_growth_truck, max_growth_truck, size =
#plt.hist(car_array, bins = 10)
truck_val = np.histogram(truck_array, bins = 20)
truck_y = [float(i)/np.sum(truck_val[0]) for i in truck_val[0]]
# Binwidth issue
x_truck = truck_val[1]
x_4 = []
for i in range(len(x_truck) - 1):
temp = (x_truck[i] + x_truck[i+1])/2
x_4.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_4, truck_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Truck Growth$’)
plt.ylabel(’$p(Truck Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_4)
print np.sum(truck_y)
20
1.0
Scenarios
In [9]: S = [[i, j, k, l] for i in x_1 for j in x_2 for k in x_3 for l in x_4]
print S[0]
8
print ’n’
print ’Number of Scenarios = ’ + str(len(S))
[1, 16537.0, 0.0044911329557659595, 0.0045996835827973384]
Number of Scenarios = 864000
Probability and Revenue Calculations for Scenarios
In [10]: # Constants: Base Revenue
rp_base = 27612
# Constants: Base values of variables
x_1b = 60.76
x_2b = 40086
x_3b = 0.021
x_4b = 0.025
# Constants: Elasticities of variables
e_x1 = 0.994
e_x2 = 2.57
e_x3 = 0.19
e_x4 = 0.19
revenue_S = []
prob_S = []
for i in range(len(S)):
# R(s)
temp_rev = rp_base * (S[i][0]/x_1b)**(e_x1) * (S[i][1]/x_2b)**(e_x2) * (S[i][2]/x_3b)**(e_x
revenue_S.append(temp_rev)
# Probability calculation:
# Truck VOT
p_x1 = (1/(sigma * S[i][0] * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(S[i][0]) - mu)/s
# Demand
p_x2 = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((S[i][1] - demand_mean)/demand
# Car VOT Growth
if S[i][2] in x_3:
cp = x_3.index(S[i][2])
p_x3 = car_y[cp]
# Truck VOT Growth
if S[i][3] in x_4:
tp = x_4.index(S[i][3])
p_x4 = truck_y[tp]
prob_S.append(p_x1 * p_x2 * p_x3 * p_x4)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
9
Probability Sum = 0.9924
In [11]: # Sorting Result based on Revenue
output = (np.array([revenue_S, prob_S])).T
output = output[output[:, 0].argsort()]
In [12]: # Plotting Cumulative Probability Distribution
plt.figure(figsize = (16, 8))
plt.plot(output[:,0], np.cumsum(output[:,1]), linewidth = 2) # Selecting array column: array[:,
# Plotting Predicted Revenue
plt.axvline(x = rp_base, color = ’r’)
plt.text(74754 + 500, 0.1, ’Predicted Revenue for 2018’, fontsize = 16)
# Remove Scientific Notation
ax = plt.gca()
ax.get_xaxis().get_major_formatter().set_scientific(False)
plt.xlabel(’$Revenue$’, fontsize = 16)
plt.ylabel(’$p(Revenue)$’, fontsize = 16)
plt.title(’Cumulative Probability Distribution - Revenue (000’s 2011 Dollars)’, fontsize = 16)
# Set tick label size
plt.tick_params(axis = ’both’, which = ’major’, labelsize = 14)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
print ’Demand std = %d’ % demand_sd
Probability Sum = 0.9924
Demand std = 11774
Percentile Calculation
In [14]: year = 2018
cum_prob = pd.DataFrame({’Revenue’: output[:,0], ’Cumulative Probability’: np.cumsum(output[:,1
10
# P(Revenue < r) = percentile -> Find r
# 75+
percentile_75 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.75].shape[
percentile_85 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.85].shape[
percentile_95 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.95].shape[
# 25-
percentile_05 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.05].shape[
percentile_15 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.15].shape[
percentile_25 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.25].shape[
# Print values
print ’75th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_75
print ’85th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_85
print ’95th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_95
print ’5th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_05
print ’15th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_15
print ’25th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_25
print percentile_95
print percentile_85
print percentile_75
print percentile_25
print percentile_15
print percentile_05
75th Percentile of 2018 Revenue = 49146.23
85th Percentile of 2018 Revenue = 62173.69
95th Percentile of 2018 Revenue = 91921.62
5th Percentile of 2018 Revenue = 8193.37
15th Percentile of 2018 Revenue = 14082.97
25th Percentile of 2018 Revenue = 18857.58
91921.6153431
62173.6864964
49146.2293139
18857.5804073
14082.965342
8193.3714675
11

Prob-Dist-Toll-Forecast-Uncertainty

  • 1.
    Prob-Dist-Toll-Forecast-Uncertainty December 16, 2015 In[2]: # special IPython command to prepare the notebook for matplotlib %matplotlib inline import numpy as np import pandas as pd import scipy as sp import seaborn as sns import matplotlib.pyplot as plt Estimating the probability distribution of a travel demand forecast Authors: John L. Bowman, Dinesh Gopinath, and Moshe Ben-Akiva 0.0.1 Algorithm 1. Identify variables that induce error in Toll Revenue prediction : x = (x1, x2, ..., xk, ..., xK) • Simple Toll Revenue Model - Variables that induce error in Toll Revenue Prediction r(p) are: (1) Value of Time, (2) Population, (3) Households, (4) Employment • Let: x1 = Value of Time, x2 = Population, x3 = Households, x4 = Employment. Thus K = 4, i.e. 4-Dimensional space of possible outcomes xk • x = (x1, x2, x3, x4) 2. Obtain probability distribution of xk for k = 1, 2, . . . , K. Distribution can be based on: (a) Direct input or (b) Assumption, e.g. Triangular, Normal, etc. For each dimension k discretize an assumed probability distribution and identify a small set of discrete outcomes xnk k , where nk = 1, 2, 3, . . . , Nk (assign probabilities p(xnk k ) to these discrete outcomes based on reasoning and empirical evidence to approximate xk’s true distribution???): • Let N1 = 4, N2 = 3, N3 = 5, N4 = 4 and xk = {x1 k, x2 k, x3 k, ..., xNk k } • x1 discrete outcomes = {x1 1, x2 1, x3 1, x4 1}, with p(x1 1) + p(x2 1) + p(x3 1) + p(x4 1) = 1 • x2 discrete outcomes = {x1 2, x2 2, x3 2}, with p(x1 2) + p(x2 2) + p(x3 2) = 1 • x3 discrete outcomes = {x1 3, x2 3, x3 3, x4 3, x5 3}, with p(x1 3) + p(x2 3) + p(x3 3) + p(x4 3) + p(x5 3) = 1 • x4 discrete outcomes = {x1 4, x2 4, x3 4, x4 4}, with p(x1 4) + p(x2 4) + p(x3 4) + p(x4 4) = 1 3. Develop Toll Revenue Model for Baseline Scenario : - Get Predicted r (p) base for Baseline Scenario x = (xbase 1 , xbase 2 , xbase 3 , xbase 4 ) from output of the model 4. Run Toll Revenue Model one time for each variable that induce error in prediction : • Get predicted r (p) k=1 based on x = (xextreme 1 , xbase 2 , xbase 3 , xbase 4 ) • Get predicted r (p) k=2 based on x = (xbase 1 , xextreme 2 , xbase 3 , xbase 4 ) • Get predicted r (p) k=3 based on x = (xbase 1 , xbase 2 , xextreme 3 , xbase 4 ) • Get predicted r (p) k=4 based on x = (x1, x2, x3, x4 ) 1
  • 2.
    5. Calculate changein Predicted Toll Revenue and variables that induce error in prediction : • rchange k = r (p) base−r (p) k r (p) k , where k = 1, 2, 3, 4 • xchange k = xbase 1 −xk xk , where k = 1, 2, 3, 4 6. Calculate Elasticity of Toll Revenue with respect to variables that induce error in pre- diction : • er k = rchange k xchange k , where k = 1, 2, 3, 4 7. Define a set of scenarios: S = {(xn1 1 , ..., xnk k , ..., xNk K ); nk = 1, 2, 3, ..., Nk; k = 1, 2, ..., K}, covering all combinations of the discrete coutcomes in all K = 4 dimensions • For simple example, S = {$(x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ1), (x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ2), . . . , (x 1ˆ1, x 2ˆ3, x 3ˆ5, x 4ˆ4), . . . , (x 1ˆ4, x 2ˆ3, x 3ˆ5, x 4ˆ4) $} • Number of scenarios in S = K k=1 Nk. Thus the number of scenarios in simple example is S = 4 x 3 x 5 x 4 = 240. Thus s = 1, 2, 3, . . . , 240 • Using s as a 1-Dimensional index of the member of S: Refer to a single member of S as x(s) = (x (s) 1 , x (s) 2 , ..., x (s) k , ..., x (s) K ). For simple example: x(s=1) = (x1 1, x1 2, x1 3, x1 4); x(s=2) = (x1 1, x1 2, x1 3, x2 4); x(s=240) = (x4 1, x3 2, x5 3, x4 4) 8. Calculate the probability of each scenario: Error variables are mutually independent, thus the probability of each scenario is given by: p(s) = K k=1 p(x (s) k ), s ∈ S, thus for simple example: • p(s = 1) = 4 k=1 p(x (s=1) k ) = p(x1 1)p(x1 2)p(x1 3)p(x1 4) • p(s = 2) = 4 k=1 p(x (s=2) k ) = p(x1 1)p(x1 2)p(x1 3)p(x2 4) • p(s = 240) = 4 k=1 p(x (s=240) k ) = p(x4 1)p(x4 2)p(x5 3)p(x4 4) 9. Calculate Toll Revenue for scenario s, r(s) = r (p) base K k=1 x (s) k xbase k er k , s ∈ S, thus for simple example: • r(s=1) = r (p) base 4 k=1 x (s=1) k xbase k er k = r (p) base x1 1 xbase 1 er 1 x1 2 xbase 2 er 2 x1 3 xbase 3 er 3 x1 4 xbase 4 er 4 • r(s=2) = r (p) base 4 k=1 x (s=2) k xbase k er k = r (p) base x1 1 xbase 1 er 1 x1 2 xbase 2 er 2 x1 3 xbase 3 er 3 x2 4 xbase 4 er 4 • r(s=240) = r (p) base 4 k=1 x (s=240) k xbase k er k = r (p) base x4 1 xbase 1 er 1 x3 2 xbase 2 er 2 x5 3 xbase 3 er 3 x4 4 xbase 4 er 4 10. Using pairs r(s) and p(s) Plot Revenue CDF 0.0.2 Sources of Uncertainty - Toll (Kockleman) • Estimates of trip generation • Estimates of land development • Models: Trip Generation, Trip Distribution, Mode Choice • Toll-technology adoption rates • Hetrogeneity in (VOT) Value of Time savings 2
  • 3.
    • Network attributes- Traffic congestion (low-volume corridors have greater uncertanity in their forecasts) • Uncertainty in land development patterns • Demographic and employment projections • Tolling design - shadow tolls (govt. pays the concessionaire an amount based n toll road use - similar to toll free) or user-paid tolls (drivers willingness to pay is more complex and difficult to understand - more forecasting risk) • Tolling culture of a region, i.e. the degree to which tolls have been used in the past • Travel demand model imperfections (Heterogenity of VOT is ignored, Variable tolls or HOT lanes that are free at certain hours) • Competitive advantage - Toll on the only bridge vs toll on freeway - more options to route • User attributes - toll facilities serving a small market segment of travelers allow more reliable forecasts vs hetrogenous populations • Road location, configurations • Demand variations over times of days and days of the year also affect forecast reliability • Brian and Wilkins (2002) - poorly estimated VOTT’s, economic downturns, mis-prediction of future land use patterns, lower than predicted time savings, added competition, lower than anticipated truck usage, high variability in traffic volumes • Economic growth and related changes in income and employment • Total Demand Model errors • Model error in elasticity of demand • Value of time • Errors in measurement of network times and costs • Operating speed • Roadway improvements 1 Texas North Tarrant Express Segment 3A • Revenue and Transaction Forecast Year = 2018 2018 Revenue and Transactions • Forecasted 2018 Annual Project Revenue (000’s 2011 Dollars) = 27612 • Forecasted 2018 Daily Transactions = 40086 Truck VOT Calculations • SOV VOT - Lognormal distribution with mean = $18.59 and standard deviation = $7.4 (µ = 2.849 and σ = 0.383) 3
  • 4.
    • Coefficient ofvariation, Csov v = 7.4 18.59 = 0.398 • HHM Truck VOT: Mean = $36.48 and Standard deviation = $30.24 • AECOM Truck VOT: Mean = $60.76 and Standard deviation = $51.08 • AverageTruckV OT = HHMT ruckV OT +AECOMT ruckV OT 2 = $48.62 • Standard deviation of Average Truck VOT = Csov v ∗ AverageTruckV OT = $19.35 (µ = 3.811 and σ = 0.383, calculations below) • µσ In [3]: # Parameters for Truck Lognormal Distribution m = 48.62 s = 19.35 truck_ln_mu = np.log(m/np.sqrt(1+((s**2)/(m**2)))) truck_ln_sigma = np.sqrt(np.log(1+((s**2)/(m**2)))) print ’truck_ln_mu = %1.3f’ % truck_ln_mu print ’truck_ln_sigma = %1.3f’ % truck_ln_sigma truck ln mu = 3.811 truck ln sigma = 0.383 r (p) base = 74754 (000’s 2011 Dollars) Variables: Sources of Uncertainty • Truck VOT: x1 – Elasticity of Revenue to Truck VOT = 0.994 – xbase 1 = $60.76 – Probability distribution: Lognormal with mean = $48.62 and std. dev = $19.35 (µ = 3.811 and σ = 0.383) • Travel Demand: x2 – Elasticity of Revenue to Demand (Transactions as proxy) = 2.57 – xbase 2 = 61056 – Probability distribution: Normal with µ = 58871.5 and σ = 2184.5 • Car VOT Growth: x3 – Elasticity of Revenue to Car VOT Growth = 0.19 – xbase 3 = 2.1% – Probability distribution: Triangular with Min = 0.5%, Mean = 1.05%, and Max = 2.1% • Truck VOT Growth: x4 4
  • 5.
    – Elasticity ofRevenue to Truck VOT Growth = 0.19 – xbase 4 = 2.5% – Probability distribution: Triangular with Min = 0.5%, Mean = 1.25%, and Max = 2.5% Truck VOT Probability Distribution: Lognormal In [4]: mu = 3.811 sigma = 0.383 low = 1 high = 120 dx_1 = 2 # Length of interval # Comb points along x axis x_1 = np.arange(low, high, dx_1) # Compute y values: pdf at each value of x vot_y = (1/(sigma * x_1 * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x_1) - mu)/sigma) ** 2) # Plot the function plt.figure(figsize = (16, 8)) plt.stem(x_1, vot_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$x_1$’) plt.ylabel(’$p(x_1)$’) plt.title(’Discretized Log-Normal Probability Density’) area = np.sum(dx_1 * vot_y) print ’Probability Sum = %1.4f’ % area print ’N_1 = %d’ % len(x_1) temp1 = np.array([x_1, vot_y * dx_1]) Probability Sum = 0.9946 N 1 = 60 5
  • 6.
    Travel Demand ProbabilityDistribution: Normal In [5]: # Mean Transactions = (2035 Transactions + 2018 Transactions)/2 print ’Mean Transactions = %s’ % ((63635+40086)/2.0) # Std Dev Transactions = (2035 Transactions - 2018 Transactions)/2 print ’Std. Dev Transactions = %s’ % ((63635-40086)/2.0) Mean Transactions = 51860.5 Std. Dev Transactions = 11774.5 In [6]: demand_mean = 51860.5 demand_sd = 11774.5 demand_low = demand_mean - 3 * demand_sd # low end of x axis demand_high = demand_mean + 3 * demand_sd # high end of x axis dx_2 = 2000 # Length of interval # Comb points along x axis x_2 = np.arange(demand_low, demand_high, dx_2) # Compute y values: pdf at each value of x demand_y = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_2 - demand_mean)/demand_sd) # Plot the function plt.figure(figsize = (16, 8)) plt.stem(x_2, demand_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Demand$’) plt.ylabel(’$p(Demand)$’) plt.title(’Discretized Normal Probability Density’) area = np.sum(dx_2 * demand_y) print ’Probability Sum = %1.4f’ % area print ’N_2 = %d’ % len(x_2) Probability Sum = 0.9978 N 2 = 36 6
  • 7.
    Car VOT GrowthProbability Distribution: Triangular In [7]: min_growth_car = 0.004 mean_growth_car = 0.0105 max_growth_car = 0.022 car_array = np.random.triangular(min_growth_car, mean_growth_car, max_growth_car, size = 100000) #plt.hist(car_array, bins = 10) car_val = np.histogram(car_array, bins = 20) car_y = [float(i)/np.sum(car_val[0]) for i in car_val[0]] # Binwidth issue x_car = car_val[1] x_3 = [] for i in range(len(x_car) - 1): temp = (x_car[i] + x_car[i+1])/2 x_3.append(temp) # Plot triangular distribution plt.figure(figsize = (16, 8)) plt.stem(x_3, car_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Car Growth$’) plt.ylabel(’$p(Car Growth)$’) plt.title(’Discretized Triangular Probability Density’) print len(x_3) print np.sum(car_y) 20 1.0 Truck VOT Growth Probability Distribution: Triangular 7
  • 8.
    In [8]: min_growth_truck= 0.004 mean_growth_truck = 0.0125 max_growth_truck = 0.026 truck_array = np.random.triangular(min_growth_truck, mean_growth_truck, max_growth_truck, size = #plt.hist(car_array, bins = 10) truck_val = np.histogram(truck_array, bins = 20) truck_y = [float(i)/np.sum(truck_val[0]) for i in truck_val[0]] # Binwidth issue x_truck = truck_val[1] x_4 = [] for i in range(len(x_truck) - 1): temp = (x_truck[i] + x_truck[i+1])/2 x_4.append(temp) # Plot triangular distribution plt.figure(figsize = (16, 8)) plt.stem(x_4, truck_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Truck Growth$’) plt.ylabel(’$p(Truck Growth)$’) plt.title(’Discretized Triangular Probability Density’) print len(x_4) print np.sum(truck_y) 20 1.0 Scenarios In [9]: S = [[i, j, k, l] for i in x_1 for j in x_2 for k in x_3 for l in x_4] print S[0] 8
  • 9.
    print ’n’ print ’Numberof Scenarios = ’ + str(len(S)) [1, 16537.0, 0.0044911329557659595, 0.0045996835827973384] Number of Scenarios = 864000 Probability and Revenue Calculations for Scenarios In [10]: # Constants: Base Revenue rp_base = 27612 # Constants: Base values of variables x_1b = 60.76 x_2b = 40086 x_3b = 0.021 x_4b = 0.025 # Constants: Elasticities of variables e_x1 = 0.994 e_x2 = 2.57 e_x3 = 0.19 e_x4 = 0.19 revenue_S = [] prob_S = [] for i in range(len(S)): # R(s) temp_rev = rp_base * (S[i][0]/x_1b)**(e_x1) * (S[i][1]/x_2b)**(e_x2) * (S[i][2]/x_3b)**(e_x revenue_S.append(temp_rev) # Probability calculation: # Truck VOT p_x1 = (1/(sigma * S[i][0] * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(S[i][0]) - mu)/s # Demand p_x2 = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((S[i][1] - demand_mean)/demand # Car VOT Growth if S[i][2] in x_3: cp = x_3.index(S[i][2]) p_x3 = car_y[cp] # Truck VOT Growth if S[i][3] in x_4: tp = x_4.index(S[i][3]) p_x4 = truck_y[tp] prob_S.append(p_x1 * p_x2 * p_x3 * p_x4) print ’Probability Sum = %0.4f’ % np.sum(prob_S) 9
  • 10.
    Probability Sum =0.9924 In [11]: # Sorting Result based on Revenue output = (np.array([revenue_S, prob_S])).T output = output[output[:, 0].argsort()] In [12]: # Plotting Cumulative Probability Distribution plt.figure(figsize = (16, 8)) plt.plot(output[:,0], np.cumsum(output[:,1]), linewidth = 2) # Selecting array column: array[:, # Plotting Predicted Revenue plt.axvline(x = rp_base, color = ’r’) plt.text(74754 + 500, 0.1, ’Predicted Revenue for 2018’, fontsize = 16) # Remove Scientific Notation ax = plt.gca() ax.get_xaxis().get_major_formatter().set_scientific(False) plt.xlabel(’$Revenue$’, fontsize = 16) plt.ylabel(’$p(Revenue)$’, fontsize = 16) plt.title(’Cumulative Probability Distribution - Revenue (000’s 2011 Dollars)’, fontsize = 16) # Set tick label size plt.tick_params(axis = ’both’, which = ’major’, labelsize = 14) print ’Probability Sum = %0.4f’ % np.sum(prob_S) print ’Demand std = %d’ % demand_sd Probability Sum = 0.9924 Demand std = 11774 Percentile Calculation In [14]: year = 2018 cum_prob = pd.DataFrame({’Revenue’: output[:,0], ’Cumulative Probability’: np.cumsum(output[:,1 10
  • 11.
    # P(Revenue <r) = percentile -> Find r # 75+ percentile_75 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.75].shape[ percentile_85 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.85].shape[ percentile_95 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.95].shape[ # 25- percentile_05 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.05].shape[ percentile_15 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.15].shape[ percentile_25 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.25].shape[ # Print values print ’75th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_75 print ’85th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_85 print ’95th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_95 print ’5th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_05 print ’15th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_15 print ’25th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_25 print percentile_95 print percentile_85 print percentile_75 print percentile_25 print percentile_15 print percentile_05 75th Percentile of 2018 Revenue = 49146.23 85th Percentile of 2018 Revenue = 62173.69 95th Percentile of 2018 Revenue = 91921.62 5th Percentile of 2018 Revenue = 8193.37 15th Percentile of 2018 Revenue = 14082.97 25th Percentile of 2018 Revenue = 18857.58 91921.6153431 62173.6864964 49146.2293139 18857.5804073 14082.965342 8193.3714675 11