SlideShare a Scribd company logo
Prob-Dist-Toll-Forecast-Uncertainty
December 16, 2015
In [2]: # special IPython command to prepare the notebook for matplotlib
%matplotlib inline
import numpy as np
import pandas as pd
import scipy as sp
import seaborn as sns
import matplotlib.pyplot as plt
Estimating the probability distribution of a travel demand forecast Authors: John L. Bowman,
Dinesh Gopinath, and Moshe Ben-Akiva
0.0.1 Algorithm
1. Identify variables that induce error in Toll Revenue prediction : x = (x1, x2, ..., xk, ..., xK)
• Simple Toll Revenue Model - Variables that induce error in Toll Revenue Prediction r(p)
are: (1)
Value of Time, (2) Population, (3) Households, (4) Employment
• Let: x1 = Value of Time, x2 = Population, x3 = Households, x4 = Employment. Thus K = 4,
i.e. 4-Dimensional space of possible outcomes xk
• x = (x1, x2, x3, x4)
2. Obtain probability distribution of xk for k = 1, 2, . . . , K. Distribution can be based on:
(a) Direct input or (b) Assumption, e.g. Triangular, Normal, etc. For each dimension k
discretize an assumed probability distribution and identify a small set of discrete outcomes xnk
k , where
nk = 1, 2, 3, . . . , Nk (assign probabilities p(xnk
k ) to these discrete outcomes based on reasoning and
empirical evidence to approximate xk’s true distribution???):
• Let N1 = 4, N2 = 3, N3 = 5, N4 = 4 and xk = {x1
k, x2
k, x3
k, ..., xNk
k }
• x1 discrete outcomes = {x1
1, x2
1, x3
1, x4
1}, with p(x1
1) + p(x2
1) + p(x3
1) + p(x4
1) = 1
• x2 discrete outcomes = {x1
2, x2
2, x3
2}, with p(x1
2) + p(x2
2) + p(x3
2) = 1
• x3 discrete outcomes = {x1
3, x2
3, x3
3, x4
3, x5
3}, with p(x1
3) + p(x2
3) + p(x3
3) + p(x4
3) + p(x5
3) = 1
• x4 discrete outcomes = {x1
4, x2
4, x3
4, x4
4}, with p(x1
4) + p(x2
4) + p(x3
4) + p(x4
4) = 1
3. Develop Toll Revenue Model for Baseline Scenario : - Get Predicted r
(p)
base for Baseline Scenario
x = (xbase
1 , xbase
2 , xbase
3 , xbase
4 ) from output of the model
4. Run Toll Revenue Model one time for each variable that induce error in prediction :
• Get predicted r
(p)
k=1 based on x = (xextreme
1 , xbase
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=2 based on x = (xbase
1 , xextreme
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=3 based on x = (xbase
1 , xbase
2 , xextreme
3 , xbase
4 )
• Get predicted r
(p)
k=4 based on x = (x1, x2, x3, x4 )
1
5. Calculate change in Predicted Toll Revenue and variables that induce error in prediction
:
• rchange
k =
r
(p)
base−r
(p)
k
r
(p)
k
, where k = 1, 2, 3, 4
• xchange
k =
xbase
1 −xk
xk
, where k = 1, 2, 3, 4
6. Calculate Elasticity of Toll Revenue with respect to variables that induce error in pre-
diction :
• er
k =
rchange
k
xchange
k
, where k = 1, 2, 3, 4
7. Define a set of scenarios: S = {(xn1
1 , ..., xnk
k , ..., xNk
K ); nk = 1, 2, 3, ..., Nk; k = 1, 2, ..., K}, covering
all combinations of the discrete coutcomes in all K = 4 dimensions
• For simple example, S = {$(x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ1), (x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ2), . . . , (x 1ˆ1,
x 2ˆ3, x 3ˆ5, x 4ˆ4), . . . , (x 1ˆ4, x 2ˆ3, x 3ˆ5, x 4ˆ4) $}
• Number of scenarios in S =
K
k=1 Nk. Thus the number of scenarios in simple example is S = 4
x 3 x 5 x 4 = 240. Thus s = 1, 2, 3, . . . , 240
• Using s as a 1-Dimensional index of the member of S: Refer to a single member of S as x(s)
=
(x
(s)
1 , x
(s)
2 , ..., x
(s)
k , ..., x
(s)
K ). For simple example: x(s=1)
= (x1
1, x1
2, x1
3, x1
4); x(s=2)
= (x1
1, x1
2, x1
3, x2
4);
x(s=240)
= (x4
1, x3
2, x5
3, x4
4)
8. Calculate the probability of each scenario: Error variables are mutually independent, thus the
probability of each scenario is given by: p(s) =
K
k=1 p(x
(s)
k ), s ∈ S, thus for simple example:
• p(s = 1) =
4
k=1 p(x
(s=1)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x1
4)
• p(s = 2) =
4
k=1 p(x
(s=2)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x2
4)
• p(s = 240) =
4
k=1 p(x
(s=240)
k ) = p(x4
1)p(x4
2)p(x5
3)p(x4
4)
9. Calculate Toll Revenue for scenario s, r(s)
= r
(p)
base
K
k=1
x
(s)
k
xbase
k
er
k
, s ∈ S, thus for simple example:
• r(s=1)
= r
(p)
base
4
k=1
x
(s=1)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x1
4
xbase
4
er
4
• r(s=2)
= r
(p)
base
4
k=1
x
(s=2)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x2
4
xbase
4
er
4
• r(s=240)
= r
(p)
base
4
k=1
x
(s=240)
k
xbase
k
er
k
= r
(p)
base
x4
1
xbase
1
er
1 x3
2
xbase
2
er
2 x5
3
xbase
3
er
3 x4
4
xbase
4
er
4
10. Using pairs r(s)
and p(s) Plot Revenue CDF
0.0.2 Sources of Uncertainty - Toll (Kockleman)
• Estimates of trip generation
• Estimates of land development
• Models: Trip Generation, Trip Distribution, Mode Choice
• Toll-technology adoption rates
• Hetrogeneity in (VOT) Value of Time savings
2
• Network attributes - Traffic congestion (low-volume corridors have greater uncertanity in their
forecasts)
• Uncertainty in land development patterns
• Demographic and employment projections
• Tolling design - shadow tolls (govt. pays the concessionaire an amount based n toll road use - similar
to toll free) or user-paid tolls (drivers willingness to pay is more complex and difficult to understand -
more forecasting risk)
• Tolling culture of a region, i.e. the degree to which tolls have been used in the past
• Travel demand model imperfections (Heterogenity of VOT is ignored, Variable tolls or HOT lanes
that are free at certain hours)
• Competitive advantage - Toll on the only bridge vs toll on freeway - more options to route
• User attributes - toll facilities serving a small market segment of travelers allow more reliable forecasts
vs hetrogenous populations
• Road location, configurations
• Demand variations over times of days and days of the year also affect forecast reliability
• Brian and Wilkins (2002) - poorly estimated VOTT’s, economic downturns, mis-prediction of future
land use patterns, lower than predicted time savings, added competition, lower than anticipated truck
usage, high variability in traffic volumes
• Economic growth and related changes in income and employment
• Total Demand Model errors
• Model error in elasticity of demand
• Value of time
• Errors in measurement of network times and costs
• Operating speed
• Roadway improvements
1 Texas North Tarrant Express Segment 3A
• Revenue and Transaction Forecast Year = 2018
2018 Revenue and Transactions
• Forecasted 2018 Annual Project Revenue (000’s 2011 Dollars) = 27612
• Forecasted 2018 Daily Transactions = 40086
Truck VOT Calculations
• SOV VOT - Lognormal distribution with mean = $18.59 and standard deviation = $7.4 (µ = 2.849
and σ = 0.383)
3
• Coefficient of variation, Csov
v = 7.4
18.59 = 0.398
• HHM Truck VOT: Mean = $36.48 and Standard deviation = $30.24
• AECOM Truck VOT: Mean = $60.76 and Standard deviation = $51.08
• AverageTruckV OT = HHMT ruckV OT +AECOMT ruckV OT
2 = $48.62
• Standard deviation of Average Truck VOT = Csov
v ∗ AverageTruckV OT = $19.35 (µ = 3.811 and σ
= 0.383, calculations below)
• µσ
In [3]: # Parameters for Truck Lognormal Distribution
m = 48.62
s = 19.35
truck_ln_mu = np.log(m/np.sqrt(1+((s**2)/(m**2))))
truck_ln_sigma = np.sqrt(np.log(1+((s**2)/(m**2))))
print ’truck_ln_mu = %1.3f’ % truck_ln_mu
print ’truck_ln_sigma = %1.3f’ % truck_ln_sigma
truck ln mu = 3.811
truck ln sigma = 0.383
r
(p)
base = 74754 (000’s 2011 Dollars)
Variables: Sources of Uncertainty
• Truck VOT: x1
– Elasticity of Revenue to Truck VOT = 0.994
– xbase
1 = $60.76
– Probability distribution: Lognormal with mean = $48.62 and std. dev = $19.35 (µ = 3.811 and
σ = 0.383)
• Travel Demand: x2
– Elasticity of Revenue to Demand (Transactions as proxy) = 2.57
– xbase
2 = 61056
– Probability distribution: Normal with µ = 58871.5 and σ = 2184.5
• Car VOT Growth: x3
– Elasticity of Revenue to Car VOT Growth = 0.19
– xbase
3 = 2.1%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.05%, and Max = 2.1%
• Truck VOT Growth: x4
4
– Elasticity of Revenue to Truck VOT Growth = 0.19
– xbase
4 = 2.5%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.25%, and Max = 2.5%
Truck VOT Probability Distribution: Lognormal
In [4]: mu = 3.811
sigma = 0.383
low = 1
high = 120
dx_1 = 2 # Length of interval
# Comb points along x axis
x_1 = np.arange(low, high, dx_1)
# Compute y values: pdf at each value of x
vot_y = (1/(sigma * x_1 * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x_1) - mu)/sigma) ** 2)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_1, vot_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$x_1$’)
plt.ylabel(’$p(x_1)$’)
plt.title(’Discretized Log-Normal Probability Density’)
area = np.sum(dx_1 * vot_y)
print ’Probability Sum = %1.4f’ % area
print ’N_1 = %d’ % len(x_1)
temp1 = np.array([x_1, vot_y * dx_1])
Probability Sum = 0.9946
N 1 = 60
5
Travel Demand Probability Distribution: Normal
In [5]: # Mean Transactions = (2035 Transactions + 2018 Transactions)/2
print ’Mean Transactions = %s’ % ((63635+40086)/2.0)
# Std Dev Transactions = (2035 Transactions - 2018 Transactions)/2
print ’Std. Dev Transactions = %s’ % ((63635-40086)/2.0)
Mean Transactions = 51860.5
Std. Dev Transactions = 11774.5
In [6]: demand_mean = 51860.5
demand_sd = 11774.5
demand_low = demand_mean - 3 * demand_sd # low end of x axis
demand_high = demand_mean + 3 * demand_sd # high end of x axis
dx_2 = 2000 # Length of interval
# Comb points along x axis
x_2 = np.arange(demand_low, demand_high, dx_2)
# Compute y values: pdf at each value of x
demand_y = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_2 - demand_mean)/demand_sd)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_2, demand_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Demand$’)
plt.ylabel(’$p(Demand)$’)
plt.title(’Discretized Normal Probability Density’)
area = np.sum(dx_2 * demand_y)
print ’Probability Sum = %1.4f’ % area
print ’N_2 = %d’ % len(x_2)
Probability Sum = 0.9978
N 2 = 36
6
Car VOT Growth Probability Distribution: Triangular
In [7]: min_growth_car = 0.004
mean_growth_car = 0.0105
max_growth_car = 0.022
car_array = np.random.triangular(min_growth_car, mean_growth_car, max_growth_car, size = 100000)
#plt.hist(car_array, bins = 10)
car_val = np.histogram(car_array, bins = 20)
car_y = [float(i)/np.sum(car_val[0]) for i in car_val[0]]
# Binwidth issue
x_car = car_val[1]
x_3 = []
for i in range(len(x_car) - 1):
temp = (x_car[i] + x_car[i+1])/2
x_3.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_3, car_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Car Growth$’)
plt.ylabel(’$p(Car Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_3)
print np.sum(car_y)
20
1.0
Truck VOT Growth Probability Distribution: Triangular
7
In [8]: min_growth_truck = 0.004
mean_growth_truck = 0.0125
max_growth_truck = 0.026
truck_array = np.random.triangular(min_growth_truck, mean_growth_truck, max_growth_truck, size =
#plt.hist(car_array, bins = 10)
truck_val = np.histogram(truck_array, bins = 20)
truck_y = [float(i)/np.sum(truck_val[0]) for i in truck_val[0]]
# Binwidth issue
x_truck = truck_val[1]
x_4 = []
for i in range(len(x_truck) - 1):
temp = (x_truck[i] + x_truck[i+1])/2
x_4.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_4, truck_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Truck Growth$’)
plt.ylabel(’$p(Truck Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_4)
print np.sum(truck_y)
20
1.0
Scenarios
In [9]: S = [[i, j, k, l] for i in x_1 for j in x_2 for k in x_3 for l in x_4]
print S[0]
8
print ’n’
print ’Number of Scenarios = ’ + str(len(S))
[1, 16537.0, 0.0044911329557659595, 0.0045996835827973384]
Number of Scenarios = 864000
Probability and Revenue Calculations for Scenarios
In [10]: # Constants: Base Revenue
rp_base = 27612
# Constants: Base values of variables
x_1b = 60.76
x_2b = 40086
x_3b = 0.021
x_4b = 0.025
# Constants: Elasticities of variables
e_x1 = 0.994
e_x2 = 2.57
e_x3 = 0.19
e_x4 = 0.19
revenue_S = []
prob_S = []
for i in range(len(S)):
# R(s)
temp_rev = rp_base * (S[i][0]/x_1b)**(e_x1) * (S[i][1]/x_2b)**(e_x2) * (S[i][2]/x_3b)**(e_x
revenue_S.append(temp_rev)
# Probability calculation:
# Truck VOT
p_x1 = (1/(sigma * S[i][0] * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(S[i][0]) - mu)/s
# Demand
p_x2 = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((S[i][1] - demand_mean)/demand
# Car VOT Growth
if S[i][2] in x_3:
cp = x_3.index(S[i][2])
p_x3 = car_y[cp]
# Truck VOT Growth
if S[i][3] in x_4:
tp = x_4.index(S[i][3])
p_x4 = truck_y[tp]
prob_S.append(p_x1 * p_x2 * p_x3 * p_x4)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
9
Probability Sum = 0.9924
In [11]: # Sorting Result based on Revenue
output = (np.array([revenue_S, prob_S])).T
output = output[output[:, 0].argsort()]
In [12]: # Plotting Cumulative Probability Distribution
plt.figure(figsize = (16, 8))
plt.plot(output[:,0], np.cumsum(output[:,1]), linewidth = 2) # Selecting array column: array[:,
# Plotting Predicted Revenue
plt.axvline(x = rp_base, color = ’r’)
plt.text(74754 + 500, 0.1, ’Predicted Revenue for 2018’, fontsize = 16)
# Remove Scientific Notation
ax = plt.gca()
ax.get_xaxis().get_major_formatter().set_scientific(False)
plt.xlabel(’$Revenue$’, fontsize = 16)
plt.ylabel(’$p(Revenue)$’, fontsize = 16)
plt.title(’Cumulative Probability Distribution - Revenue (000’s 2011 Dollars)’, fontsize = 16)
# Set tick label size
plt.tick_params(axis = ’both’, which = ’major’, labelsize = 14)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
print ’Demand std = %d’ % demand_sd
Probability Sum = 0.9924
Demand std = 11774
Percentile Calculation
In [14]: year = 2018
cum_prob = pd.DataFrame({’Revenue’: output[:,0], ’Cumulative Probability’: np.cumsum(output[:,1
10
# P(Revenue < r) = percentile -> Find r
# 75+
percentile_75 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.75].shape[
percentile_85 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.85].shape[
percentile_95 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.95].shape[
# 25-
percentile_05 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.05].shape[
percentile_15 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.15].shape[
percentile_25 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.25].shape[
# Print values
print ’75th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_75
print ’85th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_85
print ’95th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_95
print ’5th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_05
print ’15th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_15
print ’25th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_25
print percentile_95
print percentile_85
print percentile_75
print percentile_25
print percentile_15
print percentile_05
75th Percentile of 2018 Revenue = 49146.23
85th Percentile of 2018 Revenue = 62173.69
95th Percentile of 2018 Revenue = 91921.62
5th Percentile of 2018 Revenue = 8193.37
15th Percentile of 2018 Revenue = 14082.97
25th Percentile of 2018 Revenue = 18857.58
91921.6153431
62173.6864964
49146.2293139
18857.5804073
14082.965342
8193.3714675
11

More Related Content

What's hot

Advanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part IIAdvanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part II
Dr. Volkan OBAN
 
Advanced Data Visualization in R- Somes Examples.
Advanced Data Visualization in R- Somes Examples.Advanced Data Visualization in R- Somes Examples.
Advanced Data Visualization in R- Somes Examples.
Dr. Volkan OBAN
 
Slides: Perspective click-and-drag area selections in pictures
Slides: Perspective click-and-drag area selections in picturesSlides: Perspective click-and-drag area selections in pictures
Slides: Perspective click-and-drag area selections in pictures
Frank Nielsen
 
A Survey Of R Graphics
A Survey Of R GraphicsA Survey Of R Graphics
A Survey Of R Graphics
Dataspora
 
Interpolation
InterpolationInterpolation
Interpolation
Dmytro Mitin
 
Surface3d in R and rgl package.
Surface3d in R and rgl package.Surface3d in R and rgl package.
Surface3d in R and rgl package.
Dr. Volkan OBAN
 
CLUSTERGRAM
CLUSTERGRAMCLUSTERGRAM
CLUSTERGRAM
Dr. Volkan OBAN
 
maths Individual assignment on differentiation
maths Individual assignment on differentiationmaths Individual assignment on differentiation
maths Individual assignment on differentiation
tenwoalex
 
Seminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mmeSeminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mme
Vyacheslav Arbuzov
 
Proyecto
ProyectoProyecto
Proyecto
crislogan
 
Num Integration
Num IntegrationNum Integration
Num Integration
muhdisys
 
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
Laurel Ayuyao
 
peRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysispeRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysis
Vyacheslav Arbuzov
 
Mosaic plot in R.
Mosaic plot in R.Mosaic plot in R.
Mosaic plot in R.
Dr. Volkan OBAN
 
Click-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) predictionClick-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) prediction
Andrey Lange
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting Spatial
FAO
 
Pdfcode
PdfcodePdfcode
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoop
ryancox
 
Fractal Rendering in Developer C++ - 2012-11-06
Fractal Rendering in Developer C++ - 2012-11-06Fractal Rendering in Developer C++ - 2012-11-06
Fractal Rendering in Developer C++ - 2012-11-06
Aritra Sarkar
 
Julia Set
Julia SetJulia Set
Julia Set
Ashish Kumar
 

What's hot (20)

Advanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part IIAdvanced Data Visualization Examples with R-Part II
Advanced Data Visualization Examples with R-Part II
 
Advanced Data Visualization in R- Somes Examples.
Advanced Data Visualization in R- Somes Examples.Advanced Data Visualization in R- Somes Examples.
Advanced Data Visualization in R- Somes Examples.
 
Slides: Perspective click-and-drag area selections in pictures
Slides: Perspective click-and-drag area selections in picturesSlides: Perspective click-and-drag area selections in pictures
Slides: Perspective click-and-drag area selections in pictures
 
A Survey Of R Graphics
A Survey Of R GraphicsA Survey Of R Graphics
A Survey Of R Graphics
 
Interpolation
InterpolationInterpolation
Interpolation
 
Surface3d in R and rgl package.
Surface3d in R and rgl package.Surface3d in R and rgl package.
Surface3d in R and rgl package.
 
CLUSTERGRAM
CLUSTERGRAMCLUSTERGRAM
CLUSTERGRAM
 
maths Individual assignment on differentiation
maths Individual assignment on differentiationmaths Individual assignment on differentiation
maths Individual assignment on differentiation
 
Seminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mmeSeminar PSU 10.10.2014 mme
Seminar PSU 10.10.2014 mme
 
Proyecto
ProyectoProyecto
Proyecto
 
Num Integration
Num IntegrationNum Integration
Num Integration
 
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"
 
peRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysispeRm R group. Review of packages for r for market data downloading and analysis
peRm R group. Review of packages for r for market data downloading and analysis
 
Mosaic plot in R.
Mosaic plot in R.Mosaic plot in R.
Mosaic plot in R.
 
Click-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) predictionClick-Trough Rate (CTR) prediction
Click-Trough Rate (CTR) prediction
 
10. Getting Spatial
10. Getting Spatial10. Getting Spatial
10. Getting Spatial
 
Pdfcode
PdfcodePdfcode
Pdfcode
 
Megadata With Python and Hadoop
Megadata With Python and HadoopMegadata With Python and Hadoop
Megadata With Python and Hadoop
 
Fractal Rendering in Developer C++ - 2012-11-06
Fractal Rendering in Developer C++ - 2012-11-06Fractal Rendering in Developer C++ - 2012-11-06
Fractal Rendering in Developer C++ - 2012-11-06
 
Julia Set
Julia SetJulia Set
Julia Set
 

Similar to Prob-Dist-Toll-Forecast-Uncertainty

Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
Rezzy Caraka
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Arvind Surve
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Arvind Surve
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
Tribhuwan Pant
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
Ken'ichi Matsui
 
1.pdf
1.pdf1.pdf
Contrastive Divergence Learning
Contrastive Divergence LearningContrastive Divergence Learning
Contrastive Divergence Learning
penny 梁斌
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
Yanchang Zhao
 
Vectorization vs Compilation
Vectorization vs CompilationVectorization vs Compilation
Vectorization vs Compilation
Alex Averbuch
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
Uma mohan
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
Mark Chang
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1
aravindangc
 
Better Visualization of Trips through Agglomerative Clustering
Better Visualization of  Trips through     Agglomerative ClusteringBetter Visualization of  Trips through     Agglomerative Clustering
Better Visualization of Trips through Agglomerative Clustering
Anbarasan S
 
Unit-2 raster scan graphics,line,circle and polygon algorithms
Unit-2 raster scan graphics,line,circle and polygon algorithmsUnit-2 raster scan graphics,line,circle and polygon algorithms
Unit-2 raster scan graphics,line,circle and polygon algorithms
Amol Gaikwad
 
Branch and bounding : Data structures
Branch and bounding : Data structuresBranch and bounding : Data structures
Branch and bounding : Data structures
Kàŕtheek Jåvvàjí
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturial
Wayne Tsai
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
Charles Martin
 
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
Cyber Security Alliance
 
Econometric Analysis 8th Edition Greene Solutions Manual
Econometric Analysis 8th Edition Greene Solutions ManualEconometric Analysis 8th Edition Greene Solutions Manual
Econometric Analysis 8th Edition Greene Solutions Manual
LewisSimmonss
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
Roziq Bahtiar
 

Similar to Prob-Dist-Toll-Forecast-Uncertainty (20)

Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
 
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by  Alexandre V EvfimievskiClustering and Factorization using Apache SystemML by  Alexandre V Evfimievski
Clustering and Factorization using Apache SystemML by Alexandre V Evfimievski
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
 
1.pdf
1.pdf1.pdf
1.pdf
 
Contrastive Divergence Learning
Contrastive Divergence LearningContrastive Divergence Learning
Contrastive Divergence Learning
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
 
Vectorization vs Compilation
Vectorization vs CompilationVectorization vs Compilation
Vectorization vs Compilation
 
Computer graphics lab manual
Computer graphics lab manualComputer graphics lab manual
Computer graphics lab manual
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1
 
Better Visualization of Trips through Agglomerative Clustering
Better Visualization of  Trips through     Agglomerative ClusteringBetter Visualization of  Trips through     Agglomerative Clustering
Better Visualization of Trips through Agglomerative Clustering
 
Unit-2 raster scan graphics,line,circle and polygon algorithms
Unit-2 raster scan graphics,line,circle and polygon algorithmsUnit-2 raster scan graphics,line,circle and polygon algorithms
Unit-2 raster scan graphics,line,circle and polygon algorithms
 
Branch and bounding : Data structures
Branch and bounding : Data structuresBranch and bounding : Data structures
Branch and bounding : Data structures
 
Coscup2021-rust-toturial
Coscup2021-rust-toturialCoscup2021-rust-toturial
Coscup2021-rust-toturial
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...
 
Econometric Analysis 8th Edition Greene Solutions Manual
Econometric Analysis 8th Edition Greene Solutions ManualEconometric Analysis 8th Edition Greene Solutions Manual
Econometric Analysis 8th Edition Greene Solutions Manual
 
Open GL T0074 56 sm4
Open GL T0074 56 sm4Open GL T0074 56 sm4
Open GL T0074 56 sm4
 

Prob-Dist-Toll-Forecast-Uncertainty

  • 1. Prob-Dist-Toll-Forecast-Uncertainty December 16, 2015 In [2]: # special IPython command to prepare the notebook for matplotlib %matplotlib inline import numpy as np import pandas as pd import scipy as sp import seaborn as sns import matplotlib.pyplot as plt Estimating the probability distribution of a travel demand forecast Authors: John L. Bowman, Dinesh Gopinath, and Moshe Ben-Akiva 0.0.1 Algorithm 1. Identify variables that induce error in Toll Revenue prediction : x = (x1, x2, ..., xk, ..., xK) • Simple Toll Revenue Model - Variables that induce error in Toll Revenue Prediction r(p) are: (1) Value of Time, (2) Population, (3) Households, (4) Employment • Let: x1 = Value of Time, x2 = Population, x3 = Households, x4 = Employment. Thus K = 4, i.e. 4-Dimensional space of possible outcomes xk • x = (x1, x2, x3, x4) 2. Obtain probability distribution of xk for k = 1, 2, . . . , K. Distribution can be based on: (a) Direct input or (b) Assumption, e.g. Triangular, Normal, etc. For each dimension k discretize an assumed probability distribution and identify a small set of discrete outcomes xnk k , where nk = 1, 2, 3, . . . , Nk (assign probabilities p(xnk k ) to these discrete outcomes based on reasoning and empirical evidence to approximate xk’s true distribution???): • Let N1 = 4, N2 = 3, N3 = 5, N4 = 4 and xk = {x1 k, x2 k, x3 k, ..., xNk k } • x1 discrete outcomes = {x1 1, x2 1, x3 1, x4 1}, with p(x1 1) + p(x2 1) + p(x3 1) + p(x4 1) = 1 • x2 discrete outcomes = {x1 2, x2 2, x3 2}, with p(x1 2) + p(x2 2) + p(x3 2) = 1 • x3 discrete outcomes = {x1 3, x2 3, x3 3, x4 3, x5 3}, with p(x1 3) + p(x2 3) + p(x3 3) + p(x4 3) + p(x5 3) = 1 • x4 discrete outcomes = {x1 4, x2 4, x3 4, x4 4}, with p(x1 4) + p(x2 4) + p(x3 4) + p(x4 4) = 1 3. Develop Toll Revenue Model for Baseline Scenario : - Get Predicted r (p) base for Baseline Scenario x = (xbase 1 , xbase 2 , xbase 3 , xbase 4 ) from output of the model 4. Run Toll Revenue Model one time for each variable that induce error in prediction : • Get predicted r (p) k=1 based on x = (xextreme 1 , xbase 2 , xbase 3 , xbase 4 ) • Get predicted r (p) k=2 based on x = (xbase 1 , xextreme 2 , xbase 3 , xbase 4 ) • Get predicted r (p) k=3 based on x = (xbase 1 , xbase 2 , xextreme 3 , xbase 4 ) • Get predicted r (p) k=4 based on x = (x1, x2, x3, x4 ) 1
  • 2. 5. Calculate change in Predicted Toll Revenue and variables that induce error in prediction : • rchange k = r (p) base−r (p) k r (p) k , where k = 1, 2, 3, 4 • xchange k = xbase 1 −xk xk , where k = 1, 2, 3, 4 6. Calculate Elasticity of Toll Revenue with respect to variables that induce error in pre- diction : • er k = rchange k xchange k , where k = 1, 2, 3, 4 7. Define a set of scenarios: S = {(xn1 1 , ..., xnk k , ..., xNk K ); nk = 1, 2, 3, ..., Nk; k = 1, 2, ..., K}, covering all combinations of the discrete coutcomes in all K = 4 dimensions • For simple example, S = {$(x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ1), (x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ2), . . . , (x 1ˆ1, x 2ˆ3, x 3ˆ5, x 4ˆ4), . . . , (x 1ˆ4, x 2ˆ3, x 3ˆ5, x 4ˆ4) $} • Number of scenarios in S = K k=1 Nk. Thus the number of scenarios in simple example is S = 4 x 3 x 5 x 4 = 240. Thus s = 1, 2, 3, . . . , 240 • Using s as a 1-Dimensional index of the member of S: Refer to a single member of S as x(s) = (x (s) 1 , x (s) 2 , ..., x (s) k , ..., x (s) K ). For simple example: x(s=1) = (x1 1, x1 2, x1 3, x1 4); x(s=2) = (x1 1, x1 2, x1 3, x2 4); x(s=240) = (x4 1, x3 2, x5 3, x4 4) 8. Calculate the probability of each scenario: Error variables are mutually independent, thus the probability of each scenario is given by: p(s) = K k=1 p(x (s) k ), s ∈ S, thus for simple example: • p(s = 1) = 4 k=1 p(x (s=1) k ) = p(x1 1)p(x1 2)p(x1 3)p(x1 4) • p(s = 2) = 4 k=1 p(x (s=2) k ) = p(x1 1)p(x1 2)p(x1 3)p(x2 4) • p(s = 240) = 4 k=1 p(x (s=240) k ) = p(x4 1)p(x4 2)p(x5 3)p(x4 4) 9. Calculate Toll Revenue for scenario s, r(s) = r (p) base K k=1 x (s) k xbase k er k , s ∈ S, thus for simple example: • r(s=1) = r (p) base 4 k=1 x (s=1) k xbase k er k = r (p) base x1 1 xbase 1 er 1 x1 2 xbase 2 er 2 x1 3 xbase 3 er 3 x1 4 xbase 4 er 4 • r(s=2) = r (p) base 4 k=1 x (s=2) k xbase k er k = r (p) base x1 1 xbase 1 er 1 x1 2 xbase 2 er 2 x1 3 xbase 3 er 3 x2 4 xbase 4 er 4 • r(s=240) = r (p) base 4 k=1 x (s=240) k xbase k er k = r (p) base x4 1 xbase 1 er 1 x3 2 xbase 2 er 2 x5 3 xbase 3 er 3 x4 4 xbase 4 er 4 10. Using pairs r(s) and p(s) Plot Revenue CDF 0.0.2 Sources of Uncertainty - Toll (Kockleman) • Estimates of trip generation • Estimates of land development • Models: Trip Generation, Trip Distribution, Mode Choice • Toll-technology adoption rates • Hetrogeneity in (VOT) Value of Time savings 2
  • 3. • Network attributes - Traffic congestion (low-volume corridors have greater uncertanity in their forecasts) • Uncertainty in land development patterns • Demographic and employment projections • Tolling design - shadow tolls (govt. pays the concessionaire an amount based n toll road use - similar to toll free) or user-paid tolls (drivers willingness to pay is more complex and difficult to understand - more forecasting risk) • Tolling culture of a region, i.e. the degree to which tolls have been used in the past • Travel demand model imperfections (Heterogenity of VOT is ignored, Variable tolls or HOT lanes that are free at certain hours) • Competitive advantage - Toll on the only bridge vs toll on freeway - more options to route • User attributes - toll facilities serving a small market segment of travelers allow more reliable forecasts vs hetrogenous populations • Road location, configurations • Demand variations over times of days and days of the year also affect forecast reliability • Brian and Wilkins (2002) - poorly estimated VOTT’s, economic downturns, mis-prediction of future land use patterns, lower than predicted time savings, added competition, lower than anticipated truck usage, high variability in traffic volumes • Economic growth and related changes in income and employment • Total Demand Model errors • Model error in elasticity of demand • Value of time • Errors in measurement of network times and costs • Operating speed • Roadway improvements 1 Texas North Tarrant Express Segment 3A • Revenue and Transaction Forecast Year = 2018 2018 Revenue and Transactions • Forecasted 2018 Annual Project Revenue (000’s 2011 Dollars) = 27612 • Forecasted 2018 Daily Transactions = 40086 Truck VOT Calculations • SOV VOT - Lognormal distribution with mean = $18.59 and standard deviation = $7.4 (µ = 2.849 and σ = 0.383) 3
  • 4. • Coefficient of variation, Csov v = 7.4 18.59 = 0.398 • HHM Truck VOT: Mean = $36.48 and Standard deviation = $30.24 • AECOM Truck VOT: Mean = $60.76 and Standard deviation = $51.08 • AverageTruckV OT = HHMT ruckV OT +AECOMT ruckV OT 2 = $48.62 • Standard deviation of Average Truck VOT = Csov v ∗ AverageTruckV OT = $19.35 (µ = 3.811 and σ = 0.383, calculations below) • µσ In [3]: # Parameters for Truck Lognormal Distribution m = 48.62 s = 19.35 truck_ln_mu = np.log(m/np.sqrt(1+((s**2)/(m**2)))) truck_ln_sigma = np.sqrt(np.log(1+((s**2)/(m**2)))) print ’truck_ln_mu = %1.3f’ % truck_ln_mu print ’truck_ln_sigma = %1.3f’ % truck_ln_sigma truck ln mu = 3.811 truck ln sigma = 0.383 r (p) base = 74754 (000’s 2011 Dollars) Variables: Sources of Uncertainty • Truck VOT: x1 – Elasticity of Revenue to Truck VOT = 0.994 – xbase 1 = $60.76 – Probability distribution: Lognormal with mean = $48.62 and std. dev = $19.35 (µ = 3.811 and σ = 0.383) • Travel Demand: x2 – Elasticity of Revenue to Demand (Transactions as proxy) = 2.57 – xbase 2 = 61056 – Probability distribution: Normal with µ = 58871.5 and σ = 2184.5 • Car VOT Growth: x3 – Elasticity of Revenue to Car VOT Growth = 0.19 – xbase 3 = 2.1% – Probability distribution: Triangular with Min = 0.5%, Mean = 1.05%, and Max = 2.1% • Truck VOT Growth: x4 4
  • 5. – Elasticity of Revenue to Truck VOT Growth = 0.19 – xbase 4 = 2.5% – Probability distribution: Triangular with Min = 0.5%, Mean = 1.25%, and Max = 2.5% Truck VOT Probability Distribution: Lognormal In [4]: mu = 3.811 sigma = 0.383 low = 1 high = 120 dx_1 = 2 # Length of interval # Comb points along x axis x_1 = np.arange(low, high, dx_1) # Compute y values: pdf at each value of x vot_y = (1/(sigma * x_1 * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x_1) - mu)/sigma) ** 2) # Plot the function plt.figure(figsize = (16, 8)) plt.stem(x_1, vot_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$x_1$’) plt.ylabel(’$p(x_1)$’) plt.title(’Discretized Log-Normal Probability Density’) area = np.sum(dx_1 * vot_y) print ’Probability Sum = %1.4f’ % area print ’N_1 = %d’ % len(x_1) temp1 = np.array([x_1, vot_y * dx_1]) Probability Sum = 0.9946 N 1 = 60 5
  • 6. Travel Demand Probability Distribution: Normal In [5]: # Mean Transactions = (2035 Transactions + 2018 Transactions)/2 print ’Mean Transactions = %s’ % ((63635+40086)/2.0) # Std Dev Transactions = (2035 Transactions - 2018 Transactions)/2 print ’Std. Dev Transactions = %s’ % ((63635-40086)/2.0) Mean Transactions = 51860.5 Std. Dev Transactions = 11774.5 In [6]: demand_mean = 51860.5 demand_sd = 11774.5 demand_low = demand_mean - 3 * demand_sd # low end of x axis demand_high = demand_mean + 3 * demand_sd # high end of x axis dx_2 = 2000 # Length of interval # Comb points along x axis x_2 = np.arange(demand_low, demand_high, dx_2) # Compute y values: pdf at each value of x demand_y = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_2 - demand_mean)/demand_sd) # Plot the function plt.figure(figsize = (16, 8)) plt.stem(x_2, demand_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Demand$’) plt.ylabel(’$p(Demand)$’) plt.title(’Discretized Normal Probability Density’) area = np.sum(dx_2 * demand_y) print ’Probability Sum = %1.4f’ % area print ’N_2 = %d’ % len(x_2) Probability Sum = 0.9978 N 2 = 36 6
  • 7. Car VOT Growth Probability Distribution: Triangular In [7]: min_growth_car = 0.004 mean_growth_car = 0.0105 max_growth_car = 0.022 car_array = np.random.triangular(min_growth_car, mean_growth_car, max_growth_car, size = 100000) #plt.hist(car_array, bins = 10) car_val = np.histogram(car_array, bins = 20) car_y = [float(i)/np.sum(car_val[0]) for i in car_val[0]] # Binwidth issue x_car = car_val[1] x_3 = [] for i in range(len(x_car) - 1): temp = (x_car[i] + x_car[i+1])/2 x_3.append(temp) # Plot triangular distribution plt.figure(figsize = (16, 8)) plt.stem(x_3, car_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Car Growth$’) plt.ylabel(’$p(Car Growth)$’) plt.title(’Discretized Triangular Probability Density’) print len(x_3) print np.sum(car_y) 20 1.0 Truck VOT Growth Probability Distribution: Triangular 7
  • 8. In [8]: min_growth_truck = 0.004 mean_growth_truck = 0.0125 max_growth_truck = 0.026 truck_array = np.random.triangular(min_growth_truck, mean_growth_truck, max_growth_truck, size = #plt.hist(car_array, bins = 10) truck_val = np.histogram(truck_array, bins = 20) truck_y = [float(i)/np.sum(truck_val[0]) for i in truck_val[0]] # Binwidth issue x_truck = truck_val[1] x_4 = [] for i in range(len(x_truck) - 1): temp = (x_truck[i] + x_truck[i+1])/2 x_4.append(temp) # Plot triangular distribution plt.figure(figsize = (16, 8)) plt.stem(x_4, truck_y, markerfmt = ’ ’) # This draws the intervals plt.xlabel(’$Truck Growth$’) plt.ylabel(’$p(Truck Growth)$’) plt.title(’Discretized Triangular Probability Density’) print len(x_4) print np.sum(truck_y) 20 1.0 Scenarios In [9]: S = [[i, j, k, l] for i in x_1 for j in x_2 for k in x_3 for l in x_4] print S[0] 8
  • 9. print ’n’ print ’Number of Scenarios = ’ + str(len(S)) [1, 16537.0, 0.0044911329557659595, 0.0045996835827973384] Number of Scenarios = 864000 Probability and Revenue Calculations for Scenarios In [10]: # Constants: Base Revenue rp_base = 27612 # Constants: Base values of variables x_1b = 60.76 x_2b = 40086 x_3b = 0.021 x_4b = 0.025 # Constants: Elasticities of variables e_x1 = 0.994 e_x2 = 2.57 e_x3 = 0.19 e_x4 = 0.19 revenue_S = [] prob_S = [] for i in range(len(S)): # R(s) temp_rev = rp_base * (S[i][0]/x_1b)**(e_x1) * (S[i][1]/x_2b)**(e_x2) * (S[i][2]/x_3b)**(e_x revenue_S.append(temp_rev) # Probability calculation: # Truck VOT p_x1 = (1/(sigma * S[i][0] * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(S[i][0]) - mu)/s # Demand p_x2 = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((S[i][1] - demand_mean)/demand # Car VOT Growth if S[i][2] in x_3: cp = x_3.index(S[i][2]) p_x3 = car_y[cp] # Truck VOT Growth if S[i][3] in x_4: tp = x_4.index(S[i][3]) p_x4 = truck_y[tp] prob_S.append(p_x1 * p_x2 * p_x3 * p_x4) print ’Probability Sum = %0.4f’ % np.sum(prob_S) 9
  • 10. Probability Sum = 0.9924 In [11]: # Sorting Result based on Revenue output = (np.array([revenue_S, prob_S])).T output = output[output[:, 0].argsort()] In [12]: # Plotting Cumulative Probability Distribution plt.figure(figsize = (16, 8)) plt.plot(output[:,0], np.cumsum(output[:,1]), linewidth = 2) # Selecting array column: array[:, # Plotting Predicted Revenue plt.axvline(x = rp_base, color = ’r’) plt.text(74754 + 500, 0.1, ’Predicted Revenue for 2018’, fontsize = 16) # Remove Scientific Notation ax = plt.gca() ax.get_xaxis().get_major_formatter().set_scientific(False) plt.xlabel(’$Revenue$’, fontsize = 16) plt.ylabel(’$p(Revenue)$’, fontsize = 16) plt.title(’Cumulative Probability Distribution - Revenue (000’s 2011 Dollars)’, fontsize = 16) # Set tick label size plt.tick_params(axis = ’both’, which = ’major’, labelsize = 14) print ’Probability Sum = %0.4f’ % np.sum(prob_S) print ’Demand std = %d’ % demand_sd Probability Sum = 0.9924 Demand std = 11774 Percentile Calculation In [14]: year = 2018 cum_prob = pd.DataFrame({’Revenue’: output[:,0], ’Cumulative Probability’: np.cumsum(output[:,1 10
  • 11. # P(Revenue < r) = percentile -> Find r # 75+ percentile_75 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.75].shape[ percentile_85 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.85].shape[ percentile_95 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.95].shape[ # 25- percentile_05 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.05].shape[ percentile_15 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.15].shape[ percentile_25 = cum_prob[’Revenue’][cum_prob[cum_prob[’Cumulative Probability’] <= 0.25].shape[ # Print values print ’75th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_75 print ’85th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_85 print ’95th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_95 print ’5th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_05 print ’15th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_15 print ’25th Percentile of ’ + str(year) + ’ Revenue = %0.2f’ % percentile_25 print percentile_95 print percentile_85 print percentile_75 print percentile_25 print percentile_15 print percentile_05 75th Percentile of 2018 Revenue = 49146.23 85th Percentile of 2018 Revenue = 62173.69 95th Percentile of 2018 Revenue = 91921.62 5th Percentile of 2018 Revenue = 8193.37 15th Percentile of 2018 Revenue = 14082.97 25th Percentile of 2018 Revenue = 18857.58 91921.6153431 62173.6864964 49146.2293139 18857.5804073 14082.965342 8193.3714675 11