The document describes an algorithm for estimating the probability distribution of travel demand forecasts that are subject to uncertainty. It involves identifying variables that influence forecast error, determining probability distributions for each variable, defining scenarios that combine the discrete outcomes of each variable, calculating the probability and predicted revenue for each scenario, and plotting the revenue cumulative distribution function. Key variables of uncertainty identified for a toll road project include truck value of time, travel demand, and growth rates of car and truck value of time. Probability distributions assumed for these variables include lognormal, normal, and triangular. The algorithm allows assessment of the uncertainty and risk associated with toll revenue forecasts.
This document describes performing extreme value analysis on daily precipitation data from Fort Collins, Colorado from 1900 to 1999 using R. It first reads in and plots the data, summarizing seasonal variations. It then performs two extreme value analysis approaches: the block maxima approach, which fits a generalized extreme value distribution to summer maximum daily precipitation values within blocks; and the peak over threshold approach, which fits a generalized Pareto distribution to values exceeding a threshold. It estimates return levels such as the 100-year event and calculates confidence intervals.
This document provides an R tutorial for an undergraduate climate workshop. It introduces key concepts in R including data types, arrays, matrices, data frames, packages, and basic plotting. It demonstrates how to perform calculations, subset data, install and load packages, create different plot types like histograms and maps, and use functions like quantile and quilt.plot. Exercises include drawing a histogram of ozone values and calculating quantiles.
Explanation on Tensorflow example -Deep mnist for expert홍배 김
you can find the exact and detailed network architecture of 'Deep mnist for expert' example of tensorflow's tutorial. I also added descriptions on the program for your better understanding.
1) The plane containing points p1(1,2,3), p2(3,4,3), and p3(1,3,4) has the equation 2x - 2y + 2z - 4 = 0.
2) The line perpendicular to the plane x + 2y + 3z + 4 = 0 and passing through the point (5,6,7) is r(t) = (5 + t, 6, 7 + 3t).
3) The distance between a point p = (x,y,z) and a plane ax + by + cz + d = 0 is |ax + by + cz + d|/√(
Finagle is Twitter's open source RPC library that allows composing asynchronous RPC requests like functions. It provides three key abstractions: Futures for asynchronous computations, Services for RPC functions, and ServiceFactories for creating Services. Finagle handles load balancing, connection pooling, failure detection, and other distributed systems concerns through composable layers above a transport layer.
The document describes a Python module called r.ipso that is used in GRASS GIS to generate ipsographic and ipsometric curves from raster elevation data. The module imports GRASS and NumPy libraries, reads elevation and cell count statistics from a raster, calculates normalized elevation and area values, and uses these to plot the curves and output quantile information. The module demonstrates calling GRASS functionality from Python scripts.
This document discusses simulation methods for analyzing slot machine payout data. It includes sections on the mathematical approach, homework assignments, and feedback. Students are asked to write functions to simulate random slot machine draws, calculate payouts from multiple draws, and explore the distribution of mean payouts from increasing numbers of draws. The document suggests moving from simulation to calculating the expected value and variance of payouts mathematically.
This document describes performing extreme value analysis on daily precipitation data from Fort Collins, Colorado from 1900 to 1999 using R. It first reads in and plots the data, summarizing seasonal variations. It then performs two extreme value analysis approaches: the block maxima approach, which fits a generalized extreme value distribution to summer maximum daily precipitation values within blocks; and the peak over threshold approach, which fits a generalized Pareto distribution to values exceeding a threshold. It estimates return levels such as the 100-year event and calculates confidence intervals.
This document provides an R tutorial for an undergraduate climate workshop. It introduces key concepts in R including data types, arrays, matrices, data frames, packages, and basic plotting. It demonstrates how to perform calculations, subset data, install and load packages, create different plot types like histograms and maps, and use functions like quantile and quilt.plot. Exercises include drawing a histogram of ozone values and calculating quantiles.
Explanation on Tensorflow example -Deep mnist for expert홍배 김
you can find the exact and detailed network architecture of 'Deep mnist for expert' example of tensorflow's tutorial. I also added descriptions on the program for your better understanding.
1) The plane containing points p1(1,2,3), p2(3,4,3), and p3(1,3,4) has the equation 2x - 2y + 2z - 4 = 0.
2) The line perpendicular to the plane x + 2y + 3z + 4 = 0 and passing through the point (5,6,7) is r(t) = (5 + t, 6, 7 + 3t).
3) The distance between a point p = (x,y,z) and a plane ax + by + cz + d = 0 is |ax + by + cz + d|/√(
Finagle is Twitter's open source RPC library that allows composing asynchronous RPC requests like functions. It provides three key abstractions: Futures for asynchronous computations, Services for RPC functions, and ServiceFactories for creating Services. Finagle handles load balancing, connection pooling, failure detection, and other distributed systems concerns through composable layers above a transport layer.
The document describes a Python module called r.ipso that is used in GRASS GIS to generate ipsographic and ipsometric curves from raster elevation data. The module imports GRASS and NumPy libraries, reads elevation and cell count statistics from a raster, calculates normalized elevation and area values, and uses these to plot the curves and output quantile information. The module demonstrates calling GRASS functionality from Python scripts.
This document discusses simulation methods for analyzing slot machine payout data. It includes sections on the mathematical approach, homework assignments, and feedback. Students are asked to write functions to simulate random slot machine draws, calculate payouts from multiple draws, and explore the distribution of mean payouts from increasing numbers of draws. The document suggests moving from simulation to calculating the expected value and variance of payouts mathematically.
Advanced Data Visualization Examples with R-Part IIDr. Volkan OBAN
This document provides several examples of advanced data visualization techniques using R. It includes examples of 3D surface plots, contour plots, scatter plots and network graphs using various R packages like plot3D, scatterplot3D, ggplot2, qgraph and ggtree. Functions used include surf3D, contour3D, arrows3D, persp3D, image3D, scatter3D, qgraph, geom_point, geom_violin and ggtree. The examples demonstrate different visualization approaches for multivariate, spatial and network data.
Advanced Data Visualization in R- Somes Examples.Dr. Volkan OBAN
This document provides examples of using the geomorph package in R for advanced data visualization. It includes code snippets showing how to visualize geometric morphometric data using functions like plotspec() and plotRefToTarget(). It also includes an example of creating a customized violin plot function for comparing multiple groups and generating simulated data to plot.
I survey three approaches for data visualization in R: (i) the built-in base graphics functions, (ii) the ggplot2 package, and (iii) the lattice package. I also discuss some methods for visualizing large data sets.
The document discusses various methods of interpolation, including Lagrange and Newton interpolation polynomials. Lagrange interpolation involves constructing a polynomial that passes through a set of n data points, represented by its values at the points. Newton interpolation similarly uses a polynomial but is based on divided differences. Both can be used to interpolate values within or extrapolate beyond the original data range. The complexity of calculating the interpolation polynomials is also addressed.
The document discusses using the rgl and surface3d functions in R to visualize 3D data. It provides code to:
1. Plot the volcano data set in 3D with colors corresponding to peak heights
2. Add axes labels and titles to the 3D volcano plot
3. Generate additional 3D surface plots using mathematical functions and datasets like a DEM model
maths Individual assignment on differentiationtenwoalex
The document contains an individual assignment with 14 math problems. The assignment includes problems on calculus topics such as derivatives, limits, implicit differentiation, and optimization. The solutions show the steps and work to arrive at the answers for each problem.
This document provides an overview of using R for financial modeling. It covers basic R commands for calculations, vectors, matrices, lists, data frames, and importing/exporting data. Graphical functions like plots, bar plots, pie charts, and boxplots are demonstrated. Advanced topics discussed include distributions, parameter estimation, correlations, linear and nonlinear regression, technical analysis packages, and practical exercises involving financial data analysis and modeling.
This document provides the steps to calculate the determinant of a matrix representing distances between four locations with coordinates x1, x2, x3, x4. It defines the matrix A and input vector xIn, then calculates the determinant of the matrix xIn - A to determine the value Cx for the given coordinates.
This document discusses numerical integration methods for calculating ship geometrical properties. It introduces the Trapezoidal rule, Simpson's 1st rule, and Simpson's 2nd rule for numerical integration when the ship's shape cannot be represented by a mathematical equation. It then provides examples of applying Simpson's 1st rule to calculate properties like waterplane area, sectional area, submerged volume, and the longitudinal center of floatation (LCF). The document explains the calculation steps and provides generalized Simpson's equations for these examples.
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"Laurel Ayuyao
For Intermediate Microeconomic Theory (ECON 30010) at University of Notre Dame. Topics include demand, elasticity, income and substitution effects, compensating and equivalent variation, intertemporal choice, uncertainty, and preferences over risk.
peRm R group. Review of packages for r for market data downloading and analysisVyacheslav Arbuzov
This document summarizes R packages for downloading market data. It discusses packages such as quantmod, tseries, rdatamarket, and rBloomberg that can be used to access stock, economic, and financial time series data from various sources including Yahoo Finance, Google Finance, FRED, DataMarket, and Bloomberg. It provides examples of functions to download and visualize different types of market data using these packages.
Data visualization with R.
Mosaic plot .
---Ref: https://www.stat.auckland.ac.nz/~ihaka/120/Lectures/lecture17.pdf
http://www.statmethods.net/advgraphs/mosaic.html
https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/mosaicplot.html
1. The document discusses approaches for click-through rate (CTR) prediction, including offline and online methods as well as Bayesian parameter estimation.
2. It presents logistic regression models for CTR prediction with regularization to avoid overfitting, describing offline solutions using independent component optimization and online solutions using sample-by-sample updates.
3. Bayesian approaches are also covered, including maximizing the posterior probability to select hyperparameters like lambda and maximizing evidence to estimate model parameters.
This document summarizes steps for working with spatial point data in R, including:
1. Importing point data from a CSV file and defining the coordinate columns;
2. Specifying the coordinate reference system of the data;
3. Plotting the data spatially and exporting to common GIS formats like shapefiles;
4. Transforming the data to a different CRS (WGS84) in order to visualize in Google Earth.
The document defines functions to generate random samples from several probability distributions (uniform, Bernoulli, binomial, Poisson, exponential, normal) and calculates bounds on the probability that samples exceed thresholds k using Chebyshev's, Markov's and Chernoff's inequalities. It plots the theoretical distributions, inequality bounds and sample probabilities for comparison.
This document discusses processing large datasets with Python and Hadoop. It begins with an example of finding the highest temperature from a climate dataset using a map-reduce approach. Next, it provides code examples for implementing map-reduce in pure Python, with Hadoop Streaming, and with the Dumbo library. The document then discusses using Amazon Elastic MapReduce for running Hadoop jobs on AWS. It poses a question about how to implement breadth-first search as a map-reduce algorithm and ends with an example of using MongoDB's map-reduce functionality.
Julia set is the boundary between points in the complex number plane or the Riemann sphere (the complex number plane plus the point at infinity) that diverge to infinity and those that remain finite under repeated iteration of some mapping (function).
This document discusses properties of pseudo-random numbers and methods for generating random numbers computationally. It covers:
- Properties of pseudo-random numbers including being continuous between 0 and 1 and uniformly distributed.
- Common methods for generating pseudo-random numbers including table lookup, linear congruential generators (LCG), and feedback shift registers.
- Desirable properties for random number generators including being fast, requiring little memory, having a long cycle or period, and producing numbers that are close to uniform and independent.
Advanced Data Visualization Examples with R-Part IIDr. Volkan OBAN
This document provides several examples of advanced data visualization techniques using R. It includes examples of 3D surface plots, contour plots, scatter plots and network graphs using various R packages like plot3D, scatterplot3D, ggplot2, qgraph and ggtree. Functions used include surf3D, contour3D, arrows3D, persp3D, image3D, scatter3D, qgraph, geom_point, geom_violin and ggtree. The examples demonstrate different visualization approaches for multivariate, spatial and network data.
Advanced Data Visualization in R- Somes Examples.Dr. Volkan OBAN
This document provides examples of using the geomorph package in R for advanced data visualization. It includes code snippets showing how to visualize geometric morphometric data using functions like plotspec() and plotRefToTarget(). It also includes an example of creating a customized violin plot function for comparing multiple groups and generating simulated data to plot.
I survey three approaches for data visualization in R: (i) the built-in base graphics functions, (ii) the ggplot2 package, and (iii) the lattice package. I also discuss some methods for visualizing large data sets.
The document discusses various methods of interpolation, including Lagrange and Newton interpolation polynomials. Lagrange interpolation involves constructing a polynomial that passes through a set of n data points, represented by its values at the points. Newton interpolation similarly uses a polynomial but is based on divided differences. Both can be used to interpolate values within or extrapolate beyond the original data range. The complexity of calculating the interpolation polynomials is also addressed.
The document discusses using the rgl and surface3d functions in R to visualize 3D data. It provides code to:
1. Plot the volcano data set in 3D with colors corresponding to peak heights
2. Add axes labels and titles to the 3D volcano plot
3. Generate additional 3D surface plots using mathematical functions and datasets like a DEM model
maths Individual assignment on differentiationtenwoalex
The document contains an individual assignment with 14 math problems. The assignment includes problems on calculus topics such as derivatives, limits, implicit differentiation, and optimization. The solutions show the steps and work to arrive at the answers for each problem.
This document provides an overview of using R for financial modeling. It covers basic R commands for calculations, vectors, matrices, lists, data frames, and importing/exporting data. Graphical functions like plots, bar plots, pie charts, and boxplots are demonstrated. Advanced topics discussed include distributions, parameter estimation, correlations, linear and nonlinear regression, technical analysis packages, and practical exercises involving financial data analysis and modeling.
This document provides the steps to calculate the determinant of a matrix representing distances between four locations with coordinates x1, x2, x3, x4. It defines the matrix A and input vector xIn, then calculates the determinant of the matrix xIn - A to determine the value Cx for the given coordinates.
This document discusses numerical integration methods for calculating ship geometrical properties. It introduces the Trapezoidal rule, Simpson's 1st rule, and Simpson's 2nd rule for numerical integration when the ship's shape cannot be represented by a mathematical equation. It then provides examples of applying Simpson's 1st rule to calculate properties like waterplane area, sectional area, submerged volume, and the longitudinal center of floatation (LCF). The document explains the calculation steps and provides generalized Simpson's equations for these examples.
Intermediate Microeconomic Theory Midterm 2 "Cheat Sheet"Laurel Ayuyao
For Intermediate Microeconomic Theory (ECON 30010) at University of Notre Dame. Topics include demand, elasticity, income and substitution effects, compensating and equivalent variation, intertemporal choice, uncertainty, and preferences over risk.
peRm R group. Review of packages for r for market data downloading and analysisVyacheslav Arbuzov
This document summarizes R packages for downloading market data. It discusses packages such as quantmod, tseries, rdatamarket, and rBloomberg that can be used to access stock, economic, and financial time series data from various sources including Yahoo Finance, Google Finance, FRED, DataMarket, and Bloomberg. It provides examples of functions to download and visualize different types of market data using these packages.
Data visualization with R.
Mosaic plot .
---Ref: https://www.stat.auckland.ac.nz/~ihaka/120/Lectures/lecture17.pdf
http://www.statmethods.net/advgraphs/mosaic.html
https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/mosaicplot.html
1. The document discusses approaches for click-through rate (CTR) prediction, including offline and online methods as well as Bayesian parameter estimation.
2. It presents logistic regression models for CTR prediction with regularization to avoid overfitting, describing offline solutions using independent component optimization and online solutions using sample-by-sample updates.
3. Bayesian approaches are also covered, including maximizing the posterior probability to select hyperparameters like lambda and maximizing evidence to estimate model parameters.
This document summarizes steps for working with spatial point data in R, including:
1. Importing point data from a CSV file and defining the coordinate columns;
2. Specifying the coordinate reference system of the data;
3. Plotting the data spatially and exporting to common GIS formats like shapefiles;
4. Transforming the data to a different CRS (WGS84) in order to visualize in Google Earth.
The document defines functions to generate random samples from several probability distributions (uniform, Bernoulli, binomial, Poisson, exponential, normal) and calculates bounds on the probability that samples exceed thresholds k using Chebyshev's, Markov's and Chernoff's inequalities. It plots the theoretical distributions, inequality bounds and sample probabilities for comparison.
This document discusses processing large datasets with Python and Hadoop. It begins with an example of finding the highest temperature from a climate dataset using a map-reduce approach. Next, it provides code examples for implementing map-reduce in pure Python, with Hadoop Streaming, and with the Dumbo library. The document then discusses using Amazon Elastic MapReduce for running Hadoop jobs on AWS. It poses a question about how to implement breadth-first search as a map-reduce algorithm and ends with an example of using MongoDB's map-reduce functionality.
Julia set is the boundary between points in the complex number plane or the Riemann sphere (the complex number plane plus the point at infinity) that diverge to infinity and those that remain finite under repeated iteration of some mapping (function).
This document discusses properties of pseudo-random numbers and methods for generating random numbers computationally. It covers:
- Properties of pseudo-random numbers including being continuous between 0 and 1 and uniformly distributed.
- Common methods for generating pseudo-random numbers including table lookup, linear congruential generators (LCG), and feedback shift registers.
- Desirable properties for random number generators including being fast, requiring little memory, having a long cycle or period, and producing numbers that are close to uniform and independent.
Clustering and Factorization using Apache SystemML by Alexandre V EvfimievskiArvind Surve
This document discusses clustering and factorization techniques in SystemML. It begins by describing k-means clustering, including how it takes as input a matrix of records and clusters them to minimize within-cluster sum of squares. It also discusses k-means++ initialization and the standard k-means algorithm. The document then describes weighted non-negative matrix factorization, which approximates a data matrix as the product of two non-negative matrices to find latent topics. It discusses optimizations for WNMF like operator fusion to reduce computation.
Here are the steps to solve this ODE problem:
1. Define the ODE function:
function dydt = odefun(t,y)
dydt = -t.*y/10;
end
2. Solve the ODE:
[t,y] = ode45(@odefun,[0 10],10);
3. Plot the result:
plot(t,y)
xlabel('t')
ylabel('y(t)')
This uses ode45 to solve the ODE dy/dt = -t*y/10 on the interval [0,10] with initial condition y(0)=10.
- The document discusses random number generation and probability distributions. It presents methods for generating random numbers from Bernoulli, binomial, beta, and multinomial distributions using random bits generated from linear congruential generators.
- Graphical examples are shown comparing histograms of generated random samples to theoretical probability density functions. Code examples in R demonstrate how to simulate random number generation from various discrete distributions.
- The goal is to introduce different methods for random number generation from basic discrete distributions that are important for modeling random phenomena and Monte Carlo simulations.
This document discusses numerical methods for solving engineering problems. It introduces numerical methods as ways to find approximate solutions to complex problems that cannot be solved analytically. Bisection and Newton-Raphson methods are described for finding roots of equations. Examples are provided to demonstrate applying the bisection method to find roots of polynomial and transcendental equations. The Newton-Raphson method is also demonstrated for finding roots. Flowcharts and MATLAB programs for implementing the methods are included.
1) The document discusses contrastive divergence learning, a method for training probabilistic models using gradient descent.
2) It involves using Markov chain Monte Carlo sampling to approximate gradients that are intractable, by running a short Markov chain to move model samples from the data distribution to the model distribution.
3) There is a potential for bias compared to true maximum likelihood learning, as contrastive divergence approximates minimizing the Kullback-Leibler divergence between the data and model after one step of MCMC, rather than the full distributions.
This document discusses building regression and classification models in R, including linear regression, generalized linear models, and decision trees. It provides examples of building each type of model using various R packages and datasets. Linear regression is used to predict CPI data. Generalized linear models and decision trees are built to predict body fat percentage. Decision trees are also built on the iris dataset to classify flower species.
The document provides source code for generating and manipulating computer graphics using various algorithms. It includes algorithms for drawing lines, circles and curves, as well as algorithms for translating, rotating, and scaling two-dimensional and three-dimensional objects. The source code is written in C/C++ and uses graphics libraries to output the results. Various input parameters are taken from the user and output is displayed to demonstrate the algorithms.
This document provides an overview of TensorFlow and how to implement machine learning models using TensorFlow. It discusses:
1) How to install TensorFlow either directly or within a virtual environment.
2) The key concepts of TensorFlow including computational graphs, sessions, placeholders, variables and how they are used to define and run computations.
3) An example one-layer perceptron model for MNIST image classification to demonstrate these concepts in action.
The document describes various computer graphics output primitives and algorithms for drawing them, including lines, circles, and filled areas. It discusses line drawing algorithms like DDA, Bresenham's, and midpoint circle algorithms. These algorithms use incremental integer calculations to efficiently rasterize primitives by determining the next pixel coordinates without performing floating point calculations at each step. The midpoint circle algorithm in particular uses a "circle function" and incremental updates to its value to determine whether the next pixel is inside or outside the circle boundary.
Better Visualization of Trips through Agglomerative ClusteringAnbarasan S
The document proposes a methodology to cluster similar taxi trip flows to better visualize mobility data on maps. It involves finding k-nearest neighbors of trip origins and destinations, identifying contiguous trip pairs, and performing agglomerative clustering to group similar trips. The methodology is applied to a dataset of 500 taxi trips in Portugal. Trips are successfully clustered into meaningful groups and visualized as aggregated flows on a map, providing clearer insights than visualizing individual trips.
Unit-2 raster scan graphics,line,circle and polygon algorithmsAmol Gaikwad
This document provides information about raster scan graphics and algorithms for drawing lines, circles, and polygons in raster graphics. It begins with an introduction to raster scan graphics and line drawing concepts. It then describes the Digital Differential Analyzer (DDA) line drawing algorithm and provides an example of how to use it to rasterize a line. Next, it explains Bresenham's line drawing algorithm and provides another example of using it to rasterize a line. Finally, it includes C program code implementations of the DDA and Bresenham's algorithms.
Branch and bound is a state space search method that generates all children of a node before expanding any children. It associates a cost or profit with each node and uses a min or max heap to select the next node to expand. For the travelling salesman problem, it constructs a permutation tree representing all possible routes and uses lower bounds and reduced cost matrices at each node to prune the search space and find an optimal solution.
The document compares TypeScript and Rust by providing examples of common programming concepts like variables, functions, collections, and iterators. It shows how concepts are implemented similarly in both languages, though the syntax differs. Key points covered include declaring immutable and mutable variables, defining and calling functions, working with collections like arrays/vectors through methods like map and filter, and how iterators are implemented and consumed in each language.
Applied machine learning for search engine relevance 3Charles Martin
The document discusses using support vector machines (SVMs) for ranking web search results, where SVMs learn weight vectors to maximize the relevance score of correct results based on training data while minimizing a multivariate loss function between item pairs. It mentions that a ranking SVM consistently improved the click rank performance on Shopping.com by a certain percentage, indicating SVMs are effective for learning document relevance in web search ranking. Large-scale linear SVMs for ranking can be solved using conjugate gradient or a cutting plane algorithm.
ASFWS 2012 - Obfuscator, ou comment durcir un code source ou un binaire contr...Cyber Security Alliance
This document discusses software obfuscation techniques. It provides an overview of the Obfuscator project, which aims to develop tools for LLVM-based source code obfuscation and binary obfuscation for ARM processors. Specific techniques discussed include code substitution, insertion of fake branches with opaque predicates, code flattening, addition of junk code, and use of opaque predicates and condition codes to add junk code to ARM binaries.
The document discusses drawing 2D primitives such as lines, circles, and polygons in a raster graphics system. It covers:
- Representations of lines, circles, and polygons using implicit, explicit, and parametric formulas
- Scan conversion algorithms to draw these primitives by mapping them to pixels, including basic and midpoint line algorithms, a circle midpoint algorithm, and flood fill and scan conversion approaches for polygon fill
- Components of an interactive graphics system including the application model, program, and graphics system that interfaces with display hardware like CRT and FED displays
Similar to Prob-Dist-Toll-Forecast-Uncertainty (20)
1. Prob-Dist-Toll-Forecast-Uncertainty
December 16, 2015
In [2]: # special IPython command to prepare the notebook for matplotlib
%matplotlib inline
import numpy as np
import pandas as pd
import scipy as sp
import seaborn as sns
import matplotlib.pyplot as plt
Estimating the probability distribution of a travel demand forecast Authors: John L. Bowman,
Dinesh Gopinath, and Moshe Ben-Akiva
0.0.1 Algorithm
1. Identify variables that induce error in Toll Revenue prediction : x = (x1, x2, ..., xk, ..., xK)
• Simple Toll Revenue Model - Variables that induce error in Toll Revenue Prediction r(p)
are: (1)
Value of Time, (2) Population, (3) Households, (4) Employment
• Let: x1 = Value of Time, x2 = Population, x3 = Households, x4 = Employment. Thus K = 4,
i.e. 4-Dimensional space of possible outcomes xk
• x = (x1, x2, x3, x4)
2. Obtain probability distribution of xk for k = 1, 2, . . . , K. Distribution can be based on:
(a) Direct input or (b) Assumption, e.g. Triangular, Normal, etc. For each dimension k
discretize an assumed probability distribution and identify a small set of discrete outcomes xnk
k , where
nk = 1, 2, 3, . . . , Nk (assign probabilities p(xnk
k ) to these discrete outcomes based on reasoning and
empirical evidence to approximate xk’s true distribution???):
• Let N1 = 4, N2 = 3, N3 = 5, N4 = 4 and xk = {x1
k, x2
k, x3
k, ..., xNk
k }
• x1 discrete outcomes = {x1
1, x2
1, x3
1, x4
1}, with p(x1
1) + p(x2
1) + p(x3
1) + p(x4
1) = 1
• x2 discrete outcomes = {x1
2, x2
2, x3
2}, with p(x1
2) + p(x2
2) + p(x3
2) = 1
• x3 discrete outcomes = {x1
3, x2
3, x3
3, x4
3, x5
3}, with p(x1
3) + p(x2
3) + p(x3
3) + p(x4
3) + p(x5
3) = 1
• x4 discrete outcomes = {x1
4, x2
4, x3
4, x4
4}, with p(x1
4) + p(x2
4) + p(x3
4) + p(x4
4) = 1
3. Develop Toll Revenue Model for Baseline Scenario : - Get Predicted r
(p)
base for Baseline Scenario
x = (xbase
1 , xbase
2 , xbase
3 , xbase
4 ) from output of the model
4. Run Toll Revenue Model one time for each variable that induce error in prediction :
• Get predicted r
(p)
k=1 based on x = (xextreme
1 , xbase
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=2 based on x = (xbase
1 , xextreme
2 , xbase
3 , xbase
4 )
• Get predicted r
(p)
k=3 based on x = (xbase
1 , xbase
2 , xextreme
3 , xbase
4 )
• Get predicted r
(p)
k=4 based on x = (x1, x2, x3, x4 )
1
2. 5. Calculate change in Predicted Toll Revenue and variables that induce error in prediction
:
• rchange
k =
r
(p)
base−r
(p)
k
r
(p)
k
, where k = 1, 2, 3, 4
• xchange
k =
xbase
1 −xk
xk
, where k = 1, 2, 3, 4
6. Calculate Elasticity of Toll Revenue with respect to variables that induce error in pre-
diction :
• er
k =
rchange
k
xchange
k
, where k = 1, 2, 3, 4
7. Define a set of scenarios: S = {(xn1
1 , ..., xnk
k , ..., xNk
K ); nk = 1, 2, 3, ..., Nk; k = 1, 2, ..., K}, covering
all combinations of the discrete coutcomes in all K = 4 dimensions
• For simple example, S = {$(x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ1), (x 1ˆ1, x 2ˆ1, x 3ˆ1, x 4ˆ2), . . . , (x 1ˆ1,
x 2ˆ3, x 3ˆ5, x 4ˆ4), . . . , (x 1ˆ4, x 2ˆ3, x 3ˆ5, x 4ˆ4) $}
• Number of scenarios in S =
K
k=1 Nk. Thus the number of scenarios in simple example is S = 4
x 3 x 5 x 4 = 240. Thus s = 1, 2, 3, . . . , 240
• Using s as a 1-Dimensional index of the member of S: Refer to a single member of S as x(s)
=
(x
(s)
1 , x
(s)
2 , ..., x
(s)
k , ..., x
(s)
K ). For simple example: x(s=1)
= (x1
1, x1
2, x1
3, x1
4); x(s=2)
= (x1
1, x1
2, x1
3, x2
4);
x(s=240)
= (x4
1, x3
2, x5
3, x4
4)
8. Calculate the probability of each scenario: Error variables are mutually independent, thus the
probability of each scenario is given by: p(s) =
K
k=1 p(x
(s)
k ), s ∈ S, thus for simple example:
• p(s = 1) =
4
k=1 p(x
(s=1)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x1
4)
• p(s = 2) =
4
k=1 p(x
(s=2)
k ) = p(x1
1)p(x1
2)p(x1
3)p(x2
4)
• p(s = 240) =
4
k=1 p(x
(s=240)
k ) = p(x4
1)p(x4
2)p(x5
3)p(x4
4)
9. Calculate Toll Revenue for scenario s, r(s)
= r
(p)
base
K
k=1
x
(s)
k
xbase
k
er
k
, s ∈ S, thus for simple example:
• r(s=1)
= r
(p)
base
4
k=1
x
(s=1)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x1
4
xbase
4
er
4
• r(s=2)
= r
(p)
base
4
k=1
x
(s=2)
k
xbase
k
er
k
= r
(p)
base
x1
1
xbase
1
er
1 x1
2
xbase
2
er
2 x1
3
xbase
3
er
3 x2
4
xbase
4
er
4
• r(s=240)
= r
(p)
base
4
k=1
x
(s=240)
k
xbase
k
er
k
= r
(p)
base
x4
1
xbase
1
er
1 x3
2
xbase
2
er
2 x5
3
xbase
3
er
3 x4
4
xbase
4
er
4
10. Using pairs r(s)
and p(s) Plot Revenue CDF
0.0.2 Sources of Uncertainty - Toll (Kockleman)
• Estimates of trip generation
• Estimates of land development
• Models: Trip Generation, Trip Distribution, Mode Choice
• Toll-technology adoption rates
• Hetrogeneity in (VOT) Value of Time savings
2
3. • Network attributes - Traffic congestion (low-volume corridors have greater uncertanity in their
forecasts)
• Uncertainty in land development patterns
• Demographic and employment projections
• Tolling design - shadow tolls (govt. pays the concessionaire an amount based n toll road use - similar
to toll free) or user-paid tolls (drivers willingness to pay is more complex and difficult to understand -
more forecasting risk)
• Tolling culture of a region, i.e. the degree to which tolls have been used in the past
• Travel demand model imperfections (Heterogenity of VOT is ignored, Variable tolls or HOT lanes
that are free at certain hours)
• Competitive advantage - Toll on the only bridge vs toll on freeway - more options to route
• User attributes - toll facilities serving a small market segment of travelers allow more reliable forecasts
vs hetrogenous populations
• Road location, configurations
• Demand variations over times of days and days of the year also affect forecast reliability
• Brian and Wilkins (2002) - poorly estimated VOTT’s, economic downturns, mis-prediction of future
land use patterns, lower than predicted time savings, added competition, lower than anticipated truck
usage, high variability in traffic volumes
• Economic growth and related changes in income and employment
• Total Demand Model errors
• Model error in elasticity of demand
• Value of time
• Errors in measurement of network times and costs
• Operating speed
• Roadway improvements
1 Texas North Tarrant Express Segment 3A
• Revenue and Transaction Forecast Year = 2018
2018 Revenue and Transactions
• Forecasted 2018 Annual Project Revenue (000’s 2011 Dollars) = 27612
• Forecasted 2018 Daily Transactions = 40086
Truck VOT Calculations
• SOV VOT - Lognormal distribution with mean = $18.59 and standard deviation = $7.4 (µ = 2.849
and σ = 0.383)
3
4. • Coefficient of variation, Csov
v = 7.4
18.59 = 0.398
• HHM Truck VOT: Mean = $36.48 and Standard deviation = $30.24
• AECOM Truck VOT: Mean = $60.76 and Standard deviation = $51.08
• AverageTruckV OT = HHMT ruckV OT +AECOMT ruckV OT
2 = $48.62
• Standard deviation of Average Truck VOT = Csov
v ∗ AverageTruckV OT = $19.35 (µ = 3.811 and σ
= 0.383, calculations below)
• µσ
In [3]: # Parameters for Truck Lognormal Distribution
m = 48.62
s = 19.35
truck_ln_mu = np.log(m/np.sqrt(1+((s**2)/(m**2))))
truck_ln_sigma = np.sqrt(np.log(1+((s**2)/(m**2))))
print ’truck_ln_mu = %1.3f’ % truck_ln_mu
print ’truck_ln_sigma = %1.3f’ % truck_ln_sigma
truck ln mu = 3.811
truck ln sigma = 0.383
r
(p)
base = 74754 (000’s 2011 Dollars)
Variables: Sources of Uncertainty
• Truck VOT: x1
– Elasticity of Revenue to Truck VOT = 0.994
– xbase
1 = $60.76
– Probability distribution: Lognormal with mean = $48.62 and std. dev = $19.35 (µ = 3.811 and
σ = 0.383)
• Travel Demand: x2
– Elasticity of Revenue to Demand (Transactions as proxy) = 2.57
– xbase
2 = 61056
– Probability distribution: Normal with µ = 58871.5 and σ = 2184.5
• Car VOT Growth: x3
– Elasticity of Revenue to Car VOT Growth = 0.19
– xbase
3 = 2.1%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.05%, and Max = 2.1%
• Truck VOT Growth: x4
4
5. – Elasticity of Revenue to Truck VOT Growth = 0.19
– xbase
4 = 2.5%
– Probability distribution: Triangular with Min = 0.5%, Mean = 1.25%, and Max = 2.5%
Truck VOT Probability Distribution: Lognormal
In [4]: mu = 3.811
sigma = 0.383
low = 1
high = 120
dx_1 = 2 # Length of interval
# Comb points along x axis
x_1 = np.arange(low, high, dx_1)
# Compute y values: pdf at each value of x
vot_y = (1/(sigma * x_1 * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x_1) - mu)/sigma) ** 2)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_1, vot_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$x_1$’)
plt.ylabel(’$p(x_1)$’)
plt.title(’Discretized Log-Normal Probability Density’)
area = np.sum(dx_1 * vot_y)
print ’Probability Sum = %1.4f’ % area
print ’N_1 = %d’ % len(x_1)
temp1 = np.array([x_1, vot_y * dx_1])
Probability Sum = 0.9946
N 1 = 60
5
6. Travel Demand Probability Distribution: Normal
In [5]: # Mean Transactions = (2035 Transactions + 2018 Transactions)/2
print ’Mean Transactions = %s’ % ((63635+40086)/2.0)
# Std Dev Transactions = (2035 Transactions - 2018 Transactions)/2
print ’Std. Dev Transactions = %s’ % ((63635-40086)/2.0)
Mean Transactions = 51860.5
Std. Dev Transactions = 11774.5
In [6]: demand_mean = 51860.5
demand_sd = 11774.5
demand_low = demand_mean - 3 * demand_sd # low end of x axis
demand_high = demand_mean + 3 * demand_sd # high end of x axis
dx_2 = 2000 # Length of interval
# Comb points along x axis
x_2 = np.arange(demand_low, demand_high, dx_2)
# Compute y values: pdf at each value of x
demand_y = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x_2 - demand_mean)/demand_sd)
# Plot the function
plt.figure(figsize = (16, 8))
plt.stem(x_2, demand_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Demand$’)
plt.ylabel(’$p(Demand)$’)
plt.title(’Discretized Normal Probability Density’)
area = np.sum(dx_2 * demand_y)
print ’Probability Sum = %1.4f’ % area
print ’N_2 = %d’ % len(x_2)
Probability Sum = 0.9978
N 2 = 36
6
7. Car VOT Growth Probability Distribution: Triangular
In [7]: min_growth_car = 0.004
mean_growth_car = 0.0105
max_growth_car = 0.022
car_array = np.random.triangular(min_growth_car, mean_growth_car, max_growth_car, size = 100000)
#plt.hist(car_array, bins = 10)
car_val = np.histogram(car_array, bins = 20)
car_y = [float(i)/np.sum(car_val[0]) for i in car_val[0]]
# Binwidth issue
x_car = car_val[1]
x_3 = []
for i in range(len(x_car) - 1):
temp = (x_car[i] + x_car[i+1])/2
x_3.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_3, car_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Car Growth$’)
plt.ylabel(’$p(Car Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_3)
print np.sum(car_y)
20
1.0
Truck VOT Growth Probability Distribution: Triangular
7
8. In [8]: min_growth_truck = 0.004
mean_growth_truck = 0.0125
max_growth_truck = 0.026
truck_array = np.random.triangular(min_growth_truck, mean_growth_truck, max_growth_truck, size =
#plt.hist(car_array, bins = 10)
truck_val = np.histogram(truck_array, bins = 20)
truck_y = [float(i)/np.sum(truck_val[0]) for i in truck_val[0]]
# Binwidth issue
x_truck = truck_val[1]
x_4 = []
for i in range(len(x_truck) - 1):
temp = (x_truck[i] + x_truck[i+1])/2
x_4.append(temp)
# Plot triangular distribution
plt.figure(figsize = (16, 8))
plt.stem(x_4, truck_y, markerfmt = ’ ’) # This draws the intervals
plt.xlabel(’$Truck Growth$’)
plt.ylabel(’$p(Truck Growth)$’)
plt.title(’Discretized Triangular Probability Density’)
print len(x_4)
print np.sum(truck_y)
20
1.0
Scenarios
In [9]: S = [[i, j, k, l] for i in x_1 for j in x_2 for k in x_3 for l in x_4]
print S[0]
8
9. print ’n’
print ’Number of Scenarios = ’ + str(len(S))
[1, 16537.0, 0.0044911329557659595, 0.0045996835827973384]
Number of Scenarios = 864000
Probability and Revenue Calculations for Scenarios
In [10]: # Constants: Base Revenue
rp_base = 27612
# Constants: Base values of variables
x_1b = 60.76
x_2b = 40086
x_3b = 0.021
x_4b = 0.025
# Constants: Elasticities of variables
e_x1 = 0.994
e_x2 = 2.57
e_x3 = 0.19
e_x4 = 0.19
revenue_S = []
prob_S = []
for i in range(len(S)):
# R(s)
temp_rev = rp_base * (S[i][0]/x_1b)**(e_x1) * (S[i][1]/x_2b)**(e_x2) * (S[i][2]/x_3b)**(e_x
revenue_S.append(temp_rev)
# Probability calculation:
# Truck VOT
p_x1 = (1/(sigma * S[i][0] * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(S[i][0]) - mu)/s
# Demand
p_x2 = (1/(demand_sd * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((S[i][1] - demand_mean)/demand
# Car VOT Growth
if S[i][2] in x_3:
cp = x_3.index(S[i][2])
p_x3 = car_y[cp]
# Truck VOT Growth
if S[i][3] in x_4:
tp = x_4.index(S[i][3])
p_x4 = truck_y[tp]
prob_S.append(p_x1 * p_x2 * p_x3 * p_x4)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
9
10. Probability Sum = 0.9924
In [11]: # Sorting Result based on Revenue
output = (np.array([revenue_S, prob_S])).T
output = output[output[:, 0].argsort()]
In [12]: # Plotting Cumulative Probability Distribution
plt.figure(figsize = (16, 8))
plt.plot(output[:,0], np.cumsum(output[:,1]), linewidth = 2) # Selecting array column: array[:,
# Plotting Predicted Revenue
plt.axvline(x = rp_base, color = ’r’)
plt.text(74754 + 500, 0.1, ’Predicted Revenue for 2018’, fontsize = 16)
# Remove Scientific Notation
ax = plt.gca()
ax.get_xaxis().get_major_formatter().set_scientific(False)
plt.xlabel(’$Revenue$’, fontsize = 16)
plt.ylabel(’$p(Revenue)$’, fontsize = 16)
plt.title(’Cumulative Probability Distribution - Revenue (000’s 2011 Dollars)’, fontsize = 16)
# Set tick label size
plt.tick_params(axis = ’both’, which = ’major’, labelsize = 14)
print ’Probability Sum = %0.4f’ % np.sum(prob_S)
print ’Demand std = %d’ % demand_sd
Probability Sum = 0.9924
Demand std = 11774
Percentile Calculation
In [14]: year = 2018
cum_prob = pd.DataFrame({’Revenue’: output[:,0], ’Cumulative Probability’: np.cumsum(output[:,1
10