(8264348440) 🔝 Call Girls In Shaheen Bagh 🔝 Delhi NCR
Large-Scale Nonparametric Estimation of Vehicle Travel Time Distributions
1. Data
Model and Fitting
Experimental Results
.
.
.
.
.
..
Large-Scale Nonparametric Estimation
of Vehicle Travel Time Distributions
Rikiya Takahashi, Takayuki Osogami,
and Tetsuro Morimura
{rikiya,osogami,tetsuro}@jp.ibm.com
IBM Research - Tokyo
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
2. Data
Model and Fitting
Experimental Results
. Route recommendation and traffic simulation
Which route (e.g. A or B) is chosen by a car driver?
Route recommendation
Which route should you select?
Traffic simulation
Which route do you select?
Dijkstra for minimizing expected traveltime is inflexible because of
Risk unawareness Variability of
travel-time is not
considered.
Unrealistic homogeneity Everyone
takes the same route.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
3. Data
Model and Fitting
Experimental Results
. Example of risk-sensitive route choice: ICTE
Instead of its mean, evaluate Iterated Conditional Tail
Expectation (ICTE) (Osogami, 2011) of travel-time.
Quantiles of travel-time distribution are utilized.
The value q of CTE q can be different among drivers.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
4. Data
Model and Fitting
Experimental Results
. Agenda
What we need: probability density function (p.d.f.) of
travel-time for every link of a road network.
Main proposal: data-mining algorithm to interpolating
p.d.f. for every link.
...
...
...
1
2
3
Summary of real data
Model and how to fit it
Experimental prediction performance
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
5. Data
Model and Fitting
Experimental Results
. Our road network and travel-time samples
We have a road network and probe-car dataset as
1.2M intersections and 3.3M links in Greater Tokyo Area.
3.1M travel-time samples by totally 58,584 taxis.
Data sparseness especially in suburban or rural regions.
Figure: Heatmaps based on the total number of travel-time
samples in 24 hours for each link. The green, yellow or red points
are located on the links that have at least 1, 10, or 100 samples,
Rikiyarespectively. Osogami, and Tetsuro Morimura
Takahashi, Takayuki
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
6. Data
Model and Fitting
Experimental Results
. Distribution of relative travel-time
Histogram of the relative travel-time y
y =(actual travel-time)/(travel time by legal speed limit)
Modes of P(y ) are about from 0 to 2.
2
4
6
8
10
0
2
4
6
8
10
x=(actual time)/(standard time)
16:00-16:59
Link ID=’1049171’
#samples=50
0.8
0.4
0.6
0.2
0.4
4
6
8
10
0.0
0.2
2
0
x=(actual time)/(standard time)
2
4
6
8
10
#samples=45
6
8
10
8:00-8:59
2
4
6
8
10
x=(actual time)/(standard time)
9:00-9:59
8
10
0.30
0.4
0.20
0.10
0.00
0.1
0.0
0.0
0
6
0.2
0.2
0.1
4
4
Link ID=’1049171’
#samples=41
0.3
0.3
0.20
0.10
0.00
2
x=(actual time)/(standard time)
2
x=(actual time)/(standard time)
22:00-22:59
#samples=59
0.4
0.5
0.4
0.3
0.2
0.1
0.0
0
0
x=(actual time)/(standard time)
18:00-18:59
20:00-20:59
Link ID=’1539993’ Link ID=’1049171’
Link ID=’1049171’
0.30
Link ID=’1049171’
#samples=31
0.0
0
x=(actual time)/(standard time)
14:00-14:59
Link ID=’1539993’
#samples=31
0.6
0.8
0.5
0.0
0.0
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.4
0.3
0.2
0.1
0.0
0
Link ID=’1539993’
#samples=56
1.0
Link ID=’1539993’
#samples=103
1.0
Link ID=’1539993’
#samples=71
0.6
Link ID=’1539993’
#samples=84
0
2
4
6
8
10
x=(actual time)/(standard time)
0
2
4
6
8
10:00-10:59
15:00-15:59
Link ID=’1049171’
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
10
x=(actual time)/(standard time)
0
2
4
6
8
10
x=(actual time)/(standard time)
16:00-16:59
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
7. Data
Model and Fitting
Experimental Results
. Issues we must solve
Scalability The sizes of the road network and travel-time
samples are large.
Data sparseness Travel-time samples are limited or missing in
suburban links.
Non-Gaussianity Distribution of travel-time is not Gaussian.
Multi-modality or heavy tails could happen.
Least-square (L2 -loss) regression is inflexible.
Assumption for solving: connected links have similar
distributions of vehicle velocities, depending on the required
hops.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
8. Data
Model and Fitting
Experimental Results
. Conditional density estimator of relative travel-time
Conditional p.d.f. of the relative travel-time y
∑
λ0 ϕ0 (y )+ m λi K (e, eπ[i] )ϕi (y )
∑ i=1
fe (y ) =
,
λ0 + m λi K (e, eπ[i] )
i=1
EΦ
{eπ[1] , · · · , eπ[m] } : subset of E
Φ {ϕ0 , ϕ1 , · · · , ϕm } : set of basis density functions
K (·, ·) : similarity function between links
λ (λ0 , λ1 , · · · , λm )T : vector of link importance
The link-independent terms λ0 and ϕ0 (·) are introduced for
handling the case ∀i ∈ {1, · · · , m}, K (e, eπ[i] ) ≡ 0.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
9. Data
Model and Fitting
Experimental Results
. 3 steps in estimating the parameters
∑
λ0 ϕ0 (y )+ m λi K (e, eπ[i] )ϕi (y )
∑ i=1
fe (y ) =
λ0 + m λi K (e, eπ[i] )
i=1
A) Basis function Φ {ϕ0 , ϕ1 , · · · , ϕm } Mixture of gamma or
log-normal distributions using convex clustering.
B) Link similarity K (·, ·) Sparse diffusion kernel on a
link-connectivity graph.
C) Link importance λ (λ0 , λ1 , · · · , λm )T Kullback-Leibler
Importance Estimation Procedure (KLIEP)
(Sugiyama et al., 2008).
Stability of fitting: each component can be fitted with either
convex optimization or simple matrix multiplication.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
10. Data
Model and Fitting
Experimental Results
. A) Fitting nonparametric basis density functions
At most L mixtures of gamma or log-normal distributions
ϕi (y ) =
L
∑
θi ψ (y )
=1
Optimize mixture weights as
[ L
]
∑
∑
max
log
θi ψ (y ) .
θi
Figure: Sliding windows
for fitting ψ1 ,· · ·, ψL
y ∈Yi
=1
Convex w.r.t. θ i (θi1 , · · · , θiL )T
Fast convergence with Sequential
Minimal Optimization (SMO)
(Takahashi, 2011)
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
11. Data
Model and Fitting
Experimental Results
. B) Link connectivity graph and its Laplacian
Adjacency matrix A = (aij ; ei = (ui , vi ), ej = (uj , vj )) as
{
∆T (e )∆(ej )
1
+ 2 ∆(ei )i ∆(ej )
if ui = vj ∪ vi = uj
aij = 2
.
0
otherwise
Values of {aij } when the wide
arrow represents ei .
xv − xu for e = (u, v ) and xu , xv ∈ R2 : location
(∑
)
∑|E |
|E |
D = diag
j=1 a1j , · · · ,
j=1 a|E |
∆(e)
H = D−1/2 (A−D) D−1/2 : negative normalized Laplacian
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
12. Data
Model and Fitting
Experimental Results
. B) Sparse diffusion kernel as link similarity
The diffusion kernel exp(βH) (Kondor and Lafferty, 2002)
is dense and computationally infeasible, while H is sparse.
Assume that traffic does not diffuse broadly in short time.
Then β is small and an approximate kernel matrix is
(
β
K (β, p) = I+ H
p
)p
=
p
∑
q=0
p!β q
Hq ,
q!(p−q)!p q
where p is a resolution hyperparameter in discretization.
The (i, j)-th element of the matrix K (β, p) gives the
similarity value between the edges ei and ej .
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
13. Data
Model and Fitting
Experimental Results
. C) Optimize the link importance with SMO
The vector of link importance λ is optimized with KLIEP as
]
[
m
∑
∑ ∑
λi K (e, eπ[i] )ϕi (y )
max
log λ0 ϕ0 (y )+
λ
e∈E+ y ∈Y[e]
s.t.
∑ ∑
i=1
[
λ0 +
e∈E+ y ∈Y[e]
m
∑
]
λi K (e, eπ[i] ) = n.
i=1
Convex optimization
Equivalent objective to that of convex clustering, with a
variable transformation
Also can be accelerated with SMO
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
14. Data
Model and Fitting
Experimental Results
. Experimental setting
10-fold likelihood cross-validation to evaluate predictive
performances.
Evaluate performances independently for 24 hourly
datasets.
Hyperparameters are also chosen with validations.
L = 100 and p = 8 (fixed)
r ∈ {1, 1.5, 2, · · · , 3} and β ∈ {1, 2, 3, 4, 5}.
Compare with parametric regression methods assuming
single log-normal distribution.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
15. Data
Model and Fitting
Experimental Results
. Time dependent size of the data
Table: The numbers, N, of travel-time samples, and the numbers,
|E+ |, of links that have at least one sample for each time slot.
hour
N
|E+ |
0:00273,168 69,126
1:00185,567 53,018
2:00109,662 38,994
3:0049,821 25,620
4:0022,501 15,484
5:0024,433 16,189
6:0023,868 16,579
7:0062,753 30,025
8:00149,906 47,400
9:00154,597 47,067
10:00- 131,383 42,445
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
11:00- 111,664 37,080
hour
N
|E+ |
12:00- 129,148 41,569
13:00- 133,987 40,083
14:00- 128,288 37,594
15:00- 130,971 36,980
16:00- 134,056 37,794
17:00- 174,748 43,074
18:00- 196,978 45,676
19:00- 162,816 41,468
20:00- 149,438 42,592
21:00- 169,125 47,856
22:00- 169,956 49,328
Large-Scale 165,835
23:00- Nonparametric Estimation of Vehicle Travel Time Dis
47,297
16. Data
Model and Fitting
Experimental Results
. Experimental predictive performances
Nonparametric CDEs outperform for all of the datasets.
0
Euclid-kNN
Nadaraya-Watson
CDE(Gamma)
CDE(LogNormal)
CDE(MixGamma)
CDE(MixLogNormal)
avg. test-set log-likelihood
-0.2
-0.4
-0.6
-0.8
-1
-1.2
-1.4
-1.6
-1.8
0
3
6
9
12
15
18
21
hour (index of the dataset)
Figure: Average test-set log-likelihood for each hourly dataset,
based on the 10-fold likelihood cross-validations.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
17. Data
Model and Fitting
Experimental Results
. Links having complex distributions
ge (y ): single exponential-family approximation of fe (y )
based on moment matching
Cauchy-Schwarz (CS) divergence (Pr´
ıncipe, 2010)
∫
f (y )ge (y )dy
y e
.
CS(f , g |e) = − log √∫
∫
2
fe2 (y )dy y ge (y )dy
y
12:00-
18:00-
0:00-
Figure: Links having top-1% highest CS divergence scores.
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis
18. Data
Model and Fitting
Experimental Results
. Conclusion and future directions
A novel nonparametric estimator of travel-time
distributions conditioned on the link of a road network.
A) Basis density functions by mixture of gamma or
log-normal distributions
B) Sparse diffusion kernel as link similarity
C) Optimizing link importance with KLIEP and SMO
Future directions
Interpolate p.d.f.s also in time domain, as well as the
spatial domain
Incorporate correlation among links
Estimate each driver’s preference for realistic simulation
Rikiya Takahashi, Takayuki Osogami, and Tetsuro Morimura
Large-Scale Nonparametric Estimation of Vehicle Travel Time Dis