Traffic Data Simulation

Lab 1: Traffic Data from Urban Inductive Loop Detectors and the
Macroscopic Fundamental Diagram
Henri Azélart, Youssef Kitane, Dirk Lauinger, Yann Martinson
November 12, 2018
Abstract
The micro-simulation of downtown Barcelona’s traffic exhibits the patterns expected from theoretic funda-
mental diagrams. Two macroscopic fundamental diagrams are constructed one based on volume-occupancy and
one on production-accumulation. The latter is more informative for perimeter control as it shows how much
traffic each of the network’s four regions could absorb or expulse to operate closer at maximum production. The
macroscopic fundamental diagram can be estimated quite well based on limited sensor data by either selecting
the longest links or random selection. These results depend crucially on the choice of the scaling parameter. The
partitioning of the network into four regions does not lead to consistently more homogeneous regions in terms
of link occupancy variance. Some regions have a higher some a lower variance than the network as a whole.
Overall, the variance stays basically the same.
Introduction
In this lab, we analyze loop detector data from a micro-simulator of the road traffic in Barcelona’s city center
partitioned into four regions (Fig. 1). The data are similar in kind but not in quantity to real sensor measurements.
Indeed, the data span 1570 arcs with an average of 2.74 lanes connecting 870 nodes. On each directed arc, we
measure vehicle flow and average occupancy with a time resolution of 90s over a horizon of two hours. That is, we
have a total of 80 measurements per link. Since information acquisition is costly in general, it is unrealistic that we
will be able to fully observe a traffic network in practice.
The overarching goal of this lab is to characterize the regional traffic flow data with a Macroscopic Fundamental
Diagrams (MFD) which can inform traffic control strategies. First, we characterize the network using all the infor-
mation at our disposal and analyze the difference between Volume vs. Occupancy and Production vs. Accumulation
MFDs. Second, we try to estimate regional MFDs using only a subset of the available data. Lastly, we examine
whether the partitioning into four regions leads to more homogeneous regions as compared to the whole network.
1

Figure 1: Barcelona’s city center is partitioned into four regions. The width of each link is proportional to its
number of lanes.
Contents
1 Congestion Snapshots 3
2 Volume vs. Occupancy 4
3 Mean speed vs. Density 5
4 Production vs. Accumulation 5
5 MFD Estimation 7
6 Random link sampling 9
7 Heterogeneity in the spatial distribution of density 9
2

1 Congestion Snapshots
The congestion in the network grows over time (Fig. 2). Region 4 seems to be the most congested area followed by
region 1, 3 and 2. The congestion is measured based on the occupancy of each link.
(a) 60min
(b) 90min
(c) 120min
Figure 2: Congestion snapshots
3

2 Volume vs. Occupancy
A fundamental diagram (FD) characterizes the relationship between the traﬃc volume (the ﬂow of vehicles per
hour) and the link occupancy for a given link. The macroscopic fundamental diagram (MFD) generalizes this
notion to a collection of interconnected links.
Figure 3: Scatter plot of volume vs occupancy for varying sets of links
4

Since the MFD is built by averaging volume and occupancy data over all links in a given set, it is not surprising
that the MFDs in figure 3 are less scattered than the FD for a single link or the MFD of just two links. The MFD
with two links does not cover the entire range of possible link occupancies. It just so happened that the two links
we selected randomly had only three out of 80 samples for which occupancy was between 20 and 60%. All figures
show the same trend though: volume increases with occupancy up to a certain critical occupancy of about 15%
after which volume decreases.
The partitioning of the network should help to avoid heterogeneity in the data, especially scatter around the
critical occupancy and the over-congested part of the MFDs. The idea is to separate a heterogeneous network into
more homogeneous compact regions. Region 2 is lower in volume and occupancy than the other regions. In fact, it
doesn’t reach an over-congested regime. On the contrary region 4 has the highest critical occupancy and the highest
volume of all regions. Regions 1 and 4 are quite similar in terms of volume and occupancy. They reach the highest
recorded occupancies. Based on the regional MFD alone, there is no clear reason for the distinction between these
two regions. The price to pay for partitioning (aka clustering) in our network is that the regional MFDs are more
scattered than the MFD for the whole network, particularly for region 3 and 4.
3 Mean speed vs. Density
Space-mean speed looks inversely proportional to the density (Fig. 4). It is a reasonable hypothesis, the average
speed decreases with an increasing number of vehicles in the network.
The range of mean speeds is from a few km/h to 60 km/h. It seems logical in an urban environment, where
most of the streets are limited to 50 km/h and the more important road to 90 km/h, that the average maximum
speed is around such values. And in a very congested setting, which can occur often in this kind of city, the speed
in traffic jam can be very low.
Indeed, in the time series, we can see that the system is getting more and more congested, and particularly with
an important drop of speed around 20 minutes into the simulation. At the end of the simulation the network is
overall very congested, region 1, 3 and, 4 are congested at more or less the same speed, 5 km/h, and region 2 looks
less congested with an average speed of approximately 15 km/h.
Figure 4: Space-mean speed vs Average density (Left) and Time (Right)
4 Production vs. Accumulation
The regional MFDs of production vs accumulation (Fig. 5) are very similar in shape and order of regions to the ones
calculated in Step 2. However, contrary to the latter, the production vs accumulation MFDs are not normalized by
the number of vehicles per region.
5

Figure 5: Scatter plot of production vs accumulation for varying sets of links
It is thus easier to see the importance of each region in terms of vehicle capacity. Region 4 is clearly the most
used and then region 1 and ﬁnally regions 2 and 3 are much less important than the other two. The production vs
accumulation MFDs are useful to control inter-regional ﬂows because they show the absolute and not the relative
capacity of each region to absorb additional vehicles.
6

5 MFD Estimation
This section approximates production-accumulation MFDs with polynomials. It is divided in two parts, the first
being the estimation based on all available data. In a second step, we estimate the MFD using only information of
half the links in the network. The links to keep for the estimation are the ones with the (i) highest number of lanes,
(ii) longest link length, (iii) maximum average flow and (iv) a random selection of links. The first three MFDs
based on incomplete link information are scaled by the ratio of the total link length of the region to the total link
length of the sample. For the MFD based on random link selection, the scaling factor is simply 2. Since the regional
MFD in figure 5 is neither a linear nor a symmetric function of accumulation, we choose to estimate it by the lowest
order polynomial that yields such functions: a third order polynomial. The asymmetric part (after surpassing the
critical accumulation) is not very pronounced for regions 2 and 3. This may lead to badly conditioned systems
when estimating third order polynomials.
The goal is to find a method that tells us in which links we should install sensors to infer the state of the
entire network. Method (iii) relying on flow measurements is questionable because these measurements can only be
obtained after the sensors have already been installed. The approach could make sense in settings where temporary
sensors are used to decide on the best locations for permament sensors.
Figure 6: MFD Estimation Errors
Figure 6 shows that the approaches based on the longest links and on random sampling yields the lowest
estimation errors. They are almost as good as the estimation based on the full data set. It should be said that the
performance of the estimation depends critically on the scaling factor used. All methods estimate MFDs with the
correct general shape (Fig. 7).
7

Figure 7: MFD Estimation. Dashed lines represent estimated MFD based on half the links. Solid lines are MFD
estimated on all links. Filled cercles represent the scaled data used for MFD estimation. The empty cercles are the
original data.
8

6 Random link sampling
The random selection of links has beat all other approaches in figure 6. Indeed, it is the only unbiased way to select
links. Figure 8 shows the performance of this approach over 100 draws of links for the MFD estimation. There is
a 60% chance to obtain an estimate of the production in each network with a mean error of below 2000 veh km/h.
Figure 8: Performance of the random link selection
7 Heterogeneity in the spatial distribution of density
Figure 9 shows the spatial distribution for the four different clusters and the whole network at times 60 and 90min.
Some regions have a higher, some regions a smaller dispersion of occupancy than the whole network. In particular,
the dispersion in region 2 is much lower in the whole network (Fig. 10). We used the sample standard deviation as
a measure of dispersion. To decide whether clustering helped to reduce variation, we examine the total variation of
occupancy after clustering with the variance before clustering.
Total variation TV is the sum of the occupancy Xk variation weighted by the total link length Lk in each
region k. The total link length is the sum of the link length times the number of lanes (Table 1). The length of the
entire network is denoted by L.
TV =
4
k=1
Lk
L
Var[Xk]
In theory, clustering can yield a total variation smaller than the variation of the entire network because every
region can have a different mean around which the deviations are calculated. At 60 and 90 minutes the ratio of
total variation to the variance before clustering is 98 respectively 97%. This means that clustering has reduced the
spatial hetereogeneity in density only very slightly.
9

Figure 9: Histogram of spatial occupancy by region
10

Figure 10: Dispersion of spatial occupancy by region
Table 1: Total link length by region
Region Total Link Length (km)
1 118.71
2 47.34
3 50.82
4 151.67
All 368.55
Conclusion
With respect to the goals set out in the introduction, we confirme that the traffic data from a micro-simulation
of Barcelona’s city center exhibits the patterns we expect: the volume-occupancy and production-accumulation
macroscopic fundamental diagrams (MFD) are composed of a free-flow and a congested phase, the critical density
is different for each region on the production-accumulation MFD, and the space-mean-speed decreases with den-
sity. When investigating interregional flows for example for perimeter control, the production-accumulation MFD
is useful because it measures the total traffic volume that each region is able to absorb or would like to expulse to
operate closer at maximum production.
Concerning the estimation of the regional MFDs based on incomplete information, we find that the scaling pa-
rameter strongly impacts the estimation quality. With the scalings suggested in the lab, we find that the approach
based on (i) selecting the longest links and (ii) selecting links at random perform the best. Surprisingly, the random
approach outperforms all other approaches in terms of mean estimation error. A probabilistic analysis showed that
we can have a 60% chance of having an average estimation error of less than 2000 veh km/h.
Lastly and surpisingly, we find that partitioning the network into four regions does not yield regions that are
consistently more homogeneous than the whole network in terms of the standard deviation of link occupancy. Some
regions are in fact more homogeneous but others are more heterogeneous. Overall, the heterogeneity stays basically
11

the same when measured with total variation.
For future work, we would be intrigued to see if flow balance holds in this network. There can be a couple of
reasons for which it should not such as parking and departing vehicles. At the moment, we assume that vehicle
inflow and outflow at each link is the same. This may not be the case if parking is considered.
12

Traffic Data Simulation

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Traffic Data Simulation

Similar to Traffic Data Simulation (20)

More from YoussefKitane

More from YoussefKitane (6)

Recently uploaded

Recently uploaded (20)

Traffic Data Simulation