SlideShare a Scribd company logo
USING DECISION TREES IN ORDER TO DETERMINE INTERSECTION DESIGN
RULES
Erwin M. Bezembinder *
Windesheim University of Applied Sciences
Department of Technology
Area Development Research Group
P.O. Box 10090, 8000 GB, Zwolle, The Netherlands
Phone: +31-88-4698436
E-mail: e.bezembinder@windesheim.nl
Luc J.J. Wismans
University of Twente
Faculty of Engineering Technology
Centre for Transport Studies
P.O. Box 217, 7500 AE, Enschede, The Netherlands
Phone: +31-570-666840
E-mail: lwismans@dat.nl
Eric. C. van Berkum
University of Twente
Faculty of Engineering Technology
Centre for Transport Studies
P.O. Box 217, 7500 AE, Enschede, The Netherlands
Phone: +31-53-4894886
E-mail: e.c.vanberkum@utwente.nl
* = Corresponding author.
Submitted: August 1, 2014.
Paper prepared for the 94th
Annual Meeting of the Transportation Research Board, 2015.
Word Count:
Text, excl. references: 4,972
Tables: 5 x 250 = 1,250
Figures: 3 x 250 = 750
Total: 6,972
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 2
ABSTRACT
1
Road planners frequently face the challenge to determine which intersection design provides the
2
best traffic flow for a particular traffic demand. Many road design manuals provide guidelines
3
for the design and evaluation of different intersection alternatives, however mostly refer to
4
specialized software in which the performances of different design alternatives can be modelled.
5
In a planning stage of the design process, such assessments are undesirable due to time and cost.
6
There is a need for quick design rules which need limited input data. Although some of these
7
rules exist, their usability is limited. In this paper we examine the possibilities to determine
8
intersection design rules by Decision Tree (DT) methods which are trained with data generated
9
by HCM 2010 intersection modelling. The models consider 24 intersection designs varying the
10
main type (all-way stop controlled, two-way stop controlled, signalized and roundabout) and the
11
number and configuration of the entering and exiting lanes. Traffic demand patterns are
12
randomly generated for various sizes of the dataset (5,000 – 5,000,000 cases) represented by 38
13
(independent) demand variables. Different DT methods (CHAID, CRT and QUEST), options
14
(splitting criteria, tree depth) and datasets are tested for their predictive accuracy. The DT models
15
provide accuracy rates between 76% and 96%. The CRT methods seem the most promising, and
16
a further analysis was made concerning the independent variable importance and the possibilities
17
for reducing the trees complexity. An example is shown of a DT which provides straightforward
18
design rules and an predictive accuracy of 85.5%.
19
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 3
INTRODUCTION
1
Throughout the world, road planners frequently face the challenge to determine which
2
intersection design provides the best traffic flow for a particular location. This intersection
3
design encompasses the choice of the main type such as a signalized intersection or roundabout,
4
but also choices regarding the number and characteristics of the entering and exiting lanes and
5
the type of signal control. In order to determine the best intersection design, road planners
6
initially consult their road design manuals. These manuals provide guidelines for the assessments
7
of different intersection designs, e.g. see [1,2,3,4,5,6,7]. However, most guidelines refer to
8
specialized software in which the performances of different design alternatives can be modeled.
9
This is not workable for planning purposes. At that stage of the design process, rules of thumb,
10
assessment schemes or simple calculation methods are more suitable. Only a very limited
11
amount of these rules exist. In the Netherlands, various rules have been published by the CROW
12
[5,6,7] mostly regarding the choice for a particular roundabout design. In the USA, Stamatiadis
13
et.al. [8,9] did a literate study concerning the evaluation of design alternatives based on both
14
national and (41) state design manuals. They observed a lack of specific guidance at both
15
national and state levels. Similar observation can be made based on design manuals in Germany
16
and the UK [3,4]. The lack of rules can for the most part be explained by the lack of appropriate
17
field data to determine the rules. Alternatively, data generated by intersection models can be
18
used. Using a model provides the opportunity to create a complete dataset. Stamatiadis et.al.
19
[8,9] used CORSIM to analyze the intersection control delay and the critical volume to
20
determine thresholds for different designs. Han et.al. [10] used the HCM 2000 [11] to distinguish
21
between different intersection types based on the major and minor street traffic volumes in order
22
to test and improve HCM 2000 Exhibit 10-15. Vitins and Axhausen [12] also used the HCM
23
2000 to determine intersection design rules. Although these studies provide interesting
24
approaches, they examined a limited number of intersection designs and demand volumes and
25
moreover they conducted a manual analysis of the generated data.
26
In this paper we present an approach in which we use the HCM 2010 [13] in order to
27
generate various datasets which can be used to determine intersection design rules bases on
28
intersection control delay. The datasets will be used as input for a Decision Tree (DT) method.
29
DTs are a classification technique used to categorize samples/instances based on the features of
30
the samples. Although it has not been applied to determine intersection design rules, it has
31
recently proven to be a powerful method which can handle both numerical and categorical data,
32
requires little data preparations, uses a white box model, is easy to understand and interpret,
33
performs well with large datasets, is robust and offers possibilities to validate the model using
34
statistical techniques [14]. Most important, rules can be derived from the resulting DTs.
35
The aim of this research is to determine whether DTs can be used to determine
36
intersection design rules based on demand variables and control delays generated by HCM 2010
37
with sufficient accuracy. Secondly, various DT methods are available and there is no ‘rule’
38
which states which methods is most suitable for this specific situation. Therefore, three main DT
39
methods (CHAID, CRT and QUEST), which based on their characteristics are suitable for the
40
job and various options will be tested for accuracy in order to determine the best DT method.
41
Furthermore, it is tested whether manageable DTs and rules can be generated so they can be used
42
in design manuals.
43
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 4
Paper outline
1
In the next section the generation of the data sets will be described. Subsequently, the general
2
principle of DTs, the DT methods and the evaluation of the methods in this study will be
3
discussed. Next, the results will be explained followed by the conclusions and considerations for
4
further research.
5
DATA SETS
6
The DT model needs a dataset in which each case (or record or row) contains values for multiple
7
independent (predictor) variables and the corresponding dependent (target) variable. In this
8
research, the independent variables represent the demand flow rates on an intersection, whereas
9
the dependent variable is the preferred intersection design.
10
The dataset will be generated using the HCM 2010 methodology [14]. The HCM 2010 is
11
used to determine the volume weighted average control delay for the intersection based on given
12
demand flow rates and an intersection design. Figure 1 shows an example of the input data for a
13
multilane roundabout. Using the HCM 2010 Roundabout Analysis Methodology with default
14
values, this will result in a control delay for the intersection of 16.6 s/veh.
15
16
17
Figure 1 Intersection design and demand volumes (pcu/h) for a multilane roundabout [13,
18
Exhibit 21-23].
19
20
In order to determine the preferred intersection design for these specific demand flow rates, the
21
control delay for multiple intersection design alternatives is calculated. The intersection design
22
with the lowest value for the control delay will be the preferred intersection design. Basically,
23
this produces one case (or record or row) in the resulting dataset. This process is repeated for a
24
multitude of different demand flow rates, thus generating a dataset for the training and testing of
25
the DT model. Different datasets will be generated in order to test the performance of the DT
26
methods related to the size of the dataset, the use of spatial constraints and the use of multiple
27
(closely related) preferred intersection designs for one set of demand flow rates. This will be
28
discussed in more detail in the subsequent sections, respectively explaining the intersection
29
model, the employed intersection design alternatives and demand flow rates and the selection of
30
the preferred intersection design.
31
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 5
Intersection model
1
As stated before, we use the HCM 2010 methodology to determine the control delay for each
2
intersection design and set of demand flow rates. The latter is expressed as person car units per
3
hour (pcu/h), thus incorporating freight traffic. The standard duration of the analysis period is 15
4
minutes. The HCM 2010 distinguishes four main types of (ground-level) intersections for non-
5
highways: All-way stop controlled (ASWC) intersections, two-way stop controlled (TWSC)
6
intersections, signalized intersections and roundabouts. For each of these types a different
7
methodology is suggested. For required attributes other than the intersection design attributes
8
and demand flow rates default values as suggested by the HCM 2010 will be used. For signalized
9
intersections, the additionally required signal control settings are determined by using the so-
10
called ‘Quick Estimation Method’ as described in the HCM 2010 [13, Chapter 31]. Although this
11
method does not guarantee an optimal signal timing plan (with a minimal control delay for the
12
intersection), it does provides signal timings and delays when minimal data are available, which
13
is case for this application.
14
Intersection design
15
Since HCM 2010 is used, the possible intersection designs and the available attributes are bound
16
to this model. However, the number of possible designs is still very large. Since the intersection
17
design will be the dependent variable for the DT models, it is wise to limit the number of
18
categories. This is also legitimate, since in practice only a limited number of reasonable
19
intersection designs can and would be applied. Therefore, the number of intersection designs for
20
a four-arm intersection is limited to 24 design types. Table 1 gives an overview of the types used
21
in this study. The first attribute is the main type in accordance with the HCM 2010 type and
22
methodology. Then a distinction is made between attributes for the major and minor road
23
approaches. For reasons of clarity, in this research, the major approaches are always the east- and
24
westbound approaches, whereas the minor approaches are the north- and southbound approaches.
25
For each approach, the configuration of the entry lanes is determined ranging from one shared
26
lane for all movements to two designated lanes per movements. The * used at roundabout lanes
27
is used to indicate a (nonyielding) right-turn bypass lane. Other attributes concern the width of
28
the central reservation (CR) in meters, the number of exiting lanes and the number of opposing
29
circulating lanes in case of a roundabout approach. The rightmost column in the table shows the
30
size category of the design. This attribute was introduced in order to prevent large (and
31
expensive) intersections always to be preferred design, regardless of the traffic volume on the
32
intersection. This will be discussed later on. The size categories are based on an estimation of the
33
required surface area for both the central part of the intersection as well as the approaches. Size
34
category 1 designs have a maximum area of 500 m2
, the other boundaries are 1000, 1500, 2000
35
and 2000+ m2
.
36
37
38
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 6
Design
type
Main type Major road approaches Minor road approaches Size
cat.
Entry CR Exit Circ. Entry CR Exit Circ.
AW1 AWSC 1:  0.0 1 - 1:  0.0 1 - 1
TW1 TWSC 1:  0.0 1 - 1:  0.0 1 - 1
TW2 1:  5.0 1 - 1:  0.0 1 - 2
TW3 2:  0.0 1 - 1:  0.0 1 - 1
TW4 2:  5.0 1 - 1:  0.0 1 - 2
TW5 2:  5.0 1 - 2:  0.0 1 - 2
SIG1 Signalized 1:  0.0 1 - 1:  0.0 1 - 1
SIG2 2:  0.0 1 - 1:  0.0 1 - 1
SIG3 2:  0.0 1 - 2:  0.0 1 - 2
SIG4 3:  0.0 1 - 2:  0.0 1 - 2
SIG5 3:  0.0 1 - 3:  0.0 1 - 2
SIG6 4:  0.0 2 - 2:  0.0 1 - 3
SIG7 4:  0.0 2 - 3:  0.0 1 - 3
SIG8 4:  0.0 2 - 4:  0.0 2 - 3
SIG9 6:  0.0 2 - 4:  0.0 2 - 4
SIG10 6:  0.0 2 - 6:  0.0 2 - 5
RA1 Roundabout 1:  0.0 1 1 1:  0.0 1 1 3
RA2 1:  0.0 1 2 1:  0.0 1 2 5
RA3 2:  0.0 1 1 1:  0.0 1 1 3
RA4 2:  0.0 2 1 1:  0.0 1 2 5
RA5 2: * 0.0 1 1 1:  0.0 1 1 4
RA6 2: * 0.0 1 1 2: * 0.0 1 1 4
RA7 2:  0.0 2 1 2:  0.0 1 2 5
RA8 3: * 0.0 2 2 2: * 0.0 2 2 5
Table 1 Intersection design types for four-arm intersections.
1
Demand flow rates
2
On a four-arm intersection there are twelve turning movements for car traffic, i.e. left, through
3
and right turning movements for each approach. In this research it is the idea to determine the
4
intersection performance for as much demand flow rates combinations as possible. Suppose that
5
for each turning movement a value of 0 to 1000 pcu/h should be tested. The total number of
6
demand combinations would then be 100012
. Although this number can be reduced because
7
various design are equal once you mirror them, it is still a substantial number of calculations to
8
be performed for each of the 24 intersection designs. Also a very fast model is not able to handle
9
these quantities. The number of combinations could be reduced by reducing the number of
10
values for each turning movement, for example by using only 10 categories. Although this
11
reduces the number of tested demand drastically, it still leaves 1012
combinations. Further
12
reduction could be reached by using proportions for each approach and turning movement per
13
approach. In preparation for this research several approaches for the definition of the traffic
14
demand have been tested. Ultimately, an approach in which a given number of random generated
15
traffic demands will be determined turned out to be most promising. In this approach a random
16
amount of total traffic on the intersection is determined from a range of 0 to 6000 pcu/h. Next,
17
the proportion of traffic on the major approaches will be randomly drawn from a range of 50-
18
100%. The proportion of traffic on the minor approach is then derived from this value. This
19
means that there is never more traffic on the minor road then on the major road. In the most
20
extreme setting, the proportions are 50-50%. Subsequently, the proportions of traffic on the west-
21
bound (major) and north-bound (minor) approach are determined from a range of 0-100%. The
22
values for the other approaches are derived. Then for each approach, the right turn proportion is
23
drawn from a range of 0-100%. The through turn proportion is drawn from the remaining range,
24
while the left turn proportion is equal to the remaining proportion.
25
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 7
Once all proportions are determined, they can be applied to the total volume in order to
1
determine the flow rates for each turning movement in the intersection. This is the input for the
2
intersection model.
3
Eventually, the demand flow rates are used as independent variables for the DT model.
4
However, as stated before, it is likely that composed variables, such as the total number of left
5
turning traffic from the minor road, are more explanatory for the model. One of the goals of
6
training and testing the DT models is to determine which independent variables are most
7
important for the model. Therefore, a total of 38 independent demand variables will be used in to
8
DT models. Table 2 shows an overview of these variables.
9
10
WB right volume SB right volume Major through & minor left volume Minor percentage
WB through volume SB through volume Major through & left volume Major left percentage
WB left volume SB left volume Minor through & left volume Minor left percentage
NB right volume Total volume Left volume Major through percentage
NB through volume Major volume Through volume Minor through percentage
NB left volume Minor volume WB volume Major through & left percentage
EB right volume Major through volume NB volume Minor through & left percentage
EB through volume Major left volume EB volume Left percentage
EB left volume Minor through volume SB volume Through percentage
Minor left volume Major percentage
Table 2 Independent demand variables.
11
12
Another issue concerns the size of the dataset that will be used to train the DT model. Since this
13
is still a point of research, different dataset sizes will be tested. Datasets with respectively 1,000,
14
10,000, 100,000 and 1,000,000 different demand configurations will be tested. So, for each of
15
these demand sets the performances of 24 different intersection design will be tested.
16
Preferred intersection design(s)
17
Basically, the preferred intersection design will be the intersection design with the lowest
18
intersection control delay. This is the volume weighted average control delay for the whole
19
intersection in s/veh. A preferred design is determined for each set of demand flow rates and size
20
category. The latter was introduced in order to prevent large (and expensive) intersections always
21
to be preferred design, regardless of the traffic volume on the intersection. With five size
22
categories, this will result in five cases (or records or rows) for each configuration of demand
23
flow rates.
24
DT METHODS AND EVALUATION
25
DTs
26
DTs classify instances by sorting them down the tree from the root to some leaf node, which
27
provides the classification of the instance. Each node in the tree specifies a test of some
28
(predictor) attribute, and each branch descending from that node to one of the possible
29
values for this attribute. An instance is classified starting at the root node of the tree,
30
testing the attribute specified by this node, then moving down the tree branch
31
corresponding to the value of the attribute in the given example. If the new node is not a
32
leaf node, but an internal node, the process is repeated for the subtree starting at the new
33
node.
34
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 8
Figure 2 shows an imaginary example of a simple DT which predicts whether one should apply a
1
signal control.
2
3
Total
volume
<=1250 pcu/h
Area
Rural
Yes
>1250 pcu/h
Speed
limit
Urban
No
30 km/h 50 km/h
No Yes
4
5
Figure 2 Example of a DT.
6
7
As can be seen, a DT can contain both nominal and numeric attributes. When the predicted
8
outcome (target) of the DT model is a real number, it is called a regression tree. When the
9
predicted outcome is the class to which the data belongs, it is called a classification tree. In this
10
study, we are interested in predicting the junction design type, which means we are only looking
11
at classification trees.
12
DT inducers/methods
13
DT inducers are algorithms that automatically construct a DT from a given dataset. Typically the
14
goal is to find the optimal DT by minimizing the generalization error. Induction of an optimal
15
DT from a given dataset is considered to be hard task [14]. As a result, heuristic methods are
16
required for solving the problem. Roughly speaking, these methods can be divided into two
17
groups: top-down and bottom-up with clear preference in the literature to the first group. Most
18
methods consist of two conceptual phases: growing (extending) and pruning (reducing).
19
A variety of top-down (univariate) algorithms is available. Rokach and Maimon [14] give
20
an extensive overview and state that the main algorithms are ID3, C4.5, CART, CHAID and
21
QUEST. The other algorithms are earlier versions, variations or are specifically designed for
22
numerical-valued predictor and/or target attributes, which is not applicable for predicting the
23
intersection design type. It is well known that no algorithm can be the best in all possible
24
domains [14]. For as far as we know, there are no publications concerning DT analysis for
25
intersection design type. The closest domain concerns the analysis of traffic accident severity
26
using DTs [15,16] for which the C4.5 algorithm is used. Most studies analyzing other domains
27
compare multiple algorithms, mostly CART, CHAID and QUEST e.g. see [17,18,19,20]. That
28
will also be done in this study.
29
CHAID stands for Chi-Squared Automatic Interaction Detection. The CHAID algorithm
30
was originally proposed by Kass [21]. The algorithm tests variables for independence using the
31
chi-square test, which determines whether splitting a node generates a statistically improvement
32
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 9
in purity. Either Pearson or likelihood-ratio chi-square is used for determining the node splitting
1
and class merging.
2
The resulting significance values are adjusted using the Bonferroni method. CHAID allows
3
multiway node splitting (as opposed to binary node splitting), which means that the instance
4
space can be split in more than two classes. CHAID does not perform pruning. Exhaustive
5
CHAID , proposed by Biggs et.al. [22], is a modification of CHAID that uses a more thorough
6
heuristic for finding, at each node, the optimal way of grouping the categories of each predictor.
7
CART (or CRT) stands for Classification and Regression Trees. It was developed by
8
Breiman et.al. [23] and is characterized by the fact that it constructs binary trees. The splits are
9
selected using the Gini or Twoing impurity measure criteria and the obtained tree is pruned by
10
cost complexity pruning.
11
QUEST was proposed by Lim and Shih [24] as a Quick, Unbiased, Efficient Statistical
12
Tree and also constructs binary trees. For nominal predictors, it uses the Pearson chi-squared test
13
for node splitting. QUEST has negligible bias and uses ten-fold cross-validation to prune the
14
trees.
15
Summarizing, the following DT algorithms and settings will be tested in this study:
16
1. CHAID, Pearson chi-square statistic;
17
2. CHAID, Likelihood Ratio chi-square statistic;
18
3. Exhaustive CHAID, Pearson chi-square statistic;
19
4. Exhaustive CHAID, Likelihood Ratio chi-square statistic;
20
5. CRT, Gini impurity measure;
21
6. CRT, Twoing impurity measure;
22
7. QUEST.
23
Evaluation of DT methods
24
In order to determine which DT method is the best, one could compare prediction accuracy,
25
complexity and training time [26]. The most common measure for accuracy is the risk estimate
26
(and its standard error). For categorical dependent variables, the risk estimate is the proportion of
27
cases incorrectly classified. A classification table gives more insight in the specific
28
misclassification cases. A validation method will be used by determining risk estimates for both
29
training and test data. For this the dataset will be split in a training and test set with different
30
proportional segmentations (50%-50%, 67%-33% and 75%-25%).
31
Complexity can be compared by using general statistics such as the tree depth, the
32
number of branches and nodes, the number of cases in leafs and the number of independent
33
variables included. Since we are initially interested in the accuracy of the models, instead of the
34
manageability of the resulting trees, the maximum tree depth is set to the (maximum value) of 20
35
for all methods. In a later stage this number will be reduced to create more manageable trees. For
36
the same reason, no tree pruning is applied at first. The minimum number of cases in the parent
37
and child nodes are set to 100 and 50. The CRT method provides a measure for the (normalized)
38
importance of each independent variable in the model which can be used to reduce the number of
39
input variables. Together with options for the maximum tree depth and pruning, the complexity
40
of the tree can be reduced. This will be done only for the method with the best overall accuracy,
41
before applying these options.
42
Training time is expected to be an issue only for large amounts of data, i.e. containing
43
1,000,000 cases. The differences between the methods are expected to be relatively small.
44
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 10
RESULTS
1
Accuracy
2
An overview of the prediction accuracy of the different DT algorithms for various sizes of the
3
training set is shown in Table 3. The table shows the overall classification rate, which is the
4
proportion of cases correctly classified. The dependent (target) variable used is the preferred
5
intersection design type as presented in Table 1. Only one design type for each combination of
6
demand variable configuration and size category is included in the dataset. The independent
7
(predictor) variables are all demand variables as mentioned in Table 2 and the size category
8
variable.
9
10
1 2 3 4 5 6 7
CHAID CHAID ECHAID ECHAID CRT CRT QUEST
Cases Std.err. Pearson Likelihood Pearson Likelihood Gini Twoing -
5,000 0.006 77.4% 76.9% 77.6% 77.3% 79.5% 79.5% 76.3%
50,000 0.002 82.8% 82.7% 82.9% 82.6% 84.7% 84.5% 81.0%
500,000 0.000 85.8% 85.9% 85.8% 85.9% 85.7% 87.3% 84.4%
5,000,000 0.000 87.9% 88.1% 87.8% 88.0% 85.2% 86.6% 85.9%
Table 3 Prediction accuracy for different algorithms and training set sizes.
11
12
The overall accuracy in Table 3 ranges from 76.3% to 88.1%. The standard error is, except for a
13
dataset with 5,000 cases, negligible small. Generally, the accuracy increases with an increase of
14
cases in the dataset, with the exception of both CRT methods (5,6) between 500,000 and
15
5,000,000 cases. Although the differences between the accuracy rates for the methods are small
16
(2-3% points), the QUEST method (7) generally has the lowest accuracy and the CRT methods
17
(5,6) have the highest accuracy. For the largest dataset the CHAID methods with a likelihood
18
ratio criterion (2,4) have more accuracy. For the largest dataset, calculation times become an
19
issue, since one method on average takes one hour to calculate.
20
Table 4 shows additional accuracy values for the tested methods, only for the dataset
21
containing 500,000 cases. The upper part of the table shows accuracy rates for the main junction
22
types being; all-way stop controlled (AW), two-way stop controlled (TW) and signalized (SIG)
23
intersections and roundabouts (RA). An intersection design is counted as correctly classified
24
when the main intersection type is correct. So if the observed intersection design type is SIG3
25
and the predicted type is SIG4 it is labeled as correctly classified. It can be seen that trees are
26
much better in predicting roundabouts correctly (range of 95.6%-96.3%) then all-way sign
27
controlled intersections (range of 32.6%-51.2%). Besides the fact that AWSC intersections are
28
more difficult to predict by using the given independent variables, the total number of cases in
29
which the AWSC intersection is the preferred intersection design is limited to 688 (0.1%). DT
30
models tend to underestimate classes with few elements and overestimate classes with a lot of
31
elements. Without using any of the independent variables a model which always predicts a
32
roundabout would have a 50.9% accuracy. This is an issue that should be checked when taking a
33
closer look at the tree. The CRT with Twoing (6) is always the method with the best accuracy.
34
Table 4 also shows the accuracy rates for a validation test which splits the dataset in a
35
training and test set. Only the result of a 75% training and 25% test set size are shown. The
36
results of a 50%-50% and a 67%-33% split show similar though a little less accurate results. The
37
shown values are average values of 10 different runs, since the split is randomly determined.
38
39
40
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 11
1 2 3 4 5 6 7
CHAID CHAID ECHAID ECHAID CRT CRT QUEST
Cases Pearson Likelih. Pearson Likelih. Gini Twoing -
All All 500,000 85.8% 85.9% 85.8% 85.9% 85.7% 87.3% 84.4%
Classified
by main
type
AW 688 36.9% 39.9% 36.6% 38.3% 45.1% 51.2% 32.6%
TW 81,659 86.2% 86.4% 86.1% 86.1% 85.0% 86.8% 81.4%
SIG 163,380 85.1% 85.1% 85.0% 85.4% 84.9% 87.2% 85.4%
RA 254,273 96.2% 96.1% 96.2% 96.0% 95.6% 96.3% 94.9%
All 500,000 90.7% 90.7% 90.6% 90.7% 90.2% 91.6% 89.3%
Validation Training 375,000 85.4% 85.5% 85.5% 85.5% 85.8% 87.3% 83.8%
Test 125,000 84.4% 84.4% 84.3% 84.5% 85.3% 86.3% 83.3%
Modelled
by size
category
1 100,000 89.6% 89.7% 89.7% 89.8% 91.9% 91.8% 89.6%
2 100,000 71.6% 71.5% 71.3% 71.2% 76.4% 76.9% 67.6%
3 100,000 88.5% 88.5% 88.5% 88.5% 89.9% 90.0% 86.6%
4 100,000 91.6% 91.7% 91.7% 91.7% 93.3% 93.4% 91.6%
5 100,000 87.7% 87.9% 87.8% 87.9% 89.1% 89.3% 86.6%
Table 4 Additional analysis of prediction accuracy for 500,000 dataset.
1
2
All the methods (fortunately) use the size category as the primary variable to split the
3
tree. This is rather logical since the size category is directly linked to the intersection design type
4
as shown in Table 1. The tree depth and model complexity can be reduced by creating separate
5
datasets for each size category and thus perform a DT analysis for each size category separately.
6
The lower part of Table 4 shows the results of these analyses, again performed with the 500,000
7
dataset. Each separate dataset now has a size of 100,000 cases. With the exception of size
8
category 2, the accuracy rates are higher than the ones for the undivided dataset.
9
Independent variables
10
The CRT algorithm provides the possibility to report the independent variable importance to the
11
tree model. Since the CRT Twoing method (6) in almost all cases gives the highest accuracy
12
rates, the results of this model will be used to analyze the mentioned variable importance. Table
13
5 shows the normalized importance of the ten topmost independent variables for the tree models
14
for each size category. The variable with the highest importance value is set to 100%. The model
15
for intersection designs of size category 1 can primarily by classified by the total demand volume
16
on the intersection, followed by the sum of the major left turning volumes and the sum of the
17
major through and left turning volumes. The latter is an important variable for all size categories.
18
Notably is that the total volume is far less important for larger size categories (3-5) and that the
19
table is dominated by absolute values in contrast with relative (percentage) ones.
20
Towards manageable DTs
21
Up till now, the maximum tree depth was set to 20, which produces rather large and complex
22
trees with up to 600 nodes and 16 levels. These cannot be used as a base for decision rules for
23
intersection design. The tree complexity can be reduced by reducing the maximum tree depth
24
and the number of independent variables and initiating tree pruning methods. The task is to
25
reduce the tree complexity without losing to much predictive accuracy.
26
Figure 3 shows a DT for intersection design of size category 3. In order to generate this
27
tree, again the CRT Twoing method is used. This time with a maximum tree depth of 3 levels,
28
only the ten most important independent variables according to Table 5, and pruning.
29
30
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 12
1 2 3 4 5
Variable % Variable % Variable % Variable % Variable %
1 Total
volume
100.0% Major
through &
left volume
100.0% Through
volume
100.0% Through
volume
100.0% Major
through &
left volume
100.0%
2 Major left
volume
92.5% Through
volume
82.2% Major
through &
left volume
97.1% Major
through &
left volume
95.8% Through
volume
63.4%
3 Major
through &
left volume
92.0% Total
volume
81.0% Major
through &
minor left
volume
79.8% Major
through
volume
73.7% Major
through &
left
percentage
61.2%
4 Left
volume
88.5% Major left
volume
79.9% Major
through
volume
77.4% Major
through &
minor left
volume
73.6% Major
through &
minor left
volume
61.0%
5 Minor
volume
80.0% Major
volume
77.0% EB through
volume
56.0% EB left
volume
71.4% Major
through
volume
59.0%
6 Minor
through &
left volume
72.1% Left
volume
73.4% Through
percentage
51.8% Minor
through &
left
percentage
70.3% EB left
volume
47.8%
7 Major
volume
69.8% Major
through &
minor left
volume
72.5% EB left
volume
44.0% EB through
volume
54.7% Left volume 46.7%
8 Through
volume
61.9% Major
through
volume
65.9% WB
through
volume
29.1% Major left
volume
47.3% Major left
volume
46.5%
9 Minor
through
volume
54.1% Minor
through
volume
51.8% Minor
through
volume
28.1% Minor
volume
41.8% Through
percentage
39.8%
10 Major
through &
minor left
volume
52.0% Minor
through &
left volume
49.9% Major left
volume
28.0% Through
percentage
41.4% Major
volume
34.1%
Table 5 Independent variable importance.
1
2
Size category 3 comprises the intersection design types SIG6, SIG7, SIG, RA1 and RA3. If the
3
total amount of through traffic on the intersection is less than or equal to 1434.5 pcu/h then the
4
RA3 is the advised design type. When the amount of through traffic is greater than the
5
mentioned value, a further split is made for through traffic at 2028.5 pcu/h. Below this limiting
6
value, the advice is still RA3, but with far less accuracy. With more than 2028.5 pcu/h through
7
traffic on the intersection the advice is either SIG7 or SIG6 dependent upon the amount of
8
through traffic on the minor road. The overall accuracy of this drastically compressed tree model,
9
is still 85.5% (compared to 90.0% of the uncompressed model).
10
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 13
1
Figure 3 DT for intersection designs for size category 3.
2
CONCLUSIONS AND FURTHER RESEARCH
3
In this study we explored the possibilities of using a DT model in order to determine the
4
intersection design type solely based on traffic demand variables. Datasets were generated using
5
HCM 2010 intersection models for all-way sign controlled, two-way sign controlled and
6
signalized intersections and roundabouts, using the control delay of the intersection to determine
7
the preferred intersection design. We examined the predictive accuracy of various different DT
8
methods for several sizes of datasets and observed a satisfying accuracy rate in the range of
9
76.3% to 88.1%. The QUEST method gave the lowest accuracies while the CRT based methods
10
gave the highest values. The accuracy rates can be improved (up to 93.4%) by estimating
11
separate tree models for intersection designs for different size categories.
12
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 14
The importance of independent demand variables differs over the size categories, although the
1
topmost ten to twenty variables are very similar and are predominantly absolute variables.
2
Calculation times stay within reasonable limits (minutes for 500,000 cases and hours for
3
5,000,000 cases). The accuracy can reasonably be maintained while reducing the tree complexity
4
in order to make it usable for intersection design rules. We showed an example with a tree depth
5
of three levels and seven nodes with an accuracy of 85.5%. Based on our research we can
6
conclude that it is possible to determine accurate intersection design rules by using DTs.
7
An issue that has not been dealt with in this paper concerns the fact that the control delay
8
of various intersection designs may be the same or very close. Then, it might be better to include
9
multiple intersection designs in de dataset for the DT. Preliminary tests with a 10% bandwidth,
10
generating a dataset with 648,066 cases showed that the accuracy rates were reduced to nearly
11
68%. The results at first seem disappointing, since the accuracy rates are quite low. However,
12
although the model should be able to produce a better fit with the provided data, the accuracy
13
rate is reduced due to the fact that when three preferred intersection designs are provided in the
14
dataset for one combination of demand variables and size category, only one will correctly be
15
predicted. The other two are incorrectly classified, this reducing the predicting accuracy. Another
16
measure has to be used to determine its worth, which is an issue for further research.
17
Other topics for further research concern the addition of the C4.5 and/or C5.0 algorithm
18
[27,28], comparing the performance of the resulting intersection design rules based on the DT
19
with existing rules, creating DTs for traffic safety and environmental impact related objectives
20
and examining and incorporating network effects.
21
ACKNOWLEDGMENT
22
This research is part of an ongoing PhD-research concerning the optimization of intersection
23
design in urban traffic networks which is funded by the Netherlands Organisation of Scientific
24
Research.
25
REFERENCES
26
1. AASHTO (2011) A Policy on Geometric Design of Highways and Streets, 6th
Edition,
27
American Association of State Highway and Transportation Officials, Washington D.C.
28
2. Transportation Research Board (2010) Roundabouts: An Information Guide, Second Edition,
29
NCHRP Report 672, Transportation Research Board, Washington D.C.
30
3. FGSV (2007) Richtlinien für die Anlage von Stadtstraßen: RASt 06 (in German), 200,
31
Forschungsgesellschaft für Straßen- und Verkehrswesen, Köln.
32
4. Highways Agency (2012) Design Manual for Roads and Bridges (DMRB), Online version
33
June 2012, Highways Agency, London.
34
5. CROW (2002) Handboek Wegontwerp (in Dutch), Publications 164A-D, CROW, Ede.
35
6. CROW (2012) ASVV 2012 – Aanbevelingen voor verkeersvoorzieningen binnen de
36
bebouwde kom (in Dutch), Publication 723, CROW, Ede.
37
7. CROW (2008) Turborotondes (in Dutch), Publication 257, CROW, Ede.
38
8. Stamatiadis, N., Kirk, A., Agarwal, N. and Jones, C. (2012) Improving Intersection Design
39
Practices – Final Report, Research Report KTC-12-4 / SPR-09-380-1F, Kentucky
40
Transportation Center, University of Kentucky, Lexington.
41
9. Kirk, A., Jones, C. and Stamatiadis, N. (2011) Improving Intersection Design Practices,
42
Transportation Research Record, Volume 2223, 1-8.
43
TRB 2015 Annual Meeting Original paper submittal - not revised by author.
Bezembinder, Wismans and Van Berkum 15
10. Han, L., Li, J-M. and Urbanik, T. (2008) Control-Type Selection at Isolated Intersections
1
Based on Control Delay Under Various Demand Levels, Transportation Research Record,
2
Volume 2071, 109-116.
3
11. Transportation Research Board (2000) Highway Capacity Manual 2000, Transportation
4
Research Board, Washington D.C.
5
12. Vitens, B.J. and K.W. Axhausen (2012) Shape Grammars for Intersection Type Choice in
6
Road Network Generation, paper presented at the 12th
Swiss Transport Research Conference,
7
Ascona, May 2012.
8
13. Transportation Research Board (2010) Highway Capacity Manual 2010, Transportation
9
Research Board, Washington D.C.
10
14. Rokach, L. and Maimon, O. (2008) Data Mining with Decision Trees, Theory and
11
Applications, Series in Machine Perception and Artificial Intelligence – Vol. 69, World
12
Scientific Publishing, Singapore
13
15. Abellán, J., López, G. and Oña, J. de (2013) Analysis of traffic accident severity using
14
Decision Tree Rules via Decision Trees, Journal of Expert Systems with Applications,
15
Volume 40, Issue 15, 6047-6054.
16
16. Oña, J. de, López, G. and Abellán, J. (2013) Extracting decision rules from policy accident
17
reports through decision trees, Journal of Accident Analysis and Prevention, Volume 50,
18
1151-1160.
19
17. Lee, S. and Park, I. (2013) Application of decision tree model for the ground subsidence
20
hazard mapping near abandoned underground coal mines, Journal of Environmental
21
Management 127, 166-176.
22
18. Huang, C.S., Lin, Y.J. and Lin, C.C. (2008) Implementation of Classifiers for Choosing
23
Insurance Policy Using Decision Trees: A Case Study, WSEAS Transactions on Computers,
24
Issue 10, Volume 7, 1679-1689.
25
19. Ture, M., Kurt, I., Kurum, A.T. and Ozdamar, K. (2005) Comparing classification techniques
26
for predicting essential hypertension, Export Systems with Applications 29, 583-588.
27
20. Pal, M. and Mather, P.M. (2003) An assessment of the effectiveness of decision tree methods
28
for land cover classification, Remote Sensing of Environment 86, 554-565.
29
21. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression
30
Trees, Wadsworth Int. Group.
31
22. Kass, G.V. (1980) An Exploratory Technique for Investigating Large Quantities of
32
Categorical Data, Applied Statistics, 29(2), 119-127.
33
23. Biggs, D., De Ville, B. and Suen, E. (1991) A Method of Choosing Multiway Partitions for
34
Classification and Decision Trees, Journal of Applied Statistics 18(1), 49-62.
35
24. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression
36
Trees, Wadsworth Int. Group.
37
25. Loh, W.Y. and Shih, X. (1997) Split Selection Methods for Classification Trees, Statistica
38
Sinica 7, 815-840.
39
26. Lim, X., Loh, W.Y. and Shih, X. (2000) A Comparison of Prediction Accuracy, Complexity
40
and Training Time of Thirty-Three Old and New Classification Algorithms, Machine
41
Learning 40, 203-228.
42
27. Quinlan, J.R. (1986) Induction of decision trees, Machine Learning, Volume 1, 81-106.
43
28. Quinlan, J.R. (1993) C4.5: Programs for Machine Learning, Morgan Kaufman, Los Altos.
44
TRB 2015 Annual Meeting Original paper submittal - not revised by author.

More Related Content

What's hot

Buy in for brt over lrt how to inform project planning prioritisation about r...
Buy in for brt over lrt how to inform project planning prioritisation about r...Buy in for brt over lrt how to inform project planning prioritisation about r...
Buy in for brt over lrt how to inform project planning prioritisation about r...
BRTCoE
 
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
Saurav Barua
 
Unraveling urban traffic flows
Unraveling urban traffic flowsUnraveling urban traffic flows
Unraveling urban traffic flows
Serge Hoogendoorn
 
Short talk impact Covid-19 on supply and demand during the RA webinar
Short talk impact Covid-19 on supply and demand during the RA webinarShort talk impact Covid-19 on supply and demand during the RA webinar
Short talk impact Covid-19 on supply and demand during the RA webinar
Serge Hoogendoorn
 
Beyond Level of Service – Towards a relative measurement of congestion in pla...
Beyond Level of Service – Towards a relative measurement of congestion in pla...Beyond Level of Service – Towards a relative measurement of congestion in pla...
Beyond Level of Service – Towards a relative measurement of congestion in pla...
JumpingJaq
 
Presentatie donderdag
Presentatie donderdagPresentatie donderdag
Presentatie donderdag
Tale Meester
 

What's hot (6)

Buy in for brt over lrt how to inform project planning prioritisation about r...
Buy in for brt over lrt how to inform project planning prioritisation about r...Buy in for brt over lrt how to inform project planning prioritisation about r...
Buy in for brt over lrt how to inform project planning prioritisation about r...
 
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
TE004, A Study On Feasible Traffic Operation Alternatives At Signalized Inter...
 
Unraveling urban traffic flows
Unraveling urban traffic flowsUnraveling urban traffic flows
Unraveling urban traffic flows
 
Short talk impact Covid-19 on supply and demand during the RA webinar
Short talk impact Covid-19 on supply and demand during the RA webinarShort talk impact Covid-19 on supply and demand during the RA webinar
Short talk impact Covid-19 on supply and demand during the RA webinar
 
Beyond Level of Service – Towards a relative measurement of congestion in pla...
Beyond Level of Service – Towards a relative measurement of congestion in pla...Beyond Level of Service – Towards a relative measurement of congestion in pla...
Beyond Level of Service – Towards a relative measurement of congestion in pla...
 
Presentatie donderdag
Presentatie donderdagPresentatie donderdag
Presentatie donderdag
 

Similar to 15 0544

2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
Azad public school
 
Adaptive traffic lights based on traffic flow prediction using machine learni...
Adaptive traffic lights based on traffic flow prediction using machine learni...Adaptive traffic lights based on traffic flow prediction using machine learni...
Adaptive traffic lights based on traffic flow prediction using machine learni...
IJECEIAES
 
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
IJCNCJournal
 
On Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised ApproachOn Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised Approach
Waqas Tariq
 
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRESpectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
TELKOMNIKA JOURNAL
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
Nicole Adams
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
Alicia Buske
 
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
IRJET Journal
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.ppt
PerumalPitchandi
 
06775257
0677525706775257
06775257
Harish Madurai
 
2010 - Winter simulation
2010 - Winter simulation2010 - Winter simulation
2010 - Winter simulation
Newton Narciso Pereira
 
WMNs: The Design and Analysis of Fair Scheduling
WMNs: The Design and Analysis of Fair SchedulingWMNs: The Design and Analysis of Fair Scheduling
WMNs: The Design and Analysis of Fair Scheduling
iosrjce
 
C017641219
C017641219C017641219
C017641219
IOSR Journals
 
Journal paper 1
Journal paper 1Journal paper 1
Journal paper 1
Editor IJCATR
 
New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial
IJECEIAES
 
Vivarana literature survey
Vivarana literature surveyVivarana literature survey
Vivarana literature survey
Tharindu Ranasinghe
 
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsHelp the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
IJORCS
 
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsTraffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
ITIIIndustries
 
TOD
TODTOD
Plant location selection by using MCDM methods
Plant location selection by using MCDM methodsPlant location selection by using MCDM methods
Plant location selection by using MCDM methods
IJERA Editor
 

Similar to 15 0544 (20)

2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Adaptive traffic lights based on traffic flow prediction using machine learni...
Adaptive traffic lights based on traffic flow prediction using machine learni...Adaptive traffic lights based on traffic flow prediction using machine learni...
Adaptive traffic lights based on traffic flow prediction using machine learni...
 
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
RESPONSE SURFACE METHODOLOGY FOR PERFORMANCE ANALYSIS AND MODELING OF MANET R...
 
On Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised ApproachOn Tracking Behavior of Streaming Data: An Unsupervised Approach
On Tracking Behavior of Streaming Data: An Unsupervised Approach
 
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRESpectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
Spectral opportunity selection based on the hybrid algorithm AHP-ELECTRE
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
 
A Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment AlgorithmsA Computational Study Of Traffic Assignment Algorithms
A Computational Study Of Traffic Assignment Algorithms
 
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
Control of A Platoon of Vehicles in Vanets using Communication Scheduling Pro...
 
deep_Visualization in Data mining.ppt
deep_Visualization in Data mining.pptdeep_Visualization in Data mining.ppt
deep_Visualization in Data mining.ppt
 
06775257
0677525706775257
06775257
 
2010 - Winter simulation
2010 - Winter simulation2010 - Winter simulation
2010 - Winter simulation
 
WMNs: The Design and Analysis of Fair Scheduling
WMNs: The Design and Analysis of Fair SchedulingWMNs: The Design and Analysis of Fair Scheduling
WMNs: The Design and Analysis of Fair Scheduling
 
C017641219
C017641219C017641219
C017641219
 
Journal paper 1
Journal paper 1Journal paper 1
Journal paper 1
 
New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial New Heuristic Model for Optimal CRC Polynomial
New Heuristic Model for Optimal CRC Polynomial
 
Vivarana literature survey
Vivarana literature surveyVivarana literature survey
Vivarana literature survey
 
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsHelp the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
 
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier FactorsTraffic Outlier Detection by Density-Based Bounded Local Outlier Factors
Traffic Outlier Detection by Density-Based Bounded Local Outlier Factors
 
TOD
TODTOD
TOD
 
Plant location selection by using MCDM methods
Plant location selection by using MCDM methodsPlant location selection by using MCDM methods
Plant location selection by using MCDM methods
 

More from uykuyk

dgdfgg
dgdfggdgdfgg
dgdfgg
uykuyk
 
26 hcm 20fhfdh
26 hcm 20fhfdh26 hcm 20fhfdh
26 hcm 20fhfdh
uykuyk
 
___ fghh
  ___ fghh  ___ fghh
___ fghh
uykuyk
 
2010 test
2010 test2010 test
2010 test
uykuyk
 
16 0648
16 064816 0648
16 0648
uykuyk
 
2008 evry0045
2008 evry00452008 evry0045
2008 evry0045
uykuyk
 
2005 0002 02
2005 0002 022005 0002 02
2005 0002 02
uykuyk
 
1980bac2 5e40-4db7-8021-9b6a1c8a1829
1980bac2 5e40-4db7-8021-9b6a1c8a18291980bac2 5e40-4db7-8021-9b6a1c8a1829
1980bac2 5e40-4db7-8021-9b6a1c8a1829
uykuyk
 
34th cycle call
34th cycle call34th cycle call
34th cycle call
uykuyk
 
16 0648 a
16 0648 a16 0648 a
16 0648 a
uykuyk
 
16 3239 c
16 3239 c16 3239 c
16 3239 c
uykuyk
 
34th cycle attachement-a-1
34th cycle attachement-a-134th cycle attachement-a-1
34th cycle attachement-a-1
uykuyk
 
gfdjjf
gfdjjfgfdjjf
gfdjjf
uykuyk
 
jfgjgf
jfgjgfjfgjgf
jfgjgf
uykuyk
 
Img 20210208 0002g
Img 20210208 0002gImg 20210208 0002g
Img 20210208 0002g
uykuyk
 
Le manuel tegggchnicien-photovoltaique
Le manuel tegggchnicien-photovoltaiqueLe manuel tegggchnicien-photovoltaique
Le manuel tegggchnicien-photovoltaique
uykuyk
 
Img 20210208 0002 (1)
Img 20210208 0002 (1)Img 20210208 0002 (1)
Img 20210208 0002 (1)
uykuyk
 
Guide de-la-communication-ecrite-en-anglais
Guide de-la-communication-ecrite-en-anglaisGuide de-la-communication-ecrite-en-anglais
Guide de-la-communication-ecrite-en-anglais
uykuyk
 

More from uykuyk (18)

dgdfgg
dgdfggdgdfgg
dgdfgg
 
26 hcm 20fhfdh
26 hcm 20fhfdh26 hcm 20fhfdh
26 hcm 20fhfdh
 
___ fghh
  ___ fghh  ___ fghh
___ fghh
 
2010 test
2010 test2010 test
2010 test
 
16 0648
16 064816 0648
16 0648
 
2008 evry0045
2008 evry00452008 evry0045
2008 evry0045
 
2005 0002 02
2005 0002 022005 0002 02
2005 0002 02
 
1980bac2 5e40-4db7-8021-9b6a1c8a1829
1980bac2 5e40-4db7-8021-9b6a1c8a18291980bac2 5e40-4db7-8021-9b6a1c8a1829
1980bac2 5e40-4db7-8021-9b6a1c8a1829
 
34th cycle call
34th cycle call34th cycle call
34th cycle call
 
16 0648 a
16 0648 a16 0648 a
16 0648 a
 
16 3239 c
16 3239 c16 3239 c
16 3239 c
 
34th cycle attachement-a-1
34th cycle attachement-a-134th cycle attachement-a-1
34th cycle attachement-a-1
 
gfdjjf
gfdjjfgfdjjf
gfdjjf
 
jfgjgf
jfgjgfjfgjgf
jfgjgf
 
Img 20210208 0002g
Img 20210208 0002gImg 20210208 0002g
Img 20210208 0002g
 
Le manuel tegggchnicien-photovoltaique
Le manuel tegggchnicien-photovoltaiqueLe manuel tegggchnicien-photovoltaique
Le manuel tegggchnicien-photovoltaique
 
Img 20210208 0002 (1)
Img 20210208 0002 (1)Img 20210208 0002 (1)
Img 20210208 0002 (1)
 
Guide de-la-communication-ecrite-en-anglais
Guide de-la-communication-ecrite-en-anglaisGuide de-la-communication-ecrite-en-anglais
Guide de-la-communication-ecrite-en-anglais
 

Recently uploaded

Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptxStatus of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
MuhammadWaqasBaloch1
 
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
dsnow9802
 
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
GabrielleSinaga
 
0624.speakingengagementsandteaching-01.pdf
0624.speakingengagementsandteaching-01.pdf0624.speakingengagementsandteaching-01.pdf
0624.speakingengagementsandteaching-01.pdf
Thomas GIRARD BDes
 
Switching Careers Slides - JoyceMSullivan SocMediaFin - 2024Jun11.pdf
Switching Careers Slides - JoyceMSullivan SocMediaFin -  2024Jun11.pdfSwitching Careers Slides - JoyceMSullivan SocMediaFin -  2024Jun11.pdf
Switching Careers Slides - JoyceMSullivan SocMediaFin - 2024Jun11.pdf
SocMediaFin - Joyce Sullivan
 
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
2zjra9bn
 
IT Career Hacks Navigate the Tech Jungle with a Roadmap
IT Career Hacks Navigate the Tech Jungle with a RoadmapIT Career Hacks Navigate the Tech Jungle with a Roadmap
IT Career Hacks Navigate the Tech Jungle with a Roadmap
Base Camp
 
A Guide to a Winning Interview June 2024
A Guide to a Winning Interview June 2024A Guide to a Winning Interview June 2024
A Guide to a Winning Interview June 2024
Bruce Bennett
 
Leadership Ambassador club Adventist module
Leadership Ambassador club Adventist moduleLeadership Ambassador club Adventist module
Leadership Ambassador club Adventist module
kakomaeric00
 
5 Common Mistakes to Avoid During the Job Application Process.pdf
5 Common Mistakes to Avoid During the Job Application Process.pdf5 Common Mistakes to Avoid During the Job Application Process.pdf
5 Common Mistakes to Avoid During the Job Application Process.pdf
Alliance Jobs
 
lab.123456789123456789123456789123456789
lab.123456789123456789123456789123456789lab.123456789123456789123456789123456789
lab.123456789123456789123456789123456789
Ghh
 
Introducing Gopay Mobile App For Environment.pptx
Introducing Gopay Mobile App For Environment.pptxIntroducing Gopay Mobile App For Environment.pptx
Introducing Gopay Mobile App For Environment.pptx
FauzanHarits1
 
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
NWEXAM
 
labb123456789123456789123456789123456789
labb123456789123456789123456789123456789labb123456789123456789123456789123456789
labb123456789123456789123456789123456789
Ghh
 
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
taqyea
 
Resumes, Cover Letters, and Applying Online
Resumes, Cover Letters, and Applying OnlineResumes, Cover Letters, and Applying Online
Resumes, Cover Letters, and Applying Online
Bruce Bennett
 
Lbs last rank 2023 9988kr47h4744j445.pdf
Lbs last rank 2023 9988kr47h4744j445.pdfLbs last rank 2023 9988kr47h4744j445.pdf
Lbs last rank 2023 9988kr47h4744j445.pdf
ashiquepa3
 
Tape Measure Training & Practice Assessments.pdf
Tape Measure Training & Practice Assessments.pdfTape Measure Training & Practice Assessments.pdf
Tape Measure Training & Practice Assessments.pdf
KateRobinson68
 
thyroid case presentation.pptx Kamala's Lakshaman palatial
thyroid case presentation.pptx Kamala's Lakshaman palatialthyroid case presentation.pptx Kamala's Lakshaman palatial
thyroid case presentation.pptx Kamala's Lakshaman palatial
Aditya Raghav
 
Learnings from Successful Jobs Searchers
Learnings from Successful Jobs SearchersLearnings from Successful Jobs Searchers
Learnings from Successful Jobs Searchers
Bruce Bennett
 

Recently uploaded (20)

Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptxStatus of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
Status of Women in Pakistan.pptxStatus of Women in Pakistan.pptx
 
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
Jill Pizzola's Tenure as Senior Talent Acquisition Partner at THOMSON REUTERS...
 
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
Gabrielle M. A. Sinaga Portfolio, Film Student (2024)
 
0624.speakingengagementsandteaching-01.pdf
0624.speakingengagementsandteaching-01.pdf0624.speakingengagementsandteaching-01.pdf
0624.speakingengagementsandteaching-01.pdf
 
Switching Careers Slides - JoyceMSullivan SocMediaFin - 2024Jun11.pdf
Switching Careers Slides - JoyceMSullivan SocMediaFin -  2024Jun11.pdfSwitching Careers Slides - JoyceMSullivan SocMediaFin -  2024Jun11.pdf
Switching Careers Slides - JoyceMSullivan SocMediaFin - 2024Jun11.pdf
 
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
在线制作加拿大萨省大学毕业证文凭证书实拍图原版一模一样
 
IT Career Hacks Navigate the Tech Jungle with a Roadmap
IT Career Hacks Navigate the Tech Jungle with a RoadmapIT Career Hacks Navigate the Tech Jungle with a Roadmap
IT Career Hacks Navigate the Tech Jungle with a Roadmap
 
A Guide to a Winning Interview June 2024
A Guide to a Winning Interview June 2024A Guide to a Winning Interview June 2024
A Guide to a Winning Interview June 2024
 
Leadership Ambassador club Adventist module
Leadership Ambassador club Adventist moduleLeadership Ambassador club Adventist module
Leadership Ambassador club Adventist module
 
5 Common Mistakes to Avoid During the Job Application Process.pdf
5 Common Mistakes to Avoid During the Job Application Process.pdf5 Common Mistakes to Avoid During the Job Application Process.pdf
5 Common Mistakes to Avoid During the Job Application Process.pdf
 
lab.123456789123456789123456789123456789
lab.123456789123456789123456789123456789lab.123456789123456789123456789123456789
lab.123456789123456789123456789123456789
 
Introducing Gopay Mobile App For Environment.pptx
Introducing Gopay Mobile App For Environment.pptxIntroducing Gopay Mobile App For Environment.pptx
Introducing Gopay Mobile App For Environment.pptx
 
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
How to Prepare for Fortinet FCP_FAC_AD-6.5 Certification?
 
labb123456789123456789123456789123456789
labb123456789123456789123456789123456789labb123456789123456789123456789123456789
labb123456789123456789123456789123456789
 
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
一比一原版布拉德福德大学毕业证(bradford毕业证)如何办理
 
Resumes, Cover Letters, and Applying Online
Resumes, Cover Letters, and Applying OnlineResumes, Cover Letters, and Applying Online
Resumes, Cover Letters, and Applying Online
 
Lbs last rank 2023 9988kr47h4744j445.pdf
Lbs last rank 2023 9988kr47h4744j445.pdfLbs last rank 2023 9988kr47h4744j445.pdf
Lbs last rank 2023 9988kr47h4744j445.pdf
 
Tape Measure Training & Practice Assessments.pdf
Tape Measure Training & Practice Assessments.pdfTape Measure Training & Practice Assessments.pdf
Tape Measure Training & Practice Assessments.pdf
 
thyroid case presentation.pptx Kamala's Lakshaman palatial
thyroid case presentation.pptx Kamala's Lakshaman palatialthyroid case presentation.pptx Kamala's Lakshaman palatial
thyroid case presentation.pptx Kamala's Lakshaman palatial
 
Learnings from Successful Jobs Searchers
Learnings from Successful Jobs SearchersLearnings from Successful Jobs Searchers
Learnings from Successful Jobs Searchers
 

15 0544

  • 1. USING DECISION TREES IN ORDER TO DETERMINE INTERSECTION DESIGN RULES Erwin M. Bezembinder * Windesheim University of Applied Sciences Department of Technology Area Development Research Group P.O. Box 10090, 8000 GB, Zwolle, The Netherlands Phone: +31-88-4698436 E-mail: e.bezembinder@windesheim.nl Luc J.J. Wismans University of Twente Faculty of Engineering Technology Centre for Transport Studies P.O. Box 217, 7500 AE, Enschede, The Netherlands Phone: +31-570-666840 E-mail: lwismans@dat.nl Eric. C. van Berkum University of Twente Faculty of Engineering Technology Centre for Transport Studies P.O. Box 217, 7500 AE, Enschede, The Netherlands Phone: +31-53-4894886 E-mail: e.c.vanberkum@utwente.nl * = Corresponding author. Submitted: August 1, 2014. Paper prepared for the 94th Annual Meeting of the Transportation Research Board, 2015. Word Count: Text, excl. references: 4,972 Tables: 5 x 250 = 1,250 Figures: 3 x 250 = 750 Total: 6,972 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 2. Bezembinder, Wismans and Van Berkum 2 ABSTRACT 1 Road planners frequently face the challenge to determine which intersection design provides the 2 best traffic flow for a particular traffic demand. Many road design manuals provide guidelines 3 for the design and evaluation of different intersection alternatives, however mostly refer to 4 specialized software in which the performances of different design alternatives can be modelled. 5 In a planning stage of the design process, such assessments are undesirable due to time and cost. 6 There is a need for quick design rules which need limited input data. Although some of these 7 rules exist, their usability is limited. In this paper we examine the possibilities to determine 8 intersection design rules by Decision Tree (DT) methods which are trained with data generated 9 by HCM 2010 intersection modelling. The models consider 24 intersection designs varying the 10 main type (all-way stop controlled, two-way stop controlled, signalized and roundabout) and the 11 number and configuration of the entering and exiting lanes. Traffic demand patterns are 12 randomly generated for various sizes of the dataset (5,000 – 5,000,000 cases) represented by 38 13 (independent) demand variables. Different DT methods (CHAID, CRT and QUEST), options 14 (splitting criteria, tree depth) and datasets are tested for their predictive accuracy. The DT models 15 provide accuracy rates between 76% and 96%. The CRT methods seem the most promising, and 16 a further analysis was made concerning the independent variable importance and the possibilities 17 for reducing the trees complexity. An example is shown of a DT which provides straightforward 18 design rules and an predictive accuracy of 85.5%. 19 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 3. Bezembinder, Wismans and Van Berkum 3 INTRODUCTION 1 Throughout the world, road planners frequently face the challenge to determine which 2 intersection design provides the best traffic flow for a particular location. This intersection 3 design encompasses the choice of the main type such as a signalized intersection or roundabout, 4 but also choices regarding the number and characteristics of the entering and exiting lanes and 5 the type of signal control. In order to determine the best intersection design, road planners 6 initially consult their road design manuals. These manuals provide guidelines for the assessments 7 of different intersection designs, e.g. see [1,2,3,4,5,6,7]. However, most guidelines refer to 8 specialized software in which the performances of different design alternatives can be modeled. 9 This is not workable for planning purposes. At that stage of the design process, rules of thumb, 10 assessment schemes or simple calculation methods are more suitable. Only a very limited 11 amount of these rules exist. In the Netherlands, various rules have been published by the CROW 12 [5,6,7] mostly regarding the choice for a particular roundabout design. In the USA, Stamatiadis 13 et.al. [8,9] did a literate study concerning the evaluation of design alternatives based on both 14 national and (41) state design manuals. They observed a lack of specific guidance at both 15 national and state levels. Similar observation can be made based on design manuals in Germany 16 and the UK [3,4]. The lack of rules can for the most part be explained by the lack of appropriate 17 field data to determine the rules. Alternatively, data generated by intersection models can be 18 used. Using a model provides the opportunity to create a complete dataset. Stamatiadis et.al. 19 [8,9] used CORSIM to analyze the intersection control delay and the critical volume to 20 determine thresholds for different designs. Han et.al. [10] used the HCM 2000 [11] to distinguish 21 between different intersection types based on the major and minor street traffic volumes in order 22 to test and improve HCM 2000 Exhibit 10-15. Vitins and Axhausen [12] also used the HCM 23 2000 to determine intersection design rules. Although these studies provide interesting 24 approaches, they examined a limited number of intersection designs and demand volumes and 25 moreover they conducted a manual analysis of the generated data. 26 In this paper we present an approach in which we use the HCM 2010 [13] in order to 27 generate various datasets which can be used to determine intersection design rules bases on 28 intersection control delay. The datasets will be used as input for a Decision Tree (DT) method. 29 DTs are a classification technique used to categorize samples/instances based on the features of 30 the samples. Although it has not been applied to determine intersection design rules, it has 31 recently proven to be a powerful method which can handle both numerical and categorical data, 32 requires little data preparations, uses a white box model, is easy to understand and interpret, 33 performs well with large datasets, is robust and offers possibilities to validate the model using 34 statistical techniques [14]. Most important, rules can be derived from the resulting DTs. 35 The aim of this research is to determine whether DTs can be used to determine 36 intersection design rules based on demand variables and control delays generated by HCM 2010 37 with sufficient accuracy. Secondly, various DT methods are available and there is no ‘rule’ 38 which states which methods is most suitable for this specific situation. Therefore, three main DT 39 methods (CHAID, CRT and QUEST), which based on their characteristics are suitable for the 40 job and various options will be tested for accuracy in order to determine the best DT method. 41 Furthermore, it is tested whether manageable DTs and rules can be generated so they can be used 42 in design manuals. 43 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 4. Bezembinder, Wismans and Van Berkum 4 Paper outline 1 In the next section the generation of the data sets will be described. Subsequently, the general 2 principle of DTs, the DT methods and the evaluation of the methods in this study will be 3 discussed. Next, the results will be explained followed by the conclusions and considerations for 4 further research. 5 DATA SETS 6 The DT model needs a dataset in which each case (or record or row) contains values for multiple 7 independent (predictor) variables and the corresponding dependent (target) variable. In this 8 research, the independent variables represent the demand flow rates on an intersection, whereas 9 the dependent variable is the preferred intersection design. 10 The dataset will be generated using the HCM 2010 methodology [14]. The HCM 2010 is 11 used to determine the volume weighted average control delay for the intersection based on given 12 demand flow rates and an intersection design. Figure 1 shows an example of the input data for a 13 multilane roundabout. Using the HCM 2010 Roundabout Analysis Methodology with default 14 values, this will result in a control delay for the intersection of 16.6 s/veh. 15 16 17 Figure 1 Intersection design and demand volumes (pcu/h) for a multilane roundabout [13, 18 Exhibit 21-23]. 19 20 In order to determine the preferred intersection design for these specific demand flow rates, the 21 control delay for multiple intersection design alternatives is calculated. The intersection design 22 with the lowest value for the control delay will be the preferred intersection design. Basically, 23 this produces one case (or record or row) in the resulting dataset. This process is repeated for a 24 multitude of different demand flow rates, thus generating a dataset for the training and testing of 25 the DT model. Different datasets will be generated in order to test the performance of the DT 26 methods related to the size of the dataset, the use of spatial constraints and the use of multiple 27 (closely related) preferred intersection designs for one set of demand flow rates. This will be 28 discussed in more detail in the subsequent sections, respectively explaining the intersection 29 model, the employed intersection design alternatives and demand flow rates and the selection of 30 the preferred intersection design. 31 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 5. Bezembinder, Wismans and Van Berkum 5 Intersection model 1 As stated before, we use the HCM 2010 methodology to determine the control delay for each 2 intersection design and set of demand flow rates. The latter is expressed as person car units per 3 hour (pcu/h), thus incorporating freight traffic. The standard duration of the analysis period is 15 4 minutes. The HCM 2010 distinguishes four main types of (ground-level) intersections for non- 5 highways: All-way stop controlled (ASWC) intersections, two-way stop controlled (TWSC) 6 intersections, signalized intersections and roundabouts. For each of these types a different 7 methodology is suggested. For required attributes other than the intersection design attributes 8 and demand flow rates default values as suggested by the HCM 2010 will be used. For signalized 9 intersections, the additionally required signal control settings are determined by using the so- 10 called ‘Quick Estimation Method’ as described in the HCM 2010 [13, Chapter 31]. Although this 11 method does not guarantee an optimal signal timing plan (with a minimal control delay for the 12 intersection), it does provides signal timings and delays when minimal data are available, which 13 is case for this application. 14 Intersection design 15 Since HCM 2010 is used, the possible intersection designs and the available attributes are bound 16 to this model. However, the number of possible designs is still very large. Since the intersection 17 design will be the dependent variable for the DT models, it is wise to limit the number of 18 categories. This is also legitimate, since in practice only a limited number of reasonable 19 intersection designs can and would be applied. Therefore, the number of intersection designs for 20 a four-arm intersection is limited to 24 design types. Table 1 gives an overview of the types used 21 in this study. The first attribute is the main type in accordance with the HCM 2010 type and 22 methodology. Then a distinction is made between attributes for the major and minor road 23 approaches. For reasons of clarity, in this research, the major approaches are always the east- and 24 westbound approaches, whereas the minor approaches are the north- and southbound approaches. 25 For each approach, the configuration of the entry lanes is determined ranging from one shared 26 lane for all movements to two designated lanes per movements. The * used at roundabout lanes 27 is used to indicate a (nonyielding) right-turn bypass lane. Other attributes concern the width of 28 the central reservation (CR) in meters, the number of exiting lanes and the number of opposing 29 circulating lanes in case of a roundabout approach. The rightmost column in the table shows the 30 size category of the design. This attribute was introduced in order to prevent large (and 31 expensive) intersections always to be preferred design, regardless of the traffic volume on the 32 intersection. This will be discussed later on. The size categories are based on an estimation of the 33 required surface area for both the central part of the intersection as well as the approaches. Size 34 category 1 designs have a maximum area of 500 m2 , the other boundaries are 1000, 1500, 2000 35 and 2000+ m2 . 36 37 38 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 6. Bezembinder, Wismans and Van Berkum 6 Design type Main type Major road approaches Minor road approaches Size cat. Entry CR Exit Circ. Entry CR Exit Circ. AW1 AWSC 1:  0.0 1 - 1:  0.0 1 - 1 TW1 TWSC 1:  0.0 1 - 1:  0.0 1 - 1 TW2 1:  5.0 1 - 1:  0.0 1 - 2 TW3 2:  0.0 1 - 1:  0.0 1 - 1 TW4 2:  5.0 1 - 1:  0.0 1 - 2 TW5 2:  5.0 1 - 2:  0.0 1 - 2 SIG1 Signalized 1:  0.0 1 - 1:  0.0 1 - 1 SIG2 2:  0.0 1 - 1:  0.0 1 - 1 SIG3 2:  0.0 1 - 2:  0.0 1 - 2 SIG4 3:  0.0 1 - 2:  0.0 1 - 2 SIG5 3:  0.0 1 - 3:  0.0 1 - 2 SIG6 4:  0.0 2 - 2:  0.0 1 - 3 SIG7 4:  0.0 2 - 3:  0.0 1 - 3 SIG8 4:  0.0 2 - 4:  0.0 2 - 3 SIG9 6:  0.0 2 - 4:  0.0 2 - 4 SIG10 6:  0.0 2 - 6:  0.0 2 - 5 RA1 Roundabout 1:  0.0 1 1 1:  0.0 1 1 3 RA2 1:  0.0 1 2 1:  0.0 1 2 5 RA3 2:  0.0 1 1 1:  0.0 1 1 3 RA4 2:  0.0 2 1 1:  0.0 1 2 5 RA5 2: * 0.0 1 1 1:  0.0 1 1 4 RA6 2: * 0.0 1 1 2: * 0.0 1 1 4 RA7 2:  0.0 2 1 2:  0.0 1 2 5 RA8 3: * 0.0 2 2 2: * 0.0 2 2 5 Table 1 Intersection design types for four-arm intersections. 1 Demand flow rates 2 On a four-arm intersection there are twelve turning movements for car traffic, i.e. left, through 3 and right turning movements for each approach. In this research it is the idea to determine the 4 intersection performance for as much demand flow rates combinations as possible. Suppose that 5 for each turning movement a value of 0 to 1000 pcu/h should be tested. The total number of 6 demand combinations would then be 100012 . Although this number can be reduced because 7 various design are equal once you mirror them, it is still a substantial number of calculations to 8 be performed for each of the 24 intersection designs. Also a very fast model is not able to handle 9 these quantities. The number of combinations could be reduced by reducing the number of 10 values for each turning movement, for example by using only 10 categories. Although this 11 reduces the number of tested demand drastically, it still leaves 1012 combinations. Further 12 reduction could be reached by using proportions for each approach and turning movement per 13 approach. In preparation for this research several approaches for the definition of the traffic 14 demand have been tested. Ultimately, an approach in which a given number of random generated 15 traffic demands will be determined turned out to be most promising. In this approach a random 16 amount of total traffic on the intersection is determined from a range of 0 to 6000 pcu/h. Next, 17 the proportion of traffic on the major approaches will be randomly drawn from a range of 50- 18 100%. The proportion of traffic on the minor approach is then derived from this value. This 19 means that there is never more traffic on the minor road then on the major road. In the most 20 extreme setting, the proportions are 50-50%. Subsequently, the proportions of traffic on the west- 21 bound (major) and north-bound (minor) approach are determined from a range of 0-100%. The 22 values for the other approaches are derived. Then for each approach, the right turn proportion is 23 drawn from a range of 0-100%. The through turn proportion is drawn from the remaining range, 24 while the left turn proportion is equal to the remaining proportion. 25 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 7. Bezembinder, Wismans and Van Berkum 7 Once all proportions are determined, they can be applied to the total volume in order to 1 determine the flow rates for each turning movement in the intersection. This is the input for the 2 intersection model. 3 Eventually, the demand flow rates are used as independent variables for the DT model. 4 However, as stated before, it is likely that composed variables, such as the total number of left 5 turning traffic from the minor road, are more explanatory for the model. One of the goals of 6 training and testing the DT models is to determine which independent variables are most 7 important for the model. Therefore, a total of 38 independent demand variables will be used in to 8 DT models. Table 2 shows an overview of these variables. 9 10 WB right volume SB right volume Major through & minor left volume Minor percentage WB through volume SB through volume Major through & left volume Major left percentage WB left volume SB left volume Minor through & left volume Minor left percentage NB right volume Total volume Left volume Major through percentage NB through volume Major volume Through volume Minor through percentage NB left volume Minor volume WB volume Major through & left percentage EB right volume Major through volume NB volume Minor through & left percentage EB through volume Major left volume EB volume Left percentage EB left volume Minor through volume SB volume Through percentage Minor left volume Major percentage Table 2 Independent demand variables. 11 12 Another issue concerns the size of the dataset that will be used to train the DT model. Since this 13 is still a point of research, different dataset sizes will be tested. Datasets with respectively 1,000, 14 10,000, 100,000 and 1,000,000 different demand configurations will be tested. So, for each of 15 these demand sets the performances of 24 different intersection design will be tested. 16 Preferred intersection design(s) 17 Basically, the preferred intersection design will be the intersection design with the lowest 18 intersection control delay. This is the volume weighted average control delay for the whole 19 intersection in s/veh. A preferred design is determined for each set of demand flow rates and size 20 category. The latter was introduced in order to prevent large (and expensive) intersections always 21 to be preferred design, regardless of the traffic volume on the intersection. With five size 22 categories, this will result in five cases (or records or rows) for each configuration of demand 23 flow rates. 24 DT METHODS AND EVALUATION 25 DTs 26 DTs classify instances by sorting them down the tree from the root to some leaf node, which 27 provides the classification of the instance. Each node in the tree specifies a test of some 28 (predictor) attribute, and each branch descending from that node to one of the possible 29 values for this attribute. An instance is classified starting at the root node of the tree, 30 testing the attribute specified by this node, then moving down the tree branch 31 corresponding to the value of the attribute in the given example. If the new node is not a 32 leaf node, but an internal node, the process is repeated for the subtree starting at the new 33 node. 34 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 8. Bezembinder, Wismans and Van Berkum 8 Figure 2 shows an imaginary example of a simple DT which predicts whether one should apply a 1 signal control. 2 3 Total volume <=1250 pcu/h Area Rural Yes >1250 pcu/h Speed limit Urban No 30 km/h 50 km/h No Yes 4 5 Figure 2 Example of a DT. 6 7 As can be seen, a DT can contain both nominal and numeric attributes. When the predicted 8 outcome (target) of the DT model is a real number, it is called a regression tree. When the 9 predicted outcome is the class to which the data belongs, it is called a classification tree. In this 10 study, we are interested in predicting the junction design type, which means we are only looking 11 at classification trees. 12 DT inducers/methods 13 DT inducers are algorithms that automatically construct a DT from a given dataset. Typically the 14 goal is to find the optimal DT by minimizing the generalization error. Induction of an optimal 15 DT from a given dataset is considered to be hard task [14]. As a result, heuristic methods are 16 required for solving the problem. Roughly speaking, these methods can be divided into two 17 groups: top-down and bottom-up with clear preference in the literature to the first group. Most 18 methods consist of two conceptual phases: growing (extending) and pruning (reducing). 19 A variety of top-down (univariate) algorithms is available. Rokach and Maimon [14] give 20 an extensive overview and state that the main algorithms are ID3, C4.5, CART, CHAID and 21 QUEST. The other algorithms are earlier versions, variations or are specifically designed for 22 numerical-valued predictor and/or target attributes, which is not applicable for predicting the 23 intersection design type. It is well known that no algorithm can be the best in all possible 24 domains [14]. For as far as we know, there are no publications concerning DT analysis for 25 intersection design type. The closest domain concerns the analysis of traffic accident severity 26 using DTs [15,16] for which the C4.5 algorithm is used. Most studies analyzing other domains 27 compare multiple algorithms, mostly CART, CHAID and QUEST e.g. see [17,18,19,20]. That 28 will also be done in this study. 29 CHAID stands for Chi-Squared Automatic Interaction Detection. The CHAID algorithm 30 was originally proposed by Kass [21]. The algorithm tests variables for independence using the 31 chi-square test, which determines whether splitting a node generates a statistically improvement 32 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 9. Bezembinder, Wismans and Van Berkum 9 in purity. Either Pearson or likelihood-ratio chi-square is used for determining the node splitting 1 and class merging. 2 The resulting significance values are adjusted using the Bonferroni method. CHAID allows 3 multiway node splitting (as opposed to binary node splitting), which means that the instance 4 space can be split in more than two classes. CHAID does not perform pruning. Exhaustive 5 CHAID , proposed by Biggs et.al. [22], is a modification of CHAID that uses a more thorough 6 heuristic for finding, at each node, the optimal way of grouping the categories of each predictor. 7 CART (or CRT) stands for Classification and Regression Trees. It was developed by 8 Breiman et.al. [23] and is characterized by the fact that it constructs binary trees. The splits are 9 selected using the Gini or Twoing impurity measure criteria and the obtained tree is pruned by 10 cost complexity pruning. 11 QUEST was proposed by Lim and Shih [24] as a Quick, Unbiased, Efficient Statistical 12 Tree and also constructs binary trees. For nominal predictors, it uses the Pearson chi-squared test 13 for node splitting. QUEST has negligible bias and uses ten-fold cross-validation to prune the 14 trees. 15 Summarizing, the following DT algorithms and settings will be tested in this study: 16 1. CHAID, Pearson chi-square statistic; 17 2. CHAID, Likelihood Ratio chi-square statistic; 18 3. Exhaustive CHAID, Pearson chi-square statistic; 19 4. Exhaustive CHAID, Likelihood Ratio chi-square statistic; 20 5. CRT, Gini impurity measure; 21 6. CRT, Twoing impurity measure; 22 7. QUEST. 23 Evaluation of DT methods 24 In order to determine which DT method is the best, one could compare prediction accuracy, 25 complexity and training time [26]. The most common measure for accuracy is the risk estimate 26 (and its standard error). For categorical dependent variables, the risk estimate is the proportion of 27 cases incorrectly classified. A classification table gives more insight in the specific 28 misclassification cases. A validation method will be used by determining risk estimates for both 29 training and test data. For this the dataset will be split in a training and test set with different 30 proportional segmentations (50%-50%, 67%-33% and 75%-25%). 31 Complexity can be compared by using general statistics such as the tree depth, the 32 number of branches and nodes, the number of cases in leafs and the number of independent 33 variables included. Since we are initially interested in the accuracy of the models, instead of the 34 manageability of the resulting trees, the maximum tree depth is set to the (maximum value) of 20 35 for all methods. In a later stage this number will be reduced to create more manageable trees. For 36 the same reason, no tree pruning is applied at first. The minimum number of cases in the parent 37 and child nodes are set to 100 and 50. The CRT method provides a measure for the (normalized) 38 importance of each independent variable in the model which can be used to reduce the number of 39 input variables. Together with options for the maximum tree depth and pruning, the complexity 40 of the tree can be reduced. This will be done only for the method with the best overall accuracy, 41 before applying these options. 42 Training time is expected to be an issue only for large amounts of data, i.e. containing 43 1,000,000 cases. The differences between the methods are expected to be relatively small. 44 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 10. Bezembinder, Wismans and Van Berkum 10 RESULTS 1 Accuracy 2 An overview of the prediction accuracy of the different DT algorithms for various sizes of the 3 training set is shown in Table 3. The table shows the overall classification rate, which is the 4 proportion of cases correctly classified. The dependent (target) variable used is the preferred 5 intersection design type as presented in Table 1. Only one design type for each combination of 6 demand variable configuration and size category is included in the dataset. The independent 7 (predictor) variables are all demand variables as mentioned in Table 2 and the size category 8 variable. 9 10 1 2 3 4 5 6 7 CHAID CHAID ECHAID ECHAID CRT CRT QUEST Cases Std.err. Pearson Likelihood Pearson Likelihood Gini Twoing - 5,000 0.006 77.4% 76.9% 77.6% 77.3% 79.5% 79.5% 76.3% 50,000 0.002 82.8% 82.7% 82.9% 82.6% 84.7% 84.5% 81.0% 500,000 0.000 85.8% 85.9% 85.8% 85.9% 85.7% 87.3% 84.4% 5,000,000 0.000 87.9% 88.1% 87.8% 88.0% 85.2% 86.6% 85.9% Table 3 Prediction accuracy for different algorithms and training set sizes. 11 12 The overall accuracy in Table 3 ranges from 76.3% to 88.1%. The standard error is, except for a 13 dataset with 5,000 cases, negligible small. Generally, the accuracy increases with an increase of 14 cases in the dataset, with the exception of both CRT methods (5,6) between 500,000 and 15 5,000,000 cases. Although the differences between the accuracy rates for the methods are small 16 (2-3% points), the QUEST method (7) generally has the lowest accuracy and the CRT methods 17 (5,6) have the highest accuracy. For the largest dataset the CHAID methods with a likelihood 18 ratio criterion (2,4) have more accuracy. For the largest dataset, calculation times become an 19 issue, since one method on average takes one hour to calculate. 20 Table 4 shows additional accuracy values for the tested methods, only for the dataset 21 containing 500,000 cases. The upper part of the table shows accuracy rates for the main junction 22 types being; all-way stop controlled (AW), two-way stop controlled (TW) and signalized (SIG) 23 intersections and roundabouts (RA). An intersection design is counted as correctly classified 24 when the main intersection type is correct. So if the observed intersection design type is SIG3 25 and the predicted type is SIG4 it is labeled as correctly classified. It can be seen that trees are 26 much better in predicting roundabouts correctly (range of 95.6%-96.3%) then all-way sign 27 controlled intersections (range of 32.6%-51.2%). Besides the fact that AWSC intersections are 28 more difficult to predict by using the given independent variables, the total number of cases in 29 which the AWSC intersection is the preferred intersection design is limited to 688 (0.1%). DT 30 models tend to underestimate classes with few elements and overestimate classes with a lot of 31 elements. Without using any of the independent variables a model which always predicts a 32 roundabout would have a 50.9% accuracy. This is an issue that should be checked when taking a 33 closer look at the tree. The CRT with Twoing (6) is always the method with the best accuracy. 34 Table 4 also shows the accuracy rates for a validation test which splits the dataset in a 35 training and test set. Only the result of a 75% training and 25% test set size are shown. The 36 results of a 50%-50% and a 67%-33% split show similar though a little less accurate results. The 37 shown values are average values of 10 different runs, since the split is randomly determined. 38 39 40 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 11. Bezembinder, Wismans and Van Berkum 11 1 2 3 4 5 6 7 CHAID CHAID ECHAID ECHAID CRT CRT QUEST Cases Pearson Likelih. Pearson Likelih. Gini Twoing - All All 500,000 85.8% 85.9% 85.8% 85.9% 85.7% 87.3% 84.4% Classified by main type AW 688 36.9% 39.9% 36.6% 38.3% 45.1% 51.2% 32.6% TW 81,659 86.2% 86.4% 86.1% 86.1% 85.0% 86.8% 81.4% SIG 163,380 85.1% 85.1% 85.0% 85.4% 84.9% 87.2% 85.4% RA 254,273 96.2% 96.1% 96.2% 96.0% 95.6% 96.3% 94.9% All 500,000 90.7% 90.7% 90.6% 90.7% 90.2% 91.6% 89.3% Validation Training 375,000 85.4% 85.5% 85.5% 85.5% 85.8% 87.3% 83.8% Test 125,000 84.4% 84.4% 84.3% 84.5% 85.3% 86.3% 83.3% Modelled by size category 1 100,000 89.6% 89.7% 89.7% 89.8% 91.9% 91.8% 89.6% 2 100,000 71.6% 71.5% 71.3% 71.2% 76.4% 76.9% 67.6% 3 100,000 88.5% 88.5% 88.5% 88.5% 89.9% 90.0% 86.6% 4 100,000 91.6% 91.7% 91.7% 91.7% 93.3% 93.4% 91.6% 5 100,000 87.7% 87.9% 87.8% 87.9% 89.1% 89.3% 86.6% Table 4 Additional analysis of prediction accuracy for 500,000 dataset. 1 2 All the methods (fortunately) use the size category as the primary variable to split the 3 tree. This is rather logical since the size category is directly linked to the intersection design type 4 as shown in Table 1. The tree depth and model complexity can be reduced by creating separate 5 datasets for each size category and thus perform a DT analysis for each size category separately. 6 The lower part of Table 4 shows the results of these analyses, again performed with the 500,000 7 dataset. Each separate dataset now has a size of 100,000 cases. With the exception of size 8 category 2, the accuracy rates are higher than the ones for the undivided dataset. 9 Independent variables 10 The CRT algorithm provides the possibility to report the independent variable importance to the 11 tree model. Since the CRT Twoing method (6) in almost all cases gives the highest accuracy 12 rates, the results of this model will be used to analyze the mentioned variable importance. Table 13 5 shows the normalized importance of the ten topmost independent variables for the tree models 14 for each size category. The variable with the highest importance value is set to 100%. The model 15 for intersection designs of size category 1 can primarily by classified by the total demand volume 16 on the intersection, followed by the sum of the major left turning volumes and the sum of the 17 major through and left turning volumes. The latter is an important variable for all size categories. 18 Notably is that the total volume is far less important for larger size categories (3-5) and that the 19 table is dominated by absolute values in contrast with relative (percentage) ones. 20 Towards manageable DTs 21 Up till now, the maximum tree depth was set to 20, which produces rather large and complex 22 trees with up to 600 nodes and 16 levels. These cannot be used as a base for decision rules for 23 intersection design. The tree complexity can be reduced by reducing the maximum tree depth 24 and the number of independent variables and initiating tree pruning methods. The task is to 25 reduce the tree complexity without losing to much predictive accuracy. 26 Figure 3 shows a DT for intersection design of size category 3. In order to generate this 27 tree, again the CRT Twoing method is used. This time with a maximum tree depth of 3 levels, 28 only the ten most important independent variables according to Table 5, and pruning. 29 30 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 12. Bezembinder, Wismans and Van Berkum 12 1 2 3 4 5 Variable % Variable % Variable % Variable % Variable % 1 Total volume 100.0% Major through & left volume 100.0% Through volume 100.0% Through volume 100.0% Major through & left volume 100.0% 2 Major left volume 92.5% Through volume 82.2% Major through & left volume 97.1% Major through & left volume 95.8% Through volume 63.4% 3 Major through & left volume 92.0% Total volume 81.0% Major through & minor left volume 79.8% Major through volume 73.7% Major through & left percentage 61.2% 4 Left volume 88.5% Major left volume 79.9% Major through volume 77.4% Major through & minor left volume 73.6% Major through & minor left volume 61.0% 5 Minor volume 80.0% Major volume 77.0% EB through volume 56.0% EB left volume 71.4% Major through volume 59.0% 6 Minor through & left volume 72.1% Left volume 73.4% Through percentage 51.8% Minor through & left percentage 70.3% EB left volume 47.8% 7 Major volume 69.8% Major through & minor left volume 72.5% EB left volume 44.0% EB through volume 54.7% Left volume 46.7% 8 Through volume 61.9% Major through volume 65.9% WB through volume 29.1% Major left volume 47.3% Major left volume 46.5% 9 Minor through volume 54.1% Minor through volume 51.8% Minor through volume 28.1% Minor volume 41.8% Through percentage 39.8% 10 Major through & minor left volume 52.0% Minor through & left volume 49.9% Major left volume 28.0% Through percentage 41.4% Major volume 34.1% Table 5 Independent variable importance. 1 2 Size category 3 comprises the intersection design types SIG6, SIG7, SIG, RA1 and RA3. If the 3 total amount of through traffic on the intersection is less than or equal to 1434.5 pcu/h then the 4 RA3 is the advised design type. When the amount of through traffic is greater than the 5 mentioned value, a further split is made for through traffic at 2028.5 pcu/h. Below this limiting 6 value, the advice is still RA3, but with far less accuracy. With more than 2028.5 pcu/h through 7 traffic on the intersection the advice is either SIG7 or SIG6 dependent upon the amount of 8 through traffic on the minor road. The overall accuracy of this drastically compressed tree model, 9 is still 85.5% (compared to 90.0% of the uncompressed model). 10 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 13. Bezembinder, Wismans and Van Berkum 13 1 Figure 3 DT for intersection designs for size category 3. 2 CONCLUSIONS AND FURTHER RESEARCH 3 In this study we explored the possibilities of using a DT model in order to determine the 4 intersection design type solely based on traffic demand variables. Datasets were generated using 5 HCM 2010 intersection models for all-way sign controlled, two-way sign controlled and 6 signalized intersections and roundabouts, using the control delay of the intersection to determine 7 the preferred intersection design. We examined the predictive accuracy of various different DT 8 methods for several sizes of datasets and observed a satisfying accuracy rate in the range of 9 76.3% to 88.1%. The QUEST method gave the lowest accuracies while the CRT based methods 10 gave the highest values. The accuracy rates can be improved (up to 93.4%) by estimating 11 separate tree models for intersection designs for different size categories. 12 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 14. Bezembinder, Wismans and Van Berkum 14 The importance of independent demand variables differs over the size categories, although the 1 topmost ten to twenty variables are very similar and are predominantly absolute variables. 2 Calculation times stay within reasonable limits (minutes for 500,000 cases and hours for 3 5,000,000 cases). The accuracy can reasonably be maintained while reducing the tree complexity 4 in order to make it usable for intersection design rules. We showed an example with a tree depth 5 of three levels and seven nodes with an accuracy of 85.5%. Based on our research we can 6 conclude that it is possible to determine accurate intersection design rules by using DTs. 7 An issue that has not been dealt with in this paper concerns the fact that the control delay 8 of various intersection designs may be the same or very close. Then, it might be better to include 9 multiple intersection designs in de dataset for the DT. Preliminary tests with a 10% bandwidth, 10 generating a dataset with 648,066 cases showed that the accuracy rates were reduced to nearly 11 68%. The results at first seem disappointing, since the accuracy rates are quite low. However, 12 although the model should be able to produce a better fit with the provided data, the accuracy 13 rate is reduced due to the fact that when three preferred intersection designs are provided in the 14 dataset for one combination of demand variables and size category, only one will correctly be 15 predicted. The other two are incorrectly classified, this reducing the predicting accuracy. Another 16 measure has to be used to determine its worth, which is an issue for further research. 17 Other topics for further research concern the addition of the C4.5 and/or C5.0 algorithm 18 [27,28], comparing the performance of the resulting intersection design rules based on the DT 19 with existing rules, creating DTs for traffic safety and environmental impact related objectives 20 and examining and incorporating network effects. 21 ACKNOWLEDGMENT 22 This research is part of an ongoing PhD-research concerning the optimization of intersection 23 design in urban traffic networks which is funded by the Netherlands Organisation of Scientific 24 Research. 25 REFERENCES 26 1. AASHTO (2011) A Policy on Geometric Design of Highways and Streets, 6th Edition, 27 American Association of State Highway and Transportation Officials, Washington D.C. 28 2. Transportation Research Board (2010) Roundabouts: An Information Guide, Second Edition, 29 NCHRP Report 672, Transportation Research Board, Washington D.C. 30 3. FGSV (2007) Richtlinien für die Anlage von Stadtstraßen: RASt 06 (in German), 200, 31 Forschungsgesellschaft für Straßen- und Verkehrswesen, Köln. 32 4. Highways Agency (2012) Design Manual for Roads and Bridges (DMRB), Online version 33 June 2012, Highways Agency, London. 34 5. CROW (2002) Handboek Wegontwerp (in Dutch), Publications 164A-D, CROW, Ede. 35 6. CROW (2012) ASVV 2012 – Aanbevelingen voor verkeersvoorzieningen binnen de 36 bebouwde kom (in Dutch), Publication 723, CROW, Ede. 37 7. CROW (2008) Turborotondes (in Dutch), Publication 257, CROW, Ede. 38 8. Stamatiadis, N., Kirk, A., Agarwal, N. and Jones, C. (2012) Improving Intersection Design 39 Practices – Final Report, Research Report KTC-12-4 / SPR-09-380-1F, Kentucky 40 Transportation Center, University of Kentucky, Lexington. 41 9. Kirk, A., Jones, C. and Stamatiadis, N. (2011) Improving Intersection Design Practices, 42 Transportation Research Record, Volume 2223, 1-8. 43 TRB 2015 Annual Meeting Original paper submittal - not revised by author.
  • 15. Bezembinder, Wismans and Van Berkum 15 10. Han, L., Li, J-M. and Urbanik, T. (2008) Control-Type Selection at Isolated Intersections 1 Based on Control Delay Under Various Demand Levels, Transportation Research Record, 2 Volume 2071, 109-116. 3 11. Transportation Research Board (2000) Highway Capacity Manual 2000, Transportation 4 Research Board, Washington D.C. 5 12. Vitens, B.J. and K.W. Axhausen (2012) Shape Grammars for Intersection Type Choice in 6 Road Network Generation, paper presented at the 12th Swiss Transport Research Conference, 7 Ascona, May 2012. 8 13. Transportation Research Board (2010) Highway Capacity Manual 2010, Transportation 9 Research Board, Washington D.C. 10 14. Rokach, L. and Maimon, O. (2008) Data Mining with Decision Trees, Theory and 11 Applications, Series in Machine Perception and Artificial Intelligence – Vol. 69, World 12 Scientific Publishing, Singapore 13 15. Abellán, J., López, G. and Oña, J. de (2013) Analysis of traffic accident severity using 14 Decision Tree Rules via Decision Trees, Journal of Expert Systems with Applications, 15 Volume 40, Issue 15, 6047-6054. 16 16. Oña, J. de, López, G. and Abellán, J. (2013) Extracting decision rules from policy accident 17 reports through decision trees, Journal of Accident Analysis and Prevention, Volume 50, 18 1151-1160. 19 17. Lee, S. and Park, I. (2013) Application of decision tree model for the ground subsidence 20 hazard mapping near abandoned underground coal mines, Journal of Environmental 21 Management 127, 166-176. 22 18. Huang, C.S., Lin, Y.J. and Lin, C.C. (2008) Implementation of Classifiers for Choosing 23 Insurance Policy Using Decision Trees: A Case Study, WSEAS Transactions on Computers, 24 Issue 10, Volume 7, 1679-1689. 25 19. Ture, M., Kurt, I., Kurum, A.T. and Ozdamar, K. (2005) Comparing classification techniques 26 for predicting essential hypertension, Export Systems with Applications 29, 583-588. 27 20. Pal, M. and Mather, P.M. (2003) An assessment of the effectiveness of decision tree methods 28 for land cover classification, Remote Sensing of Environment 86, 554-565. 29 21. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression 30 Trees, Wadsworth Int. Group. 31 22. Kass, G.V. (1980) An Exploratory Technique for Investigating Large Quantities of 32 Categorical Data, Applied Statistics, 29(2), 119-127. 33 23. Biggs, D., De Ville, B. and Suen, E. (1991) A Method of Choosing Multiway Partitions for 34 Classification and Decision Trees, Journal of Applied Statistics 18(1), 49-62. 35 24. Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984) Classification and Regression 36 Trees, Wadsworth Int. Group. 37 25. Loh, W.Y. and Shih, X. (1997) Split Selection Methods for Classification Trees, Statistica 38 Sinica 7, 815-840. 39 26. Lim, X., Loh, W.Y. and Shih, X. (2000) A Comparison of Prediction Accuracy, Complexity 40 and Training Time of Thirty-Three Old and New Classification Algorithms, Machine 41 Learning 40, 203-228. 42 27. Quinlan, J.R. (1986) Induction of decision trees, Machine Learning, Volume 1, 81-106. 43 28. Quinlan, J.R. (1993) C4.5: Programs for Machine Learning, Morgan Kaufman, Los Altos. 44 TRB 2015 Annual Meeting Original paper submittal - not revised by author.