Project104_Group713_ProgressReportI

ENS 491 – Graduation Project (Design)
Progress Report I
Project Title:
Mathematical Interpretation of the Linguistic Scale
What Does “Significantly More” Mean?
Project #104
Group Members:
Dilara ERŞAHİN - 17671
Sarp UZEL - 18184
Emir ÇAKAR - 16732
Berk ÇETİN - 16508
Supervisor: Kemal KILIÇ
Date: 20.11.2016

2
ABSTRACT
The purpose of this project is to determine the quantitative range corresponding to the
individuals’ understanding of various qualifiers such as “significantly more important”,
“extremely more important” etc. which are used by decision makers while comparing two
criteria. It motivates from the scales associated with linguistic qualifiers which are commonly
utilized in the context of multi criteria decision making algorithms such as Analytical
Hierarchy Process (AHP).
AHP, developed in 1977 by Thomas L. Saaty, uses a scale of 1-9 to assign such
importance relations among different criteria. However, this scale may be partially
responsible of the inconsistency existing in the comparison matrix which is used to determine
the decision makers’ preferences by AHP. The project team’s concern is to find out the reason
of this inconsistency, and propose recommendations for the formation of a scale, which will
decrease the level of inconsistency in the AHP.

3
Table of Contents
ABSTRACT............................................................................................................................................ 2
INTRODUCTION................................................................................................................................... 1
OBJECTIVES AND INTENDED RESULTS ........................................................................................ 6
LITERATURE REVIEW........................................................................................................................ 9
1. SCALES IN AHP........................................................................................................................ 9
A. Linear..................................................................................................................................... 10
B. Power..................................................................................................................................... 10
C. Root Square........................................................................................................................... 10
D. Geometric.............................................................................................................................. 11
E. Inverse Linear........................................................................................................................ 11
F. Asymptotical ......................................................................................................................... 11
G. Balanced................................................................................................................................ 11
H. Logarithmic ........................................................................................................................... 12
2. ELICITING TECHNIQUES IN MCDM .................................................................................. 13
A. Analytic Hierarchy Process................................................................................................... 14
B. Direct Point Allocation.......................................................................................................... 17
C. Simple Multi Attribute Rating Technique............................................................................. 18
D. SWING Weighting................................................................................................................ 18
E. TRADEOFF Weighting ........................................................................................................ 18
3. CONSISTENCY ISSUE ........................................................................................................... 19
4. INDIVIDUALIZATION OF SCALES ..................................................................................... 20
5. EXPERIMENTS ....................................................................................................................... 22

4
A. Distance from Milan.............................................................................................................. 22
B. Games of Chance................................................................................................................... 23
C. Rainfall in November 2001 ................................................................................................... 24
D. Different Multi Attribute Methods ........................................................................................ 24
E. Visual Psychophysics of Graphical Elements ....................................................................... 26
THEORETICAL EXPERIMENTS....................................................................................................... 27
PILOT PSYCHOPHYSICAL EXPERIMENTS................................................................................... 28
1. Speed of a Car ........................................................................................................................... 29
2. Weight of a Bottle ..................................................................................................................... 33
PROJECT TASKS AND SCHEDULE................................................................................................. 34
CONCLUSION ..................................................................................................................................... 35
REFERENCES...................................................................................................................................... 36
APPENDICES....................................................................................................................................... 37
Appendix A : Implementation of the matrix generator code............................................................. 37
Appendix B : Excel spreadsheet of matrices’ eigenvalues and acceptance/rejection values ............ 37
Appendix C : An example of the subset issue................................................................................... 38
Appendix D : Minitab Worksheet for paired t-test............................................................................ 39

5
INTRODUCTION
With the rise of the Industrial Revolution, started in 1760’s, and the demand arose
during the World War II for operating military operations in the most effective way, decision
making became extremely crucial. However, the area of operations research was based on
decision making methods with single subject in their focus. For instance, during the World
War II, England tried to maximize the damaged area with using less number of bombs in their
inventories. But in reality, people tried to maximize/minimize more than one objective. This
insufficiency of the current methods was eliminated by the introduction of various multi
criteria decision making techniques among which Thomas L. Saaty’s Analytic Hierarchy
Process (AHP) (Saaty 1980) can be counted as one of the most popular.
AHP can be described as a method of making decisions when there are multiple
criteria to be considered, and it is a tool for simplifying the procedure of evaluating these
criteria (Budak et al., 2006). In the evaluation of the criteria, a pairwise comparison matrix is
required to be formed by using both a linguistic scale and Saaty’s original scale. This
proposed scale of Saaty creates an inconsistency for the pairwise comparison matrices,
meaning the decision to be made will not be accepted as consistent. Further detailed
explanation on the AHP method can be found in section 3.
This report contains a detailed description of the project objective and proposed
solution, the literature review that the project group has done so far, and the technical and
empirical developments. The project group was given the task to search solutions for
inconsistency issue arose during the implementation of Analytic Hierarchy Process (AHP),
since its development. AHP is a widely used multi-criteria decision making method which is
based on generating comparison matrices by doing pairwise comparisons. It is a known fact in

6
the subject of AHP that as the number of criteria increases, decision makers are prone to
become more inconsistent. Our claim is that the inconsistency does not only depend on the
decision makers but the scale that is used in the AHP method. Our hypothesis is that Saaty’s
linear scale from 1 to 9 is not a good conversion from verbal statements to numerical values.
The linear scale lacks representing the meaning of “significantly more” and other similar
linguistic variables with their corresponding numerical values. The first step for the project
group in this project was to do research on the AHP literature and gather as much information
as possible in the planned time scale.
Rest of the report is summarized as follow. In section 2, we present the goals achieved
until this stage of the project which are the information related to the existing scales used in
AHP, the different multi-attribute weighting techniques, the consistency issue, the
individualization of scales, and the conducted experiments in literature. Section 3 focuses on
the conduction of the theoretical experiment for our scale analysis is explained, following the
pilot empirical experiments and the resulting issues are stated in section 4. Finally in section
5, further tasks to be completed until the finalization of the project are stated and the time
table for the tasks due until the Progress Report II is given.
OBJECTIVES AND INTENDED RESULTS
Our project is based on the issue of the inconsistency of the comparison matrix
structured by the decision maker. A wide range of literature states that the decision maker
uses false approach while assigning the values of the comparison matrix. However, it is
highly probable that the problem is due to the scale that does not match with the decision
maker’s opinions. Hence, our project goal is to investigate the role of such arbitrary scales in
the resulting inconsistency. “Clearly one scale may be appropriate for one application (or one
decision maker) and may not be appropriate for another. In this situation, a different scale
could and should be chosen for each application.”(Harker & Vargas, 1987). This issue creates

7
a need for a new approach while constructing a comparison scale which can ultimately be
used for a variety of problems as well as by all decision making individuals.
If this project achieves success, the project team would contribute to the minimization
of the inconsistency emerging from the comparison matrices results of the AHP method.
Therefore, linguistic qualifiers such as “significantly more” etc., which are referring to the
relations, will be defined more accurately. By now, we have concluded that Saaty’s Linear
Scale does not completely reflect the decision makers’ importance values of the criteria, and
we hope to propose a new approach to the scales which will be applicable to most of the
problems needing AHP method and suitable to all decision makers.
The solution proposed, will not cover the realization procedures of the AHP method,
but the focus would be more on the theoretical basis of this method, in particular with the
rating scales and the comparison steps of the procedure. We will conduct various comparison
experiments with individuals and gather data that will be used in the analysis. By the analysis
we are aiming to utilize statistical data analysis techniques and optimization methods. The
optimization will be particularly helpful to determine the optimal scales that would minimize
the inconsistency of the decision makers. Note that even though only AHP is at the core of the
project we will also incorporate other multi criteria techniques such as SMART, Trade-Off
etc. in order to discuss and validate the results that would be obtained from the experiments.
At the end of this project, the team’s goal is to achieve a solution which is applicable
to all of the AHP problems and decision makers; rather than the already existing solutions that
are specific to certain multi-criteria problem types. The intended result of this project is
developing an optimization program for individualized scales.

8
Although not developed yet, the optimization program in theory should work in the
following manner:
 Linguistic pairwise comparison matrix (LPCM) is taken as an input by the decision
maker
 Provide the corresponding individualized weights for the given matrix, as an
output
After the implementation of the optimization, the resulting individualized weights will
be compared with the weights obtained by Saaty’s Linear Scale corresponding to the same
LPCM. In order to perform this comparison, the Random Index values for the optimized
individual scale should be computed, via the already coded C++ program of the project team.
Then the objective function of the overall optimization analysis can be defined as: minimizing
the deviation between the weights obtained by the individualized scale and Saaty’s scale.
While the project progress is going on, the realistic constraints should not be
exceeded. As mentioned earlier in the project’s Proposal, the only topics of constraints for this
project are economic, social, and time. For the economic constraint, the experiments should
have large sample sizes, therefore the materials used in these experiments may need an
amount of budget, although the pilot experiments did not need any. For the social constraint,
the project team should be able to contact the subjects when needed, for the probability of
repeating the experiments. Lastly, the project schedule and realized tasks should refer to the
same time frame; but until now the project team had no issues by this constraint, in fact the
project is running ahead of the planned time table.

9
LITERATURE REVIEW
1. SCALES IN AHP
Throughout the years from AHP’s development to the present day, several scales of
judgement have been used for the implementation of this specific multi-criteria decision
making process. The differencess between the scales emerged from the claim that the
linguistic scales may correspond different numerical values. The verbal statements that
correspond to the parameters of the scales are same for all and are shown in Table 1:
Parameter Order Verbal Statement
1 Equally important
2 Weakly more important
3 Moderately more important
4 Moderately plus more important
5 Strongly more important
6 Strongly plus more important
7 Demonstratedly more important
8 Very, very strongly more
important
9 Extremely more important
Table 1: Verbal statements of linguistic scale
Using a different scale results in a different comparison matrix and thus different
average eigenvalue and, a different Random Index for the decision of inconsistency. In our
project’s content, we have studied eight of these judgement scales which are applicable to
AHP, which can be listed as Linear, Power, Root Square, Geometric, Inverse Linear,
Asymptotical, Balanced, and Logarithmic.

10
Figure 1 : Judgement scales used in AHP
A. Linear
The Linear scale, developed by Thomas L. Saaty in 1977, is the basis and the original
scale used in the AHP method. The main idea behind this scale is that the linguistic values
have a linear relationship among them. So, the corresponding numerical values are
computed to be from 1 to 9.
B. Power
The Power scale was developed by P. Harker and L. Vargas in 1987, and is based on
the functional relationship of the power law. The linguistic values for this scale refers to
the numerical values of 1, 4, 9, 16, 25, 36, 49, 64, and 81.
C. Root Square
The Root Square scale was formed again by P. Harker and L. Vargas, in 1987. This
scale takes into account the square root relation between our input values. The values of
the Linear scale are taken into their square roots, for the Root Square scale. This refers to

11
the statement of linguistic scales corresponding to numerical values of 1, 2, 3, 2, 5,
6, 7, 8, and 3.
D. Geometric
Freerk A. Lootsma developed the Geometric scale, in 1989. Lootsma claimed that the
geometric relation between the linguistic values and the numerical values would form a
more applicable pairwise comparison matrix for the procedure of AHP. The linguistic
values correspond to the roots of 2, in this scale. So, the numerical values are computed as
1, 2, 4, 8, 16, 32, 64, 128, and 256.
E. Inverse Linear
The Inverse Linear scale was first introduced by D. Ma and X. Zheng in 1991, and
explains the relationship between the linguistic and numerical values as being inversely
linear. This scale, as the above ones, is based on the parameters x of 1 to 9, and computes
the corresponding numerical values as
9
10−𝑥
, which results in 1, 1.13, 1.29, 1.5, 1.8, 2.25,
3, 4.5, and 9.
F. Asymptotical
The Asymptotical scale, unlike all the other scales of our study, takes numerical values
between 0 and 1. It was developed by F. J. Dodd and H. A. Donegan in 1995. The
parameters from 1 to 9 is inserted in the function tanh−1 3 𝑥−1
14
, which results in the
numerical values of 0, 0.12, 0.24, 0.36, 0.46, 0.55, 0.63, 0.7, and 0.76.
G. Balanced
The Balanced scale was developed by Ahti A. Salo and Raimo P. Hӓmӓlӓinen, in
1997. Unlike the other scales, the parameters of x – referred to as weights for this scale-
are calculated to be 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, and 0.9. With a function

12
implementation of
𝑤
1−𝑤
, and the numerical values are from then on computed as 1, 1.22,
1.5, 1.86, 2.33, 3, 4, 5.67, and 9.
H. Logarithmic
The Logarithmic scale was developed in 2010, by Alessio Ishizaka, Dieter
Balkenborg, and Todd R. Kaplan. By applying the function log2(𝑥 + 1) to the
parameters of x from 1 to 9, the corresponding numerical values were computed to be 1,
1.58, 2, 2.2, 2.58, 2.81, 3, 3.17, and 3.32.
The Power and Geometric scales result in higher values of the criterion with the
highest priority. The Asymptotical and Root Square scales result in dominating criterion for
the highest priority, above others.
All of the scales stated above, generates different values for the eigenvalue (λ 𝑚𝑎𝑥 ) of
the computed matrices and different RI values, the values can be found in Figures 2 and 3.
Figure 2 : Average λ 𝑚𝑎𝑥 values for different scales, and different matrix sizes

13
Figure 3 : RI values for different scales, and different matrix sizes
The difference in the λ 𝑚𝑎𝑥 and RI values for different matrices were meant to result in
a difference in the inconsistency levels for the scales. The resulting statement was found to be
that the Geometric and Inverse Linear, and Logarithmic scales are less sensitive to
inconsistency compared to the other scales. On the other hand, the Root Square scale was
found to be the most sensitive, or the least tolerant, to inconsistency. The consistency issue
will be explained further in the section 4, Consistency Issue.
2. ELICITING TECHNIQUES IN MCDM
There are several MCDM methods developed over the past many years, since the
decision making became crucial for both our daily lives and for making critical decisions for
companies. Throughout our project, we have studied other applicable MCDM methods, along
with AHP. The scope of our study for the methods included AHP, DIRECT, SMART,
SWING, and TRADEOFF. Each of these methods have different approaches to interpreting
weights of the criteria.

14
A. Analytic Hierarchy Process
The AHP method, developed by Thomas L. Saaty in 1977, provides ease in the
selection of an alternative where all the alternatives have multiple criteria. The decision
maker(DM) is first asked to rank the attributes in a hierarchical manner. The next step is
to compare the importance of the attributes over one another. The comparison is
performed by a pairwise comparison matrix, in which the DM inserts values from a 1-9
Linear scale, corresponding to the verbal statements of preference. The weight values are
computed in three different ways; the original way is using the eigenvector method, the
heuristic methods are the arithmetic and geometric mean methods.
a) Eigenvector Method
Whereas calculating the weights of each criterion with using the eigenvector
method takes more time than the heuristic methods, it converges to more accurate
weighting results. In this approach there are basically 3 steps which the decision maker
should consider. First step is to find the eigenvalues of the corresponding square
comparison matrix. Secondly , the eigenvectors are calculated by using the
eigenvalues calculated in the first step . The last step is to determine the eigenvector
which corresponds to the maximum eigenvalue and normalize it. That eigenvector is
our principal vector which represents the weights of the criteria.
The method is implemented to the following example matrix. Suppose A is our
pairwise comparison matrix which is generated by using the Linear scale of 1-9.
A=
1 1/3 5
3 1 7
1/5 1/7 1

15
The corresponding eigenvalues and eigenvectors then would be:
λ=
3,0649 0 0
0 −0,0324 + 0,4448𝑖 0
0 0 −0,0324 − 0,4448𝑖
W=
0,3928 … …
0,9140 … …
0,1013 … …
so 𝑊1=
0,3928
0,9140
0,1013
Note that we do not have to calculate “...”, since they do not correspond to the
elements of the eigen vector of highest λ value. Now, we need to normalize the values
of 𝑊1 :
𝑊∗
=
0,2790
0,6491
0,0719
where 𝑊∗
denotes the principal eigenvector for weights of the criteria. Note how the
values in W are close to the values 𝑊∗
, this means that W is a good estimator of 𝑊∗
.
b) Arithmetic Mean Method
First step in this approach is to sum up all elements in a column vector of the
comparison matrix and then normalize each element of the column with its
corresponding column sum. Decision maker will apply the same approach for all
column vectors of the comparision matrix. After applying the same calculation method
for all the column vectors, now sum up all the elements that are in the same row of the
modified comparision matrix, and multiply the summation found in the row operation
with
1
𝑛
where n≥1 , denotes the size of the square matrix to create priority vector.
The elements of the priority vector now represents the weights of the each
corresponding criterion. This method is also called normalized principal eigenvector,

16
since we are not directly calculating the eigenvector of the comparison matrix, but
relaxing the problem and applying heuristic methods to estimate the eigen vector.
Let us perform this method on the same matrix in the section Eigenvector Method.
A=
1 1/3 5
3 1 7
1/5 1/7 1
Sum=
21
5
31
21
13
After applying the first step, our matrix becomes:
A=
5/21 7/31 5/13
15/21 21/31 7/13
1/21 3/31 1/13
Sum= 1 1 1
The corresponding primary vector would then be:
W=
1
3
*
5
21
+
7
31
+
5
13
15
21
+
21
31
+
7
13
1
21
+
3
31
+
1
31
=
0,2828
0,6434
0,0738
c) Geometric Mean Method
This method mainly depends on taking the geometrical average of the row vectors
of the comparison matrix, this is why it is called the heuristic geometric weight
calculation. Again the purpose of using such a method is to estimate the eigenvector to
determine the weights of each criterion. First step is to multiply all the elements in a
row vector of the comparison matrix, and take the 𝑛𝑡ℎ root of the multiplication.
Repeat the same step for all row vectors of the comparison matrix. After finishing the
calculations for each row vector, sum up all the elements of the nx1 vector that is

17
structured by taking the 𝑛𝑡ℎ root of each row product and normalize the nx1 vector to
find a good estimator for the eigenvector.
Suppose our comparison matrix A and the step results are as followed:
A=
1 1/8 1/3
8 1 3
3 1/3 1
P =
0,347
2,884
1
W =
0,082
0,682
0,236
Sum= 1
where W denotes the pricipal eigenvector.
From this procedure the attribute weights are computed. The final procedure of AHP is
to measure the consistency or inconsistency level of the resulting matrices. The AHP
method, accepts an exceeding of 10% from the specified Random Index value.
B. Direct Point Allocation
In DIRECT method, the decision maker(DM) is supposed to allocate 100 points
among all the criteria –also defined by the decision maker. In other words, the DM has to
give scores to each criteria and the summation of these scores will add up to 100 points.
The criteria should be ranked from the best to the worst, so that the highest score must be
assigned to the most important criterion, and the following least important criteria will
have lower scores. The assigned points are considered to be the attribute weights.

18
C. Simple Multi Attribute Rating Technique
The SMART method, pioneered by Ward Edwards in 1971, consists of two steps in
the interpretation of the weights for the criteria. First step for the DM is to rank the
importance of the possible changes in the criteria from the worst attribute levels to the best
levels. The following step is to make ratio estimation of the importance of each criteria
relative to the one ranked as the lowest importance. The second step usually begins by
assigning 10 points to the criterion with the lowest importance and continues by assigning
10 points upwards to the relative importance of the remaining criteria. At the end,
normalization is done on the points in order to get the weights of attributes.
D. SWING Weighting
In the SWING method, which was developed by Von Winterfelt and Ward Edwards in
1986, the decision maker considers a hypothetical attribute in which all criteria are at their
worst levels and assigns 100 points to the criteria that he/she wants to change it to its best
level in the first hand. Next, the DM needs to decide which criterion should be changed to
its best level as the second, and assigns a point to this criterion which is less than 100, and
the same step is repeated until each criterion has its own point assigned. Finally, the points
are normalized so that their summation would be equal to 1, and these resulting
normalized values are considered to be the weights of the attribute.
E. TRADEOFF Weighting
In TRADEOFF method, the decision maker is given a consideration of a situation in
which he/she needs to decide between two alternatives, having only two attributes that are
different in values. The DM is supposed to rank the two attributes in his/her preferred
importance order, and then change the value of one attribute in a way to equalize the
preference of the alternatives. Finally, the attribute weights are computed in the same
manner as AHP.

19
3. CONSISTENCY ISSUE
Thomas L. Saaty proposed the AHP method with the implementation of the
eigenvector method for the evaluation of pairwise comparison matrices. Although there are
other heuristic method for pairwise matrix evaluation, we preferred to use the method Saaty
instructed.
Stated by the Perron-Frobenius Theorem, for a given non-negative square matrix A
there exists its eigenvalues (𝜆 𝑚𝑎𝑥 ) in which one of them is positive and either greather than or
equal to the other eigenvalues. And for that specific maximum eigenvalue, there is always a
corresponding positive eigenvector (w). The eigenvector w is referred to as the matrix
Frobenius root for the equation below (Alonso & Lamata, 2006):
Aw = 𝜆 𝑚𝑎𝑥 w
After computation of the eigenvalue, a check for consistency is needed in order to
accept/reject the pairwise comparison matrix. In order to compute the unit of measurement for
consistency, we first need to obtain two different indexes: consistency index and the random
index. The consistency index is measured as;
CI =
𝜆 𝑚𝑎𝑥 −𝑛
𝑛−1
where 𝜆 𝑚𝑎𝑥 is the maximum eigenvalue of the specific matrix, and n refers to the dimension
of the n×n matrix. The random index is unique to the scale used and the value of n of the
pairwise comparison matrices. It is already stated for each judgement scale and it is basically
the average of the CI values of randomly generated matrices of that scale. The RI values for
different scales we have studied is shown in the Figure 3. After obtaining the values for CI
and RI, the consistency ratio is calculated by the function:
𝐶𝑅 =
𝐶𝐼
𝑅𝐼

20
For the AHP eigenvector method, the acceptance of the matrices depend on the value
that CR takes. If the CI value exceeds the RI by more than 10%, Saaty states that the matrix
can not be accepted as consistent (Alonso & Lamata, 2006).
In order for a matrix to be out of the tolerance zone for consistency, two specifications
should be fulfilled; the matrix in its structure should possess the transitivity property, and the
𝜆 𝑚𝑎𝑥 value should be in the specified interval of acceptance. For a matrix A of n×n having
matrix entries of 𝑎𝑖𝑗 , transitivity states that the equation 𝑎𝑖𝑗 ∗ 𝑎𝑗𝑘 = 𝑎𝑖𝑘 should be satisfied
(Bozóki & Rapcsák, 2007). This property is violated mostly by the decision makers’ selection
of numerical values from the given scale. Although there exists a great amount of literature in
which it is stated that the inconsistency occurs only due to the decision makers behaviours.
What we claim as a project team is that the inconsistency can not only be due to the decision
makers, but the selection of the scale as well. Hence, our objective is to minimize the
inconsistency as much as possible, either by selecting the most suitable scale from the
literature or developing a new scale which will enable us to decrease the inconsistency level.
4. INDIVIDUALIZATION OF SCALES
Although being widely used, the AHP method of Saaty’s is exposed to several critics
in terms of the consistency issue it forms. Alonso and Lamata (2006) criticized the 10% limit
of acceptance for CR value of the AHP method for being too strict. They have suggested that
the decision makers themselves should decide on the level of consistency for their specific
problem in hand. Dong et al. (2013) criticized is the procedure of letting the decision maker
assign pairwise comparison matrix entries from a numerical scale of 1 to 9. What they have
claimed is, the decision makers should form up a linguistic pairwise comparison matrix.
In their study, Dong et al.(2013) proposed an individualized scale, which is formed by
taking the verbal statements for preference from the decision makers and converting those

21
statements into numerical values by a function. The linguistic scale they have defined for
AHP is:
Figure 4: Linguistic scale
They have proposed two functions: 𝑓𝑖 which converts the linguistic entry to a
numerival value, and 𝑓−1
𝑖
which converts numerical values to linguistic entries. They have
stated limits for the 𝑓𝑖 function:
LB < 𝑓𝑖 < UB and 𝑓𝑖 < 𝑓𝑖+1
where LB is the lower bound and UB is the upper bound, the bounds are calculated by the
minimum and maximum values of 𝑆𝑖 -for LB and UB respectively- of the Linear, Geometric,
Inverse Linear, and Balanced scales used in AHP.
𝑆0 𝑆1 ... 𝑆16
Linear 1/9 1/8 ... 9
Geometric 1/256 1/128 ... 256
Inverse Linear 1/9 1/ 4.5 ... 9
Balanced 1/9 1/5.67 ... 9
Table 2: Linguistic scale values for different judgement scales
The decision maker is asked to form a linguistic pairwise comparison matrix by
assigning values to only the upper part of the diagonal entries. For a 3x3 matrix, the decision
makers matrix would look like:

22
1 𝑠𝑟 𝑒𝑟𝑡
1
𝑠𝑟
1 𝑠𝑡
1
𝑒𝑟𝑡
1
𝑠𝑡
1
By their individualized scale method, Dong et al. aimed to reduce the inconsistency
level mainly by obtaining the transitivity property for their matrices. Therefore, their method
needs an estimation of the 𝒆 𝒓𝒕 values as (Dong et al., 2013):
𝑒𝑟𝑡 = 𝑓−1
(𝑓 𝑠𝑟 ∗ 𝑓 𝑠𝑡 )
Then, the objective function becomes minimizing the deviation between 𝑒𝑟𝑡 and 𝑒𝑟𝑡
values, which is stated as:
𝑑( 𝑒𝑟𝑡
16
𝑟,𝑡=0
, 𝑒𝑟𝑡 )
To sum up, individualized scales are aiming to minimize the inconsistency level and
enables the selection of the most suitable scale for a specific decision maker.
5. EXPERIMENTS
This project aims to select or develop a scale applicable to AHP, which will minimize
the inconsistency resulted by the Linear scale of Saaty. In order to achieve this goal, the first
step of our project was stated to be research on literature. The literature on this issue includes
both theoretical and psychophysical experiments.
A. Distance from Milan
Performed by Bernoscani, Choirat, and Seri (2010), this experiment is based on
estimation of distances from a reference point. A total number of 69 subjects performed
the experiment. The natural scale to compare the result of subjects were based on actual
kilometer values between Milan and the cities Naples, Venice, Rome, Turin, and Palermo.

23
The subjects were asked to compare the cities in pairs and state their estimation on the
ratios of distances. The natural scale is shown in Figure 5.
Figure 5: Natural scale of the experiment Distance from Milan
B. Games of Chance
The Games of Chance experiment was also done by Bernoscani, Choirat, and Seri.
This time, the 69 subjects were asked to estimate the probability ratios of a game result,
which is totally depended on chance of the players. The games of chance that are included
in this experiment are taking cards out of a deck of 52, and rolling a 6-sided dice. Figure 6
shows the specific probabilities being asked.
Figure 6: Games of Chance experiment’s scale values

24
C. Rainfall in November 2001
Again, an experiment done by by Bernoscani, Choirat, and Seri, refers to the amount
of rainfall occured in November 2001 in five European countries. This time, the subjects
were asked to estimate the rainfall ratios in pairs for the five cities stated. Figure 7 shows
the basal scale values.
Figure 7: Rainfall values for the scale
Together with the above experiments Distance from Milan and Games of Chance, this
experiment’s resulting eigenvalues and eigenvectors were computed. The achieved result
in the end lead to a claim that AHP generates an issue of disconnection between the verbal
statements and the scientific numerical values for the decision maker. As stated by
Bernasconi et al. : “We have found that the distortions due to the subjective weighting
function are general and fairly robust across estimation experiments, and have shown that
our method can be applied to obtain greater consistency in the subjects’ ratio assessments”
(2010, p. 710).
D. Different Multi Attribute Methods
An experiment for examining the effect of weighting by using different multi-attribute
weighting methods(MAWM) such as SMART,DIRECT,SWING,TRADE-OFF and AHP
was done by Mari Pöyhönen, Raimo P. Hӓmӓlӓinen in 1991. This experiment was done

25
online and all of the subjects were students. The experiment was sent to the students via e-
mail. Subjects were asked to select any three alternative jobs they can consider and they
also select two to five criteria for the alternative jobs, so that they can weight each
criterion with the different methods listed above. The reason for leaving the job and
criteria selection up to subjects was to eliminate the effect of dominating alternatives.
Subjects were given the freedom to use any score they desire to assign weights for criteria
using MAWM such as SMART, DIRECT, SWING and TRADE-OFF. For AHP, subjects
used both Linear and Balanced scale ; moreover, for some of the subjects the range of the
criteria was given, for others it was not given. Before weighting the criteria, subjects
ranked the alternative jobs from best to worst to see if methods yield to the same results as
subjects preferences. The results showed that:
 Weights differ due to using a limited set of numbers.
 Decision makers do not use numbers to describe the strength of their preferences
only for the rank of attributes.
 Rank of attributes calculated with the methods did not correlate perfectly with the
rank of attributes given by a subject or calculated with another mehtod.
 Decision makers tend to interpret the numbers in a different way than what the
theory assumes.
 Inconsistency is not totally due to human behavior but also partly connected to the
numbers used in the methods.
 Inconsistency is higher with the 1-9 scale AHP comparisons.
 Properties of the AHP weights strongly depend on the evaluation scale.

26
 No differences between the versions of AHP where attribute ranges are presented or
not.
 The tendency to use integer numbers appears.
 Subjects used the presented evaluation scales without considering the meaning of
the actual numbers.
 DIRECT, SWING, and TRADEOFF yield similar weight ratios.
 Averages are smaller with balanced scale AHP, compared to 1-9 scale.
 Approximately all methods yield to the subjects’ opinion with the same percentage.
E. Visual Psychophysics of Graphical Elements
The experiment done by Ian Spence in 1990 was based on the psycophysical effects
of visual elements, and was divided into two parts. The experiment took place in
Univesity of Toronto. Subjects were paid 5 dollars each to participate in the
experiment which can be used as motivation to ensure the subjects will pay attention
to the experiment and to attract more people to the experiment, since it is a well
known fact that large samples converge to more accurate results.
In first stage of the experiment, different graphical primitives in Microsoft C were
reflected on a monitor. Seven different graphical elements were used. The size of the
visual elements were not deterministic, the size was selected randomly with respect to
uniformly distributed function. Available sizes were large, medium and small.
Element types can be listed as, horizontal line, vertical line, pie, disk slices, bars, box
and cylinder graphs. Subjects in this experiment were asked to move the cursor on a
scale, using right and left arrows to estimate the proportion of two divided parts
displayed using graphs. In some scales numerical values were present, but for others
they were hidden.Results showed that there was no effect of response variables which

27
are exponent, accuracy and latency (Spence, 1990). The presence of numerical scale
converged to more accurate results. Also using stimulus sizes of graphs had no effect
on accuracy but exponent. Last but not least subjects were more prone to make better
estimations when the pie charts were displayed.
In the second stage of the experiment, Spence implemented the same experiment in
the experiment mentioned above but this time the size of the graphs did not vary. All
elements were displayed in medium size. The results show that the variability of
latency and accuracy responses had increased by factor of 2 or 3 from the fisst
experiment to second. The mean for accuracy response and latency increased a little
when compared to the first experiment. However in pie chart case, the difference was
noticiable. This result was probably because the subjects got used to the experiment
and they started making more accurate estimations.
THEORETICAL EXPERIMENTS
The theoretical experiment part of the project refers to computing all possible 3-3
matrices of pairwise comparison for the scales Linear, Power, Root Square, Geometric,
Inverse Linear, Balanced, and Logarithmic. The number of numerical values that can be
assigned for one entity by these seven scales are in total 17. For a 3-3 matrix, there are
17
𝑛∗(𝑛−1)
2 possible combinations of that matrix’s entities.
A coding program was developed through the C++ language. All possible
combinations of matrices were generated for each of the scales mentioned above. The
implementation of the matrix generator code is shown in the Appendix A. The generated
matrices were used to calculate the maximum eigenvalues, which are then used to compute
the CI and CR values for each matrices generated. The CR value calculation includes the RI’s
stated in Figure 3. After calculating each matrices’ CR values, the coded program selects the

28
matrices which are in the consistency tolerance zone, 10% of the RI values. The toleranced
interval for the maximum eigenvalues (𝜆 𝑚𝑎𝑥 ) are computed as :
𝑛 ≤ 𝜆 𝑚𝑎𝑥 ≤ 0.1 ∗ 𝑅𝐼 ∗ 𝑛 − 1 + 𝑛
The program assigns a value of 1 for the matrices which satisfy the above condition,
and 0 for the remaining rejected matrices. The resulting eigenvalue and 0 or 1 values are
transferred into an Excel spreadsheet, in order to decide whether the accepted matrices of each
scale are subsets of each other or not. The Excel spreadsheet is shown in the Appendix B.
According to the results, out of 4913, there are 1021 consistent matrices for the Linear
Scale, 1381 for the Power, 1893 for the Geometric, 1251 for the Balanced, 1317 for the
Inverse Linear, 871 for the Logarithmic, and 937 for the Square Root scales. Also, the
analysis on these matrices reveal that the consistent matrices for the Linear Scale are subsets
of the consistent matrices for the Power Scale. But for the remaining scales, they are not
always subsets of the other scales’ consistent matrices, although they correspond to the same
verbal statements. An example of this issue is shown in Appendix C.
PILOT PSYCHOPHYSICAL EXPERIMENTS
The few of many psychophysical experiments were mentioned and summarized in
section 2. In order to compare the effects of the scaling technique used in the decision making
process, experiments of which there is a certain answer should be performed. The certain
answer, in this case, refers to a natural scale of numerical values.
In the AHP methodology, the decision maker has the chance of forming the
comparison matrices either directly from a numerical scale, or by converting the linguistic
scale in to numerical values; although the values end up being the same. However, when the
numerical scale is changed to another than the Saaty’s Linear Scale, there become differences

29
in using the numerical or the linguistic scales. Huizing and Vrolijk (1997) stated that their
experimental results conclude to the claim that usage of linguistic scale corresponds to higher
levels of inconsistency, although the difference is not significant. We wanted approach this
same issue from another point of view. In our pilot experiments we did not compute the
resulting consistency levels, but rather the transitivity errors and the weights corresponding to
the 𝜆 𝑚𝑎𝑥 values of the matrices. Our hypothesis was that the usage of linguistic scale and
Saaty’s Linear Scale result in the same values of weights. Rejecting this hypothesis either
means the linguistic scale is better, or Saaty’s Linear scale is better, in terms of being closer to
the real weight values formed via the natural scale. So we have performed a paired t-test, and
also checked the confidence interval of the average differences in the weight values. A
positive confidence interval refers to the Linear Scale resulting in closer values of weight to
the natural scale weight results, and a negative confidence interval refers to vice versa.
1. Speed of a Car
In this experiment, the subjects were shown videos in which a car passes through the
same two points with different speed values. The videos are edited by speeding up the
original video, taken by the group members. There were in total five videos with speed
values as 25 km/h, 50 km/h, 75 km/h, 100 km/h, and 200 km/h. The subjects were asked
to make paired comparisons of the videos, ten comparisons in total. The comparison
values were asked to be in both linguistic form of AHP and numerical form of the
subjects’ individual scales.
Since this was a pilot experiment, in order to see possible outcomes and possible
problems of the performed experiment, there were only ten subjects, three females and
seven males. The subjects were all students of the Sabancı University, from different
majors, who did not have knowledge in the topic of AHP.

30
The linguistic scale used for our experiment, their notations for our experiment, and
the corresponding Linear Scale values are shown in Table 3.
Linguistic Scale Notations Saaty’s Linear Scale
Equally (fast) s0 1
Weakly more s1 2
Moderately more s2 3
Moderately plus more s3 4
Strongly more s4 5
Strongly plus more s5 6
Demonstratedly more s6 7
Very, very strongly more s7 8
Extremely more s8 9
Table 3 : The linguistic scale, its notation, and the Linear Scale values
According to the table above, our experiment should have the pairwise comparison
matrix:
Figure 8 : The pairwise comparison matrix of the experiment’s natural scale
Even the natural scale is totally consistent, its value of 𝜆 𝑚𝑎𝑥 = 5.000001, the excess
value of 0.000001 is due to the rounding of the values. And the above matrix also satisfies the
transitivity property, with an error of 0.1 by computing the formula | 𝑎𝑖𝑗 ∗ 𝑎𝑗𝑘 − 𝑎𝑖𝑘 |𝑛=5
𝑖,𝑗,𝑘 ,
which is in between the tolerance zone. This calculation was made by a program coded by the
project team, and the implementation of the code for this matrix is shown in Figure 9 and 10.

31
Figure 9 : Code implementation for the experiment
Figure 10 : Code implementation result for weight calculation and transitivity error

32
The data collected from the subjects only include the matrix entities of 𝑎12, 𝑎13, 𝑎14,
𝑎15, 𝑎23, 𝑎24, 𝑎25, 𝑎34, 𝑎35, and 𝑎45; which correspond to the ten comparisons made. The
subjects were free to assign any real number corresponding to their linguistic statements. It
was expected for the subjects to be stable within their assigning of numerical values, however
this was not the case. Majority of the subjects assigned the same numerical values for
different linguistic scale values, and some of them assigned the same linguistic values for
different numerical values.
The statistical analysis made for the subjects’ assignments show some unexpected
results;
 Only one subject assigned integer values for the linguistic scale
 Two subjects assigned numerical values only in between 1-3
 One subject assign a numerical value of 10 for the linguistic statement “extremely
more”
After collection of the data from subjects, the weight calculation, 𝜆 𝑚𝑎𝑥 values, and
transitivty errors for both subjects’ numerical values and corresponding Saaty’s Linear Scale
values are calculated. The resulting table is shown in Appendix D. The differences between
the weights of the natural scale (N) of experiment and the other two (Saaty and Subject)
collected values were computed –taken into absolute values, and the statistical summary is
shown in Figure 11. Although the differences in the means are not significant, the weights
obtained by the subjects’ numerical values are closer to the natural scale values. But this is
only a basic approach to decide on the better option.
Figure 11 : Statistical Summary of differences between weights of scales

33
For the computed differences, paired t-test is performed in order to see which option is
better for obtaining the criteria weights. The statistical results of the paired t-test is shown in
Figure 12. The data shows that, with a 95% of confidence interval, there seems to be no
significant difference between the weight values obtained by the Saaty’s Linear Scale and the
individual scales of each subject.
Figure 12 : Statistical summary of the paired t-test
Even though the results are from a pilot experiment of ten subjects, the avhieved
results are important for further study of the project. But, of course, a sample size of ten
subjects is not enough to achieve an accurate result and decision. For further studies on this
project, the sample size will be increased, and the analyses will be performed again.
The problems occured for this experiment were that the subjects were not consistent in
their own assignments of values of numerical scale and linguistic scale. So, for the second
experiment, the subjects may be informed before the conduction of the experiment that the
same numerical values should correspond to the same linguistic scale statements, and vice
versa.
2. Weight of a Bottle
In this experiment, the subjects were blindfolded and asked to hold two bottles with
different filled rates. There were five bottles with mL values of 100, 200, 300, 400, and

34
500. As in the Speed of a Car experiment, the subjects were supposed to make pairwise
comparison using both the linguistic scale of AHP and their own opinions of numerical
scale.
This experiment was performed in order to compare results with the previous
experiment. We have asked only five subjects to be included in this pilot experiment,
three females and two males who are again students of Sabancı University, and from
different major, without having knowledge on the topic of AHP.
The analysis for weight calculation, eigenvalue, and transitivity error was not
performed for this experiment. Until now, this experiment was only used in order to
compare the accuracy of the subjects’ pairwise comparison matrices, with respect to the
matrices formed by the natural scale.
The results show that subjects gave more accurate results, which means the matrices’
values were closer to the real natural scale based matrices, for the Weight of a Bottle
experiment than the Speed of a Car experiment. The difference may be occured due to
speeding up one video and obtaining four other videos from the original one. So, for the
second conduction of the experiment all videos should be recorded seperately in the same
way, without creating edited videos.
PROJECT TASKS AND SCHEDULE
The objective of this project was mentioned in section 2, in details. In order to achieve
this goal, or even to come close to this goal, the remaining tasks planned by the project team
are collecting information on MATLAB coding techniques, developing an individualized
scale optimization either by using MATLAB or C++ language, conducting empirical
experiments in campus via the coded program, and adjusting the developed code if needed
after the experiment results.

35
For the upcoming project Progress Report II, the planned tasks to be completed are the
first two: collecting information on MATLAB coding techniques, deciding on whether using
MATLAB or C++ language, and developing the optimization code for individualized scales.
The Gannt chart for the planned schedule until the deadline of Progress Report II is
shown in Figure 13.
Figure 13 : Gannt Chart for the tasks until Progress Report II
CONCLUSION
In the AHP method, the Linear Scale of 1-to-9 is used to form the pairwise comparison
matrices, by the decision maker. Although it is the most popular scale to be used in AHP,
there occurs a matching issue in between the decision maker’s individual opinions on
numerical values to assign and the numerical values available in Saaty’s scale. As a solution,
this project will hopefully provide an optimized individual scale for each decision maker to
use when assigning numerical values to the verbal statements of the linguistic scale. In order
to achieve the proposed solution, the project team has done a wide research on the literature,
computed their theoretical and empirical experiments, and planned further tasks to accomplish
that will result in an optimization program. This report gives a detailed and a comprehensive
explanation on the realized tasks from the earlier schedule stated in Project Proposal report.
1-Jan 11-Jan 21-Jan 31-Jan
Information collection on MATLAB coding
techniques
Deciding on which coding language to use
Developing the optimization program
Days to Complete

36
REFERENCES
Alonso, JosÃ© Antonio, and M. Teresa Lamata. "Consistency In The Analytic Hierarchy
Process: A New Approach." International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems 14.04 (2006): 445-59. Web. 3 Dec. 2016.
Bernasconi, Michele, Christine Choirat, and Raffaello Seri. "The Analytic Hierarchy Process
and the Theory of Measurement." Management Science 56.4 (2010): 699-711. Web.
Bozóki, Sándor, and Tamás Rapcsák. "On Saaty's and Koczkodaj's Inconsistencies of
Pairwise Comparison Matrices." Journal of Global Optimization 42.2 (2007): 157-75.
Web.
Dong, Yucheng, Wei-Chiang Hong, Yinfeng Xu, and Shui Yu. "Numerical Scales Generated
Individually for Analytic Hierarchy Process." European Journal of Operational
Research229.3 (2013): 654-62. Web.
Harker, P. T., & Vargas, L. G. (1987). The Theory of Ratio Scale Estimation: Saaty's Analytic
Hierarchy Process. Management Science, 33(11), 1383-1403. doi:10.1287/mnsc.33.11.1383
Huizingh, E. K., & Vrolijk, H. C. (1997). A Comparison of Verbal and Numerical Judgments
in the Analytic Hierarchy Process. Organizational Behavior and Human Decision
Processes,70(3), 237-247. doi:10.1006/obhd.1997.2708
Saaty, Thomas L. "How to Make a Decision: The Analytic Hierarchy Process."European Journal of
Operational Research 48.1 (1990): 9-26. Web. 1 Oct. 2016.
Spence, Ian. "Visual Psychophysics of Simple Graphical Elements." Journal of Experimental
Psychology: Human Perception and Performance 16.4 (1990): 683-92. Web.

37
APPENDICES
Appendix A : Implementation of the matrix generator code
Appendix B : Excel spreadsheet of matrices’ eigenvalues and acceptance/rejection values

38
Appendix C : An example of the subset issue
The highlighted row shows that the Power Scale’s (blue) consistent matrix with a maximum
eigenvalue of 3,44302 is not the subset of the same linguistic scale matrix formed by the Geometric
Scale (yellowish) of a maximum eigenvalue of 4,48978; although the number of consistent matrices is
higher for the Geometric Scale compared to the Power Scale.

39
Appendix D : Minitab Worksheet for paired t-test

Project104_Group713_ProgressReportI

Recommended

Recommended

More Related Content

Similar to Project104_Group713_ProgressReportI

Similar to Project104_Group713_ProgressReportI (20)

Project104_Group713_ProgressReportI