4. L N Pattanaik
The author
Dr. L N Pattanaik (born 1969), earned his PhD from Indian
Institute of Technology, Roorkee and Master of Technology from
IIT-BHU, Varanasi after the Degree in Mechanical Engineering in
1993. Presently Associate Professor in the Department of
Production Engineering, Birla Institute of Technology (deemed
university), Mesra, Ranchi, INDIA. He is in the profession of
teaching and research since 1997 with interest in the areas of
statistical tools in quality engineering, applications of soft
computing tools, cellular/reconfigurable/lean manufacturing and
modeling simulation.
More about the author at https://sites.google.com/site/lnpattana
5. Preface
Research works often require one or more analytical tools for
problem solving, optimization, prediction, system modelling, data
analysis or interpretation etc. In a typical academic research work
or any professional research assignment, at some point the
researcher looks for a suitable analytical tool to progress the work
before arriving at some inferences. However, a wrong selection of
the tool for a particular problem can be counterproductive and
mislead the hard research efforts. The fact is that for each research
problem or situation, one has to judiciously select an analytical
tool that is best suitable for it. But usually if we hold a hammer
then all problems seem to be a nail to us. This crucial research
tool selection should not be influenced by our own expertise
(hammer), wrong suggestions from colleagues or latest trend in
tools found in literatures etc.
The rationale of this book is to guide the researcher in selecting
the best tool for the research problem by briefly introducing to
various tools along with their typical features. After convincingly
zeroing on a particular tool, more theories can be learned about it
from several reference sources. Some illustrative examples using
popular software (MS Excel®
, MATLAB®
, MINITAB®
, SPSS®
,
SYSTAT®
, LINGO®
) applications are included to help the readers
in tool selection by bridging the gap between theory and
application. Although author is apprehensive of the fact that fast
changes in software versions may slightly differ from the
illustrations provided in the book but hope the readers take that in
their stride. The book attempts to contain most of the
contemporary analytical tools, but giving exhaustive description
for each tool is neither feasible nor the objective. Proportionate
emphasis based on author's perception is given to various
analytical tools. Serious readers are advised to have a dry-run of
6. L N Pattanaik
the selected tool using suitable commercial software or
programming tool before implementing it to their research
problem. This unique book can serve as a reference source for
various academic courses, dissertations, research projects and
assignments. The hypothetical illustrative examples cited in the
book are solved either through software or otherwise. The book is
penned over a period of two years while keeping the needs and
circumstances of the researchers in mind. Author will be grateful
to hear from the reader about the usefulness, factual errors and
suggestions for improvements.
Wishing the reader a successful research endeavour.
L N Pattanaik
pattanaikbit@gmail.com
January, 2017
7. Contents
S. No. Page
Preface vi
Acknowledgements viii
1. INTRODUCTION 1
1.1 Research definitions and research
methodology
2
1.2 Analytical Tools for research 4
1.3 How to use this book? 6
2. STATISTICAL TOOLS 8
2.1 Correlation and Scatter plot (Excel®
and SPSS®)
10
2.2 Regression Analysis (Excel®,
MINITAB®, SYSTAT® and SPSS®)
13
2.2.1 Simple or bivariate regression 13
2.2.2 Multivariate regression 18
2.2.3 Linear and nonlinear regression 23
2.2.4 Logistic regression 31
2.2.5 Poisson regression 33
2.2.6 Robust regression 34
2.2.7 Path analysis 38
2.2.8 Multi-stage regression 41
2.2.9 Stepwise regression 44
2.2.10 Best subsets regression 49
2.3 Hypothesis Testing (Excel®, MINITAB®
and SPSS®)
50
2.3.1 t-test and z-test 51
8. L N Pattanaik
2.3.2 Chi-square Test 59
2.4 Analysis of variance (ANOVA) and F-
test (Excel®, SPSS® and MINITAB®)
64
2.5 Design of Experiments (DOE)
(MINITAB®)
72
2.5.1 Taguchi's robust design 78
2.5.2 Response Surface Methodology
(RSM)
81
2.5.3 Grey Relational Analysis (GRA) 93
2.6 Factor Analysis or Principal Component
Analysis (SPSS® and SYSTAT®)
99
2.7 Stratification and Pareto analysis
(MINITAB®)
103
2.8 Statistical aspect in survey 105
3. TOOLS FROM OPERATION RESEARCH 109
3.1 Linear Programming (LP) (Excel
Solver®)
110
3.1.1 Sensitivity Analysis 113
3.1.2 Integer Programming (LINGO®
) 114
3.2 Nonlinear Programming (NLP)
(LINGO®)
119
3.3 Modeling and Simulation 122
3.3.1 Programming and software for
simulation
124
3.3.2 Monte Carlo simulation 131
3.4 Decision analysis and decision tree 134
3.5 Multiple Criteria Decision Making
(MCDM)
137
3.5.1 Analytic Hierarchy Process (AHP) 139
3.5.2 Analytic Network Process (ANP) 146
3.5.3 TOPSIS 149
13. Introduction
______________________________________________________
Imagination is the highest form of research
~ Albert Einstein
Let us acknowledge research as a journey, which one willingly
starts with an unknown destination. During the journey, the person
carries a bag containing stuffs like programming skills, analytical
ability, interpersonal skills, experimental aptitude, expertise in
some tools, a laptop, balm for headache etc. This journey is no
doubt full of adventures and pitfalls. Plenty of instances of
unfinished journey can be found. A successful journey or research
is depending on the stuffs in the bag and any companion guide
source available like thesis supervisor or superior/senior scientist.
Irrespective of the environment (academic, industry R&D or
research labs) genuine research has to rely on these stuffs. When a
researcher has just completed a set of experiments with a lot of
hardship and wondering with the collected data, simple reporting
the data may be a justified research output if the experiment was
innovative in itself. But more in depth information and knowledge
can only be obtained after putting the experimental results to
various tests and analysis.
This book intends to serve as a guide for the selection of an
appropriate analytical tool for various fields of research. The
1
14. L N Pattanaik
different areas of research may include but not limited to all the
disciplines of engineering, core sciences, management and social
sciences, medical and bio technology, pharmacy and so on.
Irrespective of the research area, at some point the researcher may
come across a situation where an analytical tool is required to
proceed with the work. Sometimes one finds published
papers/reports/theses where the tools look completely out-of-sync
for the problem or the work seems to be built around a prominent
tool undermining the research topic. Ideally, due importance
should be given to both and the tool has to be suitable for the
research problem. However, there is no dearth of specialized
reference material for each individual tool. But to decide about and
select a particular tool may require an exhaustive search and
comprehension about all potential tools still running the risk of
making a wrong selection. To mitigate this painstaking effort of
the researcher and to work as a handy tool box at the time of need,
the rationale of this book was conceived upon. The author attempts
to cover most of the contemporary tools here. As can be
understood, it is not feasible to discuss all the tools in great detail.
However, with some key features and software based illustrative
examples on these tools will certainly help the readers to map with
their own research problem.
1.1 Research Definitions and Research Methodology
By definition, research is "a careful investigation or inquiry
especially through search for new facts in any branch of
knowledge" (Advanced Learner's Dictionary). It is also understood
as a systematized effort to gain new knowledge. It is actually a
journey to discover new facts in a focused area of study. John W.
Best defines research to be the "more formal, systematic, intensive
process of carrying on the scientific methods of analysis. It
involves a more systematic structure of investigation, usually
resulting in some sort of formal record of procedures and a report
of results or conclusions". Cook defined research as "an honest
15. Statistical Tools
______________________________________________________
Torture numbers and they'll confess to anything
~Gregg Easterbrook
Statistics, a body of methods for making
wise decisions in the face of uncertainty
~W.A. Wallis
Statistics has immense contributions in research works for data
collection, interpretation, inference, analysis, significance testing
etc. In its simplest form as measure of central tendency and
dispersion or the complex design of experiments and robust
parameter designs, researchers have utilized the powers of this
science more often. As it is the most frequently used tool for
research data analysis and inferential statistics, this book gives it
the first priority in discussion. Before explaining about various
statistical tools, some basics from statistics are discussed here.
The data types one may come across in a typical research work
can be broadly classified as follows:
Numerical/Quantitative (data expressed in numerical or
quantitative form)
a) Continuous variable (length in mm, temperature in °C,
volume in mm3
etc.)
b) Discrete variable (in whole number; missing bolts, no. of
flights, count of defects etc.)
2
16. L N Pattanaik
Categorical/Qualitative (data expressed qualitatively in
categories)
a) Binary variable (only two options; present/absent,
correct/incorrect, yes/no etc.)
b) Nominal variable (categories having no hierarchical
relation; green/blue/red)
c) Ordinal variable (categories having hierarchical relation;
low/medium/high)
The selection of an appropriate statistical tool depends on the
data type also. For example, logistic regression and chi-square tests
are applicable for categorical data types only. In statistics, the
concepts of Population and Sample are used most often to draw
several inferences. Population represents a data source of nearly
infinite size but of the same kind whereas a random sample is a
small representation of the population which the investigator is
capable of examining. For example, selection of 100 phone
subscribers as a sample from the list of 5 million subscribers as
population. The study of various characteristics of the sample is
feasible but not of the population. Now the statistics helps here in
predicting or inference more about the population using the sample
characteristics.
In any preliminary statistical analysis on a data set, the two
characteristics that are commonly observed are central tendency
(mean, median or mode) and dispersion (range, standard deviation
or variance). These two characteristics often calculated and
estimated for both the sample and population in various
applications of statistical tools.
Some of the useful tools from statistics which deserve some
explanations are given here.
Correlation and Scatter plot
Regression analysis
Hypothesis testing
t-test/z-test
Chi-square test
F-test and ANOVA
Design of Experiments (DOE)
Taguchi's robust design
17. Response Surface Methodology (RSM)
Grey Relational Analysis (GRA)
Factor analysis or Principal Component Analysis (PCA)
Stratification and Pareto analysis
Statistical aspects in survey
2.1 Correlation and Scatter Plot
Scatter plots are useful to graphically represent the correlation
between two quantitative variables or numerical continuous data.
Each data pair is represented by a point on the scatter diagram. The
correlation may be linear, non-linear (quadratic, cubic etc.) or no
correlation. Linear correlations can further be interpreted as
positive or negative correlation depending on increase or decrease
of Y-axis value (response/dependent) with that of X-axis
(predictor/independent) respectively as depicted in Figure 2.1. A
linear correlation coefficient (r) is used to measure the strength of
the linear relation between the independent and dependent
variables.
� =
� −
∑ {
� − ̅
�
} {
� − ̅
�
} − ≤ � ≤ +
n is the number of data pairs
σx, σy are the standard deviations of variables x and y.
̅ and ̅ are the mean values of variables x and y.
Strong positive linear correlations and negative linear
correlations are indicated by r value close to (+1) and (-1)
respectively. An r value close to zero indicates no linear
correlation although non-linear correlations may exist. A
coefficient of determination (r2
) is useful to measure the proportion
of total variation in the response due to the linear relation with
independent variable. In other words, it is a quantitative measure of
how well the regression straight line fits on the scatter plot. Its
value lies in the range of ≤ �2
≤ and a higher value indicates
18. L N Pattanaik
Figure 2.2 Scatter diagram and
linear correlation coefficient (MS Excel®)
Using MINITAB®
also the Pearson correlation coefficient (r)
can be obtained as (−0.99602) through 'Stat' >> 'Basic Statistics'
>> 'Correlation...'. In SPSS®
software the same data is entered in
two numeric variables Aluminium and Hardness before following
the path Analyze > Correlate > Bivariate and the output is
produced as in Figure 2.3. The Pearson correlation coefficient of
(−0.996) indicates the same result and interpretation as earlier. It
can also be observed from the output that the Sig. (2-tailed) value
is .000 which is the p-value representing the significance of the
correlation. A p-value less than 0.01 or 1% means the inference is
statistically significant with 99% confidence level.
Correlations
Aluminium Hardness
Aluminium Pearson Correlation 1 -.996**
Sig. (2-tailed) .000
N 11 11
Hardness Pearson Correlation -.996** 1
Sig. (2-tailed) .000
N 11 11
** Correlation is significant at the 0.01 level (2-tailed).
Figure 2.3 Output from SPSS® software
for the correlation example
19. 2.2 Regression Analysis
When the researcher wants to establish a relationship among
independent (predictor or explanatory) variables and one or more
dependent (response or outcome) variables, regression analysis or
ANCOVA (Analysis of Covariance) is an appropriate statistical
tool. It is a statistical technique to determine the linear/nonlinear
mathematical relationship between two or more variables. Data
from experiments or other sources is available as ordered pairs (y,
x) or (y, x1, x2..). Using this data a regression equation is framed
which can be used to predict or forecast outcomes for known set of
inputs. Also known as curve or line fitting because the sum of
differences in the distances of data points from the curve or line is
minimized (method of least square). More the data, better the curve
fitting and hence the prediction. The limitation of this statistical
tool is prediction of response value beyond the experimental data
range is forbidden. Similar to correlation analysis, regression also
fails to prove any causal relationship between independent and
dependent variables. Popular software for regression analysis is
MINITAB®
, Statistica®
, MS Excel®
, MATLAB®
, SYSTAT®
,
SPSS®
, SAS®
etc. Based on the nature of data and number of
independent and dependent variables, the various classes of
regression analysis are
Simple or Bivariate regression
Multiple or multivariate regression
Linear and nonlinear regression
Logistic regression
Poisson regression
Robust regression
Path analysis
Multi-stage regression
Stepwise regression and
Best subsets regression
2.2.1 Simple or Bivariate regression
Regression problems having a single independent variable for one
dependent variable come under bivariate or simple regression
20. L N Pattanaik
class. Both linear and nonlinear functions can be used to fit the line
or curve to the data.
Illustrative Example:
Mr A conducted some experiments on an Electro Discharge
Machining (EDM) process for his academic research. The 12
collected data pairs (x, y) tabulated here (Table 2.1). Now he can
perform a simple regression analysis using paper and pen or any
software to find the relationship between independent variable
current and dependent variable material removal rate, MRR. The
question is "whether there is a relationship between MRR (Output)
and current (Input)? If yes, then how to formulate it for future
prediction of output MRR for a current of 1.63 units.
First the researcher used MS Excel®
to perform the regression
analysis through the following steps:
Step 1: Enter the data in two columns
Step 2: Open the tab DATA and find DATA ANALYSER
Expt. Current MRR Expt. Current MRR
1 1.2 3.5 7 1.3 3.4
2 1.4 3.7 8 1.1 3.3
3 1.6 3.5 9 1.6 3.5
4 1.5 3.5 10 1.8 3.6
5 1.7 3.8 11 1.5 3.5
6 1.1 3.1 12 1.2 3.2
Table 2.1 Experimental data for regression analysis
22. L N Pattanaik
Figure 2.7 Data and summary output from MS Excel®
The same data of MRR as response, current and voltage as
predictors entered in MINITAB®
worksheet. The regression
analysis output as shown in Figure 2.8, which gives the linear
regression equation (enclosed in a rectangular box), R Square
(57.3%) and R Square (adj) (47.8%) exactly same as obtained from
MS-Excel®. Now the F-value of 6.04 is found to be more than
F(0.05, 2, 9) read as 4.26 from the table, indicating the significance of
the regression model. However, the high p-value (0.755) for
Voltage indicates its insignificance as an independent variable.
Further the low values of R Square (57.3%) and R Square (adj)
BOOK PREVIEW
SAMPLE PAGES
23. Illustrative Example:
A survey was conducted to understand the pattern of high blood
pressure in a group of corporate managers. The independent
variables considered as related to BP are weight, height, age and
sugar level. The collected data for 15 cases are as given in Table
2.3.
This multiple regression analysis having four independent
variables and BP as the response variable is attempted in
MINITAB®
by entering the data in five columns of the worksheet.
Then following Stat > Regression, the variables are appropriately
selected as response and predictors. In the Graphs, standardized
normal plot for residual is selected. The output and normal
probability plot are produced in Figure 2.10. As the points are
found to be in a straight line in the normal plot, it indicates that the
residuals follow a normal distribution approximately. The
regression linear equation representing a hyper-plane in five
dimensions is expressed as
� = . + . ����ℎ − . ����ℎ + . ��
− . � ���
Suppose a case with weight 69, height 173, age 35 and sugar
level at 97 is considered for predicting the BP using the regression
analysis, then substituting the values in the equation, BP is
calculated as 103.20. But during such predictions, the numerical
values of the independent variables should be well within the
ranges for each variable as used in model development. For
example, cases like 82 years age, 50 kg weight etc. are beyond the
range of the model and cannot be used for prediction.
From the MINITAB®
output (Figure 2.10) it can be observed
that, the p-value for the regression (ANOVA) is 0.000 (less than
0.05) means significant relations exists among the variables.
However the p-values for Height and Sugar are more than 0.05 and
that of Age and Weight is less. It clearly indicates that both Age
and Weight are related to the variation in the response BP while
Height and Sugar are not. This is an important piece of information
which can be used to modify the regression model.
24. L N Pattanaik
Figure 2.10 Normal probability plot and MINITAB® output
2.2.3 Linear and nonlinear regression
With a set of collected data either bivariate or multivariate, the
analyst may ponder to select between linear and nonlinear
regression analysis. Some important facts regarding these two
regression analysis will help in making this decision.
25. Transformation of variables to convert into linear
regression or
Nonlinear regression
Transforming a problem having curves in their scatter plots
into a linear regression problem is one of the methods of dealing
with nonlinearity in the data distribution. However, this is
completely dependent on the hit-and-trial expertise of the analyst.
Further, transformation of variables also involves transformation of
noise or disturbance, which affects its assumptions. In spite of
these limitations, transformation of intrinsically linear or
transformably linear functions is practiced in regression analysis.
Intrinsically linear are those functions, which can be converted into
linear functions by suitable transformation. An illustrative example
on transformation is given here.
A bivariate regression analysis is conducted on 10 data points
available in the form of (y, x1). From the scatter plot, a curve is
clearly found to exist between response and predictor. Using
MINITAB®
, regression > Fitted Line Plot, first a quadratic and
then a cubic curve is fitted as shown in Figure 2.11. The R2
values
for quadratic is 85.5 (poor) and for cubic 96.5 (better). However, to
further improve the curve fitting, the reciprocal or inverse of
predictor is used. Figure 2.12, shows the quadratic curve between
response and inverse of original predictor with a R2
value of 99.9
(best). The linear regression equation in quadratic form is found as
= . − . / + . / 2
Transformation to linear regression can also be conveniently
performed in MS Excel® spreadsheet. After entering the response
and predictor data in two columns, first a scatter plot is to be
inserted. By selecting Add Trendline… from the scatter plot,
various trendline options for line fitting like exponential, linear,
logarithmic, polynomial, power and moving average will be
available to select from. The mathematical equation for the best fit
27. Tools from Operation Research
______________________________________________________
Operation research is scientific methodology - analytical,
experimental, quantitative - which by assessing the implication of
various alternative courses of action, provides an improved basis
for management decisions
~ Pocock
In this chapter, some analytical tools are presented having their
roots in Operation Research (OR) to solve varieties of problems
related to optimization, decision-making, selection of best
alternative, assignment, scheduling etc. As the name suggests,
these tools are the outcomes of research to optimize certain
operations. Developed mostly during World War II, OR tools rely
on mathematical formulations and logical deductions to arrive at a
best solution for a given set of constraints or conditions. The major
limitation of these tools is the incapability to handle non-
quantifiable factors. Applications of OR also known as decision
analysis tools can be found in different domains like business
enterprises, manufacturing and industrial engineering, service
sector, supply chain management and so on. These powerful tools
are not been utilized fully in research applications due lack of
information on related software. However, it is extensively used in
industries for handling problems in production planning, personnel
scheduling, finance and budgeting, resource allocation, stock
optimization, portfolio management and inventory control. The
ubiquitous MS Excel Solver®
can handle the majority of OR
3
3
28. L N Pattanaik
problems easily. The following popularly used OR tools having
scopes in research applications are discussed here:
Linear Programming (LP)
Integer Programming (IP)
Binary Integer Programming (BIP)
Mixed Integer Programming (MIP)
Binary Integer Linear Programming (BILP)
Non Linear Programming (NLP)
Modeling and Simulation
Decision analysis and decision tree
Multiple Criteria Decision Making (MCDM)
3.1 Linear Programming (LP)
It is the most popular OR technique to solve optimization or
allocation problem using a graphical or analytical Simplex method.
Although small size problems can be handled graphically (for two
variables) or manually but for problems involving more variables,
computer software like MS Excel®
Solver add-in, LINDO®
,
MATLAB®
etc. are useful. Before solving the problem,
mathematical formulation is required to express the objective
function in linear form which needs to be optimized (either
maximize or minimize) in terms of the variables. The constraint
relationships among the variables are also expressed in the form of
linear equalities or inequalities.
BOOK PREVIEW
SAMPLE PAGES
29. Metaheuristics
______________________________________________________
Heuristic is an algorithm in a clown suit.
~ Steve McConnell
Of course, this is a heuristic, which is a fancy way
of saying that it doesn't work.
~ Mark Dominus
Metaheuristic term was coined by Glover in 1986 by combining
the Greek prefix meta- (metá, beyond or of high-level) with
heuristic (from the Greek heuriskein or euriskein, to search).
Heuristics are problem-dependent techniques whereas
metaheuristics are problem-independent techniques. Although
metaheuristics are problem-independent generic techniques, it is
nonetheless necessary to do some restructuring and selection of its
intrinsic parameters in order to adapt the tool to a particular
problem. Further, a problem specific implementation of a heuristic
algorithm according to the guidelines of a metaheuristic framework
can also be referred to as a metaheuristic.
Metaheuristics have been proved as more viable and superior
alternatives to traditional operation research based methods like
linear, integer, mixed, dynamic programming etc. For complex and
large problems, metaheuristics are capable of providing optimum
solution with a good balance between quality of solution and
computing time. However, the metaheuristics are criticized for the
lack of universally applicable design methodology, the lack of
scientific validity in testing and comparing different
implementation results and the tendency of metaheuristics
4
30. L N Pattanaik
researchers to create intricate methods using biological patterns of
species and nature. Nevertheless, metaheuristics are also more
flexible than traditional methods in terms of easy adaptability to
any real-life problem. Unlike the conventional optimization
problems, constraints and limitations of integers, linear or non-
linear equations etc. are unimportant. However, the solution from
metaheuristics may not guaranteed as global optimum in all
instances owing to the nature of search algorithm but the other
advantages like easy formulation, handling of complex problems
and less computational time outweighs the exact methods. These
capabilities of metaheuristics enables it for solving a majority of
large real-life optimization problems in research and in practical
applications. Several commercial software use metaheuristics as
their logical algorithms in optimization engines. For example,
genetic algorithms (GA) are used by SimRunner®
of ProModel®
and tabu search, GA and scatter search are used in OptQuest®
. But,
when a researcher is trying to solve a unique research problem
using any of the metaheuristics, high-level programming in C/C++,
MATLAB®
, VBASIC, Java®
etc. is a prerequisite as commercial
software may not be of much help.
4.1 Combinatorial Optimization (CO) problems
Metaheuristics are often proved to be efficient in solving several
combinatorial optimization (CO) problems in academics and
industrial practice. Among the optimization problems whose
solution can only be discrete variable, a class of problems called
CO where a best configuration of variables are required to achieve
some goals. Although CO problems can also be formulated as
linear or nonlinear IP as discussed in Chapter 3 earlier, but with
much difficulty owing to the complicated constraints and solution
requirement. A CO problem usually contains a huge number of
31. Figure 4.13 MATLAB® function and Pareto optimal front
The ten solutions represented as points on the Pareto front as
found by the algorithm are numerically given in Table 4.2. It can
be observed from this set of optimal solutions that there is no
solution having both fitness f1 and f2 inferior to any other.
Sol Objective f1 Objective f2 x1 x2
1 -21922.3342528 -25934.585002590 -29.17623522283 -29.612877082111
2 -15830.45954592 -32222.115144986 -26.280679180414 -31.833098537540
3 -26616.61244151 -20404.906985259 -31.055557150414 -27.340117817865
4 -22936.986794234 -25505.78311401 -29.603773679597 -29.44891360409
5 -23827.324858954 -24027.99389984 -29.968607241193 -28.869354157216
6 -24810.93087209 -22020.181005088 -30.361188706569 -28.04254736906
7 -17624.978807409 -30859.363408263 -27.20162988194 -31.37821172964
8 -21101.651303911 -29680.232361892 -28.820617985159 -30.973687714374
9 -24010.53238651 -22378.82231343 -30.042542052669 -28.19381523936
10 -23122.043593963 -24543.621581016 -29.6803756604 -29.07420385881
Table 4.2 Non-dominated optimal solutions
on the Pareto front and their fitness
32. L N Pattanaik
Artificial Intelligence Tools
______________________________________________________
A year spent in artificial intelligence
is enough to make one believe in God.
~ Alan Perlis
The term Artificial Intelligence (AI) was coined at MIT, USA in
1956 by John McCarthy. It is considered as an extended branch of
computer science to develop intelligent human-like decision-
makings in computers or computer controlled machines. Keeping
the objective of this book in focus, only those AI techniques,
which can be applied to various research problems as an analytical
tool, need to be discussed here. A major subset of AI tools comes
under the heading of Soft Computing (SC) tools. It is a family of
tools that imitate human intelligence with the goal of creating some
human-like capabilities such as learning, reasoning and decision-
making. SC tools are based on Fuzzy Logic (FL), Artificial Neural
Network (ANN) and Probabilistic Reasoning (PR) techniques such
as Genetic Algorithm (GA). These tools aim to exploit the
tolerance for imprecision, partial truth and uncertainty similar to
human decisions for real life research problems. Fusion or
hybridization of two or more SC tools is also more effective in
some situations. Apart from these SC tools, AI also encompasses
tools like Expert Systems (ES) or Knowledge-based Systems
(KBS). The following list of potential AI tools, which are
5
33. applicable to research problems dealing with design, search,
optimization or prediction, will be covered:
1. Fuzzy Logic
2. Artificial Neural Network
3. Genetic Algorithms (Chapter 4)
4. Hybrid of ANN, FL and GA
5.1 Fuzzy Logic
Fuzzy logic is based on the Fuzzy Set theory introduced by Zadeh
in 1965. In classical mathematics, crisp sets assigns a number 1 or
0 to each element in the set, depending on whether the element
belongs to the set or not. This concept is sufficient for many areas
of applications, but it can easily be seen, that it lacks in flexibility
for some practical applications. A straight way to generalize this
concept is to allow more values between 0 and 1. All other values
mean a gradual membership to the set. The membership function is
a graphical representation of the magnitude of participation of each
input. As shown in Figure 5.1, the input data can be fuzzified using
the membership plots. The rolling force 170 KN can be expressed
as belonging to two fuzzy sets LOW and MED with membership
functions of 0.3 and 0.7 respectively.
Figure 5.1 Fuzzification of a crisp value
34. L N Pattanaik
Hybrid Tools
____________________________________________________
The whole is greater than the sum of its parts
~ Aristotle
Synergy is almost as if a group collectively agrees to subordinate
old scripts and to write a new one
~ Stephen R. Covey
This chapter contains analytical tools, which are hybrid in nature.
Although, the avenues for combining two or more individual tools
for effective problem solving or other research purposes cannot be
exhausted but only some popular hybrid tools are discussed here.
6.1 Fuzzy MCDM
The MCDM problems often consider subjective views of human
beings in the decision-making process. Hence, fuzzy logic was
found to be more appropriate for integrating with MCDM tools in
order to include the ambiguities and vagueness in human
judgments in linguistic terms. There are several published
approaches to collect the opinions in fuzzy language and convert
that into crisp numbers later. Further, the stage at which the
defuzzification occurs also varies. To keep the essence of fuzzy
logic, it demands that the defuzzification process be at a later stage
of the algorithm. As dealing with fuzzy numbers is always more
6
35. cumbersome than crisp ones, hence some researchers are tempted
to defuzzify them as early as possible.
Based on a review report, more than 400 research application
papers on fuzzy MCDM were found in the areas of engineering,
science, business and management until 2014, which is evident of
its robustness and potential as analytical tools. Further,
researchers are able to apply fuzzy MCDM tools to various
strategic and decision making problems without the help of any
software. Majority of decision problems reported are related to
logistics and supply chains like location management, supplier
selection and management of waste, risk, water resource, forest
etc., transportation and energy planning, product development and
other such applications. The following major fuzzy MCDM
(FMCDM) tools are discussed here:
Fuzzy AHP (FAHP)
Fuzzy ANP (FANP)
Fuzzy TOPSIS
Fuzzy VIKOR
Fuzzy ELECTRE
Fuzzy PROMETHEE
Fuzzy MOORA
6.1.1 Fuzzy AHP (FAHP)
Analytic Hierarchy Process (AHP) is discussed earlier in §3.5.1 as
a multi-criteria decision making or MCDM tool. However, the
rating scales used in pairwise comparisons of criteria or
alternatives was crisp numbers which is against the imprecision
and vagueness of human judgment. Fuzzy theory as described in
§5.1 accommodates this imprecision and partial truth in the input
data. Hence, a hybrid of fuzzy logic with conventional AHP
becomes a more effective MCDM tool to handle the subjective and
linguistic inputs from the survey or feedback. FAHP makes use of
a triplet of triangular fuzzy numbers (TFN) to include the
vagueness in the priorities during pairwise comparisons. A typical
set of TFNs that can be used in FAHP applications is given here.
36. L N Pattanaik
INDEX
______________________________________________________
A Binary int. linear program. 114
Adaptive control 207 Biological immune system 206
Adjusted-R2
17 Bivariate regression 13
Agent based systems 207 Box-Behnken Design 81
aGPSS 129 Branch-and-Bound algo. 114
AHP 139 Buckley’s geometric mean 274
Alternate hypothesis 50
Analytical tools 4 C
ANCOVA 13 Categorical variable 31
ANFIS 265 Causal model 38
ANN training 250 Causal relationship 10
ANOVA 64 Central Composite Design 81
ANP 146 Central Limit Theorem 51
Ant Colony Optimization 199 Chi-square Test 59
Artificial Immune Sys. 206 Chromosome 189
Artificial Neural Net. 250 Closeness coefficient 284
Aspiration concept 181 Clustering 261
Assignment problem 119 Coefficient of determination 10
Coeff. Mult. Determination 16
B
Collinearity 44
Best subsets regression 49