Chapter 6Techniques for Predictive ModelingBusiness.docx

Chapter 6:
Techniques for Predictive Modeling
Business Intelligence and Analytics: Systems for Decision
Support
(10th Edition)
Business Intelligence and Analytics: Systems for Decision
Support
(10th Edition)
Copyright © 2014 Pearson Education, Inc.
6-‹#›

1
Learning Objectives
Understand the concept and definitions of artificial neural
networks (ANN)
Learn the different types of ANN architectures
Know how learning happens in ANN
Become familiar with ANN applications
Understand the sensitivity analysis in ANN
Understand the concept and structure of support vector
machines (SVM)
(Continued…)
6-‹#›
Learning Objectives
Learn the advantages and disadvantages of SVM compared to
ANN

Understand the concept and formulation of k-nearest neighbor
algorithm (kNN)
Learn the process of applying kNN
Learn the advantages and disadvantages of kNN compared to
ANN and SVM
6-‹#›
Opening Vignette…
Predictive Modeling Helps Better Understand and Manage
Complex Medical Procedures
Situation
Problem
Solution
Results
Answer & discuss the case questions.

6-‹#›
4
Questions for the Opening Vignette
Why is it important to study medical procedures? What is the
value in predicting outcomes?
What factors do you think are the most important in better
understanding and managing healthcare?
What would be the impact of predictive modeling on healthcare
and medicine? Can predictive modeling replace medical or

managerial personnel?
What were the outcomes of the study? Who can use these
results? How can they be implemented?
Search the Internet to locate two additional cases in managing
complex medical procedures.
6-‹#›
Opening Vignette –
A Process Map for Training and Testing Four Predictive Models

6-‹#›
Opening Vignette
The Comparison of Four Models

6-‹#›
Neural Network Concepts
Neural networks (NN): a brain metaphor for information
processing
Neural computing
Artificial neural network (ANN)
Many uses for ANN for
pattern recognition, forecasting, prediction, and classification
Many application areas
finance, marketing, manufacturing, operations, information
systems, and so on

6-‹#›
8
Biological Neural Networks
Two interconnected brain cells (neurons)
6-‹#›

9
Processing Information in ANN
A single neuron (processing element – PE) with inputs and
outputs
6-‹#›

10
Biology Analogy
6-‹#›
11

Application Case 6.1
Neural Networks Are Helping to Save Lives in the Mining
Industry
Questions for
Discussion
How did neural networks help save lives in the mining industry?
What were the challenges, the proposed solution, and the
obtained results?
6-‹#›

Elements of ANN
Processing element (PE)
Network architecture
Hidden layers
Parallel processing
Network information processing
Inputs
Outputs
Connection weights
Summation function
6-‹#›

13
Elements of ANN
Neural Network with One Hidden Layer
6-‹#›
14
Elements of ANN

Summation Function for a Single Neuron (a), and
Several Neurons (b)
6-‹#›
15
Elements of ANN
Transformation (Transfer) Function
Linear function
Sigmoid (logical activation) function [0 1]

Tangent Hyperbolic function [-1 1]
Threshold value?
6-‹#›
16
Neural Network Architectures
Architecture of a neural network is driven by the task it is
intended to address
Classification, regression, clustering, general optimization,

association, ….
Most popular architecture: Feedforward, multi-layered
perceptron with backpropagation learning algorithm
Used for both classification and regression type problems
Others – Recurrent, self-organizing feature maps, Hopfield
networks, …
6-‹#›
17

Feed-Forward Neural Networks
Feed-forward MLP with 1 Hidden Layer
6-‹#›
18
Recurrent Neural Networks

6-‹#›
19
Other Popular ANN Paradigms
Self-Organizing Maps (SOM)
First introduced by the Finnish Professor Teuvo Kohonen
Applies to clustering type problems

6-‹#›
20
Other Popular ANN Paradigms
Hopfield Networks
First introduced by John Hopfield
Highly interconnected neurons
Applies to solving complex computational problems (e.g.,
optimization problems)

6-‹#›
21
Predictive Modeling is Powering the Power Generators
Questions for Discussion
What are the key environmental concerns in the electric power
industry?

What are the main application areas for predictive modeling in
the electric power industry?
How was predictive modeling used to address a variety of
problems in the electric power industry?
6-‹#›
Development Process of an ANN

6-‹#›
23
An MLP ANN Structure for the Box-Office Prediction
Problem

6-‹#›
24
Testing a Trained ANN Model
Data is split into three parts
Training (~60%)
Validation (~20%)
Testing (~20%)
k-fold cross validation
Less bias
Time consuming

6-‹#›
25
AN Learning Process
A Supervised Learning Process
Three-step process:
1.Compute temporary outputs.
2.Compare outputs with desired targets.
3.Adjust the weights and repeat the process.

6-‹#›
26
Backpropagation Learning
Backpropagation of Error for a Single Neuron

6-‹#›
27
Backpropagation Learning
The learning algorithm procedure
Initialize weights with random values and set other network
parameters
Read in the inputs and the desired outputs
Compute the actual output (by working forward through the
layers)
Compute the error (difference between the actual and desired
output)
Change the weights by working backward through the hidden
layers

Repeat steps 2-5 until weights stabilize
6-‹#›
28
Illuminating The Black Box Sensitivity Analysis on ANN
A common criticism for ANN: The lack of
transparency/explainability
The black-box syndrome!
Answer: sensitivity analysis

Conducted on a trained ANN
The inputs are perturbed while the relative change on the output
is measured/recorded
Results illustrate the relative importance of input variables
6-‹#›
29
Sensitivity Analysis on ANN Models
For a good example, see Application Case 6.3

Sensitivity analysis reveals the most important injury severity
factors in traffic accidents
6-‹#›
30
Sensitivity Analysis Reveals Injury Severity Factors in Traffic
Accidents

How does sensitivity analysis shed light on the black box (i.e.,
neural networks)?
Why would someone choose to use a blackbox tool like neural
networks over theoretically sound, mostly transparent statistical
tools like logistic regression?
In this case, how did NNs and sensitivity analysis help identify
injury-severity factors in traffic accidents?
6-‹#›
Support Vector Machines (SVM)
SVM are among the most popular machine-learning techniques.
SVM belong to the family of generalized linear models…

(capable of representing non-linear relationships in a linear
fashion).
SVM achieve a classification or regression decision based on
the value of the linear combination of input features.
Because of their architectural similarities, SVM are also closely
associated with ANN.
6-‹#›
32

Goal of SVM: to generate mathematical functions that map
input variables to desired outputs for classification or
regression type prediction problems.
First, SVM uses nonlinear kernel functions to transform non-
linear relationships among the variables into linearly separable
feature spaces.
Then, the maximum-margin hyperplanes are constructed to
optimally separate different classes from each other based on
the training dataset.
SVM has solid mathematical foundation!
6-‹#›

33
A hyperplane is a geometric concept used to describe the
separation surface between different classes of things.
In SVM, two parallel hyperplanes are constructed on each side
of the separation space with the aim of maximizing the distance
between them.
A kernel function in SVM uses the kernel trick (a method for
using a linear classifier algorithm to solve a nonlinear problem)
The most commonly used kernel function is the radial basis
function (RBF).
6-‹#›

34
Many linear classifiers (hyperplanes) may separate the data
6-‹#›

35
Managing Student Retention with Predictive Modeling
Why is attrition one of the most important issues in higher
education?
How can predictive analytics (ANN, SVM, and so forth) be used
to better manage student retention?
What are the main challenges and potential solutions to the use
of analytics in retention management?
6-‹#›

Application
Case 6.4
Managing Student Retention with Predictive Modeling
6-‹#›

How Does an SVM Work?
Following a machine-learning process, an SVM learns from the
historic cases.
The Process of Building SVM
1. Preprocess the data
Scrub and transform the data.
2. Develop the model.
Select the kernel type (RBF is often a natural choice).
Determine the kernel parameters for the selected kernel type.
If the results are satisfactory, finalize the model; otherwise
change the kernel type and/or kernel parameters to achieve the
desired accuracy level.
3. Extract and deploy the model.
6-‹#›

38
The Process of Building an SVM
6-‹#›

39
SVM Applications
SVMs are the most widely used kernel-learning algorithms for
wide range of classification and regression problems
SVMs represent the state-of-the-art by virtue of their excellent
generalization performance, superior prediction power, ease of
use, and rigorous theoretical foundation
Most comparative studies show its superiority in both
regression and classification type prediction problems.
SVM versus ANN?
6-‹#›

40
k-Nearest Neighbor Method (k-NN)
-demanding, computationally intensive
iterative derivations
k-NN is a simplistic and logical prediction method, that
produces very competitive results
k-NN is a prediction method for classification as well as
regression types (similar to ANN & SVM)
k-NN is a type of instance-based learning (or lazy learning) –
most of the work takes place at the time of prediction (not at
modeling)
k : the number of neighbors used

6-‹#›
41
k-Nearest Neighbor Method (k-NN)
The answer depends on the value of k
6-‹#›

The Process of k-NN Method
6-‹#›
Similarity Measure: The Distance Metric

Numeric versus nominal values?
k-NN Model Parameter
6-‹#›
Number of Neighbors (the value of k)
The best value depends on the data
Larger values reduce the effect of noise but also make

boundaries between classes less distinct
An “optimal” value can be found heuristically
Cross Validation is often used to determine the best value for k
and the distance measure
k-NN Model Parameter
6-‹#›
Efficient Image Recognition and Categorization with kNN

Why is image recognition/classification a worthy but difficult
problem?
How can k-NN be effectively used for image
recognition/classification applications?
6-‹#›
End of the Chapter
Questions, comments

6-‹#›
47
All rights reserved. No part of this publication may be
reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of
the publisher. Printed in the United States of America.

6-‹#›
48
Soma
Axon
Axon
Synapse
Synapse
Dendrites
Dendrites
Soma
w

1
w
2
w
n
x
1
x
2
x
n
.
.
.
Y
Y
1
Y
n
Y
2
InputsWeightsOutputs
.
.
.

Neuron (or PE)
n
i
ii
WXS
1
)(Sf
Summation
Transfer
Function
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
Transfer
Function
( f )
Weighted
Sum

(S)
x
1
x
2
x
3
Y
1
Input
Layer
Hidden
Layer
Output
Layer
x
1
x
2
2211
(PE)
(PE)
Y
(PE)

(PE)
w
1
w
1
w
11
w
21
w
12
w
22
w
23
x
1
x
2
Y
1
Y
2
Y
3

2121111
2221212
2323
(a) Single neuron(b) Multiple neurons
PE: Processing Element (or neuron)
X
1
=3
Processing
element (PE)
X
2
=1
X
3
=2
W
1
=
0
.
2

W
2
=0.4
W
3
=
0
.
1
Y=1.2
Summation function:
Transfer function:
Y = 3(0.2) + 1(0.4) + 2(0.1) = 1.2
Y
T
= 1/(1 + e
-1.2
) = 0.77
Y
T
=0.77
INPUT
LAYER
HIDDEN

LAYER
OUTPUT
LAYER
.
.
.
.
.
.
Voted “yes” or
“no” to legalizing
gaming
Predicted
vs. Actual
=
Socio-demographic
Religious
Financial
Other
Input 1
Input 2
Input 3
...
I n p u t
O

u
t
p
u
t
1
2
3
4
5
6
7
...
1
2
3
4
5
6
7
8
9
...
MPAA Rating (5)
(G, PG, PG13, R, NR)

Competition (3)
(High, Medium, Low)
Star Value (3)
(High, Medium, Low)
Genre (10)
(Sci-Fi, Action, ... )
Technical Effects (3)
(High, Medium, Low)
Sequel (2)
(Yes, No)
Number of Screens
(Positive Integer)
Class 1 -FLOP
(BO < 1 M)
Class 2
(1M < BO < 10M)
Class 3
(10M < BO < 20M)
Class 4
(20M < BO < 40M)
Class 5
(40M < BO < 65M)
Class 6
(65M < BO < 100M)
Class 7

(100M < BO < 150M)
Class 8
(150M < BO < 200M)
Class 9 -BLOCKBUSTER
(BO > 200M)
INPUT
LAYER
(27 PEs)
HIDDEN
LAYER I
(18 PEs)
HIDDEN
LAYER II
(16 PEs)
OUTPUT
LAYER
(9 PEs)
Compute
output
Is desired
output
achieved?
Stop
learning
Adjust

weights
Yes
No
ANN
Model
w
1
w
2
w
n
x
1
x
2
x
n
.
.
.
Y
i
Neuron (or PE)

n
i
ii
WXS
1
)(Sf
Summation
Transfer
Function
a(Z
i
–Y
i
)
error
D
1
Systematically
Perturbed
Inputs
Observed
Change in
Outputs

Trained ANN
“the black-box”
X
1
X
2
M
a
x
i
m
u
m
-
m
a
r
g
i
n
h
y
p
e

r
p
l
a
n
e
X
1
X
2
L
1
L
2
L
3
M
a
r
g
i
n
Pre-Process the Data
üScrub the data
“Identify and handle missing,

incorrect, and noisy”
üTransform the data
“Numerisize, normalize and
standardize the data”
Develop the Model
üSelect the kernel type
“Choose from RBF, Sigmoid
or Polynomial kernel types”
üDetermine the kernel values
“Use v-fold cross validation or
employ ‘grid-search’”
Deploy the Model
üExtract the model coefficients
üCode the trained model into
the decision support system
üMonitor and maintain the
model
Training
data
Pre-processed data
Validated SVM model
Prediction
Model
Experimentation
“Training/Testing”

X
Y
X
i
Y
i
k= 3
k= 5
Historic Data
New Data
Parameter Setting
üDistance measure
üValue of “k”
Training Set
Validation Set
Predicting
Classify (or Forecast)
new cases using k
number of most
similar cases

Chapter 6Techniques for Predictive ModelingBusiness.docx

Recommended

Recommended

More Related Content

Similar to Chapter 6Techniques for Predictive ModelingBusiness.docx

Similar to Chapter 6Techniques for Predictive ModelingBusiness.docx (20)

More from mccormicknadine86

More from mccormicknadine86 (20)

Recently uploaded

Recently uploaded (20)

Chapter 6Techniques for Predictive ModelingBusiness.docx