An Introduction to Fuzzy Logic and Neural Networks

An Introduction to
Deep Learning
Mehrnaz Faraz
Faculty of Electrical Engineering
K. N. Toosi University of Technology
1
In the name of God
Milad Abbasi
Faculty of Electrical Engineering
Sharif University of Technology

Contents
• Introduction to Fuzzy
• Introduction to Neural Network
2

Introduction to Fuzzy
• Fuzzy: “Difficult to perceive; indistinct or vague”
– Simplicity and flexibility
– Can handle problems with imprecise data
– More readily customizable in natural language terms
3

4
Slowest, 𝐴1 Slow, 𝐴2 Fast, 𝐴3 Fastest, 𝐴4
Subset
0
1
Velocity
𝜇 𝐴
Membership Function
𝜇 𝐴 𝑖
𝑥 : 𝑥 → 0, 1 , i = 1, … , 4
Slowest FastestSlow Fast

• Example: Automatic Braking System
5
Close?
 Is car close? 0.2 (Not very close)
Brakes: 0.2 (Slight pressure)
 Is car close? 0.8 (Pretty close)
Brakes: 0.8 (Fairly heavy pressure)

6
• Well-known Membership Functions:
trimf trapmf
gaussmf sigmf
a b c a b c d
sig
c

7
• Linguistic variables:
• Weather is quite cold.
• Height is almost tall.
• Speed is very high.
• Weather, Height and Speed are linguistic variables.
• Cold, Tall and high are linguistic value.

• Operations with fuzzy sets:
– Complement operation:
– Fuzzy union operation or fuzzy OR:
– Fuzzy intersection operation or fuzzy AND:
8
)(1)( xx AA
 
A( ) s[ ( ), ( )]A B Bx x x   
A( ) t [ ( ), ( )]A B Bx x x   

1
1 2 30 4
1
1 2 30 4
1 2 30 4
1
( )A
x
( )B
x
( )A B
x 
• Fuzzy union operation (s-norm):
9
Max is a s-norm operator
x
x
x

1
1 2 30 4
1
1 2 30 4
1 2 30 4
1
( )A
x
( )B
x
,
( )A B
x
10
• Fuzzy intersection operation (t-norm):
Min is a t-norm operator
Product is another t-norm operator
(Mamdani Implication)
x
x
x

• Other complement operations:
– Sugeno Class:
– Yager Class:
11
 
1
1
a
c a
a





   
1
1 w w
wc a a 
 Aa x
Where:

• Other union operation:
– Yager class:
– Drastic sum:
– Algebraic sum:
12
   
1
, min 1, w w w
ws a b a b
 
  
 
 0,w 
 
, 0
, , 0
1 . .
ds
a b
s a b b a
o w


 


 ,ass a b a b ab  
 Aa x  Bb xWhere:

• Other intersection operation:
– Yager class:
– Drastic product:
– Algebraic product:
13
      
1
, 1 min 1, 1 1
w w w
wt a b a b
 
     
  
 
, 1
, , 1
0 . .
dp
a b
t a b b a
o w


 


 ,apt a b ab

• Example: Assume that we want to evaluate the health of a
person based on his height and weight.
• The input variables are the crisp numbers of the person’s
height and weight.
• Output is percentage of healthiness.
14
Input Output
Fazzifier Defazzifier
Rule Base
Data Base
Inference
Engine
Crisp Number Crisp Number

• Step 1: Fuzzification
15
SlimVery Slim Medium Heavy Very Heavy
50 Kg 75 Kg 100 Kg 125 Kg
Weight
𝜇
1
Very Short Short Medium Tall Very Tall
Height
140 cm 160 cm 180 cm 200 cm
𝜇
1
Input Membership Functions:

• Step 2: Rules
• Rules reflect experts decisions
• Rules can be redundant
• Rules can be adjusted to match desired
• Rules are tabulated as fuzzy words
• If x is A then y is B
• if 𝑥1is 𝐴1 and/or 𝑥2 is 𝐴2 … and/or 𝑥 𝑛 is 𝐴 𝑛 then y is B
16

• Implications:
– Mamdani implication:
17
1 2
1 2
FP FP
IF FP THEN FP
 
   
     1 2
, min ,QMM FP FPx y x y     
     1 2
, ,QMP FP FPx y prod x y     

– Godel implication:
– Zadeh implication:
18
   
 
1 2
2
1 ,
. .
FP FP
QG
FP
x y
y o w
 



 

         1 2 1
, max min , , 1QZ FP FP FPx y x y x      

• Inference Rules:
– Modus Ponens:
– Modus Tollens:
– Hypothetical Syllogism:
19
X is A
Y is B
If X is A then Y is B



Y is not B
X is not A



If X is A then Z is C
If Y is B then Z is C




• Rules are tabulated as fuzzy words
– Healthy (H)
– Somewhat healthy (SH)
– Less Healthy (LH)
– Unhealthy (U)
– Rule function f = {U, LH, SH, H}
20
U LH SH H
0.2 0.4 0.6 0.8 1
f
1
Decision
Output Membership Function:

21
Very
Slim
slim Medium Heavy
Very
Heavy
Very Short H SH LH U U
Short SH H SH LH U
Medium LH H H LH U
Tall U SH H SH U
Very Tall U LH H SH LH
WeightHeight
• Fuzzy Rule Table:

• Step 3: Calculation
• Assume that height = 187 cm and weight = 49 kg
22
SlimVery Slim Medium Heavy Very Heavy
50 Kg 75 Kg 100 Kg 125 Kg
Weight
𝜇
1
Very Short Short Medium Tall Very Tall
Height
140 cm 160 cm 180 cm 200 cm
𝜇
1
0.3
0.7
0.2
0.8

23
0.7 0.3 Medium Heavy
Very
Heavy
Very Short H SH LH U U
Short SH H SH LH U
0.8 LH H H LH U
0.2 U SH H SH U
Very Tall U LH H SH LH
WeightHeight
• Fuzzy Rule Table:

24
0.7 0.3 Medium Heavy
Very
Heavy
Very
Short
H SH LH U U
Short SH H SH LH U
0.8 0.7 0.3 H LH U
0.2 0.2 0.2 H SH U
Very
Tall
U LH H SH LH
Weight
Height
   0.7, 0.3, 0, 0, 0Weight VS S M H VH      
   0, 0, 0.8, 0.2, 0Height VS S M T VT      

• Scaled Fuzzified Decision:
25
   , , , 0.2, 0.7, 0.2, 0.3f U LH SH H 
U
LH
SH H
0.2 0.4 0.6 0.8 1
f
𝟎. 𝟕
𝟎. 𝟑
𝟎. 𝟐
Decision

• Defuzzification Methods:
– Centroid:
– Bisector:
26
0
0
( ) ( )
x
A Ax
x dx x dx


  
0
( )
( )
i A i
A i
x x
x
x


 


– Middle, Smallest, and Largest of Maximum:
27
( )A
x
x
SOM LOMMOM
• Defuzzification Methods:

• Centroid Method:
28
*
l l
l
y w
y
w
 

0.2 0.2 0.7 0.4 0.2 0.6 0.3 0.8
0.4857
0.2 0.7 0.2 0.3
FD
      
 
  
FD= Final Decision
D: Decision
U
LH
SH H
0.2 0.4 0.6 0.8 1
f
𝟎. 𝟕
𝟎. 𝟑
𝟎. 𝟐
Decision
, 1,...,4
l
y l 
lw

• Step 4: Final Decision
• Assume that crisp decision index (D) is centroid:
D=0.4857
29
U LH SH H
0.2 0.4 0.6 0.8 1
f
Decision
1
0.75
0.25
0.4857
25% in SH group and 75% in LH group

• Fuzzy Extension Principle:
– How far is it from Zanjan to Urmia?
30
Tabriz Shahrekord
Zanjan 0.3 0.9
Shahrekord 1 0
Esfahan 0.95 0.1
x
y
Urmia Ahvaz
Tabriz 0.95 0.1
Shahrekord 0.1 0.9
y
z
( , ) ( , ), ( , )POQ y P Qx z S t x y y z     
Scaled Distance Among Cities:

• Assume that t-norm is product, and s-norm is max
31
0.3 0.95
( , ), ( , ) 0.3 0.95 0.285P Qprod Zanjan Tabriz Tabriz Urmia 
 
    
 
 
0.9 0.1
( , ), ( , ) 0.9 0.1 0.09P Qprod Zanjan Shahrekord Shahrekord Urmia 
 
    
 
 
 max 0.285,0.09 0.285
Zanjan is close to Urmia

• TSK Fuzzy System:
32
1 1 0 1 1,..., ...l l l l l l
n n n nIf x is C x is C then y c c x c x   
:l
iC Fuzzy sets
:l
ic Constants
 
l l
l
y w
f x
w
 

1,2,...,l M
 
1
l
i
n
l
iC
i
w x

 Output:

Introduction to Neural Network
33
Input
Weight
Σ f
Neuron
Output
Activation Function
• Neural Network:

• Multilayer Perceptron:
34
InputSignal
OutputSignal
Input Layer
First
Hidden
Layer
Second
Hidden
Layer
Output Layer⋮
⋮
⋮⋮
Supervised
Learning
Random
Initialization
Deep NN

• MLP:
– Fully connected Overfitting
– Suitable for:
• Classification prediction problems
• Regression prediction problems
• Tabular datasets
– Contain data in a columnar format, each column (field)
must have a name and may only contain data of one type
– Try MLPs on:
• Image data (e.g. the pixels of an image can be reduced down
to one long row of data and fed into an MLP)
• Text data
• Time series data
• Other types of data
35
Supervised
Learning

• Feed Forward:
36
InputSignal
Output Signal
Input Layer
First
Hidden
Layer
Second
Hidden
Layer
Output Layer
Te
⋮⋮ ⋮
⋮ ⋮

• Back Propagate:
37
InputSignal
Output Signal
Input Layer
First
Hidden
Layer
Second
Hidden
Layer
Output Layer
Te
Training
⋮ ⋮ ⋮
⋮ ⋮

38
• Simple Neural Network:
𝑥1
𝑥2
𝑥 𝑛0
𝑁𝑒𝑢𝑟𝑜𝑛1
1
1
2
1
1
Σ
Σ
Σ
𝑤10
1
𝑤11
2
𝑤10
2
𝑤22
1
𝑤21
1
𝑤20
1
𝑤1𝑛0
1
𝑤11
1
𝑤12
1
𝑤12
2
𝑤2𝑛0
1
f
f
f
𝑜1
2
𝑜2
1
𝑜1
1
𝑛𝑒𝑡1
1
𝑛𝑒𝑡1
2
𝑛𝑒𝑡2
1
…
…
⋮

39
= 𝑤10
1
, 𝑤11
1
, 𝑤12
1
, . . . , 𝑤1𝑛0
1 𝑇
= 𝑤20
1
, 𝑤21
1
, 𝑤22
1
, . . . , 𝑤2𝑛0
1 𝑇
= 𝑤10
2
, 𝑤11
2
, 𝑤12
2 𝑇
𝑛𝑒𝑡1
1
=
𝑖=0
𝑛0
𝑤1𝑖
1
𝑥𝑖
𝑛𝑒𝑡2
1
=
𝑖=0
𝑛0
𝑤2𝑖
1
𝑥𝑖
𝑜1
1
= 𝑓(𝑛𝑒𝑡1
1
𝑜2
1
1
𝑛𝑒𝑡1
2
=
𝑖=0
2
𝑤1𝑖
2
𝑜𝑖
1
𝑜1
2
2
= 𝑜0
1
, 𝑜1
1
, 𝑜2
1 𝑇
𝑜0
1
= 1, 𝑥0 = 1
𝑤1
1
𝑤2
1
𝑤1
2
𝑜1
𝑥 = 𝑥0, 𝑥1, 𝑥2, . . . , 𝑥 𝑛0
𝑇
= 𝑤1
1 𝑇. 𝑥
= 𝑤2
1 𝑇. 𝑥
= 𝑤1
2 𝑇
. 𝑜1
Feed Forward

40
𝑥1
𝑥2
𝑥 𝑛0
1
1
2
1
1
Σ
Σ
Σ
𝑤10
1
𝒘 𝟏𝟏
𝟐
𝒘 𝟏𝟎
𝟐
𝑤22
1
𝑤21
1
𝑤20
1
𝑤1𝑛0
1
𝑤11
1
𝑤12
1
𝒘 𝟏𝟐
𝟐
𝑤2𝑛0
1
f
f
f
𝒐 𝟏
𝟐
𝑜2
1
𝑜1
1
𝑛𝑒𝑡1
1
𝒏𝒆𝒕 𝟏
𝟐
𝑛𝑒𝑡2
1
…
…
T
e
Back Propagate
'2 1
2 2
1 1
2 2 2 2
1 1 1 1
1e
f o
E E e o net
w e o net w
 
    
   
     
⋮

41
𝑥1
𝑥2
𝑥 𝑛0
1
1
2
1
1
Σ
Σ
Σ
𝑤10
1
𝒘 𝟏𝟏
𝟐
𝒘 𝟏𝟎
𝟐
𝑤22
1
𝑤21
1
𝑤20
1
𝑤1𝑛0
1
𝑤11
1
𝑤12
1
𝒘 𝟏𝟐
𝟐
𝑤2𝑛0
1
f
f
f
𝒐 𝟏
𝟐
𝑜2
1
𝑜1
1
𝑛𝑒𝑡1
1
𝒏𝒆𝒕 𝟏
𝟐
𝑛𝑒𝑡2
1
…
…
T
e
Back Propagate
'2 2 '1
11
2 2 1 1
1 1 1 1
1 2 2 1 1 1
1 1 1 1 1 1
1e xf w f
E E e o net o net
w e o net o net w
 
      
     
       
⋮

42
  21
2
i
i
E k e 
   
 
 
1
E k
w k w k
w k


  

Gradient Descent
Exercise: Rewrite the back-propagation equations for a
neural network with 2 outputs.

• Popular Activation Functions:
– Sigmoid (Logistic):
• Sigmoids saturate and tend to vanish gradient
• Exp() is a bit compute expensive
• Sigmoid outputs are not zero-centered
43
 
1
1 x
x
e
 


   0,1x 

– Tanh:
• Zero centered
• Tanh saturate and tend to vanish gradient
• Tanh() is a bit compute expensive
44
 
2
2
1
1
x
x
e
f x
e





   1,1f x  

– ReLU:
• Rectified Linear Unit
• Does not saturate (in range +)
• Very computationally efficient
• Converges faster than sigmoid/tanh
• Not zero-centered output
• Saturate (in range -)
45
   max 0,f x x
   0,f x  

– LReLU and PReLU:
• Does not saturate
• Computationally efficient
• Converges much faster
• Zero-centered output
46
   max 0.01 ,f x x x
   max ,f x x x
LReLU:
PReLU:

– ELU:
• Exponential Linear Unit
• Zero-centered output
• Closer to zero mean output
• Robustness to noise compared with LReLU
47
 
  
0
exp 1 0
x x
f x
x x

 
 

• Properties of Activation Functions:
– Nonlinear
– Continuously differentiable
– Range
– Monotonic
– Smooth
• Bipolar and Unipolar:
– Unipolar Sigmoid
– Bipolar Sigmoid
48
f(net)=
𝟏
𝟏+𝒆−𝒈.𝒏𝒆𝒕
f(net)=
𝟏−𝒆−𝒈.𝒏𝒆𝒕
𝟏+𝒆−𝒈.𝒏𝒆𝒕

• Flexible Neural Network:
49
        .
1
,
1
s
j
s
j
s
j g k
s s
j j net k
f gnet k k
e



   
 
 
1s s s
g s
E k
g k g k
g k


  

    
   
   
.
.
1
,
1
s
j
s
j
s
j
s
j
g k
s
j g k
net k
s s
j j net k
e
f n g ket k
e





Unipolar Sigmoid:
Bipolar Sigmoid:
Training:

50
𝑥1
𝑥2
𝑥 𝑛0
1
1
2
1
1
Σ
Σ
Σ
𝑤10
1
𝒘 𝟏𝟏
𝟐
𝒘 𝟏𝟎
𝟐
𝑤22
1
𝑤21
1
𝑤20
1
𝑤1𝑛0
1
𝑤11
1
𝑤12
1
𝒘 𝟏𝟐
𝟐
𝑤2𝑛0
1
f
f
f
𝒐 𝟏
𝟐
𝑜2
1
𝑜1
1
𝑛𝑒𝑡1
1
𝑛𝑒𝑡1
2
𝑛𝑒𝑡2
1
…
…
T
e
Back Propagate
*2
2
1
2 2 2
1
1e
f
E E e o
g e o g

   
  
   
⋮

51
Unipolar Sigmoid:
Bipolar Sigmoid:
Training:
    
 
   .
2
,
1
ss
j j
s
js
j a k
s s
j j net k
f
a
net k
k
a k
e



      
   
   
.
.
1 1
,
1
s
j
s
j j
s
j
s
n
kn
a k
s
et k
s s
j j s et k
j
j a
e
f net k
a k e
a k



 

   
 
 
1s s s
a s
E k
a k a k
a k


  

• Flexible Neural Network:
  1s
jg k 

52
𝑥1
𝑥2
𝑥 𝑛0
1
1
2
1
1
Σ
Σ
Σ
𝑤10
1
𝒘 𝟏𝟏
𝟐
𝒘 𝟏𝟎
𝟐
𝑤22
1
𝑤21
1
𝑤20
1
𝑤1𝑛0
1
𝑤11
1
𝑤12
1
𝒘 𝟏𝟐
𝟐
𝑤2𝑛0
1
f
f
f
𝒐 𝟏
𝟐
𝑜2
1
𝑜1
1
𝑛𝑒𝑡1
1
𝒏𝒆𝒕 𝟏
𝟐
𝑛𝑒𝑡2
1
…
…
T
e
Back Propagate
*1
'2 2
11
2 2 1
1 1 1
1 2 2 1 1
1 1 1
1e ff w
E E e o net o
a e o net o a

     
    
     
⋮

• Radial Basis Function (RBF):
– Similarity between input signal and prototype
53
⋮
𝑥1
𝑥n
𝑥2
𝑤 y
⋮
Gaussian Activation Function
⋮ ⋮

54
Σ
Σ
Σ
.
.
.
Neuron 1
Neuron m
Neuron j⋮
⋮
Σ y
𝑤1
𝑤 𝑚
𝑤𝑗
𝑛𝑒𝑡1
𝑛𝑒𝑡𝑗
𝑛𝑒𝑡 𝑚
𝑜1
1
𝑜 𝑚
1
𝑜𝑗
1
𝜙1 .
𝜙 𝑚 .
𝜙𝑗 .
−𝑐1
−𝑐𝑗
−𝑐 𝑚
𝑥
• RBF:
⋮

55
𝑛𝑒𝑡𝑗 = ‖𝑥 − 𝑐𝑗‖ = 𝑥1 − 𝑐1𝑗
2
+ 𝑥2 − 𝑐2𝑗
2
+. . . + 𝑥 𝑛 − 𝑐 𝑛𝑗
2
𝑜𝑗
1
= 𝜙 𝑛𝑒𝑡𝑗
𝜙 𝑛𝑒𝑡𝑗 = 𝑒
−1
2
𝑛𝑒𝑡 𝑗
𝜎 𝑗
2
𝑥 = 𝑥1, 𝑥2, . . . , 𝑥 𝑛
𝑐𝑗 = 𝑐1𝑗, 𝑐2𝑗, . . . , 𝑐 𝑛𝑗
𝜎 = 𝜎1, 𝜎2, . . . , 𝜎 𝑚
′
𝑤 = 𝑤1, 𝑤2, . . . , 𝑤 𝑚
′
𝑜1 = 𝑜1
1
, 𝑜2
1
, . . . , 𝑜 𝑚
1 ′

• Training of RBF Networks:
–
–
56
𝑐𝑗 𝑘 + 1 = 𝑐𝑗 𝑘 − 𝜂
𝜕𝐸
𝜕𝑐𝑗
𝑘  
      
  
 
 
1
22
1
1
1
j j
j
j
j
j j j
e
w k x k c k
o k
k
oE E e y
k k
c e y o c




   
   
    
𝜎𝑗 𝑘 + 1 = 𝜎𝑗 𝑘 − 𝜂
𝜕𝐸
𝜕𝜎𝑗
𝑘  
   
  
 
 
2
1
3
1
1
1
j
j
j
j
j j j
e
w k net
o k
k
oE E e y
k k
e y o

 


   
   
    
𝒄𝒋 :
𝝈𝒋 :

• Recurrent Neural Network (RNN):
57
y(k+1)
x(k)
x(k-1)
x(k-2)
𝑧−1
𝑧−1
𝑧−1
𝑧−1
𝑧−1
b
y(k-2)
y(k-1)
y(k)

• Feedback Types in Recurrent Neural Network:
– Local (A)
– Inter-layer (B)
– Global (C)
58
X(k) y(k)
A
B
C

𝑤𝑥X(k) y(k)
𝑤 𝑟,1
𝑤 𝑟,2
𝑤 𝑟,𝑛1
• Local feedback activation:
59
Feed Forward
     
1
x r
i
net k w x k w net k i

     

• Local output feedback:
60
𝑤𝑥X(k) y(k)
𝑤 𝑟,1
𝑤 𝑟,2
𝑤 𝑟,𝑛1
Feed Forward
     
0 1
x r
j i
net k w x k j w y k i
 
     
    y k f net k

• The Vanishing Gradient Problem:
– Causes small gradients
– Prevents the weights from updating
61
0.29
0.28999
InputSignal
OutputSignal

• The Exploding Gradient Problem:
– Causes large gradients
– The weights get away from their optimum value
62
0.29
1.872351
InputSignal
OutputSignal

63
⋮
• Elman Neural Network:
⋮
⋮⋮
𝑥1
𝑥2
𝑥 𝑛0
𝑥 𝑐1
𝑥 𝑐𝑛1
𝑥 𝑐2
𝑓1
𝑓1
𝑓1
𝑓2
𝑓2
𝑓2
cw
xw yw
𝑜1
1
𝑜 𝑛1
1
⋮
𝑒1
𝑒 𝑛2
𝑒2
Hidden Layer 1
𝑇1
𝑇2
𝑇𝑛2
⋮
⋮
𝑛1 𝑛2

64
Feed Forward
    
        
1 1 1
1 1
1 1
1 1 1 1x c
o k f net k
f w k x k w k o k
   
       
 1
1cx o k  
       2 2 2 2 1
yo K f net k f w o k   

65
Back Propagate
   
2
2
1
1
2
n
j
j
E k e k

 
'2
1
2 2
2 2
1
y y
e f
o
E E e o net
w e o net w


    
   
      
   
 
 
1y y
y
E k
w k w k
w k


  

In the same way for 𝑤 𝑥

• Jordan Neural Network:
66
⋮⋮
𝑥1
𝑥2
𝑥 𝑛0
𝑥 𝑐1
𝑥 𝑐𝑛2
𝑥 𝑐2
𝑓1
𝑓1
𝑓1
𝑓2
𝑓2
𝑓2
cw
xw yw
𝑜1
1
𝑜 𝑛2
1
𝑦 𝑛2
𝑦2
𝑦1
⋮
⋮
⋮
𝑛1
𝑛2
Output Layer

An Introduction to Fuzzy Logic and Neural Networks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to An Introduction to Fuzzy Logic and Neural Networks

Similar to An Introduction to Fuzzy Logic and Neural Networks (20)

Recently uploaded

Recently uploaded (20)

An Introduction to Fuzzy Logic and Neural Networks