SlideShare a Scribd company logo
1 of 110
Machine Learning and
Neural Networks
Definitions
īēMachine learning investigates the
mechanisms by which knowledge is
acquired through experience
īēMachine Learning is the field that
concentrates on induction algorithms and
on other algorithms that can be said to
``learn.''
Model
īēA model of learning is fundamental in any
machine learning application:
ī‚ˇ who is learning (a computer program)
ī‚ˇ what is learned (a domain)
ī‚ˇ from what the learner is learning (the
information source)
A domain
īšConcept learning is one of the most studied
domain: the learner will try to come up with a
rule useful to separate positive examples
from negative examples.
The information source
ī‚ˇ examples: the learner is given positive
and negative examples
ī‚ˇ queries: the learner gets information
about the domain by asking questions
ī‚ˇ experimentation: the learner may get
information by actively experiment with
the domain
Other component of the
model are
īē the prior knowledge
īšof the learner about the domain. For example the learner may
know that the unknown concept can be represented in a certain
way
īē the performance criteria
īšthat defines how we know that the learner has learned
something and how it can demonstrate it. Performance criteria
can include:
ī‚ˇ off line or on line measures
ī‚ˇ descriptive or predictive output
ī‚ˇ accuracy
ī‚ˇ efficiency
What techniques we will see
īēkNN algorithm
īēWinnow algorithm
īēNaïve Bayes classifier
īēDecision trees
īēReinforcement learning (Rocchio algorithm)
īēGenetic algorithm
k-NN algorithm
īēThe definition of k-nearest neighbors is
trivial:
īšSuppose that each esperience can be
represented as a point in an space For a
particular point in question, find the k points
in the population that are nearest to the point
in question. The class of the majority of the
of these neighbors is the class to the selected
point.
k-NN algorithm
c2
c
c1 c4
c3
c4
c1
c2
c2
c3
c4
1
New input
Inputs already classified
Class 1
k-NN algorithm
īēFinding the k-nearest neighbors reliably
and efficiently can be difficult. Other
metrics that the Euclidean can be used.
īēThe implicit assumption in using any k-
nearest neighbors technique is that items
with similar attributes tend to cluster
together.
k-NN algorithm
īēThe k-nearest neighbors method is most
frequently used to tentatively classify
points when firm class bounds are not
established.
īēThe learning is done using only positive
examples not negative.
k-NN algorithm
īēUsed in
īš Schwab, I., Pohl, W., and Koychev, I. (2000) Learning to recommend from
positive evidence. In: H. Lieberman (ed.) Proceedings of 2000 International
Conference on Intelligent User Interfaces, New Orleans, LA, January 9-12, 2000,
ACM Press, pp. 241-247
Winnow Algorithm
īēIs useful to distinguish binary patterns
into two classes using a threshold S and a
set of weights
īēthe pattern x holds to the class y=1 if
j
w
s
x
w
j
j
j ī€ž
īƒĨ
(1)
Winnow Algorithm
īēThe algorithm:
īštake an example (x, y)
īšgenerate the answer of the classifier
īšif the answer is correct do nothing
īšelse apply some correction
īƒĨ
ī€Ŋ
j
j
j x
w
y'
Winnow Algorithm
īēIf y’>y the the weights are too high and
are diminished
īēIf y’<y the the weights are too low and
are corrected
in both cases are corrected only the ones
corresponding to 1
ī€Ŋ
j
x
Winnow Algorithm application
īēUsed in
īš M.J. Pazzani “ A framework for Collaborative, Content Based and Demographic
Filtering” Artificial Intelligence Review, Dec 1999
īš R.Armstrong, D. Freitag, T. Joachims, and T. Mitchell " WebWatcher: A Learning
Apprentice for the World Wide Web " 1995.
Naïve Bayes Classifier
īēBayes theorem : given an Hypotesis H,
an Evidence E and a context c
)
|
(
)
|
(
)
,
|
(
)
,
|
(
c
E
P
c
H
P
c
H
E
P
c
E
H
P
īƒ—
ī€Ŋ
Naïve Bayes Classifier
īēSuppose to have a set of objects that can
hold to two categories, y1 and y2,
described using n features x1, x2, â€Ļ, xn.
īēIf
īēthen the object holds to the category y1
1
)
|
(
)
|
(
2
1
ī€ž
x
x
y
P
y
P We drop
the context
Naïve Bayes Classifier
)
(
)
|
(
...
)
|
(
)
|
(
)
|
(
)
(
)
|
(
...
)
|
(
)
|
(
)
|
(
)
(
)
|
(
)
(
)
|
(
)
|
(
)
|
(
2
2
2
3
2
2
2
1
1
1
1
3
1
2
1
1
2
2
1
1
2
1
y
P
y
x
P
y
x
P
y
x
P
y
x
P
y
P
y
x
P
y
x
P
y
x
P
y
x
P
y
P
y
P
y
P
y
P
y
P
y
P
n
n
īƒ—
īƒ—
īƒ—
īƒ—
īƒ—
īƒ—
ī€Ŋ
īƒ—
īƒ—
ī€Ŋ
x
x
x
x
īēUsing the Bayes theorem:
Supposing that all
the features are
not correlated
Naïve Bayes Classifier
īēUsed in:
īš Mladenic, D. (2001) Using text learning to help Web browsing. In: M. Smith, G.
Salvendy, D. Harris and R. J. Koubek (eds.) Usability evaluation and interface
design. Vol. 1, (Proceedings of 9th International Conference on Human-
Computer Interaction, HCI International'2001, New Orleans, LA, August 8-10,
2001) Mahwah, NJ: Lawrence Erlbaum Associates, pp. 893-897.
īš Schwab, I., Pohl, W., and Koychev, I. (2000) Learning to recommend from
positive evidence. In: H. Lieberman (ed.) Proceedings of 2000 International
Conference on Intelligent User Interfaces, New Orleans, LA, January 9-12, 2000,
ACM Press, pp. 241-247, also available at .Self, J. (1986) The application of
machine learning to student modelling. Instr. Science, Instructional Science 14,
327-338.
Naïve Bayes Classifier
īš Bueno D., David A. A. (2001) METIORE: A Personalized Information Retrieval
System. In M. Bauer, P. J. Gmytrasiewicz and J. Vassileva (eds.) User Modeling
2001. Lecture Notes on Artificial Intelligence, Vol. 2109, (Proceedings of 8th
International Conference on User Modeling, UM 2001, Sonthofen, Germany, July
13-17, 2001) Berlin: Springer-Verlag, pp. 188-198.
īš Frasconi P., Soda G., Vullo A., Text Categorization for Multi-page Documents: A
HybridNaive Bayes HMM Approach, ACM JCDL’01, June 24-28, 2001
Decision trees
īēA decision tree is a tree whose internal
nodes are tests (on input patterns) and
whose leaf nodes are categories (of
patterns).
īēEach test has mutually exclusive and
exhaustive outcomes.
Decision trees
T1
T3
T2
T4
1 2 1 3 2
1
3 classes
4 tests (maybe
4 variables)
Decision trees
īēThe test:
īšmight be multivariate (tests on several
features of the input) or univariate (test only
one feature);
īšmight have two or more outcomes.
īēThe features can be categorical or
numerical.
Decision trees
īēSuppose to have n binary features
īēThe main problem in learning decision
trees is to decide the order of tests on
variables
īēIn order to decide, the average entropy of
each test attribute is calculated and the
lower one is chosen.
Decision trees
īēIf we have binary patterns and a set of
pattern ī‘ it is possible to write the
entropy as
were p(i|ī‘) is the probability that a random
pattern from ī‘ belongs to the class i
)
|
(
log
)
|
(
)
( 2 ī‘
ī‘
ī€­
ī€Ŋ
ī‘ īƒĨ i
p
i
p
H
i
Decision trees
īēWe will approximate the probability p(i|ī‘)
using the number of patterns in ī‘
belonging to the class i divided by the
total number of pattern in ī‘
Decision trees
If a test T have k
outcomes, k subsets ī‘1,
ī‘2, ...ī‘k, are considered
with n1, n2, â€Ļ, nk patterns.
It is possible to calculate:
T
1
... ...
J
K
)
|
(
log
)
|
(
)
( 2 j
j
i
j i
p
i
p
H ī‘
ī‘
ī€­
ī€Ŋ
ī‘ īƒĨ
Decision trees
īēThe average entropy over all the ī‘j
again we evaluate p(ī‘j ) has the number of patterns in
ī‘ that outcomes j divided by the total number of
patterns in ī‘
īģ īŊ )
(
)
(
)
( j
j
j
j
T H
p
H
E ī‘
ī‘
ī€­
ī€Ŋ
ī‘ īƒĨ
Decision trees
īēWe calculate the average entropy for all
the test T and chose the lower one.
īēWe write the part of the tree and go head
in order to chose again the test that gives
the lower entropy
Decision trees
īēThe knowledge in the tree is strongly
dependent from the examples
Reinforcement Learning
īē An agent tries to optimize its interaction
with a dynamic environment using trial
and error.
īēThe agent can make an action u that
applied to the environment changes its
state from x to x’. The agent receives a
reinforcement r.
Reinforcement Learning
īēThere are three parts of a Reinforcement
Learning Problem:
īšThe environment
īšThe reinforcement function
īšThe value function
Reinforcement Learning
īēThe environment
at least partially observable by means of
sensors or symbolic description. The theory is
based on an environment that shows its
“true” state.
Reinforcement Learning
īēThe reinforcement function
a mapping from the couple (state, action) to
the reinforcement value. There are three
classes of reinforcement functions:
ī¸Pure delayed reward: the reinforcements are
all zero except for the terminal state (games,
inverted pendulum)
ī¸Minimum time to goal: cause an agent to
perform actions that generate the shortest path to
a goal state
Reinforcement Learning
ī¸Minimization: the reinforcement is a function of
of limited resources and the agent have to achieve
the goal while minimizing the energy used
Reinforcement Learning
īēThe Value Function:
defines how to choose a “good” action. First
we have to define
ī¸policy (state) action
ī¸value of a state I (following a defined policy)
the optimal policy maximize the value of a state
īƒĨ
T
i
i
r T is the final state
Reinforcement Learning
īēThe Value Function
is a mapping (state) State Value
If the optimal value function is founded the
optimal policy can be extracted.
Reinforcement Learning
īēGiven a state xt
V*(xt) is the optimal state value;
V(xt) is the approximation we have;
where e(xt) is the approximation error
)
(
)
(
)
( *
t
t
t x
V
x
e
x
V ī€Ģ
ī€Ŋ
Reinforcement Learning
īēMoreover
where ī§ is a discount factor that causes
immediate reinforcement to have more
importance than future reinforcements
)
(
)
(
)
( 1
*
*
ī€Ģ
ī€Ģ
ī€Ŋ t
t
t x
V
x
r
x
V ī§
)
(
)
(
)
( 1
ī€Ģ
ī€Ģ
ī€Ŋ t
t
t x
V
x
r
x
V ī§
Reinforcement Learning
īēWe can find
that gives
(**)
ī› ī
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
)
(
1
1
*
*
1
*
1
*
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ŋ
ī€Ģ
ī€Ģ
ī€Ģ
ī€Ŋ
ī€Ģ
t
t
t
t
t
t
t
t
t
t
x
e
x
V
x
r
x
V
x
e
x
V
x
e
x
r
x
V
x
e
ī§
ī§
ī§
)
(
)
( 1
ī€Ģ
ī€Ŋ t
t x
e
x
e ī§
Reinforcement Learning
īēThe learning process goal is to find an
approximation V(xt) that makes the
equation (**) true for all the state.
The finale state T of a process has a value that is
defined a priori so e(T)=0, so e(T-1)=0 it the (**) is true
and then backwards to the initial state.
Reinforcement Learning
īēAssuming that the function approximator for the
V* is a look-up table (a table with an
approximate state value w for each state) then
it is possible to sweep through the state space
and update the values in the table according to:
ī€¨ ī€Š )
(
)
(
)
,
(
max 1 t
t
t
u
x
V
x
V
u
x
r
w ī€­
ī€Ģ
ī€Ŋ
ī„ ī€Ģ
ī§
Reinforcement Learning
where u is the action performed that
causes the transition to the state xt+1. This
must be done by using some kind of
simulation in order to evaluate ī€¨ ī€Š
)
(
max 1
ī€Ģ
t
u
x
V
Reinforcement Learning
The last equation can be rewritten as
Each update reduce the value of e(xt+1)
the learning stops when e(xt+1)=0
ī€¨ ī€Š )
(
)
(
)
,
(
max
)
( 1 t
t
t
u
t x
V
x
V
u
x
r
x
e ī€­
ī€Ģ
ī€Ŋ ī€Ģ
ī§
Rocchio Algorithm
īēUsed in Relevance Feedback in IR
īēWe represent a user profile and the
objects (documents) using the same
space
m represents the user
w represent the objects (documents)
Rocchio Algorithm
īēThe object (document) is matched to the
user using an available matching criteria
(cosine measure)
īēThe user model is updated using
where s is a function of the feedback
w
m
m
w s
s
u ī€Ģ
ī€Ŋ
)
,
,
(
Rocchio Algorithm
īēIt is possible to use a collection of vectors
m to represent the user’s interests
Rocchio and Reiforcement
Learning
īēThe goal is to have the “best” user’s
profile
īēThe state is defined by the weight vector
of the user profile
Rocchio Algorithm (IR)
where
Q is the vector of the initial query
Ri is the vector for relevant document
Si is the vector for the irrelevant documents
īĄ,īĸ are Rocchio’s weights
īƒĨ
īƒĨ ī€Ŋ
ī€Ŋ
ī€­
ī€Ģ
ī€Ŋ
2
2
1
1
1
1
1
'
n
i
i
n
n
i
i
n S
R
Q
Q i
īĸ
īĄ
Rocchio algorithm
īēUsed in
īš Seo, Y.-W. and Zhang, B.-T. (2000) A reinforcement learning agent for
personalized information filtering. In: H. Lieberman (ed.) Proceedings of 2000
International Conference on Intelligent User Interfaces, New Orleans, LA,
January 9-12, 2000, ACM Press, pp. 248-251
īš Balabanovic M. “An Adaptive Web Page Recomandation Service in Proc. Of 1th
International Conference on Autonomous Agents 1997
Genetic Algorithms
īēGenetic algorithms are inspired by natural
evolution. In the natural world, organisms
that are poorly suited for an environment
die off, while those well-suited for it
prosper.
īēEach individual is a bit-string that encodes
its characteristics. Each element of the
string is called a gene.
Genetic Algorithms
īēGenetic algorithms search the space of
individuals for good candidates.
īēThe "goodness" of an individual is
measured by some fitness function.
Search takes place in parallel, with many
individuals in each generation.
Genetic Algorithms
īēThe algorithm consists of looping through
generations. In each generation, a subset
of the population is selected to reproduce;
usually this is a random selection in which
the probability of choice is proportional to
fitness.
Genetic Algorithms
īēReproduction occurs by randomly pairing
all of the individuals in the selection pool,
and then generating two new individuals
by performing crossover, in which the
initial n bits (where n is random) of the
parents are exchanged. There is a small
chance that one of the genes in the
resulting individuals will mutate to a new
value.
Neural Networks
īēAn artificial network consists of a pool of
simple processing units which
communicate by sending signals to each
other over a large number of weighted
connections.
Artificial Neuron
x1
x2
xn
w1j
w2j
wnj
j
n
i
i
ij
j b
x
w
s ī€Ģ
ī€Ŋ īƒĨ
ī€Ŋ0
)
( j
j s
f
y ī€Ŋ
yj
bj
j
s
e
ī€­
ī€Ģ
1
1
Neural Networks
īēEach unit performs a relatively simple job:
receive input from neighbors or external sources
and use this to compute an output signal which
is propagated to other units (Test stage).
īēApart from this processing, there is the task of
the adjustment of the weights (Learning stage).
īēThe system is inherently parallel in the sense
that many units can carry out their computations
at the same time.
Neural Networks
1. Learning stage
2. Test stage
(working stage)
Your knowledge
is useless !!
Classification (connections)
As for this pattern of connections, the main
distinction we can make is between:
īēFeed-forward networks, where the data flow
from input to output units is strictly feed-forward.
The data processing can extend over multiple
layers of units, but no feedback connections or
connections between units of the same layer are
present.
Classification
īēRecurrent networks that do contain feedback
connections. Contrary to feed-forward networks,
the dynamical properties of the network are
important. In some cases, the activation values
of the units undergo a relaxation process such
that the network will evolve to a stable state in
which these activations do not change anymore.
Classification (connections)
Recurrent Networks
īēIn other applications, the change of the
activation values of the output neurons are
significant, such that the dynamical behavior
constitutes the output of the network.
Classification (Learning)
We can categorise the learning situations in
two distinct sorts. These are:
īēSupervised learning in which the network is
trained by providing it with input and matching
output patterns. These input-output pairs are
usually provided by an external teacher.
īēUnsupervised learning in which an (output)
unit is trained to respond to clusters of pattern
within the input. In this paradigm the system is
supposed to discover statistically salient
features of the input population. Unlike the
supervised learning paradigm, there is no a
priori set of categories into which the patterns
are to be classified; rather the system must
develop its own representation of the input
stimuli.
Classification (Learning)
Perceptron
īēA single layer feed-forward network consists of
one or more output neurons, each of which is
connected with a weighting factor wij to all of the
inputs xi.
xi
b
b
Perceptron
īēIn the simplest case the network has only two
inputs and a single output. The output of the
neuron is:
īēsuppose that the activation function is a
threshold
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ¨
īƒĻ
ī€Ģ
ī€Ŋ īƒĨ
ī€Ŋ
2
1
i
i
i b
x
w
f
y
īƒŽ
īƒ­
īƒŦ
ī‚Ŗ
ī€­
ī€ž
ī€Ŋ
0
1
0
1
s
if
s
if
f
Perceptron
īēIn this example the simple network (the
neuron) can be used to separate the
inputs in two classes.
īēThe separation between the two classes is
given by
0
2
2
1
1 ī€Ŋ
ī€Ģ
ī€Ģ b
x
w
x
w
Perceptron
x1
x2
x
x
x
x
x
x
x
x
x
Learning in Perceptrons
īēThe weights of the neural networks are
modified during the learning phase
ij
ij
ij
ij
ij
ij
b
t
b
t
b
w
t
w
t
w
ī„
ī€Ģ
ī€Ŋ
ī€Ģ
ī„
ī€Ģ
ī€Ŋ
ī€Ģ
)
(
)
1
(
)
(
)
1
(
Learning in Perceptrons
īēStart with random weights
īēSelect an input couple (x, d(x))
īēif then modify the weight
according with
Note that the weights are not modified if the
network gives the correct answer
i
ij x
x
d
w )
(
ī€Ŋ
ī„
)
(x
d
y ī‚š
Convergence theorem
īēIf there exists a set of connection weights
w* which is able to perform the
transformation y = d(x), the perceptron
learning rule will converge to some
solution (which may or may not be the
same as w* ) in a finite number of steps for
any initial choice of the weights.
Linear Units
x2
xn
w1j
w2j
wnj
j
n
i
i
ij
j b
x
w
s ī€Ģ
ī€Ŋ īƒĨ
ī€Ŋ0
bj
Yj=sj
The Delta Rule 1
īēThe idea is to make the change of the
weight proportional to the negative
derivative of the error
ij
i
i
ij
ij
w
y
y
E
w
E
w
ī‚ļ
ī‚ļ
ī‚ļ
ī‚ļ
ī€Ŋ
ī‚ļ
ī‚ļ
ī€­
ī€Ŋ
ī„ ī§
The Delta Rule 2
ī€¨ ī€Š
j
i
ij
i
i
i
i
j
ij
i
x
w
y
d
y
E
x
w
y
ī§ī¤
ī¤
ī€Ŋ
ī„
ī€Ŋ
ī€­
ī€­
ī€Ŋ
ī‚ļ
ī‚ļ
ī€Ŋ
ī‚ļ
ī‚ļ
(1)
Backpropagation
īēThe multi-layer networks with a linear
activation can classify only linear
separable inputs or, in case of function
approximation, only linear functions can
be represented.
Backpropagation
. . .
x1 x2 xn
vjk
hj
wij
yi
Backpropagation
īēWhen a learning pattern is clamped, the
activation values are propagated to the
output units, and the actual network output
is compared with the desired output
values, we usually end up with an error in
each of the output units. Let's call this
error eo for a particular output unit o. We
have to bring eo to zero.
Backpropagation
īēThe simplest method to do this is the
greedy method: we strive to change the
connections in the neural network in such
a way that, next time around, the error eo
will be zero for this particular pattern. We
know from the delta rule that, in order to
reduce an error, we have to adapt its
incoming weights according to the last
equation (1)
Backpropagation
īēIn order to adapt the weights from input to
hidden units, we again want to apply the
delta rule. In this case, however, we do
not have a value for for the hidden units.
Backpropagation
īēCalculate the activation of the hidden
units
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ¨
īƒĻ
ī€Ŋ īƒĨ
ī€Ŋ
n
k
k
jk
j x
v
f
h
0
Backpropagation
īēAnd the activation of the output units
īƒˇ
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ§
īƒ¨
īƒĻ
ī€Ŋ īƒĨ
ī€Ŋ0
j
j
ij
i h
w
f
y
Backpropagation
īēIf we have ī­ pattern to learn the error is
ī€¨ ī€Š
2
0
2
1
2
2
1
2
2
1
īƒĨīƒĨ īƒĨ īƒĨ
īƒĨīƒĨ īƒĨ
īƒĨīƒĨ
īƒē
īƒē
īƒģ
īƒš
īƒĒ
īƒĒ
īƒĢ
īƒŠ
īƒˇ
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ§
īƒ¨
īƒĻ
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ¨
īƒĻ
ī€­
ī€Ŋ
īƒē
īƒē
īƒģ
īƒš
īƒĒ
īƒĒ
īƒĢ
īƒŠ
īƒˇ
īƒˇ
īƒ¸
īƒļ
īƒ§
īƒ§
īƒ¨
īƒĻ
ī€­
ī€Ŋ
ī€Ŋ
ī€­
ī€Ŋ
ī€Ŋ
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
i j
n
k
jk
ij
i
i j
ij
i
i
i
i
k
j
x
v
f
w
f
t
h
w
f
t
y
t
E
Backpropagation
ī€¨ ī€Š ī€¨ ī€Š
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī¤
ī¨
ī¨
ī¨
j
i
j
i
i
i
ij
ij
h
h
A
f
y
t
w
E
w
īƒĨ
īƒĨ
ī€Ŋ
ī€Ŋ
ī€­
ī€Ŋ
ī€Ŋ
ī‚ļ
ī‚ļ
ī€­
ī€Ŋ
ī„
.
ī€¨ ī€Š ī€¨ ī€Š
ī­
ī­
ī­
ī­
ī¤ i
i
i
i A
f
y
t
.
ī€­
ī€Ŋ
Backpropagation
ī€¨ ī€Š ī€¨ ī€Š ī€¨ ī€Š
ī€¨ ī€Š
īƒĨīƒĨ
īƒĨīƒĨ
īƒĨ
ī€Ŋ
ī€Ŋ
ī€­
ī€Ŋ
ī€Ŋ
ī‚ļ
ī‚ļ
ī‚ļ
ī‚ļ
ī€­
ī€Ŋ
ī‚ļ
ī‚ļ
ī€­
ī€Ŋ
ī„
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī­
ī¤
ī¨
ī¨
ī¨
ī¨
i
k
j
ij
i
k
j
i
ij
i
i
i
jk
j
j
jk
jk
x
A
f
w
x
A
f
w
A
f
y
t
v
h
h
E
v
E
v
.
.
.
.
Backpropagation
īēThe weight correction is given by :
īƒĨ
ī€Ŋ
ī„
īŽ
ī­
ī­
ī¤
ī¨ n
m
mn x
w
ī€¨ ī€Š ī€¨ ī€Š
ī­
ī­
ī­
ī­
ī¤ m
m
m
m A
f
y
t '
ī€­
ī€Ŋ
ī€¨ ī€ŠīƒĨ
ī€Ŋ
s
s
sm
m
m w
A
f ī­
ī­
ī­
ī¤
ī¤ '
Where
If m is the output layer
If m is an hidden layer
or
Backpropagation
. . .
x1 x2 xn
vjk
hj
wij
yi
Backpropagation
. . .
x1 x2 xn
vjk
hj
wij
yi
Recurrent Networks
īēWhat happens when we introduce a
cycle? For instance, we can connect a
hidden unit with itself over a weighted
connection, connect hidden units to input
units, or even connect all units with each
other ?
Hopfield Network
īēThe Hopfield network consists of a set of
N interconnected neurons which update
their activation values asynchronously and
independently of other neurons.
īēAll neurons are both input and output
neurons. The activation values are binary
(+1, -1)
Hopfield Network
Hopfield Network
īēThe state of the system is given by the
activation values y = (y k ).
īēThe net input s k (t +1) of a neuron k at
cycle (t +1) is a weighted sum
īƒĨ
ī‚š
ī€Ģ
ī€Ŋ
ī€Ģ
k
j
k
jk
j b
w
t
y
t
s )
(
)
1
(
Hopfield Network
īēA threshold function is applied to obtain
the output
ī€¨ ī€Š
)
1
(
sgn
)
1
( ī€Ģ
ī€Ŋ
ī€Ģ t
s
t
y k
k
Hopfield Network
īēA neuron k in the net is stable at time t
I.e.
īēA state is state if all the neurons are
stable
ī€¨ ī€Š
)
1
(
sgn
)
( ī€­
ī€Ŋ t
s
t
y k
k
Hopfield Networks
īēIf wjk = wkj the behavior of the system can
be described with an energy function
īēThis kind of network has stable limit points
īƒĨ
īƒĨīƒĨ ī€­
ī€­
ī€Ŋ
ī‚š k
k
k
jk
k
k
j
j y
b
w
y
y
2
1
īĨ
Hopfield net. applications
īēA primary application of the Hopfield
network is an associative memory.
īēThe states of the system corresponding
with the patterns which are to be stored in
the network are stable.
īēThese states can be seen as `dips' in
energy space.
Hopfield Networks
īēIt appears, however, that the network gets
saturated very quickly, and that about
0.15N memories can be stored before
recall errors become severe.
Hopfield Networks
Stable
state
State
state
Input
Hopfield Networks
īēUsed in
īšChung, Y.-M., Pottenger, W. M., and Schatz, B. R. (1998)
Automatic subject indexing using an associative neural network.
In: I. Witten, R. Akscyn and F. M. Shipman III (eds.)
Proceedings of The Third ACM Conference on Digital Libraries
(Digital Libraries '98), Pittsburgh, USA, June 23-26, 1998, ACM
Press, pp. 59-6
Self Organization
īēThe unsupervised weight adapting
algorithms are usually based on some
form of global competition between the
neurons.
īēApplications of self-organizing networks
are:
S.O. Applications
īēclustering: the input data may be
grouped in `clusters' and the data
processing system has to find these
inherent clusters in the input data.
S.O. Applications
īēvector quantisation: this problem occurs
when a continuous space has to be
discretised. The input of the system is the
n-dimensional vector x, the output is a
discrete representation of the input space.
The system has to find optimal
discretisation of the input space.
S.O. Applications
īēdimensionality reduction: the input data
are grouped in a subspace which has
lower dimensionality than the
dimensionality of the data. The system
has to learn an “optimal” mapping.
S.O. Applications
īēfeature extraction: the system has to
extract features from the input signal. This
often means a dimensionality reduction as
described above.
Self-Organizing Networks
īēLearning Vector Quantization
īēKohonen maps
īēPrincipal Components Networks
īēAdaptive Resonance Theory
Kohonen Maps
īēIn the Kohonen network, the output units
are ordered in some fashion, often in a
two-dimensional grid or array, although
this is application-dependent.
Kohonen Maps
Kohonen Maps
The input x is given to
all the units at the same
time
Kohonen Maps
The weights
of the winner unit
are updated
together with the weights of
its neighborhoods
Kohonen Maps
īēUsed in:
īš Fulantelli, G., Rizzo, R., Arrigo, M., and Corrao, R. (2000) An adaptive open
hypermedia system on the Web. In: P. Brusilovsky, O. Stock and C. Strapparava
(eds.) Adaptive Hypermedia and Adaptive Web-Based Systems. Lecture Notes in
Computer Science, (Proceedings of Adaptive Hypermedia and Adaptive Web-
based Systems, AH2000, Trento, Italy, August 28-30, 2000) Berlin: Springer-
Verlag, pp. 189-201.
īš Goren-Bar, D., Kuflik, T., Lev, D., and Shoval, P. (2001) Automating personal
categorizations using artificial neural network. In: M. Bauer, P. J. Gmytrasiewicz
and J. Vassileva (eds.) User Modeling 2001. Lecture Notes on Artificial
Intelligence, Vol. 2109, (Proceedings of 8th International Conference on User
Modeling, UM 2001, Sonthofen, Germany, July 13-17, 2001) Berlin: Springer-
Verlag, pp. 188-198.
Kohonen Maps
īš Kayama, M. and Okamoto, T. (1999) Hy-SOM: The semantic map framework
applied on an example case of navigation. In: G. Gumming, T. Okamoto and L.
Gomez (eds.) Advanced Research in Computers and Communications in
Education. Frontiers ub Artificial Intelligence and Applications, Vol. 2,
(Proceedings of ICCE'99, 7th International Conference on Computers in
Education, Chiba, Japan, 4-7 November, 1999) Amsterdam: IOS Press, pp. 252-
259.
īš Taskaya, T., Contreras, P., Feng, T., and Murtagh, F. (2001) Interactive visual
user interfaces to databases. In: M. Smith, G. Salvendy, D. Harris and R. J.
Koubek (eds.) Usability evaluation and interface design. Vol. 1, (Proceedings of
9th International Conference on Human-Computer Interaction, HCI
International'2001, New Orleans, LA, August 8-10, 2001) Mahwah, NJ: Lawrence
Erlbaum Associates, pp. 913-917.

More Related Content

Similar to Machine Learning and Artificial Neural Networks.ppt

Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesijsc
 
Classifiers
ClassifiersClassifiers
ClassifiersAyurdata
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkshesnasuneer
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkshesnasuneer
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional VerificationSai Kiran Kadam
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...IJERA Editor
 
Text categorization
Text categorizationText categorization
Text categorizationPhuong Nguyen
 
ppt
pptppt
pptbutest
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...IJCNCJournal
 
Lecture 2
Lecture 2Lecture 2
Lecture 2butest
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesXavier Rafael Palou
 
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...ijdmtaiir
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
Topic_6
Topic_6Topic_6
Topic_6butest
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiersKrish_ver2
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 

Similar to Machine Learning and Artificial Neural Networks.ppt (20)

Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Classifiers
ClassifiersClassifiers
Classifiers
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkkOBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
OBJECTRECOGNITION1.pptxjjjkkkkjjjjkkkkkkk
 
SVM - Functional Verification
SVM - Functional VerificationSVM - Functional Verification
SVM - Functional Verification
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
 
Text categorization
Text categorizationText categorization
Text categorization
 
ppt
pptppt
ppt
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
CONSTRUCTING A FUZZY NETWORK INTRUSION CLASSIFIER BASED ON DIFFERENTIAL EVOLU...
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Introduction to conventional machine learning techniques
Introduction to conventional machine learning techniquesIntroduction to conventional machine learning techniques
Introduction to conventional machine learning techniques
 
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
Analysis of Influences of memory on Cognitive load Using Neural Network Back ...
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Topic_6
Topic_6Topic_6
Topic_6
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
Search Engines
Search EnginesSearch Engines
Search Engines
 

More from Anshika865276

Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAnshika865276
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAnshika865276
 
Introduction and Concept of Concurrent Engineering.pptx
Introduction and Concept of Concurrent Engineering.pptxIntroduction and Concept of Concurrent Engineering.pptx
Introduction and Concept of Concurrent Engineering.pptxAnshika865276
 
Innovation Classification and Types and Phases.pptx
Innovation Classification and Types and Phases.pptxInnovation Classification and Types and Phases.pptx
Innovation Classification and Types and Phases.pptxAnshika865276
 
INNOVATIONS MANAGEMENT and Process of Innovation.pptx
INNOVATIONS MANAGEMENT and Process of Innovation.pptxINNOVATIONS MANAGEMENT and Process of Innovation.pptx
INNOVATIONS MANAGEMENT and Process of Innovation.pptxAnshika865276
 
Different Sources of financing Businesses.ppt
Different Sources of financing Businesses.pptDifferent Sources of financing Businesses.ppt
Different Sources of financing Businesses.pptAnshika865276
 
Capital Structure - Concept and Theories.ppt
Capital Structure - Concept and Theories.pptCapital Structure - Concept and Theories.ppt
Capital Structure - Concept and Theories.pptAnshika865276
 
Introduction to Machine Learning and different types of Learning
Introduction to Machine Learning and different types of LearningIntroduction to Machine Learning and different types of Learning
Introduction to Machine Learning and different types of LearningAnshika865276
 
Overview of Business Models.pptx
Overview of Business Models.pptxOverview of Business Models.pptx
Overview of Business Models.pptxAnshika865276
 
Security Issues in E-Commerce.pptx
Security Issues in E-Commerce.pptxSecurity Issues in E-Commerce.pptx
Security Issues in E-Commerce.pptxAnshika865276
 
Impact of E-Commerce.pptx
Impact of E-Commerce.pptxImpact of E-Commerce.pptx
Impact of E-Commerce.pptxAnshika865276
 
Electronic Commerce Technologies.pptx
Electronic Commerce Technologies.pptxElectronic Commerce Technologies.pptx
Electronic Commerce Technologies.pptxAnshika865276
 
2. CONCEPT OF INFORMATION.pptx
2. CONCEPT OF INFORMATION.pptx2. CONCEPT OF INFORMATION.pptx
2. CONCEPT OF INFORMATION.pptxAnshika865276
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptxAnshika865276
 
Types of Products.pdf
Types of Products.pdfTypes of Products.pdf
Types of Products.pdfAnshika865276
 
bussinesscommunicationfinal-181130112044.pdf
bussinesscommunicationfinal-181130112044.pdfbussinesscommunicationfinal-181130112044.pdf
bussinesscommunicationfinal-181130112044.pdfAnshika865276
 
Group Discussion and Interviews.pptx
Group Discussion and Interviews.pptxGroup Discussion and Interviews.pptx
Group Discussion and Interviews.pptxAnshika865276
 
The_Financial_System.pptx
The_Financial_System.pptxThe_Financial_System.pptx
The_Financial_System.pptxAnshika865276
 
Personal Selling.pptx
Personal Selling.pptxPersonal Selling.pptx
Personal Selling.pptxAnshika865276
 
Brand and Branding Strategy.pptx
Brand and Branding Strategy.pptxBrand and Branding Strategy.pptx
Brand and Branding Strategy.pptxAnshika865276
 

More from Anshika865276 (20)

Advanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptxAdvanced Data Analytics techniques .pptx
Advanced Data Analytics techniques .pptx
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.ppt
 
Introduction and Concept of Concurrent Engineering.pptx
Introduction and Concept of Concurrent Engineering.pptxIntroduction and Concept of Concurrent Engineering.pptx
Introduction and Concept of Concurrent Engineering.pptx
 
Innovation Classification and Types and Phases.pptx
Innovation Classification and Types and Phases.pptxInnovation Classification and Types and Phases.pptx
Innovation Classification and Types and Phases.pptx
 
INNOVATIONS MANAGEMENT and Process of Innovation.pptx
INNOVATIONS MANAGEMENT and Process of Innovation.pptxINNOVATIONS MANAGEMENT and Process of Innovation.pptx
INNOVATIONS MANAGEMENT and Process of Innovation.pptx
 
Different Sources of financing Businesses.ppt
Different Sources of financing Businesses.pptDifferent Sources of financing Businesses.ppt
Different Sources of financing Businesses.ppt
 
Capital Structure - Concept and Theories.ppt
Capital Structure - Concept and Theories.pptCapital Structure - Concept and Theories.ppt
Capital Structure - Concept and Theories.ppt
 
Introduction to Machine Learning and different types of Learning
Introduction to Machine Learning and different types of LearningIntroduction to Machine Learning and different types of Learning
Introduction to Machine Learning and different types of Learning
 
Overview of Business Models.pptx
Overview of Business Models.pptxOverview of Business Models.pptx
Overview of Business Models.pptx
 
Security Issues in E-Commerce.pptx
Security Issues in E-Commerce.pptxSecurity Issues in E-Commerce.pptx
Security Issues in E-Commerce.pptx
 
Impact of E-Commerce.pptx
Impact of E-Commerce.pptxImpact of E-Commerce.pptx
Impact of E-Commerce.pptx
 
Electronic Commerce Technologies.pptx
Electronic Commerce Technologies.pptxElectronic Commerce Technologies.pptx
Electronic Commerce Technologies.pptx
 
2. CONCEPT OF INFORMATION.pptx
2. CONCEPT OF INFORMATION.pptx2. CONCEPT OF INFORMATION.pptx
2. CONCEPT OF INFORMATION.pptx
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
 
Types of Products.pdf
Types of Products.pdfTypes of Products.pdf
Types of Products.pdf
 
bussinesscommunicationfinal-181130112044.pdf
bussinesscommunicationfinal-181130112044.pdfbussinesscommunicationfinal-181130112044.pdf
bussinesscommunicationfinal-181130112044.pdf
 
Group Discussion and Interviews.pptx
Group Discussion and Interviews.pptxGroup Discussion and Interviews.pptx
Group Discussion and Interviews.pptx
 
The_Financial_System.pptx
The_Financial_System.pptxThe_Financial_System.pptx
The_Financial_System.pptx
 
Personal Selling.pptx
Personal Selling.pptxPersonal Selling.pptx
Personal Selling.pptx
 
Brand and Branding Strategy.pptx
Brand and Branding Strategy.pptxBrand and Branding Strategy.pptx
Brand and Branding Strategy.pptx
 

Recently uploaded

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)jennyeacort
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ208367051
 
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†e4aez8ss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi đŸ’¯Call Us 🔝8264348440🔝
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)
Call Us âžĨ97111√47426đŸ¤ŗCall Girls in Aerocity (Delhi NCR)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ
原į‰ˆ1:1厚åˆļ南十字星大å­Ļæ¯•ä¸šč¯īŧˆSCUæ¯•ä¸šč¯īŧ‰#文凭成įģŠå•#įœŸåŽžį•™äŋĄå­ĻåŽ†čŽ¤č¯æ°¸äš…å­˜æĄŖ
 
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†
į§‘įŊ—拉多大å­Ļæŗĸå°”åž—åˆ†æ Ąæ¯•ä¸šč¯å­ĻäŊč¯æˆįģŠå•-可办į†
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 

Machine Learning and Artificial Neural Networks.ppt

  • 2. Definitions īēMachine learning investigates the mechanisms by which knowledge is acquired through experience īēMachine Learning is the field that concentrates on induction algorithms and on other algorithms that can be said to ``learn.''
  • 3. Model īēA model of learning is fundamental in any machine learning application: ī‚ˇ who is learning (a computer program) ī‚ˇ what is learned (a domain) ī‚ˇ from what the learner is learning (the information source)
  • 4. A domain īšConcept learning is one of the most studied domain: the learner will try to come up with a rule useful to separate positive examples from negative examples.
  • 5. The information source ī‚ˇ examples: the learner is given positive and negative examples ī‚ˇ queries: the learner gets information about the domain by asking questions ī‚ˇ experimentation: the learner may get information by actively experiment with the domain
  • 6. Other component of the model are īē the prior knowledge īšof the learner about the domain. For example the learner may know that the unknown concept can be represented in a certain way īē the performance criteria īšthat defines how we know that the learner has learned something and how it can demonstrate it. Performance criteria can include: ī‚ˇ off line or on line measures ī‚ˇ descriptive or predictive output ī‚ˇ accuracy ī‚ˇ efficiency
  • 7. What techniques we will see īēkNN algorithm īēWinnow algorithm īēNaïve Bayes classifier īēDecision trees īēReinforcement learning (Rocchio algorithm) īēGenetic algorithm
  • 8. k-NN algorithm īēThe definition of k-nearest neighbors is trivial: īšSuppose that each esperience can be represented as a point in an space For a particular point in question, find the k points in the population that are nearest to the point in question. The class of the majority of the of these neighbors is the class to the selected point.
  • 9. k-NN algorithm c2 c c1 c4 c3 c4 c1 c2 c2 c3 c4 1 New input Inputs already classified Class 1
  • 10. k-NN algorithm īēFinding the k-nearest neighbors reliably and efficiently can be difficult. Other metrics that the Euclidean can be used. īēThe implicit assumption in using any k- nearest neighbors technique is that items with similar attributes tend to cluster together.
  • 11. k-NN algorithm īēThe k-nearest neighbors method is most frequently used to tentatively classify points when firm class bounds are not established. īēThe learning is done using only positive examples not negative.
  • 12. k-NN algorithm īēUsed in īš Schwab, I., Pohl, W., and Koychev, I. (2000) Learning to recommend from positive evidence. In: H. Lieberman (ed.) Proceedings of 2000 International Conference on Intelligent User Interfaces, New Orleans, LA, January 9-12, 2000, ACM Press, pp. 241-247
  • 13. Winnow Algorithm īēIs useful to distinguish binary patterns into two classes using a threshold S and a set of weights īēthe pattern x holds to the class y=1 if j w s x w j j j ī€ž īƒĨ (1)
  • 14. Winnow Algorithm īēThe algorithm: īštake an example (x, y) īšgenerate the answer of the classifier īšif the answer is correct do nothing īšelse apply some correction īƒĨ ī€Ŋ j j j x w y'
  • 15. Winnow Algorithm īēIf y’>y the the weights are too high and are diminished īēIf y’<y the the weights are too low and are corrected in both cases are corrected only the ones corresponding to 1 ī€Ŋ j x
  • 16. Winnow Algorithm application īēUsed in īš M.J. Pazzani “ A framework for Collaborative, Content Based and Demographic Filtering” Artificial Intelligence Review, Dec 1999 īš R.Armstrong, D. Freitag, T. Joachims, and T. Mitchell " WebWatcher: A Learning Apprentice for the World Wide Web " 1995.
  • 17. Naïve Bayes Classifier īēBayes theorem : given an Hypotesis H, an Evidence E and a context c ) | ( ) | ( ) , | ( ) , | ( c E P c H P c H E P c E H P īƒ— ī€Ŋ
  • 18. Naïve Bayes Classifier īēSuppose to have a set of objects that can hold to two categories, y1 and y2, described using n features x1, x2, â€Ļ, xn. īēIf īēthen the object holds to the category y1 1 ) | ( ) | ( 2 1 ī€ž x x y P y P We drop the context
  • 20. Naïve Bayes Classifier īēUsed in: īš Mladenic, D. (2001) Using text learning to help Web browsing. In: M. Smith, G. Salvendy, D. Harris and R. J. Koubek (eds.) Usability evaluation and interface design. Vol. 1, (Proceedings of 9th International Conference on Human- Computer Interaction, HCI International'2001, New Orleans, LA, August 8-10, 2001) Mahwah, NJ: Lawrence Erlbaum Associates, pp. 893-897. īš Schwab, I., Pohl, W., and Koychev, I. (2000) Learning to recommend from positive evidence. In: H. Lieberman (ed.) Proceedings of 2000 International Conference on Intelligent User Interfaces, New Orleans, LA, January 9-12, 2000, ACM Press, pp. 241-247, also available at .Self, J. (1986) The application of machine learning to student modelling. Instr. Science, Instructional Science 14, 327-338.
  • 21. Naïve Bayes Classifier īš Bueno D., David A. A. (2001) METIORE: A Personalized Information Retrieval System. In M. Bauer, P. J. Gmytrasiewicz and J. Vassileva (eds.) User Modeling 2001. Lecture Notes on Artificial Intelligence, Vol. 2109, (Proceedings of 8th International Conference on User Modeling, UM 2001, Sonthofen, Germany, July 13-17, 2001) Berlin: Springer-Verlag, pp. 188-198. īš Frasconi P., Soda G., Vullo A., Text Categorization for Multi-page Documents: A HybridNaive Bayes HMM Approach, ACM JCDL’01, June 24-28, 2001
  • 22. Decision trees īēA decision tree is a tree whose internal nodes are tests (on input patterns) and whose leaf nodes are categories (of patterns). īēEach test has mutually exclusive and exhaustive outcomes.
  • 23. Decision trees T1 T3 T2 T4 1 2 1 3 2 1 3 classes 4 tests (maybe 4 variables)
  • 24. Decision trees īēThe test: īšmight be multivariate (tests on several features of the input) or univariate (test only one feature); īšmight have two or more outcomes. īēThe features can be categorical or numerical.
  • 25. Decision trees īēSuppose to have n binary features īēThe main problem in learning decision trees is to decide the order of tests on variables īēIn order to decide, the average entropy of each test attribute is calculated and the lower one is chosen.
  • 26. Decision trees īēIf we have binary patterns and a set of pattern ī‘ it is possible to write the entropy as were p(i|ī‘) is the probability that a random pattern from ī‘ belongs to the class i ) | ( log ) | ( ) ( 2 ī‘ ī‘ ī€­ ī€Ŋ ī‘ īƒĨ i p i p H i
  • 27. Decision trees īēWe will approximate the probability p(i|ī‘) using the number of patterns in ī‘ belonging to the class i divided by the total number of pattern in ī‘
  • 28. Decision trees If a test T have k outcomes, k subsets ī‘1, ī‘2, ...ī‘k, are considered with n1, n2, â€Ļ, nk patterns. It is possible to calculate: T 1 ... ... J K ) | ( log ) | ( ) ( 2 j j i j i p i p H ī‘ ī‘ ī€­ ī€Ŋ ī‘ īƒĨ
  • 29. Decision trees īēThe average entropy over all the ī‘j again we evaluate p(ī‘j ) has the number of patterns in ī‘ that outcomes j divided by the total number of patterns in ī‘ īģ īŊ ) ( ) ( ) ( j j j j T H p H E ī‘ ī‘ ī€­ ī€Ŋ ī‘ īƒĨ
  • 30. Decision trees īēWe calculate the average entropy for all the test T and chose the lower one. īēWe write the part of the tree and go head in order to chose again the test that gives the lower entropy
  • 31. Decision trees īēThe knowledge in the tree is strongly dependent from the examples
  • 32. Reinforcement Learning īē An agent tries to optimize its interaction with a dynamic environment using trial and error. īēThe agent can make an action u that applied to the environment changes its state from x to x’. The agent receives a reinforcement r.
  • 33. Reinforcement Learning īēThere are three parts of a Reinforcement Learning Problem: īšThe environment īšThe reinforcement function īšThe value function
  • 34. Reinforcement Learning īēThe environment at least partially observable by means of sensors or symbolic description. The theory is based on an environment that shows its “true” state.
  • 35. Reinforcement Learning īēThe reinforcement function a mapping from the couple (state, action) to the reinforcement value. There are three classes of reinforcement functions: ī¸Pure delayed reward: the reinforcements are all zero except for the terminal state (games, inverted pendulum) ī¸Minimum time to goal: cause an agent to perform actions that generate the shortest path to a goal state
  • 36. Reinforcement Learning ī¸Minimization: the reinforcement is a function of of limited resources and the agent have to achieve the goal while minimizing the energy used
  • 37. Reinforcement Learning īēThe Value Function: defines how to choose a “good” action. First we have to define ī¸policy (state) action ī¸value of a state I (following a defined policy) the optimal policy maximize the value of a state īƒĨ T i i r T is the final state
  • 38. Reinforcement Learning īēThe Value Function is a mapping (state) State Value If the optimal value function is founded the optimal policy can be extracted.
  • 39. Reinforcement Learning īēGiven a state xt V*(xt) is the optimal state value; V(xt) is the approximation we have; where e(xt) is the approximation error ) ( ) ( ) ( * t t t x V x e x V ī€Ģ ī€Ŋ
  • 40. Reinforcement Learning īēMoreover where ī§ is a discount factor that causes immediate reinforcement to have more importance than future reinforcements ) ( ) ( ) ( 1 * * ī€Ģ ī€Ģ ī€Ŋ t t t x V x r x V ī§ ) ( ) ( ) ( 1 ī€Ģ ī€Ģ ī€Ŋ t t t x V x r x V ī§
  • 41. Reinforcement Learning īēWe can find that gives (**) ī› ī ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( 1 1 * * 1 * 1 * ī€Ģ ī€Ģ ī€Ģ ī€Ģ ī€Ģ ī€Ģ ī€Ŋ ī€Ģ ī€Ģ ī€Ģ ī€Ŋ ī€Ģ t t t t t t t t t t x e x V x r x V x e x V x e x r x V x e ī§ ī§ ī§ ) ( ) ( 1 ī€Ģ ī€Ŋ t t x e x e ī§
  • 42. Reinforcement Learning īēThe learning process goal is to find an approximation V(xt) that makes the equation (**) true for all the state. The finale state T of a process has a value that is defined a priori so e(T)=0, so e(T-1)=0 it the (**) is true and then backwards to the initial state.
  • 43. Reinforcement Learning īēAssuming that the function approximator for the V* is a look-up table (a table with an approximate state value w for each state) then it is possible to sweep through the state space and update the values in the table according to: ī€¨ ī€Š ) ( ) ( ) , ( max 1 t t t u x V x V u x r w ī€­ ī€Ģ ī€Ŋ ī„ ī€Ģ ī§
  • 44. Reinforcement Learning where u is the action performed that causes the transition to the state xt+1. This must be done by using some kind of simulation in order to evaluate ī€¨ ī€Š ) ( max 1 ī€Ģ t u x V
  • 45. Reinforcement Learning The last equation can be rewritten as Each update reduce the value of e(xt+1) the learning stops when e(xt+1)=0 ī€¨ ī€Š ) ( ) ( ) , ( max ) ( 1 t t t u t x V x V u x r x e ī€­ ī€Ģ ī€Ŋ ī€Ģ ī§
  • 46. Rocchio Algorithm īēUsed in Relevance Feedback in IR īēWe represent a user profile and the objects (documents) using the same space m represents the user w represent the objects (documents)
  • 47. Rocchio Algorithm īēThe object (document) is matched to the user using an available matching criteria (cosine measure) īēThe user model is updated using where s is a function of the feedback w m m w s s u ī€Ģ ī€Ŋ ) , , (
  • 48. Rocchio Algorithm īēIt is possible to use a collection of vectors m to represent the user’s interests
  • 49. Rocchio and Reiforcement Learning īēThe goal is to have the “best” user’s profile īēThe state is defined by the weight vector of the user profile
  • 50. Rocchio Algorithm (IR) where Q is the vector of the initial query Ri is the vector for relevant document Si is the vector for the irrelevant documents īĄ,īĸ are Rocchio’s weights īƒĨ īƒĨ ī€Ŋ ī€Ŋ ī€­ ī€Ģ ī€Ŋ 2 2 1 1 1 1 1 ' n i i n n i i n S R Q Q i īĸ īĄ
  • 51. Rocchio algorithm īēUsed in īš Seo, Y.-W. and Zhang, B.-T. (2000) A reinforcement learning agent for personalized information filtering. In: H. Lieberman (ed.) Proceedings of 2000 International Conference on Intelligent User Interfaces, New Orleans, LA, January 9-12, 2000, ACM Press, pp. 248-251 īš Balabanovic M. “An Adaptive Web Page Recomandation Service in Proc. Of 1th International Conference on Autonomous Agents 1997
  • 52. Genetic Algorithms īēGenetic algorithms are inspired by natural evolution. In the natural world, organisms that are poorly suited for an environment die off, while those well-suited for it prosper. īēEach individual is a bit-string that encodes its characteristics. Each element of the string is called a gene.
  • 53. Genetic Algorithms īēGenetic algorithms search the space of individuals for good candidates. īēThe "goodness" of an individual is measured by some fitness function. Search takes place in parallel, with many individuals in each generation.
  • 54. Genetic Algorithms īēThe algorithm consists of looping through generations. In each generation, a subset of the population is selected to reproduce; usually this is a random selection in which the probability of choice is proportional to fitness.
  • 55. Genetic Algorithms īēReproduction occurs by randomly pairing all of the individuals in the selection pool, and then generating two new individuals by performing crossover, in which the initial n bits (where n is random) of the parents are exchanged. There is a small chance that one of the genes in the resulting individuals will mutate to a new value.
  • 56. Neural Networks īēAn artificial network consists of a pool of simple processing units which communicate by sending signals to each other over a large number of weighted connections.
  • 57. Artificial Neuron x1 x2 xn w1j w2j wnj j n i i ij j b x w s ī€Ģ ī€Ŋ īƒĨ ī€Ŋ0 ) ( j j s f y ī€Ŋ yj bj j s e ī€­ ī€Ģ 1 1
  • 58. Neural Networks īēEach unit performs a relatively simple job: receive input from neighbors or external sources and use this to compute an output signal which is propagated to other units (Test stage). īēApart from this processing, there is the task of the adjustment of the weights (Learning stage). īēThe system is inherently parallel in the sense that many units can carry out their computations at the same time.
  • 59. Neural Networks 1. Learning stage 2. Test stage (working stage) Your knowledge is useless !!
  • 60. Classification (connections) As for this pattern of connections, the main distinction we can make is between: īēFeed-forward networks, where the data flow from input to output units is strictly feed-forward. The data processing can extend over multiple layers of units, but no feedback connections or connections between units of the same layer are present.
  • 61. Classification īēRecurrent networks that do contain feedback connections. Contrary to feed-forward networks, the dynamical properties of the network are important. In some cases, the activation values of the units undergo a relaxation process such that the network will evolve to a stable state in which these activations do not change anymore. Classification (connections)
  • 62. Recurrent Networks īēIn other applications, the change of the activation values of the output neurons are significant, such that the dynamical behavior constitutes the output of the network.
  • 63. Classification (Learning) We can categorise the learning situations in two distinct sorts. These are: īēSupervised learning in which the network is trained by providing it with input and matching output patterns. These input-output pairs are usually provided by an external teacher.
  • 64. īēUnsupervised learning in which an (output) unit is trained to respond to clusters of pattern within the input. In this paradigm the system is supposed to discover statistically salient features of the input population. Unlike the supervised learning paradigm, there is no a priori set of categories into which the patterns are to be classified; rather the system must develop its own representation of the input stimuli. Classification (Learning)
  • 65. Perceptron īēA single layer feed-forward network consists of one or more output neurons, each of which is connected with a weighting factor wij to all of the inputs xi. xi b b
  • 66. Perceptron īēIn the simplest case the network has only two inputs and a single output. The output of the neuron is: īēsuppose that the activation function is a threshold īƒˇ īƒ¸ īƒļ īƒ§ īƒ¨ īƒĻ ī€Ģ ī€Ŋ īƒĨ ī€Ŋ 2 1 i i i b x w f y īƒŽ īƒ­ īƒŦ ī‚Ŗ ī€­ ī€ž ī€Ŋ 0 1 0 1 s if s if f
  • 67. Perceptron īēIn this example the simple network (the neuron) can be used to separate the inputs in two classes. īēThe separation between the two classes is given by 0 2 2 1 1 ī€Ŋ ī€Ģ ī€Ģ b x w x w
  • 69. Learning in Perceptrons īēThe weights of the neural networks are modified during the learning phase ij ij ij ij ij ij b t b t b w t w t w ī„ ī€Ģ ī€Ŋ ī€Ģ ī„ ī€Ģ ī€Ŋ ī€Ģ ) ( ) 1 ( ) ( ) 1 (
  • 70. Learning in Perceptrons īēStart with random weights īēSelect an input couple (x, d(x)) īēif then modify the weight according with Note that the weights are not modified if the network gives the correct answer i ij x x d w ) ( ī€Ŋ ī„ ) (x d y ī‚š
  • 71. Convergence theorem īēIf there exists a set of connection weights w* which is able to perform the transformation y = d(x), the perceptron learning rule will converge to some solution (which may or may not be the same as w* ) in a finite number of steps for any initial choice of the weights.
  • 72. Linear Units x2 xn w1j w2j wnj j n i i ij j b x w s ī€Ģ ī€Ŋ īƒĨ ī€Ŋ0 bj Yj=sj
  • 73. The Delta Rule 1 īēThe idea is to make the change of the weight proportional to the negative derivative of the error ij i i ij ij w y y E w E w ī‚ļ ī‚ļ ī‚ļ ī‚ļ ī€Ŋ ī‚ļ ī‚ļ ī€­ ī€Ŋ ī„ ī§
  • 74. The Delta Rule 2 ī€¨ ī€Š j i ij i i i i j ij i x w y d y E x w y ī§ī¤ ī¤ ī€Ŋ ī„ ī€Ŋ ī€­ ī€­ ī€Ŋ ī‚ļ ī‚ļ ī€Ŋ ī‚ļ ī‚ļ (1)
  • 75. Backpropagation īēThe multi-layer networks with a linear activation can classify only linear separable inputs or, in case of function approximation, only linear functions can be represented.
  • 76. Backpropagation . . . x1 x2 xn vjk hj wij yi
  • 77. Backpropagation īēWhen a learning pattern is clamped, the activation values are propagated to the output units, and the actual network output is compared with the desired output values, we usually end up with an error in each of the output units. Let's call this error eo for a particular output unit o. We have to bring eo to zero.
  • 78. Backpropagation īēThe simplest method to do this is the greedy method: we strive to change the connections in the neural network in such a way that, next time around, the error eo will be zero for this particular pattern. We know from the delta rule that, in order to reduce an error, we have to adapt its incoming weights according to the last equation (1)
  • 79. Backpropagation īēIn order to adapt the weights from input to hidden units, we again want to apply the delta rule. In this case, however, we do not have a value for for the hidden units.
  • 80. Backpropagation īēCalculate the activation of the hidden units īƒˇ īƒ¸ īƒļ īƒ§ īƒ¨ īƒĻ ī€Ŋ īƒĨ ī€Ŋ n k k jk j x v f h 0
  • 81. Backpropagation īēAnd the activation of the output units īƒˇ īƒˇ īƒ¸ īƒļ īƒ§ īƒ§ īƒ¨ īƒĻ ī€Ŋ īƒĨ ī€Ŋ0 j j ij i h w f y
  • 82. Backpropagation īēIf we have ī­ pattern to learn the error is ī€¨ ī€Š 2 0 2 1 2 2 1 2 2 1 īƒĨīƒĨ īƒĨ īƒĨ īƒĨīƒĨ īƒĨ īƒĨīƒĨ īƒē īƒē īƒģ īƒš īƒĒ īƒĒ īƒĢ īƒŠ īƒˇ īƒˇ īƒ¸ īƒļ īƒ§ īƒ§ īƒ¨ īƒĻ īƒˇ īƒ¸ īƒļ īƒ§ īƒ¨ īƒĻ ī€­ ī€Ŋ īƒē īƒē īƒģ īƒš īƒĒ īƒĒ īƒĢ īƒŠ īƒˇ īƒˇ īƒ¸ īƒļ īƒ§ īƒ§ īƒ¨ īƒĻ ī€­ ī€Ŋ ī€Ŋ ī€­ ī€Ŋ ī€Ŋ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ i j n k jk ij i i j ij i i i i k j x v f w f t h w f t y t E
  • 83. Backpropagation ī€¨ ī€Š ī€¨ ī€Š ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī¤ ī¨ ī¨ ī¨ j i j i i i ij ij h h A f y t w E w īƒĨ īƒĨ ī€Ŋ ī€Ŋ ī€­ ī€Ŋ ī€Ŋ ī‚ļ ī‚ļ ī€­ ī€Ŋ ī„ . ī€¨ ī€Š ī€¨ ī€Š ī­ ī­ ī­ ī­ ī¤ i i i i A f y t . ī€­ ī€Ŋ
  • 84. Backpropagation ī€¨ ī€Š ī€¨ ī€Š ī€¨ ī€Š ī€¨ ī€Š īƒĨīƒĨ īƒĨīƒĨ īƒĨ ī€Ŋ ī€Ŋ ī€­ ī€Ŋ ī€Ŋ ī‚ļ ī‚ļ ī‚ļ ī‚ļ ī€­ ī€Ŋ ī‚ļ ī‚ļ ī€­ ī€Ŋ ī„ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī­ ī¤ ī¨ ī¨ ī¨ ī¨ i k j ij i k j i ij i i i jk j j jk jk x A f w x A f w A f y t v h h E v E v . . . .
  • 85. Backpropagation īēThe weight correction is given by : īƒĨ ī€Ŋ ī„ īŽ ī­ ī­ ī¤ ī¨ n m mn x w ī€¨ ī€Š ī€¨ ī€Š ī­ ī­ ī­ ī­ ī¤ m m m m A f y t ' ī€­ ī€Ŋ ī€¨ ī€ŠīƒĨ ī€Ŋ s s sm m m w A f ī­ ī­ ī­ ī¤ ī¤ ' Where If m is the output layer If m is an hidden layer or
  • 86. Backpropagation . . . x1 x2 xn vjk hj wij yi
  • 87. Backpropagation . . . x1 x2 xn vjk hj wij yi
  • 88. Recurrent Networks īēWhat happens when we introduce a cycle? For instance, we can connect a hidden unit with itself over a weighted connection, connect hidden units to input units, or even connect all units with each other ?
  • 89. Hopfield Network īēThe Hopfield network consists of a set of N interconnected neurons which update their activation values asynchronously and independently of other neurons. īēAll neurons are both input and output neurons. The activation values are binary (+1, -1)
  • 91. Hopfield Network īēThe state of the system is given by the activation values y = (y k ). īēThe net input s k (t +1) of a neuron k at cycle (t +1) is a weighted sum īƒĨ ī‚š ī€Ģ ī€Ŋ ī€Ģ k j k jk j b w t y t s ) ( ) 1 (
  • 92. Hopfield Network īēA threshold function is applied to obtain the output ī€¨ ī€Š ) 1 ( sgn ) 1 ( ī€Ģ ī€Ŋ ī€Ģ t s t y k k
  • 93. Hopfield Network īēA neuron k in the net is stable at time t I.e. īēA state is state if all the neurons are stable ī€¨ ī€Š ) 1 ( sgn ) ( ī€­ ī€Ŋ t s t y k k
  • 94. Hopfield Networks īēIf wjk = wkj the behavior of the system can be described with an energy function īēThis kind of network has stable limit points īƒĨ īƒĨīƒĨ ī€­ ī€­ ī€Ŋ ī‚š k k k jk k k j j y b w y y 2 1 īĨ
  • 95. Hopfield net. applications īēA primary application of the Hopfield network is an associative memory. īēThe states of the system corresponding with the patterns which are to be stored in the network are stable. īēThese states can be seen as `dips' in energy space.
  • 96. Hopfield Networks īēIt appears, however, that the network gets saturated very quickly, and that about 0.15N memories can be stored before recall errors become severe.
  • 98. Hopfield Networks īēUsed in īšChung, Y.-M., Pottenger, W. M., and Schatz, B. R. (1998) Automatic subject indexing using an associative neural network. In: I. Witten, R. Akscyn and F. M. Shipman III (eds.) Proceedings of The Third ACM Conference on Digital Libraries (Digital Libraries '98), Pittsburgh, USA, June 23-26, 1998, ACM Press, pp. 59-6
  • 99. Self Organization īēThe unsupervised weight adapting algorithms are usually based on some form of global competition between the neurons. īēApplications of self-organizing networks are:
  • 100. S.O. Applications īēclustering: the input data may be grouped in `clusters' and the data processing system has to find these inherent clusters in the input data.
  • 101. S.O. Applications īēvector quantisation: this problem occurs when a continuous space has to be discretised. The input of the system is the n-dimensional vector x, the output is a discrete representation of the input space. The system has to find optimal discretisation of the input space.
  • 102. S.O. Applications īēdimensionality reduction: the input data are grouped in a subspace which has lower dimensionality than the dimensionality of the data. The system has to learn an “optimal” mapping.
  • 103. S.O. Applications īēfeature extraction: the system has to extract features from the input signal. This often means a dimensionality reduction as described above.
  • 104. Self-Organizing Networks īēLearning Vector Quantization īēKohonen maps īēPrincipal Components Networks īēAdaptive Resonance Theory
  • 105. Kohonen Maps īēIn the Kohonen network, the output units are ordered in some fashion, often in a two-dimensional grid or array, although this is application-dependent.
  • 107. Kohonen Maps The input x is given to all the units at the same time
  • 108. Kohonen Maps The weights of the winner unit are updated together with the weights of its neighborhoods
  • 109. Kohonen Maps īēUsed in: īš Fulantelli, G., Rizzo, R., Arrigo, M., and Corrao, R. (2000) An adaptive open hypermedia system on the Web. In: P. Brusilovsky, O. Stock and C. Strapparava (eds.) Adaptive Hypermedia and Adaptive Web-Based Systems. Lecture Notes in Computer Science, (Proceedings of Adaptive Hypermedia and Adaptive Web- based Systems, AH2000, Trento, Italy, August 28-30, 2000) Berlin: Springer- Verlag, pp. 189-201. īš Goren-Bar, D., Kuflik, T., Lev, D., and Shoval, P. (2001) Automating personal categorizations using artificial neural network. In: M. Bauer, P. J. Gmytrasiewicz and J. Vassileva (eds.) User Modeling 2001. Lecture Notes on Artificial Intelligence, Vol. 2109, (Proceedings of 8th International Conference on User Modeling, UM 2001, Sonthofen, Germany, July 13-17, 2001) Berlin: Springer- Verlag, pp. 188-198.
  • 110. Kohonen Maps īš Kayama, M. and Okamoto, T. (1999) Hy-SOM: The semantic map framework applied on an example case of navigation. In: G. Gumming, T. Okamoto and L. Gomez (eds.) Advanced Research in Computers and Communications in Education. Frontiers ub Artificial Intelligence and Applications, Vol. 2, (Proceedings of ICCE'99, 7th International Conference on Computers in Education, Chiba, Japan, 4-7 November, 1999) Amsterdam: IOS Press, pp. 252- 259. īš Taskaya, T., Contreras, P., Feng, T., and Murtagh, F. (2001) Interactive visual user interfaces to databases. In: M. Smith, G. Salvendy, D. Harris and R. J. Koubek (eds.) Usability evaluation and interface design. Vol. 1, (Proceedings of 9th International Conference on Human-Computer Interaction, HCI International'2001, New Orleans, LA, August 8-10, 2001) Mahwah, NJ: Lawrence Erlbaum Associates, pp. 913-917.