SlideShare a Scribd company logo
1 of 6
Download to read offline
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
Automatic control based on Wasp Behavioral Model and Stochastic 
Learning Automata 
FLORIN STOICA, DANA SIMIAN 
Computer Science Department 
“Lucian Blaga” University Sibiu 
Str. Dr. Ion Ratiu 5-7, 550012, Sibiu 
ROMANIA 
Abstract: - A stochastic automaton can perform a finite number of actions in a random environment. When a specific 
action is performed, the environment responds by producing an environment output that is stochastically related to the 
action. The aim is to design an automaton, using a reinforcement scheme based on the computational model of wasp 
behaviour that can determine the best action guided by past actions and environment responses. Using Stochastic 
Learning Automata techniques, we introduce a decision/control method for intelligent vehicles receiving data from on-board 
sensors or from the localization system of highway infrastructure. 
Key-Words: - Stochastic Learning Automata, Wasp Behavioral Model, Intelligent Vehicle Control 
1 Introduction 
An automaton is a machine or control mechanism 
designed to automatically follow a predetermined 
sequence of operations or respond to encoded 
instructions. The term stochastic emphasizes the 
adaptive nature of the automaton we describe here. The 
automaton described here does not follow predetermined 
rules, but adapts to changes in its environment. This 
adaptation is the result of the learning process. Learning 
is defined as any permanent change in behavior as a 
result of past experience, and a learning system should 
therefore have the ability to improve its behavior with 
time, toward a final goal. 
The stochastic automaton attempts a solution of the 
problem without any information on the optimal action 
(initially, equal probabilities are attached to all the 
actions). One action is selected at random, the response 
from the environment is observed, action probabilities 
are updated based on that response, and the procedure is 
repeated. A stochastic automaton acting as described to 
improve its performance is called a learning automaton. 
The response values can be represented in three 
different models. In the P-model, the response values are 
either 0 or 1, in the S-model the response values are 
continuous in the range (0, 1) and in the Q-model the 
values belong to a finite set of discrete values in the 
range (0, 1). 
The environment can further be split up in two types, 
stationary and nonstationary. In a stationary environment 
the penalty probabilities will never change. In a 
nonstationary environment the penalties will change 
over time. 
The algorithm that guarantees the desired learning 
process is called a reinforcement scheme [5]. The 
reinforcement scheme is the basis of the learning process 
for learning automata. The general solution for 
absolutely expedient schemes was found by 
Lakshmivarahan and Thathachar [8]. However, these 
reinforcement schemes are theoretically valid only in 
stationary environments. 
For a nonstationary automata environment resulting 
from a changing physical environment, we propose a 
new reinforcement scheme, based on the computational 
model of wasp behaviour. 
The aim of this paper is to define an agent-based 
model for an Intelligent Vehicle Control System using 
Stochastic Learning Automata with a learning process 
driven by a reinforcement scheme with wasp-like 
behaviour. 
The remainder of this paper is organized as follows. 
In section 2 we present the Stochastic learning automata 
concept. Sections 3 and 4 contain aspects related to the 
Intelligent Vehicle Control system. In the section 5 we 
introduce the Wasp Behavioral Model. Our new 
reinforcement scheme is presented in section 6. In 
section 7 is presented an implementation of a simulator 
for the Intelligent Vehicle Control System and 
conclusions are presented in section 8. 
2 Stochastic learning automata 
Mathematically, the environment of a stochastic 
automaton is defined by a triple {α , c,β } where 
{ , ,..., } α = α1 α 2 α r represents a finite set of actions 
being the input to the environment, { , } 1 2 β = β β 
ISSN: 1790-2769 289 ISBN: 978-960-474-012-3
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
represents a binary response set, and { , ,..., } c = c1 c2 cr is 
a set of penalty probabilities, where ci is the probability 
that action i α 
will result in an unfavourable response. 
Given that β (n) = 0 is a favourable outcome and 
β (n) =1 is an unfavourable outcome at time instant 
n (n = 0,1, 2, ...) , the element ci of c is defined 
mathematically by: 
ci = P(β (n) =1|α (n) =α i ) i =1, 2, ..., r 
A learning automaton generates a sequence of 
actions on the basis of its interaction with the 
environment. If the automaton is “learning” in the 
process, its performance must be superior to “intuitive” 
methods [6]. Consider a stationary random environment 
with penalty probabilities { , ,..., } c1 c2 cr defined above. 
In order to describe the reinforcement schemes, is 
defined p(n) , a vector of action probabilities: 
pi (n) = P(α (n) =α i ), i =1, r 
We define a quantity M(n) as the average penalty 
for a given action probability vector: 
M ( n ) = P ( β 
( n ) = 1| p ( n 
)) 
= 
Σ r 
β = α = α ∗ α = α 
= 
Σ 
= = 
( ( ) 1| ( ) ) ( ( ) ) ( ) 
i 
i i 
r 
i 
P n n i P n i c p n 
1 1 
An automaton is absolutely expedient if the expected 
value of the average penalty at one iteration step is less 
than it was at the previous step for all steps: 
M(n +1) < M(n) for all n [7]. 
Updating action probabilities can be represented as 
follows: 
p(n +1) = T[ p(n),α (n),β (n)] , 
where T is a mapping. This formula says the next action 
probability p(n +1) is updated based on the current 
probability p(n) , the input from the environment and the 
resulting action. If p(n +1) is a linear function of p(n) , 
the reinforcement scheme is said to be linear; otherwise 
it is termed nonlinear. 
Absolutely expedient learning schemes are presently 
the only class of schemes for which necessary and 
sufficient conditions of design are available. 
However, the need for learning and adaptation in 
systems is mainly due to the fact that the environment 
changes with time. The performance of a learning 
automaton should be judged in such a context. If a 
learning automaton with a fixed strategy is used in a 
nonstationary environment, it may become less 
expedient, and even non-expedient. The learning scheme 
must have sufficient flexibility to track the better actions. 
If the action probabilities of the learning automata are 
functions of the status of the physical environment (such 
in case of an Intelligent Vehicle Control System), the 
realization of a deterministic mathematical model of this 
physical environment may be impossible, if not 
extremely difficult. 
For a nonstationary automata environment resulting 
from a changing physical environment, we propose a 
new reinforcement scheme, based on the computational 
model of wasp behaviour. 
3 Using stochastic learning automata for 
Intelligent Vehicle Control 
In this section, we present a method for intelligent 
vehicle control, having as theoretical background 
Stochastic Learning Automata. The aim here is to design 
an automata system that can learn the best possible 
action based on the data received from on-board sensors, 
of from roadside-to-vehicle communications. For our 
model, we assume that an intelligent vehicle is capable 
of two sets of lateral and longitudinal actions. Lateral 
actions are LEFT (shift to left lane), RIGHT (shift to 
right lane) and LINE_OK (stay in current lane). 
Longitudinal actions are ACC (accelerate), DEC 
(decelerate) and SPEED_OK (keep current speed). An 
autonomous vehicle must be able to “sense” the 
environment around itself. Therefore, we assume that 
there are four different sensors modules on board the 
vehicle (the headway module, two side modules and a 
speed module), in order to detect the presence of a 
vehicle traveling in front of the vehicle or in the 
immediately adjacent lane and to know the current speed 
of the vehicle. These sensor modules evaluate the 
information received from the on-board sensors or from 
the highway infrastructure in the light of the current 
automata actions, and send a response to the automata. 
The response from physical environment is a 
combination of outputs from the sensor modules. 
Because an input parameter for the decision blocks is the 
action chosen by the stochastic automaton, is necessary 
to use two distinct functions for mapping the outputs of 
decision blocks in inputs for the two learning automata, 
namely the longitudinal automaton and respectively the 
lateral automaton. 
After updating the action probability vectors in both 
learning automata, using the new reinforcement 
scheme presented in section 6, the outputs from 
stochastic automata are transmitted to the regulation 
layer. The regulation layer handles the actions received 
from the two automata in a distinct manner, using for 
each of them a regulation buffer. If an action received 
was rewarded, it will be introduced in the regulation 
buffer of the corresponding automaton, else in buffer 
will be introduced a certain value which denotes a 
penalized action by the physical environment. The 
regulation layer does not carry out the action chosen 
immediately; instead, it carries out an action only if it is 
ISSN: 1790-2769 290 ISBN: 978-960-474-012-3
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
recommended k times consecutively by the automaton, 
where k is the length of the regulation buffer. After an 
action is executed, the action probability vector is 
initialized to 
1 
, where r is the number of actions. When 
r 
an action is executed, regulation buffer is initialized also. 
Our basic model for planning and coordination of lane 
changing and speed control is shown in figure 1. 
Highway 
yes 
─ 
Fig. 1 The model of the Intelligent Vehicle Control 
System 
4 Sensor modules 
The four teacher modules mentioned above are decision 
blocks that calculate the response (reward/penalty), 
based on the last chosen action of automaton. Table 1 
describes the output of decision blocks for side sensors. 
As seen in table 1, a penalty response is received 
from the left sensor module when the action is LEFT and 
there is a vehicle in the left or the vehicle is already 
traveling on the leftmost lane. There is a similar situation 
for the right sensor module. 
Left/Right Sensor Module 
Actions 
Vehicle in 
sensor range or 
no adjacent lane 
No vehicle in 
sensor range and 
adjacent lane 
exists 
LINE_OK 0/0 0/0 
LEFT 1/0 0/0 
RIGHT 0/1 0/0 
Table 1 Outputs from the Left/Right Sensor Module 
The Headway (Frontal) Module is defined as shown 
in table 2. If there is a vehicle at a close distance (< 
admissible distance), a penalty response is sent to the 
automaton for actions LINE_OK, SPEED_OK and 
ACC. All other actions (LEFT, RIGHT, DEC) are 
encouraged, because they may serve to avoid a collision. 
Headway Sensor Module 
Actions 
Vehicle in range 
(at a close 
frontal distance) 
No vehicle in 
range 
LINE_OK 1 0 
LEFT 0 0 
RIGHT 0 0 
SPEED_OK 1 0 
ACC 1 0 
DEC 0* 0 
Table 2 Outputs from the Headway Module 
The Speed Module compares the actual speed with 
the desired speed, and based on the action chosen send a 
feedback to the longitudinal automaton. 
The reward response indicated by 0* (from the 
Headway Sensor Module) is different than the normal 
reward response, indicated by 0: this reward response 
has a higher priority and must override a possible 
penalty from other modules. 
Speed Sensor Module 
Actions 
Speed: 
too slow 
Acceptable 
speed 
Speed: 
too fast 
SPEED_OK 1 0 1 
ACC 0 0 1 
DEC 1 0 0 
Table 3 Outputs from the Speed Module 
Physical Environment 
Auto vehicle 
Regulation Layer 
SLA 
Environment 
Planning Layer 
β2 = 0 
Frontal 
detection 
Left 
Detection 
Right 
Detection 
Speed 
Detection 
F1 
F2 
Longitudinal 
Automaton 
Lateral 
Automaton 
Regulation 
Buffer 
Regulation 
Buffer 
β1 = 0 
no 
yes 
no 
β1 
β 2 
Localization 
System 
α1 
α 2 
α1 
α 2 
α1 
α 2 
─ 
action 
ISSN: 1790-2769 291 ISBN: 978-960-474-012-3
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
5 Wasp Behavioral Model 
Theraulaz et al. present a model for self-organization 
within a colony of wasps ([15]). In a colony of wasps, 
individual wasp interacts with its local environment in 
the form of a stimulus-response mechanism, which 
governs distributed task allocation. An individual wasp 
has a response threshold for each zone of the nest. Based 
on a wasp’s threshold for a given zone and the amount of 
stimulus from brood located in this zone, a wasp may or 
may not become engaged in the task of foraging for this 
zone. A lowest response threshold for a given zone 
amounts to a higher likelihood of engaging in activity 
given a stimulus. In [16] is discussed a model in which 
these thresholds remain fixed over time. Later, in [17] is 
considered that a threshold for a given task decreases 
during time periods when that task is performed and 
increases otherwise. In [18], Cicirello and Smith, present 
a system which incorporates aspects of the wasp model 
which have been ignored by others authors. They 
consider three ways in which the response thresholds are 
updated. The first two ways are analogous to that of real 
wasp model. The third is included to encourage a wasp 
associated with an idle machine to take whatever jobs 
rather than remaining idle. The model of wasp behaviour 
also describes the nature of wasp-to-wasp interaction 
that takes place within the nest. When two individuals of 
the colony encounter each other, they may with some 
probability interact in a dominance contest. The wasp 
with the higher social rank will have a higher probability 
of dominating in the interaction. Wasps within the 
colony self-organize themselves into a dominance 
hierarchy. In [18] is incorporated this aspect of the 
behaviour model, that is when two or more of the wasp-like 
agents bid for a given job, the winner is chosen 
through a tournament of dominance contests. 
In the following section is presented a new 
reinforcement scheme for stochastic learning automata. 
6 A new reinforcement scheme 
In this section we present our approach in 
implementing a new reinforcement scheme based on a 
computational model of wasp behavior. 
Each stochastic automaton (namely the longitudinal 
automaton and respectively the lateral automaton) has an 
associated wasp. Each wasp has a set of response 
thresholds: w { w,0 w,r } Θ = θ ,...,θ 
where r denotes the number of actions of the automaton 
(as defined in section 2) and w,i θ is the response 
threshold of wasp w for action i . An action i broadcast 
to the automaton a stimulus i S which is equal to the 
number of occurences of action i in the regulation buffer 
of the automaton. The stochastic automaton will pick up 
the action i emitting a stimulus i S with probability: 
2 
, 
P S S 
2 
2 
, ( , ) 
i 
i w i 
w i i S 
θ 
θ 
+ 
= 
The threshold values w,i θ may vary in the range 
[ , ]. min max θ θ 
If automaton will execute the action i , the threshold 
w,i θ is updated as follows: 
( 0) , , 1 1 θ =θ −δ δ > w i w i 
For each action j other than i , the threshold w, j θ is 
updated according to: 
( 0) , , 2 2 θ =θ +δ δ > w j w j 
7 Implementation of a simulator 
In this section is described an implementation of a 
simulator for the Intelligent Vehicle Control System. 
The entire system was implemented in Java, and is based 
on JADE platform. 
JADE is a middleware that facilitates the 
development of multi-agent systems and applications 
conforming to FIPA standards for intelligent agents. In 
figure 3 is showed the class diagram of the simulator. 
Each vehicle has associated an agent, responsible for the 
intelligent control. 
The response of the physical environment is a 
combination of the outputs of all four sensor modules. 
The implementation of this combination for each 
automaton (longitudinal respectively lateral) is showed 
in figure 2 (the value 0* was substituted by 2). 
A snapshot of the running simulator is shown in figure 5. 
// environment response for Longitudinal Automaton 
public double reward(int action){ 
int combine; 
combine = Math.max(speedModule(action), 
frontModule(action)); 
if (combine = = 2) combine = 0; 
return combine; 
} 
// environment response for Lateral Automaton 
public double reward(int action){ 
int combine; 
combine = Math.max(leftRightModule(action), 
frontModule(action)); 
return combine; 
} 
Fig. 2 The physical environment response 
ISSN: 1790-2769 292 ISBN: 978-960-474-012-3
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
Fig. 3 The class diagram of the simulator 
Implementation of the learning process for the lateral 
automaton is described in the following: 
// The learning process 
public void learning(){ 
int i, j; 
double f; 
boolean doIt; 
// choose an action 
i = getAction(); 
// compute environment response 
f = reward(i); 
for (int k = 1; k < HISTORY; k++) 
regulation_layer[k-1]= 
regulation_layer[k]; 
if (f==0) 
regulation_layer[HISTORY-1] = i; 
else 
// ignore the action! 
regulation_layer[HISTORY-1] = -1; 
doIt=true; 
for (int k = 0; k < HISTORY; k++) 
if (regulation_layer[k]!=i) { 
doIt=false; 
break; 
} 
if (doIt) { 
// all action probabilities 
// become equals 
init(); 
switch(i) { 
case LEFT: 
auto.setLane(auto.getLane()-1); 
break; 
case RIGHT: 
auto.setLane(auto.getLane()+1); 
break; 
} 
return; 
} 
// update the action probabilities using a 
// reinforcement scheme based on the 
// wasp-like computational model 
if (f==0){ 
if (t[i]-d1 >= t_min) t[i]=t[i]-d1; 
S[i] = getStimulus(i); 
p[i]=S[i]*S[i]/( S[i]*S[i] +t[i]*t[i]); 
for (j=0; j < ACTIONS; j++) 
if (j != i) { 
if (t[j]+d2 <= t_max) t[j]=t[j]+d2; 
S[j] = getStimulus(j); 
p[j]=S[j]*S[j]/( S[j]*S[j] +t[j]*t[j]); 
} 
} else{ 
if (t[i]+d2 <= t_max) t[i]=t[i]+d2; 
S[i] = getStimulus(i); 
p[i]=S[i]*S[i]/( S[i]*S[i] +t[i]*t[i]); 
for (j=0; j < ACTIONS; j++) 
if (j != i) { 
if (t[j]-d1 >= t_min) t[j]=t[j]-d1; 
S[j] = getStimulus(j); 
p[j]=S[j]*S[j]/( S[j]*S[j] +t[j]*t[j]); 
} 
} 
} 
Fig. 4 The learning process for the lateral automaton 
Fig. 5 A scenario executed in the simulator 
8 Conclusion 
Reinforcement learning has attracted rapidly increasing 
interest in the machine learning and artificial intelligence 
communities. Its promise is beguiling - a way of 
programming agents by reward and punishment without 
needing to specify how the task (i.e., behavior) is to be 
achieved. Reinforcement learning allows, at least in 
ISSN: 1790-2769 293 ISBN: 978-960-474-012-3
MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 
principle, to bypass the problems of building an explicit 
model of the behavior to be synthesized and its 
counterpart, a meaningful learning base (supervised 
learning). 
The new reinforcement scheme presented in this 
paper is suitable for nonstationary stochastic automata 
environment, resulting from a changing physical 
environment. Used within a simulator of an Intelligent 
Vehicle Control System, this new reinforcement scheme, 
based on a computational model of wasp behavior, has 
proved its efficiency. 
Acknowledgment: This work was completed with the 
support of the research grant of the Romanian Ministry 
of Education and Research, CNCSIS 33/2007. 
References: 
[1] A. Barto, S. Mahadevan, Recent advances in 
hierarchical reinforcement learning, Discrete-Event 
Systems journal, Special issue on Reinforcement 
Learning, 2003. 
[2] R. Sutton, A. Barto, Reinforcement learning: An 
introduction, MIT-press, Cambridge, MA, 1998. 
[3] O. Buffet, A. Dutech, F. Charpillet, Incremental 
reinforcement learning for designing multi-agent 
systems, In J. P. Müller, E. Andre, S. Sen, and C. 
Frasson, editors, Proceedings of the Fifth 
International Conference onAutonomous Agents, pp. 
31–32,Montreal, Canada, 2001. ACM Press. 
[4] J. Moody, Y. Liu, M. Saffell, K. Youn, Stochastic 
direct reinforcement: Application to simple games 
with recurrence, In Proceedings of Artificial 
Multiagent Learning. Papers from the 2004 AAAI 
Fall Symposium,Technical Report FS-04-02, 2004. 
[5] C. Ünsal, P. Kachroo, J. S. Bay, Simulation Study of 
Multiple Intelligent Vehicle Control using Stochastic 
Learning Automata, TRANSACTIONS, the Quarterly 
Archival Journal of the Society for Computer 
Simulation International, volume 14, number 4, 
December 1997. 
[6] C. Ünsal, P. Kachroo, J. S. Bay, Multiple Stochastic 
Learning Automata for Vehicle Path Control in an 
Automated Highway System, IEEE Transactions on 
Systems, Man, and Cybernetics -part A: systems and 
humans, vol. 29, no. 1, january 1999 
[7] K. S. Narendra, M. A. L. Thathachar, Learning 
Automata: an introduction, Prentice-Hall, 1989. 
[8] S. Lakshmivarahan, M.A.L. Thathachar, Absolutely 
Expedient Learning Algorithms for Stochastic 
Automata, IEEE Transactions on Systems, Man and 
Cybernetics, vol. SMC-6, pp. 281-286, 1973 
[9] F. Stoica, E. M. Popa, An Absolutely Expedient 
Learning Algorithm for Stochastic Automata, 
WSEAS Transactions on Computers, Issue 2, Volume 
6, February 2007, ISSN 1109-2750, pp. 229-235 
[10] I. De Falco, A. Iazzetta, A. Della Cioppa, E. 
Tarantino, Mijn Mutation Operator for Aerofoil 
Design Optimisation, Research Institute on Parallel 
Information Systems, National Research Council of 
Italy, 2001 
[11] I. De Falco, R. Del Balio, A. Della Cioppa, E. 
Tarantino, A Comparative Analysis of Evolutionary 
Algorithms for Function Optimisation, Research 
Institute on Parallel Information Systems, National 
Research Council of Italy, 1998 
[12] I. De Falco, A. Iazzetta, A. Della Cioppa, E. 
Tarantino, The Effectiveness of Co-mutation in 
Evolutionary Algorithms: the Mijn operator, Research 
Institute on Parallel Information Systems, National 
Research Council of Italy, 2000 
[13] H. Kita, A comparison study of self-adaptation in 
evolution strategies and real-code genetic algorithms, 
Evolutionary Computation, 9(2):223-241, 2001 
[14] F. Stoica, D. Simian, C. Simian, A new co-mutation 
genetic operator, In Proceedings of Applications of 
Evolutionary Computing in Modeling and 
Development of Intelligent Systems, within the 9th 
WSEAS International Conference on 
EVOLUTIONARY COMPUTING (EC '08), May 
2008, Sofia, Bulgaria. 
[15] G. Theraulaz , E. Bonabeau, J. Gervet , J. I. 
Demeubourg, Task differention in policies 
waspcolonies. A model for self-organizing groups of 
robots, From animals to Animats: Proceedings of the 
First International Conference on Simulation of 
Adaptiv behaviour, 1991, pp. 346–355. 
[16] E. Bonabeau, G. Theraulaz, J.I. Demeubourg, Fixed 
response thresholds and the regulation of division of 
labor in insect societies, Bull. Math. Biol., vol. 60, 
1998, pp. 753–807. 
[17] G. Theraulaz , E. Bonabeau, J. I. Demeubourg, 
Response threshold reinforcement and division of 
labour in insects societies, Proc. R Spob London 
B.,vol. 265, no. 1393,1998, pp. 327-‘335. 
[18] V. A. Cicirelo, S.F. Smith,Wasp-like Agents for 
Distributed Factory coordination, Autonomous 
Agents and Multi-Agent Systems, Vol. 8, No. 3,2004, 
pp. 237–267. 
[19] D. Simian, F. Stoica, C. Simian, Models for a 
Multi-Agent System Based on Wasp-Like Behaviour 
for Distributed Patients Repartition, 9th WSEAS 
International Conference on EVOLUTIONARY 
COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 
2008. 
ISSN: 1790-2769 294 ISBN: 978-960-474-012-3

More Related Content

What's hot

Modeling examples
Modeling examplesModeling examples
Modeling examplesanand hd
 
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOT
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOTPSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOT
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOTJaresJournal
 
Predictive feedback control
Predictive feedback controlPredictive feedback control
Predictive feedback controlISA Interchange
 
A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...Université de Liège (ULg)
 
Robotics of Quadruped Robot
Robotics of Quadruped RobotRobotics of Quadruped Robot
Robotics of Quadruped Robot홍배 김
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAnirban Santara
 
Kinematic control with singularity avoidance for teaching-playback robot mani...
Kinematic control with singularity avoidance for teaching-playback robot mani...Kinematic control with singularity avoidance for teaching-playback robot mani...
Kinematic control with singularity avoidance for teaching-playback robot mani...Baron Y.S. Yong
 
On tracking control problem for polysolenoid motor model predictive approach
On tracking control problem for polysolenoid motor model predictive approach On tracking control problem for polysolenoid motor model predictive approach
On tracking control problem for polysolenoid motor model predictive approach IJECEIAES
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksMohamed Arif
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
 
Neural Network Approach to Railway Stand Lateral SKEW Control
Neural Network Approach to Railway Stand Lateral SKEW ControlNeural Network Approach to Railway Stand Lateral SKEW Control
Neural Network Approach to Railway Stand Lateral SKEW Controlcsandit
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...IRJET Journal
 
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENT
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENTA NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENT
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENTPower System Operation
 
Optimization problems and algorithms
Optimization problems and  algorithmsOptimization problems and  algorithms
Optimization problems and algorithmsAboul Ella Hassanien
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)irjes
 
REINFORCEMENT LEARNING
REINFORCEMENT LEARNINGREINFORCEMENT LEARNING
REINFORCEMENT LEARNINGpradiprahul
 
Multi parametric model predictive control based on laguerre model for permane...
Multi parametric model predictive control based on laguerre model for permane...Multi parametric model predictive control based on laguerre model for permane...
Multi parametric model predictive control based on laguerre model for permane...IJECEIAES
 

What's hot (20)

Modeling examples
Modeling examplesModeling examples
Modeling examples
 
Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques
 
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOT
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOTPSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOT
PSO APPLIED TO DESIGN OPTIMAL PD CONTROL FOR A UNICYCLE MOBILE ROBOT
 
Predictive feedback control
Predictive feedback controlPredictive feedback control
Predictive feedback control
 
A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...A reinforcement learning approach for designing artificial autonomous intelli...
A reinforcement learning approach for designing artificial autonomous intelli...
 
Robotics of Quadruped Robot
Robotics of Quadruped RobotRobotics of Quadruped Robot
Robotics of Quadruped Robot
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGI
 
Kinematic control with singularity avoidance for teaching-playback robot mani...
Kinematic control with singularity avoidance for teaching-playback robot mani...Kinematic control with singularity avoidance for teaching-playback robot mani...
Kinematic control with singularity avoidance for teaching-playback robot mani...
 
0234
02340234
0234
 
On tracking control problem for polysolenoid motor model predictive approach
On tracking control problem for polysolenoid motor model predictive approach On tracking control problem for polysolenoid motor model predictive approach
On tracking control problem for polysolenoid motor model predictive approach
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
 
Neural Network Approach to Railway Stand Lateral SKEW Control
Neural Network Approach to Railway Stand Lateral SKEW ControlNeural Network Approach to Railway Stand Lateral SKEW Control
Neural Network Approach to Railway Stand Lateral SKEW Control
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
 
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENT
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENTA NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENT
A NEW TOOL FOR LARGE SCALE POWER SYSTEM TRANSIENT SECURITY ASSESSMENT
 
922214 e002013
922214 e002013922214 e002013
922214 e002013
 
Optimization problems and algorithms
Optimization problems and  algorithmsOptimization problems and  algorithms
Optimization problems and algorithms
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
REINFORCEMENT LEARNING
REINFORCEMENT LEARNINGREINFORCEMENT LEARNING
REINFORCEMENT LEARNING
 
Multi parametric model predictive control based on laguerre model for permane...
Multi parametric model predictive control based on laguerre model for permane...Multi parametric model predictive control based on laguerre model for permane...
Multi parametric model predictive control based on laguerre model for permane...
 

Viewers also liked

Reading and publishing - two sides of the same coin?
Reading and publishing- two sides of the same coin?Reading and publishing- two sides of the same coin?
Reading and publishing - two sides of the same coin?Ulf Kronman
 
Mary frias final
Mary frias finalMary frias final
Mary frias finalmaryyfrias
 
Игра-квест Осень в науке
Игра-квест Осень в наукеИгра-квест Осень в науке
Игра-квест Осень в наукеpeshkova_anastasiya
 
Open Access in Sweden and the programme OpenAccess.se
Open Access in Sweden and the programme OpenAccess.seOpen Access in Sweden and the programme OpenAccess.se
Open Access in Sweden and the programme OpenAccess.seUlf Kronman
 
A New Model Checking Tool
A New Model Checking ToolA New Model Checking Tool
A New Model Checking Toolinfopapers
 
Den nya publikationsekonomin
Den nya publikationsekonomin Den nya publikationsekonomin
Den nya publikationsekonomin Ulf Kronman
 
Родительское собрание 20.10.2016г.
Родительское собрание 20.10.2016г.Родительское собрание 20.10.2016г.
Родительское собрание 20.10.2016г.peshkova_anastasiya
 
экскурсия в библиотеку имени франко
экскурсия в библиотеку имени франкоэкскурсия в библиотеку имени франко
экскурсия в библиотеку имени франкоpeshkova_anastasiya
 

Viewers also liked (13)

Bipod by-humayun-ahmed-mishir-ali-series
Bipod by-humayun-ahmed-mishir-ali-series Bipod by-humayun-ahmed-mishir-ali-series
Bipod by-humayun-ahmed-mishir-ali-series
 
Reading and publishing - two sides of the same coin?
Reading and publishing- two sides of the same coin?Reading and publishing- two sides of the same coin?
Reading and publishing - two sides of the same coin?
 
Mary frias final
Mary frias finalMary frias final
Mary frias final
 
Игра-квест Осень в науке
Игра-квест Осень в наукеИгра-квест Осень в науке
Игра-квест Осень в науке
 
Open Access in Sweden and the programme OpenAccess.se
Open Access in Sweden and the programme OpenAccess.seOpen Access in Sweden and the programme OpenAccess.se
Open Access in Sweden and the programme OpenAccess.se
 
Graficas
GraficasGraficas
Graficas
 
A New Model Checking Tool
A New Model Checking ToolA New Model Checking Tool
A New Model Checking Tool
 
Den nya publikationsekonomin
Den nya publikationsekonomin Den nya publikationsekonomin
Den nya publikationsekonomin
 
Родительское собрание 20.10.2016г.
Родительское собрание 20.10.2016г.Родительское собрание 20.10.2016г.
Родительское собрание 20.10.2016г.
 
экскурсия в библиотеку имени франко
экскурсия в библиотеку имени франкоэкскурсия в библиотеку имени франко
экскурсия в библиотеку имени франко
 
день грамотности
день грамотностидень грамотности
день грамотности
 
Кенжаева С.Л.
Кенжаева С.Л.Кенжаева С.Л.
Кенжаева С.Л.
 
Гайновская С.В.
Гайновская С.В.Гайновская С.В.
Гайновская С.В.
 

Similar to Automatic control based on Wasp Behavioral Model and Stochastic Learning Automata

A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning AutomataA new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithminfopapers
 
Generic Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their OptimizationGeneric Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their Optimizationinfopapers
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningRyo Iwaki
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORijistjournal
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORijistjournal
 
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...ijccmsjournal
 
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning AutomataA New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automatainfopapers
 
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...ijccmsjournal
 
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...ijccmsjournal
 
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...IJCSEA Journal
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsIRJET Journal
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka
 
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...ijscai
 
Using genetic algorithms and simulation as decision support in marketing stra...
Using genetic algorithms and simulation as decision support in marketing stra...Using genetic algorithms and simulation as decision support in marketing stra...
Using genetic algorithms and simulation as decision support in marketing stra...infopapers
 
An AsmL model for an Intelligent Vehicle Control System
An AsmL model for an Intelligent Vehicle Control SystemAn AsmL model for an Intelligent Vehicle Control System
An AsmL model for an Intelligent Vehicle Control Systeminfopapers
 
Head First Reinforcement Learning
Head First Reinforcement LearningHead First Reinforcement Learning
Head First Reinforcement Learningazzeddine chenine
 

Similar to Automatic control based on Wasp Behavioral Model and Stochastic Learning Automata (20)

A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning AutomataA new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
A new Evolutionary Reinforcement Scheme for Stochastic Learning Automata
 
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithmOptimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
Optimizing a New Nonlinear Reinforcement Scheme with Breeder genetic algorithm
 
Generic Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their OptimizationGeneric Reinforcement Schemes and Their Optimization
Generic Reinforcement Schemes and Their Optimization
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
 
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTORADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
ADAPTIVE CONTROL AND SYNCHRONIZATION OF A HIGHLY CHAOTIC ATTRACTOR
 
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
 
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning AutomataA New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
A New Nonlinear Reinforcement Scheme for Stochastic Learning Automata
 
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...
Computational Complexity Comparison Of Multi-Sensor Single Target Data Fusion...
 
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
COMPUTATIONAL COMPLEXITY COMPARISON OF MULTI-SENSOR SINGLE TARGET DATA FUSION...
 
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
USING LEARNING AUTOMATA AND GENETIC ALGORITHMS TO IMPROVE THE QUALITY OF SERV...
 
The Joy of SLAM
The Joy of SLAMThe Joy of SLAM
The Joy of SLAM
 
Survey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique AlgorithmsSurvey on Artificial Neural Network Learning Technique Algorithms
Survey on Artificial Neural Network Learning Technique Algorithms
 
Performance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning AlgorithmsPerformance Comparision of Machine Learning Algorithms
Performance Comparision of Machine Learning Algorithms
 
Adaptive dynamic programing based optimal control for a robot manipulator
Adaptive dynamic programing based optimal control for a robot manipulatorAdaptive dynamic programing based optimal control for a robot manipulator
Adaptive dynamic programing based optimal control for a robot manipulator
 
Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement LearningPlaying Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
 
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...
STABILITY ANALYSIS AND CONTROL OF A 3-D AUTONOMOUS AI-YUAN-ZHI-HAO HYPERCHAOT...
 
Using genetic algorithms and simulation as decision support in marketing stra...
Using genetic algorithms and simulation as decision support in marketing stra...Using genetic algorithms and simulation as decision support in marketing stra...
Using genetic algorithms and simulation as decision support in marketing stra...
 
An AsmL model for an Intelligent Vehicle Control System
An AsmL model for an Intelligent Vehicle Control SystemAn AsmL model for an Intelligent Vehicle Control System
An AsmL model for an Intelligent Vehicle Control System
 
Head First Reinforcement Learning
Head First Reinforcement LearningHead First Reinforcement Learning
Head First Reinforcement Learning
 

More from infopapers

Implementing an ATL Model Checker tool using Relational Algebra concepts
Implementing an ATL Model Checker tool using Relational Algebra conceptsImplementing an ATL Model Checker tool using Relational Algebra concepts
Implementing an ATL Model Checker tool using Relational Algebra conceptsinfopapers
 
Deliver Dynamic and Interactive Web Content in J2EE Applications
Deliver Dynamic and Interactive Web Content in J2EE ApplicationsDeliver Dynamic and Interactive Web Content in J2EE Applications
Deliver Dynamic and Interactive Web Content in J2EE Applicationsinfopapers
 
Building a Web-bridge for JADE agents
Building a Web-bridge for JADE agentsBuilding a Web-bridge for JADE agents
Building a Web-bridge for JADE agentsinfopapers
 
An Executable Actor Model in Abstract State Machine Language
An Executable Actor Model in Abstract State Machine LanguageAn Executable Actor Model in Abstract State Machine Language
An Executable Actor Model in Abstract State Machine Languageinfopapers
 
CTL Model Update Implementation Using ANTLR Tools
CTL Model Update Implementation Using ANTLR ToolsCTL Model Update Implementation Using ANTLR Tools
CTL Model Update Implementation Using ANTLR Toolsinfopapers
 
Generating JADE agents from SDL specifications
Generating JADE agents from SDL specificationsGenerating JADE agents from SDL specifications
Generating JADE agents from SDL specificationsinfopapers
 
A general frame for building optimal multiple SVM kernels
A general frame for building optimal multiple SVM kernelsA general frame for building optimal multiple SVM kernels
A general frame for building optimal multiple SVM kernelsinfopapers
 
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...infopapers
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsinfopapers
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
 
Interoperability issues in accessing databases through Web Services
Interoperability issues in accessing databases through Web ServicesInteroperability issues in accessing databases through Web Services
Interoperability issues in accessing databases through Web Servicesinfopapers
 
Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning SystemsUsing Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning Systemsinfopapers
 
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...infopapers
 
Intelligent agents in ontology-based applications
Intelligent agents in ontology-based applicationsIntelligent agents in ontology-based applications
Intelligent agents in ontology-based applicationsinfopapers
 
A new co-mutation genetic operator
A new co-mutation genetic operatorA new co-mutation genetic operator
A new co-mutation genetic operatorinfopapers
 
Modeling the Broker Behavior Using a BDI Agent
Modeling the Broker Behavior Using a BDI AgentModeling the Broker Behavior Using a BDI Agent
Modeling the Broker Behavior Using a BDI Agentinfopapers
 
Algebraic Approach to Implementing an ATL Model Checker
Algebraic Approach to Implementing an ATL Model CheckerAlgebraic Approach to Implementing an ATL Model Checker
Algebraic Approach to Implementing an ATL Model Checkerinfopapers
 
An executable model for an Intelligent Vehicle Control System
An executable model for an Intelligent Vehicle Control SystemAn executable model for an Intelligent Vehicle Control System
An executable model for an Intelligent Vehicle Control Systeminfopapers
 
Using the Breeder GA to Optimize a Multiple Regression Analysis Model
Using the Breeder GA to Optimize a Multiple Regression Analysis ModelUsing the Breeder GA to Optimize a Multiple Regression Analysis Model
Using the Breeder GA to Optimize a Multiple Regression Analysis Modelinfopapers
 
Building a new CTL model checker using Web Services
Building a new CTL model checker using Web ServicesBuilding a new CTL model checker using Web Services
Building a new CTL model checker using Web Servicesinfopapers
 

More from infopapers (20)

Implementing an ATL Model Checker tool using Relational Algebra concepts
Implementing an ATL Model Checker tool using Relational Algebra conceptsImplementing an ATL Model Checker tool using Relational Algebra concepts
Implementing an ATL Model Checker tool using Relational Algebra concepts
 
Deliver Dynamic and Interactive Web Content in J2EE Applications
Deliver Dynamic and Interactive Web Content in J2EE ApplicationsDeliver Dynamic and Interactive Web Content in J2EE Applications
Deliver Dynamic and Interactive Web Content in J2EE Applications
 
Building a Web-bridge for JADE agents
Building a Web-bridge for JADE agentsBuilding a Web-bridge for JADE agents
Building a Web-bridge for JADE agents
 
An Executable Actor Model in Abstract State Machine Language
An Executable Actor Model in Abstract State Machine LanguageAn Executable Actor Model in Abstract State Machine Language
An Executable Actor Model in Abstract State Machine Language
 
CTL Model Update Implementation Using ANTLR Tools
CTL Model Update Implementation Using ANTLR ToolsCTL Model Update Implementation Using ANTLR Tools
CTL Model Update Implementation Using ANTLR Tools
 
Generating JADE agents from SDL specifications
Generating JADE agents from SDL specificationsGenerating JADE agents from SDL specifications
Generating JADE agents from SDL specifications
 
A general frame for building optimal multiple SVM kernels
A general frame for building optimal multiple SVM kernelsA general frame for building optimal multiple SVM kernels
A general frame for building optimal multiple SVM kernels
 
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...
Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Be...
 
An evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernelsAn evolutionary method for constructing complex SVM kernels
An evolutionary method for constructing complex SVM kernels
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernels
 
Interoperability issues in accessing databases through Web Services
Interoperability issues in accessing databases through Web ServicesInteroperability issues in accessing databases through Web Services
Interoperability issues in accessing databases through Web Services
 
Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning SystemsUsing Ontology in Electronic Evaluation for Personalization of eLearning Systems
Using Ontology in Electronic Evaluation for Personalization of eLearning Systems
 
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...
Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed ...
 
Intelligent agents in ontology-based applications
Intelligent agents in ontology-based applicationsIntelligent agents in ontology-based applications
Intelligent agents in ontology-based applications
 
A new co-mutation genetic operator
A new co-mutation genetic operatorA new co-mutation genetic operator
A new co-mutation genetic operator
 
Modeling the Broker Behavior Using a BDI Agent
Modeling the Broker Behavior Using a BDI AgentModeling the Broker Behavior Using a BDI Agent
Modeling the Broker Behavior Using a BDI Agent
 
Algebraic Approach to Implementing an ATL Model Checker
Algebraic Approach to Implementing an ATL Model CheckerAlgebraic Approach to Implementing an ATL Model Checker
Algebraic Approach to Implementing an ATL Model Checker
 
An executable model for an Intelligent Vehicle Control System
An executable model for an Intelligent Vehicle Control SystemAn executable model for an Intelligent Vehicle Control System
An executable model for an Intelligent Vehicle Control System
 
Using the Breeder GA to Optimize a Multiple Regression Analysis Model
Using the Breeder GA to Optimize a Multiple Regression Analysis ModelUsing the Breeder GA to Optimize a Multiple Regression Analysis Model
Using the Breeder GA to Optimize a Multiple Regression Analysis Model
 
Building a new CTL model checker using Web Services
Building a new CTL model checker using Web ServicesBuilding a new CTL model checker using Web Services
Building a new CTL model checker using Web Services
 

Recently uploaded

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Recently uploaded (20)

NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

Automatic control based on Wasp Behavioral Model and Stochastic Learning Automata

  • 1. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS Automatic control based on Wasp Behavioral Model and Stochastic Learning Automata FLORIN STOICA, DANA SIMIAN Computer Science Department “Lucian Blaga” University Sibiu Str. Dr. Ion Ratiu 5-7, 550012, Sibiu ROMANIA Abstract: - A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. The aim is to design an automaton, using a reinforcement scheme based on the computational model of wasp behaviour that can determine the best action guided by past actions and environment responses. Using Stochastic Learning Automata techniques, we introduce a decision/control method for intelligent vehicles receiving data from on-board sensors or from the localization system of highway infrastructure. Key-Words: - Stochastic Learning Automata, Wasp Behavioral Model, Intelligent Vehicle Control 1 Introduction An automaton is a machine or control mechanism designed to automatically follow a predetermined sequence of operations or respond to encoded instructions. The term stochastic emphasizes the adaptive nature of the automaton we describe here. The automaton described here does not follow predetermined rules, but adapts to changes in its environment. This adaptation is the result of the learning process. Learning is defined as any permanent change in behavior as a result of past experience, and a learning system should therefore have the ability to improve its behavior with time, toward a final goal. The stochastic automaton attempts a solution of the problem without any information on the optimal action (initially, equal probabilities are attached to all the actions). One action is selected at random, the response from the environment is observed, action probabilities are updated based on that response, and the procedure is repeated. A stochastic automaton acting as described to improve its performance is called a learning automaton. The response values can be represented in three different models. In the P-model, the response values are either 0 or 1, in the S-model the response values are continuous in the range (0, 1) and in the Q-model the values belong to a finite set of discrete values in the range (0, 1). The environment can further be split up in two types, stationary and nonstationary. In a stationary environment the penalty probabilities will never change. In a nonstationary environment the penalties will change over time. The algorithm that guarantees the desired learning process is called a reinforcement scheme [5]. The reinforcement scheme is the basis of the learning process for learning automata. The general solution for absolutely expedient schemes was found by Lakshmivarahan and Thathachar [8]. However, these reinforcement schemes are theoretically valid only in stationary environments. For a nonstationary automata environment resulting from a changing physical environment, we propose a new reinforcement scheme, based on the computational model of wasp behaviour. The aim of this paper is to define an agent-based model for an Intelligent Vehicle Control System using Stochastic Learning Automata with a learning process driven by a reinforcement scheme with wasp-like behaviour. The remainder of this paper is organized as follows. In section 2 we present the Stochastic learning automata concept. Sections 3 and 4 contain aspects related to the Intelligent Vehicle Control system. In the section 5 we introduce the Wasp Behavioral Model. Our new reinforcement scheme is presented in section 6. In section 7 is presented an implementation of a simulator for the Intelligent Vehicle Control System and conclusions are presented in section 8. 2 Stochastic learning automata Mathematically, the environment of a stochastic automaton is defined by a triple {α , c,β } where { , ,..., } α = α1 α 2 α r represents a finite set of actions being the input to the environment, { , } 1 2 β = β β ISSN: 1790-2769 289 ISBN: 978-960-474-012-3
  • 2. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS represents a binary response set, and { , ,..., } c = c1 c2 cr is a set of penalty probabilities, where ci is the probability that action i α will result in an unfavourable response. Given that β (n) = 0 is a favourable outcome and β (n) =1 is an unfavourable outcome at time instant n (n = 0,1, 2, ...) , the element ci of c is defined mathematically by: ci = P(β (n) =1|α (n) =α i ) i =1, 2, ..., r A learning automaton generates a sequence of actions on the basis of its interaction with the environment. If the automaton is “learning” in the process, its performance must be superior to “intuitive” methods [6]. Consider a stationary random environment with penalty probabilities { , ,..., } c1 c2 cr defined above. In order to describe the reinforcement schemes, is defined p(n) , a vector of action probabilities: pi (n) = P(α (n) =α i ), i =1, r We define a quantity M(n) as the average penalty for a given action probability vector: M ( n ) = P ( β ( n ) = 1| p ( n )) = Σ r β = α = α ∗ α = α = Σ = = ( ( ) 1| ( ) ) ( ( ) ) ( ) i i i r i P n n i P n i c p n 1 1 An automaton is absolutely expedient if the expected value of the average penalty at one iteration step is less than it was at the previous step for all steps: M(n +1) < M(n) for all n [7]. Updating action probabilities can be represented as follows: p(n +1) = T[ p(n),α (n),β (n)] , where T is a mapping. This formula says the next action probability p(n +1) is updated based on the current probability p(n) , the input from the environment and the resulting action. If p(n +1) is a linear function of p(n) , the reinforcement scheme is said to be linear; otherwise it is termed nonlinear. Absolutely expedient learning schemes are presently the only class of schemes for which necessary and sufficient conditions of design are available. However, the need for learning and adaptation in systems is mainly due to the fact that the environment changes with time. The performance of a learning automaton should be judged in such a context. If a learning automaton with a fixed strategy is used in a nonstationary environment, it may become less expedient, and even non-expedient. The learning scheme must have sufficient flexibility to track the better actions. If the action probabilities of the learning automata are functions of the status of the physical environment (such in case of an Intelligent Vehicle Control System), the realization of a deterministic mathematical model of this physical environment may be impossible, if not extremely difficult. For a nonstationary automata environment resulting from a changing physical environment, we propose a new reinforcement scheme, based on the computational model of wasp behaviour. 3 Using stochastic learning automata for Intelligent Vehicle Control In this section, we present a method for intelligent vehicle control, having as theoretical background Stochastic Learning Automata. The aim here is to design an automata system that can learn the best possible action based on the data received from on-board sensors, of from roadside-to-vehicle communications. For our model, we assume that an intelligent vehicle is capable of two sets of lateral and longitudinal actions. Lateral actions are LEFT (shift to left lane), RIGHT (shift to right lane) and LINE_OK (stay in current lane). Longitudinal actions are ACC (accelerate), DEC (decelerate) and SPEED_OK (keep current speed). An autonomous vehicle must be able to “sense” the environment around itself. Therefore, we assume that there are four different sensors modules on board the vehicle (the headway module, two side modules and a speed module), in order to detect the presence of a vehicle traveling in front of the vehicle or in the immediately adjacent lane and to know the current speed of the vehicle. These sensor modules evaluate the information received from the on-board sensors or from the highway infrastructure in the light of the current automata actions, and send a response to the automata. The response from physical environment is a combination of outputs from the sensor modules. Because an input parameter for the decision blocks is the action chosen by the stochastic automaton, is necessary to use two distinct functions for mapping the outputs of decision blocks in inputs for the two learning automata, namely the longitudinal automaton and respectively the lateral automaton. After updating the action probability vectors in both learning automata, using the new reinforcement scheme presented in section 6, the outputs from stochastic automata are transmitted to the regulation layer. The regulation layer handles the actions received from the two automata in a distinct manner, using for each of them a regulation buffer. If an action received was rewarded, it will be introduced in the regulation buffer of the corresponding automaton, else in buffer will be introduced a certain value which denotes a penalized action by the physical environment. The regulation layer does not carry out the action chosen immediately; instead, it carries out an action only if it is ISSN: 1790-2769 290 ISBN: 978-960-474-012-3
  • 3. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS recommended k times consecutively by the automaton, where k is the length of the regulation buffer. After an action is executed, the action probability vector is initialized to 1 , where r is the number of actions. When r an action is executed, regulation buffer is initialized also. Our basic model for planning and coordination of lane changing and speed control is shown in figure 1. Highway yes ─ Fig. 1 The model of the Intelligent Vehicle Control System 4 Sensor modules The four teacher modules mentioned above are decision blocks that calculate the response (reward/penalty), based on the last chosen action of automaton. Table 1 describes the output of decision blocks for side sensors. As seen in table 1, a penalty response is received from the left sensor module when the action is LEFT and there is a vehicle in the left or the vehicle is already traveling on the leftmost lane. There is a similar situation for the right sensor module. Left/Right Sensor Module Actions Vehicle in sensor range or no adjacent lane No vehicle in sensor range and adjacent lane exists LINE_OK 0/0 0/0 LEFT 1/0 0/0 RIGHT 0/1 0/0 Table 1 Outputs from the Left/Right Sensor Module The Headway (Frontal) Module is defined as shown in table 2. If there is a vehicle at a close distance (< admissible distance), a penalty response is sent to the automaton for actions LINE_OK, SPEED_OK and ACC. All other actions (LEFT, RIGHT, DEC) are encouraged, because they may serve to avoid a collision. Headway Sensor Module Actions Vehicle in range (at a close frontal distance) No vehicle in range LINE_OK 1 0 LEFT 0 0 RIGHT 0 0 SPEED_OK 1 0 ACC 1 0 DEC 0* 0 Table 2 Outputs from the Headway Module The Speed Module compares the actual speed with the desired speed, and based on the action chosen send a feedback to the longitudinal automaton. The reward response indicated by 0* (from the Headway Sensor Module) is different than the normal reward response, indicated by 0: this reward response has a higher priority and must override a possible penalty from other modules. Speed Sensor Module Actions Speed: too slow Acceptable speed Speed: too fast SPEED_OK 1 0 1 ACC 0 0 1 DEC 1 0 0 Table 3 Outputs from the Speed Module Physical Environment Auto vehicle Regulation Layer SLA Environment Planning Layer β2 = 0 Frontal detection Left Detection Right Detection Speed Detection F1 F2 Longitudinal Automaton Lateral Automaton Regulation Buffer Regulation Buffer β1 = 0 no yes no β1 β 2 Localization System α1 α 2 α1 α 2 α1 α 2 ─ action ISSN: 1790-2769 291 ISBN: 978-960-474-012-3
  • 4. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS 5 Wasp Behavioral Model Theraulaz et al. present a model for self-organization within a colony of wasps ([15]). In a colony of wasps, individual wasp interacts with its local environment in the form of a stimulus-response mechanism, which governs distributed task allocation. An individual wasp has a response threshold for each zone of the nest. Based on a wasp’s threshold for a given zone and the amount of stimulus from brood located in this zone, a wasp may or may not become engaged in the task of foraging for this zone. A lowest response threshold for a given zone amounts to a higher likelihood of engaging in activity given a stimulus. In [16] is discussed a model in which these thresholds remain fixed over time. Later, in [17] is considered that a threshold for a given task decreases during time periods when that task is performed and increases otherwise. In [18], Cicirello and Smith, present a system which incorporates aspects of the wasp model which have been ignored by others authors. They consider three ways in which the response thresholds are updated. The first two ways are analogous to that of real wasp model. The third is included to encourage a wasp associated with an idle machine to take whatever jobs rather than remaining idle. The model of wasp behaviour also describes the nature of wasp-to-wasp interaction that takes place within the nest. When two individuals of the colony encounter each other, they may with some probability interact in a dominance contest. The wasp with the higher social rank will have a higher probability of dominating in the interaction. Wasps within the colony self-organize themselves into a dominance hierarchy. In [18] is incorporated this aspect of the behaviour model, that is when two or more of the wasp-like agents bid for a given job, the winner is chosen through a tournament of dominance contests. In the following section is presented a new reinforcement scheme for stochastic learning automata. 6 A new reinforcement scheme In this section we present our approach in implementing a new reinforcement scheme based on a computational model of wasp behavior. Each stochastic automaton (namely the longitudinal automaton and respectively the lateral automaton) has an associated wasp. Each wasp has a set of response thresholds: w { w,0 w,r } Θ = θ ,...,θ where r denotes the number of actions of the automaton (as defined in section 2) and w,i θ is the response threshold of wasp w for action i . An action i broadcast to the automaton a stimulus i S which is equal to the number of occurences of action i in the regulation buffer of the automaton. The stochastic automaton will pick up the action i emitting a stimulus i S with probability: 2 , P S S 2 2 , ( , ) i i w i w i i S θ θ + = The threshold values w,i θ may vary in the range [ , ]. min max θ θ If automaton will execute the action i , the threshold w,i θ is updated as follows: ( 0) , , 1 1 θ =θ −δ δ > w i w i For each action j other than i , the threshold w, j θ is updated according to: ( 0) , , 2 2 θ =θ +δ δ > w j w j 7 Implementation of a simulator In this section is described an implementation of a simulator for the Intelligent Vehicle Control System. The entire system was implemented in Java, and is based on JADE platform. JADE is a middleware that facilitates the development of multi-agent systems and applications conforming to FIPA standards for intelligent agents. In figure 3 is showed the class diagram of the simulator. Each vehicle has associated an agent, responsible for the intelligent control. The response of the physical environment is a combination of the outputs of all four sensor modules. The implementation of this combination for each automaton (longitudinal respectively lateral) is showed in figure 2 (the value 0* was substituted by 2). A snapshot of the running simulator is shown in figure 5. // environment response for Longitudinal Automaton public double reward(int action){ int combine; combine = Math.max(speedModule(action), frontModule(action)); if (combine = = 2) combine = 0; return combine; } // environment response for Lateral Automaton public double reward(int action){ int combine; combine = Math.max(leftRightModule(action), frontModule(action)); return combine; } Fig. 2 The physical environment response ISSN: 1790-2769 292 ISBN: 978-960-474-012-3
  • 5. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS Fig. 3 The class diagram of the simulator Implementation of the learning process for the lateral automaton is described in the following: // The learning process public void learning(){ int i, j; double f; boolean doIt; // choose an action i = getAction(); // compute environment response f = reward(i); for (int k = 1; k < HISTORY; k++) regulation_layer[k-1]= regulation_layer[k]; if (f==0) regulation_layer[HISTORY-1] = i; else // ignore the action! regulation_layer[HISTORY-1] = -1; doIt=true; for (int k = 0; k < HISTORY; k++) if (regulation_layer[k]!=i) { doIt=false; break; } if (doIt) { // all action probabilities // become equals init(); switch(i) { case LEFT: auto.setLane(auto.getLane()-1); break; case RIGHT: auto.setLane(auto.getLane()+1); break; } return; } // update the action probabilities using a // reinforcement scheme based on the // wasp-like computational model if (f==0){ if (t[i]-d1 >= t_min) t[i]=t[i]-d1; S[i] = getStimulus(i); p[i]=S[i]*S[i]/( S[i]*S[i] +t[i]*t[i]); for (j=0; j < ACTIONS; j++) if (j != i) { if (t[j]+d2 <= t_max) t[j]=t[j]+d2; S[j] = getStimulus(j); p[j]=S[j]*S[j]/( S[j]*S[j] +t[j]*t[j]); } } else{ if (t[i]+d2 <= t_max) t[i]=t[i]+d2; S[i] = getStimulus(i); p[i]=S[i]*S[i]/( S[i]*S[i] +t[i]*t[i]); for (j=0; j < ACTIONS; j++) if (j != i) { if (t[j]-d1 >= t_min) t[j]=t[j]-d1; S[j] = getStimulus(j); p[j]=S[j]*S[j]/( S[j]*S[j] +t[j]*t[j]); } } } Fig. 4 The learning process for the lateral automaton Fig. 5 A scenario executed in the simulator 8 Conclusion Reinforcement learning has attracted rapidly increasing interest in the machine learning and artificial intelligence communities. Its promise is beguiling - a way of programming agents by reward and punishment without needing to specify how the task (i.e., behavior) is to be achieved. Reinforcement learning allows, at least in ISSN: 1790-2769 293 ISBN: 978-960-474-012-3
  • 6. MATHEMATICAL METHODS, COMPUTATIONAL TECHNIQUES, NON-LINEAR SYSTEMS, INTELLIGENT SYSTEMS principle, to bypass the problems of building an explicit model of the behavior to be synthesized and its counterpart, a meaningful learning base (supervised learning). The new reinforcement scheme presented in this paper is suitable for nonstationary stochastic automata environment, resulting from a changing physical environment. Used within a simulator of an Intelligent Vehicle Control System, this new reinforcement scheme, based on a computational model of wasp behavior, has proved its efficiency. Acknowledgment: This work was completed with the support of the research grant of the Romanian Ministry of Education and Research, CNCSIS 33/2007. References: [1] A. Barto, S. Mahadevan, Recent advances in hierarchical reinforcement learning, Discrete-Event Systems journal, Special issue on Reinforcement Learning, 2003. [2] R. Sutton, A. Barto, Reinforcement learning: An introduction, MIT-press, Cambridge, MA, 1998. [3] O. Buffet, A. Dutech, F. Charpillet, Incremental reinforcement learning for designing multi-agent systems, In J. P. Müller, E. Andre, S. Sen, and C. Frasson, editors, Proceedings of the Fifth International Conference onAutonomous Agents, pp. 31–32,Montreal, Canada, 2001. ACM Press. [4] J. Moody, Y. Liu, M. Saffell, K. Youn, Stochastic direct reinforcement: Application to simple games with recurrence, In Proceedings of Artificial Multiagent Learning. Papers from the 2004 AAAI Fall Symposium,Technical Report FS-04-02, 2004. [5] C. Ünsal, P. Kachroo, J. S. Bay, Simulation Study of Multiple Intelligent Vehicle Control using Stochastic Learning Automata, TRANSACTIONS, the Quarterly Archival Journal of the Society for Computer Simulation International, volume 14, number 4, December 1997. [6] C. Ünsal, P. Kachroo, J. S. Bay, Multiple Stochastic Learning Automata for Vehicle Path Control in an Automated Highway System, IEEE Transactions on Systems, Man, and Cybernetics -part A: systems and humans, vol. 29, no. 1, january 1999 [7] K. S. Narendra, M. A. L. Thathachar, Learning Automata: an introduction, Prentice-Hall, 1989. [8] S. Lakshmivarahan, M.A.L. Thathachar, Absolutely Expedient Learning Algorithms for Stochastic Automata, IEEE Transactions on Systems, Man and Cybernetics, vol. SMC-6, pp. 281-286, 1973 [9] F. Stoica, E. M. Popa, An Absolutely Expedient Learning Algorithm for Stochastic Automata, WSEAS Transactions on Computers, Issue 2, Volume 6, February 2007, ISSN 1109-2750, pp. 229-235 [10] I. De Falco, A. Iazzetta, A. Della Cioppa, E. Tarantino, Mijn Mutation Operator for Aerofoil Design Optimisation, Research Institute on Parallel Information Systems, National Research Council of Italy, 2001 [11] I. De Falco, R. Del Balio, A. Della Cioppa, E. Tarantino, A Comparative Analysis of Evolutionary Algorithms for Function Optimisation, Research Institute on Parallel Information Systems, National Research Council of Italy, 1998 [12] I. De Falco, A. Iazzetta, A. Della Cioppa, E. Tarantino, The Effectiveness of Co-mutation in Evolutionary Algorithms: the Mijn operator, Research Institute on Parallel Information Systems, National Research Council of Italy, 2000 [13] H. Kita, A comparison study of self-adaptation in evolution strategies and real-code genetic algorithms, Evolutionary Computation, 9(2):223-241, 2001 [14] F. Stoica, D. Simian, C. Simian, A new co-mutation genetic operator, In Proceedings of Applications of Evolutionary Computing in Modeling and Development of Intelligent Systems, within the 9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC '08), May 2008, Sofia, Bulgaria. [15] G. Theraulaz , E. Bonabeau, J. Gervet , J. I. Demeubourg, Task differention in policies waspcolonies. A model for self-organizing groups of robots, From animals to Animats: Proceedings of the First International Conference on Simulation of Adaptiv behaviour, 1991, pp. 346–355. [16] E. Bonabeau, G. Theraulaz, J.I. Demeubourg, Fixed response thresholds and the regulation of division of labor in insect societies, Bull. Math. Biol., vol. 60, 1998, pp. 753–807. [17] G. Theraulaz , E. Bonabeau, J. I. Demeubourg, Response threshold reinforcement and division of labour in insects societies, Proc. R Spob London B.,vol. 265, no. 1393,1998, pp. 327-‘335. [18] V. A. Cicirelo, S.F. Smith,Wasp-like Agents for Distributed Factory coordination, Autonomous Agents and Multi-Agent Systems, Vol. 8, No. 3,2004, pp. 237–267. [19] D. Simian, F. Stoica, C. Simian, Models for a Multi-Agent System Based on Wasp-Like Behaviour for Distributed Patients Repartition, 9th WSEAS International Conference on EVOLUTIONARY COMPUTING (EC’08), Sofia, Bulgaria, May 2-4, 2008. ISSN: 1790-2769 294 ISBN: 978-960-474-012-3