The development of neural network based vision for an autonomous vehicle

THE DEVELOPMENT OF NEURAL NETWORK BASED
VISION FOR AN AUTONOMOUS VEHICLE

BY

AKINOLA Otitoaleke Gideon
(EEG/2006/034)

A DISSERTATION SUBMITTED TO THE

DEPARTMENT OF ELECTRONIC AND ELECTRICAL ENGINERING,
FACULTY OF TECHNOLOGY,
OBAFEMI AWOLOWO UNIVERSITY, ILE-IFE, NIGERIA

IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD OF
BACHELOR OF SCIENCE (HONOURS) DEGREE IN THE ELECTRONIC AND
ELECTRICAL ENGINEERING.

JANUARY, 2012.

Department of Electronic and Electrical Engineering,
Faculty of Technology,
Obafemi Awolowo University,
Ile-Ife, Osun State.
24th January, 2012.

The Coordinator,
Final Year Project,
Department of Electronic and Electrical Engineering,
Obafemi Awolowo University,
Ile-Ife, Osun State.

Dear Sir,

LETTER OF TRANSMITTAL
Kindly accept the accompanying copy of dissertation of my final year project
titled “The Development of Neural Network Based Vision for an Autonomous Vehicle”.
This was undertaken in partial fulfilment of the requirements for the award of a Bachelor
of Science (B.Sc.) degree from the Department of Electronic and Electrical Engineering,
Obafemi Awolowo University.

Thank You.

Yours Sincerely,

AKINOLA Otitoaleke Gideon
EEG/2006/034

ii

CERTIFICATION
This is to certify that this report was prepared by AKINOLA Otitoaleke Gideon

(EEG/2006/034) in accordance with the requirements stipulated for the execution of a

final year project as a fulfillment of the requirements for an award of Bachelor of Science

(B.Sc.) degree in Electronic and Electrical Engineering.

___________________

Dr. K.P. Ayodele

(Project Supervisor)

iii

ACKNOWLEDGEMENT
I give my heartfelt thanks to my supervisor, Dr. Kayode AYODELE for providing

the adequate direction and materials needed for this project. I also thank my family

members for their support and understanding.

Above all, I am indeed grateful to God Almighty for the progress made so far. It

could only have been possible by His grace and help.

iv

TABLE OF CONTENTS

LETTER OF TRANSMITTAL ....................................................................................................... ii
CERTIFICATION .......................................................................................................................... iii
ACKNOWLEDGEMENT .............................................................................................................. iv
ABSTRACT .................................................................................................................................. viii
LIST OF FIGURES ..........................................................................................................................x
LIST OF TABLES .......................................................................................................................... xi
CHAPTER ONE ...............................................................................................................................1
INTRODUCTION ............................................................................................................................1
1.1. Background .......................................................................................................................1
1.2. Project Description............................................................................................................4
1.3. Objective of the Project.....................................................................................................5
1.4. Scope .................................................................................................................................5
1.5. Justification .......................................................................................................................5
CHAPTER TWO ..............................................................................................................................7
LITERATURE REVIEW .................................................................................................................7
2.1. Mobile Robots ...................................................................................................................7
2.1.1. Types of Mobile Robots ............................................................................................7
2.1.2. Systems and Methods for Mobile Robot Navigation ................................................9
2.1.3. Mobile Robots and Artificial Intelligence...............................................................11
2.2. Autonomous Vehicles .....................................................................................................14
2.2.1. Road Recognition and Following............................................................................14
2.3. Vision and Image Processing ..........................................................................................18
2.3.1. Pixels: ......................................................................................................................19
2.3.2. Indexed Images .......................................................................................................19
2.3.3. Methods in Image Processing .................................................................................20
2.4. Artificial Neural Networks..............................................................................................22
2.4.1. Introduction .............................................................................................................22
2.4.2. Neural Network Architecture ..................................................................................25
2.4.3. Transfer Functions ..................................................................................................27
2.4.4. Learning rules .........................................................................................................28
2.4.5. Some important neural network paradigms ............................................................29
2.4.6. Validation ................................................................................................................34
2.4.7. Real life applications ...............................................................................................35
2.4.8. Case Study of Two Neural Network Paradigms .....................................................36

v

CHAPTER THREE ........................................................................................................................42
METHODOLOGY .........................................................................................................................42
3.1. Chassis Fabrication .........................................................................................................42
3.2. Vision, Image Acquisition and Processing .....................................................................44
3.3. Developing the Neural Network with Multi-Layer Perceptron (Method 1) ...................47
3.4. Developing the Neural Network with Self Organizing Maps (Preferred Method) .........49
3.5. Scene Identification.........................................................................................................50
3.6. Locomotion .....................................................................................................................50
3.7. Steering ...........................................................................................................................51
CHAPTER FOUR ...........................................................................................................................52
RESULTS AND DISCUSSIONS ...................................................................................................52
5.1. Simulations and Results ..................................................................................................52
5.1.1. Multi-Layer Perceptron Neural Network ................................................................52
5.1.2. Self-Organizing Maps (SOM) Neural Network ......................................................53
5.2. Observations ...................................................................................................................57
5.2.1. Front wheel actuation. .................................................................................................57
5.2.2. Processing requirement. ..............................................................................................57
5.2.3. Self-Organizing Map: A better Neural Network paradigm. ........................................57
CHAPTER FIVE ............................................................................................................................58
CONCLUSIONS AND RECOMMENDATIONS .........................................................................58
5.1. Conclusion ......................................................................................................................58
5.2. Recommendations ...........................................................................................................58
5.3. Further Works To Be Done .............................................................................................59
5.3.1. Advanced Image Processing ...................................................................................59
5.3.2. Path Planning ..........................................................................................................60
REFERENCES ...............................................................................................................................61
Appendix I: Matlab Codes for Image Acquisition and Processing .................................................66
Appendix II: Matlab Codes for Multilayer Perceptron Neural Network ........................................67
Appendix III: Matlab Codes for Self Organizing Map (SOM) Neural Network ............................70
Appendix IV: Matlab Codes Defining Functions for Various Movement Patterns ........................72
Appendix IV: Pictures Acquired For Training the Self Organizing Map (SOM) Neural Network 81
Appendix V: Self Organizing Map (SOM) Training Tool Showing 100 Iterations .......................83
Appendix VI: The Self Organizing Map (SOM) Weight Distances ...............................................84
Appendix VII: Plot Showing the SOM Sample Hits or Classes ....................................................85
Appendix VIII: Plot Showing Weight Positions of the Three Neurons ..........................................86
Appendix IX: Test Scene Images for the Neural Network .............................................................87

vi

Appendix X: GUI showing the state of the Neural Network ..........................................................88
Appendix XI: Confusion Matrices for the Neural Network............................................................89
Appendix XII: Error and Performance Plots for the Neural Network ............................................90
Appendix XIII: Gradient Plot for each epoch of the Neural Network ............................................91
Appendix XIV: Simulation Images for the Neural Network ..........................................................92

vii

ABSTRACT
This is a report on the development of neural network based vision for an

autonomous vehicle fabricated for my final year project. An Autonomous Vehicle is a

kind of robot that is self-acting and self-regulating and thereby reacts to its environment

without external control. The objective of the project is to implement and train an

Artificial Neural Network (ANN) to control a mobile robot based on its visual perception.

Discussions were carried out on: the types of mobile robots; systems and methods

for mobile robot navigation; and artificial intelligence methods used in mobile robot

navigation. Image processing and robot vision were also carefully considered. Special

attention was given to ANN which is an adaptive interconnected group of artificial

neurons that uses a mathematical or computational model for information processing

based on a connectionist approach to computation. The goal of the ANN is to discover

some associations between input and output patterns, or to analyse or find the structure of

the input patterns through iterative adjustment of weights of individual connection

between units.

To carry out the set objective, a model chassis for the autonomous vehicle was

fabricated. Differential Drive method was used to control steering and for the control of

the wheels, servomotors (Dynamixel) were attached to the front wheels only. Also, since

autonomy is based on visual perception, a webcam is the primary sensor. This was then

integrated into Matlab for acquiring images from scenes and getting the images processed

for effective use in the Artificial Neural Network. Initially, Multi-Layer Perceptron

(MLP) was used as the Neural Network, but due to observed classification errors, the use

of Self-Organizing Maps (SOM) was then adopted. Ultimately, the outputs of the Neural

Network were used to control the car movement and steering patterns. Analysis of

results, for scene identification, obtained from the Self-Organizing Maps (SOM) and the

viii

Multi-Layer Perceptron (MLP) neural networks are carried out. These results which are

outputs from the Neural Network were sent as control commands to take a particular

movement pattern.

Numerical outputs from the NN were obtained. The analysis of these outputs was

obtained in graphical form showing performance and classification plots. Comparison in

the analysis of results showed that SOM provides better suited Neural Network paradigm

for scene identification than the MLP. As a key result in realizing the objective of this

project we obtained that the robot was able to differentiate between three different scenes

and act on them.

Although the robot can differentiate between three or more different scenes and

act on them, to achieve a satisfactory degree of autonomy, image and video processing

need to be improved upon for road identification and path planning purposes. Also the

robot needs to be given better cognitive ability by fine-tuning the neural network. Various

recommendations were given in anticipation to further works. The project should serve as

a model in a Control Engineering course when fully developed.

ix

LIST OF FIGURES
FIGURE 2. 1: THE BASIC NEURAL UNIT PROCESSES THE INPUT INFORMATION INTO THE

OUTPUT INFORMATION. ................................................................................................ 26

FIGURE 2. 2: DIFFERENT TYPES OF ACTIVATION FUNCTIONS: (A) THRESHOLD (B) PIECEWISE

LINEAR (C) SIGMOID AND (D) GAUSSIAN. ..................................................................... 31

FIGURE 2. 3: A TAXONOMY OF FEED-FORWARD AND RECURRENT /FEEDBACK NETWORK

ARCHITECTURES (JAIN, 1996). ..................................................................................... 32

FIGURE 2. 4: A MULTI-LAYER PERCEPTRON. ....................................................................... 33

FIGURE 3. 1: BASIC ALGORITHM FOR AUTONOMY .............................................................. 42

FIGURE 3. 2: A MORE COMPREHENSIVE ALGORITHM FOR THE ROBOT ............................... 43

FIGURE 3. 3: IMAGE SHOWING THE ROBOT CHASSIS ........................................................... 45

x

LIST OF TABLES

TABLE 4. 1: CORRESPONDING TARGETS AND OUTPUTS FOR THE TRAINING THE MULTI-

LAYER PERCEPTRON. ................................................................................................... 54

TABLE 4. 2: ERROR AND PERFORMANCE FOR TEST IMAGES CLASSIFIED. ............................. 55

TABLE 4. 3: CLASSIFICATION OF IMAGES INTO GROUPS BY THE SELF-ORGANIZING MAP. .. 56

xi

1

CHAPTER ONE

INTRODUCTION

1.1. Background
An autonomous vehicle is a kind of robot that is capable of automatic navigation.

It is self-acting and self-regulating, therefore it is able to operate in and react to its

environment without outside control.

After proving to be an efficient tool for improving quality, productivity, and

competitiveness of manufacturing organizations, robots now expand to service

organizations, offices, and even homes. Global competition and the tendency to reduce

production cost and increase efficiency creates new applications for robots that stationary

robots can't perform. These new applications require the robots to move and perform

certain activities at the same time. Moreover, the availability and low cost of faster

processors, better programming, and the use of new hardware allow robot designers to

build more accurate, faster, and even safer robots. Currently, mobile robots are expanding

outside the confines of buildings and into rugged terrain, as well as familiar environments

like schools and city streets (Wong, 2005).

Robots must be able to understand the structure of their environment. To reach

their targets without collisions, the robots must be endowed with perception, data

processing, recognition, learning, reasoning, interpreting, decision making and action

capacities (which constitutes artificial intelligence).

Therefore, to reach a reasonable degree of autonomy, two basic requirements are

sensing, reasoning and actuation. Sensing is to be provided by a web camera that gathers

information about robot with respect to the surrounding scene. Reasoning can be

2

accomplished by devising algorithms that exploit this information in order to generate

appropriate commands for the robot. Actuation is by intelligent servo-motors.

For an autonomous outdoor mobile robot, the ability to detect roads existing

around is a vital capability. Unstructured roads are among the toughest challenges for a

mobile robot both in terms of detection and navigation. Even though mobile robots use

various sensors to interact with their environment, being a comparatively low-cost and

rich source of information, potential of cameras should be fully utilized (Dilan, 2010).

Road recognition using visual information is an important capability for

autonomous navigation in urban environments. Shinzato et al (2008) presented a visual

road detection system that uses multiple Artificial Neural Network (ANN), similar to

MANIAC (Multiple ALVINN Networks in Autonomous Control), in order to improve the

robustness. However, the ANN learns colours and textures from sub-images instead of all

road appearance. In their system, each ANN receives different image features (like

averages, entropy, energy and variance from different colour channels (RGB, HSV, and

YUV)) as input from sub-images. In the final step of algorithm, a set ANN’s outputs are

combined to generate only one classification for each sub-image. This classification

provides confidence factor for each sub-image classification of image that can be used by

control algorithm. The system does not need to be retrained all the time, therefore,

location of the road is not assumed.

Zhu et al. (2008) proposes an integrated system called road understanding neural

network (RUNN) for an autonomous mobile robot to move in an outdoor road

environment. The RUNN consists of two major neural network modules, a single three-

layer road classification network (RCN) to identify the road category (straight: road,

intersection or T-junction), and a two-layer road orientation network (RON) for each road

3

category. Several design issues, including the network model, the selection of input data,

the number of the hidden units and the learning problems were adequately dealt with.

One of the most used artificial neural networks (ANNs) models is the well-known

Multi-Layer Perceptron (MLP) (Haykin, 1998). The training process of MLPs for pattern

classification problems consists of two tasks, the first one is the selection of an

appropriate architecture for the problem, and the second is the adjustment of the

connection weights of the network.

Extensive research work has been conducted to attack these issues. Global search

techniques, with the ability to broaden the search space in the attempt to avoid local

minima, has been used for connection weights adjustment or architecture optimization of

MLPs, such as evolutionary algorithms (EA) (Eiben & Smith, 2003), simulated annealing

(SA) (Jurjoatrucj et al., 1983), tabu search (TS) (Glover, 1986), ant colony optimization

(ACO) (Dorigo et al., 1996) and particle swarm optimization (PSO) (Kennedy &

Eberhart, 1995).

Recently, artificial neural networks based methods are applied to robotic systems.

In (Racz & Dubrawski, 1994), an ANN was trained to estimate a robot’s position relative

to a particular local object. Robot localization was achieved by using entropy nets to

implement a regression tree as an ANN in (Sethi & Yu, 1990). An ANN can also be

trained to correct the pose estimates from odometry using ultrasonic sensors.

Conforth and Meng(2008) proposed an artificial neural networks learning method

for mobile robot localization, which combines the two popular swarm inspired methods in

computational intelligence areas: Ant Colony Optimization (ACO) and Particle Swarm

Optimization (PSO) to train the Artificial Neural Network (ANN) models. These

4

algorithms have been applied already to solving problems of clustering, data mining,

dynamic task allocation, and optimization in autonomous mobile robots.

1.2. Project Description
A vehicle is to be designed which can intuitively navigate its ways to destination

points, avoiding obstacles and obeying traffic rules without being pre-programmed.

Although a vehicle can be made to avoid obstacles using various common technologies

based on structured programming and algorithm standard, the intelligence of this vehicle

will be based on whether sufficient conditions were addressed or not and, whether the

algorithm is well designed or not. Afterwards, the drudgery of typing so many lines of

codes and debugging will have to be dealt with.

An improved technology is Artificial Intelligence or Machine Learning. Although

some implement this technology with the normal object oriented programing platform.

Some better emerging technologies for the implementation of artificial intelligence

include: Artificial Neural Networks and Support Vector Machines. The Problem is to

train the car to identify objects on its own and rationalize based on the training it has

received.

A system to be developed is to acquire training data (images), utilize this data to

train an ANN in Matlab, and implement the converged network on a laptop controlling a

model mobile robot. The training data is collected utilizing a laptop running Matlab that

controls the robot. The robot is to be steered through differential–drive control logic so

that it follows a path marked out on the floor. The primary sensor is to be a webcam

attached to the robot records images of the path to the laptop’s hard drive. The images

from the webcam are inputs to the Neural Network. This data is then used to train a

backpropagation ANN. After satisfactory ANN performance is achieved, the converged

ANN weight values are written to a file. These weight values are then read by a program

5

that implements a feed forward ANN that reads webcam images, processes the images,

and then inputs the processed images to the ANN which then produces corresponding

steering control signals to the differential-drive system. Simulations and controls are to be

implemented using, principally, Matlab Image Processing Toolbox, Matlab Artificial

Neural Networks Toolbox and Dynamixel control functions in Matlab.

1.3. Objective of the Project
The primary objective of this project is to develop a procedure to implement and train an

Artificial Neural Network to control a mobile robot to follow a visibly distinguishable

path autonomously. The main idea is visual control of a vehicle for autonomous

navigation through a Neural Network platform (without a stereotype pre-programming).

1.4. Scope
To bite an adequate amount chewable, the problem has to be simplified and made

unambiguous. The task now is to make the vehicle take a certain action when it sees

images of specific classes. The project involves basic image acquisition and processing

for Neural Network training purposes and sending the output control commands to the

wheels of a model car. The circuitry utilized is that of the laptop computer. Separate

circuits have not been developed because ANNs require high memory capacity and

processing speed which can only be provided by a high-processing computer.

Nevertheless, the realization of ANN is an ongoing research work.

Although allusions to and efforts towards generalization, effectiveness and

sophistication are being made, the project does not treat exhaustively path-planning and

road-tracking.

1.5. Justification
The process of automating vehicle navigation can be broken down into four steps:

1) perceiving and modelling the environment, 2) localizing the vehicle within the

6

environment, 3) planning and deciding the vehicle’s desired motion and 4) executing the

vehicle’s desired motion (Wit, 2000). There has been much interest and research done in

each of these areas in the past decade. This work focuses on perceiving the environment

and deciding the vehicle’s desired motion. Further work on localization and executing the

desired motion is an on-going research work. Nevertheless more finesse has to be put in

to modelling the environment, planning and deciding the vehicle’s desired motion. The

above process is to be implemented with Matlab Artificial Neural Network Toolbox and

Image Processing tool box.

Apart from the fact that the project creates an arousal of interest in solving the

research problems, the robot developed can be used for the scholastic purpose of concept

illustration particularly in Control and Instrumentation Engineering courses.

7

CHAPTER TWO

LITERATURE REVIEW

2.1. Mobile Robots

2.1.1. Types of Mobile Robots
Many different types of mobile robots had been developed depending on the

kind of application, velocity, and the type of environment whether its water,

space, terrain with fixed or moving obstacles. Four major categories had been

identified (Dudek & Jenkin, 2000):

 Terrestrial or ground-contact robots: The most common ones are the

wheeled robots; others are the tracked vehicles and Limbed vehicles.

Explained below:

1. Wheeled robots: Wheeled robots exploit friction or ground contact to

enable the robot to move. Different kinds of wheeled robots exist: the

differential drive robot, synchronous drive robot, steered wheels robots and

Ackerman steering (car drive) robots, the tricycle, bogey, and bicycle drive

robots, and robots with complex or compound or omnidirectional wheels.

2. Tracked vehicles: Tracked vehicles are robust to any terrain environment,

their construction are similar to the differential drive robot but the two

differential wheels are extended into treads which provide a large contact

area and enable the robot to navigate through a wide range of terrain.

3. Limbed vehicles: Limbed vehicles are suitable in rough terrains such as

those found in forests, near natural or man-made disasters, or in planetary

exploration, where ground contact support is not available for the entire

path of motion. Limbed vehicles are characterized by the design and the

8

number of legs, the minimum number of legs needed for a robot to move is

one, to be supported a robot need at least three legs, and four legs are

needed for a statically stable robot, six, eight, and twelve legs robots exists.

 Aquatic robots: Those operate in water surface or underwater. Most use

water jets or propellers. Aquatic vehicles support propulsion by utilizing the

surrounding water. There are two common structures (Dudek & Jenkin, 2000):

torpedo-like structures (Feruson & Pope, 1995, Kloske et al., 1993) where a

single propeller provides forward, and reverse thrust while the navigation

direction is controlled by the control surfaces, the buoyancy of the vessel

controls the depth. The disadvantage of this type is poor manoeuvrability.

 Airborne robots: Flying robots like Robotic helicopters, fixed-wing aircraft,

robotically controlled parachutes, and dirigibles. The following

subclassifications are given according to Dudek and Jenkin (2000):

1. Fixed-wing autonomous vehicles: This utilizes control systems very

similar to the ones found in commercial autopilots. Ground station can

provide remote commands if needed, and with the help of the Global

Positioning System (GPS) the location of the vehicle can be determined.

2. Automated helicopters (Baker et al., 1992, Lewis et al., 1993): These use

onboard computation and sensing and ground control, their control is very

difficult compared to the fixed-wing autonomous vehicles.

3. Buoyant (aerobots, aerovehicles, or blimps) vehicles: These vehicles can

float and are characterized by having high energy efficiency ration, long-

range travel and duty cycle, vertical mobility, and they usually has no

disastrous results in case of failure.

9

4. Unpowered autonomous flying vehicles: These vehicles reach their desired

destination by utilizing gravity, GPS, and other sensors.

 Space robots: Those are designed to operate in the microgravity of outer

space and are typically envisioned for space station maintenance. Space robots

either move by climbing or are independently propelled. These are needed for

applications related to space stations like construction, repair, and

maintenance. Free-flying systems have been proposed where the spacecraft is

equipped with thrusters with one or more manipulators, the thrusters are

utilized to modify the robot trajectory.

2.1.2. Systems and Methods for Mobile Robot Navigation
Navigation is the major challenge in the autonomous mobile robots; a

navigation system is the method for guiding a vehicle. Several capabilities are

needed for autonomous navigation (Alhaj Ali, 2003):

• The ability to execute elementary goal achieving actions such as going to a given

location or following a leader;

• The ability to react to unexpected events in real time such as avoiding a suddenly

appearing obstacle;

• The ability to formulate a map of the environment;

• The ability to learn which might include noting the location of an obstacle and of

a three-dimensional nature of the terrain and adapt the drive torque to the

inclination of hills (Golnazarian & Hall, 2000).

The following basic systems and methods have been identified for mobile

robot navigation:

10

1. Odometry and other dead-reckoning methods: These methods use encoders

to measure wheel rotation and/or steering orientation.

2. Vision based navigation: Computer vision and image sequence techniques

were proposed for obstacle detection and avoidance for autonomous land

vehicles that can navigate in an outdoor road environment. The object shape

boundary is first extracted from the image, after the translation from the

vehicle location in the current cycle to that in the next cycle, the position of the

object shape in the image of the next cycle is predicted, and then it is matched

with the extracted shape of the object in the image of the next cycle to decide

whether the object is an obstacle (Alhaj Ali, 2003, Chen & Tsai, 2000).

3. Sensor based navigation: Sensor based navigation systems that rely on sonar

or laser scanners that provide one dimensional distance profiles have been

used for collision and obstacle avoidance. A general adaptable control

structure is also required. The mobile robot must make decisions on its

navigation tactics; decide which information to use to modify its position,

which path to follow around obstacles, when stopping is the safest alternative,

and which direction to proceed when no path is given. In addition, sensors

information can be used for constructing maps of the environment for short

term reactive planning and long-term environmental learning.

4. Inertial navigation: This method uses gyroscopes and sometimes

accelerometers to measure the rate of rotation and acceleration.

5. Active beacon navigation systems: This method computes the absolute

position of the robot from measuring the direction of incidence of three or

more actively transmitted beacons. The transmitters, usually using light or

11

radio frequencies must be located at known sites in the environment (Janet,

1997, Premvuti & Wang, 1996, Alhaj Ali, 2003).

6. Landmark navigation: In this method distinctive artificial landmarks are

placed at known locations in the environment to be detected even under

adverse environmental conditions.

7. Map-based positioning: In this method information acquired from the robot's

onboard sensors is compared to a map or world model of the environment. The

vehicle's absolute location can be estimated if features from the sensor-based

map and the world model map match.

8. Biological navigation: biologically-inspired approaches were utilized in the

development of intelligent adaptive systems; biomimetic systems provide a

real world test of biological navigation behaviours besides making new

navigation mechanisms available for indoor robots.

9. Global positioning system (GPS): This system provides specially coded

satellite signals that can be processed in a GPS receiver, enabling it to compute

position, velocity, and time.

2.1.3. Mobile Robots and Artificial Intelligence
While robotics research has mainly been concerned with vision (eyes) and

tactile, some problems regarding adapting, reasoning, and responding to changed

environment have been solved with the help of artificial intelligence using

heuristic methods such as ANN.

Neural computers have been suggested to provide a higher level of

intelligence that allows the robot to plan its action in a normal environment as

well as to perform non-programmed tasks (Golnazarian & Hall, 2000, Alhaj Ali,

2003).

12

A well-established field in the discipline of control systems is the

intelligent control, which represents a generalization of the concept of control, to

include autonomous anthropomorphic interactions of a machine with the

environment (Alhaj Ali, 2003, Meystel & Albus, 2002). Meystel and Albus (2002)

defined intelligence as “the ability of a system to act appropriately in an uncertain

environment, where an appropriate action is that which increases the probability

of success, and success is the achievement of the behavioural sub goals that

support the system’s ultimate goal”. The intelligent systems act so as to maximize

this probability. Both goals and success criteria are generated in the environment

external to the intelligent system. At a minimum, the intelligent system had to be

able to sense the environment which can be achieved by the use of sensors, then

perceive and interpret the situation in order to make decisions by the use of

analyzers, and finally implements the proper control actions by using actuators or

drives.

Higher levels of intelligence require the abilities to: recognize objects and

events store and use knowledge about the world, learn, and to reason about and

plan for the future. Advanced forms of intelligence have the ability to perceive

and analyze, to plan and scheme, to choose wisely, and plan successfully in a

complex, competitive, and hostile world (Alhaj Ali, 2003).

Intelligent behaviour is crucial to mobile robots; it could be supported by

connecting perception to action (Kortenkamp et al., 1998). In the following a brief

review for the literature in the use of artificial intelligence in mobile robots will be

presented (Alhaj Ali, 2003).

13

2.1.3.1. Use of Artificial Neural Networks (ANN)
ANN has been applied to mobile robot navigation. It had been considered

for applications that focus on recognition and classification of path features during

navigation. Kurd and Oguchi (1997) propose the use of neural network controller

that was trained using supervised learning as an indirect-controller to obtain the

best control parameters for the main controller in use with respect to the position

of an Autonomous Ground Vehicle (AGV). A method that uses incremental

learning and classification based on a self-organizing ANN is described by

Vercelli and Morasso (1998). Xue and Cheung (1996) proposed a neural network

control scheme for controlling active suspension. The presented controller used a

multi-layer neural network and a prediction-correction method for adjusting

learning parameters. Dracopoulos (1998) present the application of multi-layer

perceptrons to the robot path planning problem and in particular to the task of

maze navigation.

Zhu, et al. (1998) present results of integrating omni-directional view

image analysis and a set of adaptive networks to understand the outdoor road

scene by a mobile robot (Alhaj Ali, 2003). To navigate and recognize where it is,

a mobile robot must be able to identify its current location. The more the robot

knows about its environment, the more efficiently it can operate (Cicirelli, 1998).

Grudic and Lawrence (1998) used a nonparametric learning algorithm to build a

robust mapping between an image obtained from a mobile robot’s on-board

camera, and the robot’s current position. It used the learning data obtained from

these raw pixel values to automatically choose a structure for the mapping without

human intervention, or any prior assumptions about what type of image features

should be used (Alhaj Ali, 2003).

14

2.1.3.2. Use of Fuzzy Logic
Fuzzy logic and fuzzy languages have also been used in navigation

algorithms for mobile robots as described in (Wijesoma et al., 1999, Mora and

Sanchez, 1998). Lin and Wang (1997) propose a fuzzy logic approach to guide an

AGV from a starting point toward the target without colliding with any static

obstacle as well as moving obstacles; they also study other issues as sensor

modelling and trap recovery. Kim and Hyung (1998) used fuzzy multiattribute

decision-making in deciding which via-point the robot should proceed to at each

step. The via-point is a local target point for the robot’s movement at each

decision step. A set of candidate via-points is constructed at various headings and

velocities. Watanabe, et al. (1998) described a method using a fuzzy logic model

for the control of a time varying rotational angle in which multiple linear models

are obtained by utilizing the original nonlinear model at some representative

angles (Alhaj Ali, 2003).

2.1.3.3. Use of Neural Integrated Fuzzy Controller
A neural integrated fuzzy controller (NiF-T) that integrates the fuzzy logic

representation of human knowledge with the learning capability of neural

networks has been developed for nonlinear dynamic control problems (Alhaj Ali,

2003). Daxwanger and Schmidt (1998) presented their neuro-fuzzy approach for

visual guidance of a mobile robot vehicle.

2.2. Autonomous Vehicles

2.2.1. Road Recognition and Following
Road recognition, detection and following problem for autonomous

vehicles (also known as unmanned vehicles or wheeled robots) has been an active

research area for the past several decades. Road detection for mobile robots is

15

required for the environments which are dangerous for human-beings. Moreover,

it can be used for assisting humans while driving or operating a vehicle.

Road detection is an important requirement for autonomous navigation

even in the presence of assisting technologies such as GPS. Road recognition can

be performed using sensors such as; laser sensors, omnivision cameras, etc. and

several algorithms and applications are developed in the literature offering

satisfactory solutions. However, most of the satisfactory solutions cannot be

applied to all types of roads a mobile robot has to deal with during autonomous

navigation. Roads can be classified into two groups regarding to the setting as;

structured and unstructured roads.

Research on road detection for structured roads (i.e. asphalt roads) has

produced well-working solutions. Satisfactory unstructured road detection

algorithms using the several sensors other than vision sensors are available in

literature. However, unstructured road detection through the sole use of vision

sensors is still an open research area.

The problem of road detection can be to a large extent be solved by

tackling the issue of pattern recognition. Pattern recognition is a process of

description, grouping, and classification of patterns. In terms of information

availability, there are two general paradigms for pattern recognition -Supervised

and unsupervised schemes. A supervised scheme identifies an unknown pattern as

a member of a predefined class, while an unsupervised scheme groups input

pattern into a number of clusters defined as classes.

16

Automatic pattern recognition has the primary tasks of feature extraction

and classification. Classical pattern recognition techniques include: Feature

extraction and dimensionality reduction.

Examples of classifiers include:

1. Bayesian optimal classifier

2. Exemplar classifier

3. Space partition methods

4. Neural Networks.

Generic feature extraction methods include:

1. Wavelet based analysis

2. Invariant moments

3. Entropy

4. Cepstrum analysis

5. Fractal dimension

` Methods of algorithm selection is be based on image preprocessing,

pattern recognition using geometric algorithm, line detection, extraction of curve

lines, semantic retrieval by spatial relationships, and structural object recognition

using shape-form shading (Funtanilla, 2008).

Generally speaking, a complete pattern recognition system employs a

sensor, a feature extraction mechanism and a classification scheme. Pattern

recognition is recognized under the artificial intelligence and data processing

environment. The usual approach in classifying pattern recognition in these fields

could be statistical (or decision theoretic), syntactic (or structural), or neural. The

17

statistical approach is based on patterns generated by probabilistic system or from

statistical analysis. The syntactic or structural approach is based on structural

relationships of features. The neural approach uses the neural computing

environment using neural network structure.

The camera serves as the most common sensor system. The digital images

taken remotely from this sensor acts as the object where features are established

from which we try to extract significant patterns. Pattern is defined as an

arrangement of descriptors (length, diameter, shape numbers, regions). A feature

denotes a descriptor and a classification is defined by the family of patterns that

share a set of common property. The two principal arrangements used in

computerized pattern recognition using MATLAB programming is defined in

(Gonzalez et al, 2004) as vectors, for quantitative descriptions (decision theoretic)

and strings, for structural descriptions or recognition (represented by symbolic

information properties and relationships).

Quantitative descriptors such as length, area and texture fall in the area of

decision theoretic computerized pattern recognition system. Image pre-processing

techniques, such as image conversion, edge detection, image restoration and

image segmentation, are important prerequisites to computerized image

processing. MATLAB implements point, line and peak detection in the image

segmentation process. The segmentation process carries on until the level of detail

to identify the element (point, line, peak) has been isolated which is limited by the

choice of imaging sensor in remote processing application (Gonzalez et al, 2004).

18

To complete the process for an efficient pattern recognition system, (Baker

1996) developed pattern rejector algorithm based on object recognition and local

feature detection.

Another method of pattern recognition was developed byo Zhang et al

(2008) wherein, image splicing detection can be treated as a two-class pattern

recognition problem, which builds the model using moment features and some

image quality metrics (IQMs) extracted from the given test image.

2.3. Vision and Image Processing

Digital image processing is the process of transforming digital information

(images). For the following reasons:

1. To improve pictorial information for human interpretation through

 Noise removal

 Making corrections for motion, camera position, distortion

 Enhancements by changing contrast, colour

2. To process pictorial information by machine by

 Segmentation - dividing an image up into constituent parts

 Representation - representing an image by some more abstract models

 Classification

3. To reduce the size of image information for efficient handling.

 Compression with loss of digital information that minimizes loss of

"perceptual" information. JPEG and GIF, MPEG,

 Multiresolution representations versus quality of service

19

2.3.1. Pixels:
Photographs, for example, are described by breaking an image up into a mosaic of

colour squares (pixels). Depending on their final destination, the number of pixels

used per inch varies (PPI or DPI).

MATLAB stores most images as two-dimensional arrays (i.e., matrices), in

which each element of the matrix corresponds to a single pixel in the displayed

image. For example, an image composed of 200 rows and 300 columns of

different coloured dots would be stored in MATLAB as a 200-by-300 matrix.

Some images, such as RGB, require a three-dimensional array, where the first

plane in the 3rd dimension represents the red pixel intensities, the second plane

represents the green pixel intensities, and the third plane represents the blue pixel

intensities.

So, to reduce memory requirements, MATLAB supports storing image

data in arrays of class uint8 and uint16. The data in these arrays is stored as 8-bit

or 16-bit unsigned integers. These arrays require one-eighth or one-fourth as much

memory as data in double arrays. An image whose data matrix has class uint8 is

called an 8-bit image; an image whose data matrix has class uint16 is called a 16-

bit image.

2.3.2. Indexed Images
An indexed image consists of a data matrix, X, and a colormap matrix,

map. map is an mby-3 array of class double containing floating-point values in

the range [0, 1]. Each row of map specifies the red, green, and blue components

of a single colour. An indexed image uses "direct mapping" of pixel values to

colormap values. The colour of each image pixel is determined by using the

corresponding value of X as an index into map. The value 1 points to the first row

20

in map, the value 2 points to the second row, and so on. You can display an

indexed image with the statements:

image(X); colormap(map)

A colormap is often stored with an indexed image and is automatically

loaded with the image when you use the imread function. However, you are not

limited to using the default colormap--you can use any colormap that you choose.

2.3.3. Methods in Image Processing
These methods are used to acquire images through the webcam and

convert them to forms which can be used by the neural network. The image

processing activity can be broken down into the sub-steps.

1. Image acquisition: We will acquire an image to our system as an input .this

image should have a specific format, for example, bmp format and with a

determined size such as 30 x 20 pixels. Image is acquired through the

digital web camera.

2. Image pre- and post-processing: The preprocessing stage involves:

a. Binarization: This is the conversion of the raw images acquired to

gray-scale and then to a binary image by choosing threshold value

from the gray-scale elements.

b. Morphological Operators – these remove isolated specks and holes in

the binary images. Can use the majority operator.

c. Noise removal: reducing noise in an image. For on-line there is no

noise to eliminate so no need for the noise removal. In off-line mode,

the noise may come from surface roughness and tiny particles of dirt or

debris moving with the breeze.

Other processing operations that can be carried out include:

21

 Contrast enhancement

 De-blurring

 Region-based processing

 Linear and non-linear filtering

22

3. Image analysis: This includes amidst others

a. Edge detection

b. Segmentation

2.4. Artificial Neural Networks

2.4.1. Introduction
Artificial neural networks are made up of interconnecting artificial neurons

(programming constructs that mimic the properties of biological neurons).

Artificial neural networks may either be used to gain an understanding of

biological neural networks, or for solving artificial intelligence problems without

necessarily creating a model of a real biological system. The real, biological

nervous system is highly complex and includes some features that may seem

superfluous based on an understanding of artificial networks.

In general, a biological neural network is composed of a group or groups

of chemically connected or functionally associated neurons. A single neuron may

be connected to many other neurons and the total number of neurons and

connections in a network may be extensive. Connections, called synapses, are

usually formed from axons to dendrites, though dendrodendritic microcircuits and

other connections are possible. Apart from the electrical signaling, there are other

forms of signaling that arise from neurotransmitter diffusion, which have an effect

on electrical signaling. As such, neural networks are extremely complex.

Artificial intelligence and cognitive modeling try to simulate some

properties of neural networks. While similar in their techniques, the former has the

aim of solving particular tasks, while the latter aims to build mathematical models

of biological neural systems. In the artificial intelligence field, artificial neural

networks have been applied successfully to speech recognition, image analysis

23

and adaptive control, in order to construct software agents (in computer and video

games) or autonomous robots. Most of the currently employed artificial neural

networks for artificial intelligence are based on statistical estimation, optimization

and control theory. Artificial intelligence, cognitive modeling, and neural

networks are information processing paradigms inspired by the way biological

neural systems process data.

A neural network (NN), in the case of artificial neurons called artificial

neural network (ANN) or simulated neural network (SNN), is an interconnected

group of natural or artificial neurons that uses a mathematical or computational

model for information processing based on a connectionist approach to

computation. In most cases an ANN is an adaptive system that changes its

structure based on external or internal information that flows through the network.

In more practical terms neural networks are non-linear statistical data modeling or

decision making tools. They can be used to model complex relationships between

inputs and outputs or to find patterns in data.

According to Abdi (1999), Neural networks are adaptive statistical models

based on an analogy with the structure of the brain. They are adaptive because

they can learn to estimate the parameters of some population using a small number

of exemplars (one or a few) at a time. They do not differ essentially from standard

statistical models. For example, one can find neural network architectures akin to

discriminant analysis, principal component analysis, logistic regression, and other

techniques. (Jordan & Bishop, 1996).

Many neural network methods can be viewed as generalizations of

classical pattern-oriented techniques in statistics and the engineering areas of

24

signal processing, system identification, optimization, and control theory. There

are also ties to parallel processing, VLSI design, and numerical analysis.

A neural network is first and foremost a graph, with patterns represented in

terms of numerical values attached to the nodes of the graph and transformations

between patterns achieved via simple message-passing algorithms. Certain of the

nodes in the graph are generally distinguished as being input nodes or output

nodes, and the graph as a whole can be viewed as a representation of a

multivariate function linking inputs to outputs. Numerical values (weights) are

attached to the links of the graph, parameterizing the input/output function and

allowing it to be adjusted via a learning algorithm.

A broader view of neural network architecture involves treating the

network as a statistical processor, characterized by making particular probabilistic

assumptions about data. Patterns appearing on the input nodes or the output nodes

of a network are viewed as samples from probability densities, and a network is

viewed as a probabilistic model that assigns probabilities to patterns. However, the

paradigm of neural networks - i.e., implicit, not explicit , learning is stressed -

seems more to correspond to some kind of natural intelligence than to the

traditional Artificial Intelligence, which would stress, instead, rule-based learning.

Neural networks usually organize their units (called neurons) into several

layers. The first layer is called the input layer, the last one the output layer. The

intermediate layers (if any) are called the hidden layers. The information to be

analyzed is fed to the neurons of the first layer and then propagated to the neurons

of the second layer for further processing. The result of this processing is then

propagated to the next layer and so on until the last layer. Each unit receives some

25

information from other units (or from the external world through some devices)

and processes this information, which will be converted into the output of the unit.

The goal of the network is to learn or to discover some association

between input and output patterns, or to analyze, or to find the structure of the

input patterns. The learning process is achieved through the modification of the

connection weights between units. In statistical terms, this is equivalent to

interpreting the value of the connections between units as parameters (e.g., like the

values of a and b in the regression equation (y = a + b*x) to be estimated. The

learning process specifies the “algorithm” used to estimate the parameters.

2.4.2. Neural Network Architecture
Neural networks are made of basic units (Figure 3.1) arranged in layers. A

unit collects information provided by other units (or by the external world) to

which it is connected with weighted connections called synapses. These weights,

called synaptic weights multiply (i.e., amplify or attenuate) the input information:

A positive weight is considered excitatory, a negative weight inhibitory.

26

Figure 2. 1: The basic neural unit processes the input information into the output
information.

27

Each of these units is a simplified model of a neuron and transforms its

input information into an output response. This transformation involves two steps:

First, the activation of the neuron is computed as the weighted sum of it inputs,

and second this activation is transformed into a response by using a transfer

function.

The three basic Neural Network architectures are:

1. Feed-forward networks: All signals flow in one direction only, i.e. from lower

layers (input) to upper layers (output).

2. Recurrent (feed-back) networks: Signals from neurons in upper layers are fed

back to either its own or to neurons in lower layers.

3. Cellular: Neurons are connected in a cellular manner.

2.4.3. Transfer Functions
If each input is denoted xi, and each weight wi, then the activation is equal

to and the output denoted o is obtained as ( ) Different

transfer (or activation) functions, ( ), exist for transforming the weighted sum of

the inputs to outputs. The most commonly used ones are enumerated below (Refer

to the graphs on figure 2.2:

1. Threshold (sgn) function

{

2. Piecewise linear function

3. Linear function

28

4. Sigmoid function

5. Gaussian function

( √ ) ( ⁄ )

2.4.4. Learning rules
Neural networks are adaptive statistical devices. This means that they can

change iteratively the values of their parameters (i.e., the synaptic weights) as a

function of their performance. These changes are made according to learning rules

which can be characterized as supervised (when a desired output is known and

used to compute an error signal) or unsupervised (when no such error signal is

used). The types of learning are:

2.4.4.1. Supervised Learning:
The Widrow-Hoff (a.k.a., gradient descent or Delta rule) is the most widely

known supervised learning rule. It uses the difference between the actual input of

the cell and the desired output as an error signal for units in the output layer. Units

in the hidden layers cannot compute directly their error signal but estimate it as a

function (e.g., a weighted average) of the error of the units in the following layer.

This adaptation of the Widrow-Hoff learning rule is known as error

backpropagation. With Widrow-Hoff learning, the correction to the synaptic

weights is proportional to the error signal multiplied by the value of the activation

given by the derivative of the transfer function. Using the derivative has the effect

of making finely tuned corrections when the activation is near its extreme values

(minimum or maximum) and larger corrections when the activation is in its middle

29

range. Each correction has the immediate effect of making the error signal smaller

if a similar input is applied to the unit.

In general, supervised learning rules implement optimization algorithms

akin to descent techniques because they search for a set of values for the free

parameters (i.e., the synaptic weights) of the system such that some error function

computed for the whole network is minimized.

2.4.4.2. Unsupervised Learning
The Hebbian rule is the most widely known unsupervised learning rule; it

is based on work by the Canadian neuropsychologist Donald Hebb, who theorized

that neuronal learning (i.e., synaptic change) is a local phenomenon expressible in

terms of the temporal correlation between the activation values of neurons.

Specifically, the synaptic change depends on both presynaptic and

postsynaptic activities and states that the change in a synaptic weight is a function

of the temporal correlation between the presynaptic and postsynaptic activities.

Specifically, the value of the synaptic weight between two neurons increases

whenever they are in the same state; and decreases when they are in different

states.

2.4.5. Some important neural network paradigms
Neural network paradigms are formulated by a combination of network

architecture (or model) and a learning rule with some modifications. Refer to

figure 2.3 for the different learning paradigms.

One the most popular paradigms in neural networks are the multi-layer perceptron

(Figure 2.4). Most of the networks with this architecture use the Widrow-Hoff rule

as their learning algorithm and the logistic function as the transfer function of the

30

units of the hidden layer (the transfer function is in general non-linear for these

neurons). These networks are very popular because they can approximate any

multivariate function relating the input to the output. In a statistical framework,

these networks are akin to multivariate non-linear regression. When the input

patterns are the same as the output patterns, these networks are called auto-

associators. They are closely related to linear (if the hidden units are linear) or

non-linear principal component analysis and other statistical techniques linked to

the general linear model (see Abdi et al., 1996), such as discriminant analysis or

correspondence analysis.

31

(a) (b)

(c)
(d)

Figure 2. 2: Different types of activation functions: (a) threshold (b) piecewise linear (c)
sigmoid and (d) Gaussian.

32

Figure 2. 3: A taxonomy of feed-forward and recurrent /feedback network
architectures (Jain, 1996).

33

.
Figure 2. 4: A multi-layer perceptron.

34

A recent development generalizes the radial basis function (rbf) networks

(Abdi et al, 1999) and integrates them with statistical learning theory (Vapnik,

1999) under the name of support vector machine or SVM (see Schölkopf et al,

2003). In these networks, the hidden units (called the support vectors) represent

possible (or even real) input patterns and their response is a function to their

similarity to the input pattern under consideration. The similarity is evaluated by a

kernel function (e.g., dot product; in the radial basis function the kernel is the

Gaussian transformation of the Euclidean distance between the support vector and

the input). In the specific case of rbf networks, the outputs of the units of the

hidden layers are connected to an output layer composed of linear units. In fact,

these networks work by breaking the difficult problem of a nonlinear

approximation into two more simple ones. The first step is a simple nonlinear

mapping (the Gaussian transformation of the distance from the kernel to the input

pattern), the second step corresponds to a linear transformation from the hidden

layer to the output layer. Learning occurs at the level of the output layer. The main

difficulty with these architectures resides in the choice of the support vectors and

the specific kernels to use. These networks are used for pattern recognition,

classification, and for clustering data.

2.4.6. Validation
From a statistical point of view, neural networks represent a class of

nonparametric adaptive models. In this framework, an important issue is to

evaluate the performance of the model. This is done by separating the data into

two sets: the training set and the testing set. The parameters (i.e., the value of the

synaptic weights) of the network are computed using the training set. Then

35

learning is stopped and the network is evaluated with the data from the testing set.

This cross-validation approach is akin to the bootstrap or the jackknife.

The utility of artificial neural network models lies in the fact that they can

be used to infer a function from observations and also to use it. This is particularly

useful in applications where the complexity of the data or task makes the design of

such a function by hand impractical.

2.4.7. Real life applications
The tasks to which artificial neural networks are applied tend to fall within

the following broad categories:

 Function approximation, or regression analysis, including time series

prediction and modeling.

 Classification, including pattern and sequence recognition, novelty

detection and sequential decision making.

 Data processing, including filtering, clustering, blind signal separation and

compression.

Application areas of ANNs include system identification and control

(vehicle control, process control), game-playing and decision making

(backgammon, chess, racing), pattern recognition (radar systems, face

identification, object recognition, etc.), sequence recognition (gesture, speech,

handwritten text recognition), medical diagnosis, financial applications, data

mining (or knowledge discovery in databases, "KDD"), visualization and e-mail

spam filtering.

36

Computational devices have been created in CMOS for both biophysical

simulation and neuromorphic computing. More recent efforts show promise for

creating nanodevices for very large scale principal components analyses and

convolution (Yang et al, 2008). If successful, these efforts could usher in a new

era of neural computing that is a step beyond digital computing, because it

depends on learning rather than programming and because it is fundamentally

analog rather than digital even though the first instantiations may in fact be with

CMOS digital devices (Strukov et al, 2008).

2.4.8. Case Study of Two Neural Network Paradigms

2.4.8.1. Supervised Learning: Multi-Layer Perceptron
A feed forward network has a layered structure. Each layer consists of

units which receive their input from units from a layer directly below and send

their output to units in a layer directly above the unit. There are no connections

within a layer. (Krӧ se and Smagt, 1996). The inputs are fed into the first layer of

hidden units. The input units are merely ‘fan-out’ units; no processing takes place

in these units. The activation of a hidden unit is a function of the weighted inputs

plus a bias. The output of the hidden units is distributed over the next layer of

hidden units, until the last layer of hidden units, of which the outputs are fed into a

layer of output units as shown in figure 2.4.

The Multilayer Perceptron allows establishing decision regions which are

much more complex than the two semi-planes generated by the perceptron.

The back-propagation learning rule is one solution to the problem of how

to learn the weights and biases in the network. It is an iterative procedure. For

each weight and threshold, the new value is computed by adding a correction to

the old value:

37

( ) ( ) ( )
1

( ) ( ) ( )
2
To compute ( ) and ( ), the gradient descent iterative method is

used. Then the back-propagation (or generalized delta rule) adjusts the weights of

the network using the cost- or error-function. In high dimensional input spaces the

network represents a (hyper) plane and it is possible to obtain multiple outputs. If

the network is to be trained such that a hyperplane is fitted as well as possible to a

set of training samples consisting of input values and the desired (or target)

output values . For every given input sample, the actual output of the network

differs from the target value by( ). The error function (also known as

least mean square), is the summed square error. The total error is given as:

∑ ∑( )

3
, where the index p ranges over the set of input patterns and represent the error

on pattern p. The LMS procedure finds the values of all the weights that minimise

the error function by a method called gradient descent which is to make a change

in the weight proportional to the negative of the derivative of the error as

measured on the current pattern with respect to each weight.

The fundamental idea of the back-propagation rule is that errors for the

units of the hidden layer are determined by back-propagating the errors of the

units of the output layer.

The output of a network is formed by the activation of the output neuron.

The activation function for a multilayer feed-forward network must be a

differentiable function of the total input:

38

( )

4
where

∑

5
According to the gradient descent method, the generalised delta rule is given as:

6
The error measure defined as the total quadratic error for pattern p at the output

units is:

∑( )

7

where is the desired output for unit o when pattern p is clamped.

Training a network by back-propagation in a neural network consists of

two steps. The first step is the propagation of input signals forward through the

network. The second step is the adjustment of weights based on error signals

propagated backward through the network. As shown in Equation (7), the

performance of the system is measured by a mean square difference between the

desired target outputs and the actual outputs.

2.4.8.2. Unsupervised Learning: Kohonen Network/ Self -Organizing
Maps (SOM)
These networks can learn to detect regularities and correlations in their

input and adapt their future responses to that input accordingly. The neurons of

39

competitive networks learn to recognize groups of similar input vectors. Self-

organizing maps learn to recognize groups of similar input vectors in such a way

that neurons physically near each other in the neuron layer respond to similar

input vectors.

The Kohonen layer (Kohonen 1984, 1988) is a “Winner-take-all” (WTA)

layer. Thus, for a given input vector, only one Kohonen layer output is 1 whereas

all others are 0. No training vector is required to achieve this performance. Hence,

the name: Self-Organizing Map Layer (SOM-Layer).

The Kohonen network (Kohonen, 1982, 1984) can be seen as an extension

to the competitive learning network, although this is chronologically incorrect.

Also, the Kohonen network has a different set of applications.

In the Kohonen network, the output units in S are ordered in some fashion,

often in a two-dimensional grid or array, although this is application-dependent.

The ordering, which is chosen by the user determines which output neurons are

neighbours.

Now, when learning patterns are presented to the network, the weights to

the output units are thus adapted such that the order present in the input space

is preserved in the output, i.e., the neurons in . This means that learning patterns

which are near to each other in the input space (where ‘near’ is determined by the

distance measure used in finding the winning unit) must be mapped on output

units which are also near to each other, i.e., the same or neighbouring units. Thus,

if inputs are uniformly distributed in and the order must be preserved, the

dimensionality of must be at least . The mapping, which represents a

discretisation of the input space, is said to be topology preserving. However, if

40

the inputs are restricted to a subspace of a Kohonen network can be used of

lower dimensionality. For example, data on a two-dimensional manifold in a high

dimensional input space can be mapped onto a two-dimensional Kohonen

network, which can for example be used for visualisation of the data. (Krӧ se and

Smagt, 1996).

Usually, the learning patterns are random samples from . At time , a

sample ( ) is generated and presented to the network. Using the formulas for

competitive learning the winning unit k is determined. Next, the weights to this

winning unit as well as its neighbours are adapted using the learning rule

( ) ( ) ( )( ( ) ( ))

Here, ( ) is a decreasing function of the grid-distance between units and ,

such that ( ) For example, for ( ) a Gaussian function can be used,

such that (in one dimension!) ( ) ( ( ) ) Due to this collective

learning scheme, input signals which are near to each other will be mapped on

neighbouring neurons. Thus the topology inherently present in the input signals

will be preserved in the mapping.

Self-organizing maps (SOM) learn to classify input vectors according to

how they are grouped in the input space. They differ from competitive layers in

that neighboring neurons in the self-organizing map learn to recognize

neighboring sections of the input space. Thus, self-organizing maps learn both the

distribution (as do competitive layers) and topology of the input vectors they are

trained on (Demuth and Beale, 1992).

41

The neurons in the layer of an SOM are arranged originally in physical

positions according to a topology function. The function gridtop, hextop, or

randtop can arrange the neurons in a grid, hexagonal, or random topology.

Distances between neurons are calculated from their positions with a distance

function. There are four distance functions, dist, boxdist, linkdist, and mandist.

Link distance is the most common. These topology and distance functions are

described in Topologies (gridtop, hextop, randtop) and Distance Functions (dist,

linkdist, mandist, boxdist). (Demuth and Beale, 1992).

42

CHAPTER THREE

METHODOLOGY

The basic flowchart of the software procedure of this project is shown in Figure

3.1 while a more comprehensive one is shown in figure 3.2. It can be seen that the

software implementation is divided into perception, reasoning and action.

3.1. Chassis Fabrication
Before anything can be done a vehicle chassis must be available. To avoid the complexity

and much mechanics in axle and gearing system, it was decided that the use of gears and

axle will be avoided unless in cases where it is of utmost importance. Based on this

rationale, the vehicle will be turned right or left using the principle of Deferential Drive.

The chassis was constructed with the following steps:

1. The base was cut out from fibre glass material having dimensions of 18cm by

24cm.

2. The diagonals of the base were marked out.

3. The tires were fabricated from wood of thickness 2cm. The tires are 10cm in

diameter.

Wooden materials were chosen for the tyres to ensure rigidity of the vehicle.

To ensure proper friction, the curved surface areas of the tyres are to be lined

up with thin Dunlop material to aid traction.

4. The tyres were attached at the middle to a plywood material unto which the

motors (Dynamixel AX – 25f) were attached. The servo-motors (which have

been attached to the tyres) were attached to points equidistant from the

vertices and along the diagonals of the fibre-glass base. This is to ensure that

the centre of mass of the whole motor is located at the centroid.

43
Autonomous Driving
Supervised Unsupervised
Perception Pre-Processing
Learning Learning

Start

Conversion of
Use Multi-Layer Use Kohonen
Image Matrix to
Perceptron Network
Column Vector

Training? Simulate? Simulate?

Image Acquisition Yes No No

Compound to Initialize MLP Initialize SOM
Input Matrix
Conversion to
Grayscale

Train Train
Yes Yes
Yes Supervised?
Thresholding
Yes

Yes
Validate Test
Conversion to
Binary Image Compound No
No
Corresponding
Target Value No

Test
Simulate
Get More
Images?

No
Simulate

Generate Dataset
Take
Necessary
Actions

Supervised?

Figure 3. 2: A More Comprehensive Algorithm for the Robot

44

This is also to ensure that the motor is stable and does not tip over while

moving. The servo-motors were attached to the base with a strong adhesive

material.

5. Servo-motors were attached to the front tires only. The back tires were made

free. Although there will be reduction in effective torque in this case as

compared to if servo-motors were attached to all the tires, this method results

in effective cost reduction and less complex programming algorithm. The

Picture of the chassis is shown in figure 3.3

3.2. Vision, Image Acquisition and Processing
The following steps were taken in Image Acquisition and Processing:

1. Initialization of the Image Acquisition Device.

i. Plug in the Logitech webcam to the Laptop which has MATLAB 2009b

installed on it.

ii. Install and configure the webcam

iii. Retrieve information that uniquely identifies the Logitech webcam to

the MATLAB Image Acquisition Toolbox software.

iv. Delete any image acquisition object that exists in memory and unload

all adaptors loaded by the toolbox, by resetting the image acquisition

hardware with the command:

imaqreset

imaqreset can be used to force the toolbox to search for new

hardware that might have been installed while MATLAB was running.

45

Figure 3. 3: Image Showing the Robot Chassis

46

2. A reference image has to be firstly defined. This will be the standard image the

vehicle compares all other images acquired to. The steps involved include:

i. Create a video input object. The DeviceInfo structure returned

by the imaqhwinfo function contains the default videoinput

function syntax for a device in the ObjectConstructor field.

This is done with the following command:

ed = videoinput('winvideo',1);

ii. Configure image acquisition properties

triggerconfig(ed,'manual'); % makes the triggering

manually activvated by a command

set(ed,'TriggerRepeat',1); % allows triggering to

be done only once

ed.framesPerTrigger = 1; % makes just one frame

captured per trigger

iii. Start the object running, trigger the object and then acquire the

image data. Starting the object does not imply data is being logged.

Issuing the start command makes the program obtain exclusive use

of data acquisition device.

The trigger function initiates data logging for the video input

object. To issue the trigger function the TriggerType property

has to be set to ‘manual’ and the object must have been started.

The data logged is hereafter acquired into the MATLAB workspace

for manipulation and processing. The following lines of code

illustrate the above explanations:

start(ed); %start the object

Trigger(ed) %log in data

47

init = getdata(ed,1); % acquire logged data

iv. Convert the image acquired from rgb colour map to binary image.

This allows for ease in post-processing.

inited = im2bw(init,map, 0.1); %converts the image

acquired to a binary image.

v. Normally images acquired by image acquisition device is

represented by a four dimensional array (H-by-W-by-B-by-F).

 H represents the image height

 W represents the image width

 B represents the number of colour bands

 F represents the number of frames returned.

A way of converting this 4-D array into a 1-D array has to be sought

out since the inputs to our neural network has to be in a 1-D format.

The following code makes the columns of the four dimensional

matrix successively strung out into a single long column:

comp = inited(:); %converts the 4-D array

into a 1-D array

vi. The object is stopped finally.

stop(ed);

3.3. Developing the Neural Network with Multi-Layer Perceptron
(Method 1)
This begins with initiation of the Multi-Layer Perceptron with Back-Propagation

algorithm and the suitable architecture form, train and transfer functions. A dataset which

is an array of image inputs and corresponding targets (parameters passed to the

Dynamixel actuators) is first acquired. This dataset is thereafter used to train the neural

network with Back-Propagation algorithm. After the training, then the neural network is

48

validated and then tested. The main tool used here is the MATLAB Artificial Neural

Network Toolbox. The following steps were taken:

1. The third step is to compile the dataset used to train the neural network. The learning

mode adopted for the neural network is “Supervised Learning”. This involves feeding

the Neural Networks with various types of input and giving it the corresponding

target actions to take. Now to sufficiently train the Network, enough Input – Target

maps have to be specified. Conventionally this used to be done by a teacher who puts

in the target value for every input condition encountered. To save the drudgery in

having to do this for over a thousand or so input conditions, it was rather thought to

automate this process.

To get the dataset, the various tasks carried out in step 2 above have to be

repeated, but the image acquisition properties have to be changed for the dataset

object.

% Create video input object.

imaqreset % resets image acquisition device

vid = videoinput('winvideo',1); %define a new object

apart from the one above

% Set video input object properties for this

application.

% Note that example uses both SET method and dot

notation method.

set(vid,'TriggerRepeat',100); % allows triggering to

be repeated a hundred times

triggerconfig(vid,'manual'); % allows manual

triggering

49

vid.FrameGrabInterval = 5; % The interval between

successive frame acquisition is 5sec.

vid.FramesPerTrigger = 1; % Acquires one image per

trigger. to allow each image to be processed

To specify the target every time a frame is captured, a correlation function has to

be initiated between this new image acquired and reference image. If the

correlation coefficient is within 0.5 less than or greater than zero, then it is seeing

a similar image, then, the target parameter set will move the vehicle forward and

if the coefficient is so far from zero, the image is different and the vehicle has to

move back.

2. Step 1 normally has to be repeated for like a thousand time, and compiled into an

array, to make up the dataset used to train the Neural Network. But the loop was

made to iterate only tem times to save time and space.

3. The semifinal stage entails configuring, training, validating and testing the

Artificial Neural Network. The algorithm used is a pattern recognition algorithm.

Out of the dataset,

70% is used to train the network

15% is used to validate the network

15% is used to test the network

The performance graph of the neural network is plotted to check whether the

network has been well trained. The confusion matrix is also plotted to check

whether the network has fared well in recognizing patterns.

3.4. Developing the Neural Network with Self Organizing Maps
(Preferred Method)
It was realized that the Multi-layer perceptron was not all that suited for image

recognition and classification, so the use of Self-Organizing Maps (SOM) was

50

employed. As discussed earlier, SOM is a type of Competitive Neural Network

(CNN) which employs an Unsupervised Learning algorithm. The images to be

classified are fed into the Neural Network as input and the number of output

neurons is specified. The Network activates one of the several output neurons

depending on the minimum distance between the input and the weight.

To use this method in autonomous application, Images had to be acquired

using the codes in Appendix I. The images acquired (shown in Appendix IX) were

used to generate an input matrix for the SOM. Since this type of Neural Network

does not need Target specification, a new SOM was defined with hexagonal

topology having 3 by 1 neurons. The code used is shown in Appendix III. The

training algorithm used is Batch unsupervised weight/bias training (trainbuwb)

and was trained over 100 iterations.

3.5. Scene Identification
Lastly, the network is tested by randomly and at intervals, inputting images through

the webcam to see if it will be able to specify its target itself independent of the earlier

codes and the teacher.

3.6. Locomotion
After the platform has been set, a working control board for the servo-motor has to be

fabricated. The dynamixel servo-motor uses TTL serial technology in interfacing with

the outer world. But because the preferred communication technology with the motors

would be USB, a Dynamixel-to-USB converter has to be acquired together with its

driver installed on the laptop which is the main controller. The mechanism of turning

the robot is the differential drive. Using differential drive method, to turn right the left

wheel is made to accelerate while the velocity of the right wheel is kept constant.

51

Conversely, to turn left, the right wheel is made to accelerate while the left wheel is

made to rotate at a constant velocity.

The necessary controls to the differential drive system are the ouputs gotten from the

Neural Network.

3.7. Steering
Matlab has been chosen to send commands to the dynamixel. Since the output of the

neural network is stored in Matlab, it will be easily implemented in the movement of

the robot. Control commands have been written to steer the robot using differential

drive method. Different library functions have been developed for the various steering

and navigation tasks (forward movement, backward movement, stop, right turn, left

turn). Forward movement is achieved by moving both front wheels at the same speed

in the forward direction. Backward movement is by making the wheels move at the

same speed in the reverse direction. While left turn is accomplished when the right

wheel is moving forward and the left wheel is moving in the reverse direction at

different speeds. Right turn is the direct opposite of the left turn. The various

functions for movement of the car are shown in Appendix IV

52

CHAPTER FOUR

RESULTS AND DISCUSSIONS

5.1. Simulations and Results

5.1.1. Multi-Layer Perceptron Neural Network
The Visual part of the robot was made to observe a particular scene and some other

scenes different from the one it is used to. The Images acquired were processed and used

in testing the neural network. Thereafter the network was subjected to a call back process

wherein it observes scenes by itself and outputs a signal. In this simulation process the

signal output is a wav file. When the Network observes the scene it has been used to, it

plays a particular file and makes the robot move in some defined way. When it sees a

different thing it plays another file and the robot undertakes a different movement

scheme. Matlab is the tool used for this process. The matlab code used is shown in

Appendix II.

The images acquired for training are shown in Appendix IX. It can be seen that

Images 2, 3, 4, 7, 8, 9, 10 and 12 are identical. These represent the accustomed scene.

Images 1, 5, 6, and 11 are variations in scenes. During the testing process the outputs

given by the network is shown in Table 3.1 together with the desired output. It can be

seen that there are errors in classifying images 7 and 9.

The network has two layers and one output unit. It uses mean square error to evaluate

the error function. It uses scaled conjugate method for training. It used images 2, 4,

5,6,7,8, and 9 for training; images 3 and 10 for validation and images 1 and 11 for testing.

The network went through 14 iterations before it reached the minimum gradient. The GUI

showing the state of the neural network is shown in Appendix III. The confusion matrix is

The development of neural network based vision for an autonomous vehicle

The development of neural network based vision for an autonomous vehicle

Recommended

Recommended

More Related Content

Similar to The development of neural network based vision for an autonomous vehicle

Similar to The development of neural network based vision for an autonomous vehicle (20)

Recently uploaded

Recently uploaded (20)

The development of neural network based vision for an autonomous vehicle