SlideShare a Scribd company logo
1 of 106
The Application of Artificial Neural Networks to Business
                           Problems




                           Bryan Mills




            University of Plymouth Business School
                    (Franchised to Cornwall College)




Honours project submitted as partial fulfilment for the degree of
           BA Honours in Business Administration




                  Supervisor: Dr Jon Tucker




                          14th May 1997
Acknowledgements:
        Firstly I would like to thank my supervisor Dr Tucker for his patience and generosity
and, in addition, to acknowledge the contribution he has made to this dissertation. I would
also like to thank Dave Ager, Jill Ferret and Mike Trennary for their tolerance and
encouragement during both the dissertation and the degree programme. I would like to
acknowledge the encouragement I have received throughout the degree from Buzz Banks,
Helen Cobbin and Ken Waller. Also, I would like to take the opportunity to thank Paul
Ingram for his uncompromising and contagious obsession with academia and Tony Butt for
first introducing me to Chaos Theory and non-linearity.




                                                                                        2
Bryan Mills 1997




                                     Abstract:
     Artificial Neural Networks (ANNs) provide a powerful information technology
based tool for decision making purposes. However, present literature on the subject
is often found to be either inaccessible or of limited relevance to (general) business
application. In this report ANNs are described in a more intuitive manner than found
within much of the existing literature. Emphasis is placed upon the use of ANNs
within the business environment, although the study still provides an introduction for
wider application. Misconceptions surrounding ANNs, and Artificial Intelligence in
general, are explored and recommendations are made with a view to their resolution.
The advantages and disadvantages of ANNs are discussed and present applications
are listed with a view to demonstrating the various application possibilities of ANNs.
To enable wider application of ANNs within business, and to reduce misguided
application, a schema has been developed. This schema, which has been developed
as both a flowchart and a computer program, allows the potential ANN user to
critically appraise the use of ANNs for a given decision making problem.




                                                                            page   3
Contents:
                                                                  (modified simple)


ACKNOWLEDGEMENTS:.......................................................................................................................


ABSTRACT:...............................................................................................................................................


LIST OF DIAGRAMS: (BUILD FROM TABLE OF FIGURES STYLE - DELETE LIST OF
TABLES FIRST).........................................................................................................................................


LIST OF TABLES: (BUILD FROM TABLES STYLE)............................................................................


GLOSSARY OF TERMS:..........................................................................................................................


CHAPTER 1 - INTRODUCTION..............................................................................................................

   General Introduction...............................................................................................................................

   Popular Misconceptions Concerning Neural Networks:........................................................................

Chapter 2 - Discussion of Aims, Methodology and Research Philosophy............................................

   Aim:.........................................................................................................................................................

   Objectives:...............................................................................................................................................

   Benefit of the project to industry and commerce:...................................................................................

   The growth of research in the neural area:............................................................................................

   Methodology and Approach:...................................................................................................................

   Schema development:..............................................................................................................................

Chapter 3 - Explanation of the Fundamental Concepts of ANNs.........................................................

   Introduction:............................................................................................................................................

   An outline explanation of the fundamental concepts of Artificial Neural Networks:..............................

   Knowledge Based Systems:.....................................................................................................................

   The difference between Artificial Neural Networks and Conventional Knowledge Based Systems:......

   Explanation of the operation of ANNs:...................................................................................................

   First Principles:.......................................................................................................................................
       Knowledge Based Systems:................................................................................................................




                                                                                                                                                        4
Bryan Mills 1997


   Rule Based System:..................................................................................................................................

   Artificial Neural Networks:.....................................................................................................................
      Overview:............................................................................................................................................
      Components:........................................................................................................................................
      Nodes:..................................................................................................................................................
      Weights and bias terms:.......................................................................................................................
      Generalisation:.....................................................................................................................................
      Choice of mapping or activator function:............................................................................................
      Data pre-processing:............................................................................................................................
      Training:..............................................................................................................................................
      Topology:............................................................................................................................................

   The Multilayer Perceptron - an example of supervised learning/training:............................................

   The Kohonen self organising net- an example of unsupervised learning/training:................................

   Summary:.................................................................................................................................................

Chapter 4 - Investigation into advantages, disadvantages and current application of ANNs...........

   Introduction:............................................................................................................................................

   Advantages and disadvantages:..............................................................................................................

   Current application of ANNs:.................................................................................................................

   Summary:.................................................................................................................................................

Chapter 5 - Schema for the assessment of the suitability of ANNs for given problem.......................

   Introduction:............................................................................................................................................

   Schema:...................................................................................................................................................

   Explanation of Schema:...........................................................................................................................

   Summary:.................................................................................................................................................

Chapter 6 - Conclusions and Recommendations....................................................................................

   Conclusion:..............................................................................................................................................

   Limitations:..............................................................................................................................................

   Further Research.....................................................................................................................................

Appendix 1 - Example of Training Process.............................................................................................




                                                                                                                                           page         5
Appendix 2 - Bayesian Updating:............................................................................................................


Appendix 3 - Instructions for Running The Computer Program:........................................................


Appendix 4 - Computer Code-list............................................................................................................


Appendix 5 - Sample Output:...................................................................................................................


Appendix 6 - Visual Basic as a Programming Language:.....................................................................


Bibliography:.............................................................................................................................................


                                    List of Diagrams: (build from table of figures style - delete list of tables first)
Diagram 1 - Patent Activity..................................................................................................................................................................
Diagram 2 - Methodology.....................................................................................................................................................................
Diagram 3 - Knowledge Based System.................................................................................................................................................
Diagram 4 - Single Neuron Calculation...............................................................................................................................................
Diagram 5 - Representation of a Neuron.............................................................................................................................................
Diagram 6 - Screen dump of a text file for use in WinNN....................................................................................................................
Diagram 7 - Class Membership............................................................................................................................................................
Diagram 8 - Universe of objects...........................................................................................................................................................
Diagram 9 - Sigmoid Function.............................................................................................................................................................
Diagram 10 - The Multilayer Perceptron.............................................................................................................................................
Diagram 11 - Kohonen Self Organising Feature Map.........................................................................................................................
Diagram 12 - The operation of ANNs - flow diagram..........................................................................................................................
Diagram 13 - Schema...........................................................................................................................................................................




                                                         List of Tables: (build from tables style)
Table 1 - Sample Problem...........................................................................................................................
Table 2 - Simplified weight method.............................................................................................................
Table 5 - Input file explanation...................................................................................................................
Table 4 - Sigmoid values.............................................................................................................................
Table 5 - Data pre-processing.....................................................................................................................




                                                                                                                                                                                     6
Bryan Mills 1997




Glossary of terms:
Activator function - an equation (mapping function) which describes a neuron’s

                   internal state as the total of its inputs; net = ∑xiwi-θ, where x is an

                   input w is a weight.

Algorithm - a procedure or series of steps used to solve a problem

Autoassociative - mapping the original pattern from noisy or incomplete data

Backpropagation - an algorithm which compares results with expected answers and

                   then passes the difference back through the network to facilitate

                   weight adjustment.

Bias term - A systematic error (θ) introduced to each node independently to allow

                   control over the otherwise independent node output.

Cell - A neuron

Database - In this instance, a set of facts (data) stored within a computer system

Dependant variable - A variable which will be altered or created by the change in

                   value of an independent variable(s). Normally shown on the left

                   hand side (LHS) of an equation..

EPOS - Electronic point of sale - the computer connection between cash-tills and the

                   central computer within a retail store

EPS - Earnings per share (accountancy measure)

Front-end subsystem - A computer program designed to simplify (humanise) the

                   input and output of data

Fuzzy - A set whose members belong to it to some degree. In contrast a standard set

                   contains its members either all or none (Kosko, 1993).

Function - A rule which maps one set element onto a different element in another set,




                                                                                page   7
sales level could be said to be a function of demand

Generalise - The ability to identify a wide range of objects, patterns etc. from a

                minimal set of key descriptive data

Heteroassociative - mapping input pattern set to different output pattern set

Hyperplane - A plot involving more than 3 dimensions and therefore difficult to

                represent graphically

Independent variable - A variable which will alter or create the change in value of a

                dependent variable(s). Normally shown on the right hand side (RHS)

                of an equation

Inference engine - The part of a knowledge based system’s programming which

                deduces results from given facts/data

Knowledge based system - The separation of data and control (algorithms) allowing

                the computer to respond to a series of differing inputs by calling on a

                library of information (knowledge base) as opposed to altering

                variables contained explicitly within the program’s structure.

Mapping function - A rule linking the elements of one set to those of another; usually

                shown as F:x→y; the function which maps the x onto y.

Multivariable - Containing a large number of independent variables

Network - A collection of interconnected nodes forming a topology

Neuron - A single activator function, a processing element, a mapping function

                through which variables must pass, a calculation point

Nodes - Neurons

Non-linearity - Equations containing powers, roots, trigonometric or logarithmic

                functions.




                                                                                  8
Bryan Mills 1997


Normalisation - A form of data pre-processing which seeks to give all inputs/outputs

                   a commonality by constraining their values to within a pre-

                   determined range

Pre-processing - Alterations to data before use (normalisation, removal of outliers,

                   ratio   splitting). Usually conducted with the intention of increasing

                   the networks efficiency or conversion of non-numeric data to

                   numeric.

Propositional logic/calculus - A step by step inference system for determining

                   whether a given proposition is true or false. There are various forms

                   of propositional logic (modus ponens, modus tollens, denial of

                   antecedent etc. ), but all are based on a deviations of: If x is true then

                   y must be true/false, If and only if x is true then y is true/false etc.

                   (Eysenck and Keane, 1995).

Ratio splitting - Using the component parts of a ratio separately as opposed to using

                   the result (GPMargin = GP/Sales; use GP and Sales as input not GP

                   Margin)

Real-time - The collection and processing of data as events occur as opposed to the

                   use of historic data. EPOS works in real-time

ROCE - Return on capital employed (accountancy measure)

Set - A collection of elements defined by a rule which makes them separable from

                   other sets - e.g. men and women are two separate sets (separated by

                   sex) but are also within the common set of humans (separated from

                   other animal forms by species)

Sigmoid - A common ANN Activator function. An equation which has the effect of

                   reducing all independent variables to an answer of between tending



                                                                                  page    9
towards 1 and tending towards 0 (never reaching either 1 or 0) and is

               generally given thus:

                                                        1
                                fnonlinear ( x ) =
                                                     1 + e−x
                        where x is summed input and e is the mathematical constant that is the
                        base of natural logarithms (2.71828.....)

Topology - In this instance an attempt to graphically represent the interconnection of

               nodes within the network.                  Topology is often one of the key

               distinguishing features separating different ANNs (others being

               training method and activator function)

Training method - As ANNs self learn by exposure to data it is necessary to have an

               algorithm which allows the ANN to distinguish between correct and

               incorrect responses.          This may either be supervised (told when

               incorrect and what should have been the output), unsupervised (self

               learning pattern recognition) or reinforced (told simply if correct or

               incorrect)

Training set - A collection of data used to train the ANN, usually separated into a

               training set and a hold out or test set

Vector - A quantity which has both magnitude and direction. ANN’s input consists

               of a one dimensional array of differing x values of the form

               x1w1+x2w2+x3w3+...xnwn, where x indicates input and w indicates

               weight

Weights - A value which is altered by the ANN to enable the emphasis of the variable

               upon which it acts to be either strengthen or weakened. A variable

               coefficient which determines strength of an input’s effect on output




                                                                                         10
Bryan Mills 1997



                              Chapter 1 - Introduction
General Introduction
          Business involves a complex mix of people, policy and technology, and exists

within the constraints of economics and society (Clifton and Sutcliffe, 1994). It is

often the precise way in which these items are mixed that can create either success or

failure for an organisation.                 This presents the manager with two key tasks; the

efficient collection and analysis of all relevant information. From this analysis the

manager will be able to formulate strategies, define objectives and implement plans

for there fruition. The provision and analysis of information, within business, is often

referred to as the decision support process and the methodology adopted referred to as

decision support systems (DSS).

          Business decisions can often be viewed as the solution of various

mathematical problems. Whether it be determining the price level of a product, the

benefit of expansion into a new market, staff levels or the probability of a project

failing mathematics usually plays a role. In fact, due to the overriding objective of

“maximising shareholders wealth (McLaney, 1994)” found within all profit making

organisations, it can be said that, as wealth/profit is measured numerically, it would

be difficult, if not impossible, to view the organisation meaningfully in any other

way1.

          One of the key problems in any decision is the availability and cost of

“perfect information”. Given perfect information (all the facts concerning a decision

with complete confidence in these predictions being correct) there would be little for

the business manager to decide, it would simply be a choice of the project which


1 Non-profit making organisation seek cost efficiency - another mathematical measure




                                                                                       page   11
maximised overall contribution2. Most decisions, however, are not based on perfect

information.          This is generally due to a combination of the prohibitive cost of

gathering such information, the availability of information and the intrinsic

unpredictability and complexity of the markets in which business operates.

           Ongoing developments in the field of Information Technology has enabled

the gathering, storage and retrieval of much larger quantities of information than was

previously possible. Stock Markets can be observed in real-time, supermarkets know

the exact quantities of goods on their shelves (via Electronic Point of Sale (EPOS))

and their customers weekly shopping lists (via Loyalty Cards), companies can

measure the exact output of machines on the shop floor (via Computer Aided

Manufacturing). This information is, however, worth only as much as the gain

derived from its ownership. To be able to quote a share price or stock level is fine,

but the information has already become historic. What is required in decision making

is a means by which to identify patterns and trends in the large volumes of data

currently available, and to increase the confidence in the predictability of this data to

an acceptable level.

           The capabilities Artificial Neural Network (ANN) models have in recognising

patterns and trends in large volumes of data has meant that they are being

increasingly used for a variety of industrial/commercial applications.

           ANNs are a form of computer software which took their original inspiration

(McCulloch and Pitts, 1943) from mans limited understanding of the workings of the

human brain. Research has been carried out in this area for two broad reasons. The

first and original was an attempt to model the human brain electronically to develop

a greater understanding of its operation. The second, and most relevant in this
2 Overall Contribution - the manager would consider the organisation’s other ventures, market share, market growth and long
term survival in his/her decision




                                                                                                                        12
Bryan Mills 1997


instance, is the development of these models as a mathematical tool for studying

patterns and relationships in data.

        The mathematics which form these models are           particularly useful when

dealing with non-linear problems, problems which cannot be graphed by use of a

straight line, of which there are numerous examples in business (demand/price,

production level/cost, share price/ROCE/EPS - an increase in the independent

variable (price) does not guarantee a proportionate increase in the dependant

(demand)). ANNs are also capable of dealing with dependant variables which may

have several variables acting on them (e.g. interest rates, inflation and estimation of

risk - in cost of capital calculation), the relationship between each being both

theoretically appreciated and explainable but not easily converted into an equation or

algorithm (Klimasauskas, 1991 and Scocken, 1994).

        It is the ability to deal with non-linearity, multivariables and large volumes of

data which gives ANNs what is perhaps their most impressive features - pattern

recognition and self learning. ANNs receive their information (their knowledge) via a

process of training. Sets of data and desired results are passed through the network

until the computer is able to create, to a reasonable degree of accuracy, the desired

result. This is made possible by the networks ability to generalise the training data

presented to it and form an output, given new inputs, based on this generalisation.

Once this training stage is complete a problem (independent variables) can be input

and a result (dependent variable) is generated.

        Current application of ANNs includes, amongst others; stock and money

market forecasting (Trippi and Turban,1996), face and handwriting recognition

(Rogers, Kabrisky, Ruck and Oxely, 1994), recognising whether station platforms are

busy or not, missile direction systems, voice recognition, voice control of computers,



                                                                              page    13
data mining (Wiggins, 1994), industrial signal processing (Wiggins, 1994), modelling

of traffic flow (Recker, 1995), human resource management (redundancy selection)

(Coit, 1996), new product feasibility studies (Madu, 1995), risk evaluation, chemical

analysis, weather forecasting and resource management (Davalo, Naïm, 1991), a

complement to business decision support systems (Scocken, 1994), operations quality

control (Horridge, 1997) and the processing of marketing data.

Popular Misconceptions Concerning Neural Networks:
           The subject of Artificial Neural Networks (ANNs) is an example of a name

not being self explanatory.                      The description ‘Artificial Neural Network’ is a

misnomer, it suggests an artificial representation of the human mind (it being

composed of a network of neurons). Exciting though the creation of an ‘artificial

mind’ would be, the ANNs currently in operation are little more than computer

programs capable of doing clever ‘sums’. The cleverness of these ‘sums’, however,

is not to be taken lightly. Systems have been developed which are able to identify

patterns in very large samples of data, produce a method of calculating relationships

between data where conventional mathematics would have been inadequate for

practical application, and represent a very strong possibility of development of

systems better suited to understanding our own fuzzy3 world.

           As a subject, ANNs are fairly inaccessible and fraught with misconceptions.

The subject is clouded by two separate, but interrelated explanations, and this

difficulty is further compounded by the absence of accessible knowledge. On the one

hand there are the works of various academics and academic institutions. On the

other is the general public’s4 understanding of what ANNs represent - if they are
3 Fuzzy - e.g. Language - hot, warm, cold mean different temperatures to different people and the boundary between hot and
warm (for example) is not clear (is 18 degrees warm, 19 degrees hot and just as hot as 28 degrees?)
4 Used here simply to describe those outside of the fields of Mathematics, Computing and Psychology - not intended to be in
any way derogatory.




                                                                                                                         14
Bryan Mills 1997


aware of them at all.

           The public’s understanding stems mainly from the world of science fiction

and ‘popular’ science programmes. It is a world of Arthur C Clarke’s HAL (2001

etc.), Philip K. Dick’s Bladerunner5 - thinking machines which inevitably turn on

their creators, with devastating results6. This understanding is not assisted by the

anthropomorphic nature of the language surrounding ANNs and the willingness of

some academic’s to emphasise this definition (for example - Professor Aleksander,

Imperial College London - “Magnus [a computer program] has a mind of its own” -

(Millar7, 1996)). The use of words such as ‘thinking’, ’neuron’ and ‘understanding’

all point towards machines which, eventual, may replicate the human thought process

to the point of being conscious. The reality of the situation is quite different, at

present computers can represent little more than a few thousand neurons, compared to

10,000 in a Cockroach’s brain and 100 billion in a humans (The Economist, 1995).

           The academic world often uses anthropomorphic terms to overcome some of

the limitations of language and the mathematical nature of more correct descriptions.

For example in the development of a computer system to control the heating, lighting

and ventilation of an office building one may be tempted to use expressions such as

-“to develop a system which is aware of its environment”. However, use of the word

‘aware’ may suggest consciousness and use of ‘its environment’, as opposed ‘the

environment in which it operates’ could suggest ownership and, therefore, existence

beyond being an object. The difficulty stems from the absence of a more correct, and

equally as convenient, shorthand. The alternative - “to develop a system which

constantly monitors the surrounding environment and compares this information with

5 More correctly - the original book was called- Do Androids Dream of electric Sheep
6 The defence analyst and writer Warwick Collins has gone so far as to call on the government to restrict the human attributes
scientists can give programmes/machines (Millar 1996, The Guardian Newspaper, (17/12/96) page 4, eighth paragraph)).
7 The Guardian Newspaper, 17/12/96 page 4, second paragraph)




                                                                                                                  page      15
a pre-programmed set of ‘ideal’ conditions” - explains the process with a reduced

likelihood of confusion, but is not necessarily more accurate. The readers frame of

reference provides the key to which language would be more appropriate.

       The use of such terminology creates few problems within the field because the

level of understanding is such that the words used often have two separate meanings -

the computer related meaning and the human related meaning - for example:

Neuron -

       • Human related meaning - a cell which responds to various inputs by

           producing responses - a processing unit.

       • Computer related meaning - a part of ANN computer program which

           performs a calculation - a processing unit.

The definitions are similar and would appear to suggest that, if a significant number

of ‘computer neurons’ were assembled, a human brain could be replicated. Whilst

this formed the inspiration behind some of the early research in the field (for example

Rosenbalt 1958, 1961), modern theory points to a level of complication within the

human brain which makes the early optimism seem naive at best.

       A more comprehensive discussion on matters of human and machine

consciousness is found in Penrose, 1988, Emperors New Mind, and 1994, Shadows

of The Mind.

       This thesis is intended to explain Artificial Neural Networks in such a way as

to reduce some of the confusion which often surrounds the topic. In addition it is

intended to simplify the application of ANNs (to a given problem) by the

development of a schema (both paper based and as computer program). This schema




                                                                                   16
Bryan Mills 1997


will greatly simplify the choice faced by the manager when considering which

mathematical tools to use in both decision, classification and control problems. To

enable the full value of this schema to be realised the thesis begins with a

comprehensive review and simplification of existing literature.         As previously

discussed the confusion stems from three broad areas - media hype, anthropomorphic

descriptions and texts aimed at a specialist reader (scientific) and it is intended that

this thesis will contribute towards redressing this balance.




                                                                              page   17
Chapter 2 - Discussion of Aims, Methodology
                        and Research Philosophy
Aim:
       This study aims to develop a level of understanding from which the business

manager (who is unlikely to be an IT specialist) can establish the relative

merits/demerits of the ANN technique for business decision support analysis.

       The project aims to make inroads into some of the more accessible academic

texts with a view to creating a more intuitive guide to ANN use aimed at the business

manager and student. To aid this explanation a schema or system will be developed

whereby the reader can assess the suitability of ANNs for a problem they wish to

solve. To assist in the discussion on the suitability of ANNs for given problems there

will also be an assessment of current uses and the advantages and disadvantages that

application presents.

Objectives:
1) To conduct a literature review of the fundamental concepts underling ANNs.

2) To examine the existing use of ANNs.

3) To develop a system to enable problems to be assessed for the suitability of ANN

   application.

Benefit of the project to industry and commerce:
       Progress in the development of ANNs is closely tied to the development of

computer equipment. It is only within the past 5 years that computing power has

become cheap enough to make ANN use a viable possibility. However ANNs have

remained in the exclusive domain of the scientist and mathematician for the past 45




                                                                                  18
Bryan Mills 1997


years and there are few accessible texts for the non-specialist.

           The field of ANN contains possible solutions to business problems not fully

addressed by present mathematical techniques (Tucker, 1997). As Gleick (1993) and

Waldrop (1992) have commented, non-linearity of patterns are rife in the enormous

volumes of information produced by industry and commerce (e.g. the financial pages,

actuary data, market research responses). ANNs enable the user to analyse this data

more accurately than traditional problem solving techniques, making them a

commercial advantage to many industrial sectors.

The growth of research in the neural area:
           The field of ANNs is expanding at an amazing rate. The expansion of the

subject is closely linked to technological developments in the IT field. As this area

continues to develop8 there will be an increasing expansion of opportunities in the

field of ANNs (Medsker, Turban and Trippi, 1996). Funding of research within the

field of ANNs is continuing with the Japanese government having budgeted $250

million over next 10 years, and the US government having pledged research funding

of $400 million over next 6 years (The Economist, 1995).

                              Patents Registered USA
          300

          250

          200                                                         Combined
 Number




          150                                                         Comp. Int
                                                                      ANN
          100

          50

           0
                1986

                       1987

                                1988


                                       1989

                                              1990

                                                     1991

                                                            1992




                                       Year


Diagram 1 - Patent Activity

8 Moores Law suggests a doubling of the number of chips on a transistor every 18 to 24 months (J. Scholfeild, 1996, The
Guardian Newspaper (31/10/96) page 3 Online Section).




                                                                                                               page       19
Diagram 1 shows patent activity in the USA for the years 1986-92. It can be

seen from the graph that the growth of work within this field is almost exponential. It

is also important to note that the full extent of ANN’s application within business

(particularly finance) has yet to be realised (Farrar, Tucker and Bugmann, 1997).

Methodology and Approach:
       The project is based mainly on a comprehensive literature survey and review

of texts within the field of ANNs. The literature search was conducted in the first

instance to develop a clear understanding of the subject. From this, a succinct

explanation of the concepts underpinning artificial neural networks, aimed at business

managers, has been produced. The greater understanding engendered by the literature

research provides the basis for an analysis of the advantages and disadvantages of the

use of ANNs and forms the foundation of the schema development.




Diagram 2 - Methodology




                                                                                    20
Bryan Mills 1997




        The above chart (Diagram 2) represents the flow of tasks from development of

original synopsis to the conclusions and recommendations.

Schema development:
        The schema, which forms the most pragmatic part of the thesis, was

developed from the literature research. The schema seeks to answer the question

“Do ANNs offer a realistic solution to a given problem”. As ANNs are capable of

dealing with a variety of problems, and as the business community usually has a

variety of different problems under review, it is intended that the schema will be

general in its approach, whilst maintaining effectiveness and accuracy.

         The schema is developed both as a flow-chart and as a computer program.

By establishing the specific data and training requirements of ANNs it is possible to

construct a series of questions of a non-technical nature, which the manager can

consider concerning the problem under review. The schema follows the flow of the

responses and culminates in a suggestion for further action. The reasons for the

suggested actions are explained, allowing the manager to consider various courses of

action depending on the resources available to him or her. Where appropriate the

schema will suggest alternative decision making techniques which could prove more

cost efficient or accurate than the use of ANNs.




                                                                           page   21
Chapter 3 - Explanation of the Fundamental
                           Concepts of ANNs
Introduction:
This chapter will seek to place ANN in the broader context of computer software. A

highly simplified description of the workings of ANNs will follow. Once this basic

understanding has been enabled a more detailed explanation will follow, which is

intended to equip the reader with a reasonable level of knowledge on the topic, to

enable both further study or practical application.

An outline explanation of the fundamental concepts of Artificial Neural Networks:
       An ANN is simply a computer program which, through the adjustment of

mathematical weights, is able to create a model capable of producing results (usually

in the form 1 or 0, or scaled using decimals from 0 to 1) , for a given set of numeric

input data, to a reasonable degree of accuracy. The network will often include Front-

end subsystem (Attrasoft User’s Guide and Reference Manual, 1996) to enable both

data encoding and data decoding:

       Data encoding: to convert user-application data to neural input data.

       Data decoding: to convert neural output data back to user-application data.

       ANNs can be considered as part of the larger group of computer based

techniques referred to as Knowledge Based Systems.


Knowledge Based Systems:
       There are numerous forms of computer systems which fall under the general

heading of Knowledge Based Systems (KBS). This use of computing power can be

defined as:



                                                                            page   22
Bryan Mills 1997                                             Chapter 3 -Fundamental Concepts


    “a system within which data is analysed by comparison with sets of pre-obtained

    data by following specific rules and/or weighted relationships” (author)

To facilitate this comparison the system will require a set (library, files, historic

records) of knowledge. This knowledge is the basis upon which the system operates

and can take numerous forms:

          • Financial Data - credit limit, accounting ratios, past sales figures

          • Human Resource Data - qualifications, age, experience (years)

          • Operational Data - machine failures (frequency), tolerances, re-order levels

          As can be seen from the above examples, the knowledge base is often a form

of database of the sort now commonly found within most organisations.                     The

difference between KBS and conventional databases is the level of interrogation and

control which is placed within the systems remit. As opposed to merely storing data

the system will be called upon to ‘trawl’ through the data to identify trends and

patterns of behaviour or it may use its knowledge to instigate some form of action.

For example if a bill became overdue the system could issue a reminder without the

need for an operator to intervene. This is possible because the system knows the date,

the date the last payment was made, the difference between this date and today’s and

the company’s policy on ‘debtor days’. This example also indicates the level of

understanding possible - knowledge, in this instance, can in no way be said to be in

the same sense as a human would know what it was to have an overdue bill.

          It becomes apparent that many modern databases are capable of achieving

similar results to knowledge based systems. The difference between the conventional 9

knowledge based system and databases is becoming increasingly subtle and is more


9 Conventional as opposed to ANNs




                                                                                   page   23
an emphases on use as opposed to structure. Most data bases (Microsoft’s Access for

example) are capable of interrogating data and also of issuing notification should this

be required.


The difference between Artificial Neural Networks and Conventional Knowledge
Based Systems:
            As previously discussed ANNs are part of the broad heading of KBS,

however it is important to recognise that there are fundamental differences between

ANNs and other KBSs. Whilst a KBS has the rules and relationships concerning its

knowledge programmed into the system (albeit kept separate from the knowledge)

ANNs develop their ‘own’ rules and relationships through a process of self learning.

The self-learning abilities of ANNs are most simply explained by example:

Suppose the relationship between the following set of data was desired:

Advertising             100          150           50            10            200
Spend £’s
Sales £’s               300          450           150           30            600

Table 1 - Sample Problem

From the above table, by dividing sales by advertising spend (or by drawing a graph),

it is quite possible to estimate that sales are three times advertising spend. It is

possible to estimate this figure because, a) we appreciate and could prove a

relationship between the variables, b) there are relatively few variables which c)

enables a simplistic approach to the formulation of a equation (relationship). It can

be appreciated that a more complex relationship may exist, which is beyond the

simplistic approach used so far.                           To solve a multivariable and non-linear10

relationship would require the use of statistical techniques which are often

10 Non-linear - a relationship which would create a the graph of a curve as opposed to a straight line, the equation of which
would contain powers x2etc..




                                                                                                                   page         24
Bryan Mills 1997                                              Chapter 3 -Fundamental Concepts


complicated and/or rely on a degree of approximation.

        ANNs take a different route to establishing the relationship between variables

- by adjusting the values of numerical weights within a equation (function). The

weights will act upon the data to alter its value with the intention of producing the

desired result. To enable this process to take place the system must be exposed to the

data a set at a time (e.g. Advertising Spend of £100 and sales of £300 is the first set

of data in Table 1). The computer will, in the first instance, apply a guess as to the

value of the weights to be used (although this starting value may well be pre-

programmed or random (Hopgood, 1993)). This ‘guess’ will, inevitably, prove to be

wrong and the system will alter the weights and retry.

The first set of data will be treated as below:

Advertising Spend £’s     Weight       Function         Result      Desired Result £’s
100                       1            Spend * Weight   100         300
                          2            Spend * Weight   200         300
                          3            Spend * Weight   300         300

Table 2 - Simplified weight method

It can be seen from the above that after a series of iterative steps the system was able

to produce the desired result, and in our previous example this weight would be

acceptable for all of the data sets.

        The function used in the above example is linear as opposed to the non-linear

functions used within ANNs, also the number and relationship of the variables is

more simplistic than would normally be encountered (for an example of the more

complicated OR problem see Appendix 1).

        It is possible to imagine that if the relationship was more complicated and our

weight of 3 proved unsuitable for the next data set it could be adjusted again and then



                                                                                    page   25
re-used on both sets until a satisfactory relationship was obtained. Most ANNs have

an adjustable degree of tolerance (between ANN output and training set’s expected

result), for example WinNN has adjustable target error to determine the acceptable

Root Mean Square error11, once target and RMS match training of that net is said to

be complete - note; the lower the acceptable error the more refined, and less

generalised the net becomes.

          The procedure described in this simplified model could be said to represent a

single neuron (processing unit, cell). To enable more complicated relationships to be

developed ANNs have more than one neuron and it is not uncommon for the results

of one neuron to be the input of another. If these connections were viewed pictorially

they would form a network of interconnected neurons, and hence; Artificial (non

human) Neural (processing units) Network (interconnection of neurons).


Explanation of the operation of ANNs:

First Principles:

Knowledge Based Systems:
          As discussed in the introduction, ANNs are a form of software that has the

ability to self learn. Unlike more conventional (rule based) forms of knowledge-

based systems the algorithms used to enable the inference engine (rule interpreter) to

work are not hard programmed or explicit rules based along the IF...THEN...ELSE

pattern. Instead the program uses a series of mathematical weights to establish data

relationships. To enable an understanding of the difference it is first necessary to

explain the basic components within knowledge based systems.

          Knowledge based systems contain 3 core components. An interface with the

11 RMS - the square root of the mean of a set of squared numbers




                                                                             page   26
Bryan Mills 1997                                                                     Chapter 3 -Fundamental Concepts


user (outside world) to enable both data input (keyboard, sensors, etc.) and output

(monitor, servos, printout, etc.), a knowledge base (data base) and an inference

engine (rule interpreter, instructions, ‘main program’).                                       There are two other

components often found within knowledge based systems; an explanation module12 to

enable the reasoning behind the decision made to be shown, and a knowledge

acquisition module to enable the knowledge base to be built by use of one or more of

the acquisition techniques possible (Hopgood, 1993).                                     Diagram 3 illustrates the

relationship between these components:




Diagram 3 - Knowledge Based System
As shown in diagram 3 the relationship between the components within a KBS is

relatively straightforward. Information is gathered from the outside world, stored

within a data base and, upon a query being made, accessed to provide an answer.

Rule Based System:
             A rule based system is based, fundamentally, on the IF...THEN...ELSE

structure (propositional logic/calculus). The following illustrates this point:

IF credit level is greater than pre-agreed limit
THEN stop credit and issue reminder
ELSE do nothing

Where the credit level is computed from inputs and the pre-agreed limit is contained


12 ANN have great difficulty in satisfying this requirement and your attention is drawn to the discussion in Chapter 4




                                                                                                                  page   27
within the knowledge base. It is both common and desirable that the information

required to process the rule is contained explicitly within the knowledge base as

opposed to implicitly within the program to enable a more simplistic and robust

method of updating to be used (e.g. as opposed to altering the program’s source code

entries in a data base are changed)(Hopgood, 1993).

       Whilst it is can be appreciated that this is a simplistic view of the workings of

a rule based system further developments serve only to improve and compound this

basic methodology (see for example; Appendix 2 - Bayesian Updating).

Artificial Neural Networks:

Overview:
       The key difference between ANNs and KBS lies with the inference engine.

As opposed to having a logic imposed on it, the network is allowed to develop its

own logic by means of training, either supervised or unsupervised. Weights are used

to determine the strength of relationships and there is no IF...THEN...ELSE. Instead

the network decides the relevance of inputs and their interconnections based on its

own experience (e.g. it has been trained).

       The network consists of a selection of nodes or cells arranged structurally in a

predetermined topology. The nodes are grouped in layers. This takes the form of an

input layer, one or more hidden layers and an output layer. Each node accepts

various inputs, adjusts them via weights, adds all inputs together them, uses them to

calculate a non-linear function, outputs them for passing to another cell, or if last cell

uses the output layer to compare the result with the expected answer and then passes

the difference back through the network to allow weight adjustment to correct errors

(backpropagation). A simplified single neuron calculation appears thus:




                                                                               page   28
Bryan Mills 1997                                       Chapter 3 -Fundamental Concepts




Diagram 4 - Single Neuron Calculation



Pictorially this can be represented thus:




Diagram 5 - Representation of a Neuron




This processes is explained, in detail, below and would normally be performed by

numerous neurons/cells/nodes within one or more layers at the same time e.g. in

parallel.


Components:
        It is important to appreciate that ANNs gain their ability not from a

predetermined layout or selection of weights but from the networks ability to adjust

weights and alter (strengthen/weaken) connections between nodes.              Before

attempting to explain the mathematics behind these interconnections an explanation

of the key components of the network is required.




                                                                           page   29
Nodes:
         Medsker, Turban and Trippi (1996) comment that most commercial ANNs

have between 10 and 1,000 nodes arranged in three layers, and that although 4,5 or

more layers is not unheard of, it is not deemed necessary for business applications

         Hopgood (1993) describes a node’s role as “to sum each of its inputs, subtract

a bias term, θ, and pass the result through a non-linear function, fnon-linear, known as

the activation function”. Hopgood’s emphasis on the bias term is discussed below.

ANNs have sets of these calculating functions and a description is given by Patterson

(1996) as “Every ANN is composed of a set n of simple neural computing elements

(neurons, units, processing elements or PEs, cells)” and where this set of cells can be

given as:

C={ci } i=1,2,...,n.

Patterson goes on to comment that cells can be grouped into three distinct categories;

input, hidden (or interior) and output.



         The interior layer of cells are the nodes which perform the majority of the

calculation process and are discussed under various headings below (Weights and

Bias Terms, Generalisation, Choice of Function). Input cells are the cells which take

the initial input of stimuli (discrete keyed values or continuous sensor data) whilst

output cells enable the display of results or the control of effectors. The inputs and

outputs are usually represented by the vector x of n dimension and the output y of m

dimensions (simply put; x1, x2,...,xn and y1, y2,...ym).

         .

The input data often takes the form of a text file in PC based neural nets:




                                                                              page    30
Bryan Mills 1997                                                   Chapter 3 -Fundamental Concepts




Diagram 6 - Screen dump of a text file for use in WinNN


        The above input file demonstrates the relatively simplified form of data which

may be used in ANN training and operation. The above example does not feature

scaling of the variables as this is not required in this instance, however it does

provide a representation of the form input files often take. The file represents 4

training sets, each with 2 inputs and 1 output (4,2,1). In the first training set (case) x1

would be 0, x2 would be 0, and the expected result (y1) would be 0. This would be

followed by the second set which would be 0,1,1 respectively and the third etc. This

data represents the commonly used XOR example/problem and gives the result 1 for

an even number of inputs and 0 for an odd (Patterson, 1996). The trained network

could be used to solve simple yes/no problems for example:

Account        Purchased     Arrange       Reason
Customer?      Over 200      visit by
               Units?        sales staff
n              n             n             Probably not trade customer
n              y             y             Offer trade account
y              n             y             Try to increase sales to trade customer
y              y             n             Credit limit probably reached




Table 5 - Input file explanation

        The above example is highly simplified. It does, however, represent the style

of business control system which uses yes/no responses. It is important to note that

the reason for the decision would not be given by the ANN.



                                                                                       page   31
Weights and bias terms:
       Once the data is entered into the network its connection from input layer to

calculation node is used to facilitate the addition of weights. Patterson (1996) uses

the following notation:


net=x1w1+x2w2+x3w3=∑xiwi

where x is input variable and w is weight.                        Equation 1


Hopgood (1996) makes the point of subtracting a bias weight to give:


net=∑xiwi-θ                                                       Equation 2


whereas Patterson (1996) prefers the use of a bias fixed value of 1*w 0 on one of the

input links where w0=-θ. The use of either method is considered acceptable.

       The weights remain independent of the variables (x) so as to facilitate their

adjustment during backpropagation. It is helpful (Patterson, 1996) to view the

relationship between the weights, class membership and the bias value in terms of a

two-dimensional plot. In more complicated example the weight value vector (wi)

would define a hyperplane in n-space where n is equal to i- the number of variables.

In this example n=2 and so it is two-dimensional.




Diagram 7 - Class Membership




                                                                           page   32
Bryan Mills 1997                                         Chapter 3 -Fundamental Concepts


        The significant points in Diagram 7 are the offset - giving the value of the

bias weight (w0)and the slope of the line which is given by - w1/w2. Thus the

formation of the line is derived entirely from the weights and future x values will be

shown as either belong to the class or not.       Patterson (1996) places particular

importance on this boundary line as he identifies it as the key to the net’s autonomy

through its ability to alter weights and so define what is within the set and what is

outside it. The example shown is linearly separable in that its boundary is define by a

straight line/plane.   This is largely due to the simplicity of the example (2-

dimensional) and partly due to the fact that it would be intended for use in a single

layer network. As an ability to cope with non-linearity is one of the key features of

ANNs they are, of course, capable of dealing with more complicated examples.


Generalisation:
        To deal with n-dimensions and non-linearity ANNs generalise. Patterson

(1996) discusses generalisation in terms of “describing the whole from some of the

parts” and points out that the alternative to an ability to generalise is knowing

everything.    It is possible to identify an object by knowing some general rules

involving that class of object without knowing every member of that class. For

example a metal frame with two wheels, a set of handlebars, a saddle and fitting

various size requirements is probably a bicycle. It is not necessary to memorise

every manufacturer’s catalogue.

        ANNs generalise by creating a class which exists in weight space with its

boundary given by the mapping function F (Patterson, 1996). Mapping functions are

either autoassociative or heteroassociative meaning they map the original pattern

from noisy/incomplete data or map input patterns to different output patterns




                                                                             page   33
respectively. Mapping functions are shown mathematically as F:x→y. The non-

linear boundary can be shown by the simplified diagram:




Diagram 8 - Universe of objects
        From Diagram 8 it is possible to see that the boundary established by the

network includes both the training set data and other instances of the data not given

in the training set but which would be encountered if more sets of data were made

available - therefore giving the network an element of flexibility. The boundary must

therefore include all examples of the training set, all examples of data corresponding

to the nets function but not known at time of training, and exclude all other data sets.

Once this has been achieved the generalisation can be said to have been a complete

success. It is apparent that the method of training and the selection of data will have

particular importance on the accuracy of this process.


Choice of mapping or activator function:
        As mentioned in ANN Overview (above) the summed weights are passed

through a non-linear function before proceeding to the next cell or output layer. This

is the mapping function referred to above so that F:x→y, and is known as the

activator function (or activation level/summation function - Medsker, Turban and



                                                                             page   34
Bryan Mills 1997                                            Chapter 3 -Fundamental Concepts


Trippi, 1996). It is suggested by Patterson (1996) that the choice of activator should

be a      “monotonic nondecreasing function of net”.       This simply means that the

function should hold true for all facts, even if it was originally based on only a

sample (monotonic) and that the slope of the function should rise from left to right (it

should not cause values to diminish in relation to other lower values; x=2 y=0.88,

x=3 y=0.95 for a sigmoid value). Hopgood (1996) makes the point of stating that

“The weights and biases can be learnt, and the learning behaviour of a net depends

on the chosen algorithm”. It is further stated that the sigmoid function is most

commonly used and Patterson (1996) concurs with this statement. The sigmoid

function is given as:

                        1
fnonlinear ( x ) =
                     1 + e−x                                           Equation 3
and would appear graphically thus:


                 Sigmoid Function
                           1

                        0.8

                        0.6

                        0.4

                        0.2

                           0
                               1


                                    3


                                        5
   -5


            -3


                      -1




Diagram 9 - Sigmoid Function
The above diagram (Diagram 9) shows some of the key features of the sigmoid
function and thus its reasons for use, these features include:
          • The ability to make all values positive .
          • The relatively fine level of discrimination (the slope)
          • The fact that all results are given as between 0 and 1.

Data pre-processing:
          It can be appreciated that the data under investigation may take various forms.

The ANN will require inputs which are of a numeric nature. This does not prevent



                                                                                page   35
non-numeric data being analysed provided it can be converted, with consistency, into

numbers.      For example, risk is a common business concept which is regularly

translated from the vague - safe, moderately safe, risky, very risky - to a range of

probabilities (say 100%, 75%, 50%, 25% probability of a favourable event

occurring).

        Once the data has been gathered, and given a numeric value if required, the

efficiency and accuracy of the ANN can be enhanced by information pre-processing.

Wasserman (1989) and Patterson (1996) both concentrate on normalisation, which is

a common form of pre-processing. Normalisation is a method by which all the data

being processed can be given a common minimum and maximum value.                    For

example, readers familiar with statistics may draw a parallel            between the

normalisation of the data with techniques used in statistics to determine probabilities

using the normal distribution curve(NDC). Here any distribution of data can be

mapped (converted) to the NDC which has its probabilities pre-calculated.

        The most common form of normalisation will see all the data converted to

values between 0 and 1. This has the advantage of both reducing the difficulty of

manipulating large numbers (simply put it is easier to manipulate, say, 0.2 than

2,000,000) and enhancing the networks ability to adjust weights by reducing

unnecessary emphasis (for example in loan calculations interest rates may be given as

% or decimals, loan size in millions). Certain activator functions will restrict output

to between 1 and 0, regardless of input (Medsker, Turban and Trippi, 1996). For

example the sigmoid function mentioned above:

Input                 1/1+e-x
0.5                   0.6225
5                     0.9933
17                    0.9999999586
35                    0.99999999
35.2                  1




                                                                             page   36
Bryan Mills 1997                                            Chapter 3 -Fundamental Concepts


2256                       1
-5                         0.0067

Table 4 - Sigmoid values

        In can be seen from the above table that data which is beyond a certain range

approaches a value of one (exact point at which it appears as one is dependent on the

number of decimal places used and rounding). As the data is multiplied by weights

and summed before entering the activator function it can be appreciated that no

accuracy is gained by maintaining a mixture of large and small numbers (as it will

convert everything to between 0 and 1 regardless). It can also be recognised that the

output from the net, having passed through various activator functions, will be of

decimal or Boolean form (1 or 0). It is important to recognise that as the results for

the sigmoid function return 0 and 1 for a large range of negative and positive

numbers, respectively, it is advisable to restrict the inputs and outputs to values

between 0.1 and 0.9 (Tucker, 1996). This also avoids the use of either 1 which has

the effect of distorting the part of the sigmoid function (e-1 = 1/e ) or 0 which distorts

sigmoid (e0 = 1) and has the effect of negating weights (x1w1 = 0 for x = 0).

        There are numerous methods of data pre-processing and Tucker (1996)

contains a accessible description of six common methods.             These are given as

Distribution truncation and squashing functions, Natural log regression, Ratio

splitting, Positive/negative split, Variable pre-selection, and Data squashing.




These can be explain, briefly as:
Technique                      Description
Distribution truncation        The removal of outliers, the removal of unusually large,
and squashing                  or small, numbers from the data set
functions




                                                                                page   37
Natural log regression               Using logarithms to convert data into small units.

Ratio splitting                      Not inputting the results of a ratio but the numerator (top)
                                     and denominator (bottom) values separately.

Positive/negative split              Separating a variable which has both positive and
                                     negative examples into two separate variables (e.g. Profit
                                     becomes Profit or Loss as opposed Profit £200 or Profit -
                                     £150 (loss))

Variable pre-selection               Simply manually deciding which variable to include and
                                     which to leave out.

Data squashing                       Converting all data so as it is within a pre-defined range
                                     (0.1 to 0.9 being given as most appropriate).
                                                                Adapted from - Tucker (1996)

Table 5 - Data pre-processing

          It is also important to emphasise that the data (variables) used within a ANN

should have some form of theoretical basis for a relationship before they are

included13. There is little point comparing interest rates, inflation, project life, level

of risk and outside temperature if the problem under consideration was to determine

the ‘cost of capital’ to use in net present value (NPV) calculations .


Training:
          By their very nature, ANNs require training. This training can be viewed in a

similar manner to human training in that the activity is repeated until the system

produces a satisfactory result (or it is decided to abandon the training run). Hopgood

(1993) suggests that the training process is, more correctly, an “error reduction”

process. Alternatively Patterson (1996) suggests that it is “adaptive learning in a

dynamic environment” emphasising the flexibility of the method.

          There are, however, three separate methods of training - Supervised,

reinforced and unsupervised. Wasserman (1989) argues that unsupervised training is


13 The appendix of Farrar, Tucker and Bugmann (1997) provides an example of how this could be approached.




                                                                                                            page   38
Bryan Mills 1997                                                                     Chapter 3 -Fundamental Concepts


the only “biologically plausible” method of training ANNs.                                         The desire to be

“biologically plausible” suggests Wasserman is more concerned with the theoretical

study and attempted replication of a brain, than practical application of a

mathematical technique. Plausibility should be weighed against ‘usefulness as a tool’

when        ANNs          are     used       in     applications           not      related   to     the   study    of

psychology/neurology. The use of unsupervised learning will be discussed under the

heading of Topology below (Kohonen being one of its originators). Supervised and

reinforced learning methods are broadly similar in philosophy and so an explanation

of supervised training will be made first.

           In supervised training the results obtained from the cells are compared with

the desired results (contained within the original input). The difference is referred to

as an error and the weights contained within the network are adjusted

(backpropagation). This adjustment continues until the sum of the squares of the

differences (errors) are minimised (in a way similar to linear regression - line of best

fit calculations).

This process can be simplified thus:

           • Subtract output from target contained within input vector.

           •    Square the difference to remove negative signs14.

           • Add all the squared differences together.

           • Compare this answer with the desired level of error.

           • If error unacceptable adjust weights and begin again.

The minimisation is said to be complete when the error has reached an acceptable

level.

           Reinforced training follows a broadly similar route to supervised but, as

14 Removal of negatives - else error of -100 plus error of 100 would indicate no error.




                                                                                                           page    39
opposed to calculating the level of error, the ANN is merely informed that it is either

wrong or right and continues to adjust weights until a correct result is identified.

Patterson (1996) suggest that the method is seldom used in practice and attention

should instead be paid to supervised and unsupervised learning (other authors make

little or no mention of reinforced training).

           It should be noted that it is possible to over-train a network. To continue the

human analogy, this would represent an employee trained, in a vocational way, to

such a level that they were only able to perform their present role, and none other.

This may occur, for example, in a factory where an operative has been doing the

same repetitive job for such a long period of time their ability to transfer any of the

skills they have learnt becomes hampered.


Topology:
           The Multilayer Perceptron and Kohonen’s Self Organising Net will be used to

give a more detailed explanation of ANN construction.

The Multilayer Perceptron - an example of supervised learning/training:
           The multilayer perceptron is most commonly used for non-linear estimation

or classification.           It is often referred to as a feedforward network15.                                    This net

comprises of the conventional input and output layers, with a programmer (user)

defined number of nodes and layers between (hidden). The number of nodes used in

the input/output layers is data specific.                          A diagram representing the multilayer

perceptron is provided below:




15 Generic name used to refer to this network in particular - any network in which the data flows towards the output is
technically feedforward.




                                                                                                                  page    40
Bryan Mills 1997                                        Chapter 3 -Fundamental Concepts




Diagram 10 - The Multilayer Perceptron


        It is usual for the number of input nodes to equal the number of variables

under consideration, likewise the number of output nodes will equal the number of

desired outputs. The number of nodes are often used to give the network its name.

For example the network above may be termed a 3-4-4-2 network as it has 3 input

nodes, two sets of 4 hidden or computational nodes and 2 output nodes and would be

described as a four layer network (although Hopgood (1993) indicates that there is

argument concerning this issue as some claim the input layer should not be counted).

         It can be seen from diagram 10 that each of the calculation nodes is

connected to all four of the next set of nodes, but that no nodes are connected

vertically (with reference to the diagrams rotation only). The weights would be

adjusted at (on/during) the connections between the nodes, with each node being a

non-linear function (activator function). The process could be described as - input,

weight adjustment, conversion via activator function, become input for next layer,

weight adjustment, conversion via activator function and output to final layer where

error checking will occur and instigate backpropagation to adjust weights (note- this

method utilises supervised training).

        The number of both layers and calculation nodes is dependent on the




                                                                            page   41
programmers decision given a certain problem. It is common to start at a low number

of layers/nodes and then increase this number until the desired level of

accuracy/speed is achieved.

The Kohonen self organising net- an example of unsupervised learning/training:
        As mentioned previously, the Kohonen self organising net is a form of net

which learns using an unsupervised method. This type of network is most commonly

used for pattern recognition and is often referred to as The Self Organising Feature

Map (Patterson, 1996). As well as differing from the Multilayer Perceptron (MLP)

in training method and application it also differs in that it is a single layer

feedforward network as opposed to multilayer. In addition the network works on a

principle of “winner takes all” (Wasserman, 1989).            This means that as the

information is processed within the layer, one and only one, node will transmit an

output. This is why the method is often referred to as a “competitive one “(Patterson,

1996 and Wasserman, 1989).




Diagram 11 - Kohonen Self Organising Feature Map
        Unlike the Multilayer Perceptron the Kohonen net learns in an unsupervised

manner. As the net is attempting to replicate a pattern in various sets of data it trains

by continually processing different data-sets until it is satisfied that each new run




                                                                               page   42
Bryan Mills 1997                                                 Chapter 3 -Fundamental Concepts


will create a replica of previous runs (within tolerances pre-set). For example a child

is not always told the same word repeatedly until he or she learns to say it (by being

told when he or she has said it incorrectly) but rather learns by being exposed to

numerous examples of speech and establishes the ability to replicate these patterns

independently. This learning style often sometimes referred to a competitive filter

associative memory (Medsker, Turban and Trippi, 1996).

           It can be appreciated that the differing topologies are related to the different

applications to which the networks are applied. Classification (MLP) and pattern

recognition (Kohonen) are two quite different problems.                      An example of

classification may be given as “does this data correspond16 to a customer with good

credit ratings”, whereas pattern recognition is more commonly associated with

speech or hand writing recognition or recognition of trends or patterns within

financial information (which may return us to the credit problem by a different route).

           In summary it can be said that the Multilayer Perceptron works by co-

operation and the Kohonen network by competition between nodes.

Summary:
           ANNs offer an alternative to conventional forms of knowledge based systems.

Whilst ANNs drew their original inspiration from studies of the human mind further

developments in this area are limited by technology.                The use of ANNs as a

mathematical tool is, however, both possible and practical at today’s levels of

technology.

           The use of self learning and pattern recognition provides a solution to

problems which, using conventional techniques, may have been overcomplicated or

simple not possible.                  The basic concept of learning by example through the

16 Is it a member of the set of customers with ....




                                                                                     page   43
adjustment of mathematical weights is a reasonable approximation of the process and

allows the internal computations and structure of the network to be treated as a

‘black-box’17.

           Networks topologies and learning/training methods are problem specific and

it is should be appreciated that the correct choice of network, training style, training

data, pre-processing method and activator function all contribute towards the

successful application of the network.

           Diagram 12 (overleaf) represents, in the form of a flow diagram, the series of

discrete steps which make up ANN operations.




17 Black-box - The exact details of the internal workings need not be known to facilitate use.




                                                                                                 page   44
Diagram 12 - The operation of ANNs - flow diagram
Chapter 4 - Investigation into advantages,

disadvantages and current application of ANNs.

Introduction:

       In order to allow a more complete consideration of the suitability of ANNs for

a given decision making problem, it is necessary to appreciate the advantages and

disadvantages that the use of ANNs provides. The self-learning pattern recognition

ability which gives the ANN its distinct characteristics, also creates disadvantages,

some of which are problem-specific, whilst others are universal.

       ANNs are currently used for a wide variety of decision support problems.

Perhaps the most commonly cited problem is that of distress modelling (prediction of

bankruptcy), but examples are also found in such diverse areas as production control,

new product development and traffic light sequence (road junction) modelling.

Advantages and disadvantages:

       Numerous authors have commented on the advantages and disadvantages of

ANNs and the following, cited from Hammerstrom (1993), provides what could be

regarded as fundamental points of interest.

Advantages:

       • They can infer subtle, unknown relationships from the data.

       • They are non-linear so that complex problems can be solved more

           accurately than by linear techniques.

       •   They are highly parallel, which makes them run faster on computers with

           parallel processors than alternative methods.

Additional benefits are given by Medsker, Turban and Trippi (1996) as:
Bryan Mills 1997                                   Chapter 4 - Advantages and Disadvantages


        • The ability to cope with highly correlated input data (also Multicollinearity

           Tucker, 1996).

        • A more highly automated input interface is made possible by ANNs ability

           to process all inputs at once.

        • Fault tolerance - due to the high number of nodes, inaccuracy caused by

           bad data can often be localised and not affect the accuracy of the ANN as a

           whole.

        • Generalisation - noisy, incomplete or previously unseen data will still result

           in a reasonable response being made, providing the ANN is suitably trained

           (Hawley, Johnson and Raina, 1990).

        • Adaptability - Training can occur during the ANN’s in-service lifetime,

           allowing the ANN to remain up to date.

        Hawley, Johnson and Raina (1990) comment on the fact that ANNs, by the

general purpose nature of their structure, are faster to install and maintain than custom

built KBS. Training of ANNs, though time consuming, need not be as technically

difficult (and therefore as expensive) as writing the program structure of a KBS.

        ANNs, due to the way in which the relationship of weights is formed, are not

prone to ‘crashing’ as a result of incomplete or inaccurate data but are often said to

degrade gracefully over time as the weight values alter. This is often cited as an

advantage, but it should be borne in mind that something which has happened

progressively over time may not be noticed until damage has occurred, and without an

adequate control process, may not be appreciated at that point either.

Disadvantages:

Hammerstrom (1993):

        • They may fail to produce a satisfactory solution because of insufficient data



                                                                         page     .   47
[for training] or because no learnable (sic) function exists.

       • They may produce results from a complex machine learning procedure that

          has no straightforward cause and effect origin that can be easily explained.

       • They can be slow and expensive to train.

       • ANN’s computational speed, in the finished application, depend linearly on

          the number of connections and, roughly [approximately], the square of the

          number of nodes.

To this list of disadvantages should be added ANN’s most criticised fault -

       • ANNs are not capable of demonstrating the logical reasoning behind the

          result obtained - the black box approach (Farrar, Tucker and Bugmann,

          1997, Hopgood, 1993, Medsker, Turban and Trippi, 1996 and Tucker and

          Farrar, 1996).

A problem recognised by Tucker (1996) and Tucker and Farrar (1996) is the relative

‘youth’ of ANNs as a decision making technique. Conventional decision making

techniques have the advantage of many years of testing, both theoretically and in

practice, which have produced models that are both recognised and accepted.

        ANNs, by the very fact that they are relatively new and underdeveloped, do

not have the weight of past experience to promote their results.          However, as

development within the field is progressing, refining of the technique and empirical

evidence should produce an improved methodology and increased statistical evidence

of accuracy.

Current application of ANNs:

       Hawley, Johnson and Raina (1990) provide a comprehensive discussion of

ANN applications in finance from which the following is adapted:

Corporate Finance Applications:
Bryan Mills 1997                                  Chapter 4 - Advantages and Disadvantages


Financial Simulation: Whilst the financial management tasks of a company can be

divided into various smaller and more manageable segments, the complexity of the

company’s internal and external environment is often misrepresented in these

simplified models. ANNs provide a means of linking all segments together during

analysis, they can be tailored to an individual company and are capable of being

dynamic and responsive to change (Donaldson, 1996).

        ANNs can be used for credit customer behaviour modelling, planning bad

debt expenses, planning the cyclical expansion and contraction of accounts,

evaluating credit terms and limits, cash management, evaluation of capital

investments, asset and personal risk management (insurance), exchange risk

management, and the prediction of credit cost and availability based on a company’s

financial data. Hawley et al (1990) claim that this area of ANN application offers,

perhaps, the greatest potential for ANN business application.

Prediction: In determining a new policy, direction or product, organisations need to

determine the reaction this choice will create in both present and future investors and

the subsequent effect this will have on their investment decisions.                  Whilst

conventional decision making techniques are often more cost efficient at solving

problems which have well-identified theoretical underpinning, the problem of investor

behaviour and sentiment is of a complexity more suited to ANNs.

        Investors often base their decisions on a wide range of issues and information

concerning both the company and the broader economic environment. It is possible to

train ANNs to mimic the behaviour of investors (using actual investors as training

models) and then determine the effect alterations to company policy and financial

position has on their investment decisions. ANNs offer the opportunity to incorporate

a wide range of input and output information enabling the decision maker to gauge

reaction to change in ways other than alterations in stock price alone (Hawley et al,

                                                                        page     .    49
1990).

Evaluation: Accurately valuing a target company’s net worth before attempting a

acquisition increases the probability of success, both in terms of acquisition outcome

and with regards to eventual profit. ANNs are trained, in this instance, by exposure to

training sets of target company data, as input, and human expert value estimate as

response. This use of ANNs seeks to copy the behaviour of individuals, including the

incorporation of human “hunches” and intuition, which would make the use of

conventional decision support programming difficult or impossible.

         ANNs have been used successfully in a wide range of evaluation problems and

confer advantages including: screening large numbers of companies for potential

undervalue or other form of acquisition attractiveness to minimise decision makers

time (which then only need look at “ideal” companies); the ability to copy the

interpretations of a wide range of decision makers; and the ability to automatically

adjust to the decision maker’s changing analytical procedures and selection criteria

over time (Hawley et al, 1990).

Credit Approval:     Using a similar training approach to the company evaluation

method, detailed above, ANNs are capable of reducing time and labour by mimicking

the decisions of financial staff in both credit approval and credit limit decisions. In

addition ANNs are able to interpret a wider range of financial statements (providing

they are trained on a wide range) more quickly than their human counterparts,

negating the need for the information to be restated in a standardised form (Jenson,

1992 and Marose, 1990).

Financial Institutions Applications:

Assessing Lending/Bankruptcy Risk: ANNs can provide expert opinion on loans and

lending arrangements to financial institutions in a similar manner to the example of

credit approval discussed above.
Bryan Mills 1997                                                          Chapter 4 - Advantages and Disadvantages


Security/Asset Portfolio Management: Due to the unstructured nature of the portfolio

manager’s decision making process and the diversity of information involved ANNs

offer advantages over conventional decision making techniques (Hawley et al, 1990).

Pricing Initial Public Offerings (of ordinary shares): Determining the issue price of

ordinary shares is a complicated process but one which it is essential to optimise

(Brett, 1991). The information is often diverse and of a non-standard format and so

the application of ANNs confers advantages, not found in conventional decision

making techniques, through their ability to generalise and their lack of reliance on an

explicit rule base.

Professional Investors Applications:

Identification of Arbitrage Opportunities: By replicating an expert decision maker’s

reasoning process, a process he or she may not be able to articulate, the ANN is able

to assist in the identification of companies which are about to becoming victims of a

hostile take-over (and thus allowing purchasing of the company’s shares to be

initiated). ANNs offer advantages in their ability to screen large numbers of potential

targets, thus giving the arbitrageur a smaller workload.

The Technical Analyst18:                   ANNs pattern recognising abilities enable the patterns

(hitherto un-calculable) within stock markets to be emulated. Through this evaluating

ability, more accurate predictions of share price movements can be derived

(Davidson, 1996).

The Fundamental Analyst: Industry norm patterns, market conditions and financial

statements can be used to train ANNs to assist in share purchasing in a way similar to

the technical analyst model.

Summary:
           The advantage which ANNs confer over and above more traditional decision

18 Influences on share price which are not related to company trading position.


                                                                                                page     .   51
making/support techniques are their ability to discern patterns in large volumes of

data through a process of self-learning as opposed to explicit instruction.    This

process enables ANNs to discover patterns or relationships which may have been

overlooked or given too great an emphasis in existing decision support mathematics.

ANN’s ability to identify patterns also enables network recognition of variations in

handwriting, voice, and image recognition, and provides opportunities for a wide

range of security applications.

       Currently the use of ANNs in business is predominately within bankruptcy

prediction and financial risk assessment.        However ANNs have been used

successfully in a wide range of operations management, marketing (data mining) and

personnel applications.

       The use of ANNs has been shown to offer increased accuracy (Farrar, Tucker

and Bugmann, 1997) and, in many instances, are one of the only methods currently

available (e.g. handwriting and voice recognition). The ability to deal more easily

than conventional methods with non-linearity (Waldrop, 1992) gives the user an

advantage in the highly non-linear markets in which business operates (Cuthbertson

and Gripaios, 1996).

       ANNs “operate by a logic known only to themselves” (The Economist, 1995).

The most difficult obstacle to overcome in the promotion of ANNs as a decision

making tool is their lack of interoperability.
Bryan Mills 1997                                                                                       Chapter 5 - Schema




     Chapter 5 - Schema for the assessment of the
              suitability of ANNs for given problem
Introduction:

           Whilst ANNs are an extremely powerful tool, their application is not suited to

every problem. For reasons such as cost, unavailability of data and form of data there

are certain problems which are bettered suited to other forms of decision support. In

order to maximise the benefit gained from ANN application a process of

problem/method pre-selection is required.

           To facilitate the matching of problem19 and solution, a schema has been

developed. By answering a series of relatively simple questions, the user is able to

determine whether ANNs are suitable for the problem they are attempting to solve. In

addition to this both the reasons for and against the use of ANNs, for a given problem,

are discussed. Additionally, a suggestion as to other methods which may prove more

suitable should ANNs not be appropriate is given.

Schema:

           Diagram 13 (page 46) represents the schema in the form of a flow-chart. By

following the series of questions, labelled 1 to 10, the user is able to determine

whether ANNs are suitable for a given problem and, any problems which may be

encountered in their application. The dotted lines and boxes represent complementary

advice, the solid lines and boxes represent questions, flow and conclusions.                                                 In

conjunction to the process the flow-chart is also described, fully, in text form.

           The schema is also available as a computer program, the instructions for use

are contained in Appendix 3, and the code lists are contained in Appendix 4.

19 Schema concentrates on the use of ANNs for decision making purposes, full systems for factory control etc. can cost in the
region of £25,000 and would require more detailed analysis than is possible within a general purpose flowchart (Horridge, 1997)


                                                                                                        page        .   53
This program enables an approach which is more dynamic, multidimensional

and, above all, simplistic for the user, than is possible on paper alone. The program

(called Net Solver) enables the user to determine whether ANNs are suitable for a

given problem. To make this possible the program records the response made by the

user to a series of questions. These responses enable the program to calculate the

suitability of the problem/decision to ANN use.                             The result is displayed as both a

percentage and as a ‘progress-bar’ of the sort used within the Windows 20 environment

to show elapsed time. In addition to this result the responses are reiterated, to allow

the user to check that she/he has not made any errors. If the user had entered a project

name this will also be displayed. To enable the user to determine the next step in the

application of ANNs advice is given where it is thought appropriate. The user then

has the option of printing the results, advice and details (see Appendix 5 - Sample

Output).

          Throughout the program a fictitious company and telephone number is

mentioned (ABC ANNs on (0110)111222), at points where the user may need

additional advice. It is intended that this program could, with further development, be

used in one of the following ways:

    •   Distributed free of charge (e.g. via the Internet or computer magazine disks) by
        a software/consultancy company, replacing ABC ANNs with its own name and
        contact number as part of an advertising campaign.
    •   Incorporated into ANN software as an introduction.
    •   Sold as consultancy software.
    •   Used within education as a teaching aid.
    •   Distributed free of charge, via the Internet, as a philanthropic act.

          The program was written using Microsoft Visual Basic Version 4 Professional

(VB4), using the Microsoft Windows 95 operating environment, and a PC equipped

with a 486/66 processor, 8 Mb of Ram and 200 Mb of spare hard disk space (see
20 Microsoft, Windows and Visual Basic are all registered trademarks of the Microsoft Corporation


                                                                                                    page   .   54
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis
Business Dissertation Thesis

More Related Content

What's hot

Systems Analysis And Design Methodology And Supporting Processes
Systems Analysis And Design Methodology And Supporting ProcessesSystems Analysis And Design Methodology And Supporting Processes
Systems Analysis And Design Methodology And Supporting ProcessesAlan McSweeney
 
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...KBHN KT
 
L9 for Stress
L9 for StressL9 for Stress
L9 for Stresslami9caps
 
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)Vusi Silonda
 
CharmainePhDThesis (4)
CharmainePhDThesis (4)CharmainePhDThesis (4)
CharmainePhDThesis (4)Charmaine Borg
 
Introduction to Solution Architecture Book
Introduction to Solution Architecture BookIntroduction to Solution Architecture Book
Introduction to Solution Architecture BookAlan McSweeney
 
Crisis_Respite_Services_Evaluation__Final_Report
Crisis_Respite_Services_Evaluation__Final_ReportCrisis_Respite_Services_Evaluation__Final_Report
Crisis_Respite_Services_Evaluation__Final_ReportJenny Hall
 
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.FrdricRoux5
 
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...Alan McSweeney
 
UW Strategic Roadmap for Administrative Systems
UW Strategic Roadmap for Administrative SystemsUW Strategic Roadmap for Administrative Systems
UW Strategic Roadmap for Administrative SystemsJeanne Marie Isola
 
Abstract contents
Abstract contentsAbstract contents
Abstract contentsloisy28
 
20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer20150324 Strategic Vision for Cancer
20150324 Strategic Vision for CancerSally Rickard
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learningbutest
 

What's hot (18)

Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
 
Systems Analysis And Design Methodology And Supporting Processes
Systems Analysis And Design Methodology And Supporting ProcessesSystems Analysis And Design Methodology And Supporting Processes
Systems Analysis And Design Methodology And Supporting Processes
 
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...
Identifying and prioritizing stakeholder needs in neurodevelopmental conditio...
 
Upstill_thesis_2000
Upstill_thesis_2000Upstill_thesis_2000
Upstill_thesis_2000
 
L9 for Stress
L9 for StressL9 for Stress
L9 for Stress
 
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)
COMPETITIVE RIVALRY IN TELECOMS 09 02 09 (2) (1)
 
CharmainePhDThesis (4)
CharmainePhDThesis (4)CharmainePhDThesis (4)
CharmainePhDThesis (4)
 
Introduction to Solution Architecture Book
Introduction to Solution Architecture BookIntroduction to Solution Architecture Book
Introduction to Solution Architecture Book
 
Crisis_Respite_Services_Evaluation__Final_Report
Crisis_Respite_Services_Evaluation__Final_ReportCrisis_Respite_Services_Evaluation__Final_Report
Crisis_Respite_Services_Evaluation__Final_Report
 
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.
Alpha and Gamma Oscillations in MEG-Data: Networks, Function and Development.
 
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...
Data Privatisation, Data Anonymisation, Data Pseudonymisation and Differentia...
 
01 it2008 curriculum
01 it2008 curriculum01 it2008 curriculum
01 it2008 curriculum
 
UW Strategic Roadmap for Administrative Systems
UW Strategic Roadmap for Administrative SystemsUW Strategic Roadmap for Administrative Systems
UW Strategic Roadmap for Administrative Systems
 
Abstract contents
Abstract contentsAbstract contents
Abstract contents
 
20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer20150324 Strategic Vision for Cancer
20150324 Strategic Vision for Cancer
 
PhD Thesis
PhD ThesisPhD Thesis
PhD Thesis
 
A Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine LearningA Bilevel Optimization Approach to Machine Learning
A Bilevel Optimization Approach to Machine Learning
 
Laaksonen_thesis
Laaksonen_thesisLaaksonen_thesis
Laaksonen_thesis
 

Similar to Business Dissertation Thesis

THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSTHE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSDebashish Mandal
 
Building on Experiences_ Research Design Report
Building on Experiences_ Research Design ReportBuilding on Experiences_ Research Design Report
Building on Experiences_ Research Design Report4Building
 
Research Design Report Building On Experiences
Research Design Report Building On ExperiencesResearch Design Report Building On Experiences
Research Design Report Building On Experiences4Building
 
HJohansen (Publishable)
HJohansen (Publishable)HJohansen (Publishable)
HJohansen (Publishable)Henry Johansen
 
An investigative study into consumer choice. a case study analysis using tesc...
An investigative study into consumer choice. a case study analysis using tesc...An investigative study into consumer choice. a case study analysis using tesc...
An investigative study into consumer choice. a case study analysis using tesc...abdul9092
 
Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks
 
Wateen final (research method)
Wateen final (research method)Wateen final (research method)
Wateen final (research method)Wahab Yunus
 
NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report Duane Blackburn
 
An Analysis of Component-based Software Development -Maximize the reuse of ex...
An Analysis of Component-based Software Development -Maximize the reuse of ex...An Analysis of Component-based Software Development -Maximize the reuse of ex...
An Analysis of Component-based Software Development -Maximize the reuse of ex...Mohammad Salah uddin
 
Cove: A Practical Quantum Computer Programming Framework
Cove: A Practical Quantum Computer Programming FrameworkCove: A Practical Quantum Computer Programming Framework
Cove: A Practical Quantum Computer Programming Frameworkmpurkeypile
 
Data Mining of Cancer Data Decision Support
Data Mining of Cancer Data Decision SupportData Mining of Cancer Data Decision Support
Data Mining of Cancer Data Decision SupportPaul Carter
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesLighton Phiri
 
Design for public services- The fourth way
Design for public services- The fourth wayDesign for public services- The fourth way
Design for public services- The fourth wayforumvirium
 
Smart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaSmart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaHendrik Drachsler
 
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...Transforming a Paper-Based Library System to Digital in Example of Herat Univ...
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...Abdul Rahman Sherzad
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKSara Parker
 

Similar to Business Dissertation Thesis (20)

THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKSTHE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
THE IMPACT OF SOCIALMEDIA ON ENTREPRENEURIAL NETWORKS
 
Building on Experiences_ Research Design Report
Building on Experiences_ Research Design ReportBuilding on Experiences_ Research Design Report
Building on Experiences_ Research Design Report
 
Research Design Report Building On Experiences
Research Design Report Building On ExperiencesResearch Design Report Building On Experiences
Research Design Report Building On Experiences
 
HJohansen (Publishable)
HJohansen (Publishable)HJohansen (Publishable)
HJohansen (Publishable)
 
An investigative study into consumer choice. a case study analysis using tesc...
An investigative study into consumer choice. a case study analysis using tesc...An investigative study into consumer choice. a case study analysis using tesc...
An investigative study into consumer choice. a case study analysis using tesc...
 
Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011Simon Brooks 100042660 - Dissertation - 2010-2011
Simon Brooks 100042660 - Dissertation - 2010-2011
 
Wateen final (research method)
Wateen final (research method)Wateen final (research method)
Wateen final (research method)
 
NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report NSTC Identity Management Task Force Report
NSTC Identity Management Task Force Report
 
An Analysis of Component-based Software Development -Maximize the reuse of ex...
An Analysis of Component-based Software Development -Maximize the reuse of ex...An Analysis of Component-based Software Development -Maximize the reuse of ex...
An Analysis of Component-based Software Development -Maximize the reuse of ex...
 
Cove: A Practical Quantum Computer Programming Framework
Cove: A Practical Quantum Computer Programming FrameworkCove: A Practical Quantum Computer Programming Framework
Cove: A Practical Quantum Computer Programming Framework
 
Data Mining of Cancer Data Decision Support
Data Mining of Cancer Data Decision SupportData Mining of Cancer Data Decision Support
Data Mining of Cancer Data Decision Support
 
PR 2.0 - The New PR
PR 2.0 - The New PRPR 2.0 - The New PR
PR 2.0 - The New PR
 
M.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital LibrariesM.Sc Dissertation: Simple Digital Libraries
M.Sc Dissertation: Simple Digital Libraries
 
Design for public services- The fourth way
Design for public services- The fourth wayDesign for public services- The fourth way
Design for public services- The fourth way
 
Smart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaSmart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao Pargana
 
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...Transforming a Paper-Based Library System to Digital in Example of Herat Univ...
Transforming a Paper-Based Library System to Digital in Example of Herat Univ...
 
GHopkins_BSc_2014
GHopkins_BSc_2014GHopkins_BSc_2014
GHopkins_BSc_2014
 
It project development fundamentals
It project development fundamentalsIt project development fundamentals
It project development fundamentals
 
A Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORKA Seminar Report On NEURAL NETWORK
A Seminar Report On NEURAL NETWORK
 
EMBAThesis_MaSu_Aug2008
EMBAThesis_MaSu_Aug2008EMBAThesis_MaSu_Aug2008
EMBAThesis_MaSu_Aug2008
 

More from Dr Bryan Mills

24+ non stop lecture for cancer
24+ non stop lecture for cancer24+ non stop lecture for cancer
24+ non stop lecture for cancerDr Bryan Mills
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Dr Bryan Mills
 
Gold, inflation and dollars
Gold, inflation and dollarsGold, inflation and dollars
Gold, inflation and dollarsDr Bryan Mills
 
Research Methods and Paradigms
Research Methods and ParadigmsResearch Methods and Paradigms
Research Methods and ParadigmsDr Bryan Mills
 
Introduction to statistical terms
Introduction to statistical termsIntroduction to statistical terms
Introduction to statistical termsDr Bryan Mills
 
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, SalesMarketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, SalesDr Bryan Mills
 
PhD Economics Ch 1 2 first half
PhD Economics Ch 1 2 first halfPhD Economics Ch 1 2 first half
PhD Economics Ch 1 2 first halfDr Bryan Mills
 
Introduction to strategic management
Introduction to strategic managementIntroduction to strategic management
Introduction to strategic managementDr Bryan Mills
 

More from Dr Bryan Mills (10)

The financial crisis
The financial crisisThe financial crisis
The financial crisis
 
24+ non stop lecture for cancer
24+ non stop lecture for cancer24+ non stop lecture for cancer
24+ non stop lecture for cancer
 
Introduction to t-tests (statistics)
Introduction to t-tests (statistics)Introduction to t-tests (statistics)
Introduction to t-tests (statistics)
 
Gold, inflation and dollars
Gold, inflation and dollarsGold, inflation and dollars
Gold, inflation and dollars
 
Research Methods and Paradigms
Research Methods and ParadigmsResearch Methods and Paradigms
Research Methods and Paradigms
 
Introduction to statistical terms
Introduction to statistical termsIntroduction to statistical terms
Introduction to statistical terms
 
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, SalesMarketing, Value, Value Propositions, Selling, Value Adding, Sales
Marketing, Value, Value Propositions, Selling, Value Adding, Sales
 
Economics key words
Economics key wordsEconomics key words
Economics key words
 
PhD Economics Ch 1 2 first half
PhD Economics Ch 1 2 first halfPhD Economics Ch 1 2 first half
PhD Economics Ch 1 2 first half
 
Introduction to strategic management
Introduction to strategic managementIntroduction to strategic management
Introduction to strategic management
 

Recently uploaded

Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insightsseri bangash
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Lviv Startup Club
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLSeo
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 DelhiCall Girls in Delhi
 

Recently uploaded (20)

Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabiunwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
unwanted pregnancy Kit [+918133066128] Abortion Pills IN Dubai UAE Abudhabi
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Understanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key InsightsUnderstanding the Pakistan Budgeting Process: Basics and Key Insights
Understanding the Pakistan Budgeting Process: Basics and Key Insights
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRLMONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
MONA 98765-12871 CALL GIRLS IN LUDHIANA LUDHIANA CALL GIRL
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
 

Business Dissertation Thesis

  • 1. The Application of Artificial Neural Networks to Business Problems Bryan Mills University of Plymouth Business School (Franchised to Cornwall College) Honours project submitted as partial fulfilment for the degree of BA Honours in Business Administration Supervisor: Dr Jon Tucker 14th May 1997
  • 2. Acknowledgements: Firstly I would like to thank my supervisor Dr Tucker for his patience and generosity and, in addition, to acknowledge the contribution he has made to this dissertation. I would also like to thank Dave Ager, Jill Ferret and Mike Trennary for their tolerance and encouragement during both the dissertation and the degree programme. I would like to acknowledge the encouragement I have received throughout the degree from Buzz Banks, Helen Cobbin and Ken Waller. Also, I would like to take the opportunity to thank Paul Ingram for his uncompromising and contagious obsession with academia and Tony Butt for first introducing me to Chaos Theory and non-linearity. 2
  • 3. Bryan Mills 1997 Abstract: Artificial Neural Networks (ANNs) provide a powerful information technology based tool for decision making purposes. However, present literature on the subject is often found to be either inaccessible or of limited relevance to (general) business application. In this report ANNs are described in a more intuitive manner than found within much of the existing literature. Emphasis is placed upon the use of ANNs within the business environment, although the study still provides an introduction for wider application. Misconceptions surrounding ANNs, and Artificial Intelligence in general, are explored and recommendations are made with a view to their resolution. The advantages and disadvantages of ANNs are discussed and present applications are listed with a view to demonstrating the various application possibilities of ANNs. To enable wider application of ANNs within business, and to reduce misguided application, a schema has been developed. This schema, which has been developed as both a flowchart and a computer program, allows the potential ANN user to critically appraise the use of ANNs for a given decision making problem. page 3
  • 4. Contents: (modified simple) ACKNOWLEDGEMENTS:....................................................................................................................... ABSTRACT:............................................................................................................................................... LIST OF DIAGRAMS: (BUILD FROM TABLE OF FIGURES STYLE - DELETE LIST OF TABLES FIRST)......................................................................................................................................... LIST OF TABLES: (BUILD FROM TABLES STYLE)............................................................................ GLOSSARY OF TERMS:.......................................................................................................................... CHAPTER 1 - INTRODUCTION.............................................................................................................. General Introduction............................................................................................................................... Popular Misconceptions Concerning Neural Networks:........................................................................ Chapter 2 - Discussion of Aims, Methodology and Research Philosophy............................................ Aim:......................................................................................................................................................... Objectives:............................................................................................................................................... Benefit of the project to industry and commerce:................................................................................... The growth of research in the neural area:............................................................................................ Methodology and Approach:................................................................................................................... Schema development:.............................................................................................................................. Chapter 3 - Explanation of the Fundamental Concepts of ANNs......................................................... Introduction:............................................................................................................................................ An outline explanation of the fundamental concepts of Artificial Neural Networks:.............................. Knowledge Based Systems:..................................................................................................................... The difference between Artificial Neural Networks and Conventional Knowledge Based Systems:...... Explanation of the operation of ANNs:................................................................................................... First Principles:....................................................................................................................................... Knowledge Based Systems:................................................................................................................ 4
  • 5. Bryan Mills 1997 Rule Based System:.................................................................................................................................. Artificial Neural Networks:..................................................................................................................... Overview:............................................................................................................................................ Components:........................................................................................................................................ Nodes:.................................................................................................................................................. Weights and bias terms:....................................................................................................................... Generalisation:..................................................................................................................................... Choice of mapping or activator function:............................................................................................ Data pre-processing:............................................................................................................................ Training:.............................................................................................................................................. Topology:............................................................................................................................................ The Multilayer Perceptron - an example of supervised learning/training:............................................ The Kohonen self organising net- an example of unsupervised learning/training:................................ Summary:................................................................................................................................................. Chapter 4 - Investigation into advantages, disadvantages and current application of ANNs........... Introduction:............................................................................................................................................ Advantages and disadvantages:.............................................................................................................. Current application of ANNs:................................................................................................................. Summary:................................................................................................................................................. Chapter 5 - Schema for the assessment of the suitability of ANNs for given problem....................... Introduction:............................................................................................................................................ Schema:................................................................................................................................................... Explanation of Schema:........................................................................................................................... Summary:................................................................................................................................................. Chapter 6 - Conclusions and Recommendations.................................................................................... Conclusion:.............................................................................................................................................. Limitations:.............................................................................................................................................. Further Research..................................................................................................................................... Appendix 1 - Example of Training Process............................................................................................. page 5
  • 6. Appendix 2 - Bayesian Updating:............................................................................................................ Appendix 3 - Instructions for Running The Computer Program:........................................................ Appendix 4 - Computer Code-list............................................................................................................ Appendix 5 - Sample Output:................................................................................................................... Appendix 6 - Visual Basic as a Programming Language:..................................................................... Bibliography:............................................................................................................................................. List of Diagrams: (build from table of figures style - delete list of tables first) Diagram 1 - Patent Activity.................................................................................................................................................................. Diagram 2 - Methodology..................................................................................................................................................................... Diagram 3 - Knowledge Based System................................................................................................................................................. Diagram 4 - Single Neuron Calculation............................................................................................................................................... Diagram 5 - Representation of a Neuron............................................................................................................................................. Diagram 6 - Screen dump of a text file for use in WinNN.................................................................................................................... Diagram 7 - Class Membership............................................................................................................................................................ Diagram 8 - Universe of objects........................................................................................................................................................... Diagram 9 - Sigmoid Function............................................................................................................................................................. Diagram 10 - The Multilayer Perceptron............................................................................................................................................. Diagram 11 - Kohonen Self Organising Feature Map......................................................................................................................... Diagram 12 - The operation of ANNs - flow diagram.......................................................................................................................... Diagram 13 - Schema........................................................................................................................................................................... List of Tables: (build from tables style) Table 1 - Sample Problem........................................................................................................................... Table 2 - Simplified weight method............................................................................................................. Table 5 - Input file explanation................................................................................................................... Table 4 - Sigmoid values............................................................................................................................. Table 5 - Data pre-processing..................................................................................................................... 6
  • 7. Bryan Mills 1997 Glossary of terms: Activator function - an equation (mapping function) which describes a neuron’s internal state as the total of its inputs; net = ∑xiwi-θ, where x is an input w is a weight. Algorithm - a procedure or series of steps used to solve a problem Autoassociative - mapping the original pattern from noisy or incomplete data Backpropagation - an algorithm which compares results with expected answers and then passes the difference back through the network to facilitate weight adjustment. Bias term - A systematic error (θ) introduced to each node independently to allow control over the otherwise independent node output. Cell - A neuron Database - In this instance, a set of facts (data) stored within a computer system Dependant variable - A variable which will be altered or created by the change in value of an independent variable(s). Normally shown on the left hand side (LHS) of an equation.. EPOS - Electronic point of sale - the computer connection between cash-tills and the central computer within a retail store EPS - Earnings per share (accountancy measure) Front-end subsystem - A computer program designed to simplify (humanise) the input and output of data Fuzzy - A set whose members belong to it to some degree. In contrast a standard set contains its members either all or none (Kosko, 1993). Function - A rule which maps one set element onto a different element in another set, page 7
  • 8. sales level could be said to be a function of demand Generalise - The ability to identify a wide range of objects, patterns etc. from a minimal set of key descriptive data Heteroassociative - mapping input pattern set to different output pattern set Hyperplane - A plot involving more than 3 dimensions and therefore difficult to represent graphically Independent variable - A variable which will alter or create the change in value of a dependent variable(s). Normally shown on the right hand side (RHS) of an equation Inference engine - The part of a knowledge based system’s programming which deduces results from given facts/data Knowledge based system - The separation of data and control (algorithms) allowing the computer to respond to a series of differing inputs by calling on a library of information (knowledge base) as opposed to altering variables contained explicitly within the program’s structure. Mapping function - A rule linking the elements of one set to those of another; usually shown as F:x→y; the function which maps the x onto y. Multivariable - Containing a large number of independent variables Network - A collection of interconnected nodes forming a topology Neuron - A single activator function, a processing element, a mapping function through which variables must pass, a calculation point Nodes - Neurons Non-linearity - Equations containing powers, roots, trigonometric or logarithmic functions. 8
  • 9. Bryan Mills 1997 Normalisation - A form of data pre-processing which seeks to give all inputs/outputs a commonality by constraining their values to within a pre- determined range Pre-processing - Alterations to data before use (normalisation, removal of outliers, ratio splitting). Usually conducted with the intention of increasing the networks efficiency or conversion of non-numeric data to numeric. Propositional logic/calculus - A step by step inference system for determining whether a given proposition is true or false. There are various forms of propositional logic (modus ponens, modus tollens, denial of antecedent etc. ), but all are based on a deviations of: If x is true then y must be true/false, If and only if x is true then y is true/false etc. (Eysenck and Keane, 1995). Ratio splitting - Using the component parts of a ratio separately as opposed to using the result (GPMargin = GP/Sales; use GP and Sales as input not GP Margin) Real-time - The collection and processing of data as events occur as opposed to the use of historic data. EPOS works in real-time ROCE - Return on capital employed (accountancy measure) Set - A collection of elements defined by a rule which makes them separable from other sets - e.g. men and women are two separate sets (separated by sex) but are also within the common set of humans (separated from other animal forms by species) Sigmoid - A common ANN Activator function. An equation which has the effect of reducing all independent variables to an answer of between tending page 9
  • 10. towards 1 and tending towards 0 (never reaching either 1 or 0) and is generally given thus: 1 fnonlinear ( x ) = 1 + e−x where x is summed input and e is the mathematical constant that is the base of natural logarithms (2.71828.....) Topology - In this instance an attempt to graphically represent the interconnection of nodes within the network. Topology is often one of the key distinguishing features separating different ANNs (others being training method and activator function) Training method - As ANNs self learn by exposure to data it is necessary to have an algorithm which allows the ANN to distinguish between correct and incorrect responses. This may either be supervised (told when incorrect and what should have been the output), unsupervised (self learning pattern recognition) or reinforced (told simply if correct or incorrect) Training set - A collection of data used to train the ANN, usually separated into a training set and a hold out or test set Vector - A quantity which has both magnitude and direction. ANN’s input consists of a one dimensional array of differing x values of the form x1w1+x2w2+x3w3+...xnwn, where x indicates input and w indicates weight Weights - A value which is altered by the ANN to enable the emphasis of the variable upon which it acts to be either strengthen or weakened. A variable coefficient which determines strength of an input’s effect on output 10
  • 11. Bryan Mills 1997 Chapter 1 - Introduction General Introduction Business involves a complex mix of people, policy and technology, and exists within the constraints of economics and society (Clifton and Sutcliffe, 1994). It is often the precise way in which these items are mixed that can create either success or failure for an organisation. This presents the manager with two key tasks; the efficient collection and analysis of all relevant information. From this analysis the manager will be able to formulate strategies, define objectives and implement plans for there fruition. The provision and analysis of information, within business, is often referred to as the decision support process and the methodology adopted referred to as decision support systems (DSS). Business decisions can often be viewed as the solution of various mathematical problems. Whether it be determining the price level of a product, the benefit of expansion into a new market, staff levels or the probability of a project failing mathematics usually plays a role. In fact, due to the overriding objective of “maximising shareholders wealth (McLaney, 1994)” found within all profit making organisations, it can be said that, as wealth/profit is measured numerically, it would be difficult, if not impossible, to view the organisation meaningfully in any other way1. One of the key problems in any decision is the availability and cost of “perfect information”. Given perfect information (all the facts concerning a decision with complete confidence in these predictions being correct) there would be little for the business manager to decide, it would simply be a choice of the project which 1 Non-profit making organisation seek cost efficiency - another mathematical measure page 11
  • 12. maximised overall contribution2. Most decisions, however, are not based on perfect information. This is generally due to a combination of the prohibitive cost of gathering such information, the availability of information and the intrinsic unpredictability and complexity of the markets in which business operates. Ongoing developments in the field of Information Technology has enabled the gathering, storage and retrieval of much larger quantities of information than was previously possible. Stock Markets can be observed in real-time, supermarkets know the exact quantities of goods on their shelves (via Electronic Point of Sale (EPOS)) and their customers weekly shopping lists (via Loyalty Cards), companies can measure the exact output of machines on the shop floor (via Computer Aided Manufacturing). This information is, however, worth only as much as the gain derived from its ownership. To be able to quote a share price or stock level is fine, but the information has already become historic. What is required in decision making is a means by which to identify patterns and trends in the large volumes of data currently available, and to increase the confidence in the predictability of this data to an acceptable level. The capabilities Artificial Neural Network (ANN) models have in recognising patterns and trends in large volumes of data has meant that they are being increasingly used for a variety of industrial/commercial applications. ANNs are a form of computer software which took their original inspiration (McCulloch and Pitts, 1943) from mans limited understanding of the workings of the human brain. Research has been carried out in this area for two broad reasons. The first and original was an attempt to model the human brain electronically to develop a greater understanding of its operation. The second, and most relevant in this 2 Overall Contribution - the manager would consider the organisation’s other ventures, market share, market growth and long term survival in his/her decision 12
  • 13. Bryan Mills 1997 instance, is the development of these models as a mathematical tool for studying patterns and relationships in data. The mathematics which form these models are particularly useful when dealing with non-linear problems, problems which cannot be graphed by use of a straight line, of which there are numerous examples in business (demand/price, production level/cost, share price/ROCE/EPS - an increase in the independent variable (price) does not guarantee a proportionate increase in the dependant (demand)). ANNs are also capable of dealing with dependant variables which may have several variables acting on them (e.g. interest rates, inflation and estimation of risk - in cost of capital calculation), the relationship between each being both theoretically appreciated and explainable but not easily converted into an equation or algorithm (Klimasauskas, 1991 and Scocken, 1994). It is the ability to deal with non-linearity, multivariables and large volumes of data which gives ANNs what is perhaps their most impressive features - pattern recognition and self learning. ANNs receive their information (their knowledge) via a process of training. Sets of data and desired results are passed through the network until the computer is able to create, to a reasonable degree of accuracy, the desired result. This is made possible by the networks ability to generalise the training data presented to it and form an output, given new inputs, based on this generalisation. Once this training stage is complete a problem (independent variables) can be input and a result (dependent variable) is generated. Current application of ANNs includes, amongst others; stock and money market forecasting (Trippi and Turban,1996), face and handwriting recognition (Rogers, Kabrisky, Ruck and Oxely, 1994), recognising whether station platforms are busy or not, missile direction systems, voice recognition, voice control of computers, page 13
  • 14. data mining (Wiggins, 1994), industrial signal processing (Wiggins, 1994), modelling of traffic flow (Recker, 1995), human resource management (redundancy selection) (Coit, 1996), new product feasibility studies (Madu, 1995), risk evaluation, chemical analysis, weather forecasting and resource management (Davalo, Naïm, 1991), a complement to business decision support systems (Scocken, 1994), operations quality control (Horridge, 1997) and the processing of marketing data. Popular Misconceptions Concerning Neural Networks: The subject of Artificial Neural Networks (ANNs) is an example of a name not being self explanatory. The description ‘Artificial Neural Network’ is a misnomer, it suggests an artificial representation of the human mind (it being composed of a network of neurons). Exciting though the creation of an ‘artificial mind’ would be, the ANNs currently in operation are little more than computer programs capable of doing clever ‘sums’. The cleverness of these ‘sums’, however, is not to be taken lightly. Systems have been developed which are able to identify patterns in very large samples of data, produce a method of calculating relationships between data where conventional mathematics would have been inadequate for practical application, and represent a very strong possibility of development of systems better suited to understanding our own fuzzy3 world. As a subject, ANNs are fairly inaccessible and fraught with misconceptions. The subject is clouded by two separate, but interrelated explanations, and this difficulty is further compounded by the absence of accessible knowledge. On the one hand there are the works of various academics and academic institutions. On the other is the general public’s4 understanding of what ANNs represent - if they are 3 Fuzzy - e.g. Language - hot, warm, cold mean different temperatures to different people and the boundary between hot and warm (for example) is not clear (is 18 degrees warm, 19 degrees hot and just as hot as 28 degrees?) 4 Used here simply to describe those outside of the fields of Mathematics, Computing and Psychology - not intended to be in any way derogatory. 14
  • 15. Bryan Mills 1997 aware of them at all. The public’s understanding stems mainly from the world of science fiction and ‘popular’ science programmes. It is a world of Arthur C Clarke’s HAL (2001 etc.), Philip K. Dick’s Bladerunner5 - thinking machines which inevitably turn on their creators, with devastating results6. This understanding is not assisted by the anthropomorphic nature of the language surrounding ANNs and the willingness of some academic’s to emphasise this definition (for example - Professor Aleksander, Imperial College London - “Magnus [a computer program] has a mind of its own” - (Millar7, 1996)). The use of words such as ‘thinking’, ’neuron’ and ‘understanding’ all point towards machines which, eventual, may replicate the human thought process to the point of being conscious. The reality of the situation is quite different, at present computers can represent little more than a few thousand neurons, compared to 10,000 in a Cockroach’s brain and 100 billion in a humans (The Economist, 1995). The academic world often uses anthropomorphic terms to overcome some of the limitations of language and the mathematical nature of more correct descriptions. For example in the development of a computer system to control the heating, lighting and ventilation of an office building one may be tempted to use expressions such as -“to develop a system which is aware of its environment”. However, use of the word ‘aware’ may suggest consciousness and use of ‘its environment’, as opposed ‘the environment in which it operates’ could suggest ownership and, therefore, existence beyond being an object. The difficulty stems from the absence of a more correct, and equally as convenient, shorthand. The alternative - “to develop a system which constantly monitors the surrounding environment and compares this information with 5 More correctly - the original book was called- Do Androids Dream of electric Sheep 6 The defence analyst and writer Warwick Collins has gone so far as to call on the government to restrict the human attributes scientists can give programmes/machines (Millar 1996, The Guardian Newspaper, (17/12/96) page 4, eighth paragraph)). 7 The Guardian Newspaper, 17/12/96 page 4, second paragraph) page 15
  • 16. a pre-programmed set of ‘ideal’ conditions” - explains the process with a reduced likelihood of confusion, but is not necessarily more accurate. The readers frame of reference provides the key to which language would be more appropriate. The use of such terminology creates few problems within the field because the level of understanding is such that the words used often have two separate meanings - the computer related meaning and the human related meaning - for example: Neuron - • Human related meaning - a cell which responds to various inputs by producing responses - a processing unit. • Computer related meaning - a part of ANN computer program which performs a calculation - a processing unit. The definitions are similar and would appear to suggest that, if a significant number of ‘computer neurons’ were assembled, a human brain could be replicated. Whilst this formed the inspiration behind some of the early research in the field (for example Rosenbalt 1958, 1961), modern theory points to a level of complication within the human brain which makes the early optimism seem naive at best. A more comprehensive discussion on matters of human and machine consciousness is found in Penrose, 1988, Emperors New Mind, and 1994, Shadows of The Mind. This thesis is intended to explain Artificial Neural Networks in such a way as to reduce some of the confusion which often surrounds the topic. In addition it is intended to simplify the application of ANNs (to a given problem) by the development of a schema (both paper based and as computer program). This schema 16
  • 17. Bryan Mills 1997 will greatly simplify the choice faced by the manager when considering which mathematical tools to use in both decision, classification and control problems. To enable the full value of this schema to be realised the thesis begins with a comprehensive review and simplification of existing literature. As previously discussed the confusion stems from three broad areas - media hype, anthropomorphic descriptions and texts aimed at a specialist reader (scientific) and it is intended that this thesis will contribute towards redressing this balance. page 17
  • 18. Chapter 2 - Discussion of Aims, Methodology and Research Philosophy Aim: This study aims to develop a level of understanding from which the business manager (who is unlikely to be an IT specialist) can establish the relative merits/demerits of the ANN technique for business decision support analysis. The project aims to make inroads into some of the more accessible academic texts with a view to creating a more intuitive guide to ANN use aimed at the business manager and student. To aid this explanation a schema or system will be developed whereby the reader can assess the suitability of ANNs for a problem they wish to solve. To assist in the discussion on the suitability of ANNs for given problems there will also be an assessment of current uses and the advantages and disadvantages that application presents. Objectives: 1) To conduct a literature review of the fundamental concepts underling ANNs. 2) To examine the existing use of ANNs. 3) To develop a system to enable problems to be assessed for the suitability of ANN application. Benefit of the project to industry and commerce: Progress in the development of ANNs is closely tied to the development of computer equipment. It is only within the past 5 years that computing power has become cheap enough to make ANN use a viable possibility. However ANNs have remained in the exclusive domain of the scientist and mathematician for the past 45 18
  • 19. Bryan Mills 1997 years and there are few accessible texts for the non-specialist. The field of ANN contains possible solutions to business problems not fully addressed by present mathematical techniques (Tucker, 1997). As Gleick (1993) and Waldrop (1992) have commented, non-linearity of patterns are rife in the enormous volumes of information produced by industry and commerce (e.g. the financial pages, actuary data, market research responses). ANNs enable the user to analyse this data more accurately than traditional problem solving techniques, making them a commercial advantage to many industrial sectors. The growth of research in the neural area: The field of ANNs is expanding at an amazing rate. The expansion of the subject is closely linked to technological developments in the IT field. As this area continues to develop8 there will be an increasing expansion of opportunities in the field of ANNs (Medsker, Turban and Trippi, 1996). Funding of research within the field of ANNs is continuing with the Japanese government having budgeted $250 million over next 10 years, and the US government having pledged research funding of $400 million over next 6 years (The Economist, 1995). Patents Registered USA 300 250 200 Combined Number 150 Comp. Int ANN 100 50 0 1986 1987 1988 1989 1990 1991 1992 Year Diagram 1 - Patent Activity 8 Moores Law suggests a doubling of the number of chips on a transistor every 18 to 24 months (J. Scholfeild, 1996, The Guardian Newspaper (31/10/96) page 3 Online Section). page 19
  • 20. Diagram 1 shows patent activity in the USA for the years 1986-92. It can be seen from the graph that the growth of work within this field is almost exponential. It is also important to note that the full extent of ANN’s application within business (particularly finance) has yet to be realised (Farrar, Tucker and Bugmann, 1997). Methodology and Approach: The project is based mainly on a comprehensive literature survey and review of texts within the field of ANNs. The literature search was conducted in the first instance to develop a clear understanding of the subject. From this, a succinct explanation of the concepts underpinning artificial neural networks, aimed at business managers, has been produced. The greater understanding engendered by the literature research provides the basis for an analysis of the advantages and disadvantages of the use of ANNs and forms the foundation of the schema development. Diagram 2 - Methodology 20
  • 21. Bryan Mills 1997 The above chart (Diagram 2) represents the flow of tasks from development of original synopsis to the conclusions and recommendations. Schema development: The schema, which forms the most pragmatic part of the thesis, was developed from the literature research. The schema seeks to answer the question “Do ANNs offer a realistic solution to a given problem”. As ANNs are capable of dealing with a variety of problems, and as the business community usually has a variety of different problems under review, it is intended that the schema will be general in its approach, whilst maintaining effectiveness and accuracy. The schema is developed both as a flow-chart and as a computer program. By establishing the specific data and training requirements of ANNs it is possible to construct a series of questions of a non-technical nature, which the manager can consider concerning the problem under review. The schema follows the flow of the responses and culminates in a suggestion for further action. The reasons for the suggested actions are explained, allowing the manager to consider various courses of action depending on the resources available to him or her. Where appropriate the schema will suggest alternative decision making techniques which could prove more cost efficient or accurate than the use of ANNs. page 21
  • 22. Chapter 3 - Explanation of the Fundamental Concepts of ANNs Introduction: This chapter will seek to place ANN in the broader context of computer software. A highly simplified description of the workings of ANNs will follow. Once this basic understanding has been enabled a more detailed explanation will follow, which is intended to equip the reader with a reasonable level of knowledge on the topic, to enable both further study or practical application. An outline explanation of the fundamental concepts of Artificial Neural Networks: An ANN is simply a computer program which, through the adjustment of mathematical weights, is able to create a model capable of producing results (usually in the form 1 or 0, or scaled using decimals from 0 to 1) , for a given set of numeric input data, to a reasonable degree of accuracy. The network will often include Front- end subsystem (Attrasoft User’s Guide and Reference Manual, 1996) to enable both data encoding and data decoding: Data encoding: to convert user-application data to neural input data. Data decoding: to convert neural output data back to user-application data. ANNs can be considered as part of the larger group of computer based techniques referred to as Knowledge Based Systems. Knowledge Based Systems: There are numerous forms of computer systems which fall under the general heading of Knowledge Based Systems (KBS). This use of computing power can be defined as: page 22
  • 23. Bryan Mills 1997 Chapter 3 -Fundamental Concepts “a system within which data is analysed by comparison with sets of pre-obtained data by following specific rules and/or weighted relationships” (author) To facilitate this comparison the system will require a set (library, files, historic records) of knowledge. This knowledge is the basis upon which the system operates and can take numerous forms: • Financial Data - credit limit, accounting ratios, past sales figures • Human Resource Data - qualifications, age, experience (years) • Operational Data - machine failures (frequency), tolerances, re-order levels As can be seen from the above examples, the knowledge base is often a form of database of the sort now commonly found within most organisations. The difference between KBS and conventional databases is the level of interrogation and control which is placed within the systems remit. As opposed to merely storing data the system will be called upon to ‘trawl’ through the data to identify trends and patterns of behaviour or it may use its knowledge to instigate some form of action. For example if a bill became overdue the system could issue a reminder without the need for an operator to intervene. This is possible because the system knows the date, the date the last payment was made, the difference between this date and today’s and the company’s policy on ‘debtor days’. This example also indicates the level of understanding possible - knowledge, in this instance, can in no way be said to be in the same sense as a human would know what it was to have an overdue bill. It becomes apparent that many modern databases are capable of achieving similar results to knowledge based systems. The difference between the conventional 9 knowledge based system and databases is becoming increasingly subtle and is more 9 Conventional as opposed to ANNs page 23
  • 24. an emphases on use as opposed to structure. Most data bases (Microsoft’s Access for example) are capable of interrogating data and also of issuing notification should this be required. The difference between Artificial Neural Networks and Conventional Knowledge Based Systems: As previously discussed ANNs are part of the broad heading of KBS, however it is important to recognise that there are fundamental differences between ANNs and other KBSs. Whilst a KBS has the rules and relationships concerning its knowledge programmed into the system (albeit kept separate from the knowledge) ANNs develop their ‘own’ rules and relationships through a process of self learning. The self-learning abilities of ANNs are most simply explained by example: Suppose the relationship between the following set of data was desired: Advertising 100 150 50 10 200 Spend £’s Sales £’s 300 450 150 30 600 Table 1 - Sample Problem From the above table, by dividing sales by advertising spend (or by drawing a graph), it is quite possible to estimate that sales are three times advertising spend. It is possible to estimate this figure because, a) we appreciate and could prove a relationship between the variables, b) there are relatively few variables which c) enables a simplistic approach to the formulation of a equation (relationship). It can be appreciated that a more complex relationship may exist, which is beyond the simplistic approach used so far. To solve a multivariable and non-linear10 relationship would require the use of statistical techniques which are often 10 Non-linear - a relationship which would create a the graph of a curve as opposed to a straight line, the equation of which would contain powers x2etc.. page 24
  • 25. Bryan Mills 1997 Chapter 3 -Fundamental Concepts complicated and/or rely on a degree of approximation. ANNs take a different route to establishing the relationship between variables - by adjusting the values of numerical weights within a equation (function). The weights will act upon the data to alter its value with the intention of producing the desired result. To enable this process to take place the system must be exposed to the data a set at a time (e.g. Advertising Spend of £100 and sales of £300 is the first set of data in Table 1). The computer will, in the first instance, apply a guess as to the value of the weights to be used (although this starting value may well be pre- programmed or random (Hopgood, 1993)). This ‘guess’ will, inevitably, prove to be wrong and the system will alter the weights and retry. The first set of data will be treated as below: Advertising Spend £’s Weight Function Result Desired Result £’s 100 1 Spend * Weight 100 300 2 Spend * Weight 200 300 3 Spend * Weight 300 300 Table 2 - Simplified weight method It can be seen from the above that after a series of iterative steps the system was able to produce the desired result, and in our previous example this weight would be acceptable for all of the data sets. The function used in the above example is linear as opposed to the non-linear functions used within ANNs, also the number and relationship of the variables is more simplistic than would normally be encountered (for an example of the more complicated OR problem see Appendix 1). It is possible to imagine that if the relationship was more complicated and our weight of 3 proved unsuitable for the next data set it could be adjusted again and then page 25
  • 26. re-used on both sets until a satisfactory relationship was obtained. Most ANNs have an adjustable degree of tolerance (between ANN output and training set’s expected result), for example WinNN has adjustable target error to determine the acceptable Root Mean Square error11, once target and RMS match training of that net is said to be complete - note; the lower the acceptable error the more refined, and less generalised the net becomes. The procedure described in this simplified model could be said to represent a single neuron (processing unit, cell). To enable more complicated relationships to be developed ANNs have more than one neuron and it is not uncommon for the results of one neuron to be the input of another. If these connections were viewed pictorially they would form a network of interconnected neurons, and hence; Artificial (non human) Neural (processing units) Network (interconnection of neurons). Explanation of the operation of ANNs: First Principles: Knowledge Based Systems: As discussed in the introduction, ANNs are a form of software that has the ability to self learn. Unlike more conventional (rule based) forms of knowledge- based systems the algorithms used to enable the inference engine (rule interpreter) to work are not hard programmed or explicit rules based along the IF...THEN...ELSE pattern. Instead the program uses a series of mathematical weights to establish data relationships. To enable an understanding of the difference it is first necessary to explain the basic components within knowledge based systems. Knowledge based systems contain 3 core components. An interface with the 11 RMS - the square root of the mean of a set of squared numbers page 26
  • 27. Bryan Mills 1997 Chapter 3 -Fundamental Concepts user (outside world) to enable both data input (keyboard, sensors, etc.) and output (monitor, servos, printout, etc.), a knowledge base (data base) and an inference engine (rule interpreter, instructions, ‘main program’). There are two other components often found within knowledge based systems; an explanation module12 to enable the reasoning behind the decision made to be shown, and a knowledge acquisition module to enable the knowledge base to be built by use of one or more of the acquisition techniques possible (Hopgood, 1993). Diagram 3 illustrates the relationship between these components: Diagram 3 - Knowledge Based System As shown in diagram 3 the relationship between the components within a KBS is relatively straightforward. Information is gathered from the outside world, stored within a data base and, upon a query being made, accessed to provide an answer. Rule Based System: A rule based system is based, fundamentally, on the IF...THEN...ELSE structure (propositional logic/calculus). The following illustrates this point: IF credit level is greater than pre-agreed limit THEN stop credit and issue reminder ELSE do nothing Where the credit level is computed from inputs and the pre-agreed limit is contained 12 ANN have great difficulty in satisfying this requirement and your attention is drawn to the discussion in Chapter 4 page 27
  • 28. within the knowledge base. It is both common and desirable that the information required to process the rule is contained explicitly within the knowledge base as opposed to implicitly within the program to enable a more simplistic and robust method of updating to be used (e.g. as opposed to altering the program’s source code entries in a data base are changed)(Hopgood, 1993). Whilst it is can be appreciated that this is a simplistic view of the workings of a rule based system further developments serve only to improve and compound this basic methodology (see for example; Appendix 2 - Bayesian Updating). Artificial Neural Networks: Overview: The key difference between ANNs and KBS lies with the inference engine. As opposed to having a logic imposed on it, the network is allowed to develop its own logic by means of training, either supervised or unsupervised. Weights are used to determine the strength of relationships and there is no IF...THEN...ELSE. Instead the network decides the relevance of inputs and their interconnections based on its own experience (e.g. it has been trained). The network consists of a selection of nodes or cells arranged structurally in a predetermined topology. The nodes are grouped in layers. This takes the form of an input layer, one or more hidden layers and an output layer. Each node accepts various inputs, adjusts them via weights, adds all inputs together them, uses them to calculate a non-linear function, outputs them for passing to another cell, or if last cell uses the output layer to compare the result with the expected answer and then passes the difference back through the network to allow weight adjustment to correct errors (backpropagation). A simplified single neuron calculation appears thus: page 28
  • 29. Bryan Mills 1997 Chapter 3 -Fundamental Concepts Diagram 4 - Single Neuron Calculation Pictorially this can be represented thus: Diagram 5 - Representation of a Neuron This processes is explained, in detail, below and would normally be performed by numerous neurons/cells/nodes within one or more layers at the same time e.g. in parallel. Components: It is important to appreciate that ANNs gain their ability not from a predetermined layout or selection of weights but from the networks ability to adjust weights and alter (strengthen/weaken) connections between nodes. Before attempting to explain the mathematics behind these interconnections an explanation of the key components of the network is required. page 29
  • 30. Nodes: Medsker, Turban and Trippi (1996) comment that most commercial ANNs have between 10 and 1,000 nodes arranged in three layers, and that although 4,5 or more layers is not unheard of, it is not deemed necessary for business applications Hopgood (1993) describes a node’s role as “to sum each of its inputs, subtract a bias term, θ, and pass the result through a non-linear function, fnon-linear, known as the activation function”. Hopgood’s emphasis on the bias term is discussed below. ANNs have sets of these calculating functions and a description is given by Patterson (1996) as “Every ANN is composed of a set n of simple neural computing elements (neurons, units, processing elements or PEs, cells)” and where this set of cells can be given as: C={ci } i=1,2,...,n. Patterson goes on to comment that cells can be grouped into three distinct categories; input, hidden (or interior) and output. The interior layer of cells are the nodes which perform the majority of the calculation process and are discussed under various headings below (Weights and Bias Terms, Generalisation, Choice of Function). Input cells are the cells which take the initial input of stimuli (discrete keyed values or continuous sensor data) whilst output cells enable the display of results or the control of effectors. The inputs and outputs are usually represented by the vector x of n dimension and the output y of m dimensions (simply put; x1, x2,...,xn and y1, y2,...ym). . The input data often takes the form of a text file in PC based neural nets: page 30
  • 31. Bryan Mills 1997 Chapter 3 -Fundamental Concepts Diagram 6 - Screen dump of a text file for use in WinNN The above input file demonstrates the relatively simplified form of data which may be used in ANN training and operation. The above example does not feature scaling of the variables as this is not required in this instance, however it does provide a representation of the form input files often take. The file represents 4 training sets, each with 2 inputs and 1 output (4,2,1). In the first training set (case) x1 would be 0, x2 would be 0, and the expected result (y1) would be 0. This would be followed by the second set which would be 0,1,1 respectively and the third etc. This data represents the commonly used XOR example/problem and gives the result 1 for an even number of inputs and 0 for an odd (Patterson, 1996). The trained network could be used to solve simple yes/no problems for example: Account Purchased Arrange Reason Customer? Over 200 visit by Units? sales staff n n n Probably not trade customer n y y Offer trade account y n y Try to increase sales to trade customer y y n Credit limit probably reached Table 5 - Input file explanation The above example is highly simplified. It does, however, represent the style of business control system which uses yes/no responses. It is important to note that the reason for the decision would not be given by the ANN. page 31
  • 32. Weights and bias terms: Once the data is entered into the network its connection from input layer to calculation node is used to facilitate the addition of weights. Patterson (1996) uses the following notation: net=x1w1+x2w2+x3w3=∑xiwi where x is input variable and w is weight. Equation 1 Hopgood (1996) makes the point of subtracting a bias weight to give: net=∑xiwi-θ Equation 2 whereas Patterson (1996) prefers the use of a bias fixed value of 1*w 0 on one of the input links where w0=-θ. The use of either method is considered acceptable. The weights remain independent of the variables (x) so as to facilitate their adjustment during backpropagation. It is helpful (Patterson, 1996) to view the relationship between the weights, class membership and the bias value in terms of a two-dimensional plot. In more complicated example the weight value vector (wi) would define a hyperplane in n-space where n is equal to i- the number of variables. In this example n=2 and so it is two-dimensional. Diagram 7 - Class Membership page 32
  • 33. Bryan Mills 1997 Chapter 3 -Fundamental Concepts The significant points in Diagram 7 are the offset - giving the value of the bias weight (w0)and the slope of the line which is given by - w1/w2. Thus the formation of the line is derived entirely from the weights and future x values will be shown as either belong to the class or not. Patterson (1996) places particular importance on this boundary line as he identifies it as the key to the net’s autonomy through its ability to alter weights and so define what is within the set and what is outside it. The example shown is linearly separable in that its boundary is define by a straight line/plane. This is largely due to the simplicity of the example (2- dimensional) and partly due to the fact that it would be intended for use in a single layer network. As an ability to cope with non-linearity is one of the key features of ANNs they are, of course, capable of dealing with more complicated examples. Generalisation: To deal with n-dimensions and non-linearity ANNs generalise. Patterson (1996) discusses generalisation in terms of “describing the whole from some of the parts” and points out that the alternative to an ability to generalise is knowing everything. It is possible to identify an object by knowing some general rules involving that class of object without knowing every member of that class. For example a metal frame with two wheels, a set of handlebars, a saddle and fitting various size requirements is probably a bicycle. It is not necessary to memorise every manufacturer’s catalogue. ANNs generalise by creating a class which exists in weight space with its boundary given by the mapping function F (Patterson, 1996). Mapping functions are either autoassociative or heteroassociative meaning they map the original pattern from noisy/incomplete data or map input patterns to different output patterns page 33
  • 34. respectively. Mapping functions are shown mathematically as F:x→y. The non- linear boundary can be shown by the simplified diagram: Diagram 8 - Universe of objects From Diagram 8 it is possible to see that the boundary established by the network includes both the training set data and other instances of the data not given in the training set but which would be encountered if more sets of data were made available - therefore giving the network an element of flexibility. The boundary must therefore include all examples of the training set, all examples of data corresponding to the nets function but not known at time of training, and exclude all other data sets. Once this has been achieved the generalisation can be said to have been a complete success. It is apparent that the method of training and the selection of data will have particular importance on the accuracy of this process. Choice of mapping or activator function: As mentioned in ANN Overview (above) the summed weights are passed through a non-linear function before proceeding to the next cell or output layer. This is the mapping function referred to above so that F:x→y, and is known as the activator function (or activation level/summation function - Medsker, Turban and page 34
  • 35. Bryan Mills 1997 Chapter 3 -Fundamental Concepts Trippi, 1996). It is suggested by Patterson (1996) that the choice of activator should be a “monotonic nondecreasing function of net”. This simply means that the function should hold true for all facts, even if it was originally based on only a sample (monotonic) and that the slope of the function should rise from left to right (it should not cause values to diminish in relation to other lower values; x=2 y=0.88, x=3 y=0.95 for a sigmoid value). Hopgood (1996) makes the point of stating that “The weights and biases can be learnt, and the learning behaviour of a net depends on the chosen algorithm”. It is further stated that the sigmoid function is most commonly used and Patterson (1996) concurs with this statement. The sigmoid function is given as: 1 fnonlinear ( x ) = 1 + e−x Equation 3 and would appear graphically thus: Sigmoid Function 1 0.8 0.6 0.4 0.2 0 1 3 5 -5 -3 -1 Diagram 9 - Sigmoid Function The above diagram (Diagram 9) shows some of the key features of the sigmoid function and thus its reasons for use, these features include: • The ability to make all values positive . • The relatively fine level of discrimination (the slope) • The fact that all results are given as between 0 and 1. Data pre-processing: It can be appreciated that the data under investigation may take various forms. The ANN will require inputs which are of a numeric nature. This does not prevent page 35
  • 36. non-numeric data being analysed provided it can be converted, with consistency, into numbers. For example, risk is a common business concept which is regularly translated from the vague - safe, moderately safe, risky, very risky - to a range of probabilities (say 100%, 75%, 50%, 25% probability of a favourable event occurring). Once the data has been gathered, and given a numeric value if required, the efficiency and accuracy of the ANN can be enhanced by information pre-processing. Wasserman (1989) and Patterson (1996) both concentrate on normalisation, which is a common form of pre-processing. Normalisation is a method by which all the data being processed can be given a common minimum and maximum value. For example, readers familiar with statistics may draw a parallel between the normalisation of the data with techniques used in statistics to determine probabilities using the normal distribution curve(NDC). Here any distribution of data can be mapped (converted) to the NDC which has its probabilities pre-calculated. The most common form of normalisation will see all the data converted to values between 0 and 1. This has the advantage of both reducing the difficulty of manipulating large numbers (simply put it is easier to manipulate, say, 0.2 than 2,000,000) and enhancing the networks ability to adjust weights by reducing unnecessary emphasis (for example in loan calculations interest rates may be given as % or decimals, loan size in millions). Certain activator functions will restrict output to between 1 and 0, regardless of input (Medsker, Turban and Trippi, 1996). For example the sigmoid function mentioned above: Input 1/1+e-x 0.5 0.6225 5 0.9933 17 0.9999999586 35 0.99999999 35.2 1 page 36
  • 37. Bryan Mills 1997 Chapter 3 -Fundamental Concepts 2256 1 -5 0.0067 Table 4 - Sigmoid values In can be seen from the above table that data which is beyond a certain range approaches a value of one (exact point at which it appears as one is dependent on the number of decimal places used and rounding). As the data is multiplied by weights and summed before entering the activator function it can be appreciated that no accuracy is gained by maintaining a mixture of large and small numbers (as it will convert everything to between 0 and 1 regardless). It can also be recognised that the output from the net, having passed through various activator functions, will be of decimal or Boolean form (1 or 0). It is important to recognise that as the results for the sigmoid function return 0 and 1 for a large range of negative and positive numbers, respectively, it is advisable to restrict the inputs and outputs to values between 0.1 and 0.9 (Tucker, 1996). This also avoids the use of either 1 which has the effect of distorting the part of the sigmoid function (e-1 = 1/e ) or 0 which distorts sigmoid (e0 = 1) and has the effect of negating weights (x1w1 = 0 for x = 0). There are numerous methods of data pre-processing and Tucker (1996) contains a accessible description of six common methods. These are given as Distribution truncation and squashing functions, Natural log regression, Ratio splitting, Positive/negative split, Variable pre-selection, and Data squashing. These can be explain, briefly as: Technique Description Distribution truncation The removal of outliers, the removal of unusually large, and squashing or small, numbers from the data set functions page 37
  • 38. Natural log regression Using logarithms to convert data into small units. Ratio splitting Not inputting the results of a ratio but the numerator (top) and denominator (bottom) values separately. Positive/negative split Separating a variable which has both positive and negative examples into two separate variables (e.g. Profit becomes Profit or Loss as opposed Profit £200 or Profit - £150 (loss)) Variable pre-selection Simply manually deciding which variable to include and which to leave out. Data squashing Converting all data so as it is within a pre-defined range (0.1 to 0.9 being given as most appropriate). Adapted from - Tucker (1996) Table 5 - Data pre-processing It is also important to emphasise that the data (variables) used within a ANN should have some form of theoretical basis for a relationship before they are included13. There is little point comparing interest rates, inflation, project life, level of risk and outside temperature if the problem under consideration was to determine the ‘cost of capital’ to use in net present value (NPV) calculations . Training: By their very nature, ANNs require training. This training can be viewed in a similar manner to human training in that the activity is repeated until the system produces a satisfactory result (or it is decided to abandon the training run). Hopgood (1993) suggests that the training process is, more correctly, an “error reduction” process. Alternatively Patterson (1996) suggests that it is “adaptive learning in a dynamic environment” emphasising the flexibility of the method. There are, however, three separate methods of training - Supervised, reinforced and unsupervised. Wasserman (1989) argues that unsupervised training is 13 The appendix of Farrar, Tucker and Bugmann (1997) provides an example of how this could be approached. page 38
  • 39. Bryan Mills 1997 Chapter 3 -Fundamental Concepts the only “biologically plausible” method of training ANNs. The desire to be “biologically plausible” suggests Wasserman is more concerned with the theoretical study and attempted replication of a brain, than practical application of a mathematical technique. Plausibility should be weighed against ‘usefulness as a tool’ when ANNs are used in applications not related to the study of psychology/neurology. The use of unsupervised learning will be discussed under the heading of Topology below (Kohonen being one of its originators). Supervised and reinforced learning methods are broadly similar in philosophy and so an explanation of supervised training will be made first. In supervised training the results obtained from the cells are compared with the desired results (contained within the original input). The difference is referred to as an error and the weights contained within the network are adjusted (backpropagation). This adjustment continues until the sum of the squares of the differences (errors) are minimised (in a way similar to linear regression - line of best fit calculations). This process can be simplified thus: • Subtract output from target contained within input vector. • Square the difference to remove negative signs14. • Add all the squared differences together. • Compare this answer with the desired level of error. • If error unacceptable adjust weights and begin again. The minimisation is said to be complete when the error has reached an acceptable level. Reinforced training follows a broadly similar route to supervised but, as 14 Removal of negatives - else error of -100 plus error of 100 would indicate no error. page 39
  • 40. opposed to calculating the level of error, the ANN is merely informed that it is either wrong or right and continues to adjust weights until a correct result is identified. Patterson (1996) suggest that the method is seldom used in practice and attention should instead be paid to supervised and unsupervised learning (other authors make little or no mention of reinforced training). It should be noted that it is possible to over-train a network. To continue the human analogy, this would represent an employee trained, in a vocational way, to such a level that they were only able to perform their present role, and none other. This may occur, for example, in a factory where an operative has been doing the same repetitive job for such a long period of time their ability to transfer any of the skills they have learnt becomes hampered. Topology: The Multilayer Perceptron and Kohonen’s Self Organising Net will be used to give a more detailed explanation of ANN construction. The Multilayer Perceptron - an example of supervised learning/training: The multilayer perceptron is most commonly used for non-linear estimation or classification. It is often referred to as a feedforward network15. This net comprises of the conventional input and output layers, with a programmer (user) defined number of nodes and layers between (hidden). The number of nodes used in the input/output layers is data specific. A diagram representing the multilayer perceptron is provided below: 15 Generic name used to refer to this network in particular - any network in which the data flows towards the output is technically feedforward. page 40
  • 41. Bryan Mills 1997 Chapter 3 -Fundamental Concepts Diagram 10 - The Multilayer Perceptron It is usual for the number of input nodes to equal the number of variables under consideration, likewise the number of output nodes will equal the number of desired outputs. The number of nodes are often used to give the network its name. For example the network above may be termed a 3-4-4-2 network as it has 3 input nodes, two sets of 4 hidden or computational nodes and 2 output nodes and would be described as a four layer network (although Hopgood (1993) indicates that there is argument concerning this issue as some claim the input layer should not be counted). It can be seen from diagram 10 that each of the calculation nodes is connected to all four of the next set of nodes, but that no nodes are connected vertically (with reference to the diagrams rotation only). The weights would be adjusted at (on/during) the connections between the nodes, with each node being a non-linear function (activator function). The process could be described as - input, weight adjustment, conversion via activator function, become input for next layer, weight adjustment, conversion via activator function and output to final layer where error checking will occur and instigate backpropagation to adjust weights (note- this method utilises supervised training). The number of both layers and calculation nodes is dependent on the page 41
  • 42. programmers decision given a certain problem. It is common to start at a low number of layers/nodes and then increase this number until the desired level of accuracy/speed is achieved. The Kohonen self organising net- an example of unsupervised learning/training: As mentioned previously, the Kohonen self organising net is a form of net which learns using an unsupervised method. This type of network is most commonly used for pattern recognition and is often referred to as The Self Organising Feature Map (Patterson, 1996). As well as differing from the Multilayer Perceptron (MLP) in training method and application it also differs in that it is a single layer feedforward network as opposed to multilayer. In addition the network works on a principle of “winner takes all” (Wasserman, 1989). This means that as the information is processed within the layer, one and only one, node will transmit an output. This is why the method is often referred to as a “competitive one “(Patterson, 1996 and Wasserman, 1989). Diagram 11 - Kohonen Self Organising Feature Map Unlike the Multilayer Perceptron the Kohonen net learns in an unsupervised manner. As the net is attempting to replicate a pattern in various sets of data it trains by continually processing different data-sets until it is satisfied that each new run page 42
  • 43. Bryan Mills 1997 Chapter 3 -Fundamental Concepts will create a replica of previous runs (within tolerances pre-set). For example a child is not always told the same word repeatedly until he or she learns to say it (by being told when he or she has said it incorrectly) but rather learns by being exposed to numerous examples of speech and establishes the ability to replicate these patterns independently. This learning style often sometimes referred to a competitive filter associative memory (Medsker, Turban and Trippi, 1996). It can be appreciated that the differing topologies are related to the different applications to which the networks are applied. Classification (MLP) and pattern recognition (Kohonen) are two quite different problems. An example of classification may be given as “does this data correspond16 to a customer with good credit ratings”, whereas pattern recognition is more commonly associated with speech or hand writing recognition or recognition of trends or patterns within financial information (which may return us to the credit problem by a different route). In summary it can be said that the Multilayer Perceptron works by co- operation and the Kohonen network by competition between nodes. Summary: ANNs offer an alternative to conventional forms of knowledge based systems. Whilst ANNs drew their original inspiration from studies of the human mind further developments in this area are limited by technology. The use of ANNs as a mathematical tool is, however, both possible and practical at today’s levels of technology. The use of self learning and pattern recognition provides a solution to problems which, using conventional techniques, may have been overcomplicated or simple not possible. The basic concept of learning by example through the 16 Is it a member of the set of customers with .... page 43
  • 44. adjustment of mathematical weights is a reasonable approximation of the process and allows the internal computations and structure of the network to be treated as a ‘black-box’17. Networks topologies and learning/training methods are problem specific and it is should be appreciated that the correct choice of network, training style, training data, pre-processing method and activator function all contribute towards the successful application of the network. Diagram 12 (overleaf) represents, in the form of a flow diagram, the series of discrete steps which make up ANN operations. 17 Black-box - The exact details of the internal workings need not be known to facilitate use. page 44
  • 45. Diagram 12 - The operation of ANNs - flow diagram
  • 46. Chapter 4 - Investigation into advantages, disadvantages and current application of ANNs. Introduction: In order to allow a more complete consideration of the suitability of ANNs for a given decision making problem, it is necessary to appreciate the advantages and disadvantages that the use of ANNs provides. The self-learning pattern recognition ability which gives the ANN its distinct characteristics, also creates disadvantages, some of which are problem-specific, whilst others are universal. ANNs are currently used for a wide variety of decision support problems. Perhaps the most commonly cited problem is that of distress modelling (prediction of bankruptcy), but examples are also found in such diverse areas as production control, new product development and traffic light sequence (road junction) modelling. Advantages and disadvantages: Numerous authors have commented on the advantages and disadvantages of ANNs and the following, cited from Hammerstrom (1993), provides what could be regarded as fundamental points of interest. Advantages: • They can infer subtle, unknown relationships from the data. • They are non-linear so that complex problems can be solved more accurately than by linear techniques. • They are highly parallel, which makes them run faster on computers with parallel processors than alternative methods. Additional benefits are given by Medsker, Turban and Trippi (1996) as:
  • 47. Bryan Mills 1997 Chapter 4 - Advantages and Disadvantages • The ability to cope with highly correlated input data (also Multicollinearity Tucker, 1996). • A more highly automated input interface is made possible by ANNs ability to process all inputs at once. • Fault tolerance - due to the high number of nodes, inaccuracy caused by bad data can often be localised and not affect the accuracy of the ANN as a whole. • Generalisation - noisy, incomplete or previously unseen data will still result in a reasonable response being made, providing the ANN is suitably trained (Hawley, Johnson and Raina, 1990). • Adaptability - Training can occur during the ANN’s in-service lifetime, allowing the ANN to remain up to date. Hawley, Johnson and Raina (1990) comment on the fact that ANNs, by the general purpose nature of their structure, are faster to install and maintain than custom built KBS. Training of ANNs, though time consuming, need not be as technically difficult (and therefore as expensive) as writing the program structure of a KBS. ANNs, due to the way in which the relationship of weights is formed, are not prone to ‘crashing’ as a result of incomplete or inaccurate data but are often said to degrade gracefully over time as the weight values alter. This is often cited as an advantage, but it should be borne in mind that something which has happened progressively over time may not be noticed until damage has occurred, and without an adequate control process, may not be appreciated at that point either. Disadvantages: Hammerstrom (1993): • They may fail to produce a satisfactory solution because of insufficient data page . 47
  • 48. [for training] or because no learnable (sic) function exists. • They may produce results from a complex machine learning procedure that has no straightforward cause and effect origin that can be easily explained. • They can be slow and expensive to train. • ANN’s computational speed, in the finished application, depend linearly on the number of connections and, roughly [approximately], the square of the number of nodes. To this list of disadvantages should be added ANN’s most criticised fault - • ANNs are not capable of demonstrating the logical reasoning behind the result obtained - the black box approach (Farrar, Tucker and Bugmann, 1997, Hopgood, 1993, Medsker, Turban and Trippi, 1996 and Tucker and Farrar, 1996). A problem recognised by Tucker (1996) and Tucker and Farrar (1996) is the relative ‘youth’ of ANNs as a decision making technique. Conventional decision making techniques have the advantage of many years of testing, both theoretically and in practice, which have produced models that are both recognised and accepted. ANNs, by the very fact that they are relatively new and underdeveloped, do not have the weight of past experience to promote their results. However, as development within the field is progressing, refining of the technique and empirical evidence should produce an improved methodology and increased statistical evidence of accuracy. Current application of ANNs: Hawley, Johnson and Raina (1990) provide a comprehensive discussion of ANN applications in finance from which the following is adapted: Corporate Finance Applications:
  • 49. Bryan Mills 1997 Chapter 4 - Advantages and Disadvantages Financial Simulation: Whilst the financial management tasks of a company can be divided into various smaller and more manageable segments, the complexity of the company’s internal and external environment is often misrepresented in these simplified models. ANNs provide a means of linking all segments together during analysis, they can be tailored to an individual company and are capable of being dynamic and responsive to change (Donaldson, 1996). ANNs can be used for credit customer behaviour modelling, planning bad debt expenses, planning the cyclical expansion and contraction of accounts, evaluating credit terms and limits, cash management, evaluation of capital investments, asset and personal risk management (insurance), exchange risk management, and the prediction of credit cost and availability based on a company’s financial data. Hawley et al (1990) claim that this area of ANN application offers, perhaps, the greatest potential for ANN business application. Prediction: In determining a new policy, direction or product, organisations need to determine the reaction this choice will create in both present and future investors and the subsequent effect this will have on their investment decisions. Whilst conventional decision making techniques are often more cost efficient at solving problems which have well-identified theoretical underpinning, the problem of investor behaviour and sentiment is of a complexity more suited to ANNs. Investors often base their decisions on a wide range of issues and information concerning both the company and the broader economic environment. It is possible to train ANNs to mimic the behaviour of investors (using actual investors as training models) and then determine the effect alterations to company policy and financial position has on their investment decisions. ANNs offer the opportunity to incorporate a wide range of input and output information enabling the decision maker to gauge reaction to change in ways other than alterations in stock price alone (Hawley et al, page . 49
  • 50. 1990). Evaluation: Accurately valuing a target company’s net worth before attempting a acquisition increases the probability of success, both in terms of acquisition outcome and with regards to eventual profit. ANNs are trained, in this instance, by exposure to training sets of target company data, as input, and human expert value estimate as response. This use of ANNs seeks to copy the behaviour of individuals, including the incorporation of human “hunches” and intuition, which would make the use of conventional decision support programming difficult or impossible. ANNs have been used successfully in a wide range of evaluation problems and confer advantages including: screening large numbers of companies for potential undervalue or other form of acquisition attractiveness to minimise decision makers time (which then only need look at “ideal” companies); the ability to copy the interpretations of a wide range of decision makers; and the ability to automatically adjust to the decision maker’s changing analytical procedures and selection criteria over time (Hawley et al, 1990). Credit Approval: Using a similar training approach to the company evaluation method, detailed above, ANNs are capable of reducing time and labour by mimicking the decisions of financial staff in both credit approval and credit limit decisions. In addition ANNs are able to interpret a wider range of financial statements (providing they are trained on a wide range) more quickly than their human counterparts, negating the need for the information to be restated in a standardised form (Jenson, 1992 and Marose, 1990). Financial Institutions Applications: Assessing Lending/Bankruptcy Risk: ANNs can provide expert opinion on loans and lending arrangements to financial institutions in a similar manner to the example of credit approval discussed above.
  • 51. Bryan Mills 1997 Chapter 4 - Advantages and Disadvantages Security/Asset Portfolio Management: Due to the unstructured nature of the portfolio manager’s decision making process and the diversity of information involved ANNs offer advantages over conventional decision making techniques (Hawley et al, 1990). Pricing Initial Public Offerings (of ordinary shares): Determining the issue price of ordinary shares is a complicated process but one which it is essential to optimise (Brett, 1991). The information is often diverse and of a non-standard format and so the application of ANNs confers advantages, not found in conventional decision making techniques, through their ability to generalise and their lack of reliance on an explicit rule base. Professional Investors Applications: Identification of Arbitrage Opportunities: By replicating an expert decision maker’s reasoning process, a process he or she may not be able to articulate, the ANN is able to assist in the identification of companies which are about to becoming victims of a hostile take-over (and thus allowing purchasing of the company’s shares to be initiated). ANNs offer advantages in their ability to screen large numbers of potential targets, thus giving the arbitrageur a smaller workload. The Technical Analyst18: ANNs pattern recognising abilities enable the patterns (hitherto un-calculable) within stock markets to be emulated. Through this evaluating ability, more accurate predictions of share price movements can be derived (Davidson, 1996). The Fundamental Analyst: Industry norm patterns, market conditions and financial statements can be used to train ANNs to assist in share purchasing in a way similar to the technical analyst model. Summary: The advantage which ANNs confer over and above more traditional decision 18 Influences on share price which are not related to company trading position. page . 51
  • 52. making/support techniques are their ability to discern patterns in large volumes of data through a process of self-learning as opposed to explicit instruction. This process enables ANNs to discover patterns or relationships which may have been overlooked or given too great an emphasis in existing decision support mathematics. ANN’s ability to identify patterns also enables network recognition of variations in handwriting, voice, and image recognition, and provides opportunities for a wide range of security applications. Currently the use of ANNs in business is predominately within bankruptcy prediction and financial risk assessment. However ANNs have been used successfully in a wide range of operations management, marketing (data mining) and personnel applications. The use of ANNs has been shown to offer increased accuracy (Farrar, Tucker and Bugmann, 1997) and, in many instances, are one of the only methods currently available (e.g. handwriting and voice recognition). The ability to deal more easily than conventional methods with non-linearity (Waldrop, 1992) gives the user an advantage in the highly non-linear markets in which business operates (Cuthbertson and Gripaios, 1996). ANNs “operate by a logic known only to themselves” (The Economist, 1995). The most difficult obstacle to overcome in the promotion of ANNs as a decision making tool is their lack of interoperability.
  • 53. Bryan Mills 1997 Chapter 5 - Schema Chapter 5 - Schema for the assessment of the suitability of ANNs for given problem Introduction: Whilst ANNs are an extremely powerful tool, their application is not suited to every problem. For reasons such as cost, unavailability of data and form of data there are certain problems which are bettered suited to other forms of decision support. In order to maximise the benefit gained from ANN application a process of problem/method pre-selection is required. To facilitate the matching of problem19 and solution, a schema has been developed. By answering a series of relatively simple questions, the user is able to determine whether ANNs are suitable for the problem they are attempting to solve. In addition to this both the reasons for and against the use of ANNs, for a given problem, are discussed. Additionally, a suggestion as to other methods which may prove more suitable should ANNs not be appropriate is given. Schema: Diagram 13 (page 46) represents the schema in the form of a flow-chart. By following the series of questions, labelled 1 to 10, the user is able to determine whether ANNs are suitable for a given problem and, any problems which may be encountered in their application. The dotted lines and boxes represent complementary advice, the solid lines and boxes represent questions, flow and conclusions. In conjunction to the process the flow-chart is also described, fully, in text form. The schema is also available as a computer program, the instructions for use are contained in Appendix 3, and the code lists are contained in Appendix 4. 19 Schema concentrates on the use of ANNs for decision making purposes, full systems for factory control etc. can cost in the region of £25,000 and would require more detailed analysis than is possible within a general purpose flowchart (Horridge, 1997) page . 53
  • 54. This program enables an approach which is more dynamic, multidimensional and, above all, simplistic for the user, than is possible on paper alone. The program (called Net Solver) enables the user to determine whether ANNs are suitable for a given problem. To make this possible the program records the response made by the user to a series of questions. These responses enable the program to calculate the suitability of the problem/decision to ANN use. The result is displayed as both a percentage and as a ‘progress-bar’ of the sort used within the Windows 20 environment to show elapsed time. In addition to this result the responses are reiterated, to allow the user to check that she/he has not made any errors. If the user had entered a project name this will also be displayed. To enable the user to determine the next step in the application of ANNs advice is given where it is thought appropriate. The user then has the option of printing the results, advice and details (see Appendix 5 - Sample Output). Throughout the program a fictitious company and telephone number is mentioned (ABC ANNs on (0110)111222), at points where the user may need additional advice. It is intended that this program could, with further development, be used in one of the following ways: • Distributed free of charge (e.g. via the Internet or computer magazine disks) by a software/consultancy company, replacing ABC ANNs with its own name and contact number as part of an advertising campaign. • Incorporated into ANN software as an introduction. • Sold as consultancy software. • Used within education as a teaching aid. • Distributed free of charge, via the Internet, as a philanthropic act. The program was written using Microsoft Visual Basic Version 4 Professional (VB4), using the Microsoft Windows 95 operating environment, and a PC equipped with a 486/66 processor, 8 Mb of Ram and 200 Mb of spare hard disk space (see 20 Microsoft, Windows and Visual Basic are all registered trademarks of the Microsoft Corporation page . 54