Introduction Of Artificial neural network

11,604 views

Published on

JayaVel.
Joseph Amal Raj.
Kaja Mohinden

Published in: Education

Introduction Of Artificial neural network

  1. 1. Madras university<br />Department Of Computer Science<br />
  2. 2. Seminar On<br />Introduction Of ANN, Rules<br />And<br />Adaptive Resonance Theory<br />
  3. 3. GROUP MEMBERS ARE :P.JayaVelJ.Joseph Amal RajM.Kaja Mohinden<br />
  4. 4. ARTIFICIAL NEURAL NETWORK (ANN)<br />An artificial neural network (ANN), usually called "neural network" (NN), is a mathematical model or computational model that tries to simulate the structure and/or functional aspects of biological neural networks.<br />It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation.<br />In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. <br />
  5. 5. ARTIFICIAL NEURAL NETWORK (ANN)<br />
  6. 6. ARTIFICIAL NEURAL NETWORK (ANN)<br />Why use neural networks?<br />Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an "expert" in the category of information it has been given to analyse. This expert can then be used to provide projections given new situations of interest and answer "what if" questions.Other advantages include: <br /><ul><li>Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience.
  7. 7. Self-Organisation: An ANN can create its own organisation or representation of the information it receives during learning time.
  8. 8. Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability.
  9. 9. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage. </li></li></ul><li>ARTIFICIAL NEURAL NETWORK (ANN)<br />Learning paradigms :<br />There are three major learning paradigms<br /><ul><li>Supervised learning
  10. 10. Unsupervised learning
  11. 11. Reinforcement learning</li></li></ul><li>supervised learning<br />In supervised learning, the learning rule is provided with a set of examples (the training set ) of proper network behavior: , where is an input to the network and is the corresponding correct ( target ) output. As the inputs are applied to the network, the network outputs are compared to the targets. The learning rule is then used to adjust the weights and biases of the network in order to move the network outputs closer to the targets. The perceptron learning rule falls in this supervised learning category. <br />In supervised learning, we are given a set of example pairs   <br /> and the aim is to find a function    in the allowed class of functions that matches the examples. In other words, we wish to infer the mapping implied by the data; the cost function is related to the mismatch between our mapping and the data and it implicitly contains prior knowledge about the problem domain.<br />
  12. 12. supervised learning<br />
  13. 13. supervised learning<br /><ul><li>Feedforward Learning Rules</li></ul>Quickprop[fahlman88empirical]<br /><ul><li>Equation 1. Error derivative at previous epoch
  14. 14. Equation 2. Error derivative at this epoch</li></ul> The Quickprop algorithm is loosely based on Newton's method. It is quicker than standard backpropagation because it uses an approximation to the error curve, and second order derivative information which allow a quicker evaluation. Training is similar to backprop except for a copy of (eq. 1) the error derivative at a previous epoch. This, and the current error derivative (eq. 2), are used to minimise an approximation to this error curve. <br />
  15. 15. supervised learning<br />The update rule is given in equation 3:<br /><ul><li>Equation 3. Quickprop update rule</li></ul> This equation uses no learning rate. If the slope of the error curve is less than that of the previous one, then the weight will change in the same direction (positive or negative). However, there needs to be some controls to prevent the weights from growing too large.<br />
  16. 16. Unsupervised learning<br />In unsupervised learning we are given some data x and the cost function to be minimized, that can be any function of the data x and the network's output, f.<br />The cost function is dependent on the task (what we are trying to model) and our a priori assumptions (the implicit properties of our model, its parameters and the observed variables).<br />
  17. 17. Unsupervised learning <br />As a trivial example, consider the model f(x) = a, where a is a constant and the cost C = E[(x − f(x))2]. Minimizing this cost will give us a value of a that is equal to the mean of the data. <br />The cost function can be much more complicated. Its form depends on the application: for example, in compression it could be related to the mutual information between x and y, whereas in statistical modelling, it could be related to theposterior probability of the model given the data. (Note that in both of those examples those quantities would be maximized rather than minimized).<br />Tasks that fall within the paradigm of unsupervised learning are in general estimation problems; the applications include clustering, the estimation of statistical distributions, compression and filtering.<br />
  18. 18. Unsupervised learning<br />Unsupervised learning, in contrast to supervised learning, does not provide the network with target output values. This isn't strictly true, as often (and for the cases discussed in the this section) the output is identical to the input. Unsupervised learning usually performs a mapping from input to output space, data compression or clustering.<br />
  19. 19. Reinforcement learning<br />In reinforcement learning, data x are usually not given, but generated by an agent's interactions with the environment. At each point in time t, the agent performs an action yt and the environment generates an observation xt and an instantaneous cost ct, according to some (usually unknown) dynamics.<br />Tasks that fall within the paradigm of reinforcement learning are control problems, games and other sequential decision making tasks.<br />
  20. 20. Reinforcement learning<br />The aim is to discover a policy for selecting actions that minimizes some measure of a long-term cost; i.e., the expected cumulative cost. The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated.<br />ANNs are frequently used in reinforcement learning as part of the overall algorithm.<br /> <br />
  21. 21. Neural Network “Learning Rules”:<br />Successful learning in any neural network is dependent on how the connections between the neurons are allowed to change in response to activity. The manner of change is what the majority of researchers call "a learning rule". <br />However, we will call it a "synaptic modification rule" because although the network learned the sequence, it is not clear that the *connections* between the neurons in the network "learned" anything in particular.<br />
  22. 22. Mathematical synaptic Modification rule<br />There are many categories of mathematical synaptic modification rule which are used to describe how synaptic strengths should be changed in a neural network.  Some of these categories include: backpropgration of error, correlative Hebbian, and temporally-asymmetric Hebbian.<br />
  23. 23. Mathematical synaptic modification rule<br />Backpropogation of error states that connection strengths should change throughout the entire network in order to minimize the difference between the actual activity and the "desired" activity at the "output" layer of the network.<br />
  24. 24. Mathematical synaptic Modification rule<br />Correlative Hebbian states that any two interconnected neurons that are active at the same time should strengthen their connections, so that if one of the neurons is activated again in the future the other is more likely to become activated too.<br />
  25. 25. Mathematical synaptic Modification rule<br />Temporally-asymmetric Hebbian is described in more detail in the example below, but essentially emphasizes the importants of causality: if a neuron realiably fires before another, its connection to the other neuron should be strengthened. Otherwise, it should be weakened. <br />
  26. 26. Neural Network “Learning Rules”:<br />The Delta Rule<br />The Pattern Associator<br />The Hebb Rule<br />
  27. 27. The Delta Rule<br />A generalized form of the delta rule, developed by D.E. Rumelhart, G.E. Hinton, and R.J. Williams, is needed for networks with hidden layers. They showed that this method works for the class of semilinear activation functions (non-decreasing and differentiable).<br />Generalizing the ideas of the delta rule, consider a hierarchical network with an input layer, an output layer and a number of hidden layers.<br />
  28. 28. The Delta Rule<br />. We will consider only the case where there is one hidden layer. The network is presented with input signals which produce output signals that act as input to the middle layer. Output signals from the middle layer in turn act as input to the output layer to produce the final output vector. This vector is compared to the desired output vector. Since both the output and the desired output vectors are known, the delta rule can be used to adjust the weights in the output layer. <br />
  29. 29. The Delta Rule<br />Can the delta rule be applied to the middle layer? Both the input signal to each unit of the middle layer and the output signal are known. What is not known is the error generated from the output of the middle layer since we do not know the desired output. To get this error, backpropagate through the middle layer to the units that are responsible for generating that output. The error genrated from the middle layer could be used with the delta rule to adjust the weights.<br />
  30. 30. The Pattern Associator<br />A pattern associator learns associations between input patterns and output patterns. One of the most appealing characteristics of such a network is the fact that it can generate what it learns about one pattern to other similar input patterns. Pattern associators have been widely used in distributed memory modeling.<br />
  31. 31. The Pattern Associator<br />The pattern associator is one of the more basic two-layer networks. Its architecture consists of two sets of units, the input units and the output units.<br />Each input unit connects to each output unit via weighted connections.<br />Connections are only allowed from input units to output units. <br />
  32. 32. The Pattern Associator<br />The effect of a unit ui in the input layer on a unit uj in the output layer is determined by the product of the activation ai of ui and the weight of the connection from ui to uj. The activation of a unit uj in the output layer is given by: SUM(wij * ai).<br />
  33. 33. Adaptive Resonance Theory (ART) <br />Discrete Bidirectional Associative Memory <br />Kochen Self Organization Map<br />Counter Propagation Network (CPN) <br />Perceptron<br />Vector Representation<br />ADALINE (Adaptive Linear Neuron or later Adaptive Linear Element) <br />Madaline (Multiple Adaline) <br />Backpropagation, or propagation of error<br />
  34. 34. Adaptive Resonance Theory (ART) <br />Adaptive Resonance Theory (ART) is a theory developed by Stephen Grossberg and Gail Carpenter on aspects of how the brain processes information. It describes a number of neural network models which use supervised and unsupervised learning methods, and address problems such as pattern recognition and prediction.<br />
  35. 35. Discrete Bidirectional Associative Memory <br />
  36. 36. Kochen Self Organization Map<br />The self-organizing map (SOM) invented by TeuvoKohonen performs a form of unsupervised learning.<br /> A set of artificial neurons learn to map points in an input space to coordinates in an output space. The input space can have different dimensions and topology from the output space, and the SOM will attempt to preserve these.<br /> <br />
  37. 37. Kochen Self Organization Map<br />If an input space is to be processed by a neural network, the first issue of importance is the structure of this space. A neural network with real inputs computes a function f defined from an input space A to an output space B. The region where f is defined can be covered by a Kohonen network in such a way that when, for example,an input vector is selected from the region a1, only one unit in the network fires. Such a tiling in which input space is classified in subregions is also called a chart or map of input space. Kohonen networks learn to create maps of the input space in a self-organizing way.<br />
  38. 38. Kochen Self Organization Map-Advantages<br />Probably the best thing about SOMs that they are very easy to understand. It’s very simple, if they are close together and there is grey connecting them, then they are similar. If there is a black ravine between them, then they are different. Unlike Multidimensional Scaling or N-land, people can quickly pick up on how to use them in an effective manner.<br />Another great thing is that they work very well. As I have shown you they classify data well and then are easily evaluate for their own quality so you can actually calculated how good a map is and how strong the similarities between objects are. <br />
  39. 39. Kochen Self Organization Map<br />
  40. 40. Perceptron<br />The perceptron is a type of artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. It can be seen as the simplest kind of feedforward neural network: a linear classifier.<br />The Perceptron is a binary classifier that maps its input x (a real-valued vector) to an output value f(x) (a single binary value) across the matrix.<br />where w is a vector of real-valued weights and   is the dot product (which computes a weighted sum). b is the 'bias', a constant term that does not depend on any input value.<br />
  41. 41. ADALINE<br />Definition<br />Adaline is a single layer neural network with multiple nodes where each node accepts multiple inputs and generates one output. Given the following variables:<br />x is the input vector<br />w is the weight vector<br />n is the number of inputs<br />θ some constant<br />y is the output<br />then we find that the output is  . If we further assume that<br />xn + 1 = 1<br />wn + 1 = θ then the output reduces to the dot product of x and w  <br /> <br />
  42. 42. Madaline <br />Madaline (Multiple Adaline) is a two layer neural network with a set of ADALINEs in parallel as its input layer and a single PE (processing element) in its output layer. For problems with multiple input variables and one output, each input is applied to one Adaline. For similar problems with multiple outputs, madalines in parallel can be used. The madaline network is useful for problems which involve prediction based on multiple inputs, such as weather forecasting (Input variables: barometric pressure, difference in pressure. Output variables: rain, cloudy, sunny).<br />
  43. 43. Backpropagation<br />Backpropagation, or propagation of error, is a common method of teaching artificial neural networks how to perform a given task. It was first described by Arthur E. Bryson and Yu-Chi Ho in 1969,[1][2] but it wasn't until 1986, through the work of David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams, that it gained recognition, and it led to a “renaissance” in the field of artificial neural network research.<br />It is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that knows, or can calculate, the desired output for any given input. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). The term is an abbreviation for "backwards propagation of errors". Backpropagation requires that the activation function used by theartificial neurons (or "nodes") is differentiable.<br />
  44. 44. Backpropagation<br />Backpropagation<br />Calculation of error <br />dk = f(Dk) -f(Ok)<br />
  45. 45. Network Structure –Back-propagation Network<br />Oi Output Unit<br />Wj,i<br />ajHidden Units<br />Wk,j<br />IkInput Units<br />
  46. 46. Counter propagation network (CPN) (§ 5.3)<br />Basic idea of CPN<br />Purpose: fast and coarse approximation of vector mapping<br />not to map any given x to its with given precision,<br />input vectors x are divided into clusters/classes.<br />each cluster of x has one output y, which is (hopefully) the average of for all x in that class.<br />Architecture: Simple case: FORWARD ONLY CPN, <br />y<br />z<br />x<br />1<br />1<br />1<br />y<br />v<br />z<br />w<br />x<br />j<br />j,k<br />k<br />k,i<br />i<br />y<br />z<br />x<br />m<br />p<br />n<br />from hidden (class) to output<br />from input to hidden (class)<br />
  47. 47. <ul><li>Learning in two phases: </li></ul>training sample (x, d ) where is the desired precise mapping<br />Phase1: weights coming into hidden nodes are trained by competitive learning to become the representative vector of a cluster of input vectors x: (use only x, the input part of(x, d ))<br /> 1. For a chosen x, feedforward to determined the winning<br /> 2.<br /> 3. Reduce , then repeat steps 1 and 2 until stop condition is met<br />Phase 2: weights going out of hidden nodes are trained by delta rule to be an average output of where x is an input vector that causes to win (use both x andd). <br /> 1. For a chosen x, feedforward to determined the winning<br /> 2. (optional) <br /> 3.<br /> 4. Repeat steps 1 – 3 until stop condition is met <br />
  48. 48. Adaptive Resonance Theory<br />
  49. 49. Adaptive Resonance Theory<br />Adaptive Resonance Theory (ART) was developed by Grossberg (1976)<br />Input vectors which are close to each other according to a specific similarity measure should be mapped to the same cluster<br />ART adapts itself by storing input patterns, and tries to match best the input pattern <br />45<br />
  50. 50. Adaptive Resonance Theory 1 (ART 1)<br />ART 1 is a binary classification model. <br />Various other versions of the model have evolved from ART 1<br />Pointers to these can be found in the bibliographic remarks<br />The main network comprises the layers F1, F2 and the attentional gain control as the attentional subsystem<br />The attentional vigilance node forms the orienting subsystem<br />
  51. 51. ART 1: Architecture<br />…<br />…<br />Attentional Subsystem<br />Orienting Subsystem<br />F2<br />-<br />-<br />+<br />+<br />F1<br />-<br />+<br />-<br />G<br />A<br />+<br />+<br />+<br />I<br />
  52. 52. ART 1: 2/3 Rule<br />J<br />…<br />F2<br />Si(yj)<br />vji<br />si<br />-<br />sG<br />G<br />F1<br />+<br />l<br />li<br />Three kinds of inputs to each F1 neuron decide when the neuron fires<br /><ul><li>External input Ii
  53. 53. Top-down feedback through outstar weights vji
  54. 54. Gain control signal sG</li></li></ul><li>ART 1: 2/3 Rule<br />The gain control signal sG = 1 if Iis presented and all neurons in F2are inactive<br />sGis nonspecific<br />When the input is initially presented to the system, sG= 1<br />As soon as a node Jin F2 fires as a result of competition, sG = 0<br />
  55. 55. Adaptive Resonance Theory (ART) <br /><ul><li>ART1: for binary patterns; ART2: for continuous patterns
  56. 56. Motivations: Previous methods have the following problems:</li></ul>Number of class nodes is pre-determined and fixed. <br /><ul><li>Under- and over- classification may result from training
  57. 57. Some nodes may have empty classes.
  58. 58. no control of the degree of similarity of inputs grouped in one class. </li></ul>Training is non-incremental: <br /><ul><li>with a fixed set of samples,
  59. 59. adding new samples often requires re-train the network with the enlarged training set until a new stable state is reached.</li></li></ul><li>Ideas of ART model:<br />suppose the input samples have been appropriately classified into k clusters (say by some fashion of competitive learning).<br />each weight vector is a representative (average) of all samples in that cluster.<br />when a new input vector x arrives<br />Find the winner j* among all k cluster nodes<br />Compare with x<br /> if they are sufficiently similar (x resonates with class j*),<br /> then update based on <br /> else, find/create a free class node and make x as its<br /> first member.<br />
  60. 60. To achieve these, we need:<br />a mechanism for testing and determining (dis)similarity between x and .<br />a control for finding/creating new class nodes.<br />need to have all operations implemented by units of local computation.<br />Only the basic ideas are presented<br />Simplified from the original ART model<br />Some of the control mechanisms realized by various specialized neurons are done by logic statements of the algorithm<br />
  61. 61. ART1 Architecture<br />
  62. 62. Working of ART1<br />3 phases after each input vector x is applied<br />Recognition phase: determine the winner cluster for x<br />Using bottom-up weights b<br />Winner j* with max yj* = bj*ּx<br />x is tentatively classified to cluster j*<br />the winner may be far away from x (e.g., |tj* - x| is unacceptably large)<br />
  63. 63. Working of ART1 (3 phases)<br />Comparison phase: <br />Compute similarity using top-down weights t: <br /> vector:<br />If (# of 1’s ins)|/(# of 1’s inx) > ρ, accept the classification, update bj* and tj*<br />else: remove j* from further consideration, look for other potential winner or create a new node with x as its first patter. <br />
  64. 64. Weight update/adaptive phase<br />Initial weight: (no bias)<br /> bottom up: top down:<br />When a resonance occurs with<br />If k sample patterns are clustered to node jthen<br /> = pattern whose 1’s are common to all these k samples <br />
  65. 65.
  66. 66. Example <br />for input x(1)<br />Node 1 wins<br />
  67. 67.
  68. 68. Notes<br />Classification as a search process<br />No two classes have the same b and t<br />Outliers that do not belong to any cluster will be assigned separate nodes<br />Different ordering of sample input presentations may result in different classification.<br />Increase of r increases # of classes learned, and decreases the average class size.<br />Classification may shift during search, will reach stability eventually.<br />There are different versions of ART1 with minor variations<br />ART2 is the same in spirit but different in details.<br />
  69. 69. R<br />G1<br />G2<br />ART1 Architecture<br />+<br />+<br />-<br />-<br />+<br />+<br />+<br />+<br />
  70. 70. cluster units: competitive, receive input vector x through weights b: to determine winner j.<br /> input units: placeholder or external inputs<br /> interface units: <br />pass s to x as input vector for classification by <br />compare x and <br />controlled by gain control unit G1<br />Needs to sequence the three phases (by control units G1, G2, and R)<br />
  71. 71. R = 0: resonance occurs, update and<br />R = 1: fails similarity test, inhibits J from further computation<br />
  72. 72. ART clustering algorithms<br /><ul><li>ART1
  73. 73. ART2
  74. 74. ART3
  75. 75. ARTMAP
  76. 76. Fuzzy ART</li></li></ul><li>Fuzzy ART Modeling<br />
  77. 77. Fuzzy ART<br />Layer1 consists of neurons that are connected to the neurons in Layer 2 through weight vectors.<br />Thenumber of neurons in Layer 1 depends on the characteristics of the input data.<br />The Layer 2 represent clusters.<br />
  78. 78. 67<br />
  79. 79. Fuzzy ART Architecture <br />
  80. 80. Fuzzy ART FMEA<br />FMEA values are evaluated separately with severity, detection and occurrence values<br />The aim is to apply Fuzzy ART algorithm to FMEA method and by performing FMEA on test problems, most favorable parameter combinations (α , β and ρ) are investigated.<br />
  81. 81. Hand-worked Example<br />Cluster the vectors 11100, 11000, 00001, 00011<br />Low vigilance: 0.3<br />High vigilance: 0.7<br />
  82. 82. Hand-worked Example:  = 0.3<br />
  83. 83. ART 1: Clustering Application<br /> = 0.3<br />
  84. 84. Hand-worked Example:  = 0.7<br />
  85. 85. ART 1: Clustering Application<br /> = 0.7<br />
  86. 86. Neurophysiological Evidence for ARTMechanisms<br />The attentional subsystem of an ART network has been used to model aspects of the inferotemporal cortex<br />Orienting subsystem has been used to model a part of the hippocampal system, which is known to contribute to memory functions<br />The feedback prevalent in an ART network can help focus attention in models of visual object recognition<br />
  87. 87. Other Applications<br />Aircraft Part Design Classification System.<br />See text for details.<br />
  88. 88. Ehrenstein Pattern Explained by ART !<br />The bright disc disappears <br />when the alignment of the dark lines is disturbed!<br />Generates a circular illusory contour – a circular disc of enhanced brightness<br />
  89. 89. 78<br />Other Neurophysiological Evidence<br />Adam Sillito [University College, London]<br />Cortical feedback in a cat tunes cells in its LGN to respond best to lines of a specific length. <br />Chris Redie [MPI Entwicklungsbiologie, Germany]<br />Found that some visual cells in a cat’s LGN and cortex respond best at line ends— more strongly to line ends than line sides. <br />Sillito et al. [University College, London]<br /> Provide neurophysiological data suggesting that the cortico-geniculate feedback closely resembles the matching and resonance of an ART network. <br />Cortical feedback has been found to change the output of specific LGN cells, increasing the gain of the input for feature linked events that are detected by the cortex. <br />
  90. 90. Computational Experiment<br />Anon-binary<br />dataset of FMEA is<br />used to evaluate the<br />performance of the<br />Fuzzy ART neural<br />network on different<br />test problems<br />79<br />
  91. 91. 80<br />Computational Experiment<br />For acomprehensive<br />analysis of the effects<br />of parameters on the<br />performance of Fuzzy<br />ART in FMEA case, a<br />number of levels of<br />parameters are<br />considered.<br />
  92. 92. 81<br />Computational Experiment<br />The Fuzzy ART neural network method is applied to determine the most favorable parameter (α, β and ρ) combinations during application of FMEA on test problems<br />
  93. 93. 82<br />Results<br />For any test problem 900 solutions are obtained. <br />The β-ρ interactions for parameter combinations are considered where solutions are obtained. For each test problem, all the combinations are evaluated and frequency distribution of clusters are constituted<br />
  94. 94. 83<br />
  95. 95. Results<br />For example, for test<br />problem 1, four groups<br />which consist the 70% of<br />combinations are selected,<br />cluster numbers that<br />contains minimum 80% of<br />the all combinations are<br />determined according to<br />the results of pareto<br />analysis. These are groups<br />2-3 and 4<br />84<br />
  96. 96. Results<br />Parameter combinations, β-ρ interactions and the number of α parameters in any combination of β and ρ, is shown at the side. Favorable solutions are marked as bold and italic<br />85<br />
  97. 97. Results<br />Number of cluster increases with the increase in ρ.<br />Number of cluster increases with the increase in β.<br />Clustering of the data in most problems depends on the interaction between the β and ρ parameters. α parameter has no effect on solution in small scaled problems, but in large scale problems, effect of α turns to an irregular state <br />Also with the increase in problem scale, the change in number of clusters is defined. <br />
  98. 98. Results<br />In FMEA test problems, which determine most favorable parameter combinations, β-ρ interactions providing appropriate cluster numbers are noted on the summary table that evaluates each test problem separately. The values involve favorable β-ρ combinations are marked with the blue area. This is a suitable solution area for FMEA problem. <br />87<br />
  99. 99. 88<br />ART 1: Clustering Application<br />Clustering pixel based alphabet images<br />
  100. 100. 89<br />Conclusion and Discussion<br />Fuzzy ART neural network is applied to FMEA <br />Appropriate parameter intervals are investigated for giving successful results of Fuzzy ART in FMEA problems. <br />The investigations show us, if input number is smaller than or equal to 30, FMEA problem is defined as small scale, otherwise it is large scale. <br />We suggest that cluster numbers should be determined between 2 and 6 at small scale problems for practical studies.<br />Cluster numbers of large scale problems should be maximum 12for practical studies. <br />
  101. 101. 90<br />Conclusion and Discussion<br />Determinations about α:<br />In small scale problems, alfa increases cluster number only if β is greater than or equal to 0.8. In other conditions, it is observed that α values have no effect on solution. <br />In large scale problems, appropriate interval cannot be determined because the effect of α becomes irregular<br />Determinations about β:<br />For both small and large scale problems, number of cluster increases with the increase in β. <br />Determinations about ρ:<br />For both small and large scale problems, number of cluster increases with the increase in ρ. <br />
  102. 102. 91<br />Conclusion and Discussion<br />For small and large scale problems in FMEA, Fuzzy ART algorithm is fast, effective and easy to implement. Parameter combinations are acquired where the best solution is obtained for non-binary problems. <br />
  103. 103. References :<br />http://www.learnartificialneuralnetworks.com/backpropagation.html#deltarule<br />http://uhavax.hartford.edu/compsci/neural-networks-learning.html<br />http://www.neurevolution.net/2007/03/15/neural-network-learning-rules/<br />http://dmcer.net/papers/JilkCerOReilly03_cns.pdf<br />http://www.learnartificialneuralnetworks.com/<br />
  104. 104. References :<br />Carpenter, G.A. and Grossberg, S. (1987a), "A massively parallel architecture for a self-organizing neural pattern recognition machine", Computer Vision, Graphics, and Image Processing 37, 54-115.<br />Carpenter, G.A. and Grossberg, S. (1987b), "ART 2: Stable self-organization of pattern recognition codes for analog input patterns", Applied Optics 26, 4919-4930.<br />Carpenter, G.A. and Grossberg, S. (1990), "ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architectures", Neural Networks 3, 129-152.<br />Carpenter, G.A., Grossberg, S. and Reynolds, J.H. (1991a), "ARTMAP: Supervised real-time learning and classification of non-stationary data by a self-organizing neural network", Neural Networks 4, 565-588.<br />

×