ELM: Extreme Learning Machine: Learning without iterative tuning

6,977 views

Published on

Published in: Education
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,977
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
515
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

ELM: Extreme Learning Machine: Learning without iterative tuning

  1. 1. Outline Extreme Learning Machine: Learning Without Iterative Tuning Guang-Bin HUANG Associate Professor School of Electrical and Electronic Engineering Nanyang Technological University, Singapore tu-logo Talks in ELM2010 (Adelaide, Australia, Dec 7 2010), HP Labs (Palo Alto, USA), Microsoft Research (Redmond, USA), UBC (Canada), Institute for InfoComm Research (Singapore), EEE/NTU (Singapore) ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  2. 2. OutlineOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  3. 3. OutlineOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  4. 4. OutlineOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  5. 5. OutlineOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  6. 6. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  7. 7. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with Additive Nodes Output of hidden nodes G(ai , bi , x) = g(ai · x + bi ) (1) ai : the weight vector connecting the ith hidden node and the input nodes. bi : the threshold of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (2) i=1 tu-logo β i : the weight vector connecting the ith hidden node and the output nodes. Figure 1: SLFN: additive hidden nodes ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  8. 8. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with Additive Nodes Output of hidden nodes G(ai , bi , x) = g(ai · x + bi ) (1) ai : the weight vector connecting the ith hidden node and the input nodes. bi : the threshold of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (2) i=1 tu-logo β i : the weight vector connecting the ith hidden node and the output nodes. Figure 1: SLFN: additive hidden nodes ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  9. 9. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with Additive Nodes Output of hidden nodes G(ai , bi , x) = g(ai · x + bi ) (1) ai : the weight vector connecting the ith hidden node and the input nodes. bi : the threshold of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (2) i=1 tu-logo β i : the weight vector connecting the ith hidden node and the output nodes. Figure 1: SLFN: additive hidden nodes ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  10. 10. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with RBF Nodes Output of hidden nodes G(ai , bi , x) = g (bi x − ai ) (3) ai : the center of the ith hidden node. bi : the impact factor of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (4) i=1 tu-logo β i : the weight vector connecting the ith hiddenFigure 2: Feedforward Network Architecture: RBF hidden node and the output nodes.nodes ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  11. 11. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with RBF Nodes Output of hidden nodes G(ai , bi , x) = g (bi x − ai ) (3) ai : the center of the ith hidden node. bi : the impact factor of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (4) i=1 tu-logo β i : the weight vector connecting the ith hiddenFigure 2: Feedforward Network Architecture: RBF hidden node and the output nodes.nodes ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  12. 12. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFeedforward Neural Networks with RBF Nodes Output of hidden nodes G(ai , bi , x) = g (bi x − ai ) (3) ai : the center of the ith hidden node. bi : the impact factor of the ith hidden node. Output of SLFNs L X fL (x) = β i G(ai , bi , x) (4) i=1 tu-logo β i : the weight vector connecting the ith hiddenFigure 2: Feedforward Network Architecture: RBF hidden node and the output nodes.nodes ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  13. 13. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  14. 14. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Mathematical Model Any continuous target function f (x) can be approximated by SLFNs with adjustable hidden nodes. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) − f (x) < (5) Learning Issue tu-logo In real applications, target function f is usually unknown. One wishes that unknown f could be Figure 3: SLFN. approximated by SLFNs fL appropriately. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  15. 15. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Mathematical Model Any continuous target function f (x) can be approximated by SLFNs with adjustable hidden nodes. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) − f (x) < (5) Learning Issue tu-logo In real applications, target function f is usually unknown. One wishes that unknown f could be Figure 3: SLFN. approximated by SLFNs fL appropriately. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  16. 16. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Mathematical Model Any continuous target function f (x) can be approximated by SLFNs with adjustable hidden nodes. In other words, given any small positive value , for SLFNs with enough number of hidden nodes (L) we have fL (x) − f (x) < (5) Learning Issue tu-logo In real applications, target function f is usually unknown. One wishes that unknown f could be Figure 3: SLFN. approximated by SLFNs fL appropriately. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  17. 17. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Learning Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes and activation function g(x) are mathematically modeled as fL (xj ) = oj , j = 1, · · · , N (6) PN ‚ ‚ Cost function: E = ‚oj − tj ‚ . ‚ ‚ j=1 2 The target is to minimize the cost tu-logo function E by adjusting the network parameters: β i , ai , bi . Figure 4: SLFN. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  18. 18. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Learning Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes and activation function g(x) are mathematically modeled as fL (xj ) = oj , j = 1, · · · , N (6) PN ‚ ‚ Cost function: E = ‚oj − tj ‚ . ‚ ‚ j=1 2 The target is to minimize the cost tu-logo function E by adjusting the network parameters: β i , ai , bi . Figure 4: SLFN. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  19. 19. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Learning Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes and activation function g(x) are mathematically modeled as fL (xj ) = oj , j = 1, · · · , N (6) PN ‚ ‚ Cost function: E = ‚oj − tj ‚ . ‚ ‚ j=1 2 The target is to minimize the cost tu-logo function E by adjusting the network parameters: β i , ai , bi . Figure 4: SLFN. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  20. 20. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMFunction Approximation of Neural Networks Learning Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes and activation function g(x) are mathematically modeled as fL (xj ) = oj , j = 1, · · · , N (6) PN ‚ ‚ Cost function: E = ‚oj − tj ‚ . ‚ ‚ j=1 2 The target is to minimize the cost tu-logo function E by adjusting the network parameters: β i , ai , bi . Figure 4: SLFN. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  21. 21. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  22. 22. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMClassification Capability of SLFNs As long as SLFNs can approximate any continuous target function f (x), such SLFNs can differentiate any disjoint regions. tu-logo Figure 5: SLFN. G.-B. Huang, et al., “Classification Ability of Single Hidden Layer Feedforward Neural Networks,” IEEE Transactions ur-logo on Neural Networks, vol. 11, no. 3, pp. 799-801, 2000. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  23. 23. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  24. 24. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMLearning Algorithms of Neural Networks Learning Methods Many learning methods mainly based on gradient-descent/iterative approaches have been developed over the past two decades. Back-Propagation (BP) and its variants are most popular. Least-square (LS) solution for RBF network, with single impact factor for all hidden nodes. tu-logo Figure 6: Feedforward Network Architecture. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  25. 25. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMLearning Algorithms of Neural Networks Learning Methods Many learning methods mainly based on gradient-descent/iterative approaches have been developed over the past two decades. Back-Propagation (BP) and its variants are most popular. Least-square (LS) solution for RBF network, with single impact factor for all hidden nodes. tu-logo Figure 6: Feedforward Network Architecture. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  26. 26. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMLearning Algorithms of Neural Networks Learning Methods Many learning methods mainly based on gradient-descent/iterative approaches have been developed over the past two decades. Back-Propagation (BP) and its variants are most popular. Least-square (LS) solution for RBF network, with single impact factor for all hidden nodes. tu-logo Figure 6: Feedforward Network Architecture. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  27. 27. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMLearning Algorithms of Neural Networks Learning Methods Many learning methods mainly based on gradient-descent/iterative approaches have been developed over the past two decades. Back-Propagation (BP) and its variants are most popular. Least-square (LS) solution for RBF network, with single impact factor for all hidden nodes. tu-logo Figure 6: Feedforward Network Architecture. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  28. 28. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMAdvantages and Disadvantages Popularity Widely used in various applications: regression, classification, etc. Limitations Usually different learning algorithms used in different SLFNs architectures. Some parameters have to be tuned manually. Overfitting. Local minima. Time-consuming. tu-logo ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  29. 29. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMAdvantages and Disadvantages Popularity Widely used in various applications: regression, classification, etc. Limitations Usually different learning algorithms used in different SLFNs architectures. Some parameters have to be tuned manually. Overfitting. Local minima. Time-consuming. tu-logo ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  30. 30. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  31. 31. Feedforward Neural Networks SLFN Models ELM Function Approximation ELM and SVM Classification Capability OS-ELM Learning Methods Summary SVMSupport Vector Machine SVM optimization formula: N 1 2 X Minimize: LP = w +C ξi 2 i=1 (7) Subject to: ti (w · φ(xi ) + b) ≥ 1 − ξi , i = 1, · · · , N ξi ≥ 0, i = 1, · · · , N The decision function of SVM is: 0 1 Ns X f (x) = sign @ αs ts K (x, xs ) + bA (8) s=1 Figure 7: SVM Architecture. tu-logo ´ B. Frenay and M. Verleysen, “Using SVMs with Randomised Feature Spaces: an Extreme Learning Approach,” ESANN, Bruges, Belgium, pp. 315-320, 28-30 April, 2010. ur-logo G.-B. Huang, et al., “Optimization Method Based Extreme Learning Machine for Classification,” Neurocomputing, vol. 74, pp. 155-163, 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  32. 32. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  33. 33. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryGeneralized SLFNs General Hidden Layer Mapping PL Output function: f (x) = i=1 β i G(ai , bi , x) The hidden layer output function (hidden layer mapping): h(x) = [G(a1 , b1 , x), · · · , G(aL , bL , x)] The output function needn’t be: Sigmoid: G(ai , bi , x) = g(ai · x + bi ) RBF: G(ai , bi , x) = g (bi x − ai )Figure 8: SLFN: any type of piecewise continuous G(ai , bi , x). G.-B. Huang, et al., “Universal Approximation Using Incremental Constructive Feedforward Networks with Random tu-logo Hidden Nodes,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., “Convex Incremental Extreme Learning Machine,” Neurocomputing, vol. 70, pp. 3056-3062, ur-logo 2007. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  34. 34. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryGeneralized SLFNs General Hidden Layer Mapping PL Output function: f (x) = i=1 β i G(ai , bi , x) The hidden layer output function (hidden layer mapping): h(x) = [G(a1 , b1 , x), · · · , G(aL , bL , x)] The output function needn’t be: Sigmoid: G(ai , bi , x) = g(ai · x + bi ) RBF: G(ai , bi , x) = g (bi x − ai )Figure 8: SLFN: any type of piecewise continuous G(ai , bi , x). G.-B. Huang, et al., “Universal Approximation Using Incremental Constructive Feedforward Networks with Random tu-logo Hidden Nodes,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. G.-B. Huang, et al., “Convex Incremental Extreme Learning Machine,” Neurocomputing, vol. 70, pp. 3056-3062, ur-logo 2007. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  35. 35. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  36. 36. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryNew Learning Theory: Learning Without IterativeTuning New Learning View Learning Without Iterative Tuning: Given any nonconstant piecewise continuous function g, if continuous target function f (x) can be approximated by SLFNs with adjustable hidden nodes g then the hidden node parameters of such SLFNs needn’t be tuned. All these hidden node parameters can be randomly generated without the knowledge of the training data. That is, for any continuous target function f and any randomly generated sequence {(ai , bi )L }, i=1 limL→∞ f (x) − fL (x) = limL→∞ f (x) − L β i G(ai , bi , x) = 0 holds with probability one if βi is P i=1 chosen to minimize f (x) − fL (x) , i = 1, · · · , L. G.-B. Huang, et al., “Universal Approximation Using Incremental Constructive Feedforward Networks with Random Hidden Nodes,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. tu-logo G.-B. Huang, et al., “Convex Incremental Extreme Learning Machine,” Neurocomputing, vol. 70, pp. 3056-3062, 2007. G.-B. Huang, et al., “Enhanced Random Search Based Incremental Extreme Learning Machine,” Neurocomputing, ur-logo vol. 71, pp. 3460-3468, 2008. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  37. 37. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryNew Learning Theory: Learning Without IterativeTuning New Learning View Learning Without Iterative Tuning: Given any nonconstant piecewise continuous function g, if continuous target function f (x) can be approximated by SLFNs with adjustable hidden nodes g then the hidden node parameters of such SLFNs needn’t be tuned. All these hidden node parameters can be randomly generated without the knowledge of the training data. That is, for any continuous target function f and any randomly generated sequence {(ai , bi )L }, i=1 limL→∞ f (x) − fL (x) = limL→∞ f (x) − L β i G(ai , bi , x) = 0 holds with probability one if βi is P i=1 chosen to minimize f (x) − fL (x) , i = 1, · · · , L. G.-B. Huang, et al., “Universal Approximation Using Incremental Constructive Feedforward Networks with Random Hidden Nodes,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 879-892, 2006. tu-logo G.-B. Huang, et al., “Convex Incremental Extreme Learning Machine,” Neurocomputing, vol. 70, pp. 3056-3062, 2007. G.-B. Huang, et al., “Enhanced Random Search Based Incremental Extreme Learning Machine,” Neurocomputing, ur-logo vol. 71, pp. 3460-3468, 2008. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  38. 38. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM Summary Unified Learning Platform Mathematical Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes each with output function G(ai , bi , x) are mathematically modeled as L X β i G(ai , bi , xj ) = tj , j = 1, · · · , N (9) i=1 (ai , bi ): hidden node parameters. tu-logo β i : the weight vector connecting the ith hiddenFigure 9: Generalized SLFN: any type of piecewise node and the output node.continuous G(ai , bi , x). ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  39. 39. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM Summary Unified Learning Platform Mathematical Model For N arbitrary distinct samples (xi , ti ) ∈ Rn × Rm , SLFNs with L hidden nodes each with output function G(ai , bi , x) are mathematically modeled as L X β i G(ai , bi , xj ) = tj , j = 1, · · · , N (9) i=1 (ai , bi ): hidden node parameters. tu-logo β i : the weight vector connecting the ith hiddenFigure 9: Generalized SLFN: any type of piecewise node and the output node.continuous G(ai , bi , x). ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  40. 40. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Mathematical Model PL i=1 β i G(ai , bi , xj ) = tj , j = 1, · · · , N, is equivalent to Hβ = T, where h(x1 ) G(a1 , b1 , x1 ) ··· G(aL , bL , x1 ) 2 3 2 3 6 . 7 6 . . 7 H=6 . 7=6 . . 7 (10) 4 . 5 4 . ··· . 5 h(xN ) G(a1 , b1 , xN ) ··· G(aL , bL , xN ) N×L βT tT 2 3 2 3 1 1 . . 6 7 6 7 β=6 and T = 6 (11) 6 7 6 7 . 7 . 7 4 . 5 4 . 5 βTL T tN L×m N×m tu-logo H is called the hidden layer output matrix of the neural network; the ith column of H is the output of the ith hidden node with respect to inputs x1 , x2 , · · · , xN . ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  41. 41. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  42. 42. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Three-Step Learning Model Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i = 1, · · · , N}, hidden node output function G(a, b, x), and the number of hidden nodes L, 1 Assign randomly hidden node parameters (ai , bi ), i = 1, · · · , L. 2 Calculate the hidden layer output matrix H. 3 Calculate the output weight β: β = H† T. where H† is the Moore-Penrose generalized inverse of hidden layer output matrix H. tu-logo Source Codes of ELM http://www.extreme-learning-machines.org ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  43. 43. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Three-Step Learning Model Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i = 1, · · · , N}, hidden node output function G(a, b, x), and the number of hidden nodes L, 1 Assign randomly hidden node parameters (ai , bi ), i = 1, · · · , L. 2 Calculate the hidden layer output matrix H. 3 Calculate the output weight β: β = H† T. where H† is the Moore-Penrose generalized inverse of hidden layer output matrix H. tu-logo Source Codes of ELM http://www.extreme-learning-machines.org ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  44. 44. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Three-Step Learning Model Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i = 1, · · · , N}, hidden node output function G(a, b, x), and the number of hidden nodes L, 1 Assign randomly hidden node parameters (ai , bi ), i = 1, · · · , L. 2 Calculate the hidden layer output matrix H. 3 Calculate the output weight β: β = H† T. where H† is the Moore-Penrose generalized inverse of hidden layer output matrix H. tu-logo Source Codes of ELM http://www.extreme-learning-machines.org ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  45. 45. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Three-Step Learning Model Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i = 1, · · · , N}, hidden node output function G(a, b, x), and the number of hidden nodes L, 1 Assign randomly hidden node parameters (ai , bi ), i = 1, · · · , L. 2 Calculate the hidden layer output matrix H. 3 Calculate the output weight β: β = H† T. where H† is the Moore-Penrose generalized inverse of hidden layer output matrix H. tu-logo Source Codes of ELM http://www.extreme-learning-machines.org ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  46. 46. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryExtreme Learning Machine (ELM) Three-Step Learning Model Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i = 1, · · · , N}, hidden node output function G(a, b, x), and the number of hidden nodes L, 1 Assign randomly hidden node parameters (ai , bi ), i = 1, · · · , L. 2 Calculate the hidden layer output matrix H. 3 Calculate the output weight β: β = H† T. where H† is the Moore-Penrose generalized inverse of hidden layer output matrix H. tu-logo Source Codes of ELM http://www.extreme-learning-machines.org ur-logoELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  47. 47. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryELM Learning Algorithm Salient Features “Simple Math is Enough.” ELM is a simple tuning-free three-step algorithm. The learning speed of ELM is extremely fast. The hidden node parameters ai and bi are not only independent of the training data but also of each other. Unlike conventional learning methods which MUST see the training data before generating the hidden node parameters, ELM could generate the hidden node parameters before seeing the training data. Unlike traditional gradient-based learning algorithms which only work for differentiable activation functions, ELM works for all bounded nonconstant piecewise continuous activation functions. Unlike traditional gradient-based learning algorithms facing several issues like local minima, improper learning rate and overfitting, etc, ELM tends to reach the solutions straightforward without such trivial issues. The ELM learning algorithm looks much simpler than other popular learning algorithms: neural networks and support vector machines. tu-logo G.-B. Huang, et al., “Can Threshold Networks Be Trained Directly?” IEEE Transactions on Circuits and Systems II, vol. 53, no. 3, pp. 187-191, 2006. M.-B. Li, et al., “Fully Complex Extreme Learning Machine” Neurocomputing, vol. 68, pp. 306-314, 2005. ur-logo ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  48. 48. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutput Functions of Generalized SLFNs Ridge regression theory based ELM «−1 I „ T −1 “ ” T T T f(x) = h(x)β = h(x)H HH ⇒ T = h(x)H + HH T C and «−1 I “ ”−1 „ T T T T f(x) = h(x)β = h(x) H H ⇒ H T = h(x) +H H H T C Ridge Regression Theory A positive value I/C can be added to the diagonal of HT H or HHT of the Moore-Penrose generalized inverse H the resultant solution is stabler and tends to have better generalization performance. tu-logo A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems”, Technometrics, vol. 12, no. 1, pp. 55-67, 1970. Citation: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, “Extreme Learning Machine for Regression and Multi-Class ur-logo Classification”, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  49. 49. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutput Functions of Generalized SLFNs Ridge regression theory based ELM «−1 I „ T −1 “ ” T T T f(x) = h(x)β = h(x)H HH ⇒ T = h(x)H + HH T C and «−1 I “ ”−1 „ T T T T f(x) = h(x)β = h(x) H H ⇒ H T = h(x) +H H H T C Ridge Regression Theory A positive value I/C can be added to the diagonal of HT H or HHT of the Moore-Penrose generalized inverse H the resultant solution is stabler and tends to have better generalization performance. tu-logo A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems”, Technometrics, vol. 12, no. 1, pp. 55-67, 1970. Citation: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, “Extreme Learning Machine for Regression and Multi-Class ur-logo Classification”, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  50. 50. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutput functions of Generalized SLFNs Valid for both kernel and non-kernel learning 1 Non-kernel based: «−1 I „ T T f(x) = h(x)H + HH T C and «−1 I „ T T f(x) = h(x) +H H H T C 2 Kernel based: (if h(x) is unknown) K (x, x1 ) T 2 3 «−1 I „ 6 . 7 f(x) = 6 . 7 + ΩELM T 4 . 5 C K (x, xN ) tu-logo where ΩELM i,j = h(xi ) · h(xj ) = K (xi ,xj ) ur-logo Citation: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, “Extreme Learning Machine for Regression and Multi-Class Classification”, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  51. 51. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutput functions of Generalized SLFNs Valid for both kernel and non-kernel learning 1 Non-kernel based: «−1 I „ T T f(x) = h(x)H + HH T C and «−1 I „ T T f(x) = h(x) +H H H T C 2 Kernel based: (if h(x) is unknown) K (x, x1 ) T 2 3 «−1 I „ 6 . 7 f(x) = 6 . 7 + ΩELM T 4 . 5 C K (x, xN ) tu-logo where ΩELM i,j = h(xi ) · h(xj ) = K (xi ,xj ) ur-logo Citation: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, “Extreme Learning Machine for Regression and Multi-Class Classification”, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  52. 52. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutput functions of Generalized SLFNs Valid for both kernel and non-kernel learning 1 Non-kernel based: «−1 I „ T T f(x) = h(x)H + HH T C and «−1 I „ T T f(x) = h(x) +H H H T C 2 Kernel based: (if h(x) is unknown) K (x, x1 ) T 2 3 «−1 I „ 6 . 7 f(x) = 6 . 7 + ΩELM T 4 . 5 C K (x, xN ) tu-logo where ΩELM i,j = h(xi ) · h(xj ) = K (xi ,xj ) ur-logo Citation: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, “Extreme Learning Machine for Regression and Multi-Class Classification”, submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, October 2010. ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  53. 53. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryOutline 1 Feedforward Neural Networks Single-Hidden Layer Feedforward Networks (SLFNs) Function Approximation of SLFNs Classification Capability of SLFNs Conventional Learning Algorithms of SLFNs Support Vector Machines 2 Extreme Learning Machine Generalized SLFNs New Learning Theory: Learning Without Iterative Tuning ELM Algorithm tu-logo Differences between ELM and LS-SVM 3 ELM and Conventional SVM ur-logo 4 Online Sequential ELM ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning
  54. 54. Feedforward Neural Networks Generalized SLFNs ELM ELM Learning Theory ELM and SVM ELM Algorithm OS-ELM ELM and LS-SVM SummaryDifferences between ELM and LS-SVMELM (unified for regression, binary/multi-class cases) LS-SVM (for binary class case) 1 Non-kernel based: " #» – " #» – TT TT » – «−1 0 b 0 b 0 I = = „ f(x) = h(x)H T T + HH T T I C + ΩLS−SVM α T I C + ZZT α 1 C where and t1 φ(x1 ) 2 3 «−1 I „ T T f(x) = h(x) +H H H T 6 . 7 C Z=6 . 7 4 . 5 tN φ(xN ) T 2 Kernel based: (if h(x) is unknown) ΩLS−SVM = ZZ tu-logo K (x, x1 ) T 2 3 «−1 I „ 6 . 7 For details, refer to: G.-B. Huang, H. Zhou, X. Ding and R. Zhang, f(x) = 6 . 7 + ΩELM T 4 . 5 C “Extreme Learning Machine for Regression and Multi-Class K (x, xN ) Classification”, submitted to IEEE Transactions on Pattern Analysis ur-logo and Machine Intelligence, October 2010. where ΩELM i,j = h(xi ) · h(xj ) = K (xi ,xj ) ELM Web Portal: www.extreme-learning-machines.org Learning Without Iterative Tuning

×