Your SlideShare is downloading. ×
0
Models from Data: a Unifying Picture                       Johan Suykens          KU Leuven, ESAT-SCD/SISTA              K...
Overview• Models from data• Examples• Challenges for computational intelligence
High-quality predictive models are crucial                 biomedical             bio-informaticsModels from data: a unify...
High-quality predictive models are crucial                 biomedical                                     process industry...
High-quality predictive models are crucial                 biomedical                                                proce...
Classical Neural Networksx1 w1x2 w2                        y   w           h(·)x3 w 3xn n    b            h(·) 1Multilayer...
Support Vector Machines       cost function                                   cost function                               ...
Kernel-based models: different views    SVM                                                 Some early history on RKHS:    ...
SVMs: living in two worlds ...                                              Primal space                                  ...
SVMs: living in two worlds ...                                              Primal space                                  ...
Fitting models to data: alternative views- Consider model y = f (x; w), given input/output data {(xi, yi)}N :             ...
Fitting models to data: alternative views- Consider model y = f (x; w), given input/output data {(xi, yi)}N :             ...
Linear model: solving in primal or dual?inputs x ∈ Rd, output y ∈ Rtraining set {(xi, yi)}N                       i=1     ...
Linear model: solving in primal or dual?inputs x ∈ Rd, output y ∈ Rtraining set {(xi, yi)}N                       i=1     ...
Linear model: solving in primal or dual?few inputs, many data points: (e.g. 20 × 1.000.000)  primal : w ∈ R20dual: α ∈ R1....
Linear model: solving in primal or dual?many inputs, few data points: (e.g. 10.000 × 50)primal: w ∈ R10.000 dual : α ∈ R50...
Least Squares Support Vector Machines: “core models”• Regression                   min wT w + γ                         e2...
Core models                                                           regularization terms                          Core m...
Core models                                                               regularization terms                          Co...
Core models                        parametric model                        support vector machine                        l...
Overview• Models from data• Examples• Challenges for computational intelligence
Kernel spectral clustering• Underlying model: e∗ = wT ϕ(x∗)                       ˆ  with q∗ = sign[ˆ∗] the estimated clus...
Primal and dual model representationsbias term b for centeringk clusters, k − 1 sets of constraints (index l = 1, ..., k −...
Out-of-sample extension and coding        8                                                                        8      ...
Out-of-sample extension and coding        8                                                                      8        ...
Example: image segmentation                                                            4                                  ...
Example: image segmentation                                                                           1                   ...
Kernel spectral clustering: sparse kernel models       original image                                               binary...
Kernel spectral clustering: sparse kernel models       original image                                              sparse ...
Highly sparse kernel models: image segmentation                                 *                       * *               ...
Highly sparse kernel models: image segmentation                                                  *                     *  ...
Kernel spectral clustering: adding prior knowledge• Pair of points x†, x‡: c = 1 must-link, c = −1 cannot-link• Primal pro...
Kernel spectral clustering: example             original image                            without constraintsModels from d...
Kernel spectral clustering: example             original image                            with constraintsModels from data...
Hierarchical kernel spectral clusteringHierarchical kernel spectral clustering:- looking at different scales- use of model ...
Power grid: kernel spectral clustering of time-series                   1                                                 ...
Dimensionality reduction and data visualization• Traditionally:  commonly used techniques are e.g. principal component ana...
Kernel maps with reference point: formulation        • Kernel maps with reference point [Suykens, IEEE-TNN 2008]:         ...
Kernel maps with reference point: formulation        • Kernel maps with reference point [Suykens, IEEE-TNN 2008]:         ...
Kernel maps with reference point: formulation        • Kernel maps with reference point [Suykens, IEEE-TNN 2008]:         ...
Kernel maps: spiral example      0.5       0x3                                                                            ...
Kernel maps: visualizing gene distribution                                        −3                                     x...
Kernels & Tensors             neuroscience:             EEG data                                       (time samples × fre...
Tensor completionMass spectral imaging: sagittal section mouse brain [data: E. Waelkens, R. Van de Plas]Tensor completion ...
Overview• Models from data• Examples• Challenges for computational intelligence
Challenges for Computational Intelligence- Bridging gaps between advanced methods and end-users- New mathematical and meth...
Acknowledgements• Colleagues at ESAT-SCD (especially research units: systems, models,  control - biomedical data processin...
Upcoming SlideShare
Loading in...5
×

Johan Suykens: "Models from Data: a Unifying Picture"

503

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
503
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Johan Suykens: "Models from Data: a Unifying Picture" "

  1. 1. Models from Data: a Unifying Picture Johan Suykens KU Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg 10 B-3001 Leuven (Heverlee), Belgium Email: johan.suykens@esat.kuleuven.be http://www.esat.kuleuven.be/scd/ Grand Challenges of Computational Intelligence Nicosia Cyprus, Sept. 14, 2012 Models from data: a unifying picture - Johan Suykens
  2. 2. Overview• Models from data• Examples• Challenges for computational intelligence
  3. 3. High-quality predictive models are crucial biomedical bio-informaticsModels from data: a unifying picture - Johan Suykens 1
  4. 4. High-quality predictive models are crucial biomedical process industry energy bio-informaticsModels from data: a unifying picture - Johan Suykens 1
  5. 5. High-quality predictive models are crucial biomedical process industry energy brain-computer interfaces traffic networks bio-informaticsModels from data: a unifying picture - Johan Suykens 1
  6. 6. Classical Neural Networksx1 w1x2 w2 y w h(·)x3 w 3xn n b h(·) 1Multilayer Perceptron (MLP) properties:- Universal approximation- Learning from input-output patterns: off-line & on-line- Parallel network architecture, multiple inputs and outputs + Flexible and widely applicable: Feedforward & recurrent networks, supervised & unsupervised learning - Many local minima, trial and error for number of neuronsModels from data: a unifying picture - Johan Suykens 2
  7. 7. Support Vector Machines cost function cost function MLP SVM weights weights• Nonlinear classification and function estimation by convex optimization• Learning and generalization in high dimensional input spaces• Use of kernels: - linear, polynomial, RBF, MLP, splines, kernels from graphical models,... - application-specific kernels: e.g. bioinformatics, textmining [Vapnik, 1995; Sch¨lkopf & Smola, 2002; Shawe-Taylor & Cristianini, 2004] oModels from data: a unifying picture - Johan Suykens 3
  8. 8. Kernel-based models: different views SVM Some early history on RKHS: LS−SVM 1910-1920: Moore 1940: Aronszajn Kriging RKHS 1951: Krige 1970: Parzen Gaussian Processes 1971: Kimeldorf & Wahba Complementary insights from different perspectives: kernels are used in different methodologies- Support vector machines (SVM): optimization approach (primal/dual)- Reproducing kernel Hilbert spaces (RKHS): variational problem, functional analysis- Gaussian processes (GP): probabilistic/Bayesian approach Models from data: a unifying picture - Johan Suykens 4
  9. 9. SVMs: living in two worlds ... Primal space Parametric y = sign[wT ϕ(x) + b] ˆ ϕ(x) ϕ1 (x) x w1 ˆ y xx x x wnh x x x ϕnh (x) x o x K(xi , xj ) = ϕ(xi )T ϕ(xj ) (“Kernel trick”) o x o oo o Dual space o Input space o Nonparametric P#sv y = sign[ ˆ i=1 αi yiK(x, xi ) + b] K(x, x1 ) Feature space α1 ˆ y x α#sv K(x, x#sv )Models from data: a unifying picture - Johan Suykens 5
  10. 10. SVMs: living in two worlds ... Primal space Parametric Parametric y = sign[wT ϕ(x) + b] ˆ ϕ(x) ϕ1 (x) x w1 ˆ y xx x x wnh x x x ϕnh (x) x o x K(xi , xj ) = ϕ(xi )T ϕ(xj ) (“Kernel trick”) o x o oo o Dual space o Non−parametric o Input space Nonparametric P#sv y = sign[ ˆ i=1 αi yiK(x, xi ) + b] K(x, x1 ) Feature space α1 ˆ y x α#sv K(x, x#sv )Models from data: a unifying picture - Johan Suykens 5
  11. 11. Fitting models to data: alternative views- Consider model y = f (x; w), given input/output data {(xi, yi)}N : ˆ i=1 N min wT w + γ (yi − f (xi; w))2 w i=1- Rewrite the problem as minw,e wT w + γ N (yi − f (xi; w))2 i=1 subject to ei = yi − f (xi; w), i = 1, ..., N- Construct Lagrangian and take condition for optimality- Express the solution and the model in terms of the Lagrange multipliersModels from data: a unifying picture - Johan Suykens 6
  12. 12. Fitting models to data: alternative views- Consider model y = f (x; w), given input/output data {(xi, yi)}N : ˆ i=1 N min wT w + γ (yi − f (xi; w))2 w i=1- Rewrite the problem as N min wT w + γ i=1 e2 i w,e subject to ei = yi − f (xi; w), i = 1, ..., N- Construct Lagrangian and take conditions for optimality- Express the solution and the model in terms of the Lagrange multipliersModels from data: a unifying picture - Johan Suykens 6
  13. 13. Linear model: solving in primal or dual?inputs x ∈ Rd, output y ∈ Rtraining set {(xi, yi)}N i=1 (P ) : y = wT x + b, ˆ w ∈ Rd ր Model ց T (D) : y = ˆ i αi xi x + b, α ∈ RNModels from data: a unifying picture - Johan Suykens 7
  14. 14. Linear model: solving in primal or dual?inputs x ∈ Rd, output y ∈ Rtraining set {(xi, yi)}N i=1 (P ) : y = wT x + b, ˆ w ∈ Rd ր Model ց (D) : y = ˆ i αi xT x + b, i α ∈ RNModels from data: a unifying picture - Johan Suykens 7
  15. 15. Linear model: solving in primal or dual?few inputs, many data points: (e.g. 20 × 1.000.000) primal : w ∈ R20dual: α ∈ R1.000.000 (kernel matrix: 1.000.000 × 1.000.000)Models from data: a unifying picture - Johan Suykens 7
  16. 16. Linear model: solving in primal or dual?many inputs, few data points: (e.g. 10.000 × 50)primal: w ∈ R10.000 dual : α ∈ R50 (kernel matrix: 50 × 50)Models from data: a unifying picture - Johan Suykens 7
  17. 17. Least Squares Support Vector Machines: “core models”• Regression min wT w + γ e2 s.t. yi = wT ϕ(xi) + b + ei, ∀i i w,b,e i• Classification min wT w + γ e2 s.t. yi(wT ϕ(xi) + b) = 1 − ei, ∀i i w,b,e i• Kernel pca (V = I), Kernel spectral clustering (V = D −1) min −wT w + γ vie2 s.t. ei = wT ϕ(xi) + b, ∀i i w,b,e i• Kernel canonical correlation analysis/partial least squares T T 2 ei = wT ϕ1(xi) + b min w w+v v+ν (ei − ri) s.t. w,v,b,d,e,r ri = v T ϕ2(yi ) + d i [Suykens & Vandewalle, 1999; Suykens et al., 2002; Alzate & Suykens, 2010]Models from data: a unifying picture - Johan Suykens 8
  18. 18. Core models regularization terms Core model + additional constraintsModels from data: a unifying picture - Johan Suykens 9
  19. 19. Core models regularization terms Core model + additional constraints optimal model representation model estimateModels from data: a unifying picture - Johan Suykens 9
  20. 20. Core models parametric model support vector machine least−squares support vector machine Parzen kernel model regularization terms Core model + additional constraints optimal model representation model estimateModels from data: a unifying picture - Johan Suykens 9
  21. 21. Overview• Models from data• Examples• Challenges for computational intelligence
  22. 22. Kernel spectral clustering• Underlying model: e∗ = wT ϕ(x∗) ˆ with q∗ = sign[ˆ∗] the estimated cluster indicator at any x∗ ∈ Rd. ˆ e• Primal problem: training on given data {xi }N i=1 N 1 1 min − wT w + γ vi e2 i w,e 2 2 i=1 subject to ei = wT ϕ(xi), i = 1, ..., N with weights vi (related to inverse degree matrix: V = D −1). Dual problem: Ωα = λDα with Ωij = K(xi, xj ) = ϕ(xi)T ϕ(xj ).• Kernel spectral clustering [Alzate & Suykens, IEEE-PAMI, 2010], related to spectral clustering [Fiedler, 1973; Shi & Malik, 2000; Ng et al. 2002; Chung, 1997]Models from data: a unifying picture - Johan Suykens 10
  23. 23. Primal and dual model representationsbias term b for centeringk clusters, k − 1 sets of constraints (index l = 1, ..., k − 1) (l) (l) T (P ) : sign[ˆ∗ ] e = sign[w ϕ(x∗) + bl] ր M ց (l) (l) (D) : sign[ˆ∗ ] = sign[ e j αj K(x∗ , xj ) + bl ]Advantages:- out-of-sample extensions- model selection- solving large scale problemsModels from data: a unifying picture - Johan Suykens 11
  24. 24. Out-of-sample extension and coding 8 8 6 6 4 4 2 2 0 0x(2) x(2) −2 −2 −4 −4 −6 −6 −8 −8 −10 −10 −12 −10 −8 −6 −4 −2 0 2 4 6 −12 −10 −8 −6 −4 −2 0 2 4 6 x(1) x(1) Models from data: a unifying picture - Johan Suykens 12
  25. 25. Out-of-sample extension and coding 8 8 6 6 4 4 2 2 0 0x(2) x(2) −2 −2 −4 −4 −6 −6 −8 −8 −10 −10 −12 −10 −8 −6 −4 −2 0 2 4 6 −12 −10 −8 −6 −4 −2 0 2 4 6 x(1) x(1) Models from data: a unifying picture - Johan Suykens 12
  26. 26. Example: image segmentation 4 3 2 i,val (3) 1 e 0 −1 −2 3 2 2 1 1 0 0 −1 −1 −2 −2 −3 (2) −3 −4 (1) e −5 e i,val i,valModels from data: a unifying picture - Johan Suykens 13
  27. 27. Example: image segmentation 1 0.9 0.8 Fisher criterion 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2 4 6 8 10 12 14 Number of clusters k 0.1 0.05 0 αi,val −0.05 (2) −0.1 −0.15 −0.2 −0.25 −0.04 −0.02 0 (1)0.04 0.02 0.06 0.08 0.1 αi,valModels from data: a unifying picture - Johan Suykens 14
  28. 28. Kernel spectral clustering: sparse kernel models original image binary clusteringIncomplete Cholesky decomposition: Ω − GGT 2 ≤ η with G ∈ RN ×R and R ≪ N Image (Berkeley image dataset): 321 × 481 (154, 401 pixels), 175 SV (l) (l) e∗ = i∈SSV αi K(xi , x∗ ) + blModels from data: a unifying picture - Johan Suykens 15
  29. 29. Kernel spectral clustering: sparse kernel models original image sparse kernel modelIncomplete Cholesky decomposition: Ω − GGT 2 ≤ η with G ∈ RN ×R and R ≪ N Image (Berkeley image dataset): 321 × 481 (154, 401 pixels), 175 SV (l) (l) e∗ = i∈SSV αi K(xi , x∗ ) + blModels from data: a unifying picture - Johan Suykens 15
  30. 30. Highly sparse kernel models: image segmentation * * * * * * * * *** * * * * * ** * ** * * ** * * * * * * * * * * * * * ** * * * * * * * * * * * * * ** * * * ** * * * * * * ** * * * * * * * * * * * * ** * * ** ** * * * * * * * * * * *** ** * * * * * * * * * * * * * ** ** * * *** * ** * * * * * * * ** * * * * * * ** * * * * * * ** ** * * * * * * * * * * ** * * * ** * * * * * * * * * * * * * ** * * * * * * * * * * * ** * * * * * ** ** * * * * * * * * ** * * * *** * * * * * * * * * * ** * * *** * * * * * * ** * ** * * ** * * * ** * * * ** * * * * * * * * * * * * ** * * *** * * ** * * ** * * * * * ** * * * * ** * * * * ** * * * * ** * ** * * * * * * * * * * ** * ** * * * * * * * * * * ** * * * * ** * 0.5 * * ** * * *** * * * ** * * ** * * ** * ** * * * * * * * **** * * * * * * * ** ** * * ** * ** *** * * * * ** ** * * ** * 0 * * * * ** * ** * * * *** * * * * * * * ** * * ** * * * * * ** ** * ** * * * ** ** * * * * * * −0.5 * * ** * * * * * ** * * ** ** * ** * * * * * ** ** * * * ** * * * * * * * * * ** (3) * * * * * * ** * −1 * * * ** * * * * * ** * * * * * * * * * * * * ** ** * * ** * ei * * * * * * * −1.5 −2 −2.5 −3 2 1 0.5 0 0 −1 −0.5 −2 −1 (2) −3 −1.5 (1) ei eiModels from data: a unifying picture - Johan Suykens 16
  31. 31. Highly sparse kernel models: image segmentation * * * * * * * 0.5 0 * −0.5 ** (3) −1 * ei * −1.5 −2 −2.5 −3 2 1 0.5 0 0 −1 −0.5 −2 −1 (2) −3 −1.5 (1) ei ei only 3k = 12 support vectors [Alzate & Suykens, Neurocomputing, 2011]Models from data: a unifying picture - Johan Suykens 16
  32. 32. Kernel spectral clustering: adding prior knowledge• Pair of points x†, x‡: c = 1 must-link, c = −1 cannot-link• Primal problem [Alzate & Suykens, IJCNN 2009] k−1 k−1 1 (l) T (l) 1 (l)T min − w w + γle D −1e(l) w(l) ,e(l) ,bl 2 2 l=1 l=1 (1) (1) subject to e = ΦN ×nh w + b11N . . e(k−1) = ΦN ×nh w(k−1) + bk−11N T T w(1) ϕ(x†) = cw(1) ϕ(x‡) . . T T w(k−1) ϕ(x†) = cw(k−1) ϕ(x‡)• Dual problem: yields rank-one downdate of the kernel matrixModels from data: a unifying picture - Johan Suykens 17
  33. 33. Kernel spectral clustering: example original image without constraintsModels from data: a unifying picture - Johan Suykens 18
  34. 34. Kernel spectral clustering: example original image with constraintsModels from data: a unifying picture - Johan Suykens 19
  35. 35. Hierarchical kernel spectral clusteringHierarchical kernel spectral clustering:- looking at different scales- use of model selection and validation data[Alzate & Suykens, Neural Networks, 2012]Models from data: a unifying picture - Johan Suykens 20
  36. 36. Power grid: kernel spectral clustering of time-series 1 1 1 0.9 0.9 0.9 0.8 0.8 0.8normalized load normalized load normalized load 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 5 10 15 20 5 10 15 20 5 10 15 20 hour hour hour Electricity load: 245 substations in Belgian grid (1/2 train, 1/2 validation) xi ∈ R43.824: spectral clustering on high dimensional data (5 years) 3 of 7 detected clusters: - 1: Residential profile: morning and evening peaks - 2: Business profile: peaked around noon - 3: Industrial profile: increasing morning, oscillating afternoon and evening [Alzate, Espinoza, De Moor, Suykens, 2009] Models from data: a unifying picture - Johan Suykens 21
  37. 37. Dimensionality reduction and data visualization• Traditionally: commonly used techniques are e.g. principal component analysis (PCA), multi-dimensional scaling (MDS), self-organizing maps (SOM)• More recently: isomap, locally linear embedding (LLE), Hessian locally linear embedding, diffusion maps, Laplacian eigenmaps (“kernel eigenmap methods and manifold learning”) [Roweis & Saul, 2000; Coifman et al., 2005; Belkin et al., 2006]• Kernel maps with reference point [Suykens, IEEE-TNN 2008]: data visualization and dimensionality reduction by solving linear systemModels from data: a unifying picture - Johan Suykens 22
  38. 38. Kernel maps with reference point: formulation • Kernel maps with reference point [Suykens, IEEE-TNN 2008]: - LS-SVM core part: realize dimensionality reduction x → z - Regularization term: (z − PD z)T (z − PD z) = N zi − N sij Dzj 2 P P i=1 j=1 2 2 2 with D diagonal matrix and sij = exp(− xi − xj 2/σ ) - reference point q (e.g. first point; sacrificed in the visualization) • Example: d = 2 N 1 ν η X 2 min T T T (z − PD z) (z − PD z)+ (w1 w1 + w2 w2 ) + (ei,1 + e2 ) i,2z,w1 ,w2 ,b1 ,b2 ,ei,1 ,ei,2 2 2 2 i=1 such that cT z = q1 + e1,1 1,1 T c1,2z = q2 + e1,2 cT z = w1 ϕ1 (xi ) + b1 + ei,1 , ∀i = 2, ..., N i,1 T cT z = w2 ϕ2 (xi ) + b2 + ei,2 , ∀i = 2, ..., N i,2 T Coordinates in low dimensional space: z = [z1; z2; ...; zN ] ∈ RdN Models from data: a unifying picture - Johan Suykens 23
  39. 39. Kernel maps with reference point: formulation • Kernel maps with reference point [Suykens, IEEE-TNN 2008]: - LS-SVM core part: realize dimensionality reduction x → z T PN PN 2 - Regularization term: (z − PD z) (z − PD z) = i=1 zi − j=1 sij Dzj 2 with D diagonal matrix and sij = exp(− xi − xj 2/σ 2)2 - reference point q (e.g. first point; sacrificed in the visualization) • Example: d = 2 N 1 ν η X 2 min T T T (z − PD z) (z − PD z)+ (w1 w1 + w2 w2 ) + (ei,1 + e2 ) i,2z,w1 ,w2 ,b1 ,b2 ,ei,1 ,ei,2 2 2 2 i=1 such that cT z = q1 + e1,1 1,1 T c1,2z = q2 + e1,2 cT z = w1 ϕ1 (xi ) + b1 + ei,1 , ∀i = 2, ..., N i,1 T cT z = w2 ϕ2 (xi ) + b2 + ei,2 , ∀i = 2, ..., N i,2 T Coordinates in low dimensional space: z = [z1; z2; ...; zN ] ∈ RdN Models from data: a unifying picture - Johan Suykens 23
  40. 40. Kernel maps with reference point: formulation • Kernel maps with reference point [Suykens, IEEE-TNN 2008]: - LS-SVM core part: realize dimensionality reduction x → z - Regularization term: (z − PD z)T (z − PD z) = N zi − N sij Dzj 2 P P i=1 j=1 2 2 2 with D diagonal matrix and sij = exp(− xi − xj 2/σ ) - reference point q (e.g. first point; sacrificed in the visualization) • Example: d = 2 N 1 ν η X 2 min T T T (z − PD z) (z − PD z)+ (w1 w1 + w2 w2 ) + (ei,1 + e2 ) i,2z,w1 ,w2 ,b1 ,b2 ,ei,1 ,ei,2 2 2 2 i=1 such that cT z = q1 + e1,1 1,1 cT z = q2 + e1,2 1,2 cT z = w1 ϕ1 (xi ) + b1 + ei,1 , ∀i = 2, ..., N i,1 T cT z = w2 ϕ2 (xi ) + b2 + ei,2 , ∀i = 2, ..., N i,2 T Coordinates in low dimensional space: z = [z1; z2; ...; zN ] ∈ RdN Models from data: a unifying picture - Johan Suykens 23
  41. 41. Kernel maps: spiral example 0.5 0x3 q = [+1; −1] q = [−1; −1] −3 −3 x 10 x 10 12 12 −0.5 1 10 10 0.5 1 0 0.5 −0.5 0 8 8 −0.5 −1 −1 −1.5 −1.5 x 6 6 2 x1 z hat z hat 2 2 4 4 2 2 0 0 −2 −2 −0.02 −0.015 −0.01 −0.005 0 0.005 0.01 −0.01 −0.005 0 0.005 0.01 0.015 0.02 z1hat z1hat training data (blue *), validation data (magenta o), test data (red +) 2 ˆT ˆ zi zj xT xj i Model selection: min ˆ ˆ zi 2 zj 2 − xi 2 xj 2 i,j Models from data: a unifying picture - Johan Suykens 24
  42. 42. Kernel maps: visualizing gene distribution −3 x 10 2.1 2 1.9 3 z 1.8 1.7 2.3 2.2 −2.35 −2.3 2.1 −2.25 −2.2 −3 −2.15 −2.1 2 x 10 −2.05 −2 −1.95 1.9 −3 x 10 z2 z1 Alon colon cancer microarray data set: 3D projections Dimension input space: 62 Number of genes: 1500 (training: 500, validation: 500, test: 500)Models from data: a unifying picture - Johan Suykens 25
  43. 43. Kernels & Tensors neuroscience: EEG data (time samples × frequency × electrodes) computer vision: image (/video) compression/completion/· · · (pixel × illumination × expression × · · ·) web mining: analyze users behaviors (users × queries × webpages) vector x matrix X tensor X- Naive kernel: K(X , Y) = exp(− 1 σ 2 vec(X ) − vec(Y) 2) 2 2- Tensorial kernel exploiting structure: learning from few examples [Signoretto et al., Neural Networks, 2011 & IEEE-TSP, 2012]Models from data: a unifying picture - Johan Suykens 26
  44. 44. Tensor completionMass spectral imaging: sagittal section mouse brain [data: E. Waelkens, R. Van de Plas]Tensor completion using nuclear norm regularization [Signoretto et al., IEEE-SPL, 2011]Models from data: a unifying picture - Johan Suykens 27
  45. 45. Overview• Models from data• Examples• Challenges for computational intelligence
  46. 46. Challenges for Computational Intelligence- Bridging gaps between advanced methods and end-users- New mathematical and methodological frameworks- Scalable algorithms towards large and high-dimensional dataModels from data: a unifying picture - Johan Suykens 28
  47. 47. Acknowledgements• Colleagues at ESAT-SCD (especially research units: systems, models, control - biomedical data processing - bioinformatics): C. Alzate, A. Argyriou, J. De Brabanter, K. De Brabanter, B. De Moor, M. Espinoza, T. Falck, D. Geebelen, X. Huang, V. Jumutc, P. Karsmakers, R. Langone, J. Lopez, J. Luts, R. Mall, S. Mehrkanoon, Y. Moreau, K. Pelckmans, J. Puertas, L. Shi, M. Signoretto, V. Van Belle, R. Van de Plas, S. Van Huffel, J. Vandewalle, C. Varon, S. Yu, and others• Many people for joint work, discussions, invitations, organizations• Support from ERC AdG A-DATADRIVE-B, KU Leuven, GOA-MaNet, COE Optimization in Engineering OPTEC, IUAP DYSCO, FWO projects, IWT, IBBT eHealth, COSTModels from data: a unifying picture - Johan Suykens 29
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×