SlideShare a Scribd company logo
9.520
  CBCL                                                   MIT




               Statistical Learning Theory and
                         Applications



                               Sayan Mukherjee and Ryan Rifkin
                               + tomaso poggio

                               + Alex Rakhlin

9.520, spring 2003
Learning: Brains and Machines
  CBCL                                                  MIT


                              Learning is the gateway to
                              understanding the brain and to
                              making intelligent machines.


                              Problem of learning:
                              a focus for
                                o modern math
                                o computer algorithms
                                o neuroscience




9.520, spring 2003
Multidisciplinary Approach to Learning
  CBCL                                                                                MIT
                                                   1 l                               
                                               min  ∑ V ( yi , f ( xi )) + µ f
                                                                                  2
                                                                                      
                        Learning theory
                                                                                  K
                                                    i =1
                                               f ∈H l
                                                                                      
                         + algorithms




                                           • Information extraction (text,Web…)
                                           • Computer vision and graphics
                        ENGINEERING        • Man-machine interfaces
                        APPLICATIONS       • Bioinformatics (DNA arrays)
                                           • Artificial Markets (society of
                                           learning agents)




                        Computational       Learning to recognize objects in
                        Neuroscience:       visual cortex
                      models+experiments

9.520, spring 2003
Class
  CBCL                                               MIT




Rules of the game: problem sets (2 + 1 optional)
                   final project
                   grading
                   participation!

Web site:
www.ai.mit.edu/projects/cbcl/courses/course9.520/index.html




9.520, spring 2003
Overview of overview
  CBCL                                                   MIT




        o Supervised learning: the problem and how to frame
           it within classical math
        o Examples of in-house applications
        o Learning and the brain




9.520, spring 2003
Learning from Examples
  CBCL                                                MIT



      INPUT
                                   f             OUTPUT




                     EXAMPLES


                     INPUT1            OUTPUT1

                     INPUT2            OUTPUT2
                     ...........
                     INPUTn            OUTPUTn

9.520, spring 2003
Learning from Examples: formal setting
  CBCL                                                             MIT


         Given a set of l examples (past data)

                     { ( x1 , y1 ), ( x2 , y2 ),...,( xl , yl )}
         Question: find function f such that
                                    f ( x) = y
                                             ˆ
         is a good predictor of y for a future input x


9.520, spring 2003
Classical equivalent view: supervised learning as
             problem of multivariate function approximation
  CBCL                                                                           MIT




                     = data from f
                     = function f
                     = approximation of f          y




                                                                       x
          Generalization:                   estimating value of function where
                                            there are no data

            Regression:                     function is real valued

            Classification:                 function is binary
                                            Problem is ill-posed!
9.520, spring 2003
Well-posed problems
  CBCL                                                             MIT




      A problem is well-posed if its solution
      • exists                                     J. S. Hadamard, 1865-1963


      • is unique
      • is stable, eg depends continuously on the data



        Inverse problems (tomography, radar, scattering…)
                         are typically ill-posed
9.520, spring 2003
Two key requirements to solve the problem of
              learning from examples:
  CBCL                                         MIT
          well-posedness and consistency.
A standard way to learn from examples is


The problem is in general ill-posed and does not have a
predictive (or consistent) solution. By choosing an appropriate
hypothesis space H it can be made well-posed. Independently,
it is also known that an appropriate hypothesis space can
guarantee consistency.
We have proved recently a surprising theorem: consistency
and well posedness are equivalent, eg one implies the other.


   Thus a stable solution is also predictive and viceversa.
9.520, spring 2003
                                           Mukherjee, Niyogi, Poggio, 2002
A simple, “magic” algorithm ensures consistency
                 and well-posedness
  CBCL                                                                    MIT


    1 l                                      
min  ∑ V ( f ( xi ) − yi ) + λ
                                              2
                                           f K
     i =1
f ∈H l
                                              


                       f (x) = ∑i α i K (x, x i ) + p(x)
                                       l
             implies


Equation includes Regularization Networks (examples are
Radial Basis Functions and Support Vector Machines)



9.520, spring 2003
                       For a review, see Poggio and Smale, Notices of the AMS, 2003
Classical framework but with more general
                 loss function
  CBCL                                                                 MIT


 The algorithm uses a quite general space of functions or “hypotheses” :
RKHSs. n of the classical framework can provide a better measure
                       of “loss” (for instance for classification)…


                         1 l                                 
                     min  ∑ V ( f ( xi ) − yi ) + λ
                                                              2
                                                           f K
                          i =1
                     f ∈H l
                                                              




9.520, spring 2003                                Girosi, Caprile, Poggio, 1990
Formally…
  CBCL               MIT




9.520, spring 2003
Equivalence to networks
  CBCL                                                     MIT


Many different V lead to the same solution…

              f (x) = ∑i α i K (x, x i ) + b
                         l



                                               X                 X
                                                   1                 l



…and can be “written” as
                                                   K   K         K


the same type of network..


                                                       +



                                                       f


9.520, spring 2003
Unified framework: RN, SVMR and SVMC
  CBCL                                                            MIT



                   1 l                               
               min  ∑ V ( f ( xi ) − yi ) + λ
                                                      2
                                                   f K
                    i =1
               f ∈H l
                                                      

Equation includes Regularization Networks, eg Radial Basis
Functions, and Support Vector Machines (classification and
regression) and some multilayer Perceptrons.




                              Review by Evgeniou, Pontil and Poggio
9.520, spring 2003            Advances in Computational Mathematics, 2000
The theoretical foundations of statistical learning
              are becoming part of mainstream mathematics
  CBCL                                                       MIT




9.520, spring 2003
Theory summary
    CBCL                                               MIT

In the course we will introduce

•   Stability (well-posedness)
•   Consistency (predictivity)
•   RKHSs hypotheses spaces
•   Regularization techniques leading to RN and SVMs
•   Generalization bounds based on stability
•   Alternative bounds (VC and Vgamma dimensions)

• Related topics, extensions beyond SVMs

• Applications
• A new key result

9.520, spring 2003
Overview of overview
  CBCL                                                    MIT




        o Supervised learning: real math
        o Examples of recent and ongoing in-house research on
           applications
        o Learning and the brain




9.520, spring 2003
Learning from Examples: engineering
                         applications
  CBCL                                                 MIT



              INPUT                           OUTPUT



                      Bioinformatics
                      Artificial Markets
                      Object categorization
                      Object identification
                      Image analysis
                      Graphics
                      Text Classification
                      …..
9.520, spring 2003
Bioinformatics application: predicting type of
                 cancer from DNA chips signals
  CBCL                                                             MIT
                           Learning from examples paradigm

                                         Prediction
          Statistical Learning                               Prediction
          Algorithm



    Examples
                                          New sample




9.520, spring 2003
Bioinformatics application: predicting type of
                    cancer from DNA chips
  CBCL                                                             MIT

New feature selection SVM:

Only 38 training examples, 7100 features

AML vs ALL: 40 genes 34/34 correct, 0 rejects.
            5 genes 31/31 correct, 3 rejects of which 1 is an error.




9.520, spring 2003
Learning from Examples: engineering
                         applications
  CBCL                                                 MIT



              INPUT                           OUTPUT



                      Bioinformatics
                      Artificial Markets
                      Object categorization
                      Object identification
                      Image analysis
                      Graphics
                      Text Classification
                      …..
9.520, spring 2003
Face identification: example
  CBCL                                                          MIT

                       An old view-based system: 15 views




                     Performance: 98% on 68 person database
                                                 Beymer, 1995

9.520, spring 2003
Face identification
  CBCL                                                                       MIT




           Training Data                        Identification Result
                                        A


                           SVM Classifier



                           Feature extraction




                                                New face image



9.520, spring 2003                                   Bernd Heisele, Jennifer Huang, 2002
Real time detection and identification
  CBCL                                                                        MIT

    Ported to Oxygen’s H21 (handheld device)
     Identification of rotated faces up to about 45º
    Robust against changes in illumination and
     background
    Frame     rate of 15 Hz
                                                       Ho, Heisele, Poggio, 2000




                                                        Weinstein, Ho, Heisele, Poggio,
9.520, spring 2003                                      Steele, Agarwal, 2002
Learning from Examples: engineering
                         applications
  CBCL                                                 MIT



              INPUT                           OUTPUT



                      Bioinformatics
                      Artificial Markets
                      Object categorization
                      Object identification
                      Image analysis
                      Graphics
                      Text Classification
                      …..
9.520, spring 2003
Learning Object Detection:
                      Finding Frontal Faces ...
  CBCL                                                                MIT




                                 Training Database
                                 1000+ Real, 3000+ VIRTUAL
                                 50,0000+ Non-Face Pattern
                                                  Sung, Poggio 1995
9.520, spring 2003
Recent work on face detection
  CBCL                                                        MIT
         • Detection of faces
         in images

         • Robustness against
         slight rotations in
         depth and image
         plane

         • Full face vs.
         component-based
         classifier



9.520, spring 2003                         Heisele, Pontil, Poggio, 2000
The best existing system for face
  CBCL
                detection?           MIT
                                             Test: CBCL set / 1,154 real faces / 14,579 non-faces / 132,365 shifted windows
                                                      Training: 2,457 synthetic faces (58x58) / 13,654 non-faces

                          1


                         0.9


                         0.8


                         0.7


                         0.6
               Correct




                         0.5


                         0.4


                         0.3

                                                  component classifier
                         0.2
                                                  whole face classifier

                         0.1


                          0
                               0   0.00001    0.00002       0.00003        0.00004     0.00005     0.00006      0.00007   0.00008   0.00009   0.0001
                                                                          False positives / number of windows



9.520, spring 2003                                                                                      Heisele, Serre, Poggio et al., 2000
Trainable System for Object Detection:
                 Pedestrian detection - Results
  CBCL                                                    MIT




9.520, spring 2003                      Papageorgiou and Poggio, 1998
Trainable System for Object Detection:
               Pedestrian detection - Training
  CBCL                                                  MIT




                                   Papageorgiou and Poggio, 1998
9.520, spring 2003
The system was tested in a test car
                         (Mercedes)
  CBCL                                              MIT




9.520, spring 2003
CBCL               MIT




9.520, spring 2003
System installed in experimental Mercedes
  CBCL                                                           MIT




                     A fast version, integrated
                     with a real-time obstacle
                             detection system

                                        MPEG




9.520, spring 2003                                Constantine Papageorgiou
System installed in experimental
                                 Mercedes
  CBCL                                                           MIT




                     A fast version, integrated
                     with a real-time obstacle
                             detection system

                                        MPEG




9.520, spring 2003                                Constantine Papageorgiou
People classification/detection: training
                                    the system
  CBCL                                                                        MIT

                                       ...                                     ...


                     1848 patterns                     7189 patterns

             Representation: overcomplete dictionary of Haar wavelets; high
             dimensional feature space (>1300 features)




                                                Core learning algorithm:
                                                Support Vector Machine
                                                classifier




                         pedestrian detection system
9.520, spring 2003
An improved approach: combining component
                                     detectors
  CBCL                                                                  MIT




9.520, spring 2003                             Mohan, Papageorgiou and Poggio, 1999
Results
  CBCL                          MIT




The system is capable of
detecting partially
occluded people




9.520, spring 2003
System Performance
  CBCL                                                         MIT




      Combination systems (ACC)
      perform best.
      All component based
      systems perform better than
      full body person detectors.




9.520, spring 2003                  A. Mohan, C. Papageorgiou, T. Poggio
Learning from Examples: Applications
  CBCL                                      MIT




         INPUT                          OUTPUT


         Object identification
         Object categorization
         Image analysis
         Graphics
         Finance
         Bioinformatics
         …
9.520, spring 2003
Image Analysis
  CBCL                                                     MIT
                IMAGE ANALYSIS: OBJECT RECOGNITION AND POSE
                ESTIMATION




                                      ⇒ Bear (0° view)




                                       ⇒ Bear (45° view)




9.520, spring 2003
Computer vision: analysis of facial expressions
  CBCL                                                                                                            MIT
MI/ CBCL/AI                                                                                                       MIT
                                                                                                     12

                                                                                                     10

                                                                                                     8

                                                                                                     6

                                                                                                     4

                                                                                                     2

                                                                                                     0




                                                                                                85
                                                                                           79
                                                                                      73
                                                                                 67
                                                                            61
                                                                       55
                                                                  49
                                                             43
                                                        37
                                                   31
                                              25
                                         19
                                    13
                                7
                            1




   The main goal is to estimate basic facial parameters, e.g.
 degree of mouth openness, through learning. One of the main
    applications is video-speech fusion to improve speech
                      recognition systems.
NTT, Atsugi, January 2002
 9.520, spring 2002                                                                                       Kumar, Poggio, 2001
CBCL        Combining Top-Down        MIT

          Constraints and Bottom-Up Data
              Morphable Model

          =   a1     +   a2   +   a3     + …




                                       Learning Engine
                                       SVM, RBF, etc




9.520, spring 2002                                       Kumar, Poggio, 2001
The Three Stages
  CBCL                                                                                                                MIT




                                Localization of
                                Localization of       Analysis of
                                                       Analysis of
               Face Detection
                                Facial Features
                                Facial Features       Facial parts
                                                      Facial parts
                                                                                                                           12

                                                                                                                           10

                                                                                                                           8

                                                                                                                           6

                                                                                                                           4

                                                                                                                           2

                                                                                                                           0




                                                                                                                      85
                                                                                                                 79
                                                                                                            73
                                                                                                       67
                                                                                                  61
                                                                                             55
                                                                                        49
                                                                                   43
                                                                              37
                                                                         31
                                                                    25
                                                               19
                                                          13
                                                      7
                                                  1
                                     For more details  Appendix 2
9.520, spring 2003
Learning from Examples: engineering
                                 applications
  CBCL                                                      MIT



              INPUT                                OUTPUT



                           Bioinformatics
                           Artificial Markets
                           Object categorization
                           Object identification
                           Image analysis
                           Graphics
                           Text Classification
                           …..
9.520, spring 2003
Image Synthesis
  CBCL                                         MIT

                     UNCONVENTIONAL GRAPHICS



                      Θ = 0° view ⇒




                     Θ = 45° view ⇒



9.520, spring 2003
Supermodels
  CBCL                                    MIT




                      Cc-cs.mpg




                           MPEG (Steve Lines)
9.520, spring 2003
A trainable system for TTVS
  CBCL                                             MIT




       Input: Text
       Output:
       photorealistic talking
       face uttering text




                                               Tony Ezzat
9.520, spring 2003
From 2D to a better 2d and from 2D
  CBCL                         to 3D                         MIT

Two extensions of our text-to-visual-speech
 (TTVS) system:
            • morphing of 3D models of faces to output a 3D model
              of a speaking face (Blanz)
            • Learn facial appearance, dynamics and coarticulation
              (Ezzat)




9.520, spring 2002
Towards 3D
  CBCL
                                                                                             MIT


                                    Neutral face shape was reconstructed from a single image,
                                    and animation transfered to new face.

  3D face scans were collected.




                                                                             Click for animation

                                         Click for animation


                                                                             Blanz, Ezzat, Poggio
9.520, spring 2002
Using the same basic learning techniques: : Trainable
                     Videorealistic Face Animation
  CBCL                 (voice is real, video is synthetic)
                                                                            MIT




                                                  Ezzat, Geiger, Poggio, SigGraph 2002
9.520, spring 2002
Trainable Videorealistic Face Animation

                                                       2. Run Time
  CBCL
            1. Learning                                                                           MIT

                                               For any speech input the system
                                               provides as output a synthetic
                                               video stream
System learns from 4 mins                                      Phone Stream
of video the face appearance                        /SIL//B/ /B//AE/ /AE//AE/ /JH//JH/ /JH/
                                                                                          /SIL/

(Morphable Model) and the
speech dynamics of the
person
                                                                                          Phonetic Models
                                                                Trajectory
                                                                Synthesis                         {µ   p   ,Σp}

                                                                    MMM                           { I i , Ci }
                                                                                           Image Prototypes




Tony Ezzat,Geiger,   Poggio,   SigGraph 2002
9.520, spring 2002
Novel !!!: Trainable Videorealistic Face Animation
  CBCL                                                  MIT




                     Let us look at video!!!

                                               Ezzat, Poggio, 2002
9.520, spring 2002
Reconstructed 3D Face Models from 1 image
  CBCL                                           MIT




                             Blanz and Vetter,
                             MPI
                             SigGraph ‘99
9.520, spring 2003
Reconstructed 3D Face Models from 1
                               image
  CBCL                                                       MIT




                                         Blanz and Vetter,
                                         MPI
                                         SigGraph ‘99
9.520, spring 2003
Learning from Examples: Applications
  CBCL                                      MIT




         INPUT                          OUTPUT


         Object identification
         Object categorization
         Image analysis
         Graphics
         Finance
         Bioinformatics
         …
9.520, spring 2003
Artificial Agents – learning algorithms – buy
 CBCL
               and sell stocks            MIT


        Example of a subproject: The Electronic Market Maker


     Public Limit/Market
     Orders
                                                     Bid/Ask Prices
     All available market                            Bid/Ask Sizes
     Information                                     Buy/Sell

     User                                            Inventory control
     control/calibration




                                          Learning
                                          Feedback
9.520, spring 2003
                                          Loop        Nicholas Chang
Learning from Examples: engineering
                         applications
  CBCL                                                 MIT



              INPUT                           OUTPUT



                      Bioinformatics
                      Artificial Markets
                      Object categorization
                      Object identification
                      Image analysis
                      Graphics
                      Text Classification
                      …..
9.520, spring 2003
Overview of overview
  CBCL                                                   MIT




        o Supervised learning: the problem and how to frame
           it within classical math
        o Examples of in-house applications
        o Learning and the brain




9.520, spring 2003
The Ventral Visual Stream: From V1 to IT
  CBCL                                                                              MIT




                                                             modified from Ungerleider and Haxby, 1994




                     Hubel & Wiesel, 1959
                                            Desimone, 1991
9.520, spring 2003                                                   Desimone, 1991
Summary of “basic facts”
  CBCL                                                           MIT
   Accumulated evidence points to three (mostly accepted)
   properties of the ventral visual stream architecture:

           • Hierarchical build-up of invariances (first to translation
             and scale, then to viewpoint etc.) , size of the
             receptive fields and complexity of preferred stimuli

           • Basic feed-forward processing of information (for
             “immediate” recognition tasks)

           • Learning of an individual object generalizes to scale
             and position

9.520, spring 2003
The standard model: following Hubel and Wiesel…
  CBCL                                                                    MIT
                                                                          Learning
                                                                        (supervised)




                                                                         Learning
                                                                      (unsupervised
                                                                            )




9.520, spring 2003                     Riesenhuber & Poggio, Nature Neuroscience, 2000
The standard model
  CBCL                                                                    MIT
Interprets or predicts many existing data in microcircuits and system physiology,
and also in cognitive science:

• What some complex cells and V4 cells do and how: MAX…

• View-tuning of IT cells (Logothetis)
• Response to pseudomirror views
• Effect of scrambling
• Multiple objects
• Robustness to clutter
• Consistent with K. Tanaka’s simplification procedure
• Categorization tasks (cats vs dogs)
• Invariance to translation, scale etc…

• Gender classification
• Face inversion effect : experience, viewpoint, other-race, configural
  vs. featural representation
• Transfer of generalization
• No binding problem, no need for oscillations
9.520, spring 2003
Model’s early predictions: neurons
                                  become
  CBCL                 view-tuned during recognition MIT



                                                   Logothetis, Pauls, and Poggio,
                                                   1995;
                                                   Logothetis, Pauls, 1995




9.520, spring 2003                      Poggio, Edelman, Riesenhuber (1990, 2000)
Recording Sites in Anterior IT
  CBCL                                                                           MIT




             LUN
                     LAT

                     STS
                                                        …neurons tuned to
                                                       faces are close by….
           IOS

                     AMTS
                            LAT
                            STS    Ho=0

                            AMTS




                                          Logothetis, Pauls, and Poggio, 1995;
                                          Logothetis, Pauls, 1995
9.520, spring 2003
The Cortex: Neurons Tuned to Object Views as
              predicted by model
  CBCL                                              MIT




9.520, spring 2003            Logothetis, Pauls, Poggio, 1995
View-tuned cells: scale invariance
                         (one training view only)!
  CBCL                                                                                                                                                                     MIT
                                         S cale Invariant Re sponse s of an IT Neuron




              76                         1 . 0 de g   76                     1 . 7 5 de g 76                           2 . 5 de g   76                      3 . 2 5 de g
                                          (x 0. 4)                             (x 0. 7)                                 (x 1. 0)                             (x 1. 3)




                                                                                            Sp ikes/sec
              Sp ikes/sec




                                                      Sp ikes/sec




                                                                                                                                    Spikes/sec
                  0                                   0                                         0                                    0
                              0   1000 2000 3000             0        1000 2000 3000             0             1000 2000 3000        0              1000 2000 3000
                                   Time (msec)                         Time (msec)                              Time (msec)                          Time (msec)




              76                         4 . 0 de g   76                     4 . 7 5 de g   76                         5 . 5 de g   76                      6 . 2 5 de g
                                          (x 1. 6)                             (x 1. 9)                                 (x 2. 2)                             (x 2. 5)
                Sp ikes/sec




                                                        Sp ikes/sec




                                                                                                                                      Sp ikes/sec
                                                                                                 Sp ikes/sec




                  0                                   0                                         0                                    0
                              0   1000 2000 3000             0        1000 2000 3000                0          1000 2000 3000         0             1000 2000 3000
                                   Time (msec)                         Time (msec)                              Time (msec)                          Time (msec)


9.520, spring 2003
                                                                                   Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995
Joint Project (Max Riesenhuber) with Earl Miller & David
          Freedman (MIT): Neural Correlate of Categorization (NCC)
  CBCL                                                                       MIT

    Define categories in morph space

                                            60% Cat   60% Dog
                                  80% Cat             Morphs    80% Dog
                                  Morphs    Morphs
                                                                Morphs
                     Prototypes                                            Prototypes
                                                                           100% Dog
                     100% Cat




                                                                Category
9.520, spring 2003                                              boundary
Categorization task
  CBCL                                                                            MIT

    Train monkey on categorization task


                                                            .
                         .
                                     .
                                             .        (Match)
                     Fixation
                     500 ms.     Sample
                                 600 ms.   Delay        .
                                           1000                   .
                                            ms.
                                                      Test                 .
                                                   (Nonmatch)
                                                                Delay
                                                                          Test
                                                                        (Match)

    After training, record from neurons in IT & PFC
9.520, spring 2003
Single cell example: a PFC neuron that responds
                                    more strongly to DOGS than CATS
  CBCL                                                                                       MIT
                           Fixation   Sample     Delay          Choice
                      13
                                                 Dog 100%
                                                 Dog 80%
                                                 Dog 60%
   Firing Rate (Hz)




                      10



                       7



                       4
                                                            Cat 100%
                                                            Cat 80%
                                                            Cat 60%
                       1
                      -500       0       500    1000     1500      2000
                             Time from sample stimulus onset (ms)

                                                                          D. Freedman + E. Miller + M.
                                                                          Riesenhuber+T. Poggio (Science,
9.520, spring 2003                                                        2001)
The model suggests same computations for
             different recognition tasks (and objects!)
  CBCL                                                  MIT
                                            Task-specific
                                            units (e.g., PFC)
                                                 General
                                              representation
                                                   (IT)




                                              Is this
                                  HMAX
                                             what’s going
                                             on in cortex?

9.520, spring 2003

More Related Content

What's hot

Knowledge Based Clustering
Knowledge Based ClusteringKnowledge Based Clustering
Knowledge Based Clustering
Christos Zigkolis
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
Tianlu Wang
 
Shearlet Frames and Optimally Sparse Approximations
Shearlet Frames and Optimally Sparse ApproximationsShearlet Frames and Optimally Sparse Approximations
Shearlet Frames and Optimally Sparse Approximations
Jakob Lemvig
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
Anirbit Mukherjee
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
Masahiro Suzuki
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOINN Inc.
 
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
Anirbit Mukherjee
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learning
Van Thanh
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
NTNU
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
Universitat Politècnica de Catalunya
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
Universitat Politècnica de Catalunya
 
Cognition, Information and Subjective Computation
Cognition, Information and Subjective ComputationCognition, Information and Subjective Computation
Cognition, Information and Subjective Computation
Hector Zenil
 
Adaptive Neuro-Fuzzy Inference System based Fractal Image Compression
Adaptive Neuro-Fuzzy Inference System based Fractal Image CompressionAdaptive Neuro-Fuzzy Inference System based Fractal Image Compression
Adaptive Neuro-Fuzzy Inference System based Fractal Image Compression
IDES Editor
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
FedorNikolaev
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
NTNU
 
Csr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawalCsr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawalCSR2011
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inferencezukun
 

What's hot (20)

Knowledge Based Clustering
Knowledge Based ClusteringKnowledge Based Clustering
Knowledge Based Clustering
 
Lecture15 xing
Lecture15 xingLecture15 xing
Lecture15 xing
 
Shearlet Frames and Optimally Sparse Approximations
Shearlet Frames and Optimally Sparse ApproximationsShearlet Frames and Optimally Sparse Approximations
Shearlet Frames and Optimally Sparse Approximations
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
 
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
(研究会輪読) Facial Landmark Detection by Deep Multi-task Learning
 
CSMR11b.ppt
CSMR11b.pptCSMR11b.ppt
CSMR11b.ppt
 
SOIAM (SOINN-AM)
SOIAM (SOINN-AM)SOIAM (SOINN-AM)
SOIAM (SOINN-AM)
 
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
 
An introduction to deep learning
An introduction to deep learningAn introduction to deep learning
An introduction to deep learning
 
lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...lassification with decision trees from a nonparametric predictive inference p...
lassification with decision trees from a nonparametric predictive inference p...
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
 
Cognition, Information and Subjective Computation
Cognition, Information and Subjective ComputationCognition, Information and Subjective Computation
Cognition, Information and Subjective Computation
 
Adaptive Neuro-Fuzzy Inference System based Fractal Image Compression
Adaptive Neuro-Fuzzy Inference System based Fractal Image CompressionAdaptive Neuro-Fuzzy Inference System based Fractal Image Compression
Adaptive Neuro-Fuzzy Inference System based Fractal Image Compression
 
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge GraphJoint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
Joint Word and Entity Embeddings for Entity Retrieval from Knowledge Graph
 
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
An Importance Sampling Approach to Integrate Expert Knowledge When Learning B...
 
Csr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawalCsr2011 june14 14_00_agrawal
Csr2011 june14 14_00_agrawal
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 

Similar to Class01

Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
United States Air Force Academy
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
Krzysztof Kowalczyk
 
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Antonio Lieto
 
Introduction to active learning
Introduction to active learningIntroduction to active learning
Introduction to active learningAlexey Voropaev
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...
BrianDeCost
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
Frank Nielsen
 
Ch8
Ch8Ch8
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
Fcv hum mach_geman
Fcv hum mach_gemanFcv hum mach_geman
Fcv hum mach_gemanzukun
 
Particle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer VisionParticle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer Visionzukun
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono
 
Ex nihilo nihil fit: A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
Ex nihilo nihil fit:  A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...Ex nihilo nihil fit:  A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
Ex nihilo nihil fit: A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
Antonio Lieto
 
Mathematics and Engineering.pptx
Mathematics and Engineering.pptxMathematics and Engineering.pptx
Mathematics and Engineering.pptx
Dr. Chetan Bhatt
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
Valentin De Bortoli
 
Reproducible research in molecular biophysics and structural biology
Reproducible research in molecular biophysics and structural biologyReproducible research in molecular biophysics and structural biology
Reproducible research in molecular biophysics and structural biology
khinsen
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Learning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physicsLearning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physics
Balázs Kégl
 
Comparison of the optimal design
Comparison of the optimal designComparison of the optimal design
Comparison of the optimal design
Alexander Decker
 

Similar to Class01 (20)

Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual IdentificationSubspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...Commonsense reasoning as a key feature for dynamic knowledge invention and co...
Commonsense reasoning as a key feature for dynamic knowledge invention and co...
 
Introduction to active learning
Introduction to active learningIntroduction to active learning
Introduction to active learning
 
TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...TMS workshop on machine learning in materials science: Intro to deep learning...
TMS workshop on machine learning in materials science: Intro to deep learning...
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Ch8
Ch8Ch8
Ch8
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Fcv hum mach_geman
Fcv hum mach_gemanFcv hum mach_geman
Fcv hum mach_geman
 
Particle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer VisionParticle Filters and Applications in Computer Vision
Particle Filters and Applications in Computer Vision
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Ex nihilo nihil fit: A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
Ex nihilo nihil fit:  A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...Ex nihilo nihil fit:  A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
Ex nihilo nihil fit: A COMMONSENSE REASONING FRAMEWORK FOR DYNAMIC KNOWLEDGE...
 
Mathematics and Engineering.pptx
Mathematics and Engineering.pptxMathematics and Engineering.pptx
Mathematics and Engineering.pptx
 
Macrocanonical models for texture synthesis
Macrocanonical models for texture synthesisMacrocanonical models for texture synthesis
Macrocanonical models for texture synthesis
 
Reproducible research in molecular biophysics and structural biology
Reproducible research in molecular biophysics and structural biologyReproducible research in molecular biophysics and structural biology
Reproducible research in molecular biophysics and structural biology
 
Presentation v3.2
Presentation v3.2Presentation v3.2
Presentation v3.2
 
Presentation v3.2
Presentation v3.2Presentation v3.2
Presentation v3.2
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Learning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physicsLearning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physics
 
Comparison of the optimal design
Comparison of the optimal designComparison of the optimal design
Comparison of the optimal design
 

More from Sattar kayani

Q3 (a)
Q3 (a)Q3 (a)
Van meeter equaiton
Van meeter equaitonVan meeter equaiton
Van meeter equaiton
Sattar kayani
 
314 supp 6_isom_form
314 supp 6_isom_form314 supp 6_isom_form
314 supp 6_isom_form
Sattar kayani
 
314 supp 6_isom_form
314 supp 6_isom_form314 supp 6_isom_form
314 supp 6_isom_form
Sattar kayani
 
Reagent types chem
Reagent types chemReagent types chem
Reagent types chem
Sattar kayani
 
4 converted (1)
4 converted (1)4 converted (1)
4 converted (1)
Sattar kayani
 
Water soluble
Water solubleWater soluble
Water soluble
Sattar kayani
 
Water soluble vs Fat Soluable
Water soluble vs Fat SoluableWater soluble vs Fat Soluable
Water soluble vs Fat Soluable
Sattar kayani
 
6 defects
6 defects6 defects
6 defects
Sattar kayani
 
Create wireless ad hoc internet connection on windows 10
Create wireless ad hoc internet connection on windows 10Create wireless ad hoc internet connection on windows 10
Create wireless ad hoc internet connection on windows 10
Sattar kayani
 
Dua
DuaDua
Kinds of sentence
Kinds of sentenceKinds of sentence
Kinds of sentence
Sattar kayani
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
Sattar kayani
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
Sattar kayani
 
Dsys guide37
Dsys guide37Dsys guide37
Dsys guide37
Sattar kayani
 
A dhoc networks
A dhoc networksA dhoc networks
A dhoc networks
Sattar kayani
 
03 basic computer_network
03 basic computer_network03 basic computer_network
03 basic computer_networkSattar kayani
 

More from Sattar kayani (17)

Q3 (a)
Q3 (a)Q3 (a)
Q3 (a)
 
Van meeter equaiton
Van meeter equaitonVan meeter equaiton
Van meeter equaiton
 
314 supp 6_isom_form
314 supp 6_isom_form314 supp 6_isom_form
314 supp 6_isom_form
 
314 supp 6_isom_form
314 supp 6_isom_form314 supp 6_isom_form
314 supp 6_isom_form
 
Reagent types chem
Reagent types chemReagent types chem
Reagent types chem
 
4 converted (1)
4 converted (1)4 converted (1)
4 converted (1)
 
Water soluble
Water solubleWater soluble
Water soluble
 
Water soluble vs Fat Soluable
Water soluble vs Fat SoluableWater soluble vs Fat Soluable
Water soluble vs Fat Soluable
 
6 defects
6 defects6 defects
6 defects
 
Create wireless ad hoc internet connection on windows 10
Create wireless ad hoc internet connection on windows 10Create wireless ad hoc internet connection on windows 10
Create wireless ad hoc internet connection on windows 10
 
Dua
DuaDua
Dua
 
Kinds of sentence
Kinds of sentenceKinds of sentence
Kinds of sentence
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Artificial intelligence cs607 handouts lecture 11 - 45
Artificial intelligence   cs607 handouts lecture 11 - 45Artificial intelligence   cs607 handouts lecture 11 - 45
Artificial intelligence cs607 handouts lecture 11 - 45
 
Dsys guide37
Dsys guide37Dsys guide37
Dsys guide37
 
A dhoc networks
A dhoc networksA dhoc networks
A dhoc networks
 
03 basic computer_network
03 basic computer_network03 basic computer_network
03 basic computer_network
 

Class01

  • 1. 9.520 CBCL MIT Statistical Learning Theory and Applications Sayan Mukherjee and Ryan Rifkin + tomaso poggio + Alex Rakhlin 9.520, spring 2003
  • 2. Learning: Brains and Machines CBCL MIT Learning is the gateway to understanding the brain and to making intelligent machines. Problem of learning: a focus for o modern math o computer algorithms o neuroscience 9.520, spring 2003
  • 3. Multidisciplinary Approach to Learning CBCL MIT 1 l  min  ∑ V ( yi , f ( xi )) + µ f 2  Learning theory K  i =1 f ∈H l  + algorithms • Information extraction (text,Web…) • Computer vision and graphics ENGINEERING • Man-machine interfaces APPLICATIONS • Bioinformatics (DNA arrays) • Artificial Markets (society of learning agents) Computational Learning to recognize objects in Neuroscience: visual cortex models+experiments 9.520, spring 2003
  • 4. Class CBCL MIT Rules of the game: problem sets (2 + 1 optional) final project grading participation! Web site: www.ai.mit.edu/projects/cbcl/courses/course9.520/index.html 9.520, spring 2003
  • 5. Overview of overview CBCL MIT o Supervised learning: the problem and how to frame it within classical math o Examples of in-house applications o Learning and the brain 9.520, spring 2003
  • 6. Learning from Examples CBCL MIT INPUT f OUTPUT EXAMPLES INPUT1 OUTPUT1 INPUT2 OUTPUT2 ........... INPUTn OUTPUTn 9.520, spring 2003
  • 7. Learning from Examples: formal setting CBCL MIT Given a set of l examples (past data) { ( x1 , y1 ), ( x2 , y2 ),...,( xl , yl )} Question: find function f such that f ( x) = y ˆ is a good predictor of y for a future input x 9.520, spring 2003
  • 8. Classical equivalent view: supervised learning as problem of multivariate function approximation CBCL MIT = data from f = function f = approximation of f y x Generalization: estimating value of function where there are no data Regression: function is real valued Classification: function is binary Problem is ill-posed! 9.520, spring 2003
  • 9. Well-posed problems CBCL MIT A problem is well-posed if its solution • exists J. S. Hadamard, 1865-1963 • is unique • is stable, eg depends continuously on the data Inverse problems (tomography, radar, scattering…) are typically ill-posed 9.520, spring 2003
  • 10. Two key requirements to solve the problem of learning from examples: CBCL MIT well-posedness and consistency. A standard way to learn from examples is The problem is in general ill-posed and does not have a predictive (or consistent) solution. By choosing an appropriate hypothesis space H it can be made well-posed. Independently, it is also known that an appropriate hypothesis space can guarantee consistency. We have proved recently a surprising theorem: consistency and well posedness are equivalent, eg one implies the other. Thus a stable solution is also predictive and viceversa. 9.520, spring 2003 Mukherjee, Niyogi, Poggio, 2002
  • 11. A simple, “magic” algorithm ensures consistency and well-posedness CBCL MIT 1 l  min  ∑ V ( f ( xi ) − yi ) + λ 2 f K  i =1 f ∈H l  f (x) = ∑i α i K (x, x i ) + p(x) l implies Equation includes Regularization Networks (examples are Radial Basis Functions and Support Vector Machines) 9.520, spring 2003 For a review, see Poggio and Smale, Notices of the AMS, 2003
  • 12. Classical framework but with more general loss function CBCL MIT The algorithm uses a quite general space of functions or “hypotheses” : RKHSs. n of the classical framework can provide a better measure of “loss” (for instance for classification)… 1 l  min  ∑ V ( f ( xi ) − yi ) + λ 2 f K  i =1 f ∈H l  9.520, spring 2003 Girosi, Caprile, Poggio, 1990
  • 13. Formally… CBCL MIT 9.520, spring 2003
  • 14. Equivalence to networks CBCL MIT Many different V lead to the same solution… f (x) = ∑i α i K (x, x i ) + b l X X 1 l …and can be “written” as K K K the same type of network.. + f 9.520, spring 2003
  • 15. Unified framework: RN, SVMR and SVMC CBCL MIT 1 l  min  ∑ V ( f ( xi ) − yi ) + λ 2 f K  i =1 f ∈H l  Equation includes Regularization Networks, eg Radial Basis Functions, and Support Vector Machines (classification and regression) and some multilayer Perceptrons. Review by Evgeniou, Pontil and Poggio 9.520, spring 2003 Advances in Computational Mathematics, 2000
  • 16. The theoretical foundations of statistical learning are becoming part of mainstream mathematics CBCL MIT 9.520, spring 2003
  • 17. Theory summary CBCL MIT In the course we will introduce • Stability (well-posedness) • Consistency (predictivity) • RKHSs hypotheses spaces • Regularization techniques leading to RN and SVMs • Generalization bounds based on stability • Alternative bounds (VC and Vgamma dimensions) • Related topics, extensions beyond SVMs • Applications • A new key result 9.520, spring 2003
  • 18. Overview of overview CBCL MIT o Supervised learning: real math o Examples of recent and ongoing in-house research on applications o Learning and the brain 9.520, spring 2003
  • 19. Learning from Examples: engineering applications CBCL MIT INPUT OUTPUT Bioinformatics Artificial Markets Object categorization Object identification Image analysis Graphics Text Classification ….. 9.520, spring 2003
  • 20. Bioinformatics application: predicting type of cancer from DNA chips signals CBCL MIT Learning from examples paradigm Prediction Statistical Learning Prediction Algorithm Examples New sample 9.520, spring 2003
  • 21. Bioinformatics application: predicting type of cancer from DNA chips CBCL MIT New feature selection SVM: Only 38 training examples, 7100 features AML vs ALL: 40 genes 34/34 correct, 0 rejects. 5 genes 31/31 correct, 3 rejects of which 1 is an error. 9.520, spring 2003
  • 22. Learning from Examples: engineering applications CBCL MIT INPUT OUTPUT Bioinformatics Artificial Markets Object categorization Object identification Image analysis Graphics Text Classification ….. 9.520, spring 2003
  • 23. Face identification: example CBCL MIT An old view-based system: 15 views Performance: 98% on 68 person database Beymer, 1995 9.520, spring 2003
  • 24. Face identification CBCL MIT Training Data Identification Result A SVM Classifier Feature extraction New face image 9.520, spring 2003 Bernd Heisele, Jennifer Huang, 2002
  • 25. Real time detection and identification CBCL MIT  Ported to Oxygen’s H21 (handheld device) Identification of rotated faces up to about 45º  Robust against changes in illumination and background  Frame rate of 15 Hz Ho, Heisele, Poggio, 2000 Weinstein, Ho, Heisele, Poggio, 9.520, spring 2003 Steele, Agarwal, 2002
  • 26. Learning from Examples: engineering applications CBCL MIT INPUT OUTPUT Bioinformatics Artificial Markets Object categorization Object identification Image analysis Graphics Text Classification ….. 9.520, spring 2003
  • 27. Learning Object Detection: Finding Frontal Faces ... CBCL MIT Training Database 1000+ Real, 3000+ VIRTUAL 50,0000+ Non-Face Pattern Sung, Poggio 1995 9.520, spring 2003
  • 28. Recent work on face detection CBCL MIT • Detection of faces in images • Robustness against slight rotations in depth and image plane • Full face vs. component-based classifier 9.520, spring 2003 Heisele, Pontil, Poggio, 2000
  • 29. The best existing system for face CBCL detection? MIT Test: CBCL set / 1,154 real faces / 14,579 non-faces / 132,365 shifted windows Training: 2,457 synthetic faces (58x58) / 13,654 non-faces 1 0.9 0.8 0.7 0.6 Correct 0.5 0.4 0.3 component classifier 0.2 whole face classifier 0.1 0 0 0.00001 0.00002 0.00003 0.00004 0.00005 0.00006 0.00007 0.00008 0.00009 0.0001 False positives / number of windows 9.520, spring 2003 Heisele, Serre, Poggio et al., 2000
  • 30. Trainable System for Object Detection: Pedestrian detection - Results CBCL MIT 9.520, spring 2003 Papageorgiou and Poggio, 1998
  • 31. Trainable System for Object Detection: Pedestrian detection - Training CBCL MIT Papageorgiou and Poggio, 1998 9.520, spring 2003
  • 32. The system was tested in a test car (Mercedes) CBCL MIT 9.520, spring 2003
  • 33. CBCL MIT 9.520, spring 2003
  • 34. System installed in experimental Mercedes CBCL MIT A fast version, integrated with a real-time obstacle detection system MPEG 9.520, spring 2003 Constantine Papageorgiou
  • 35. System installed in experimental Mercedes CBCL MIT A fast version, integrated with a real-time obstacle detection system MPEG 9.520, spring 2003 Constantine Papageorgiou
  • 36. People classification/detection: training the system CBCL MIT ... ... 1848 patterns 7189 patterns Representation: overcomplete dictionary of Haar wavelets; high dimensional feature space (>1300 features) Core learning algorithm: Support Vector Machine classifier pedestrian detection system 9.520, spring 2003
  • 37. An improved approach: combining component detectors CBCL MIT 9.520, spring 2003 Mohan, Papageorgiou and Poggio, 1999
  • 38. Results CBCL MIT The system is capable of detecting partially occluded people 9.520, spring 2003
  • 39. System Performance CBCL MIT Combination systems (ACC) perform best. All component based systems perform better than full body person detectors. 9.520, spring 2003 A. Mohan, C. Papageorgiou, T. Poggio
  • 40. Learning from Examples: Applications CBCL MIT INPUT OUTPUT Object identification Object categorization Image analysis Graphics Finance Bioinformatics … 9.520, spring 2003
  • 41. Image Analysis CBCL MIT IMAGE ANALYSIS: OBJECT RECOGNITION AND POSE ESTIMATION ⇒ Bear (0° view) ⇒ Bear (45° view) 9.520, spring 2003
  • 42. Computer vision: analysis of facial expressions CBCL MIT MI/ CBCL/AI MIT 12 10 8 6 4 2 0 85 79 73 67 61 55 49 43 37 31 25 19 13 7 1 The main goal is to estimate basic facial parameters, e.g. degree of mouth openness, through learning. One of the main applications is video-speech fusion to improve speech recognition systems. NTT, Atsugi, January 2002 9.520, spring 2002 Kumar, Poggio, 2001
  • 43. CBCL Combining Top-Down MIT Constraints and Bottom-Up Data Morphable Model = a1 + a2 + a3 + … Learning Engine SVM, RBF, etc 9.520, spring 2002 Kumar, Poggio, 2001
  • 44. The Three Stages CBCL MIT Localization of Localization of Analysis of Analysis of Face Detection Facial Features Facial Features Facial parts Facial parts 12 10 8 6 4 2 0 85 79 73 67 61 55 49 43 37 31 25 19 13 7 1 For more details  Appendix 2 9.520, spring 2003
  • 45. Learning from Examples: engineering applications CBCL MIT INPUT OUTPUT Bioinformatics Artificial Markets Object categorization Object identification Image analysis Graphics Text Classification ….. 9.520, spring 2003
  • 46. Image Synthesis CBCL MIT UNCONVENTIONAL GRAPHICS Θ = 0° view ⇒ Θ = 45° view ⇒ 9.520, spring 2003
  • 47. Supermodels CBCL MIT Cc-cs.mpg MPEG (Steve Lines) 9.520, spring 2003
  • 48. A trainable system for TTVS CBCL MIT Input: Text Output: photorealistic talking face uttering text Tony Ezzat 9.520, spring 2003
  • 49. From 2D to a better 2d and from 2D CBCL to 3D MIT Two extensions of our text-to-visual-speech (TTVS) system: • morphing of 3D models of faces to output a 3D model of a speaking face (Blanz) • Learn facial appearance, dynamics and coarticulation (Ezzat) 9.520, spring 2002
  • 50. Towards 3D CBCL MIT Neutral face shape was reconstructed from a single image, and animation transfered to new face. 3D face scans were collected. Click for animation Click for animation Blanz, Ezzat, Poggio 9.520, spring 2002
  • 51. Using the same basic learning techniques: : Trainable Videorealistic Face Animation CBCL (voice is real, video is synthetic) MIT Ezzat, Geiger, Poggio, SigGraph 2002 9.520, spring 2002
  • 52. Trainable Videorealistic Face Animation 2. Run Time CBCL 1. Learning MIT For any speech input the system provides as output a synthetic video stream System learns from 4 mins Phone Stream of video the face appearance /SIL//B/ /B//AE/ /AE//AE/ /JH//JH/ /JH/ /SIL/ (Morphable Model) and the speech dynamics of the person Phonetic Models Trajectory Synthesis {µ p ,Σp} MMM { I i , Ci } Image Prototypes Tony Ezzat,Geiger, Poggio, SigGraph 2002 9.520, spring 2002
  • 53. Novel !!!: Trainable Videorealistic Face Animation CBCL MIT Let us look at video!!! Ezzat, Poggio, 2002 9.520, spring 2002
  • 54. Reconstructed 3D Face Models from 1 image CBCL MIT Blanz and Vetter, MPI SigGraph ‘99 9.520, spring 2003
  • 55. Reconstructed 3D Face Models from 1 image CBCL MIT Blanz and Vetter, MPI SigGraph ‘99 9.520, spring 2003
  • 56. Learning from Examples: Applications CBCL MIT INPUT OUTPUT Object identification Object categorization Image analysis Graphics Finance Bioinformatics … 9.520, spring 2003
  • 57. Artificial Agents – learning algorithms – buy CBCL and sell stocks MIT Example of a subproject: The Electronic Market Maker Public Limit/Market Orders Bid/Ask Prices All available market Bid/Ask Sizes Information Buy/Sell User Inventory control control/calibration Learning Feedback 9.520, spring 2003 Loop Nicholas Chang
  • 58. Learning from Examples: engineering applications CBCL MIT INPUT OUTPUT Bioinformatics Artificial Markets Object categorization Object identification Image analysis Graphics Text Classification ….. 9.520, spring 2003
  • 59. Overview of overview CBCL MIT o Supervised learning: the problem and how to frame it within classical math o Examples of in-house applications o Learning and the brain 9.520, spring 2003
  • 60. The Ventral Visual Stream: From V1 to IT CBCL MIT modified from Ungerleider and Haxby, 1994 Hubel & Wiesel, 1959 Desimone, 1991 9.520, spring 2003 Desimone, 1991
  • 61. Summary of “basic facts” CBCL MIT Accumulated evidence points to three (mostly accepted) properties of the ventral visual stream architecture: • Hierarchical build-up of invariances (first to translation and scale, then to viewpoint etc.) , size of the receptive fields and complexity of preferred stimuli • Basic feed-forward processing of information (for “immediate” recognition tasks) • Learning of an individual object generalizes to scale and position 9.520, spring 2003
  • 62. The standard model: following Hubel and Wiesel… CBCL MIT Learning (supervised) Learning (unsupervised ) 9.520, spring 2003 Riesenhuber & Poggio, Nature Neuroscience, 2000
  • 63. The standard model CBCL MIT Interprets or predicts many existing data in microcircuits and system physiology, and also in cognitive science: • What some complex cells and V4 cells do and how: MAX… • View-tuning of IT cells (Logothetis) • Response to pseudomirror views • Effect of scrambling • Multiple objects • Robustness to clutter • Consistent with K. Tanaka’s simplification procedure • Categorization tasks (cats vs dogs) • Invariance to translation, scale etc… • Gender classification • Face inversion effect : experience, viewpoint, other-race, configural vs. featural representation • Transfer of generalization • No binding problem, no need for oscillations 9.520, spring 2003
  • 64. Model’s early predictions: neurons become CBCL view-tuned during recognition MIT Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995 9.520, spring 2003 Poggio, Edelman, Riesenhuber (1990, 2000)
  • 65. Recording Sites in Anterior IT CBCL MIT LUN LAT STS …neurons tuned to faces are close by…. IOS AMTS LAT STS Ho=0 AMTS Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995 9.520, spring 2003
  • 66. The Cortex: Neurons Tuned to Object Views as predicted by model CBCL MIT 9.520, spring 2003 Logothetis, Pauls, Poggio, 1995
  • 67. View-tuned cells: scale invariance (one training view only)! CBCL MIT S cale Invariant Re sponse s of an IT Neuron 76 1 . 0 de g 76 1 . 7 5 de g 76 2 . 5 de g 76 3 . 2 5 de g (x 0. 4) (x 0. 7) (x 1. 0) (x 1. 3) Sp ikes/sec Sp ikes/sec Sp ikes/sec Spikes/sec 0 0 0 0 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 Time (msec) Time (msec) Time (msec) Time (msec) 76 4 . 0 de g 76 4 . 7 5 de g 76 5 . 5 de g 76 6 . 2 5 de g (x 1. 6) (x 1. 9) (x 2. 2) (x 2. 5) Sp ikes/sec Sp ikes/sec Sp ikes/sec Sp ikes/sec 0 0 0 0 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 0 1000 2000 3000 Time (msec) Time (msec) Time (msec) Time (msec) 9.520, spring 2003 Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995
  • 68. Joint Project (Max Riesenhuber) with Earl Miller & David Freedman (MIT): Neural Correlate of Categorization (NCC) CBCL MIT Define categories in morph space 60% Cat 60% Dog 80% Cat Morphs 80% Dog Morphs Morphs Morphs Prototypes Prototypes 100% Dog 100% Cat Category 9.520, spring 2003 boundary
  • 69. Categorization task CBCL MIT Train monkey on categorization task . . . . (Match) Fixation 500 ms. Sample 600 ms. Delay . 1000 . ms. Test . (Nonmatch) Delay Test (Match) After training, record from neurons in IT & PFC 9.520, spring 2003
  • 70. Single cell example: a PFC neuron that responds more strongly to DOGS than CATS CBCL MIT Fixation Sample Delay Choice 13 Dog 100% Dog 80% Dog 60% Firing Rate (Hz) 10 7 4 Cat 100% Cat 80% Cat 60% 1 -500 0 500 1000 1500 2000 Time from sample stimulus onset (ms) D. Freedman + E. Miller + M. Riesenhuber+T. Poggio (Science, 9.520, spring 2003 2001)
  • 71. The model suggests same computations for different recognition tasks (and objects!) CBCL MIT Task-specific units (e.g., PFC) General representation (IT)  Is this HMAX what’s going on in cortex? 9.520, spring 2003

Editor's Notes

  1. Desimone Fig: cells are tuned to views of complex objects. Here: face cell
  2. pretty amazing
  3. FEEDFORWARD! inspired by ventral s. HIERARCHY: from simple to complex, small to big <show>  2 objectives: inv&spec  sep.mech: blue: feature spec., green:inv. feat.spec is easy: template, ”and” --- But: how to do pooling for inv?
  4. WHY MAX?  next!
  5. This is also borne out in physiology data!  exp
  6. Use same model to also investigate novel questions. Not a different model for each problem!!!
  7. Christian Shelton
  8. Can we find split into task-independent rep & task-dependent units also in cortex?