Image Classification And Support Vector Machine

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Image Classification And Support Vector Machine - Presentation Transcript

    1. Image Classification and Support Vector Machine
      Shao-Chuan Wang
      CITI, Academia Sinica
      1
    2. Outline (1/2)
      Quick Review of SVM
      Intuition
      Functional margin and geometric margin
      Optimal margin classifier
      Generalized Lagrangian multiplier methods
      Lagrangian duality
      Kernel and feature mapping
      Soft Margin ( l1 regularization)
      2
    3. Outline (2/2)
      Some basis about Learning theory
      Bias/variance tradeoff (underfitting vs overfitting)
      Chernoff bound and VC dimension
      Model selection
      Cross validation
      Dimension Reduction
      Multiclass SVM
      One against one
      One against all
      Image Classification by SVM
      Process
      Results
      3
    4. Intuition: Margins
      Functional Margin
      Geometric Margin
      We feel more confident
      when functional margin is larger
      Note that scaling on w, b won’t change the plane.
      4
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    5. Maximize margins
      Optimization problem: maximize minimal geometric margin under constraints.
      Introduce scaling factor such that
      5
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    6. Lagrange duality
      Primal optimization problem:
      Generalized Lagrangian
      Primal optimization problem (equivalent form)
      Dual optimization problem:
      6
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    7. Dual Problem
      The necessary conditions that equality holds:
      f, giare convex, and hi are affine.
      KKT conditions.
      7
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    8. Optimal margin classifiers
      Its Lagrangian
      Its dual problem
      8
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    9. Kernel and feature mapping
      Kernel:
      Positive semi-definite
      Symmetric
      For example:
      Loose Intuition
      “similarity” between features
      9
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    10. Soft Margin (L1 regularization)
      C = ∞ leads to hard margin SVM,
      Rychetsky (2001)
      10
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    11. Why doesn’t my model fit well on test data ?
      11
    12. Some basis about Learning theory
      Bias/variance tradeoff
      underfitting (high bias) (high variance) overfitting
      Training Error =
      Generalization Error =
      12
      Andrew Ng. Part V Support Vector Machines. CS229 Lecture Notes (2008).
    13. Bias/variance tradeoff
      T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer series in statistics. Springer, New York, 2001.
      13
    14. Is training error a good estimator of generalization error?
      14
    15. Chernoff bound (|H|=finite)
      Lemma: Assume Z1, Z2, …, Zmare drawn iid from Bernoulli(φ), and
      and let γ > 0 be fixed. Then,
      based on this lemma, one can find, with probability 1-δ
      (k = # of hypotheses)
      15
      Andrew Ng. Part VI Learning Theory. CS229 Lecture Notes (2008).
    16. Chernoff bound (|H|=infinite)
      VC Dimension d : The size of largest set that H can shatter.
      e.g.
      H = linear classifiers
      in 2-D
      VC(H) = 3
      With probability at least 1-δ,
      16
      Andrew Ng. Part VI Learning Theory. CS229 Lecture Notes (2008).
    17. Model Selection
      Cross Validation: Estimator of generalization error
      K-fold: train on k-1 pieces, test on the remaining (here we will get one test error estimation).
      Average k test error estimations, say, 2%. Then 2% is the estimation of generalization error for this machine learner.
      Leave-one-out cross validation (m-fold, m = training sample size)
      train
      train
      validate
      train
      train
      train
      17
    18. Model Selection
      Loop possible parameters:
      Pick one set of parameter, e.g. C = 2.0
      Do cross validation, get a error estimation
      Pick the Cbest (with minimal error estimation) as the parameter
      18
    19. Dimensionality Reduction
      Which features are more “important”?
      Wrapper model feature selection
      Forward/backward search: add/remove a feature at a time, then evaluate the model with the new feature set.
      Filter feature selection
      Compute score S(i) that measures how informative xi is about the class label y
      S(i) can be correlation Corr(x_i, y), or mutual information MI(x_i, y), etc.
      Principal Component Analysis (PCA)
      Vector Quantization (VQ)
      19
    20. Multiclass SVM
      One against one
      There are binary SVMs. (1v2, 1v3, …)
      To predict, each SVM can vote between 2 classes.
      One against all
      There are k binary SVMs. (1 v rest, 2 v rest, …)
      To predict, evaluate , pick the largest.
      Multiclass SVM by solving ONE optimization problem
      K =
      1
      3
      5
      3
      2
      1
      1
      2
      3
      4
      5
      6
      K = 3
      poll
      Crammer, K., & Singer, Y. (2001). On the algorithmic implementation of multiclass kernel-based vector machines. JMLR, 2, 265-292.
      20
    21. Image Classification by SVM
      Process
      K = 6
      1/4
      3/4
      1 0:49 1:25 …
      1 0:49 1:25 …


      2 0:49 1:25 …

      Test Data
      Accuracy
      21
    22. Image Classification by SVM
      Results
      Run Multi-class SVM 100 times for both (linear/Gaussian).
      Accuracy Histogram
      22
    23. Image Classification by SVM
      If we throw object data that the machine never saw before.
      23
    24. ~ Thank You ~
      Shao-Chuan Wang
      CITI, Academia Sinica
      24
    SlideShare Zeitgeist 2009

    + Shao-Chuan WangShao-Chuan Wang Nominate

    custom

    218 views, 0 favs, 2 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 218
      • 211 on SlideShare
      • 7 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 9
    Most viewed embeds
    • 6 views on http://shao-chuan.appspot.com
    • 1 views on http://localhost:8080

    more

    All embeds
    • 6 views on http://shao-chuan.appspot.com
    • 1 views on http://localhost:8080

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Tags