Your SlideShare is downloading. ×
0
Linear Classification Machine Learning; Wed Apr 23, 2008
Motivation Problem:  Our goal is to “classify” input vectors  x   into one of  k  classes.  Similar to regression, but the...
Activation function Non-linear function assigning a class.
Activation function Non-linear function assigning a class.
Activation function Non-linear function assigning a class. Due to  f  the model is  not  linear in the weights.
Activation function Non-linear function assigning a class. Due to  f  the model is  not  linear in the weights. The decisi...
Discriminant functions A simple linear discriminant function:
Discriminant functions A simple linear discriminant function:
Least square training Target vectors as bit vectors. Classification vectors the  h  functions.
Least square training Least square is appropriate for Gaussian distributions, but has major problems with discrete targets...
Fisher's linear discriminant Approach: maximize this distance Consider the classification a projection:
Fisher's linear discriminant Approach: maximize this distance But notice the large overlap in the histograms. The variance...
Fisher's linear discriminant Maximize difference in mean  and  minimize within-class variance:
Probabilistic models
Probabilistic models Wake Up! Clever ideas here!
Probabilistic models Wake Up! Clever ideas here! Modeling
Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in
Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in Making choices
Probabilistic models Approach:  Define conditional distribution and make decision based on the sigmoid activation
Probabilistic models Particularly simple expression for Gaussian regression
Probabilistic models
Maximum likelihood estimation Assume observed iid
Maximum likelihood estimation
Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlyi...
Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlyi...
Logistic regression We can maximize the log likelihood...
Logistic regression Here we only estimate  M  weights, not  M  for each mean plus  O ( M  2 )  for the variance in the Gau...
Logistic regression Here we only estimate  M  weights, not  M  for each mean plus  O ( M  2 )  for the variance in the Gau...
Summary <ul><li>Classification models </li></ul><ul><ul><li>Linear decision surfaces </li></ul></ul><ul><ul><li>Geometric ...
Upcoming SlideShare
Loading in...5
×

Linear Classification

710

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
710
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Linear Classification"

  1. 1. Linear Classification Machine Learning; Wed Apr 23, 2008
  2. 2. Motivation Problem: Our goal is to “classify” input vectors x into one of k classes. Similar to regression, but the output variable is discrete. In linear classification the input space is split in (hyper-)planes, each with an assigned class.
  3. 3. Activation function Non-linear function assigning a class.
  4. 4. Activation function Non-linear function assigning a class.
  5. 5. Activation function Non-linear function assigning a class. Due to f the model is not linear in the weights.
  6. 6. Activation function Non-linear function assigning a class. Due to f the model is not linear in the weights. The decision surfaces are linear in w and x . Decision surface
  7. 7. Discriminant functions A simple linear discriminant function:
  8. 8. Discriminant functions A simple linear discriminant function:
  9. 9. Least square training Target vectors as bit vectors. Classification vectors the h functions.
  10. 10. Least square training Least square is appropriate for Gaussian distributions, but has major problems with discrete targets...
  11. 11. Fisher's linear discriminant Approach: maximize this distance Consider the classification a projection:
  12. 12. Fisher's linear discriminant Approach: maximize this distance But notice the large overlap in the histograms. The variance in the projection is larger than it need be. Consider the classification a projection:
  13. 13. Fisher's linear discriminant Maximize difference in mean and minimize within-class variance:
  14. 14. Probabilistic models
  15. 15. Probabilistic models Wake Up! Clever ideas here!
  16. 16. Probabilistic models Wake Up! Clever ideas here! Modeling
  17. 17. Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in
  18. 18. Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in Making choices
  19. 19. Probabilistic models Approach: Define conditional distribution and make decision based on the sigmoid activation
  20. 20. Probabilistic models Particularly simple expression for Gaussian regression
  21. 21. Probabilistic models
  22. 22. Maximum likelihood estimation Assume observed iid
  23. 23. Maximum likelihood estimation
  24. 24. Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlying Gaussian):
  25. 25. Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlying Gaussian): The likelihood:
  26. 26. Logistic regression We can maximize the log likelihood...
  27. 27. Logistic regression Here we only estimate M weights, not M for each mean plus O ( M 2 ) for the variance in the Gaussian approach.
  28. 28. Logistic regression Here we only estimate M weights, not M for each mean plus O ( M 2 ) for the variance in the Gaussian approach. We are not assuming anything about the underlying densities (we only assume we can separate them linearly).
  29. 29. Summary <ul><li>Classification models </li></ul><ul><ul><li>Linear decision surfaces </li></ul></ul><ul><ul><li>Geometric approach </li></ul></ul><ul><ul><ul><li>Maximizing distance of means and minimizing variance </li></ul></ul></ul><ul><ul><li>Probabilistic approach </li></ul></ul><ul><ul><ul><li>Sigmoid functions </li></ul></ul></ul><ul><ul><ul><li>Logistic regression </li></ul></ul></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×