0
Upcoming SlideShare
×

# Linear Classification

710

Published on

Published in: Technology, Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
710
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript of "Linear Classification"

1. 1. Linear Classification Machine Learning; Wed Apr 23, 2008
2. 2. Motivation Problem: Our goal is to “classify” input vectors x into one of k classes. Similar to regression, but the output variable is discrete. In linear classification the input space is split in (hyper-)planes, each with an assigned class.
3. 3. Activation function Non-linear function assigning a class.
4. 4. Activation function Non-linear function assigning a class.
5. 5. Activation function Non-linear function assigning a class. Due to f the model is not linear in the weights.
6. 6. Activation function Non-linear function assigning a class. Due to f the model is not linear in the weights. The decision surfaces are linear in w and x . Decision surface
7. 7. Discriminant functions A simple linear discriminant function:
8. 8. Discriminant functions A simple linear discriminant function:
9. 9. Least square training Target vectors as bit vectors. Classification vectors the h functions.
10. 10. Least square training Least square is appropriate for Gaussian distributions, but has major problems with discrete targets...
11. 11. Fisher's linear discriminant Approach: maximize this distance Consider the classification a projection:
12. 12. Fisher's linear discriminant Approach: maximize this distance But notice the large overlap in the histograms. The variance in the projection is larger than it need be. Consider the classification a projection:
13. 13. Fisher's linear discriminant Maximize difference in mean and minimize within-class variance:
14. 14. Probabilistic models
15. 15. Probabilistic models Wake Up! Clever ideas here!
16. 16. Probabilistic models Wake Up! Clever ideas here! Modeling
17. 17. Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in
18. 18. Probabilistic models Wake Up! Clever ideas here! Modeling Fitting in Making choices
19. 19. Probabilistic models Approach: Define conditional distribution and make decision based on the sigmoid activation
20. 20. Probabilistic models Particularly simple expression for Gaussian regression
21. 21. Probabilistic models
22. 22. Maximum likelihood estimation Assume observed iid
23. 23. Maximum likelihood estimation
24. 24. Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlying Gaussian):
25. 25. Logistic regression We can also directly express the class probability as a sigmoid (without implicitly having an underlying Gaussian): The likelihood:
26. 26. Logistic regression We can maximize the log likelihood...
27. 27. Logistic regression Here we only estimate M weights, not M for each mean plus O ( M 2 ) for the variance in the Gaussian approach.
28. 28. Logistic regression Here we only estimate M weights, not M for each mean plus O ( M 2 ) for the variance in the Gaussian approach. We are not assuming anything about the underlying densities (we only assume we can separate them linearly).
29. 29. Summary <ul><li>Classification models </li></ul><ul><ul><li>Linear decision surfaces </li></ul></ul><ul><ul><li>Geometric approach </li></ul></ul><ul><ul><ul><li>Maximizing distance of means and minimizing variance </li></ul></ul></ul><ul><ul><li>Probabilistic approach </li></ul></ul><ul><ul><ul><li>Sigmoid functions </li></ul></ul></ul><ul><ul><ul><li>Logistic regression </li></ul></ul></ul>
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.