Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Computational Social Science, Lectu... by jakehofman 2694 views
- Computational Social Science, Lectu... by jakehofman 2835 views
- Computational Social Science, Lectu... by jakehofman 1482 views
- Computational Social Science, Lectu... by jakehofman 1640 views
- Animations On PDF Using Lua and LaTeX by Mukund Muralikris... 1472 views
- Graphviz and TikZ by Claudio Fiandrino 2619 views

1,378 views

1,358 views

1,358 views

Published on

No Downloads

Total views

1,378

On SlideShare

0

From Embeds

0

Number of Embeds

3

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. ClassiﬁcationAPAM E4990Computational Social ScienceJake HofmanColumbia UniversityApril 26, 2013Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 1 / 11
- 2. Prediction a la Bayes1• You’re testing for a rare condition:• 1% of the student population is in this class• You have a highly sensitive and speciﬁc test:• 99% of students in the class visit compsocialscience.org• 99% of students who aren’t in the class don’t visit this site• Given that a student visits the course site, what is probabilitythe student is in our class?1Follows Wiggins, SciAm 2006Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 2 / 11
- 3. Prediction a la BayesStudents10,000 ppl1% In class100 ppl99% Visit99 ppl1% Don’t visit1 per99% Not in class9900 ppl1% Visit99 ppl99% Don’t visit9801 pplJake Hofman (Columbia University) Classiﬁcation April 26, 2013 3 / 11
- 4. Prediction a la BayesStudents10,000 ppl1% In class100 ppl99% Visit99 ppl1% Don’t visit1 per99% Not in class9900 ppl1% Visit99 ppl99% Don’t visit9801 pplSo given that a student visits the site (198 ppl), there is a 50%chance the student is in our class (99 ppl)!Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 3 / 11
- 5. Prediction a la BayesStudents10,000 ppl1% In class100 ppl99% Visit99 ppl1% Don’t visit1 per99% Not in class9900 ppl1% Visit99 ppl99% Don’t visit9801 pplThe small error rate on the large population outside of our classproduces many false positives.Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 3 / 11
- 6. Inverting conditional probabilitiesBayes’ TheoremEquate the far right- and left-hand sides of product rulep (y|x) p (x) = p (x, y) = p (x|y) p (y)and divide to get the probability of y given x from the probabilityof x given y:p (y|x) =p (x|y) p (y)p (x)where p (x) = y∈ΩYp (x|y) p (y) is the normalization constant.Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 4 / 11
- 7. Predictions a la BayesGiven that a patient tests positive, what is probability the patientis sick?p (class|visit) =99/100p (visit|class)1/100p (class)p (visit)99/1002+99/1002=198/1002=99198=12where p (visit) = p (visit|class) p (class) + p visit|class p class .Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 5 / 11
- 8. (Super) Naive BayesWe can use Bayes’ rule to build a one-site student classiﬁer:p (class|site) =p (site|class) p (class)p (site)where we estimate these probabilities with ratios of counts:ˆp(site|class) =# students in class who visit site# students in classˆp(site|class) =# students not in class who visit site# students not in classˆp(class) =# students in class# studentsˆp(class) =# students not in class# studentsJake Hofman (Columbia University) Classiﬁcation April 26, 2013 6 / 11
- 9. Naive BayesRepresent each student by a binary vector x where xj = 1 if thestudent has visited the j-th site (xj = 0 otherwise).Modeling each site as an independent Bernoulli random variable,the probability of visiting a set of sites x given class membershipc = 0, 1:p (x|c) =jθxjjc (1 − θjc)1−xjwhere θjc denotes the probability that the j-th site is visited by astudent with class membership c.Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 7 / 11
- 10. Naive BayesUsing this likelihood in Bayes’ rule and taking a logarithm, we have:log p (c|x) = logp (x|c) p (c)p (x)=jxj logθjc1 − θjc+jlog(1 − θjc) + logθcp (x)Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 8 / 11
- 11. Naive BayesWe can eliminate p (x) by calculating the log-odds:logp (1|x)p (0|x)=jxj logθj1(1 − θj0)θj0(1 − θj1)wj+jlog1 − θj11 − θj0+ logθ1θ0w0which gives a linear classiﬁer of the form w · x + w0Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 9 / 11
- 12. Naive BayesWe train by counting students and sites to estimate θjc and θc:ˆθjc =njcncˆθc =ncnand use these to calculate the weights ˆwj and bias ˆw0:ˆwj = logˆθj1(1 − ˆθj0)ˆθj0(1 − ˆθj1)ˆw0 =jlog1 − ˆθj11 − ˆθj0+ logˆθ1ˆθ0.We we predict by simply adding the weights of the sites that astudent has visited to the bias term.Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 10 / 11
- 13. Naive BayesIn practice, this works better than one might expect given itssimplicity22http://www.jstor.org/pss/1403452Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 11 / 11
- 14. Naive BayesTraining is computationally cheap and scalable, and the model iseasy to update given new observations22http://www.springerlink.com/content/wu3g458834583125/Jake Hofman (Columbia University) Classiﬁcation April 26, 2013 11 / 11
- 15. Naive BayesPerformance varies with document representations andcorresponding likelihood models22http://ceas.cc/2006/15.pdfJake Hofman (Columbia University) Classiﬁcation April 26, 2013 11 / 11
- 16. Naive BayesIt’s often important to smooth parameter estimates (e.g., byadding pseudocounts) to avoid overﬁttingJake Hofman (Columbia University) Classiﬁcation April 26, 2013 11 / 11

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment