Practical Machine Learning and Rails Part1

6,903 views
6,372 views

Published on

Part 2: http://www.slideshare.net/ryanstout/practical-machine-learning-and-rails-part2

Published in: Technology, Education
2 Comments
6 Likes
Statistics
Notes
No Downloads
Views
Total views
6,903
On SlideShare
0
From Embeds
0
Number of Embeds
1,967
Actions
Shares
0
Downloads
73
Comments
2
Likes
6
Embeds 0
No embeds

No notes for slide

Practical Machine Learning and Rails Part1

  1. Practical MachineLearning and Rails
  2. Andrew Cantino VP Engineering, Mavenlink @tectonic Founder, Agile Productions @ryanstout
  3. This talk will- introduce machine learning- make you ML-aware- have examples
  4. This talk will not- give you a PhD- implement algorithms- cover collaborative filtering, optimization, clustering, advanced statistics, genetic algorithms, classical AI, NLP, ...
  5. What is Machine Learning?Many different algorithmsthat predict datafrom other datausing applied statistics.
  6. "Enhance and rotate 20 degrees"
  7. What data? The web is data. User decisions APIs A/B Tests Databases Logs StreamsBrowser versions Reviews Clicktrails
  8. Okay. We have data.What do we do with it?We classify it.
  9. Classification
  10. Classification OR
  11. Classification :) OR :(
  12. Classification• Documents o Sort email (Gmails importance filter) o Route questions to appropriate expert (Aardvark) o Categorize reviews (Amazon)• Users o Expertise; interests; pro vs free; likelihood of paying; expected future karma• Events o Abnormal vs. normal
  13. Algorithms: Decision Tree Learning
  14. Algorithms: Decision Tree Learning Features Email contains word "viagra" no yes Email contains Email contains word "Ruby" attachment? no yes no yes P(Spam)=10% P(Spam)=5% P(Spam)=70% P(Spam)=95% Labels
  15. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  16. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  17. Algorithms: Naive Bayes• Break documents into words and treat each word as an independent feature• Surprisingly effective on simple text and document classification• Works well when you have lots of data Graphics from Wikipedia
  18. Algorithms: Naive BayesYou received 100 emails, 70 of which were spam.Word Spam with this word Ham with this wordviagra 42 (60%) 1 (3.3%)ruby 7 (10%) 15 (50%)hello 35 (50%) 24 (80%)A new email contains hello and viagra. The probability that itis spam is:P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra) = 0.7 * (0.5 * 0.6) / (0.59 * 0.43) = 82% Graphics from Wikipedia
  19. Algorithms: Neural Nets Hidden layerInput layer (features) Output layer (Classification) Graphics from Wikipedia
  20. Curse of DimensionalityThe more features and labels that you have, the more data that you need. http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
  21. Overfitting• With enough parameters, anything is possible.• We want our algorithms to generalize and infer, not memorize specific training examples.• Therefore, we test our algorithms on different data than we train them on.

×