Machine Learning  in JavaScript
Jamison Dancei.tv@jergasonhttp://jamisondance.com
Smart?
Smart
Curious
Two years•Secret sauce at SpotterRF•StackOverflow.com•Random side projects
What is it?
Math
Math
Math
Math
“Making computers modify or adapt theiractions so these actions get more accurate.” -   Stephen Marsland
Teaching computers to  recognize patterns
Why doyou care?
An avalanche  of data
“The purpose ofcomputing is insight, not  numbers.” - Richard      Hamming
WhyJavaScript?
Atwood’s  Law
Atwood’s  Law
Naive Bayes  and puppies
Spam Filtering?Illustrious sir/madame,I have recently acquired a bounteouscache of 7 million semicolons. They can allbe yo...
An idea• Count up all words in all spam• Count up all words in not-spam• Compare counts to words in new documents
ClassWord 1   Word 2      Word 3   Word 4
Spam / Not SpamWord 1   Word 2   Word 3    Word 4
Spam / Not SpamWord 1     Word 2   Word 3    Word 4Spam Not 50   45
Spam / Not SpamWord 1   Word 2     Word 3   Word 4         Spam Not          15   27
Spam / Not SpamWord 1   Word 2   Word 3     Word 4                  Spam Not                   33   14
Spam / Not SpamWord 1   Word 2   Word 3    Word 4                            Spam Not                             4   55
Emergency! Puppies!
Bayes TheoremP(A|B) = (P(B|A)*P(A)) / P(B)
In EnglishP(class|email) = (P(email|class)*P(class)) /     P(word1 and word2 and word3)
P(class|email) = (P(email|class)*P(class))/P(email)
“Spam” or “Not Spam” P(class|email) = (P(email|class)*P(class))/P(email)
Words in the emailP(class|email) = (P(email|class)*P(class))/P(email)
What we think the                         probability of spam or                              not spam isP(class|email) = ...
P(spam|email) = (P(email|spam)*P(spam))/P(email) P(not spam|email) = (P(email|not spam)*P(not              spam))/P(email)...
P(spam|email) = (P(email|spam)*P(spam))/P(email) P(not spam|email) = (P(email|not spam)*P(not              spam))/P(email)...
P(spam|email) = (P(email|spam)*P(spam))P(not spam|email) = (P(email|not spam)*            P(not spam))  Assume these are t...
P(spam|email) = P(email|spam)P(not spam|email) = P(email|not spam)
P(words|spam) = P(word1|spam) *P(word2|spam) . . . * P(word_n|spam)
Emergency! Kittens!
In The Wild
Everyone Hates Hacker News
Crap
CrapCrap
CrapCrap       Crap
CrapCrap     CrapCrap
CrapCrap      CrapCrap Crap
Crap Crap       Crap Crap  CrapCrap
Crap Crap       Crap Crap  CrapCrap      Crap
Crap Crap       Crap Crap  CrapCrap      CrapCrap
Crap Crap       Crap Crap  CrapCrap      CrapCrap  Crap
Crap Crap       Crap Crap  CrapCrap      CrapCrap  Crap      Crap
Crap Crap       Crap Crap  CrapCrap      CrapCrap  Crap      CrapCrap
Mostly Crap Crap Crap  Crap       CrapCrapCrap      CrapCrap  Crap      CrapCrap
Yehuda Katzhurt my feelings                   rails sucks                   node rules                        lol
Yehuda Katzhurt my feelingsDRAMA                   rails sucks                   node rules                        lol
Some Good Stuff
Automatically Find The Good Stuff
Step 1Gather Data
scraping with    jsdom
storage with mongoose
An aside
90% data prep  10% learning
naive bayes with   credulous
Like / DislikeWord 1 . . . Word n username   hostname
programmer ui  with node
recommend posts   from HN api
In The Browser
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Machine Learning In JavaScript
Upcoming SlideShare
Loading in …5
×

Machine Learning In JavaScript

393
-1

Published on

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
393
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Machine Learning In JavaScript

    1. 1. Machine Learning in JavaScript
    2. 2. Jamison Dancei.tv@jergasonhttp://jamisondance.com
    3. 3. Smart?
    4. 4. Smart
    5. 5. Curious
    6. 6. Two years•Secret sauce at SpotterRF•StackOverflow.com•Random side projects
    7. 7. What is it?
    8. 8. Math
    9. 9. Math
    10. 10. Math
    11. 11. Math
    12. 12. “Making computers modify or adapt theiractions so these actions get more accurate.” - Stephen Marsland
    13. 13. Teaching computers to recognize patterns
    14. 14. Why doyou care?
    15. 15. An avalanche of data
    16. 16. “The purpose ofcomputing is insight, not numbers.” - Richard Hamming
    17. 17. WhyJavaScript?
    18. 18. Atwood’s Law
    19. 19. Atwood’s Law
    20. 20. Naive Bayes and puppies
    21. 21. Spam Filtering?Illustrious sir/madame,I have recently acquired a bounteouscache of 7 million semicolons. They can allbe yours if you send a money order for30 semicolons.Most graciously,A spammer
    22. 22. An idea• Count up all words in all spam• Count up all words in not-spam• Compare counts to words in new documents
    23. 23. ClassWord 1 Word 2 Word 3 Word 4
    24. 24. Spam / Not SpamWord 1 Word 2 Word 3 Word 4
    25. 25. Spam / Not SpamWord 1 Word 2 Word 3 Word 4Spam Not 50 45
    26. 26. Spam / Not SpamWord 1 Word 2 Word 3 Word 4 Spam Not 15 27
    27. 27. Spam / Not SpamWord 1 Word 2 Word 3 Word 4 Spam Not 33 14
    28. 28. Spam / Not SpamWord 1 Word 2 Word 3 Word 4 Spam Not 4 55
    29. 29. Emergency! Puppies!
    30. 30. Bayes TheoremP(A|B) = (P(B|A)*P(A)) / P(B)
    31. 31. In EnglishP(class|email) = (P(email|class)*P(class)) / P(word1 and word2 and word3)
    32. 32. P(class|email) = (P(email|class)*P(class))/P(email)
    33. 33. “Spam” or “Not Spam” P(class|email) = (P(email|class)*P(class))/P(email)
    34. 34. Words in the emailP(class|email) = (P(email|class)*P(class))/P(email)
    35. 35. What we think the probability of spam or not spam isP(class|email) = (P(email|class)*P(class))/P(email)
    36. 36. P(spam|email) = (P(email|spam)*P(spam))/P(email) P(not spam|email) = (P(email|not spam)*P(not spam))/P(email) Pick the largest one
    37. 37. P(spam|email) = (P(email|spam)*P(spam))/P(email) P(not spam|email) = (P(email|not spam)*P(not spam))/P(email) These are the same
    38. 38. P(spam|email) = (P(email|spam)*P(spam))P(not spam|email) = (P(email|not spam)* P(not spam)) Assume these are the same
    39. 39. P(spam|email) = P(email|spam)P(not spam|email) = P(email|not spam)
    40. 40. P(words|spam) = P(word1|spam) *P(word2|spam) . . . * P(word_n|spam)
    41. 41. Emergency! Kittens!
    42. 42. In The Wild
    43. 43. Everyone Hates Hacker News
    44. 44. Crap
    45. 45. CrapCrap
    46. 46. CrapCrap Crap
    47. 47. CrapCrap CrapCrap
    48. 48. CrapCrap CrapCrap Crap
    49. 49. Crap Crap Crap Crap CrapCrap
    50. 50. Crap Crap Crap Crap CrapCrap Crap
    51. 51. Crap Crap Crap Crap CrapCrap CrapCrap
    52. 52. Crap Crap Crap Crap CrapCrap CrapCrap Crap
    53. 53. Crap Crap Crap Crap CrapCrap CrapCrap Crap Crap
    54. 54. Crap Crap Crap Crap CrapCrap CrapCrap Crap CrapCrap
    55. 55. Mostly Crap Crap Crap Crap CrapCrapCrap CrapCrap Crap CrapCrap
    56. 56. Yehuda Katzhurt my feelings rails sucks node rules lol
    57. 57. Yehuda Katzhurt my feelingsDRAMA rails sucks node rules lol
    58. 58. Some Good Stuff
    59. 59. Automatically Find The Good Stuff
    60. 60. Step 1Gather Data
    61. 61. scraping with jsdom
    62. 62. storage with mongoose
    63. 63. An aside
    64. 64. 90% data prep 10% learning
    65. 65. naive bayes with credulous
    66. 66. Like / DislikeWord 1 . . . Word n username hostname
    67. 67. programmer ui with node
    68. 68. recommend posts from HN api
    69. 69. In The Browser
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×