Your SlideShare is downloading. ×
0
Mturk > Machine Learning<br />BhaskarRao, Polyvore<br />1<br />
What is Polyvore?<br />An online fashion community<br />2<br />
Discover your style<br />
How Big is Polyvore?<br />6/24/11<br />4<br />1M sets<br />Sets created monthly<br />7 minutes<br />Average time on site<b...
Polyvores in the Wild<br />General Behavior<br />Collect & Create<br />Clip from the Internet, organize, tag.<br />Create ...
What is the Mechanical Turk?<br />11<br />
The Turk (circa 1770) <br />Invented in 1770 by Mr.Wolfgang.<br />The first “machine” that could play chess.<br />Beat cha...
Amazon Mechanical Turk	(circa 2007)<br />Artificial AI <br />Crowd-sourced marketplace<br />HIT = Question<br />24/7 ; 100...
14<br />Amazon Mechanical Turk	(circa 2007)<br />mturk.com<br />
Why Turk?<br />15<br />
The power of The Turk.<br />Surveys<br />Startup idea validation <br />Training Classifiers<br />Gathering data <br />Atte...
Power of the Turk : Wisdom of Crowds<br />17<br />Source – crowdflower.com<br />
Power of the Turk : Replacing Journalism?<br />18<br />mybossisarobot.com<br />
(Mturk > ML if (problem == hard OR time == startup)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 ...
(Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 1...
(Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 1...
(Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 1...
(Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 1...
How to Turk ?<br />The basics....<br />Designing complex crowdsourcing tasks is hard<br />Stick to simple tasks<br />Itera...
The golden rule<br />    We are all human…you and I and mturk.<br />Say hello at www.turkernation.com<br />Get feedback<br...
Ready, Set, Fire…<br />Is this website an e-commerce store?<br />Fire 50 questions<br />60% accuracy <br />FAIL !<br />26...
How to design a HIT ?<br />27<br />
Supervision needed…..<br />28<br />
Retry 50 questions.<br />Allow only reputed workers<br />New HIT design after feedback<br />That should do it, right ?<br ...
80%<br />
Better? NO!<br />Call a crowdsourcing company ?<br />Hire an army?<br />Write classifier?<br />31<br />
EUREKA – The golden rule REDUX<br />Qualification Tests … duh!<br />So very overlooked or so very obvious ?<br />Automate ...
97%<br />
The process	(successful mturk recipe)<br />Design a “HIT”<br />Iterate on Design<br />Answer a few tricky ones.<br />Uploa...
Best Practices<br />Automate it all …<br />$ASK->ask($Question, $Options)<br />$ASK->final_answer()<br />35<br />
Maximum Awesome<br />What happens if you meld a Classifier, Mturk and yourself into an Unholy Q&A System.<br />Answer a fe...
Thank You<br />Questions?<br />bhaskar@polyvore.com<br />www.polyvore.com/cgi/about<br />
MTurk > Machine Learning
Upcoming SlideShare
Loading in...5
×

MTurk > Machine Learning

4,674

Published on

Machine Learning is hard, asking another person's opinion is easy. In this presentation we talk about how Polyvore ( www.polyvore.com) uses Amazon's Mechanical Turk to answer questions too hard for the fastest machines and the best classifiers. We also reveal the secret sauce that helped boost our Mturk answer accuracies from 60% to over 90%.

Published in: Technology, Education
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,674
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
36
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Add a polyvore stats slide.
  • The precursor to IBM Deep Blue ? Hoax revealed 1820!
  • Artificial AI (AAI) = APIs can completely hide the human from your computing systems.Crowd-sourced marketplace. Pennies for answers!Questions asked in the form of HITs (Human Intelligence Tasks).24/7 ; 100,000s of flexible workers who could be doing your HITs
  • Surveys and opinions.Startup idea validation (swayable.com)Validating / training Classifiers. (twitter sentiment analysis)Gathering data (location, phone numbers, twitter handles from the web)Finding Jim Gray.Validating recommendations generated by your recommendation engines.Keeping Porn away from your site.Art (10000 sheep project).
  • Mentioncrowdflower blog.
  • Not doing so well…….Raison d’etre for CrowdFlowerUses Crowd Forge Quora – What are the most crazy mturk uses ?
  • Mturk is a fantastic resource for startups! (except if….)Sweet spot of volume and accuracy ……Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….) ML = MACHINE LEARNINGClassify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • The basics....Designing Complex Crowdsourcing solution is hard….there are startups…..Stick to simple tasks, or break tasks into simple tasks.Be a startup (iterate) : deploy, gather results, modify…rinse and repeat.
  • Do not blockRarely reject esp for highly qualified workers.Make sure you pay proportinoate to time taken for the tasks.
  • Make this image bigger
  • 98% acceptance rate.
  • Reading thro the docs…..custom qualification tests.Why golden rule,,,,we always have tests, interviews to select workers, so it should be with mturk.Automate: Programmer answers a few questions and the system creates a test for the mturk tasks.
  • Create an “Ask”Answer a few tricky ones. To create the Qualification test automatically.Upload the HITs with the Qualification Test (you automated it right ?)Go home and Watch a rerun of “Game of Thrones”Come back next to get a warm and fuzzy 87+% accuracy.
  • If you do not automate, why bother hiding the human.bad soap API, bad docs.$ASK-&gt;approve(ANSWER)$ASK-&gt;reject(ANSWER)$ASK-&gt;final_answer()$ASK-&gt;grant_bonus(WORKER)All test generation, fetch answers, approve, reject answers, formatting etc. hidden using internal processes and APIs
  • Create an “Ask”Answer a few tricky ones.Upload the HITs with the Qualification Test (you automated it right ?)Go home and Watch a rerun of “Game of Thrones”Come back next to get a warm and fuzzy 87+% accuracy.
  • Transcript of "MTurk > Machine Learning"

    1. 1. Mturk > Machine Learning<br />BhaskarRao, Polyvore<br />1<br />
    2. 2. What is Polyvore?<br />An online fashion community<br />2<br />
    3. 3. Discover your style<br />
    4. 4. How Big is Polyvore?<br />6/24/11<br />4<br />1M sets<br />Sets created monthly<br />7 minutes<br />Average time on site<br />10M visitors<br />Unique visitors to Polyvore monthly<br />1.5M clips<br />Images clipped monthly<br />12.4%<br />Of Polyvore’s users visit 100+ times monthly<br />140M views<br />Pageviews on Polyvore monthly<br />
    5. 5. Polyvores in the Wild<br />General Behavior<br />Collect & Create<br />Clip from the Internet, organize, tag.<br />Create sets, make collections.<br />Consume<br />Explore, search, browse.<br />Like stuff, leave comments, build social networks.<br />Share<br />Embed in an offsite instance.<br />Get alerts for offsite activity.<br />5<br />
    6. 6.
    7. 7.
    8. 8.
    9. 9.
    10. 10.
    11. 11. What is the Mechanical Turk?<br />11<br />
    12. 12. The Turk (circa 1770) <br />Invented in 1770 by Mr.Wolfgang.<br />The first “machine” that could play chess.<br />Beat challengers like Napoleon and Benjamin Franklin.<br />Hoax revealed 50 years later.<br />12<br />wikipedia.com<br />
    13. 13. Amazon Mechanical Turk (circa 2007)<br />Artificial AI <br />Crowd-sourced marketplace<br />HIT = Question<br />24/7 ; 100,000s of on-demand workers<br />13<br />
    14. 14. 14<br />Amazon Mechanical Turk (circa 2007)<br />mturk.com<br />
    15. 15. Why Turk?<br />15<br />
    16. 16. The power of The Turk.<br />Surveys<br />Startup idea validation <br />Training Classifiers<br />Gathering data <br />Attempt to find Jim Gray.<br />Validating recommendations <br />Removing Porn<br />Art<br />16<br />
    17. 17. Power of the Turk : Wisdom of Crowds<br />17<br />Source – crowdflower.com<br />
    18. 18. Power of the Turk : Replacing Journalism?<br />18<br />mybossisarobot.com<br />
    19. 19. (Mturk > ML if (problem == hard OR time == startup)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. <br />Porn Removal<br />Find the official website of Chanel<br />Is Chanel a fashion brand ?<br />19<br />
    20. 20. (Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. POSSIBLE<br />Porn / Not porn classification<br />Find the official website of Chanel<br />Perfect logo of Chanel<br />20<br />
    21. 21. (Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. <br />Porn / Not porn classificationHARD<br />Find the official website of Chanel<br />Is Chanel a fashion brand ?<br />21<br />
    22. 22. (Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. <br />Porn / Not porn classification<br />Find the official website of ChanelYIKES!<br />Is Chanel a fashion brand ?<br />22<br />
    23. 23. (Mturk > ML if (problem == hard OR time == startup OR….)<br />Mturk is a fantastic resource for startups! <br />Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. <br />Porn / Not porn classification<br />Find the official website of Chanel<br />Is Chanel a fashion brand <br />IMPOSSIBLE?<br />23<br />
    24. 24. How to Turk ?<br />The basics....<br />Designing complex crowdsourcing tasks is hard<br />Stick to simple tasks<br />Iterate<br />24<br />
    25. 25. The golden rule<br /> We are all human…you and I and mturk.<br />Say hello at www.turkernation.com<br />Get feedback<br />Be fair<br />Do not get ripped off <br />25<br />
    26. 26. Ready, Set, Fire…<br />Is this website an e-commerce store?<br />Fire 50 questions<br />60% accuracy <br />FAIL !<br />26<br />Twitter.com<br />
    27. 27. How to design a HIT ?<br />27<br />
    28. 28. Supervision needed…..<br />28<br />
    29. 29. Retry 50 questions.<br />Allow only reputed workers<br />New HIT design after feedback<br />That should do it, right ?<br />29<br />
    30. 30. 80%<br />
    31. 31. Better? NO!<br />Call a crowdsourcing company ?<br />Hire an army?<br />Write classifier?<br />31<br />
    32. 32. EUREKA – The golden rule REDUX<br />Qualification Tests … duh!<br />So very overlooked or so very obvious ?<br />Automate it all.<br />Training data for Mturk ?<br />32<br />
    33. 33. 97%<br />
    34. 34. The process (successful mturk recipe)<br />Design a “HIT”<br />Iterate on Design<br />Answer a few tricky ones.<br />Upload the HITs <br />Go home and drink beer and watch reruns<br />Next day -> 87+% accuracy (usually).<br />34<br />
    35. 35. Best Practices<br />Automate it all …<br />$ASK->ask($Question, $Options)<br />$ASK->final_answer()<br />35<br />
    36. 36. Maximum Awesome<br />What happens if you meld a Classifier, Mturk and yourself into an Unholy Q&A System.<br />Answer a few questions, and the system self-calibrates.<br />NEXT TECH TALK…<br />36<br />
    37. 37. Thank You<br />Questions?<br />bhaskar@polyvore.com<br />www.polyvore.com/cgi/about<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×