MTurk > Machine Learning
Upcoming SlideShare
Loading in...5
×
 

MTurk > Machine Learning

on

  • 4,701 views

Machine Learning is hard, asking another person's opinion is easy. In this presentation we talk about how Polyvore ( www.polyvore.com) uses Amazon's Mechanical Turk to answer questions too hard for ...

Machine Learning is hard, asking another person's opinion is easy. In this presentation we talk about how Polyvore ( www.polyvore.com) uses Amazon's Mechanical Turk to answer questions too hard for the fastest machines and the best classifiers. We also reveal the secret sauce that helped boost our Mturk answer accuracies from 60% to over 90%.

Statistics

Views

Total Views
4,701
Views on SlideShare
4,668
Embed Views
33

Actions

Likes
4
Downloads
32
Comments
0

6 Embeds 33

http://97.107.133.80 26
http://paper.li 2
http://www.redditmedia.com 2
http://us-w1.rockmelt.com 1
http://twitter.com 1
https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Add a polyvore stats slide.
  • The precursor to IBM Deep Blue ? Hoax revealed 1820!
  • Artificial AI (AAI) = APIs can completely hide the human from your computing systems.Crowd-sourced marketplace. Pennies for answers!Questions asked in the form of HITs (Human Intelligence Tasks).24/7 ; 100,000s of flexible workers who could be doing your HITs
  • Surveys and opinions.Startup idea validation (swayable.com)Validating / training Classifiers. (twitter sentiment analysis)Gathering data (location, phone numbers, twitter handles from the web)Finding Jim Gray.Validating recommendations generated by your recommendation engines.Keeping Porn away from your site.Art (10000 sheep project).
  • Mentioncrowdflower blog.
  • Not doing so well…….Raison d’etre for CrowdFlowerUses Crowd Forge Quora – What are the most crazy mturk uses ?
  • Mturk is a fantastic resource for startups! (except if….)Sweet spot of volume and accuracy ……Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….)Classify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • Mturk is a fantastic resource for startups! (except if….) ML = MACHINE LEARNINGClassify 10000 sites as store: 2-3 Weeks of researcher time (7000$+) vs 1 day of Mturk ($500). ….. Possible ?Porn / Not porn classification: Hard…..Find the official website of Chanel : Yikes!Perfect logo of Chanel : Impossible
  • The basics....Designing Complex Crowdsourcing solution is hard….there are startups…..Stick to simple tasks, or break tasks into simple tasks.Be a startup (iterate) : deploy, gather results, modify…rinse and repeat.
  • Do not blockRarely reject esp for highly qualified workers.Make sure you pay proportinoate to time taken for the tasks.
  • Make this image bigger
  • 98% acceptance rate.
  • Reading thro the docs…..custom qualification tests.Why golden rule,,,,we always have tests, interviews to select workers, so it should be with mturk.Automate: Programmer answers a few questions and the system creates a test for the mturk tasks.
  • Create an “Ask”Answer a few tricky ones. To create the Qualification test automatically.Upload the HITs with the Qualification Test (you automated it right ?)Go home and Watch a rerun of “Game of Thrones”Come back next to get a warm and fuzzy 87+% accuracy.
  • If you do not automate, why bother hiding the human.bad soap API, bad docs.$ASK->approve(ANSWER)$ASK->reject(ANSWER)$ASK->final_answer()$ASK->grant_bonus(WORKER)All test generation, fetch answers, approve, reject answers, formatting etc. hidden using internal processes and APIs
  • Create an “Ask”Answer a few tricky ones.Upload the HITs with the Qualification Test (you automated it right ?)Go home and Watch a rerun of “Game of Thrones”Come back next to get a warm and fuzzy 87+% accuracy.

MTurk > Machine Learning MTurk > Machine Learning Presentation Transcript

  • Mturk > Machine Learning
    BhaskarRao, Polyvore
    1
  • What is Polyvore?
    An online fashion community
    2
  • Discover your style
  • How Big is Polyvore?
    6/24/11
    4
    1M sets
    Sets created monthly
    7 minutes
    Average time on site
    10M visitors
    Unique visitors to Polyvore monthly
    1.5M clips
    Images clipped monthly
    12.4%
    Of Polyvore’s users visit 100+ times monthly
    140M views
    Pageviews on Polyvore monthly
  • Polyvores in the Wild
    General Behavior
    Collect & Create
    Clip from the Internet, organize, tag.
    Create sets, make collections.
    Consume
    Explore, search, browse.
    Like stuff, leave comments, build social networks.
    Share
    Embed in an offsite instance.
    Get alerts for offsite activity.
    5
  • What is the Mechanical Turk?
    11
  • The Turk (circa 1770)
    Invented in 1770 by Mr.Wolfgang.
    The first “machine” that could play chess.
    Beat challengers like Napoleon and Benjamin Franklin.
    Hoax revealed 50 years later.
    12
    wikipedia.com
  • Amazon Mechanical Turk (circa 2007)
    Artificial AI
    Crowd-sourced marketplace
    HIT = Question
    24/7 ; 100,000s of on-demand workers
    13
  • 14
    Amazon Mechanical Turk (circa 2007)
    mturk.com
  • Why Turk?
    15
  • The power of The Turk.
    Surveys
    Startup idea validation
    Training Classifiers
    Gathering data
    Attempt to find Jim Gray.
    Validating recommendations
    Removing Porn
    Art
    16
  • Power of the Turk : Wisdom of Crowds
    17
    Source – crowdflower.com
  • Power of the Turk : Replacing Journalism?
    18
    mybossisarobot.com
  • (Mturk > ML if (problem == hard OR time == startup)
    Mturk is a fantastic resource for startups!
    Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). …..
    Porn Removal
    Find the official website of Chanel
    Is Chanel a fashion brand ?
    19
  • (Mturk > ML if (problem == hard OR time == startup OR….)
    Mturk is a fantastic resource for startups!
    Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). ….. POSSIBLE
    Porn / Not porn classification
    Find the official website of Chanel
    Perfect logo of Chanel
    20
  • (Mturk > ML if (problem == hard OR time == startup OR….)
    Mturk is a fantastic resource for startups!
    Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). …..
    Porn / Not porn classificationHARD
    Find the official website of Chanel
    Is Chanel a fashion brand ?
    21
  • (Mturk > ML if (problem == hard OR time == startup OR….)
    Mturk is a fantastic resource for startups!
    Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). …..
    Porn / Not porn classification
    Find the official website of ChanelYIKES!
    Is Chanel a fashion brand ?
    22
  • (Mturk > ML if (problem == hard OR time == startup OR….)
    Mturk is a fantastic resource for startups!
    Classify 10000 sites as store2-3 Weeks of researcher time (7000$+) vs1 day of Mturk ($500). …..
    Porn / Not porn classification
    Find the official website of Chanel
    Is Chanel a fashion brand
    IMPOSSIBLE?
    23
  • How to Turk ?
    The basics....
    Designing complex crowdsourcing tasks is hard
    Stick to simple tasks
    Iterate
    24
  • The golden rule
    We are all human…you and I and mturk.
    Say hello at www.turkernation.com
    Get feedback
    Be fair
    Do not get ripped off
    25
  • Ready, Set, Fire…
    Is this website an e-commerce store?
    Fire 50 questions
    60% accuracy 
    FAIL !
    26
    Twitter.com
  • How to design a HIT ?
    27
  • Supervision needed…..
    28
  • Retry 50 questions.
    Allow only reputed workers
    New HIT design after feedback
    That should do it, right ?
    29
  • 80%
  • Better? NO!
    Call a crowdsourcing company ?
    Hire an army?
    Write classifier?
    31
  • EUREKA – The golden rule REDUX
    Qualification Tests … duh!
    So very overlooked or so very obvious ?
    Automate it all.
    Training data for Mturk ?
    32
  • 97%
  • The process (successful mturk recipe)
    Design a “HIT”
    Iterate on Design
    Answer a few tricky ones.
    Upload the HITs
    Go home and drink beer and watch reruns
    Next day -> 87+% accuracy (usually).
    34
  • Best Practices
    Automate it all …
    $ASK->ask($Question, $Options)
    $ASK->final_answer()
    35
  • Maximum Awesome
    What happens if you meld a Classifier, Mturk and yourself into an Unholy Q&A System.
    Answer a few questions, and the system self-calibrates.
    NEXT TECH TALK…
    36
  • Thank You
    Questions?
    bhaskar@polyvore.com
    www.polyvore.com/cgi/about