Metadata in a Crowd
Metadata in a Crowd
                IN pyxing
Metadata in a Crowd
                IN pyxing
crowd
Metadata in a Crowd
crowd
Metadata in a Crowd




             shared knowledge
Hi, I’m Kevin
Hi, I’m Kevin   Rundblad
What is Metadata in a Crowd?

INTRO: Real 2.0 vs. Simulated 2.0

MAIN: Simon Says “Tag This Photo”
      Human Computation...
Thinking in 2.0 – Engaging the User
2.0 is a Culture, Not Tech
2.0 is a Culture, Not Tech
If the 2.0 idea is not about the user
…it is artificial 2.0
Real 2.0 comes directly from users
Traditional development will struggle
with 2.0 development, since it is
focused on the process, not the user.
Technology only expresses 2.0 ideas
2.0 is about Creators / Participants
Individuals post valuable info
Many consume it
Expert - Seeks Objective Articulation
Expert: Structured and Precise
Expert: Narrowly Defined Expertise
Social Very Wide Knowledge
Social: Fragments of Expertise
Social: Tends to be Locally Derived
Social: Subjective Perspecti...
Social knowledge: Many Perspectives


Social: Very Wide and Fragmented
Is the Crowd
a Viable Source of Metadata?
Aren’t Users Lazy?
Um, Yes and Yes
But they do like to play games…
…spending many hours earning
points that don’t mean anything
…motivation is key for participation
…and participation often means
completing an intelligent task




                       (ok, mindless too)
Human Computation
Human Computation

Finding tasks that humans
can do better than computers
Human Computation

Finding tasks that humans
can do better than computers

And creating motivation to perform them
Human Computation and P2P Model
Human Computation and P2P Model
Human Computation and P2P Model
Human Computation and P2P Model




                      BitTorrent P2P Architecture
Human Computation and P2P Model
One Video File requested




                           BitTorrent P2P Architecture
Human Computation and P2P Model
One Video File requested




Many systems each delivering parts of file, until entire file...
Socially formed knowledge
is like the P2P distribution architecture
Social Knowledge - Human Computation
Content formed from many individuals




Many individuals contributing parts of the c...
But “…unlike computer processors,
humans require some incentive to
become part of a collective
computation”.
             ...
Models of Human Computation

AKA Human Intelligence Tasks (HITs)
(Coined by Amazon)
Human Intelligence Tasks – 3 Types

- Socially motivated
- Economically motivated
- Tacit (user may be unaware of task its...
Human Intelligence Tasks (social)
(or social knowledge systems)


Tagging (Flickr) and Comments (Amazon)
Ratings (Yelp)
Pr...
Tagging @ flickr
Tagging @ flickr


Wider Discovery

Public Tagging
Ratings @ Yelp                      Social is local




  Human Intelligence Tasks (social)

  Tagging (Flickr)
  Ratings ...
Ratings @ Yelp        Social is local



Efficient Discovery
Ratings @ Yelp               Social is local



Efficient Discovery

“Hearsay” becomes Reliable
Ratings @ Yelp               Social is local



Efficient Discovery

“Hearsay” becomes Reliable

Marketing becoming
less c...
Ratings @ Yelp               Social is local



Efficient Discovery

“Hearsay” becomes Reliable

Marketing becoming
less c...
Ratings @ Yelp   Social is local
Problem Solving @ stackoverflow
Problem Solving @ stackoverflow
Problem Solving @ stackoverflow




      Motivation to
       Participate
Human Intelligence Tasks (economic)

Mechanical Turk (Amazon) - small tasks $0 - $10
TextEagle (Nathan Eagle) – tasks for ...
Small Tasks @ Mechanical Turk
Small Tasks @ Mechanical Turk


Small Tasks

Time vs. Economics
(does not add up)
Small Tasks @ TextEagle




Mobile Human Computation in Africa

TextEagle is a…”system enabling the 3 billion mobile phone...
Small Tasks @ TextEagle




Translation/Transcription Services
Question: Translate the phrase quot;Address Bookquot; into ...
Human Intelligence Tasks (tacit)
User may not be aware of how tasks are
utilized.

reCAPTCHA (based on CAPTCHA)
ESP Game (...
CAPTCHA:


“Are you human?”

           (CAPTCHA: Developed by Louis Von Ahn)
CAPTCHA:
(Completely Automated Public Turing Test
To Tell Computers and Humans Apart )



Security based on human percepti...
reCAPTCHA

Security + Failed OCR = Opportunity




                    (CAPTCHA: Developed by Louis Von Ahn)
reCAPTCHA




Currently being used to help scan books for Internet Archive
and old editions of the New York Times.


     ...
reCAPTCHA

Bots getting better at deciphering the CAPTCHAs




                        (graphic - http://recaptcha.net/lea...
reCAPTCHA

Bots getting better at deciphering the CAPTCHAs

Creates feedback loop - Good thing, since it means
OCR gets mo...
ESP Game (now Google Image Labeler)
2 individuals match = high probability of reliable result
With Human Computation…

…computing becomes the coordinating
force between many individuals and
intelligent or perceptual ...
Human Computation Models

Social
Tagging (Flickr) and Comments (Amazon)
Ratings (Yelp)
Problem Solving (StackOverflow)

Ec...
How can you create a platform
for capturing social knowledge?
Shared Knowledge Production

Public platforms – ex. Flickr

Develop platform – ex. Simul8 Model
UCLA Library Simul8 Group Model
Listen to users – experience paradigm
See user mode, lifestyle, aesthetic
Engage in their ...
One Question:
Are You Human?




                 Kevin Rundblad
                 kevinrundblad@library.ucla.edu
One Question:
Are You Human?



         Kevin Rundblad
         kevinrundblad@library.ucla.edu
Upcoming SlideShare
Loading in …5
×

Metadata in a Crowd: Shared Knowledge Production

1,764 views

Published on

Presentation given at Society of California Archivists Conference, Riverside, May 09, 2009.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,764
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • http://photosynth.net/
  • Metadata in a Crowd: Shared Knowledge Production

    1. 1. Metadata in a Crowd
    2. 2. Metadata in a Crowd IN pyxing
    3. 3. Metadata in a Crowd IN pyxing
    4. 4. crowd Metadata in a Crowd
    5. 5. crowd Metadata in a Crowd shared knowledge
    6. 6. Hi, I’m Kevin
    7. 7. Hi, I’m Kevin Rundblad
    8. 8. What is Metadata in a Crowd? INTRO: Real 2.0 vs. Simulated 2.0 MAIN: Simon Says “Tag This Photo” Human Computation Models
    9. 9. Thinking in 2.0 – Engaging the User
    10. 10. 2.0 is a Culture, Not Tech
    11. 11. 2.0 is a Culture, Not Tech
    12. 12. If the 2.0 idea is not about the user …it is artificial 2.0
    13. 13. Real 2.0 comes directly from users
    14. 14. Traditional development will struggle with 2.0 development, since it is focused on the process, not the user.
    15. 15. Technology only expresses 2.0 ideas
    16. 16. 2.0 is about Creators / Participants
    17. 17. Individuals post valuable info
    18. 18. Many consume it
    19. 19. Expert - Seeks Objective Articulation Expert: Structured and Precise Expert: Narrowly Defined Expertise
    20. 20. Social Very Wide Knowledge Social: Fragments of Expertise Social: Tends to be Locally Derived Social: Subjective Perspective Social: Emotional Metadata
    21. 21. Social knowledge: Many Perspectives Social: Very Wide and Fragmented
    22. 22. Is the Crowd a Viable Source of Metadata?
    23. 23. Aren’t Users Lazy?
    24. 24. Um, Yes and Yes
    25. 25. But they do like to play games…
    26. 26. …spending many hours earning points that don’t mean anything
    27. 27. …motivation is key for participation
    28. 28. …and participation often means completing an intelligent task (ok, mindless too)
    29. 29. Human Computation
    30. 30. Human Computation Finding tasks that humans can do better than computers
    31. 31. Human Computation Finding tasks that humans can do better than computers And creating motivation to perform them
    32. 32. Human Computation and P2P Model
    33. 33. Human Computation and P2P Model
    34. 34. Human Computation and P2P Model
    35. 35. Human Computation and P2P Model BitTorrent P2P Architecture
    36. 36. Human Computation and P2P Model One Video File requested BitTorrent P2P Architecture
    37. 37. Human Computation and P2P Model One Video File requested Many systems each delivering parts of file, until entire file is complete BitTorrent P2P Architecture
    38. 38. Socially formed knowledge is like the P2P distribution architecture
    39. 39. Social Knowledge - Human Computation Content formed from many individuals Many individuals contributing parts of the content
    40. 40. But “…unlike computer processors, humans require some incentive to become part of a collective computation”. Louis Von Ahn Source: invited talk at CMU on Human Computation
    41. 41. Models of Human Computation AKA Human Intelligence Tasks (HITs) (Coined by Amazon)
    42. 42. Human Intelligence Tasks – 3 Types - Socially motivated - Economically motivated - Tacit (user may be unaware of task itself they perform)
    43. 43. Human Intelligence Tasks (social) (or social knowledge systems) Tagging (Flickr) and Comments (Amazon) Ratings (Yelp) Problem Solving (StackOverflow)
    44. 44. Tagging @ flickr
    45. 45. Tagging @ flickr Wider Discovery Public Tagging
    46. 46. Ratings @ Yelp Social is local Human Intelligence Tasks (social) Tagging (Flickr) Ratings (Yelp) Comments (Amazon) Problem Solving (StackOverflow)
    47. 47. Ratings @ Yelp Social is local Efficient Discovery
    48. 48. Ratings @ Yelp Social is local Efficient Discovery “Hearsay” becomes Reliable
    49. 49. Ratings @ Yelp Social is local Efficient Discovery “Hearsay” becomes Reliable Marketing becoming less credible
    50. 50. Ratings @ Yelp Social is local Efficient Discovery “Hearsay” becomes Reliable Marketing becoming less credible Many perspectives creates trustworthiness
    51. 51. Ratings @ Yelp Social is local
    52. 52. Problem Solving @ stackoverflow
    53. 53. Problem Solving @ stackoverflow
    54. 54. Problem Solving @ stackoverflow Motivation to Participate
    55. 55. Human Intelligence Tasks (economic) Mechanical Turk (Amazon) - small tasks $0 - $10 TextEagle (Nathan Eagle) – tasks for mobile phone
    56. 56. Small Tasks @ Mechanical Turk
    57. 57. Small Tasks @ Mechanical Turk Small Tasks Time vs. Economics (does not add up)
    58. 58. Small Tasks @ TextEagle Mobile Human Computation in Africa TextEagle is a…”system enabling the 3 billion mobile phone subscribers living in the developing world to earn small amounts of money by completing short, SMS-based tasks.” Nathan Eagle Research Scientist, MIT Source: http://web.media.mit.edu/~nathan/
    59. 59. Small Tasks @ TextEagle Translation/Transcription Services Question: Translate the phrase quot;Address Bookquot; into Giriama. Question: Transcribe the following audio clip from a New York hospital. Also Citizen Journalism
    60. 60. Human Intelligence Tasks (tacit) User may not be aware of how tasks are utilized. reCAPTCHA (based on CAPTCHA) ESP Game (now Google Image Labeler)
    61. 61. CAPTCHA: “Are you human?” (CAPTCHA: Developed by Louis Von Ahn)
    62. 62. CAPTCHA: (Completely Automated Public Turing Test To Tell Computers and Humans Apart ) Security based on human perception Turing Test administered by AI (CAPTCHA: Developed by Louis Von Ahn)
    63. 63. reCAPTCHA Security + Failed OCR = Opportunity (CAPTCHA: Developed by Louis Von Ahn)
    64. 64. reCAPTCHA Currently being used to help scan books for Internet Archive and old editions of the New York Times. (graphic - http://recaptcha.net/learnmore.html)
    65. 65. reCAPTCHA Bots getting better at deciphering the CAPTCHAs (graphic - http://recaptcha.net/learnmore.html)
    66. 66. reCAPTCHA Bots getting better at deciphering the CAPTCHAs Creates feedback loop - Good thing, since it means OCR gets more precise at same time (graphic - http://recaptcha.net/learnmore.html)
    67. 67. ESP Game (now Google Image Labeler) 2 individuals match = high probability of reliable result
    68. 68. With Human Computation… …computing becomes the coordinating force between many individuals and intelligent or perceptual tasks.
    69. 69. Human Computation Models Social Tagging (Flickr) and Comments (Amazon) Ratings (Yelp) Problem Solving (StackOverflow) Economic Mechanical Turk (Amazon) - small tasks $0 - $10 TextEagle (Nathan Eagle) – tasks for mobile phone Tacit reCAPTCHA (based on CAPTCHA) ESP Game (now Google Image Labeler)
    70. 70. How can you create a platform for capturing social knowledge?
    71. 71. Shared Knowledge Production Public platforms – ex. Flickr Develop platform – ex. Simul8 Model
    72. 72. UCLA Library Simul8 Group Model Listen to users – experience paradigm See user mode, lifestyle, aesthetic Engage in their mode – playful and experimental dev Fast prototyping with high level programming Student developers and designers Creative Commons License
    73. 73. One Question: Are You Human? Kevin Rundblad kevinrundblad@library.ucla.edu
    74. 74. One Question: Are You Human? Kevin Rundblad kevinrundblad@library.ucla.edu

    ×