許永真/Crowd Computing for Big and Deep AI

Jane Hsu
Computer Science & Information Engineering
Intel-NTU Connected Context Computing
National Taiwan University
Crowd Computing for Big and Deep AI
DRAFT

Deep learning is amazing; however,…
window
kitty
potted plant

Deep Learning Requires Big and Labelled Data

The truth is…
most data are messy and disorganized!

Key Challenges of Labelling Big Data
Scalability
Label Quality
- High cost for obtaining large amount of high quality labels
- Lack of qualiﬁed annotators
- Human bias
- Concept ambiguity

http://image-net.org/index
14,197,122 Images, 21841 synsets indexed
[Deng et. al, 2009]

Wisdom of Crowds
Vox Populi by Sir Francis Galton [1907]
Nature, No. 1949,Vol. 75, 450-451
Weight Judging Competition
W of English Fat Stock & Poultry Exhibition
787 votes
Middlemost vote: 1207 lb
Actual weight: 1198 lb

http://www.ﬂickr.com/photos/jlmiller/35599670

Here is the original image Here are all clicks received Here is the consensus
NASA’s Clickworkers (2000)
- NASA showed that public volunteers can perform science tasks that
would normally require months of work by scientists or graduate
students
- During one year period (Nov. 2000~Jan. 2002), they had 101,000
clickworkers contributing 14,000 work hours, 612,832 sessions, and
2,378,820 crater entries
http://nasaclickworkers.com/classic/age-maps.html

ESP Game [von Ahn and Dabbish, 2006]

Foldit: Solve Puzzles for Science
[Cooper et. al, 2010]
http://fold.it
Predicting protein structures with a multiplayer online game

[2005.11~now]Amazon Mechanical Turk

“Crowdsourcing represents the act of a company or institution
taking a function once performed by employees and outsourcing
it to an undeﬁned (and generally large) network of people in the
form of an open call.”
- Jeff Howe (2006)

What is crowdsourcing?
Crowd Outsourcing+

crowd
crowdsourcing
2009
2010 201620062004 2008
20072005 2011
2012
2013
2014
2015
HCOMP workshop HCOMP conference
ESP game
Google Image Labeler
Foldit
Crowdsourcing
AMTurk
CrowdFlower
Community sourcing
Learnersourcing
NASA Clickworkers
2000
Wikipedia
Timeline
[Data from DBLP]
http://dblp.uni-trier.de/search?q=crowd
http://dblp.uni-trier.de/search?q=crowdsourcing
2001
Umati
ToolScape
Zensors
Soylent

Research Challenges
Crowd algorithms
Incentives and Quality
Crowd-powered System
money fun feedback

New Programming
Languages ConceptsLittle, UIST 2010
Turkit
Human computation algorithms on mechanical turk
[Little et. al, 2010]

Crowd Algorithms
Find-Fix-Verify
[Bernstein et al., 2010]
Price-Divide-Solve
[Kulkarni et al., 2012]
Iterative and Parallel Process
[Little et al., 2010]
Context Trees
[Verroios and Bernstein, 2014]

http://demographics.mturk-tracker.com
Countries
Demographics of Mechanical Turk
70% US
20% India
10% Others

Demographics of Mechanical Turk
India
Gender
US
#female > #male
#male > #female
(25%)(75%)
(55%) (45%)

Ask workers: “I am motivated to do HITs on Mechanical
Turk...”
- to kill time
- to make extra money
- for fun
- because it gives me a sense of purpose
Motivations [Antin and Shaw, 2012]

Incentives
- Does paying more money produce better work?
- More work, but not higher-quality work
[Mason and Watts, 2009]
- Does feedbacks produce better work?
- Self-assessment and expert assessment both
improve the quality of work
[Dow et. al, 2011]

Crowd-Powered System
Embed crowd intelligence inside of user interfaces
and applications

Crowdsourcing - Rad Lab Talk - UC Berkeley Fall 2010 30
VizWiz [Bigham et. al, 2010]
To help blind people, Vizwiz offers a new alternative to answering visual
questions in nearly real-time — asking multiple people on the web.

Crowd-Powered System
Soylent
Scribe
A word processor with a crowd inside
- Shortn: a text shortening service
- Crowdproof: a human-powered spelling and
grammar checker
- The Human Macro: an interface for ofﬂoading
arbitrary word processing tasks
[Bernstein et. al, 2010]
Real-time captioning by non-experts
ToolScape
Extracting step-by-step information from
how-to video
[Lasecki et. al, 2012]
[Kim et. al, 2014]

VoiceTranscriber
Crowd-powered Oral Narrative Summarization System

The Future of Crowd Work [Kittur et al., 2013]
http://www.slideshare.net/mbernst/the-future-of-crowd-work

Crowd-in-the-loop
collaboration

Struggling with creative task
Writing
Explanation
Mobile UI Design

Camera RollAlbums
10:00 AMCARRIE 3G
Albums Places
0.8
0.77
0.5
0.45

Mobile design mining from mockup
Navigation bar
Grid view
Tab bar
Image
Image
Image
Photo gallery
- show multiple images
tab bar
grid item grid item
image image
buttonbutton text button
image text
root
navigation bar grid view
Design structure

Find Draw
Crowdsourcing Workﬂow
Design Example
(Mockup)
Design
StructureVerify
Crowdsourcing Platform
(Amazon Mechanical Turk)

Crowd-Powered Rewriting Aid
Prewriting Writing
Rewriting
Feedback

Crowd Collaboration
Individual: work individually
Sequential: work individually but perceive others’ feedbacks
Simultaneous: paired workers work simultaneously and
communicate immediately

Computers may outsource human-
friendly tasks to people!

Reference
[1] M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. Soylent:
a word processor with a crowd inside. In Proceedings of the 23nd annual ACM symposium on User interface software and
technology, UIST ’10, pages 313–322, New York, NY, USA, 2010. ACM.
[2] J. P. Bigham, C. Jayant, H. Ji, G. Little, A. Miller, R. C. Miller, R. Miller, A. Tatarowicz, B. White, S. White, and T. Yeh.
Vizwiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface
software and technology, UIST ’10, pages 333–342, New York, NY, USA, 2010. ACM.
[3] S. Cooper, F. Khatib, A. Treuille, J. Barbero, J. Lee, M. Beenen, A. Leaver-Fay, D. Baker, Z. Popovíc, and F. players.
Predicting protein structures with a multiplayer online game. Nature, 466(7307):756–760, Aug 2010.
[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE
Computer Vision and Pattern Recognition, CVPR ’09, pages 248–255, 2009.
[5] K. Heimerl, B. Gawalt, K. Chen, T. Parikh, and B. Hartmann. Communitysourcing: Engag- ing local crowds to perform
expert work via physical kiosks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI
’12, pages 1539–1548, New York, NY, USA, 2012. ACM.
[6] J. Kim, P. T. Nguyen, S. Weir, P. J. Guo, R. C. Miller, and K. Z. Gajos. Crowdsourcing step-by-step information extraction
to enhance existing how-to videos. In Proceedings of the 32Nd Annual ACM Conference on Human Factors in Computing
Systems, CHI ’14, pages 4017–4026, New York, NY, USA, 2014. ACM.
[7] G. Laput, W. S. Lasecki, J. Wiese, R. Xiao, J. P. Bigham, and C. Harrison. Zensors: Adaptive, rapidly deployable,
human-intelligent sensor feeds. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing
Systems, CHI ’15, pages 1935–1944, New York, NY, USA, 2015. ACM.
[8] W. Lasecki, C. Miller, A. Sadilek, A. Abumoussa, D. Borrello, R. Kushalnagar, and J. Bigham. Real-time captioning by
groups of non-experts. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology,
UIST ’12, pages 23–34, New York, NY, USA, 2012. ACM.
[9] V. Verroios and M. S. Bernstein. Context trees: Crowdsourcing global understanding from local views. In Proceedings of
the Second AAAI Conference on Human Computation and Crowdsourcing (HCOMP-2014), 2014.

https://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures
• TED Talks: How We Teach Computers to Understand Pictures
http://crowdsourcing-class.org
Resources
• UMich - Human Computation and Crowdsourcing Systems
https://goo.gl/FNK76E
• UPen - Crowdsourcing & Human Computation
• CMU - Crowd Programming
http://www.programthecrowd.com

許永真/Crowd Computing for Big and Deep AI

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 許永真/Crowd Computing for Big and Deep AI

Similar to 許永真/Crowd Computing for Big and Deep AI (20)

More from 台灣資料科學年會

More from 台灣資料科學年會 (20)

Recently uploaded

Recently uploaded (20)

許永真/Crowd Computing for Big and Deep AI