Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Science
1. Ci#zen
Science
101
What
Every
Researcher
Should
Know
About
Crowdsourcing
Science
Andrea
Wiggins
Postdoctoral
Fellow
DataONE
&
Cornell
Lab
of
Ornithology
17
September,
2012
Tuesday, September 18, 12
2. What
is
ci#zen
science?
Members
of
the
public
engaging
in
real-‐world
scien#fic
research
•Crowdsourcing
•Collabora#on
•Community
2
Tuesday, September 18, 12
3. What
is
ci#zen
science?
public
participation
in science
cro r
so w d - ee g
nt in
urc
ing olu itor
v n
mo
online
communities
*
infrastructure
cyber-
scientific
collaboration
= citizen science 3
Tuesday, September 18, 12
5. Varia#ons
on
a
theme
Label Research
Domain Key
Features
Civic
science Science
communica#on Public
par#cipa#on
in
decisions
about
science
People’s
science Poli#cal
science Social
movements
for
people-‐centered
science
Ci#zen
science Ecology Public
par#cipa#on
in
scien#fic
research
Volunteer/community-‐ Natural
resource
Long-‐term
monitoring
and
interven#on
based
monitoring management
Par#cipatory
ac#on
Behavioral
science Researcher
&
community
par#cipa#on
&
ac#on
research
Ac#on
science Behavioral
science Par#cipatory,
emphasizes
tacit
theories-‐in-‐use
Community
science Psychology Par#cipatory
community-‐centered
social
science
Living
Labs Management Public-‐private
partnership
for
innova#on
5
Tuesday, September 18, 12
7. Why
do
research
this
way?
Big
data
• Ul#mate
mobile
intelligent
sensor
network
• Spa#otemporal
range
7
Tuesday, September 18, 12
8. Why
do
research
this
way?
Big
data
• Ul#mate
mobile
intelligent
sensor
network
• Spa#otemporal
range
Human
computa#on
• Image
processing
&
puzzle
solving
8
Tuesday, September 18, 12
9. Why
do
research
this
way?
Big
data
• Ul#mate
mobile
intelligent
sensor
network
• Spa#otemporal
range
Human
computa#on
• Image
processing
&
puzzle
solving
Addressing
local
concerns
• Water
quality,
noise
pollu#on
data
9
Tuesday, September 18, 12
10. Why
do
research
this
way?
Big
data
• Ul#mate
mobile
intelligent
sensor
network
• Spa#otemporal
range
Human
computa#on
• Image
processing
&
puzzle
solving
Addressing
local
concerns
• Water
quality,
noise
pollu#on
data
Simple
economics
• There
are
more
non-‐scien#sts
than
scien#sts
10
Tuesday, September 18, 12
11. Who
par#cipates?
The
public
is
diverse
demographically
and
intellectually
• Make
no
assump#ons!
• But...
11
Tuesday, September 18, 12
12. Who
par#cipates?
The
public
is
diverse
demographically
and
intellectually
• Make
no
assump#ons!
• But...
Many
non-‐professional
communi#es
have
specialized
skills
• Rock
climbers:
lichen
• Gamers:
protein
folding
• Weather
buffs:
precipita#on
12
Tuesday, September 18, 12
13. Who
par#cipates?
The
public
is
diverse
demographically
and
intellectually
• Make
no
assump#ons!
• But...
Many
non-‐professional
communi#es
have
specialized
skills
• Rock
climbers:
lichen
• Gamers:
protein
folding
• Weather
buffs:
precipita#on
Educa#on
≠
exper#se,
exper#se
≠
educa#on
• Ornithologists
vs.
birders:
no
contest
13
Tuesday, September 18, 12
14. Just
a
few
examples
14
Tuesday, September 18, 12
15. The
Great
Sunflower
Project
Collec#ng
data
on
pollinator
service
(bees!)
• Par#cipa#on
involves:
• Plan#ng
sunflowers
• Crea#ng
garden
descrip#on
on
Drupal
website
• Recording
15-‐minute
observa#on
samples
on
data
sheet
• Online
data
entry
• Started
in
2008
by
a
single
academic
researcher
• Collects
data
across
North
America
• Very
successful
in
akrac#ng
volunteer
interest
15
Tuesday, September 18, 12
16. eBird
Collec#ng
bird
abundance
and
distribu#on
data
• Par#cipa#on
involves:
• Choosing
observa#on
methods
• Recording
bird
observa#ons
(analog
or
digital)
• Entering
observa#ons
and
metadata
online
• Launched
in
2002
by
Cornell
Lab
of
Ornithology
(with
Na#onal
Audubon
Society)
• World’s
largest
biodiversity
data
set:
100M
records
• Currently
receives
about
3M
observa#ons/month
• Data
used
in
research
and
decision-‐making
for
land
management,
policy
(and
recrea#on)
16
Tuesday, September 18, 12
17. Galaxy
Zoo
Classifying
images
of
galaxies
• Par#cipa#on
involves
• Looking
at
pictures
of
galaxies
online
• Answering
a
few
ques#ons
about
them
• Started
in
2007
by
a
team
of
academic
astronomers
• Instant
success
and
exci#ng
new
discoveries
• Galaxy
Zoo
1,
Year
1:
50M
classifica#ons,
150K
volunteers
• Galaxy
Zoo
2,
Year
2:
60M
classifica#ons
in
14
months
• Hanny’s
Voorwerp
• Green
Pea
galaxies
17
Tuesday, September 18, 12
18. Are
the
data
any
good?
#1
concern
of
the
unini#ated
• If
the
data
aren’t
good,
it’s
because
the
design
is
wrong
• Numerous
QA/QC
mechanisms;
75%
use
more
than
one
18
Tuesday, September 18, 12
19. Are
the
data
any
good?
#1
concern
of
the
unini#ated
• If
the
data
aren’t
good,
it’s
because
the
design
is
wrong
• Numerous
QA/QC
mechanisms;
75%
use
more
than
one
Expert
review:
77%
Photos:
40%
Online
+
paper:
33%
Replica#on:
23%
QA/QC
training:
22%
Automa#c
filtering:
18%
Uniform
equipment:
15%
19
Tuesday, September 18, 12
20. Are
the
data
any
good?
#1
concern
of
the
unini#ated
• If
the
data
aren’t
good,
it’s
because
the
design
is
wrong
• Numerous
QA/QC
mechanisms;
75%
use
more
than
one
Expert
review:
77% Expert
review
+...
Photos:
40%
Online
+
paper:
33% Photos:
23%
Replica#on:
23% Automa#c
filtering:
18%
QA/QC
training:
22% Paper
data
sheets:
17%
Automa#c
filtering:
18% Replica#on:
17%
Uniform
equipment:
15% Photos
+
paper:
10%
20
Tuesday, September 18, 12
21. What
does
it
accomplish?
engage%cri)cal%thinking%
(Trumbull%et%al%2000)%
science%learning,%bonding%
(Kountoupes%and%Oberhauser%2008)%
environmental%ac)on;%social%networks%
(Overdevest%et%al.%2004)%
social%capital%
(Ballard%2008)%
improved%policy%
(Wing%et%al.%2008)%
21
Tuesday, September 18, 12
22. What
does
it
accomplish?
documen(ng*range*shi0s*
(Bonter*et*al.*unpublished*data)*
iden(fying*poten(al*mismatches*
(Batalden*et*al.*2007)*
iden(fying*vulnerable*species*
(Crimmins*et*al*2008,*2009)*
health*planning*
(Leve(n*and*Van*de*Water*2008)*
an(cipa(ng*effects*on*water*sources*
(e.g.,*CoCoRaHS)*
processing
large
image
data
sets
(e.g.,
Zooniverse
projects)
applying
human
computa#on
skills
(e.g.,
Foldit)
22
Tuesday, September 18, 12
23. What
does
it
accomplish?
BIG
DATA!
23
Tuesday, September 18, 12
24. Common
myths
Non-‐professionals’
data
is
unreliable
24
Tuesday, September 18, 12
25. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
25
Tuesday, September 18, 12
26. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
• Managing
volunteers
is
never
free
26
Tuesday, September 18, 12
27. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
• Managing
volunteers
is
never
free
It’s
just
outreach
27
Tuesday, September 18, 12
28. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
• Managing
volunteers
is
never
free
It’s
just
outreach
• Some#mes,
but
not
that
oten
28
Tuesday, September 18, 12
29. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
• Managing
volunteers
is
never
free
It’s
just
outreach
• Some#mes,
but
not
that
oten
Ci#zen
science
threatens
conven#onal
science
29
Tuesday, September 18, 12
30. Common
myths
Non-‐professionals’
data
is
unreliable
It’s
free
labor
• Managing
volunteers
is
never
free
It’s
just
outreach
• Some#mes,
but
not
that
oten
Ci#zen
science
threatens
conven#onal
science
• Not
a
replacement,
but
a
complement
• Achieves
things
professional
science
can’t/wouldn’t
30
Tuesday, September 18, 12
31. Ci#zen
science
in
the
21st
century
Expansion
into
new
areas
• Protein
folding
(Foldit)
• Synthe#c
RNA
design
(EteRNA)
31
Tuesday, September 18, 12
32. Ci#zen
science
in
the
21st
century
Expansion
into
new
areas
• Protein
folding
(Foldit)
• Synthe#c
RNA
design
(EteRNA)
Increasingly
ICT-‐mediated
• Mobile
technologies
in
the
field
• Image
processing
and
problem
solving
32
Tuesday, September 18, 12
33. Ci#zen
science
in
the
21st
century
Expansion
into
new
areas
• Protein
folding
(Foldit)
• Synthe#c
RNA
design
(EteRNA)
Increasingly
ICT-‐mediated
• Mobile
technologies
in
the
field
• Image
processing
and
problem
solving
Bigger
and
beker
data
• Quality
is
an
issue,
but
not
a
showstopper
• Global
workforce
of
cogni#ve
surplus
• Public
has
more
exper#se
than
you
expect
33
Tuesday, September 18, 12
34. DataONE
PPSR
Working
Group
Purpose:
• Improve
quality,
quan#ty,
and
accessibility
of
PPSR
data
• Advance
integra#on
of
PPSR
data
in
conven#onal
science
Products:
• Data
Management
Guide
for
PPSR
-‐
coming
soon!
• Ar#cles
in
August
FREE
special
issue
• Data
quality
&
valida#on
paper
• Involved
in
several
ini#a#ves
for
developing
a
community
of
prac#ce
34
Tuesday, September 18, 12