6. • Promises
• Iden8fy
new
content
• Make
content
searchable
• Create
alt-‐text
for
visually
impaired
users
• Eventually
train
machines
to
do
this
• Crowdsourcing
enables
this
on
a
large
scale
• Challenges
• Metadata
quality
• Accuracy,
completeness,
consistency
Metadata
genera8on
via
GWAP
5
7. • Promises
• Iden8fy
new
content
• Make
content
searchable
• Create
alt-‐text
for
visually
impaired
users
• Eventually
train
machines
to
do
this
• Crowdsourcing
enables
this
on
a
large
scale
• Challenges
• Metadata
quality
• Accuracy,
completeness,
consistency
• What
happens
when
the
crowd
is
asked
to
label
images
depic5ng
people?
Metadata
genera8on
via
GWAP
6
9. Images
of
“doctors”
Hat,
Surgeon,
Doctor,
Operate,
Green,
Face
Hat,
Ugly,
Talk
Chair,
Eyes,
Smile,
Cap,
Face,
Astronaut,
White
Nurse,
Doctor
Doctor,
Earrings,
Photo,
Lips,
Paper,
Talk,
Black,
Speaker,
Desk,
Face,
Student
Photo,
Guy,
Gray,
Door,
Chinese,
Ears,
Grey,
Nerd,
Black,
Asian,
China,
Doctor
8
10. Images
of
“teachers”
Teacher,
Blackboard,
Board,
Books,
Hair,
Blue,
Book
Teacher,
Blackboard,
Lecture,
Red,
Write,
Stairs,
Black,
School,
Hair,
Math,
White,
Professor
Teacher,
Paper,
Office,
Clock,
Classroom,
School,
Work,
Grade,
White,
Kids,
Class
Teacher,
Teeth,
Smile,
Classroom,
Face,
School,
Hair,
Curly,
Lady
9
11. • Even
when
we
are
being
“polite”
social
stereotypes
influence
us,
and
our
language
reveals
this
•
Linguis8c
bias
• “A
systema8c
asymmetry
in
the
way
one
uses
language,
as
a
func8on
of
the
social
group
of
the
person(s)
being
described.”
[Beukeboom,
2013]
• Have
a
cogni8ve
origin…but
social
consequences
• Convey
stereotypes
in
a
very
subtle
way
• Linguis8c
Expectancy
Bias
(LEB)
[Mass
et
al.,
1989]
• Tendency
to
describe
other
people
and
situa8ons
that
are
expectancy
consistent
(i.e.,
stereotype-‐congruent)
with
more
abstract,
interpre8ve
language…
• And
those
who
are
not
consistent
with
expecta8ons
(i.e.,
stereotype-‐incongruent)
with
more
tangible,
concrete
language.
Stereotypes?
10
12. 11
Example:
Linguis8c
Expectancy
Bias
Doctor
Surgeon
Intelligent
Serious
Doctor
Nurse
Experiment
Smiley
Talking
Doctor
Nurse
Student
Studying
Listening
More
Expected
More
abstract
/
interpreXve
language
Less
Expected
More
concrete
language
13. Why
would
it
ma?er?
12
Doctor
Surgeon
Intelligent
Serious
Doctor
Nurse
Student
Studying
Listening
14. Why
would
it
ma?er?
13
Doctor
Surgeon
Intelligent
Serious
Doctor
Nurse
Student
Studying
Listening
Abstract
language
is
powerful!
• Implies
stability
over
Xme,
across
situaXons
• Message
recipients
interpret
messages
differently
[Wigboldus
et
al.,
2000]
• Abstract:
during
qualiXes
• Concrete:
transient
characterisXcs
• These
biases
contribute
to
the
maintenance
and
transmission
of
stereotypes,
because
abstract
informaXon
is
resistant
to
disconfirmaXon
15. • LEB
is
pervasive
in
human
communica8on,
but
has
rarely
been
studied
outside
of
a
laboratory
[Hunt,
2011]
• How
to
study
this?
Two
characteris8cs
of
the
language
used
to
label
images
• Abstract
• How
oken
are
adjec8ves
used?
• Describe
how
someone
is
versus
who
she
is
or
what
she
is
doing
(e.g.,
intelligent
vs.
studying)
• Subjec8ve
• How
oken
is
evalua8ve
language
used?
• Injects
our
interpreta8on
of
the
person
(e.g.,
intelligent,
beau8ful,
gentle,
rockstar,
loser,
ugly,
stupid)
Linguis8c
biases
in
image
metadata
14
16. Images
of
“doctors”:
adjec8ves
Hat,
Surgeon,
Doctor,
Operate,
Green,
Face
Hat,
Ugly,
Talk
Chair,
Eyes,
Smile,
Cap,
Face,
Astronaut,
White
Nurse,
Doctor
Doctor,
Earrings,
Photo,
Lips,
Paper,
Talk,
Black,
Speaker,
Desk,
Face,
Student
Photo,
Guy,
Gray,
Door,
Chinese,
Ears,
Grey,
Nerd,
Black,
Asian,
China,
Doctor
15
17. Images
of
“doctors”:
subjec8ve
words
Hat,
Surgeon,
Doctor,
Operate,
Green,
Face
Hat,
Ugly,
Talk
Chair,
Eyes,
Smile,
Cap,
Face,
Astronaut,
White
Nurse,
Doctor
Doctor,
Earrings,
Photo,
Lips,
Paper,
Talk,
Black,
Speaker,
Desk,
Face,
Student
Photo,
Guy,
Gray,
Door,
Chinese,
Ears,
Grey,
Nerd,
Black,
Asian,
China,
Doctor
16
18. • 100k
images
collected
from
the
live
game
[von
Ahn
&
Dabbish,
2004]
• 46,392
images
depic8ng
one
or
more
people
Small
ESP
Dataset
17
19. • Q1:
Do
we
observe
differences
across
gender
with
respect
to
the
use
of
abstract
and
subjec8ve
language?
• Analysis
1:
all
images
regardless
of
context
• Method:
automated
linguis8c
analysis
• 18,916
images
of
men
• 14,628
images
of
women
• Q2:
Do
we
observe
differences
across
gender
with
respect
to
the
a?ributes
of
the
subject(s)
that
are
described?
• Analysis
2:
images
in
par8cular
professional
contexts
• Method:
manual
content
analysis
to
iden8fy
labels
describing
• Physical
appearance
• Disposi8on
or
character
• Occupa8on
Research
ques8ons
and
methods
18
20. Analysis
1:
method
19
ESP
Game
Dataset
100k
images
LIWC
categories:
“humans,
friends,
family”
Use
hyponyms
of
“man”
/
“male”
and
“woman”
/
“female”
to
label
gender
STEP
1:
Find
images
of
men
and
women
21. Analysis
1:
method
20
ESP
Game
Dataset
100k
images
LIWC
categories:
“humans,
friends,
family”
Use
hyponyms
of
“man”
/
“male”
and
“woman”
/
“female”
to
label
gender
STEP
1:
Find
images
of
men
and
women
Part-‐of-‐speech
tagging
CLAWS
C5
Manual
error
analysis:
AdjecXves
(1.45%)
Women
are
more
oken
described
with
adjec8ves
STEP
2:
Find
labels
that
are
adjecXves
22. Analysis
1:
method
21
ESP
Game
Dataset
100k
images
LIWC
categories:
“humans,
friends,
family”
Use
hyponyms
of
“man”
/
“male”
and
“woman”
/
“female”
to
label
gender
STEP
1:
Find
images
of
men
and
women
Part-‐of-‐speech
tagging
CLAWS
C5
Manual
error
analysis:
AdjecXves
(1.45%)
Women
are
more
oken
described
with
adjec8ves
STEP
2:
Find
labels
that
are
adjecXves
SubjecXvity
Lexicon
(Wilson
et
al.
2005)
Women
are
more
oken
described
with
subjec8ve
adjec8ves
STEP
3:
Find
subjecXve
adjecXves
23. Analysis
1:
method
22
Examples
from
Wilson
and
colleagues’
(2005)
SubjecXvity
Lexicon
26. • There
is
a
tendency
to
label
images
of
women
with
• More
abstract
language
(adjec8ves)
• More
evalua8ve
language
(subjec8ve
adjec8ves)
• Limita8ons
of
Analysis
1
• Automated
analysis
of
the
language
used
• Compared
all
images
regardless
of
their
contexts
• Need
to
see
if
a
robust,
manual
analysis
also
suggests
a
gender-‐based
linguis8c
bias
Analysis
1:
summary
25
27. Analysis
2:
method
26
STEP
1:
Find
images
In
similar
contexts
STEP
2:
Hand
code
each
label
/
image
pair
STEP
3:
Compare
proporXons
of
labels
in
each
category
across
gender
IdenXfy
images
with
labels
concerning
6
occupaXons:
athlete,
doctor,
singer,
soldier
/
Army,
teacher,
waiter
/
waitress
1.
Could
this
word
be
used
to
describe
the
physical
appearance
of
[an
athlete]?
2.
Could
this
word
describe
the
disposiXon
or
character
of
[an
athlete]?
3.
Does
this
word
describe
something
about
the
occupaXon
of
[athlete]?
Women
are
associated
with
more
labels
concerning
appearance;
fewer
concerning
occupa8on
31. • Both
analyses
suggest
that
ESP
Game
players
label
images
of
men
and
women
a
bit
differently
• Labels
chosen
reflect
predominant
social
stereotypes
• Women
are
noted
as
being
(and
should
be)
physically
a?rac8ve
• Women
in
stereotype-‐congruent
occupa8ons
(singer,
teaching,
waitress)
were
less
likely
to
be
described
with
professional
vocabulary
Gender-‐based
linguis8c
bias
30
32. • Technologies
are
cultural
ar8facts.
It’s
worth
reflec8ng
on
the
values
we
want
them
to
have:
• Technical
(e.g.,
efficient,
low-‐cost
genera8on
of
image
metadata)
• Human
(e.g.,
gender,
racial
equality)
• Technology’s
influence,
par8cularly
on
young
people
• E.g.,
Google
search
So
what?
31
34. • I
have
iden8fied
a
problem,
but
I
haven’t
told
you
if
we
should
solve
it
and,
if
so,
how!
• Food
for
thought
• Does
linguis8c
bias
in
metadata
bother
us?
• If
so,
can
we
build
a
be?er
game,
resul8ng
in
be?er
metdata?
Limita8ons
of
this
work
33
35. • Moving
from
observa8on
to
experimenta8on
• Current
study
was
a
secondary
data
analysis
• No
control
over
• Players
(gender,
age,
social
status)
• S8mulus
(image
content)
• Currently
iden8fied
some
parameters
we
need
to
study
and
how
(e.g.,
characteris8cs
of
language)
• System
will
allow
us
to
examine
more
systema8cally
how
each
factor
impacts
the
linguis8c
biases
we
observe
• Players
• S8mulus
• Social
cues
players
receive
Next
steps
34
38. • Contact
me:
• jahna.o?erbacher@ouc.ac.cy
• This
presenta8on
was
based
on
the
following
recent
papers:
• O?erbacher,
J.
2015.
Crowdsourcing
Stereotypes:
Linguis8c
Bias
in
Metadata
Generated
via
GWAP.
In
Proceedings
of
the
Conference
on
Human
Factors
in
Compu8ng
Systems
(ACM
CHI’15).
ACM
Press:
New
York.
• O?erbacher,
J.
2015.
Linguis8c
Bias
in
Collabora8vely
Produced
Biographies:
Crowdsourcing
Social
Stereotypes?
In
Interna8onal
AAAI
Conference
of
Weblogs
and
Social
Media.
AAAI
Press:
Palo
Alto,
CA.
Thank
you!
37