Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases

Detecting Emergent Intersectional Biases:
Contextualized Word Embeddings Contain a Distribution of
Human-like Biases
Wei Guo Aylin Caliskan
August 1, 2021
George Washington University
1

Bias in NLP is everywhere!
• Bias in NLP perpetuates bias in society
• Incomprehensive measurement
• Cannot automatically identify bias
2

Intersectional bias is concerning!
3

Intersectional bias is concerning!
• Incomplete measurement
of social biases
• Unique experiences of
discrimination in ML system
3

Methods: implicit cognition → natural language → computer vision
Implicit Association Test (IAT)
• Tests for differential association of
two concepts
• Easier to categorize
stereotype-congruent pairs
• Harder to categorize
stereotype-incongruent pairs
• Effect d = difference in reaction time Weapon IAT (implicit.harvard.edu)
4

two concepts
• Effect d = difference in reaction time
Weapon IAT (implicit.harvard.edu)
4

two concepts
• Effect d = difference in reaction time
4

Methods: implicit cognition → static embeddings → contextualized embeddings
Word Embedding Association Test Implicit Association Test
5

Word Embedding Association Test
5

man
[
feature1 feature2 . . . featured
]
father
[
]
.
.
.
woman
[
]
mother
[
]
.
.
.
science
[
]
math
[
]
.
.
.
liberal arts
[
]
music
[
]
.
.
.
5

man
[
]
father
[
]
.
.
.
woman
[
]
mother
[
]
.
.
.
science
[
]
math
[
]
.
.
.
liberal arts
[
]
music
[
]
.
.
.
Word Embedding Association Test
(WEAT)
s(w, A, B) = meana∈A cos(w, a)−meanb∈B cos(w, b)
s(X, Y, A, B) =
∑
x∈X
s(x, A, B) −
∑
y∈Y
s(y, A, B)
5

Implicit Association Test
Word Embedding Factual Association
Test (WEFAT)
s(w, A, B) =
meana∈As(⃗
w,⃗
a) − meanb∈Bs(⃗
w,⃗
b)
stdx∈A∪Bs(⃗
w, x)
6

Intersectional Bias Detection (IBD)
s(w, A, B) =
meana∈As(⃗
w,⃗
w,⃗
b)
stdx∈A∪Bs(⃗
w, x)
7

Intersectional Bias Detection (IBD)
s(w, A, B) =
meana∈As(⃗
w,⃗
w,⃗
b)
stdx∈A∪Bs(⃗
w, x)
Detecting intersectional biases
associated with members of multiple minority groups.
7

Emergent Intersectional Bias Detection (EIBD)
8

Emergent Intersectional Bias Detection (EIBD)
Intersectional biases - Attributes highly associated with single social category =
Remaining set is the emergent intersectional biases
Detecting unique emergent intersectional biases that do not overlap with the
biases of their constituent minority identities.
8

Evaluation of IBD
Detection accuracy > 80% accuracy, where random chance < 15%
Validation set for intersectional biases from Ghavami Peplau, 2013
9

Extract the sentence containing the words X, Y, A, B
Contextualized Embedding Assoiciation Test (CEAT)
10

Generates the contextualized embeddings
10

Calculate the effect size of bias based on WEAT
10

Contextualized Embedding Association Test (CEAT)
Generates the distribution of effect magnitudes of biases
Calculate Combined Effect Size
CES(X, Y, A, B) =
∑N
i=1 viESi
∑N
i=1 vi
10

Evaluation of CEAT
Contextualized
embeddings from Corpus of
• Widely shared biases
• Flowers/insects
• Musical instru-
ments/weapons
• Social group biases
• Gender
• Race
• Intersectionality
• ...
11

Evaluation of CEAT
• Intersectional biases have
high magnitude.
• Biased: ELMo > BERT >
GPT > GPT-2
• The overall magnitude of
bias negatively correlates
with the level of
contextualization in the
language model.
11

Questions?
weiguo@gwu.edu
github.com/weiguowilliam/CEAT
paper code
Acknowledgements
my co-author Aylin Caliskan & many reviewers
11

Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases

Similar to Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases (20)

Recently uploaded

Recently uploaded (20)

Detecting emergent intersectional biases: Contextualized word embeddings contain a distribution of human-like biases