SlideShare a Scribd company logo
1 of 34
Predictive Biases in Natural Language Processing Models:
A Conceptual Framework and Overview
Deven Shah, H. Andrew Schwartz, & Dirk Hovy
(dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu
Human Language Analysis Beings
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
fit
outcomes
Ysource
Predictive Models in NLP
features
Xsource
Source Population
(Model Side)
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Predictive Models in NLP
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
fit
outcomes
Ysource
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Predictive Models in NLP are Biased
Zhao et al. 2018
Webster et al. 2018
Suresh and Guttag 2019
Rudinger et al. 2018
Romanov et al. 2019
Li et al. 2018Kiritchenko and Mohammad 2018
Kern et al. 2016
Hovy 2018
Hitti et al. 2019Gebru et al. 2018
Garimella et al. 2019
Elazar and Goldberg 2018
Corbett-Davies and Goel 2018 Coavoux et al. 2018
Zmigrod et al. 2019
Zhao et al. 2017
Sweeney and Najafian 2019
Sun et al. 2019
Stanovsky et al. 2019
Sap et al. 2019
Nissim et al. 2019
Mitchell et al. 2019
Lynn et al. 2017
Kurita et al. 2019
Kleinberg et al. 2018
Kern et al. 2016 Jørgensen et al. 2015
Hovy and Spruit 2016Hovy et al. 2020
Gonen and Goldberg 2019
Glymour and Herington 2019
Giorgi et al. 2019
Garg et al. 2018
Friedler et al. 2016
Culotta 2014
Caliskan et al. 2017
Bolukbasi et al. 2016
Bender and Friedman 2018
Almodaresi et al. 2017
Almeida et al. 2015
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Predictive Models in NLP are Biased
Zhao et al. 2018
Webster et al. 2018
Suresh and Guttag 2019
Rudinger et al. 2018
Romanov et al. 2019
Li et al. 2018Kiritchenko and Mohammad 2018
Kern et al. 2016
Hovy 2018
Hitti et al. 2019Gebru et al. 2018
Garimella et al. 2019
Elazar and Goldberg 2018
Corbett-Davies and Goel 2018 Coavoux et al. 2018
Zmigrod et al. 2019
Zhao et al. 2017
Sweeney and Najafian 2019
Sun et al. 2019
Stanovsky et al. 2019
Sap et al. 2019
Nissim et al. 2019
Mitchell et al. 2019
Lynn et al. 2017
Kurita et al. 2019
Kleinberg et al. 2018
Kern et al. 2016 Jørgensen et al. 2015
Hovy and Spruit 2016Hovy et al. 2020
Gonen and Goldberg 2019
Glymour and Herington 2019
Giorgi et al. 2019
Garg et al. 2018
Friedler et al. 2016
Culotta 2014
Caliskan et al. 2017
Bolukbasi et al. 2016
Bender and Friedman 2018
Almodaresi et al. 2017
Almeida et al. 2015
Goal: To provide a conceptual framework
and mathematical definitions for organizing
work on biased predictive models in NLP.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Conceptual Framework:
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Conceptual Framework:
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Conceptual Framework:
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Outcome Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2average
predicted
outcome
features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Outcome Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
woman
man
cancer
features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Error Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Biased
Q(ϵt|A)
Unbiased
P(ϵt|A)
human
attribute
value1
value2
error of
predictions
features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Error Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2
Jørgensen et al. (WNUT 2015)
Hovy & Søggard (ACL 2015)
Correlates with demographics
Distance from “Standard”
error
WSJ Effect
features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Disparities
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Disparities
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Origins of Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
Source
Q(AS)
Target
P(AT)
human
attribute
value1
value2proportion
of sample
features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
Jørgensen et al. (WNUT 2015)
Hovy & Søggard (ACL 2015)
Correlates with demographics
Distance from “Standard”
error
WSJ Effect
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
Source
Q(YS|AS)
Ideal
P(YS|AS)
human
attribute
value1
value2proportion
of sample
Label Bias
it comes out apr 30
PRON VERB ADP NOUN NUM
PRON VERB PART NOUN NUM
It’s a particle
No! It’s an
adposition
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
Target
Q(ŶT|AT)
Source
Q(YS|AS)
human
attribute
value1
value2proportion
of sample
Ideal
P(YS|AS)
Overamplifiction - Model Amplifies Bias
Zhao et al. (ACL 2015)
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Semantic Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
semantic bias
Non-ideal associations between
attributed lexeme (e.g. gendered
pronouns) and non-attributed lexeme
(e.g. occupation).
Semantic Bias
Woman
Man
Car
accessories
Pets
Biased Vectors
Predictive Bias Framework for NLP
semantic bias
Non-ideal associations between
attributed lexeme (e.g. gendered
pronouns) and non-attributed lexeme
(e.g. occupation).
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
features
𝜃embedding
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Embedding
Corpus
(Pre-trained Side)
outcomes
Ysource
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
selection bias
The sample of observations
themselves are not representative
of the application population.
origin
consequence
Source Origin Countermeasures
Label Bias
Post-stratification, Re-train
annotators
data selection Selection Bias
Stratified sampling, Post-
stratification or Re-weighing
techniques
Overamplification
Synthetically match
distributions, add outcome
disparity to cost function
Semantic Bias
Use above techniques and re-
train embeddings
Summary of Countermeasures
annotation
models
embeddings
Takeaways
Bias, as outcome and error disparities, can result from many origins:
● the embedding model
● the feature sample
● the fitting process
● the outcome sample
See the paper!
● Survey of works and countermeasures for each origin
● Details on the predictive bias framework including mathematical definitions
This is ver .01: We hope this inspires further work towards a unified understanding and
ultimately mitigating bias!
References
1. Berk, R. A. (1983). An introduction to sample selection bias in sociological data. American Sociological Review, 386-398.
2. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015, August). Certifying and removing disparate
impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259-268). ACM.
3. Friedler, S. A., Scheidegger, C., & Venkatasubramanian, S. (2016). On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236.
4. Baker, R., Brick, J. M., Bates, N. A., Battaglia, M., Couper, M. P., Dever, J. A., ... & Tourangeau, R. (2013). Summary report of the AAPOR
task force on non-probability sampling. Journal of Survey Statistics and Methodology, 1(2), 90-143.
5. Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of internet panels to conduct surveys. Behavior research methods, 47(3), 685-690.
6. Hovy, Dirk. "Demographic factors improve classification performance." Proceedings of the 53rd annual meeting of the Association for
Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers). 2015.
7. Sun, Tony, et al. “Mitigating gender bias in natural language processing: Literature review.” arXiv preprint arXiv:1906.08976 (2019).
Thank You
(dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu
Deven Shah, H. Andrew Schwartz, and Dirk Hovy. 2020. Predictive Biases in Natural Language Processing Models: A Conceptual
Framework and Overview. In Proceedings of the The 58th Annual Meeting of the Association for Computational Linguistics.
Human Language Analysis Beings

More Related Content

What's hot

Complement system in health & disease (Immunology)
Complement system in health & disease (Immunology)Complement system in health & disease (Immunology)
Complement system in health & disease (Immunology)Karthikeyan Pethusamy
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
 
Lecture 7 Unleashing the Power of AI in Education
Lecture 7 Unleashing the Power of AI in Education Lecture 7 Unleashing the Power of AI in Education
Lecture 7 Unleashing the Power of AI in Education James Stanfield
 
Лекцийн хичээлийг үр дүнтэй явуулах аргаас
Лекцийн хичээлийг үр дүнтэй явуулах аргаасЛекцийн хичээлийг үр дүнтэй явуулах аргаас
Лекцийн хичээлийг үр дүнтэй явуулах аргаасMuis-Orkhon
 
алгоритмын бодлогууд
алгоритмын бодлогуудалгоритмын бодлогууд
алгоритмын бодлогуудRenchindorj Monkhzul
 

What's hot (9)

Complement system in health & disease (Immunology)
Complement system in health & disease (Immunology)Complement system in health & disease (Immunology)
Complement system in health & disease (Immunology)
 
алгоритмын ангилал
алгоритмын ангилалалгоритмын ангилал
алгоритмын ангилал
 
алгоритм
алгоритмалгоритм
алгоритм
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Lecture 7 Unleashing the Power of AI in Education
Lecture 7 Unleashing the Power of AI in Education Lecture 7 Unleashing the Power of AI in Education
Lecture 7 Unleashing the Power of AI in Education
 
Лекцийн хичээлийг үр дүнтэй явуулах аргаас
Лекцийн хичээлийг үр дүнтэй явуулах аргаасЛекцийн хичээлийг үр дүнтэй явуулах аргаас
Лекцийн хичээлийг үр дүнтэй явуулах аргаас
 
La nueva escuela mexicana ppt
La nueva escuela mexicana pptLa nueva escuela mexicana ppt
La nueva escuela mexicana ppt
 
алгоритмын бодлогууд
алгоритмын бодлогуудалгоритмын бодлогууд
алгоритмын бодлогууд
 

Similar to Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview

Estimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataEstimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataNick Stauner
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIWithTheBest
 
Off policy evaluation
Off policy evaluationOff policy evaluation
Off policy evaluationMasa61rl
 
Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Galit Shmueli
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSScsula its training
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftSebastian Ruder
 
2016 urisa track: ring pattern of older adult population in urban areas by y...
2016 urisa track:  ring pattern of older adult population in urban areas by y...2016 urisa track:  ring pattern of older adult population in urban areas by y...
2016 urisa track: ring pattern of older adult population in urban areas by y...GIS in the Rockies
 
Assessing relative importance using rsp scoring to generate
Assessing relative importance using rsp scoring to generateAssessing relative importance using rsp scoring to generate
Assessing relative importance using rsp scoring to generateDaniel Koh
 
Assessing Relative Importance using RSP Scoring to Generate VIF
Assessing Relative Importance using RSP Scoring to Generate VIFAssessing Relative Importance using RSP Scoring to Generate VIF
Assessing Relative Importance using RSP Scoring to Generate VIFDaniel Koh
 
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC DISTRIBUTION USING MAXIMUM LIKELIH...
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC  DISTRIBUTION USING MAXIMUM LIKELIH...ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC  DISTRIBUTION USING MAXIMUM LIKELIH...
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC DISTRIBUTION USING MAXIMUM LIKELIH...BRNSS Publication Hub
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020Eero Siljander
 
Calibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptxCalibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptxbasantkumar814009
 
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...Istituto nazionale di statistica
 
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...nino_chan38
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph CompletionNaomi Shiraishi
 
IRJET- Facial Expression Recognition
IRJET- Facial Expression RecognitionIRJET- Facial Expression Recognition
IRJET- Facial Expression RecognitionIRJET Journal
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Dennis Sweitzer
 
Cross-validation estimate of the number of clusters in a network
Cross-validation estimate of the number of clusters in a networkCross-validation estimate of the number of clusters in a network
Cross-validation estimate of the number of clusters in a networkTatsuro Kawamoto
 

Similar to Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview (20)

Estimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale dataEstimators for structural equation models of Likert scale data
Estimators for structural equation models of Likert scale data
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
Off policy evaluation
Off policy evaluationOff policy evaluation
Off policy evaluation
 
Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...
 
SPSS statistics - get help using SPSS
SPSS statistics - get help using SPSSSPSS statistics - get help using SPSS
SPSS statistics - get help using SPSS
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
Day2 statistical tests
Day2 statistical testsDay2 statistical tests
Day2 statistical tests
 
2016 urisa track: ring pattern of older adult population in urban areas by y...
2016 urisa track:  ring pattern of older adult population in urban areas by y...2016 urisa track:  ring pattern of older adult population in urban areas by y...
2016 urisa track: ring pattern of older adult population in urban areas by y...
 
Assessing relative importance using rsp scoring to generate
Assessing relative importance using rsp scoring to generateAssessing relative importance using rsp scoring to generate
Assessing relative importance using rsp scoring to generate
 
Assessing Relative Importance using RSP Scoring to Generate VIF
Assessing Relative Importance using RSP Scoring to Generate VIFAssessing Relative Importance using RSP Scoring to Generate VIF
Assessing Relative Importance using RSP Scoring to Generate VIF
 
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC DISTRIBUTION USING MAXIMUM LIKELIH...
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC  DISTRIBUTION USING MAXIMUM LIKELIH...ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC  DISTRIBUTION USING MAXIMUM LIKELIH...
ALPHA LOGARITHM TRANSFORMED SEMI LOGISTIC DISTRIBUTION USING MAXIMUM LIKELIH...
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020
 
Calibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptxCalibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptx
 
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...Session III - Census and Registers  -  S. Falorsi, A. Fasulo,  Census and Soc...
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
 
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...
Local Influence Diagnostics for Generalized Linear Mixed Models with Overdisp...
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
 
IRJET- Facial Expression Recognition
IRJET- Facial Expression RecognitionIRJET- Facial Expression Recognition
IRJET- Facial Expression Recognition
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
 
Data analysis
Data analysisData analysis
Data analysis
 
Cross-validation estimate of the number of clusters in a network
Cross-validation estimate of the number of clusters in a networkCross-validation estimate of the number of clusters in a network
Cross-validation estimate of the number of clusters in a network
 

Recently uploaded

THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxPABOLU TEJASREE
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 

Recently uploaded (20)

THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptxBREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
BREEDING FOR RESISTANCE TO BIOTIC STRESS.pptx
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 

Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview

  • 1. Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview Deven Shah, H. Andrew Schwartz, & Dirk Hovy (dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu Human Language Analysis Beings
  • 2. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) fit outcomes Ysource Predictive Models in NLP
  • 3. features Xsource Source Population (Model Side) fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Predictive Models in NLP features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) fit outcomes Ysource
  • 4. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Predictive Models in NLP are Biased Zhao et al. 2018 Webster et al. 2018 Suresh and Guttag 2019 Rudinger et al. 2018 Romanov et al. 2019 Li et al. 2018Kiritchenko and Mohammad 2018 Kern et al. 2016 Hovy 2018 Hitti et al. 2019Gebru et al. 2018 Garimella et al. 2019 Elazar and Goldberg 2018 Corbett-Davies and Goel 2018 Coavoux et al. 2018 Zmigrod et al. 2019 Zhao et al. 2017 Sweeney and Najafian 2019 Sun et al. 2019 Stanovsky et al. 2019 Sap et al. 2019 Nissim et al. 2019 Mitchell et al. 2019 Lynn et al. 2017 Kurita et al. 2019 Kleinberg et al. 2018 Kern et al. 2016 Jørgensen et al. 2015 Hovy and Spruit 2016Hovy et al. 2020 Gonen and Goldberg 2019 Glymour and Herington 2019 Giorgi et al. 2019 Garg et al. 2018 Friedler et al. 2016 Culotta 2014 Caliskan et al. 2017 Bolukbasi et al. 2016 Bender and Friedman 2018 Almodaresi et al. 2017 Almeida et al. 2015
  • 5. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Predictive Models in NLP are Biased Zhao et al. 2018 Webster et al. 2018 Suresh and Guttag 2019 Rudinger et al. 2018 Romanov et al. 2019 Li et al. 2018Kiritchenko and Mohammad 2018 Kern et al. 2016 Hovy 2018 Hitti et al. 2019Gebru et al. 2018 Garimella et al. 2019 Elazar and Goldberg 2018 Corbett-Davies and Goel 2018 Coavoux et al. 2018 Zmigrod et al. 2019 Zhao et al. 2017 Sweeney and Najafian 2019 Sun et al. 2019 Stanovsky et al. 2019 Sap et al. 2019 Nissim et al. 2019 Mitchell et al. 2019 Lynn et al. 2017 Kurita et al. 2019 Kleinberg et al. 2018 Kern et al. 2016 Jørgensen et al. 2015 Hovy and Spruit 2016Hovy et al. 2020 Gonen and Goldberg 2019 Glymour and Herington 2019 Giorgi et al. 2019 Garg et al. 2018 Friedler et al. 2016 Culotta 2014 Caliskan et al. 2017 Bolukbasi et al. 2016 Bender and Friedman 2018 Almodaresi et al. 2017 Almeida et al. 2015 Goal: To provide a conceptual framework and mathematical definitions for organizing work on biased predictive models in NLP.
  • 6. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Conceptual Framework:
  • 7. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Conceptual Framework:
  • 8. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Conceptual Framework: outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj)
  • 9. features Xtarget predict Target Population (Application Side) biased outcomes Ŷtarget Outcome Disparity outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) Predicted Q(Ŷt|A) Ideal P(Yt|A) human attribute value1 value2average predicted outcome
  • 10. features Xtarget predict Target Population (Application Side) biased outcomes Ŷtarget Outcome Disparity outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) Predicted Q(Ŷt|A) Ideal P(Yt|A) human attribute value1 value2 Predicted Q(Ŷt|A) Ideal P(Yt|A) human attribute woman man cancer
  • 11. features Xtarget predict Target Population (Application Side) biased outcomes Ŷtarget Error Disparity outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) Biased Q(ϵt|A) Unbiased P(ϵt|A) human attribute value1 value2 error of predictions
  • 12. features Xtarget predict Target Population (Application Side) biased outcomes Ŷtarget Error Disparity outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) Predicted Q(Ŷt|A) Ideal P(Yt|A) human attribute value1 value2 Jørgensen et al. (WNUT 2015) Hovy & Søggard (ACL 2015) Correlates with demographics Distance from “Standard” error WSJ Effect
  • 13. features Xtarget predict Target Population (Application Side) biased outcomes Ŷtarget Disparities outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj)
  • 14. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Disparities outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj)
  • 15. features Xsource features Xtarget Target Population (Application Side) predict Source Population (Model Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Origins of Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj)
  • 16. features Xsource features Xtarget Target Population (Application Side) predict Source Population (Model Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Selection Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population.
  • 17. features Xsource features Xtarget Target Population (Application Side) predict Source Population (Model Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Selection Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. Source Q(AS) Target P(AT) human attribute value1 value2proportion of sample
  • 18. features Xsource features Xtarget Target Population (Application Side) predict Source Population (Model Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Selection Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. Jørgensen et al. (WNUT 2015) Hovy & Søggard (ACL 2015) Correlates with demographics Distance from “Standard” error WSJ Effect
  • 19. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Selection Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population.
  • 20. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Label Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications.
  • 21. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Label Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications. Source Q(YS|AS) Ideal P(YS|AS) human attribute value1 value2proportion of sample
  • 22. Label Bias it comes out apr 30 PRON VERB ADP NOUN NUM PRON VERB PART NOUN NUM It’s a particle No! It’s an adposition
  • 23. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Label Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications.
  • 24. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Overamplification outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications. over-amplification The model discriminates on a given human attribute beyond its source base-rate.
  • 25. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Overamplification outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications. over-amplification The model discriminates on a given human attribute beyond its source base-rate. Target Q(ŶT|AT) Source Q(YS|AS) human attribute value1 value2proportion of sample Ideal P(YS|AS)
  • 26. Overamplifiction - Model Amplifies Bias Zhao et al. (ACL 2015)
  • 27. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Overamplification outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications. over-amplification The model discriminates on a given human attribute beyond its source base-rate.
  • 28. features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit outcomes Ysource features 𝜃embedding Embedding Corpus (Pre-trained Side) Semantic Bias outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) selection bias The sample of observations themselves are not representative of the application population. label bias Biased annotations, interaction, or latent bias from past classifications. over-amplification The model discriminates on a given human attribute beyond its source base-rate. semantic bias Non-ideal associations between attributed lexeme (e.g. gendered pronouns) and non-attributed lexeme (e.g. occupation).
  • 30. Predictive Bias Framework for NLP semantic bias Non-ideal associations between attributed lexeme (e.g. gendered pronouns) and non-attributed lexeme (e.g. occupation). features Xsource features Xtarget predict Source Population (Model Side) Target Population (Application Side) biased outcomes Ŷtarget fit over-amplification The model discriminates on a given human attribute beyond its source base-rate. features 𝜃embedding outcome disparity The distribution of outcomes, given attribute A, is dissimilar than the ideal distribution: Q(Ŷt|A) ≁ P(Yt|A) error disparity The distribution of error (ϵ) over at least two different values of an attribute (A) are unequal: Q(ϵt|Ai) ≁ Q(ϵt|Aj) Embedding Corpus (Pre-trained Side) outcomes Ysource label bias Biased annotations, interaction, or latent bias from past classifications. selection bias The sample of observations themselves are not representative of the application population. origin consequence
  • 31. Source Origin Countermeasures Label Bias Post-stratification, Re-train annotators data selection Selection Bias Stratified sampling, Post- stratification or Re-weighing techniques Overamplification Synthetically match distributions, add outcome disparity to cost function Semantic Bias Use above techniques and re- train embeddings Summary of Countermeasures annotation models embeddings
  • 32. Takeaways Bias, as outcome and error disparities, can result from many origins: ● the embedding model ● the feature sample ● the fitting process ● the outcome sample See the paper! ● Survey of works and countermeasures for each origin ● Details on the predictive bias framework including mathematical definitions This is ver .01: We hope this inspires further work towards a unified understanding and ultimately mitigating bias!
  • 33. References 1. Berk, R. A. (1983). An introduction to sample selection bias in sociological data. American Sociological Review, 386-398. 2. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015, August). Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259-268). ACM. 3. Friedler, S. A., Scheidegger, C., & Venkatasubramanian, S. (2016). On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236. 4. Baker, R., Brick, J. M., Bates, N. A., Battaglia, M., Couper, M. P., Dever, J. A., ... & Tourangeau, R. (2013). Summary report of the AAPOR task force on non-probability sampling. Journal of Survey Statistics and Methodology, 1(2), 90-143. 5. Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of internet panels to conduct surveys. Behavior research methods, 47(3), 685-690. 6. Hovy, Dirk. "Demographic factors improve classification performance." Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers). 2015. 7. Sun, Tony, et al. “Mitigating gender bias in natural language processing: Literature review.” arXiv preprint arXiv:1906.08976 (2019).
  • 34. Thank You (dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu Deven Shah, H. Andrew Schwartz, and Dirk Hovy. 2020. Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview. In Proceedings of the The 58th Annual Meeting of the Association for Computational Linguistics. Human Language Analysis Beings

Editor's Notes

  1. "We look at the predictions we're making and consider whether there is a difference or a disparity with what is desirable."
  2. "Building on work from bias in health, we refer such a difference as a disparity. We then contend there are two types of disparities which cover most concerns for bias in NLP: ...."
  3. Definitions also work with continuous distributions: show a continuous distribution
  4. "but knowing that the model has a disparity usually doesn't provide a solution. Focusing only on the disparity is a bit like focusing only on the symptoms of a disease rather than the cause. "
  5. "so let's work from the symptoms or disparities, backwards to consider possible origins in the bias. We have to look backwards in the model development process "
  6. For instance, given a training set with woman in the kitchen 66% of the times and man in the kitchen 33% of the time, hence there is a difference in the outcome given the gender of the person. After training an NLP model on this dataset, and testing it on unseen data, the model predicts woman in the kitchen 84% of the times while man in the kitchen 16% of the times, hence amplifying a little bias which was already present it in the data
  7. We also propose countermeasures to avoid each of the predictive biases based on it’s origins. For Label bias problem where the source is the annotation, we could probably re-retrain the annotators or use post-stratification techniques For selection bias where the problem is data selection: we could use stratified sampling techniques or some reweighing or post-stratification techniques to avoid it. For overamplification problem, where the source is the model, we could synthetically match the distributions based on human attributes with respect to the outcome variables and then pass it to the model for training or add outcome disparity to the cost function. For semantic bias, where the source is biased embeddings, we could retrain the embeddings by taking the above techniques into consideration.