Published at ACL 2020
Link to the paper: https://www.aclweb.org/anthology/2020.acl-main.468.pdf
Joint work by:
Deven Shah (https://www.linkedin.com/in/deven-shah/)
Prof. Andrew Schwartz (https://www3.cs.stonybrook.edu/~has/)
Prof. Dirk Hovy (http://www.dirkhovy.com/cv/index.php)
An increasing number of natural language processing papers address the effect of bias on predictions, introducing mitigation techniques at different parts of the standard NLP pipeline (data and models). However, these works have been conducted individually, without a unifying framework to organize efforts within the field. This situation leads to repetitive approaches, and focuses overly on bias symptoms/effects, rather than on their origins, which could limit the development of effective countermeasures. In this paper, we propose a unifying predictive bias framework for NLP. We summarize the NLP literature and suggest general mathematical definitions of predictive bias. We differentiate two consequences of bias: outcome disparities and error disparities, as well as four potential origins of biases: label bias, selection bias, model overamplification, and semantic bias. Our framework serves as an overview of predictive bias in NLP, integrating existing work into a single structure, and providing a conceptual baseline for improved frameworks.
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview
1. Predictive Biases in Natural Language Processing Models:
A Conceptual Framework and Overview
Deven Shah, H. Andrew Schwartz, & Dirk Hovy
(dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu
Human Language Analysis Beings
4. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Predictive Models in NLP are Biased
Zhao et al. 2018
Webster et al. 2018
Suresh and Guttag 2019
Rudinger et al. 2018
Romanov et al. 2019
Li et al. 2018Kiritchenko and Mohammad 2018
Kern et al. 2016
Hovy 2018
Hitti et al. 2019Gebru et al. 2018
Garimella et al. 2019
Elazar and Goldberg 2018
Corbett-Davies and Goel 2018 Coavoux et al. 2018
Zmigrod et al. 2019
Zhao et al. 2017
Sweeney and Najafian 2019
Sun et al. 2019
Stanovsky et al. 2019
Sap et al. 2019
Nissim et al. 2019
Mitchell et al. 2019
Lynn et al. 2017
Kurita et al. 2019
Kleinberg et al. 2018
Kern et al. 2016 Jørgensen et al. 2015
Hovy and Spruit 2016Hovy et al. 2020
Gonen and Goldberg 2019
Glymour and Herington 2019
Giorgi et al. 2019
Garg et al. 2018
Friedler et al. 2016
Culotta 2014
Caliskan et al. 2017
Bolukbasi et al. 2016
Bender and Friedman 2018
Almodaresi et al. 2017
Almeida et al. 2015
5. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Predictive Models in NLP are Biased
Zhao et al. 2018
Webster et al. 2018
Suresh and Guttag 2019
Rudinger et al. 2018
Romanov et al. 2019
Li et al. 2018Kiritchenko and Mohammad 2018
Kern et al. 2016
Hovy 2018
Hitti et al. 2019Gebru et al. 2018
Garimella et al. 2019
Elazar and Goldberg 2018
Corbett-Davies and Goel 2018 Coavoux et al. 2018
Zmigrod et al. 2019
Zhao et al. 2017
Sweeney and Najafian 2019
Sun et al. 2019
Stanovsky et al. 2019
Sap et al. 2019
Nissim et al. 2019
Mitchell et al. 2019
Lynn et al. 2017
Kurita et al. 2019
Kleinberg et al. 2018
Kern et al. 2016 Jørgensen et al. 2015
Hovy and Spruit 2016Hovy et al. 2020
Gonen and Goldberg 2019
Glymour and Herington 2019
Giorgi et al. 2019
Garg et al. 2018
Friedler et al. 2016
Culotta 2014
Caliskan et al. 2017
Bolukbasi et al. 2016
Bender and Friedman 2018
Almodaresi et al. 2017
Almeida et al. 2015
Goal: To provide a conceptual framework
and mathematical definitions for organizing
work on biased predictive models in NLP.
8. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Conceptual Framework:
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
9. features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Outcome Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2average
predicted
outcome
10. features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Outcome Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
woman
man
cancer
11. features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Error Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Biased
Q(ϵt|A)
Unbiased
P(ϵt|A)
human
attribute
value1
value2
error of
predictions
12. features
Xtarget
predict
Target Population
(Application Side)
biased
outcomes
Ŷtarget
Error Disparity
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Predicted
Q(Ŷt|A)
Ideal
P(Yt|A)
human
attribute
value1
value2
Jørgensen et al. (WNUT 2015)
Hovy & Søggard (ACL 2015)
Correlates with demographics
Distance from “Standard”
error
WSJ Effect
14. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Disparities
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
15. features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Origins of Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
16. features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
17. features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
Source
Q(AS)
Target
P(AT)
human
attribute
value1
value2proportion
of sample
18. features
Xsource
features
Xtarget
Target Population
(Application Side)
predict
Source Population
(Model Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
Jørgensen et al. (WNUT 2015)
Hovy & Søggard (ACL 2015)
Correlates with demographics
Distance from “Standard”
error
WSJ Effect
19. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Selection Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
20. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
21. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
Source
Q(YS|AS)
Ideal
P(YS|AS)
human
attribute
value1
value2proportion
of sample
22. Label Bias
it comes out apr 30
PRON VERB ADP NOUN NUM
PRON VERB PART NOUN NUM
It’s a particle
No! It’s an
adposition
23. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Label Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
24. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
25. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
Target
Q(ŶT|AT)
Source
Q(YS|AS)
human
attribute
value1
value2proportion
of sample
Ideal
P(YS|AS)
27. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Overamplification
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
28. features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
outcomes
Ysource
features
𝜃embedding
Embedding
Corpus
(Pre-trained Side)
Semantic Bias
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
selection bias
The sample of observations
themselves are not representative
of the application population.
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
semantic bias
Non-ideal associations between
attributed lexeme (e.g. gendered
pronouns) and non-attributed lexeme
(e.g. occupation).
30. Predictive Bias Framework for NLP
semantic bias
Non-ideal associations between
attributed lexeme (e.g. gendered
pronouns) and non-attributed lexeme
(e.g. occupation).
features
Xsource
features
Xtarget
predict
Source Population
(Model Side)
Target Population
(Application Side)
biased
outcomes
Ŷtarget
fit
over-amplification
The model discriminates on
a given human attribute
beyond its source base-rate.
features
𝜃embedding
outcome disparity
The distribution of outcomes, given attribute A,
is dissimilar than the ideal distribution:
Q(Ŷt|A) ≁ P(Yt|A)
error disparity
The distribution of error (ϵ) over at least two
different values of an attribute (A) are unequal:
Q(ϵt|Ai) ≁ Q(ϵt|Aj)
Embedding
Corpus
(Pre-trained Side)
outcomes
Ysource
label bias
Biased annotations,
interaction, or latent bias
from past classifications.
selection bias
The sample of observations
themselves are not representative
of the application population.
origin
consequence
31. Source Origin Countermeasures
Label Bias
Post-stratification, Re-train
annotators
data selection Selection Bias
Stratified sampling, Post-
stratification or Re-weighing
techniques
Overamplification
Synthetically match
distributions, add outcome
disparity to cost function
Semantic Bias
Use above techniques and re-
train embeddings
Summary of Countermeasures
annotation
models
embeddings
32. Takeaways
Bias, as outcome and error disparities, can result from many origins:
● the embedding model
● the feature sample
● the fitting process
● the outcome sample
See the paper!
● Survey of works and countermeasures for each origin
● Details on the predictive bias framework including mathematical definitions
This is ver .01: We hope this inspires further work towards a unified understanding and
ultimately mitigating bias!
33. References
1. Berk, R. A. (1983). An introduction to sample selection bias in sociological data. American Sociological Review, 386-398.
2. Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015, August). Certifying and removing disparate
impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259-268). ACM.
3. Friedler, S. A., Scheidegger, C., & Venkatasubramanian, S. (2016). On the (im) possibility of fairness. arXiv preprint arXiv:1609.07236.
4. Baker, R., Brick, J. M., Bates, N. A., Battaglia, M., Couper, M. P., Dever, J. A., ... & Tourangeau, R. (2013). Summary report of the AAPOR
task force on non-probability sampling. Journal of Survey Statistics and Methodology, 1(2), 90-143.
5. Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of internet panels to conduct surveys. Behavior research methods, 47(3), 685-690.
6. Hovy, Dirk. "Demographic factors improve classification performance." Proceedings of the 53rd annual meeting of the Association for
Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers). 2015.
7. Sun, Tony, et al. “Mitigating gender bias in natural language processing: Literature review.” arXiv preprint arXiv:1906.08976 (2019).
34. Thank You
(dsshah, has)@cs.stonybrook.edu, dirk.hovy@unibocconi.edu
Deven Shah, H. Andrew Schwartz, and Dirk Hovy. 2020. Predictive Biases in Natural Language Processing Models: A Conceptual
Framework and Overview. In Proceedings of the The 58th Annual Meeting of the Association for Computational Linguistics.
Human Language Analysis Beings
Editor's Notes
"We look at the predictions we're making and consider whether there is a difference or a disparity with what is desirable."
"Building on work from bias in health, we refer such a difference as a disparity. We then contend there are two types of disparities which cover most concerns for bias in NLP: ...."
Definitions also work with continuous distributions: show a continuous distribution
"but knowing that the model has a disparity usually doesn't provide a solution. Focusing only on the disparity is a bit like focusing only on the symptoms of a disease rather than the cause. "
"so let's work from the symptoms or disparities, backwards to consider possible origins in the bias. We have to look backwards in the model development process "
For instance, given a training set with woman in the kitchen 66% of the times and man in the kitchen 33% of the time, hence there is a difference in the outcome given the gender of the person. After training an NLP model on this dataset, and testing it on unseen data, the model predicts woman in the kitchen 84% of the times while man in the kitchen 16% of the times, hence amplifying a little bias which was already present it in the data
We also propose countermeasures to avoid each of the predictive biases based on it’s origins. For Label bias problem where the source is the annotation, we could probably re-retrain the annotators or use post-stratification techniques
For selection bias where the problem is data selection: we could use stratified sampling techniques or some reweighing or post-stratification techniques to avoid it.
For overamplification problem, where the source is the model, we could synthetically match the distributions based on human attributes with respect to the outcome variables and then pass it to the model for training or add outcome disparity to the cost function.
For semantic bias, where the source is biased embeddings, we could retrain the embeddings by taking the above techniques into consideration.