Bias in AI

Research Triangle Analysts
Bias
in AI
Zeydy Ortiz, Ph. D.
CEO & Chief Data Scientist
zortiz @ datacrunchlab.com
Photo by Marc Treble (CC BY-NC 2.0)

@DataCrunch_Lab
@RTPAnalysts
Data
Scientist
Computer Engineer
Computer Science
Performance
Engineer
Zeydy Ortiz

Law EnforcementEmployment EducationCredit
Algorithms used for high-
stakes decisions affect us all

@DataCrunch_Lab
@RTPAnalysts
Bias in AI
Why
•Background
How
•Detect
•Mitigate
•Monitor
What
•Resources

@DataCrunch_Lab
@RTPAnalysts
Trusted AI
Fairness
Accountability
Transparency
Ethics

@DataCrunch_Lab
@RTPAnalystsFederal law prohibits
discrimination against
people based on…
Sex
Age
Race
Color
Religion
Disability
Citizenship
Pregnancy
Familial Status
Veteran Status
National Origin
Genetic Information
in certain circumstances

@DataCrunch_Lab
@RTPAnalysts
US Anti-discrimination Laws
• Housing – Fair Housing Act (FHA) 1968; FHA
Amendments Act 1988
• Finance – Equal Credit Opportunity Act
• Education - Civil Rights Act 1964; Education
Amendments 1972
• Employment - Civil Rights Act 1964; Civil Rights
Act 1991; Pregnancy Discrimination Act; Age
Discrimination in Employment Act; Americans
with Disability Act; Equal Pay Act; Immigration
Reform and Control Act; Genetic Information
Nondiscrimination Act
• Public Accommodation - Civil Rights Act 1964

@DataCrunch_Lab
@RTPAnalysts
Discrimination
Disparate treatment
policies, practices or procedures that
treat people differently because of
their membership in a protected class
Disparate impact
policies, practices or procedures that
has disadvantageous effect on some
people of a protected class
compared to others

@DataCrunch_Lab
@RTPAnalysts
“Scores like this — known as risk assessments — are increasingly common in courtrooms across the nation.
They are used to inform decisions about who can be set free at every stage of the criminal justice system, from
assigning bond amounts — as is the case in Fort Lauderdale — to even more fundamental decisions about
defendants’ freedom. In Arizona, Colorado, Delaware, Kentucky, Louisiana, Oklahoma, Virginia, Washington
and Wisconsin, the results of such assessments are given to judges during criminal sentencing.”
– Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner, ProPublica, May 23, 2016
(https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing)

@DataCrunch_Lab
@RTPAnalysts
Recidivism
Northpointe
“a finger-printable arrest
involving a charge and a filing
for any uniform crime reporting
(UCR) code.”
ProPublica
“criminal offense that resulted in
a jail booking”

@DataCrunch_Lab
@RTPAnalysts
Risk of Recidivism Score
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

@DataCrunch_Lab
@RTPAnalysts
Analysis of COMPAS
Recidivism Score
ALL DEFENDANTS BLACK DEFENDANTS WHITE DEFENDANTS
Low High Low High Low High
Survived 2681 1282 Survived 990 805 Survived 1139 349
Recidivated 1216 2035 Recidivated 532 1369 Recidivated 461 505

@DataCrunch_Lab
@RTPAnalysts
Accurate for all groups
65.4% 63.8% 67.0%
Accuracy
“ The company said it had devised the algorithm to achieve this goal. A test
that is correct in equal proportions for all groups cannot be biased, the
company said.”

@DataCrunch_Lab
@RTPAnalysts
Black defendants wrongly
classified high risk more often
32.3% 44.8% 23.5%
Error rate
“Black defendants who do not recidivate were nearly twice as likely to be
classified by COMPAS as higher risk compared to their white counterparts.”

@DataCrunch_Lab
@RTPAnalysts
White defendants wrongly
classified low risk more often
37.4% 28.0% 47.7%
Error rate “The test tended to make the opposite mistake with whites, meaning
that it was more likely to wrongly predict that white people would not
commit additional crimes if released compared to black defendants.”

@DataCrunch_Lab
@RTPAnalysts
Biased or Unbiased?
Predictive Parity
65.4 63.8
67
0
10
20
30
40
50
60
70
80
ALL BLACK WHITE
Accuracy
Accuracy
Disparate Impact
32.3
44.8
23.5
37.4
28
47.7
0
10
20
30
40
50
60
70
80
ALL BLACK WHITE
Error rate
Negative Positive

@DataCrunch_Lab
@RTPAnalysts
“Since blacks are re-arrested more often than
whites, is it possible to create a formula that is
equally predictive for all races without
disparities in who suffers the harm of incorrect
predictions?”
NO

@DataCrunch_Lab
@RTPAnalysts
Key finding
“If you have two populations
that have unequal base rates
then you can’t satisfy both
definitions of fairness at the
same time.”
(CC BY 2.0) Photo by Ivan Radic

@DataCrunch_Lab
@RTPAnalysts
“Begin
with the
end in
mind”

@DataCrunch_Lab
@RTPAnalysts
Project
Plan
• Human-centered approach
• Include multiple stakeholders
• Set fairness goals & metrics
• Audit the data for bias
• Consider potential sources of bias
• Consider primary, secondary, and
tertiary layers of effect
• Determine additional data collection
needs
• Plan mitigation strategies
• Plan monitoring strategies
Human

@DataCrunch_Lab
@RTPAnalysts
Bias sneaks
in…
“In 1952 the Boston
Symphony Orchestra
initiated blind auditions
to help diversify its male-
dominated roster, but
trials still skewed heavily
towards men.”
- Edd Gent, “Beware of the gaps
in Big Data,” Sept. 12, 2016

@DataCrunch_Lab
@RTPAnalysts
Sources
of Bias Skewed sample
Tainted examples
Sample size disparity
Limited features
Proxies(CC BY-SA 2.0) Photo by Chris Bloom

@DataCrunch_Lab
@RTPAnalysts
Skewed sample
Detect potholes to allocate repair crews
“The system reported a disproportionate
number of potholes in wealthier
neighbourhoods. It turned out it was
oversampling the younger, more affluent
citizens who were digitally clued up enough
to download and use the app in the first
place. ”
- David Wallace, “Big data has unconscious
bias too”, Sept. 21, 2016

@DataCrunch_Lab
@RTPAnalysts
Tainted examples
Identify job candidates
“That is because Amazon’s computer
models were trained to vet applicants by
observing patterns in resumes submitted to
the company over a 10-year period. Most
came from men, a reflection of male
dominance across the tech industry.”
- Jeffrey Dastin, “Amazon scraps secret AI
recruiting tool that showed bias against
women”, Oct. 9, 2018

@DataCrunch_Lab
@RTPAnalysts
Sample size disparity
“Assuming a fixed feature space, a
classifier generally improves with the
number of data points used to train it… The
contrapositive is that less data leads to
worse predictions. ”
- Moritz Hardt, “How big data is unfair”,
Sept. 26, 2014

@DataCrunch_Lab
@RTPAnalysts
Limited features
Diagnose heart disease
“recent research has shown that women
who have heart disease experience
different symptoms, causes and outcomes
than men.”
“Women are often told their stress tests are
normal or that they have "false positives."
Bairey Merz says doctors should pay
attention to symptoms such as chest pain
and shortness of breath rather than relying
on a stress test score.”

@DataCrunch_Lab
@RTPAnalysts
Proxies
Make college admissions decisions
“…certain rules in the system that weighed
applicants on the basis of seemingly non-
relevant factors, like place of birth and
name.
[…]simply having a non-European name
could automatically take 15 points off an
applicant’s score. The commission also
found that female applicants were docked
three points, on average.”
- Oscar Schwartz, “Untold History of AI:
Algorithmic Bias was Born in the 1980s”,
April 15, 2019

@DataCrunch_Lab
@RTPAnalystsAddressing Bias
During Modeling

@DataCrunch_Lab
@RTPAnalysts
AI Fairness
360 Toolkit
aif360.mybluemix.net
Datasets
Toolbox
• Fairness metrics
• Fairness metrics
explanations
• Bias mitigation
algorithms
Guidance
Tutorials

@DataCrunch_Lab
@RTPAnalysts
AI Fairness 360 Structure

@DataCrunch_Lab
@RTPAnalysts
Detect bias
Metrics:
• Statistical Parity
Difference
• Equal Opportunity
Difference
• Average Odds
Difference
• Disparate Impact
• Theil Index

@DataCrunch_Lab
@RTPAnalysts
Fairness Metrics
Statistical Parity
Difference
41.2%
65.2%
0%
20%
40%
60%
80%
100%
BLACK WHITE
Statistical Parity Difference = -24.0%
LOW RISK HIGH RISK
Difference of the rate of favorable
outcomes between the unprivileged
and privileged groups

@DataCrunch_Lab
@RTPAnalysts
Fairness Metrics
Disparate Impact
Ratio of the rate of favorable outcomes
between the unprivileged and
privileged groups
41.2%
65.2%
0%
20%
40%
60%
80%
100%
BLACK WHITE
Disparate Impact = 63.2%
LOW RISK HIGH RISK

@DataCrunch_Lab
@RTPAnalysts
Fairness Metrics
Equal Opportunity
Difference
Difference in true positive rates between
the unprivileged and privileged groups
55.2%
76.5%
0%
20%
40%
60%
80%
100%
BLACK WHITE
Equal Opportunity Diff = -21.4%
LOW RISK HIGH RISK

@DataCrunch_Lab
@RTPAnalysts
Fairness Metrics
Average Odds
Difference
Average of difference in false positive
rate and true positive rates of favorable
outcomes between the unprivileged
and privileged groups
65.0% 71.2%
35.0% 28.8%
0%
20%
40%
60%
80%
100%
BLACK WHITE
(Abs) Average Odds Diff = 6.1%
SURVIVED RECIDIVATED

@DataCrunch_Lab
@RTPAnalysts
Mitigate Bias
Pre-processing
Mitigate bias in
training data
• Disparate Impact
Remover
• Learning Fair
Representations
• Optimized
Preprocessing
• Reweighing
In-processing
Mitigate bias in
classifiers
• Adversarial Debiasing
• ART Classifier
• Prejudice Remover
Post-processing
Mitigate bias in
predictions
• Calibrated Equality of
Odds
• Equality of Odds
• Reject Option
Classification
https://aif360.readthedocs.io/en/latest/modules/algorithms.html

@DataCrunch_Lab
@RTPAnalysts
“there is no
[machine learning]
model that works
best for every
problem”
Based on work by Wolpert (1996),
“The Lack of A Priori Distinctions
between Learning Algorithms”
Photo by Hermes Rivera on Unsplash

@DataCrunch_Lab
@RTPAnalysts
Pre-processing
Disparate impact remover is
a preprocessing technique
that edits feature values to
increase group fairness while
preserving rank-ordering
within groups

@DataCrunch_Lab
@RTPAnalysts
In-processing
Adversarial debiasing is an in-processing technique that
learns a classifier to maximize prediction accuracy and
simultaneously reduce an adversary’s ability to determine
the protected attribute from the predictions.

@DataCrunch_Lab
@RTPAnalysts
Post-processing
Equalized odds
postprocessing is a post-
processing technique that
solves a linear program to
find probabilities with which
to change output labels to
optimize equalized odds

@DataCrunch_Lab
@RTPAnalystsBenefits of
post-
processing
@ LinkedIn
• Model agnostic
• Robust to application-
specific business logic
• Easier integration
• No significant
modifications to existing
components
• Complementary to pre-
processing and in-
processing efforts
https://engineering.linkedin.com/blog/2018/10/building-representative-talent-search-at-linkedin

•
•
•
•
•
•
•
•
•
•

@DataCrunch_Lab
@RTPAnalysts
Learn more about the
AI Fairness 360 toolkit
•GitHub: https://github.com/IBM/AIF360
•Docs: https://aif360.readthedocs.io/en/latest/
•Demo: http://aif360.mybluemix.net/

@DataCrunch_Lab
@RTPAnalysts
Try IBM Watson OpenScale
Videos and Tutorials:
https://www.ibm.com/demos/collection/
IBM-Watson-OpenScale/

@DataCrunch_Lab
@RTPAnalysts
Other resources
•What-if Tool
Interactive visual interface designed to probe
your models better including AI Fairness
https://pair-code.github.io/what-if-tool/
•Ethics & Algorithms Toolkit
A risk management frameworks for government
(and other people too)
http://ethicstoolkit.ai/

Thank you!
Zeydy Ortiz, Ph. D.
CEO & Chief Data Scientist
zortiz @ datacrunchlab.com
www.linkedin.com/in/zortiz

Our Profile
DataCrunch Lab, LLC is an IT professional services and technology development
organization providing customized solutions that leverage data science, machine
learning, and artificial intelligence. Our solutions transform data into value
resulting in increased operational efficiency, reduced costs, and improved
outcomes.
Our Focus
We create value by uncovering needs, co-developing the best solution,
supporting customers through implementation, and ensuring the solution is
successful. As an IBM Business Partner, we leverage the Data & AI portfolio of
products in our solutions. We serve customers across many industries including IT,
legal, retail, financial, and industrial/manufacturing sectors.
Our Success
We have 20+ years of professional experience with the ability to architect, design
and develop highly complex IT solutions focused on business outcomes. Some of
the issues we have addressed include:
• Informed business strategy - what products to produce, what features to
include, and for which markets - with predictive modeling and simulation
• Increased visibility of real-time operations by integrating multiple technologies -
messaging, natural language understanding, chatbots, data visualization, cloud
computing
• Increased operational efficiency by predicting customer churn
• Reduced IT costs by predicting process duration to efficiently allocate resources

@DataCrunch_Lab
@RTPAnalysts
Trademarks and Notes
© IBM Corporation 2019
• IBM, the IBM logo, ibm.com, and Watson OpenScale are trademarks or registered trademarks of
International Business Machines Corporation in the United States, other countries, or both. If these and
other IBM trademarked terms are marked on their first occurrence in this information with the
appropriate symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned
by IBM at the time this information was published. Such trademarks may also be registered or common
law trademarks in other countries. A current list of IBM trademarks is available on the Web at
“Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
• Other company, product, and service names may be trademarks or service marks of others.
• References in this publication to IBM products or services do not imply that IBM intends to make them
available in all countries in which IBM operates.

@DataCrunch_Lab
@RTPAnalysts
References
❖ “Machine Bias”, ProPublica,
❖ https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
❖ https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
❖ https://www.propublica.org/article/bias-in-criminal-risk-scores-is-mathematically-inevitable-researchers-say
❖ “Beware of the gaps in Big Data,” https://eandt.theiet.org/content/articles/2016/09/beware-of-the-gaps-
in-big-data/
❖ “Big data has unconscious bias too”: https://thesumisgreater.wordpress.com/2016/09/21/big-data-has-
unconscious-bias-too/
❖ “Amazon scraps secret AI recruiting tool that showed bias against women”:
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-
recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
❖ “How big data is unfair,” https://medium.com/@mrtz/how-big-data-is-unfair-9aa544d739de
❖ “When it comes to heart disease, women and men are not equal” https://www.cedars-sinai.edu/About-
Us/HH-Landing-Pages/When-it-comes-to-heart-disease-women-and-men-are-not-equal.aspx
❖ “Untold History of AI: Algorithmic Bias was Born in the 1980s,” https://spectrum.ieee.org/tech-talk/tech-
history/dawn-of-electronics/untold-history-of-ai-the-birth-of-machine-bias
❖ AI Fairness 360 Toolkit: http://aif360.mybluemix.net
❖ “Certifying and removing disparate impact.” https://arxiv.org/abs/1412.3756
❖ “Mitigating Unwanted Biases with Adversarial Learning”, https://arxiv.org/pdf/1801.07593
❖ “On Fairness and Calibration,” https://arxiv.org/abs/1709.02012
❖ “Building Representative Talent Search at LinkedIn,”
https://engineering.linkedin.com/blog/2018/10/building-representative-talent-search-at-linkedin
❖ IBM Watson OpenScale, https://console.bluemix.net/catalog/services/ai-openscale

Bias in AI

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Bias in AI

Similar to Bias in AI (20)

More from Zeydy Ortiz, Ph. D.

More from Zeydy Ortiz, Ph. D. (6)

Recently uploaded

Recently uploaded (20)

Bias in AI