Engineering Ethics: Practicing Fairness

Engineering Ethics: Practicing Fairness
Clare Corthell
@clarecorthell
clare@luminantdata.com
Data Science and Machine Learning Consulting

gatekeepers of critical life decisions
• getting help with homework
• going to college
• buying a car
• getting a mortgage
• getting sentenced in prison
• getting hired
• keeping a job

one of our biggest problems?
unfairness of prediction.
*Yes, I will somewhat controversially use “prediction” to refer to both predicting values and predicting
class labels (classiﬁcation); many methods and scenarios here do not apply equivalently to both.

deﬁne fairness
Dwork, et al:
similar people should be treated similarly
dissimilar people should be treated dissimilarly
for our technical purposes, we deﬁne the subjective societal value of fairness as:
ex: if two people drive similarly, they should receive similar insurance terms

— Abe Gong, Data Scientist
“Powerful algorithms can be harmful and unfair,
even when they’re unbiased in a strictly technical sense.”

Character Testing & Disability Discrimination①
“Good intent or absence of discriminatory intent does not redeem
employment procedures or testing mechanisms that operate  
as 'built-in headwinds' for minority groups”
— Warren Burger, Chief Justice, Griggs v. Duke Power Company, 1971
It is illegal to hire employees based on:
• intrinsic traits like ethnicity or gender (Equal Employment Opportunity Commission, 1965)
• disability (Americans with Disabilities Act, 1990)
• intelligence quotient or “IQ” (Griggs v. Duke Power Company, 1971)

①
In the US, 60-70% of job candidates currently undergo character testing, which
is unregulated outside of the aforementioned laws. These tests screen candidates
for things like “commuting time” and “agreeableness,” presenting issues of
redline and disability discrimination. Problematically, there is little proof that this
does not constitute a fresh “built-in headwinds” for minority groups, and in turn a
problem for both employers and employees.
Google’s people operations recently exposed that characteristics like GPA did
not predict whether an employee would perform well. This indicates that even
customary industry practices may not be strongly correlated with the ground truth
they intend to predict, particularly employability, performance, and retention.
Character Testing & Disability Discrimination

"Data analytics have the potential to eclipse longstanding civil rights protections in
how personal information is used in housing, credit, employment, health, education,
and the marketplace”
— White House Report “Big Data: Seizing Opportunities, Preserving Values”
② Insurance Premiums
In the US, banks did not lend within blocks where African-Americans lived,
called“redlining,” until it became illegal through the Fair Housing Act of 1968.
Standard practices like behavioral segmentation are used to“steer” consumers
to less favorable terms based on behavior unrelated to their creditworthiness.
These practices are unfair and threaten the principles of the Fair Housing Act.

Future Startup Founders
A decision tree classiﬁer was trained on a set of (seemingly meritocratic) features,
then used to predict who might start a company:
• College Education
• Computer Science major
• Years of experience
• Last position title
• Approximate age
• Work experience in venture backed company
③

the “meritocratic” approach does not work
because protected characteristics are
redundantly encoded
Characteristics like gender, race, or ability are often correlated
with a combination of multiple other features.

blindness is not the answer
race-blind, need-blind, able-blind, etc

0. data
1. black box
2. scale
3. impact
Problems

0. biased data
• data at scale of people’s past decisions are naturally socially biased, and models
will learn that unfairness
• data is dirty and often simply wrong
• data at scale often encodes protected characteristics like race, ability, and health
markers
• restricted options, or menu-driven identity mistakes, create worthless or dirty data
• no ground truth to test our assumptions against
• big data is usually not big data for protected classes. Less data for the protected
class means bigger error bars and worse predictions

1. black box
• many machine learning systems are not inspectable, because of
high dimensionality, hidden layer relationships, etc
• there are limits to what data scientists understand about how their
models are learning, because they (probably) didn’t build them
• data scientists make choices — hypotheses, premise, training data
selection, processing, outlier exclusion, etc.

- Cathy O’Neil, Weapons of Math Destruction
“Our own values and desires inﬂuence our choices,
from the data we choose to collect to the questions we ask.
Models are opinions embedded in mathematics.”

2. scale
• modeled decisions are exponentially scalable compared to linear
human decisions
• faster
• centralized

3. impact
unfair outcomes often results when speciﬁc biases of the data are left
unexamined, especially problematic because:
• no user feedback — people do not have personal interactions with
decision-makers or recourse

biased data + black box + scale =
invisible feedback loops

critical decisions are now in the
hands of a model and its designer
instead of trained people
often a “data scientist”

deﬁne fairness
Dwork, et al:
similar people should be treated similarly
dissimilar people should be treated dissimilarly
for our technical purposes, we deﬁne the subjective societal value of fairness as:

solutions: constructing fairness
• data scientists must construct fairness explicitly (Dwork et al)
• fairness is task-specific, requiring:
• development of context-specific non-blind fairness metrics that utilize
protected class attributes (eg gender, race, ability, etc)
• development of context-specific individual similarity metric that is as
close as possible to the ground truth or best approximation (ex:
measure of how well someone drives to test fairness of insurance terms)
• historical context has bearing on impact (ex: until 1968, african-americans
were often denied insurance and loans, which has downstream effects)

solutions: tools and design
• inspectability tools to better inspect the whole stack — from
training data to preprocessing algorithms to learned models
• data scientists making critical decisions should validate and check
assumptions with others
• better user research: investigate error cases, not just error rates
• better experience design: user outcome feedback systems allow
users to help you help them surface and correct bad predictions

why be fair?
sticks
• treating people differently based on their innate or protected characteristics
is wrong and illegal
• adversarial learning exploits proxy measures, or people will learn how to
game the system
• unfair predictions leave money on the table; not lending to someone who is
falsely predicted to be a higher risk is a missed opportunity
• being unfair begets bad press and accelerates regulation
• consumers dislike unfair companies, much more than they dislike
companies that fail to preserve their privacy

why be fair?
carrots
• doing good business - there are missed opportunities in not lending to
hard-working people, in not funding atypical founders, in not hiring people
who think differently and bring new value
• if industry is able to build proof of fair practices prior to regulation, industry
might preempt and limit regulation with its own preferred fairness proofs
• we can stop limiting of who people can become by intervening in the
self-defeating feedback loop
• when we centralize control, it presents a unique opportunity to correct
biases

a paradigm change is an opportune moment

we’re at a special moment when
decisions are being centralized,
from distributed groups of people
to central computational decision-making,
which gives us the opportunity and responsibility
to correct socially endemic biases
for the beneﬁt of both society and business

bottom line —
it is the professional responsibility of every
data scientist to ensure fairness in the
interest of both their business and society

#EthicalAlgorithms
Data Science Practitioner group in San Francisco, hosted by The Design
Guild, with the goal of discussing and actively creating fairness:
• Ethics Peer Reviews
• Forum on Fairness and Privacy in Data Science  
(talk with Data Scientists, Ethics Consultants, Academics, etc)
• Constructing a Professional Responsibility Manifesto for Data Scientists

Thank You
@clarecorthell
clare@luminantdata.com
Data Science and Machine Learning Consulting

references
Academic
• “Fairness Through Awareness” Dwork, et al. 2011.
• “Algorithmic Transparency via Quantitative Input Inﬂuence: Theory and Experiments with Learning Systems” Datta, et al.
Reports
• “Big Data: Seizing Opportunities, Preserving Values” The White House, 2014
• “Will you care when you pay more? The negative side of targeted promotions” Tsai, 2015
Books
• Weapons of Math Destruction, Cathy O’Neil, 2016
• Cybertypes: Race, Ethnicity, and Identity on the Internet, Lisa Nakamura, 2002. (deﬁnes “menu-driven identities)
Blog Posts
• Ethics for powerful algorithms, Abe Gong, 2016

Engineering Ethics: Practicing Fairness

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to Engineering Ethics: Practicing Fairness

Similar to Engineering Ethics: Practicing Fairness (20)

Recently uploaded

Recently uploaded (20)

Engineering Ethics: Practicing Fairness