SlideShare a Scribd company logo
Engineering Ethics: Practicing Fairness
Clare Corthell
@clarecorthell
clare@luminantdata.com
Data Science and Machine Learning Consulting
gatekeepers of critical life decisions
• getting help with homework
• going to college
• buying a car
• getting a mortgage
• getting sentenced in prison
• getting hired
• keeping a job
one of our biggest problems?
unfairness of prediction.
*Yes, I will somewhat controversially use “prediction” to refer to both predicting values and predicting
class labels (classification); many methods and scenarios here do not apply equivalently to both.
define fairness
Dwork, et al:
similar people should be treated similarly
dissimilar people should be treated dissimilarly
for our technical purposes, we define the subjective societal value of fairness as:
ex: if two people drive similarly, they should receive similar insurance terms
— Abe Gong, Data Scientist
“Powerful algorithms can be harmful and unfair,
even when they’re unbiased in a strictly technical sense.”
3 examples of unfair outcomes
Character Testing & Disability Discrimination①
“Good intent or absence of discriminatory intent does not redeem
employment procedures or testing mechanisms that operate 

as 'built-in headwinds' for minority groups”
— Warren Burger, Chief Justice, Griggs v. Duke Power Company, 1971
It is illegal to hire employees based on:
• intrinsic traits like ethnicity or gender (Equal Employment Opportunity Commission, 1965)
• disability (Americans with Disabilities Act, 1990)
• intelligence quotient or “IQ” (Griggs v. Duke Power Company, 1971)
①
In the US, 60-70% of job candidates currently undergo character testing, which
is unregulated outside of the aforementioned laws. These tests screen candidates
for things like “commuting time” and “agreeableness,” presenting issues of
redline and disability discrimination. Problematically, there is little proof that this
does not constitute a fresh “built-in headwinds” for minority groups, and in turn a
problem for both employers and employees.
Google’s people operations recently exposed that characteristics like GPA did
not predict whether an employee would perform well. This indicates that even
customary industry practices may not be strongly correlated with the ground truth
they intend to predict, particularly employability, performance, and retention.
Character Testing & Disability Discrimination
"Data analytics have the potential to eclipse longstanding civil rights protections in
how personal information is used in housing, credit, employment, health, education,
and the marketplace”
— White House Report “Big Data: Seizing Opportunities, Preserving Values”
② Insurance Premiums
In the US, banks did not lend within blocks where African-Americans lived,
called“redlining,” until it became illegal through the Fair Housing Act of 1968.
Standard practices like behavioral segmentation are used to“steer” consumers
to less favorable terms based on behavior unrelated to their creditworthiness.
These practices are unfair and threaten the principles of the Fair Housing Act.
Future Startup Founders
A decision tree classifier was trained on a set of (seemingly meritocratic) features,
then used to predict who might start a company:
• College Education
• Computer Science major
• Years of experience
• Last position title
• Approximate age
• Work experience in venture backed company
③
the “meritocratic” approach does not work
because protected characteristics are
redundantly encoded
Characteristics like gender, race, or ability are often correlated
with a combination of multiple other features.
blindness is not the answer
race-blind, need-blind, able-blind, etc
0. data
1. black box
2. scale
3. impact
Problems
0. biased data
• data at scale of people’s past decisions are naturally socially biased, and models
will learn that unfairness
• data is dirty and often simply wrong
• data at scale often encodes protected characteristics like race, ability, and health
markers
• restricted options, or menu-driven identity mistakes, create worthless or dirty data
• no ground truth to test our assumptions against
• big data is usually not big data for protected classes. Less data for the protected
class means bigger error bars and worse predictions
1. black box
• many machine learning systems are not inspectable, because of
high dimensionality, hidden layer relationships, etc
• there are limits to what data scientists understand about how their
models are learning, because they (probably) didn’t build them
• data scientists make choices — hypotheses, premise, training data
selection, processing, outlier exclusion, etc.
- Cathy O’Neil, Weapons of Math Destruction
“Our own values and desires influence our choices,
from the data we choose to collect to the questions we ask.
Models are opinions embedded in mathematics.”
2. scale
• modeled decisions are exponentially scalable compared to linear
human decisions
• faster
• centralized
3. impact
unfair outcomes often results when specific biases of the data are left
unexamined, especially problematic because:
• no user feedback — people do not have personal interactions with
decision-makers or recourse
biased data + black box + scale =
invisible feedback loops
critical decisions are now in the
hands of a model and its designer
instead of trained people
often a “data scientist”
solutions
define fairness
Dwork, et al:
similar people should be treated similarly
dissimilar people should be treated dissimilarly
for our technical purposes, we define the subjective societal value of fairness as:
solutions: constructing fairness
• data scientists must construct fairness explicitly (Dwork et al)
• fairness is task-specific, requiring:
• development of context-specific non-blind fairness metrics that utilize
protected class attributes (eg gender, race, ability, etc)
• development of context-specific individual similarity metric that is as
close as possible to the ground truth or best approximation (ex:
measure of how well someone drives to test fairness of insurance terms)
• historical context has bearing on impact (ex: until 1968, african-americans
were often denied insurance and loans, which has downstream effects)
solutions: tools and design
• inspectability tools to better inspect the whole stack — from
training data to preprocessing algorithms to learned models
• data scientists making critical decisions should validate and check
assumptions with others
• better user research: investigate error cases, not just error rates
• better experience design: user outcome feedback systems allow
users to help you help them surface and correct bad predictions
Why be fair?
sticks & carrots
why be fair?
sticks
• treating people differently based on their innate or protected characteristics
is wrong and illegal
• adversarial learning exploits proxy measures, or people will learn how to
game the system
• unfair predictions leave money on the table; not lending to someone who is
falsely predicted to be a higher risk is a missed opportunity
• being unfair begets bad press and accelerates regulation
• consumers dislike unfair companies, much more than they dislike
companies that fail to preserve their privacy
why be fair?
carrots
• doing good business - there are missed opportunities in not lending to
hard-working people, in not funding atypical founders, in not hiring people
who think differently and bring new value
• if industry is able to build proof of fair practices prior to regulation, industry
might preempt and limit regulation with its own preferred fairness proofs
• we can stop limiting of who people can become by intervening in the
self-defeating feedback loop
• when we centralize control, it presents a unique opportunity to correct
biases
a paradigm change is an opportune moment
we’re at a special moment when
decisions are being centralized,
from distributed groups of people
to central computational decision-making,
which gives us the opportunity and responsibility
to correct socially endemic biases
for the benefit of both society and business
bottom line —
it is the professional responsibility of every
data scientist to ensure fairness in the
interest of both their business and society
#EthicalAlgorithms
Data Science Practitioner group in San Francisco, hosted by The Design
Guild, with the goal of discussing and actively creating fairness:
• Ethics Peer Reviews
• Forum on Fairness and Privacy in Data Science 

(talk with Data Scientists, Ethics Consultants, Academics, etc)
• Constructing a Professional Responsibility Manifesto for Data Scientists
Thank You
@clarecorthell
clare@luminantdata.com
Data Science and Machine Learning Consulting
references
Academic
• “Fairness Through Awareness” Dwork, et al. 2011.
• “Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems” Datta, et al.
Reports
• “Big Data: Seizing Opportunities, Preserving Values” The White House, 2014
• “Will you care when you pay more? The negative side of targeted promotions” Tsai, 2015
Books
• Weapons of Math Destruction, Cathy O’Neil, 2016
• Cybertypes: Race, Ethnicity, and Identity on the Internet, Lisa Nakamura, 2002. (defines “menu-driven identities)
Blog Posts
• Ethics for powerful algorithms, Abe Gong, 2016

More Related Content

What's hot

A Case for Expectation Informed Design
A Case for Expectation Informed DesignA Case for Expectation Informed Design
A Case for Expectation Informed Design
gloriakt
 
A Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - FullA Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - Full
gloriakt
 
Fairness in Automated Decision Systems
Fairness in Automated Decision SystemsFairness in Automated Decision Systems
Fairness in Automated Decision Systems
SRI SAI PRAVEEN GADIYARAM
 
Fairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedInFairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedIn
Krishnaram Kenthapadi
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
Krishnaram Kenthapadi
 
Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)
GoDataDriven
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
Carlos Castillo (ChaTo)
 
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
Galit Shmueli
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
University of Minnesota, Duluth
 
Designing Trustworthy AI: A User Experience Framework at RSA 2020
Designing Trustworthy AI: A User Experience Framework at RSA 2020Designing Trustworthy AI: A User Experience Framework at RSA 2020
Designing Trustworthy AI: A User Experience Framework at RSA 2020
Carol Smith
 
Young people's policy recommendations on algorithm fairness web sci17
Young people's policy recommendations on algorithm fairness web sci17Young people's policy recommendations on algorithm fairness web sci17
Young people's policy recommendations on algorithm fairness web sci17
Ansgar Koene
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
Delip Rao
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
Seth Grimes
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Krishnaram Kenthapadi
 
L12 yem good play_toward_conclusion
L12 yem good play_toward_conclusionL12 yem good play_toward_conclusion
L12 yem good play_toward_conclusion
Chormvirak Moulsem
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
Mark Borg
 
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
Maryam Farooq
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Carlos Castillo (ChaTo)
 
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
Minjoon Kim
 

What's hot (19)

A Case for Expectation Informed Design
A Case for Expectation Informed DesignA Case for Expectation Informed Design
A Case for Expectation Informed Design
 
A Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - FullA Case for Expectation Informed Design - Full
A Case for Expectation Informed Design - Full
 
Fairness in Automated Decision Systems
Fairness in Automated Decision SystemsFairness in Automated Decision Systems
Fairness in Automated Decision Systems
 
Fairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedInFairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedIn
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)Fairness in AI (DDSW 2019)
Fairness in AI (DDSW 2019)
 
Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)Detecting Algorithmic Bias (keynote at DIR 2016)
Detecting Algorithmic Bias (keynote at DIR 2016)
 
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
Research Using Behavioral Big Data: A Tour and Why Mechanical Engineers Shoul...
 
Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it? Algorithmic Bias - What is it? Why should we care? What can we do about it?
Algorithmic Bias - What is it? Why should we care? What can we do about it?
 
Designing Trustworthy AI: A User Experience Framework at RSA 2020
Designing Trustworthy AI: A User Experience Framework at RSA 2020Designing Trustworthy AI: A User Experience Framework at RSA 2020
Designing Trustworthy AI: A User Experience Framework at RSA 2020
 
Young people's policy recommendations on algorithm fairness web sci17
Young people's policy recommendations on algorithm fairness web sci17Young people's policy recommendations on algorithm fairness web sci17
Young people's policy recommendations on algorithm fairness web sci17
 
Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
 
L12 yem good play_toward_conclusion
L12 yem good play_toward_conclusionL12 yem good play_toward_conclusion
L12 yem good play_toward_conclusion
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
NYAI #24: Developing Trust in Artificial Intelligence and Machine Learning fo...
 
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
Socia Media and Digital Volunteering in Disaster Management @ DSEM 2017
 
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
Interacting with an Inferred World: the Challenge of Machine Learning for Hum...
 

Viewers also liked

Distributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in PythonDistributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in Python
Clare Corthell
 
Hybrid Intelligence: The New Paradigm
Hybrid Intelligence: The New ParadigmHybrid Intelligence: The New Paradigm
Hybrid Intelligence: The New Paradigm
Clare Corthell
 
AI And Philosophy
AI And PhilosophyAI And Philosophy
AI And Philosophy
Aaron Sloman
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
sfdatascience
 
What Do You Do with a Problem Like AI?
What Do You Do with a Problem Like AI?What Do You Do with a Problem Like AI?
What Do You Do with a Problem Like AI?
Lilian Edwards
 
Ethics of AI
Ethics of AIEthics of AI
Ethics of AI
Andreas Heil
 
The Ethics of Machine Learning/AI - Brent M. Eastwood
The Ethics of Machine Learning/AI - Brent M. EastwoodThe Ethics of Machine Learning/AI - Brent M. Eastwood
The Ethics of Machine Learning/AI - Brent M. Eastwood
WithTheBest
 
Artificial intelligence and ethics
Artificial intelligence and ethicsArtificial intelligence and ethics
Artificial intelligence and ethics
Mia Eaker
 
Ai Ethics
Ai EthicsAi Ethics
Ai Ethics
sparks & honey
 
Engineering ethics & cases
Engineering ethics & casesEngineering ethics & cases
Engineering ethics & casesMugiwaraL
 
Gephi Quick Start
Gephi Quick StartGephi Quick Start
Gephi Quick Start
Gephi Consortium
 
Practice Problems - General Concepts Blank
Practice Problems - General Concepts BlankPractice Problems - General Concepts Blank
Practice Problems - General Concepts Blank
Lumen Learning
 
The Classical Era
The Classical EraThe Classical Era
The Classical Era
Lumen Learning
 
Practice Problems - Conversions Blank
Practice Problems - Conversions BlankPractice Problems - Conversions Blank
Practice Problems - Conversions Blank
Lumen Learning
 
Chem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics IIChem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics II
Lumen Learning
 
Cbc the argumentative research paper overview
Cbc the argumentative research paper overviewCbc the argumentative research paper overview
Cbc the argumentative research paper overview
Lumen Learning
 
Memory_OSch08_imageslideshow
Memory_OSch08_imageslideshowMemory_OSch08_imageslideshow
Memory_OSch08_imageslideshow
Lumen Learning
 
Imperfectly Comeptitive Markets
Imperfectly Comeptitive MarketsImperfectly Comeptitive Markets
Imperfectly Comeptitive Markets
Lumen Learning
 

Viewers also liked (20)

Distributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in PythonDistributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in Python
 
Hybrid Intelligence: The New Paradigm
Hybrid Intelligence: The New ParadigmHybrid Intelligence: The New Paradigm
Hybrid Intelligence: The New Paradigm
 
AI And Philosophy
AI And PhilosophyAI And Philosophy
AI And Philosophy
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
 
What Do You Do with a Problem Like AI?
What Do You Do with a Problem Like AI?What Do You Do with a Problem Like AI?
What Do You Do with a Problem Like AI?
 
Ethics of AI
Ethics of AIEthics of AI
Ethics of AI
 
The Ethics of Machine Learning/AI - Brent M. Eastwood
The Ethics of Machine Learning/AI - Brent M. EastwoodThe Ethics of Machine Learning/AI - Brent M. Eastwood
The Ethics of Machine Learning/AI - Brent M. Eastwood
 
Artificial intelligence and ethics
Artificial intelligence and ethicsArtificial intelligence and ethics
Artificial intelligence and ethics
 
Ethics and AI
Ethics and AIEthics and AI
Ethics and AI
 
Ai Ethics
Ai EthicsAi Ethics
Ai Ethics
 
Engineering ethics & cases
Engineering ethics & casesEngineering ethics & cases
Engineering ethics & cases
 
Conceptual approach
Conceptual approachConceptual approach
Conceptual approach
 
Gephi Quick Start
Gephi Quick StartGephi Quick Start
Gephi Quick Start
 
Practice Problems - General Concepts Blank
Practice Problems - General Concepts BlankPractice Problems - General Concepts Blank
Practice Problems - General Concepts Blank
 
The Classical Era
The Classical EraThe Classical Era
The Classical Era
 
Practice Problems - Conversions Blank
Practice Problems - Conversions BlankPractice Problems - Conversions Blank
Practice Problems - Conversions Blank
 
Chem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics IIChem 2 - Introduction to Chemical Kinetics II
Chem 2 - Introduction to Chemical Kinetics II
 
Cbc the argumentative research paper overview
Cbc the argumentative research paper overviewCbc the argumentative research paper overview
Cbc the argumentative research paper overview
 
Memory_OSch08_imageslideshow
Memory_OSch08_imageslideshowMemory_OSch08_imageslideshow
Memory_OSch08_imageslideshow
 
Imperfectly Comeptitive Markets
Imperfectly Comeptitive MarketsImperfectly Comeptitive Markets
Imperfectly Comeptitive Markets
 

Similar to Engineering Ethics: Practicing Fairness

Joe keating - world legal summit - ethical data science
Joe keating  - world legal summit - ethical data scienceJoe keating  - world legal summit - ethical data science
Joe keating - world legal summit - ethical data science
Joe Keating
 
Data and ethics Training
Data and ethics TrainingData and ethics Training
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Joe Keating
 
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Joe Keating
 
Fairness and Ethics in A
Fairness and Ethics in AFairness and Ethics in A
Fairness and Ethics in A
Daniel Chan
 
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
University of Chicago Booth Big Data & Analytics Roundtable
 
Dual Approaches for Integrating Ethics into the Information Systems Curriculum
Dual Approaches for Integrating Ethics into the Information Systems CurriculumDual Approaches for Integrating Ethics into the Information Systems Curriculum
Dual Approaches for Integrating Ethics into the Information Systems Curriculum
ACBSP Global Accreditation
 
A.I.pptx
A.I.pptxA.I.pptx
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
Abhishek Upadhyay
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
Krishnaram Kenthapadi
 
DATAIA & TransAlgo
DATAIA & TransAlgoDATAIA & TransAlgo
DATAIA & TransAlgo
Nozha Boujemaa
 
Ethics of personalized information filtering
Ethics of personalized information filteringEthics of personalized information filtering
Ethics of personalized information filtering
Ansgar Koene
 
Data science and ethics in fundraising
Data science and ethics in fundraisingData science and ethics in fundraising
Data science and ethics in fundraising
James Orton
 
AI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security CommonsAI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security Commons
professormadison
 
Aspa ai webinar
Aspa   ai webinarAspa   ai webinar
Aspa ai webinar
Sherri Greenberg
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
sasi
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
AnthonyMelson
 
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
Bernhard Rieder
 
Responsible Data Use in AI - core tech pillars
Responsible Data Use in AI - core tech pillarsResponsible Data Use in AI - core tech pillars
Responsible Data Use in AI - core tech pillars
Sofus Macskássy
 
Integrity
IntegrityIntegrity
Integrity
Jasleen Khalsa
 

Similar to Engineering Ethics: Practicing Fairness (20)

Joe keating - world legal summit - ethical data science
Joe keating  - world legal summit - ethical data scienceJoe keating  - world legal summit - ethical data science
Joe keating - world legal summit - ethical data science
 
Data and ethics Training
Data and ethics TrainingData and ethics Training
Data and ethics Training
 
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation: Ethical Data Science - BoI Analytics Connect 2018
 
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
Glantus Presentation Slides - Ethical Data Science - BoI Analytics Connect 2018
 
Fairness and Ethics in A
Fairness and Ethics in AFairness and Ethics in A
Fairness and Ethics in A
 
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
Data Analytics Ethics: Issues and Questions (Arnie Aronoff, Ph.D.)
 
Dual Approaches for Integrating Ethics into the Information Systems Curriculum
Dual Approaches for Integrating Ethics into the Information Systems CurriculumDual Approaches for Integrating Ethics into the Information Systems Curriculum
Dual Approaches for Integrating Ethics into the Information Systems Curriculum
 
A.I.pptx
A.I.pptxA.I.pptx
A.I.pptx
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
DATAIA & TransAlgo
DATAIA & TransAlgoDATAIA & TransAlgo
DATAIA & TransAlgo
 
Ethics of personalized information filtering
Ethics of personalized information filteringEthics of personalized information filtering
Ethics of personalized information filtering
 
Data science and ethics in fundraising
Data science and ethics in fundraisingData science and ethics in fundraising
Data science and ethics in fundraising
 
AI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security CommonsAI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security Commons
 
Aspa ai webinar
Aspa   ai webinarAspa   ai webinar
Aspa ai webinar
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
On the Diversity of the Accountability Problem. Machine Learning and Knowing ...
 
Responsible Data Use in AI - core tech pillars
Responsible Data Use in AI - core tech pillarsResponsible Data Use in AI - core tech pillars
Responsible Data Use in AI - core tech pillars
 
Integrity
IntegrityIntegrity
Integrity
 

Recently uploaded

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 

Recently uploaded (20)

PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 

Engineering Ethics: Practicing Fairness

  • 1. Engineering Ethics: Practicing Fairness Clare Corthell @clarecorthell clare@luminantdata.com Data Science and Machine Learning Consulting
  • 2. gatekeepers of critical life decisions • getting help with homework • going to college • buying a car • getting a mortgage • getting sentenced in prison • getting hired • keeping a job
  • 3.
  • 4. one of our biggest problems? unfairness of prediction. *Yes, I will somewhat controversially use “prediction” to refer to both predicting values and predicting class labels (classification); many methods and scenarios here do not apply equivalently to both.
  • 5. define fairness Dwork, et al: similar people should be treated similarly dissimilar people should be treated dissimilarly for our technical purposes, we define the subjective societal value of fairness as: ex: if two people drive similarly, they should receive similar insurance terms
  • 6. — Abe Gong, Data Scientist “Powerful algorithms can be harmful and unfair, even when they’re unbiased in a strictly technical sense.”
  • 7. 3 examples of unfair outcomes
  • 8. Character Testing & Disability Discrimination① “Good intent or absence of discriminatory intent does not redeem employment procedures or testing mechanisms that operate 
 as 'built-in headwinds' for minority groups” — Warren Burger, Chief Justice, Griggs v. Duke Power Company, 1971 It is illegal to hire employees based on: • intrinsic traits like ethnicity or gender (Equal Employment Opportunity Commission, 1965) • disability (Americans with Disabilities Act, 1990) • intelligence quotient or “IQ” (Griggs v. Duke Power Company, 1971)
  • 9. ① In the US, 60-70% of job candidates currently undergo character testing, which is unregulated outside of the aforementioned laws. These tests screen candidates for things like “commuting time” and “agreeableness,” presenting issues of redline and disability discrimination. Problematically, there is little proof that this does not constitute a fresh “built-in headwinds” for minority groups, and in turn a problem for both employers and employees. Google’s people operations recently exposed that characteristics like GPA did not predict whether an employee would perform well. This indicates that even customary industry practices may not be strongly correlated with the ground truth they intend to predict, particularly employability, performance, and retention. Character Testing & Disability Discrimination
  • 10. "Data analytics have the potential to eclipse longstanding civil rights protections in how personal information is used in housing, credit, employment, health, education, and the marketplace” — White House Report “Big Data: Seizing Opportunities, Preserving Values” ② Insurance Premiums In the US, banks did not lend within blocks where African-Americans lived, called“redlining,” until it became illegal through the Fair Housing Act of 1968. Standard practices like behavioral segmentation are used to“steer” consumers to less favorable terms based on behavior unrelated to their creditworthiness. These practices are unfair and threaten the principles of the Fair Housing Act.
  • 11. Future Startup Founders A decision tree classifier was trained on a set of (seemingly meritocratic) features, then used to predict who might start a company: • College Education • Computer Science major • Years of experience • Last position title • Approximate age • Work experience in venture backed company ③
  • 12. the “meritocratic” approach does not work because protected characteristics are redundantly encoded Characteristics like gender, race, or ability are often correlated with a combination of multiple other features.
  • 13. blindness is not the answer race-blind, need-blind, able-blind, etc
  • 14. 0. data 1. black box 2. scale 3. impact Problems
  • 15. 0. biased data • data at scale of people’s past decisions are naturally socially biased, and models will learn that unfairness • data is dirty and often simply wrong • data at scale often encodes protected characteristics like race, ability, and health markers • restricted options, or menu-driven identity mistakes, create worthless or dirty data • no ground truth to test our assumptions against • big data is usually not big data for protected classes. Less data for the protected class means bigger error bars and worse predictions
  • 16. 1. black box • many machine learning systems are not inspectable, because of high dimensionality, hidden layer relationships, etc • there are limits to what data scientists understand about how their models are learning, because they (probably) didn’t build them • data scientists make choices — hypotheses, premise, training data selection, processing, outlier exclusion, etc.
  • 17. - Cathy O’Neil, Weapons of Math Destruction “Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.”
  • 18. 2. scale • modeled decisions are exponentially scalable compared to linear human decisions • faster • centralized
  • 19. 3. impact unfair outcomes often results when specific biases of the data are left unexamined, especially problematic because: • no user feedback — people do not have personal interactions with decision-makers or recourse
  • 20. biased data + black box + scale = invisible feedback loops
  • 21. critical decisions are now in the hands of a model and its designer instead of trained people often a “data scientist”
  • 23. define fairness Dwork, et al: similar people should be treated similarly dissimilar people should be treated dissimilarly for our technical purposes, we define the subjective societal value of fairness as:
  • 24. solutions: constructing fairness • data scientists must construct fairness explicitly (Dwork et al) • fairness is task-specific, requiring: • development of context-specific non-blind fairness metrics that utilize protected class attributes (eg gender, race, ability, etc) • development of context-specific individual similarity metric that is as close as possible to the ground truth or best approximation (ex: measure of how well someone drives to test fairness of insurance terms) • historical context has bearing on impact (ex: until 1968, african-americans were often denied insurance and loans, which has downstream effects)
  • 25. solutions: tools and design • inspectability tools to better inspect the whole stack — from training data to preprocessing algorithms to learned models • data scientists making critical decisions should validate and check assumptions with others • better user research: investigate error cases, not just error rates • better experience design: user outcome feedback systems allow users to help you help them surface and correct bad predictions
  • 26. Why be fair? sticks & carrots
  • 27. why be fair? sticks • treating people differently based on their innate or protected characteristics is wrong and illegal • adversarial learning exploits proxy measures, or people will learn how to game the system • unfair predictions leave money on the table; not lending to someone who is falsely predicted to be a higher risk is a missed opportunity • being unfair begets bad press and accelerates regulation • consumers dislike unfair companies, much more than they dislike companies that fail to preserve their privacy
  • 28. why be fair? carrots • doing good business - there are missed opportunities in not lending to hard-working people, in not funding atypical founders, in not hiring people who think differently and bring new value • if industry is able to build proof of fair practices prior to regulation, industry might preempt and limit regulation with its own preferred fairness proofs • we can stop limiting of who people can become by intervening in the self-defeating feedback loop • when we centralize control, it presents a unique opportunity to correct biases
  • 29. a paradigm change is an opportune moment
  • 30. we’re at a special moment when decisions are being centralized, from distributed groups of people to central computational decision-making, which gives us the opportunity and responsibility to correct socially endemic biases for the benefit of both society and business
  • 31. bottom line — it is the professional responsibility of every data scientist to ensure fairness in the interest of both their business and society
  • 32. #EthicalAlgorithms Data Science Practitioner group in San Francisco, hosted by The Design Guild, with the goal of discussing and actively creating fairness: • Ethics Peer Reviews • Forum on Fairness and Privacy in Data Science 
 (talk with Data Scientists, Ethics Consultants, Academics, etc) • Constructing a Professional Responsibility Manifesto for Data Scientists
  • 34. references Academic • “Fairness Through Awareness” Dwork, et al. 2011. • “Algorithmic Transparency via Quantitative Input Influence: Theory and Experiments with Learning Systems” Datta, et al. Reports • “Big Data: Seizing Opportunities, Preserving Values” The White House, 2014 • “Will you care when you pay more? The negative side of targeted promotions” Tsai, 2015 Books • Weapons of Math Destruction, Cathy O’Neil, 2016 • Cybertypes: Race, Ethnicity, and Identity on the Internet, Lisa Nakamura, 2002. (defines “menu-driven identities) Blog Posts • Ethics for powerful algorithms, Abe Gong, 2016