SlideShare a Scribd company logo
1 of 56
Download to read offline
Fairness in Machine Learning: are you
sure there is no bias in your
predictions?
Azzurra Ragone - Innovation and Diversity Advisor
Slides will be shared, follow @azzurraragone
Me…
Innovation and Diversity Advisor
Previous @Google DevRel team
Before Research fellow:
➢ Univ. Milano Bicocca,
➢ University of Michigan
➢ Politecnico of Bari
➢ University of Trento
People worry that computers will get too
smart and take over the world, but the
real problem is that they’re too stupid and
they’ve already taken over the world
The Master Algorithm
Pedro Domingos, 2015
How to make my ML system fair?
...and why care?
Our success, happiness and
wellbeing can be affected by other
decisions
Life-changing decisions:
➔ Admission to schools
➔ Job offers
➔ Patients screenings
➔ Mortgage grant
➔ ...
Arbitrary, inconsistent, or faulty decision-making thus
raises serious concerns because it risks limiting our
ability to achieve the goals that we have set for ourselves
and access the opportunities for which we are qualified.
Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
How do we ensure that these decisions are
made the right way and for the right reasons?
Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
The ML promise:
make decisions more consistent,
accurate and rigorous.
B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman.
Object Recognition by Scene Alignment.
Advances in Neural Information Processing Systems, 2007.
...but there are serious risks in learning
from examples.
Generalizing from examples
Source: https://design.google/library/fair-not-default/
Quick, Draw!
Generalizing from examples
Provide good examples:
- a sufficiently large and diverse set
- well annotated
Quick, Draw!
Source: https://design.google/library/fair-not-default/
Historical examples may reflect:
- Prejudices against a social group
- Cultural stereotypes
- Demographic inequalities
and finding patterns in these data means replicating these
same dynamics
Source: https://gluon-cv.mxnet.io/build/examples_datasets/imagenet.html
45% of ImageNet data comes from USA (4% of the world population)
3% of ImageNet data comes from China and India (36% of the world population)
Ref: Nature 559 and Shankar, S. et al. (2017)
Geo bias
Photo Credit: Left: iStock/Getty; Right: Prakash Singh/AFP/Getty (from Nature 559, 324-326 (2018))
Bride
Dress
Woman
Wedding
Performance
art
Costume
Word Embeddings
Debiasing Word Embeddings
Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V. & Kalai, A. Adv. Neural Inf. Proc. Syst. 2016, 4349–4357 (2016).
Credit: Pictures by Pixabay
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Source: Fairness and Machine Learning
S. Barocas, M. Hardt, A. Narayanan
State of the world
Data
Measurement
The Machine Learning Loop
Provenance of data
is crucial.
Data cleaning is
mandatory.
The world is “messy”
Photo by pasja1000 on Pixabay
Measurement defines:
- your variables of interest,
- the process for turning your
observations into numbers,
- how you actually collect the
data
[Fairness and Machine Learning, 2018]
Photo by Iker Urteaga on Unsplash
The target variable is the
hardest to measure.
It is made up for the purpose
of the problem.
It is not a property that
people possess or lack
Ex. “creditworthiness”, “good
employee”, “attractiveness”
[Fairness and Machine Learning, 2018]
Photo by David Paschke on Unsplash
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
ML will extract
stereotypes the same
way that it extracts
knowledge
ML works better with more data, so it will work less well for
members of minority groups
Sample size disparity
Training set
Training data
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Predictions - actions - outcome
Photo by Pixabay
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
If you predict future prices (and publicizes them) you create a self-fulfilling
feedback loop: houses with a lower sales prices predicted deter buyers,
demand goes down and the final price is even lower
House price prediction
PhotobyDevaDarshanonUnsplash
Some communities may be disproportionately targeted, with people being
arrested for crimes that might be ignored in other communities.
Ref.: Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016).
Self-fulfilling predictions
PhotobyJacquesTiberionPixabay
“Feedback loops occur when data discovered on the
basis of predictions are used to update the model.”
Danielle Ensign et al.,
“Runaway Feedback Loops in Predictive Policing,” 2017
State of the world
Data
Individuals
Model
Measurement
Learning
Action Feedback
The Machine Learning Loop
Training data encode the demographic disparities in our society and
some stereotypes can be reinforced by ML (due to feedback loop)
The state of society
PhotobyCorySchadtonUnsplash
Solutions?
Bias may lurk in your data...
Analyze your data
Source: Google Machine Learning Crash Course
★ Are there missing feature values for a large number of observations?
★ Are there features that are missing that might affect other features?
★ Are there any unexpected feature values?
★ What signs of data skew do you see?
Missing feature values
Source: California Housing dataset,
Google Machine Learning Crash Course
Skew data (geographical bias)
Source: California Housing dataset,
Google Machine Learning Crash Course
Facets Overview
Source: Facet tool
(https://pair-code.github.io/facets/)
Facets Overview, an
interactive
visualization tool to
explore datasets.
Quickly analyze the
distribution of
values across the
datasets.
Facets Overview
Source: Facet tool
(https://pair-code.github.io/facets/)
⅔ of examples
represent males,
while we would
expect the
breakdown
between
genders to be
closer to 50/50
Facets Dive
Source: Facet tool
(https://pair-code.github.io/facets/)
Data are faceted by
marital-status
feature. Male
outnumbers female
by more than 5:1.
Married women are
underrepresented in
our data.
“What-if” tool
Analyze ML model
without writing code.
Given pointers to a TF
model and a dataset,
the What-If Tool
offers an interactive
visual interface for
exploring model
results.
Counterfactuals
It is possible to
compare a datapoint
to the most similar
point where your
model predicts a
different result.
Counterfactuals
a minor difference in
age and an
occupation change
flipped the model’s
prediction (earning
>50K)
Edit a datapoint
Edit a datapoint and see
how your model performs.
Edit, add or remove
features or feature values
for any selected datapoint
and then run inference to
test model performance.
★ Measurement is crucial
★ Know your data (and how data were collected and annotated)
★ Try to discover hidden biases (missing values, data skew, subgroups, etc.)
★ Ask questions. Don’t train the model and then walk away
★ Avoid feedback loop
★ Use tools that allow you to do such investigation
Key Takeaways
Thanks!
@azzurraragone
❏ AI can be sexist and racist — it’s time to make it fair James Zou &
Londa Schiebinger - Nature 559, 324-326 (2018)
❏ The Master Algorithm Pedro Domingos, 2015
❏ Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
❏ No Classification without Representation: Assessing Geodiversity
Issues in Open Data Sets for the Developing World Shreya Shankar,
Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley
❏ Man is to computer programmer as woman is to homemaker?
Debiasing word embeddings T. Bolukbasi, K.-W. Chang, J. Y. Zou, V.
Saligrama, A. T. Kalai,. Adv. Neural Inf. Process. Syst. 2016,
4349–4357 (2016)
References
❏ There is a blind spot in AI research, Kate Crawford & Ryan Calo, Nature
538, 311–313 (20 October 2016)
❏ Semantics Derived Automatically from Language Corpora Contain
Human-Like Biases, Aylin Caliskan, Joanna J. Bryson, and Arvind
Narayanan, Science 356, no. 6334 (2017): 183–86
❏ Predictions Put Into Practice: a Quasi-experimental Evaluation of
Chicago's Predictive Policing Pilot Saunders, J., Hunt, P. & Hollywood,
J. S. J. Exp. Criminol. 12, 347–371 (2016).
❏ Runaway Feedback Loops in Predictive Policing Danielle Ensign et al.
arXiv:1706.09847
References
❏ Object Recognition by Scene Alignment. B. C. Russell, A. Torralba, C.
Liu, R. Fergus, W. T. Freeman. Advances in Neural Information
Processing Systems, 2007.
❏ Fair Is Not the Default (https://design.google/library/fair-not-default/)
❏ “Playing with fairness” - David Weinberger.
❏ Google Machine Learning Crash Course
❏ What-if tool: https://pair-code.github.io/what-if-tool/
❏ Facet tool https://pair-code.github.io/facets/
References
APPENDIX
★ Group unaware: disregard the gender mix of the applicants, exclude gender and
gender-proxy information from the data set
★ Group thresholds: adjust the confidence thresholds for different group
independently
★ Demographic parity: The composition of the set should reflect the percentage of
applicants
★ Equal opportunity: Individuals who qualify for a desirable outcome should have an
equal chance of being correctly classified for this outcome (=true positive)
★ Equal accuracy: the system ought to be tuned so that the percentage of times it's
wrong in the total of approvals and denials is the same for both groups (=false
positive+false negative)
Types of Fairness
“Playing with fairness”
by David Weinberger.
Computer scientist Arvind Narayanan gave a talk:
“21 fairness definitions and their politics”
Watch it on Youtube!
★ Reporting bias (ex. book reviews)
★ Automation bias
★ Selection bias (ex. phone survey):
○ Coverage bias
○ Non-response bias
○ Sampling bias
★ Group attribution bias (ex. university)
★ Implicit bias (ex. Confirmation bias)
Types of Bias

More Related Content

What's hot

IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayHjk6653284
 
Knowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMCKnowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMCPavan Kapanipathi
 
How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?Simon Buckingham Shum
 
How to create a taxonomy for management buy-in
How to create a taxonomy for management buy-inHow to create a taxonomy for management buy-in
How to create a taxonomy for management buy-inMary Chitty
 
Lies, Damn Lies, and Big Data
Lies, Damn Lies, and Big DataLies, Damn Lies, and Big Data
Lies, Damn Lies, and Big DataBrian Bissett
 
Growth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North StarsGrowth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North StarsJune Andrews
 

What's hot (8)

IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Model bias in AI
Model bias in AIModel bias in AI
Model bias in AI
 
Math in data
Math in dataMath in data
Math in data
 
Knowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMCKnowledge base enabled Information Filtering on Social Web -- EMC
Knowledge base enabled Information Filtering on Social Web -- EMC
 
How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?How do Learning Analytics “act” in Education?
How do Learning Analytics “act” in Education?
 
How to create a taxonomy for management buy-in
How to create a taxonomy for management buy-inHow to create a taxonomy for management buy-in
How to create a taxonomy for management buy-in
 
Lies, Damn Lies, and Big Data
Lies, Damn Lies, and Big DataLies, Damn Lies, and Big Data
Lies, Damn Lies, and Big Data
 
Growth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North StarsGrowth, Engagement & Search Metrics: Snake Oil or North Stars
Growth, Engagement & Search Metrics: Snake Oil or North Stars
 

Similar to Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant)

Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine LearningDelip Rao
 
A Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) EducationA Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) EducationUniversity of South Africa (Unisa)
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessManojit Nandi
 
Responsible AI
Responsible AIResponsible AI
Responsible AINeo4j
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...QuantUniversity
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
 
The Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal InferenceThe Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal InferenceHanan Shteingart
 
Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Sara_Hajian
 
How AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinksHow AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinksKatie Fang
 
Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...Lviv Startup Club
 
Neo4j - Responsible AI
Neo4j - Responsible AINeo4j - Responsible AI
Neo4j - Responsible AINeo4j
 
Scientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum MastersScientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum MastersPavel Dabrytski
 
The Hidden Stories of Missing Data
The Hidden Stories of Missing DataThe Hidden Stories of Missing Data
The Hidden Stories of Missing DataMaria Wolters
 
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
 Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De... Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...Anh Luong
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceTrey Grainger
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsKrishnaram Kenthapadi
 
Machine learning and_buzzwords
Machine learning and_buzzwordsMachine learning and_buzzwords
Machine learning and_buzzwordsRajarshi Dutta
 
Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...University of South Africa (Unisa)
 

Similar to Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant) (20)

Fairness in Machine Learning
Fairness in Machine LearningFairness in Machine Learning
Fairness in Machine Learning
 
A Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) EducationA Blind Date With (Big) Data: Student Data in (Higher) Education
A Blind Date With (Big) Data: Student Data in (Higher) Education
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
The Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal InferenceThe Business Value of Reinforcement Learning and Causal Inference
The Business Value of Reinforcement Learning and Causal Inference
 
Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?Big data, Big prejudice: how algorithms can discriminate?
Big data, Big prejudice: how algorithms can discriminate?
 
How AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinksHow AI will change the way you help students succeed - SchooLinks
How AI will change the way you help students succeed - SchooLinks
 
Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...Ramon van den Akker. Fairness of machine learning models an overview and prac...
Ramon van den Akker. Fairness of machine learning models an overview and prac...
 
A brave new world: student surveillance in higher education
A brave new world: student surveillance in higher educationA brave new world: student surveillance in higher education
A brave new world: student surveillance in higher education
 
Neo4j - Responsible AI
Neo4j - Responsible AINeo4j - Responsible AI
Neo4j - Responsible AI
 
Scientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum MastersScientific Method to Hire Great Scrum Masters
Scientific Method to Hire Great Scrum Masters
 
The Hidden Stories of Missing Data
The Hidden Stories of Missing DataThe Hidden Stories of Missing Data
The Hidden Stories of Missing Data
 
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
 Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De... Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
Human-Machine Collaboration in Organizations: Impact of Algorithm Bias on De...
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Big data
Big dataBig data
Big data
 
Machine learning and_buzzwords
Machine learning and_buzzwordsMachine learning and_buzzwords
Machine learning and_buzzwords
 
Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...Zombie categories, broken data and biased algorithms: What else can go wrong?...
Zombie categories, broken data and biased algorithms: What else can go wrong?...
 

More from Data Driven Innovation

Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...Data Driven Innovation
 
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...Data Driven Innovation
 
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...Data Driven Innovation
 
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...Data Driven Innovation
 
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...Data Driven Innovation
 
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)Data Driven Innovation
 
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...Data Driven Innovation
 
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...Data Driven Innovation
 
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...Data Driven Innovation
 
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...Data Driven Innovation
 
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)Data Driven Innovation
 
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...Data Driven Innovation
 
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)Data Driven Innovation
 
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...Data Driven Innovation
 
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...Data Driven Innovation
 
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...Data Driven Innovation
 
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...Data Driven Innovation
 
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)Data Driven Innovation
 
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)Data Driven Innovation
 
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...Data Driven Innovation
 

More from Data Driven Innovation (20)

Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
Integrazione della mobilità elettrica nei sistemi urbani (Stefano Carrese, Un...
 
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
La statistica ufficiale e i trasporti marittimi nell'era dei big data (Vincen...
 
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
How can we realize the Mobility as a Service (Maas) (Andrea Paletti, London S...
 
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
Il DTC-Lazio e i dati del patrimonio culturale (Maria Prezioso, Università To...
 
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
CHNet-DHLab: Servizi Cloud a supporto dei beni culturali (Fabio Proietti, INF...
 
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
Progetto EOSC-Pillar (Fulvio Galeazzi, GARR)
 
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
Una infrastruttura per l’accesso al patrimonio culturale: il Progetto del Por...
 
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
Utilizzo dei Big data per l’analisi dei flussi veicolari e della mobilità (Ma...
 
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
I dati personali nell'analisi comportamentale della mobilità di dipendenti e ...
 
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
Estrarre valore dai dati: tecnologie per ottimizzare la mobilità del futuro (...
 
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
Le piattaforme dati per la mobilità nelle città italiane (Marco Mena, EY)
 
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
WiseTown, un ecosistema di applicazioni e strumenti per migliorare la qualità...
 
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
CityOpenSource as a civic tech tool (Ilaria Vitellio, CityOpenSource)
 
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...Big Data Confederation: toward the local urban data market place (Renzo Taffa...
Big Data Confederation: toward the local urban data market place (Renzo Taffa...
 
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
Making citizens the eyes of policy makers: a sweet spot for hybrid AI? (Danie...
 
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
Dall'Agenda Digitale alla Smart City: il percorso di Roma Capitale verso il D...
 
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
Reusing open data: how to make a difference (Vittorio Scarano, Università di ...
 
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
Gestire i beni culturali con i big data (Sandro Stancampiano, Istat)
 
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
Data Governance: cos’è e perché è importante? (Elena Arista, Erwin)
 
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
Data driven economy: bastano i dati per avviare una start up? (Gabriele Anton...
 

Recently uploaded

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 

Recently uploaded (20)

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 

Don't blindly trust your ML System, it may change your life (Azzurra Ragone, Independent consultant)

  • 1. Fairness in Machine Learning: are you sure there is no bias in your predictions? Azzurra Ragone - Innovation and Diversity Advisor Slides will be shared, follow @azzurraragone
  • 2. Me… Innovation and Diversity Advisor Previous @Google DevRel team Before Research fellow: ➢ Univ. Milano Bicocca, ➢ University of Michigan ➢ Politecnico of Bari ➢ University of Trento
  • 3. People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world The Master Algorithm Pedro Domingos, 2015
  • 4. How to make my ML system fair? ...and why care?
  • 5. Our success, happiness and wellbeing can be affected by other decisions
  • 6. Life-changing decisions: ➔ Admission to schools ➔ Job offers ➔ Patients screenings ➔ Mortgage grant ➔ ...
  • 7. Arbitrary, inconsistent, or faulty decision-making thus raises serious concerns because it risks limiting our ability to achieve the goals that we have set for ourselves and access the opportunities for which we are qualified. Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 8. How do we ensure that these decisions are made the right way and for the right reasons? Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 9. The ML promise: make decisions more consistent, accurate and rigorous.
  • 10. B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman. Object Recognition by Scene Alignment. Advances in Neural Information Processing Systems, 2007.
  • 11. ...but there are serious risks in learning from examples.
  • 12. Generalizing from examples Source: https://design.google/library/fair-not-default/ Quick, Draw!
  • 13. Generalizing from examples Provide good examples: - a sufficiently large and diverse set - well annotated Quick, Draw! Source: https://design.google/library/fair-not-default/
  • 14. Historical examples may reflect: - Prejudices against a social group - Cultural stereotypes - Demographic inequalities and finding patterns in these data means replicating these same dynamics
  • 16. 45% of ImageNet data comes from USA (4% of the world population) 3% of ImageNet data comes from China and India (36% of the world population) Ref: Nature 559 and Shankar, S. et al. (2017) Geo bias
  • 17. Photo Credit: Left: iStock/Getty; Right: Prakash Singh/AFP/Getty (from Nature 559, 324-326 (2018)) Bride Dress Woman Wedding Performance art Costume
  • 19. Debiasing Word Embeddings Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V. & Kalai, A. Adv. Neural Inf. Proc. Syst. 2016, 4349–4357 (2016). Credit: Pictures by Pixabay
  • 20. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop Source: Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan
  • 21. State of the world Data Measurement The Machine Learning Loop
  • 22. Provenance of data is crucial. Data cleaning is mandatory. The world is “messy” Photo by pasja1000 on Pixabay
  • 23. Measurement defines: - your variables of interest, - the process for turning your observations into numbers, - how you actually collect the data [Fairness and Machine Learning, 2018] Photo by Iker Urteaga on Unsplash
  • 24. The target variable is the hardest to measure. It is made up for the purpose of the problem. It is not a property that people possess or lack Ex. “creditworthiness”, “good employee”, “attractiveness” [Fairness and Machine Learning, 2018] Photo by David Paschke on Unsplash
  • 25. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 26. ML will extract stereotypes the same way that it extracts knowledge
  • 27. ML works better with more data, so it will work less well for members of minority groups Sample size disparity Training set Training data
  • 28. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 29. Predictions - actions - outcome Photo by Pixabay
  • 30. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 31. If you predict future prices (and publicizes them) you create a self-fulfilling feedback loop: houses with a lower sales prices predicted deter buyers, demand goes down and the final price is even lower House price prediction PhotobyDevaDarshanonUnsplash
  • 32. Some communities may be disproportionately targeted, with people being arrested for crimes that might be ignored in other communities. Ref.: Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016). Self-fulfilling predictions PhotobyJacquesTiberionPixabay
  • 33. “Feedback loops occur when data discovered on the basis of predictions are used to update the model.” Danielle Ensign et al., “Runaway Feedback Loops in Predictive Policing,” 2017
  • 34. State of the world Data Individuals Model Measurement Learning Action Feedback The Machine Learning Loop
  • 35. Training data encode the demographic disparities in our society and some stereotypes can be reinforced by ML (due to feedback loop) The state of society PhotobyCorySchadtonUnsplash
  • 37. Bias may lurk in your data...
  • 38. Analyze your data Source: Google Machine Learning Crash Course ★ Are there missing feature values for a large number of observations? ★ Are there features that are missing that might affect other features? ★ Are there any unexpected feature values? ★ What signs of data skew do you see?
  • 39. Missing feature values Source: California Housing dataset, Google Machine Learning Crash Course
  • 40. Skew data (geographical bias) Source: California Housing dataset, Google Machine Learning Crash Course
  • 41. Facets Overview Source: Facet tool (https://pair-code.github.io/facets/) Facets Overview, an interactive visualization tool to explore datasets. Quickly analyze the distribution of values across the datasets.
  • 42. Facets Overview Source: Facet tool (https://pair-code.github.io/facets/) ⅔ of examples represent males, while we would expect the breakdown between genders to be closer to 50/50
  • 43. Facets Dive Source: Facet tool (https://pair-code.github.io/facets/) Data are faceted by marital-status feature. Male outnumbers female by more than 5:1. Married women are underrepresented in our data.
  • 44. “What-if” tool Analyze ML model without writing code. Given pointers to a TF model and a dataset, the What-If Tool offers an interactive visual interface for exploring model results.
  • 45. Counterfactuals It is possible to compare a datapoint to the most similar point where your model predicts a different result.
  • 46. Counterfactuals a minor difference in age and an occupation change flipped the model’s prediction (earning >50K)
  • 47. Edit a datapoint Edit a datapoint and see how your model performs. Edit, add or remove features or feature values for any selected datapoint and then run inference to test model performance.
  • 48. ★ Measurement is crucial ★ Know your data (and how data were collected and annotated) ★ Try to discover hidden biases (missing values, data skew, subgroups, etc.) ★ Ask questions. Don’t train the model and then walk away ★ Avoid feedback loop ★ Use tools that allow you to do such investigation Key Takeaways
  • 50. ❏ AI can be sexist and racist — it’s time to make it fair James Zou & Londa Schiebinger - Nature 559, 324-326 (2018) ❏ The Master Algorithm Pedro Domingos, 2015 ❏ Fairness and Machine Learning S. Barocas, M. Hardt, A. Narayanan ❏ No Classification without Representation: Assessing Geodiversity Issues in Open Data Sets for the Developing World Shreya Shankar, Yoni Halpern, Eric Breck, James Atwood, Jimbo Wilson, D. Sculley ❏ Man is to computer programmer as woman is to homemaker? Debiasing word embeddings T. Bolukbasi, K.-W. Chang, J. Y. Zou, V. Saligrama, A. T. Kalai,. Adv. Neural Inf. Process. Syst. 2016, 4349–4357 (2016) References
  • 51. ❏ There is a blind spot in AI research, Kate Crawford & Ryan Calo, Nature 538, 311–313 (20 October 2016) ❏ Semantics Derived Automatically from Language Corpora Contain Human-Like Biases, Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan, Science 356, no. 6334 (2017): 183–86 ❏ Predictions Put Into Practice: a Quasi-experimental Evaluation of Chicago's Predictive Policing Pilot Saunders, J., Hunt, P. & Hollywood, J. S. J. Exp. Criminol. 12, 347–371 (2016). ❏ Runaway Feedback Loops in Predictive Policing Danielle Ensign et al. arXiv:1706.09847 References
  • 52. ❏ Object Recognition by Scene Alignment. B. C. Russell, A. Torralba, C. Liu, R. Fergus, W. T. Freeman. Advances in Neural Information Processing Systems, 2007. ❏ Fair Is Not the Default (https://design.google/library/fair-not-default/) ❏ “Playing with fairness” - David Weinberger. ❏ Google Machine Learning Crash Course ❏ What-if tool: https://pair-code.github.io/what-if-tool/ ❏ Facet tool https://pair-code.github.io/facets/ References
  • 54. ★ Group unaware: disregard the gender mix of the applicants, exclude gender and gender-proxy information from the data set ★ Group thresholds: adjust the confidence thresholds for different group independently ★ Demographic parity: The composition of the set should reflect the percentage of applicants ★ Equal opportunity: Individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome (=true positive) ★ Equal accuracy: the system ought to be tuned so that the percentage of times it's wrong in the total of approvals and denials is the same for both groups (=false positive+false negative) Types of Fairness “Playing with fairness” by David Weinberger.
  • 55. Computer scientist Arvind Narayanan gave a talk: “21 fairness definitions and their politics” Watch it on Youtube!
  • 56. ★ Reporting bias (ex. book reviews) ★ Automation bias ★ Selection bias (ex. phone survey): ○ Coverage bias ○ Non-response bias ○ Sampling bias ★ Group attribution bias (ex. university) ★ Implicit bias (ex. Confirmation bias) Types of Bias