SlideShare a Scribd company logo
Future directions of
Fairness-aware Data Mining
Recommendation, Causality, and Theoretical Aspects
Toshihiro Kamishima*1 and Kazuto Fukuchi*2
joint work with Shotaro Akaho*1, Hideki Asoh*1, and Jun Sakuma*2,3
*1National Institute of Advanced Industrial Science and Technology (AIST), Japan
*2University of Tsukuba, and *3JST CREST
Workshop on Fairness, Accountability, and Transparency in Machine Learning
In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015
1
Outline
New Applications of Fairness-Aware Data Mining
Applications of FADM techniques, other than anti-discrimination,
especially in a recommendation context
New Directions of Fairness
Relations of existing formal fairness with causal inference and
information theory
Introducing an idea of a fair division problem and avoiding unfair
treatments
Generalization Bound in terms of Fairness
Theoretical aspects of fairness not on training data, but on test data
✤ We use the term “fairness-aware” instead of “discrimination-aware,” because the word “discrimination”
means classification in a ML context, and this technique applicable to tasks other than avoiding
discriminative decisions
2
Foods for discussion about new directions of fairness in DM / ML
PART Ⅰ
Applications of
Fairness-Aware Data Mining
3
Fairness-Aware Data Mining
4
Fairness-aware Data mining (FADM)
data analysis taking into account potential issues of fairness
Unfairness Prevention
S : sensitive feature: representing information that is wanted not to
influence outcomes
Other factors: Y: target variable, X: non-sensitive feature
Learning a statistical model from potentially unfair data sets so
that the sensitive feature does not influence the model’s outcomes
Two major tasks of FADM
Unfairness Detection: Finding unfair treatments in database
Unfairness Prevention: Building a model to provide fair outcomes
[Romei+ 2014]
Anti-Discrimination
5
[Sweeney 13]
obtaining socially and legally anti-discriminative outcomes
African descent names European descent names
Arrested?
Located:
Advertisements indicating arrest records were more frequently
displayed for names that are more popular among individuals of
African descent than those of European descent
sensitive feature = users’ socially sensitive demographic information
anti-discriminative outcomes
Unbiased Information
6
[Pariser 2011, TED Talk by Eli Pariser, http://www.filterbubble.com, Kamishima+ 13]
To fit for Pariser’s preference, conservative people are eliminated from
his friend recommendation list in a social networking service
avoiding biased information that doesn’t meet a user’s intention
Filter Bubble: a concern that personalization technologies narrow
and bias the topics of information provided to people
sensitive feature = political conviction of a friend candidate
unbiased information in terms of candidates’ political conviction
In the recommendation on the retail store, the items sold by the site
owner are constantly ranked higher than those sold by tenants
Tenants will complain about this unfair treatment
Fair Trading
7
equal treatment of content providers
Online retail store
The site owner directly sells items
The site is rented to tenants, and the tenants also sells items
[Kamishima+ 12, Kamishima+ 13]
sensitive feature = a content provider of a candidate item
site owner and its tenants are equally treated in recommendation
Ignoring Uninteresting Information
8
[Gondek+ 04]
ignore information unwanted by a user
A simple clustering method finds two
clusters: one contains only faces, and the
other contains faces with shoulders
A data analyst considers this clustering is
useless and uninteresting
By ignoring this uninteresting information,
more meaningful female- and male-like
clusters could be obtained
non-redundant clustering : find clusters that are as independent
from a given uninteresting partition as possible
clustering facial images
sensitive feature = uninteresting information
ignore the influence of uninteresting information
Part Ⅰ: Summary
9
A belief introduction of FADM and a unfairness prevention task
Learning a statistical model from potentially unfair data sets so that
the sensitive feature does not influence the model’s outcomes
FADM techniques are widely applicable
There are many FADM applications other than anti-discrimination,
such as providing unbiased information, fair trading, and ignoring
uninteresting information
PART Ⅱ
New Directions of Fairness
10
Discussion about formal definitions and treatments of fairness
in data mining and machine learning contexts
Related Topics of a Current Formal Fairness
connection between formal fairness and causal inference
interpretation in view of information theory
New Definitions of Formal Fairness
Why statistical independence can be used as fairness
Introducing an idea of a fair division problem
New Treatments of Formal Fairness
methods for avoiding unfair treatments instead of enhancing
fairness
PART Ⅱ: Outline
11
Causality
12
[Žliobaitė+ 11, Calders+ 13]
sensitive feature: S
gender
male / female
target variable: Y
acceptance
accept / not accept
Fair determination: the gender does not influence the acceptance
	 statistical independence: Y ?? S
An example of university admission in [Žliobaitė+ 11]
Unfairness Prevention task
optimization of accuracy under causality constraints
Information Theoretic Interpretation
13
statistical independence between S and Y implies
zero mutual information: I(S; Y) = 0
the degree of influence S to Y can be measured by I(S; Y)
Sensitive: S
Target: Y
H(Y )
H(S)
I(S; Y )
H(S | Y )
H(Y | S)
Information theoretical view of a fairness condition
Causality with Explainable Features
14
[Žliobaitė+ 11, Calders+ 13]
sensitive feature: S
gender
male / female
target variable: Y
acceptance
accept / not accept
Removing the pure influence of S to Y, excluding the effect of E
	 conditional statistical independence:
explainable feature: E
(confounding feature)
program
medicine / computer
medicine → acceptance=low
computer → acceptance=high
female → medicine=high
male → computer=high
An example of fair determination
even if S and Y are not independent
Y ?? S | E
Information Theoretic Interpretation
15
Sensitive: S
Target: Y
Explainable: E H(E)
the degree of conditional independence between Y and S given E
conditional mutual information: I(S; Y | E)
We can exploit additional information I(S; Y; E) to obtain outcomes
I(S; Y ; E)
H(Y )
H(S)
H(S | Y, E)
H(Y | S, E)
H(E | S, Y )
I(S; E | Y )
I(Y ; E | S)
I(S; Y | E)
Why outcomes are assumed
as being fair?
16
Why outcomes are assumed as being fair,
if a sensitive feature does not influence the outcomes?
All parties agree with the use of this criterion,
may be because this is objective and reasonable
Is there any way for making an agreement?
To further examine new directions, we introduce a fair division problem
In this view, [Brendt+ 12]’s approach is
regarded as a way of making
agreements in a wisdom-of-crowds way.
The size and color of circles indicate the
size of samples and the risk of
discrimination, respectively
Fair Division
17
Alice and Bob want to divide this swiss-roll FAIRLY
Total length of this swiss-roll is 20cm
20cm
Alice and Bob get half each based on agreed common measure
This approach is adopted in current FADM techniques
Fair Division
17
Alice and Bob want to divide this swiss-roll FAIRLY
Alice and Bob get half each based on agreed common measure
This approach is adopted in current FADM techniques
10cm
10cm
divide the swiss-roll into 10cm each
Fair Division
18
Unfortunately, Alice and Bob don’t have a scale
Alice cut the swiss-roll exactly in halves based on her own feeling
envy-free division: Alice and Bob get a equal or larger piece
based on their own measure
Fair Division
18
Unfortunately, Alice and Bob don’t have a scale
envy-free division: Alice and Bob get a equal or larger piece
based on their own measure
Bob pick a larger piece based on his own feeling
Bob
Fair Division
19
Fairness in a fair division context
Envy-Free Division: Every party gets a equal or larger piece than
other parties’ pieces based on one’s own measure
Proportional Division: Every party gets an equal or larger piece than 1/n
based on one’s own measure; Envy-free division is proportional division
Exact Division: Every party gets a equal-sized piece
There are n parties
Every party i has one’s own measure mi(Pj) for each piece Pj
mi(Pj) = 1/n, 8i, j
mi(Pi) 1/n, 8i
mi(Pi) mi(Pj), 8i, j
Envy-Free in a FADM Context
20
Current FADM techniques adopt common agreed measure
Can we develop FADM techniques using an envy-free approach?
This technique can be applicable without agreements on fairness criterion
FADM under envy-free fairness
Maximize the utility of analysis, such as prediction accuracy,
under the envy-free fairness constraints
A Naïve method for Classification
Among n candidates k ones can be classified as positive
Among all nCk classifications, enumerate those satisfying envy-free
conditions based on parties’ own utility measures
ex. Fair classifiers with different sets of explainable features
Pick the classification whose accuracy is maximum
Open Problem: Can we develop a more efficient algorithm?
Fairness Guardian
21
Current fairness prevention methods are designed so as
to be fair
Example: Logistic Regression + Prejudice Remover 	 [Kamishima+ 12]
The objective function is composed of
classification loss and fairness constraint terms
P
D ln Pr[Y | X, S; ⇥] + 2 k⇥k2
2 + ⌘ I(Y ; S)
Fairness Guardian Approach
Unfairness is prevented by enhancing fairness of outcomes
Fair Is Not Unfair?
22
A reverse treatment of fairness:
not to be unfair
One possible formulation of a unfair classifier
Outcomes are determined ONLY by a sensitive feature
Pr[Y | S; ⇤
]
Ex. Your paper is rejected, just because you are not handsome
Penalty term to maximize the KL divergence between
a pre-trained unfair classifier and a target classifier
DKL[Pr[Y | S; ⇤
]k Pr[Y | X, S; ⇥]]
Unfairness Hater
23
Unfairness Hater Approach
Unfairness is prevented by avoiding unfair outcomes
This approach was almost useless for obtaining fair outcomes, but…
Better Optimization
The fairness-enhanced objective function tends to be non-convex;
thus, adding a unfairness hater may help for avoiding local minima
Avoiding Unfair Situation
There would be unfair situations that should be avoided;
Ex. Humans’ photos were mistakenly labeled as gorilla in auto-
tagging [Barr 2015]
There would be many choices between to be fair and not to be unfair
that should be examined
Part Ⅱ: Summary
Relation of fairness with causal inference and information theory
We review a current formal definition of fairness by relating it with
Rubin’s causal inference; and, its interpretation based on information
theory
New Directions of formal fairness without agreements
We showed the possibility of formal fairness that does not presume a
common criterion agreed between concerned parties
New Directions of treatment of fairness by avoiding unfairness
We discussed that FADM techniques for avoiding unfairness, instead
of enhancing fairness.
24
PART Ⅲ
Generalization Bound
in terms of Fairness
25
Part Ⅲ: Introduction
There are many technical problems to solve in a FADM literature,
because tools for excluding specific information has not been
developed actively.
Types of Sensitive Features
Non-binary sensitive feature
Analysis Techniques
Analysis methods other than classification or regression
Optimization
Constraint terms make objective functions non-convex
Fairness measure
Interpretable to humans and having convenient properties
Learning Theory
Generalization ability in terms of fairness
26
Kazuto Fukuchi’s Talk
27
Conclusion
Applications of Fairness-Aware Data Mining
Applications other than anti-discrimination: providing unbiased
information, fair trading, and excluding unwanted information
New Directions of Fairness
Relation of fairness with causal inference and information theory
Formal fairness introducing an idea of a fair division problem
Avoiding unfair treatment, instead of enhancing fairness
Generalization bound in terms of fairness
Generalization bound in terms of fairness based on f-divergence
Additional Information and codes
http://www.kamishima.net/fadm
Acknowledgments: This work is supported by MEXT/JSPS KAKENHI Grant Number 24500194, 24680015,
25540094, 25540094, and 15K00327
28
Bibliography I
A. Barr.
Google mistakenly tags black people as ‘gorillas,’ showing limits of algorithms.
The Wall Street Journal, 2015.
⟨http://on.wsj.com/1CaCNlb⟩.
B. Berendt and S. Preibusch.
Exploring discrimination: A user-centric evaluation of discrimination-aware data mining.
In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,
pages 344–351, 2012.
T. Calders, A. Karim, F. Kamiran, W. Ali, and X. Zhang.
Controlling attribute effect in linear regression.
In Proc. of the 13th IEEE Int’l Conf. on Data Mining, pages 71–80, 2013.
T. Calders and S. Verwer.
Three naive Bayes approaches for discrimination-free classification.
Data Mining and Knowledge Discovery, 21:277–292, 2010.
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel.
Fairness through awareness.
In Proc. of the 3rd Innovations in Theoretical Computer Science Conf., pages 214–226,
2012.
1 / 4
Bibliography II
K. Fukuchi and J. Sakuma.
Fairness-aware learning with restriction of universal dependency using f-divergences.
arXiv:1104.3913 [cs.CC], 2015.
D. Gondek and T. Hofmann.
Non-redundant data clustering.
In Proc. of the 4th IEEE Int’l Conf. on Data Mining, pages 75–82, 2004.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Considerations on fairness-aware data mining.
In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining,
pages 378–385, 2012.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Enhancement of the neutrality in recommendation.
In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems,
pages 8–14, 2012.
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Fairness-aware classifier with prejudice remover regularizer.
In Proc. of the ECML PKDD 2012, Part II, pages 35–50, 2012.
[LNCS 7524].
2 / 4
Bibliography III
T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma.
Efficiency improvement of neutrality-enhanced recommendation.
In Proc. of the 3rd Workshop on Human Decision Making in Recommender Systems,
pages 1–8, 2013.
E. Pariser.
The filter bubble.
⟨http://www.thefilterbubble.com/⟩.
E. Pariser.
The Filter Bubble: What The Internet Is Hiding From You.
Viking, 2011.
A. Romei and S. Ruggieri.
A multidisciplinary survey on discrimination analysis.
The Knowledge Engineering Review, 29(5):582–638, 2014.
L. Sweeney.
Discrimination in online ad delivery.
Communications of the ACM, 56(5):44–54, 2013.
I. ˇZliobait˙e, F. Kamiran, and T. Calders.
Handling conditional discrimination.
In Proc. of the 11th IEEE Int’l Conf. on Data Mining, 2011.
3 / 4
Bibliography IV
R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork.
Learning fair representations.
In Proc. of the 30th Int’l Conf. on Machine Learning, 2013.
4 / 4

More Related Content

What's hot

Recommendation Independence
Recommendation IndependenceRecommendation Independence
Recommendation Independence
Toshihiro Kamishima
 
Malhotra21
Malhotra21Malhotra21
Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal Models
Amit Sharma
 
Lecture 4: NBERMetrics
Lecture 4: NBERMetricsLecture 4: NBERMetrics
Lecture 4: NBERMetricsNBER
 
Malhotra03
Malhotra03Malhotra03
Malhotra14
Malhotra14Malhotra14
Malhotra04
Malhotra04Malhotra04
Machine Learning and Causal Inference
Machine Learning and Causal InferenceMachine Learning and Causal Inference
Machine Learning and Causal Inference
NBER
 
Malhotra07
Malhotra07Malhotra07
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysis
Habet Madoyan
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
Sara Hooker
 
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Sri Ambati
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
Sara Hooker
 
Instance Selection and Optimization of Neural Networks
Instance Selection and Optimization of Neural NetworksInstance Selection and Optimization of Neural Networks
Instance Selection and Optimization of Neural Networks
ITIIIndustries
 
Malhotra05
Malhotra05Malhotra05
What recommender systems can learn from decision psychology about preference ...
What recommender systems can learn from decision psychology about preference ...What recommender systems can learn from decision psychology about preference ...
What recommender systems can learn from decision psychology about preference ...
Eindhoven University of Technology / JADS
 
Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922
stone55
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
Amit Sharma
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
Sara Hooker
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
NBER
 

What's hot (20)

Recommendation Independence
Recommendation IndependenceRecommendation Independence
Recommendation Independence
 
Malhotra21
Malhotra21Malhotra21
Malhotra21
 
Alleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal ModelsAlleviating Privacy Attacks Using Causal Models
Alleviating Privacy Attacks Using Causal Models
 
Lecture 4: NBERMetrics
Lecture 4: NBERMetricsLecture 4: NBERMetrics
Lecture 4: NBERMetrics
 
Malhotra03
Malhotra03Malhotra03
Malhotra03
 
Malhotra14
Malhotra14Malhotra14
Malhotra14
 
Malhotra04
Malhotra04Malhotra04
Malhotra04
 
Machine Learning and Causal Inference
Machine Learning and Causal InferenceMachine Learning and Causal Inference
Machine Learning and Causal Inference
 
Malhotra07
Malhotra07Malhotra07
Malhotra07
 
Stated preference methods and analysis
Stated preference methods and analysisStated preference methods and analysis
Stated preference methods and analysis
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
 
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
Scott Lundberg, Microsoft Research - Explainable Machine Learning with Shaple...
 
Module 2: Machine Learning Deep Dive
Module 2:  Machine Learning Deep DiveModule 2:  Machine Learning Deep Dive
Module 2: Machine Learning Deep Dive
 
Instance Selection and Optimization of Neural Networks
Instance Selection and Optimization of Neural NetworksInstance Selection and Optimization of Neural Networks
Instance Selection and Optimization of Neural Networks
 
Malhotra05
Malhotra05Malhotra05
Malhotra05
 
What recommender systems can learn from decision psychology about preference ...
What recommender systems can learn from decision psychology about preference ...What recommender systems can learn from decision psychology about preference ...
What recommender systems can learn from decision psychology about preference ...
 
Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 

Similar to Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects

Ethical, Social, and Political Issues in E-commerce
Ethical, Social, and Political Issues in E-commerceEthical, Social, and Political Issues in E-commerce
Ethical, Social, and Political Issues in E-commerce
Nor Ayuzi Deraman
 
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
Innovation Agency
 
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
EPR1
 
AI and us communicating for algorithmic bias awareness
AI and us communicating for algorithmic bias awarenessAI and us communicating for algorithmic bias awareness
AI and us communicating for algorithmic bias awareness
Ansgar Koene
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
Manojit Nandi
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
felicidaddinwoodie
 
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docxSTEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
lillie234567
 
Sogeti big data research privacy technology and the law
Sogeti big data research privacy technology and the lawSogeti big data research privacy technology and the law
Sogeti big data research privacy technology and the law
Yann SESE
 
Vint big data research privacy technology and the law
Vint big data research privacy technology and the lawVint big data research privacy technology and the law
Vint big data research privacy technology and the lawKarlos Svoboda
 
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Rick Bouter
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
ijceronline
 
M2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparencyM2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparency
BoPeng76
 
Evidence Based Healthcare Design
Evidence Based Healthcare DesignEvidence Based Healthcare Design
Evidence Based Healthcare Design
Carmen Martin
 
Promoting an ethical and GDPR-compliant approach to learning analytics
Promoting an ethical and GDPR-compliant approach to learning analyticsPromoting an ethical and GDPR-compliant approach to learning analytics
Promoting an ethical and GDPR-compliant approach to learning analytics
Jisc
 
Unit 3 Qualitative Data
Unit 3 Qualitative DataUnit 3 Qualitative Data
Unit 3 Qualitative Data
Sherry Bailey
 
U5 a1 stages in the decision making process
U5 a1 stages in the decision making processU5 a1 stages in the decision making process
U5 a1 stages in the decision making process
Peter R Breach
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Krishnaram Kenthapadi
 
Towards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into actionTowards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into action
Mindtrek
 

Similar to Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects (19)

Fairness definitions explained
Fairness definitions explainedFairness definitions explained
Fairness definitions explained
 
Ethical, Social, and Political Issues in E-commerce
Ethical, Social, and Political Issues in E-commerceEthical, Social, and Political Issues in E-commerce
Ethical, Social, and Political Issues in E-commerce
 
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
Dr Masood Ahmed and Alan Davies - ECO 17: Transforming care through digital h...
 
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
EPR Annual Conference 2020 Workshop 1 - Simon Uytterhoeven
 
AI and us communicating for algorithmic bias awareness
AI and us communicating for algorithmic bias awarenessAI and us communicating for algorithmic bias awareness
AI and us communicating for algorithmic bias awareness
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
1PPA 670 Public Policy AnalysisBasic Policy Terms an.docx
 
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docxSTEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
STEP IV CASE STUDY & FINAL PAPERA. Based on the analysis in Ste.docx
 
Sogeti big data research privacy technology and the law
Sogeti big data research privacy technology and the lawSogeti big data research privacy technology and the law
Sogeti big data research privacy technology and the law
 
Vint big data research privacy technology and the law
Vint big data research privacy technology and the lawVint big data research privacy technology and the law
Vint big data research privacy technology and the law
 
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
Big data 3 4- vint-big-data-research-privacy-technology-and-the-law - big dat...
 
International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER) International Journal of Computational Engineering Research (IJCER)
International Journal of Computational Engineering Research (IJCER)
 
M2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparencyM2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparency
 
Evidence Based Healthcare Design
Evidence Based Healthcare DesignEvidence Based Healthcare Design
Evidence Based Healthcare Design
 
Promoting an ethical and GDPR-compliant approach to learning analytics
Promoting an ethical and GDPR-compliant approach to learning analyticsPromoting an ethical and GDPR-compliant approach to learning analytics
Promoting an ethical and GDPR-compliant approach to learning analytics
 
Unit 3 Qualitative Data
Unit 3 Qualitative DataUnit 3 Qualitative Data
Unit 3 Qualitative Data
 
U5 a1 stages in the decision making process
U5 a1 stages in the decision making processU5 a1 stages in the decision making process
U5 a1 stages in the decision making process
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
 
Towards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into actionTowards data responsibility - how to put ideals into action
Towards data responsibility - how to put ideals into action
 

More from Toshihiro Kamishima

RecSys2018論文読み会 資料
RecSys2018論文読み会 資料RecSys2018論文読み会 資料
RecSys2018論文読み会 資料
Toshihiro Kamishima
 
WSDM2018読み会 資料
WSDM2018読み会 資料WSDM2018読み会 資料
WSDM2018読み会 資料
Toshihiro Kamishima
 
機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用
Toshihiro Kamishima
 
KDD2016勉強会 資料
KDD2016勉強会 資料KDD2016勉強会 資料
KDD2016勉強会 資料
Toshihiro Kamishima
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要
Toshihiro Kamishima
 
WSDM2016勉強会 資料
WSDM2016勉強会 資料WSDM2016勉強会 資料
WSDM2016勉強会 資料
Toshihiro Kamishima
 
ICML2015読み会 資料
ICML2015読み会 資料ICML2015読み会 資料
ICML2015読み会 資料
Toshihiro Kamishima
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
Toshihiro Kamishima
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative Clustering
Toshihiro Kamishima
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in Recommendation
Toshihiro Kamishima
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシング
Toshihiro Kamishima
 
Pythonによる機械学習実験の管理
Pythonによる機械学習実験の管理Pythonによる機械学習実験の管理
Pythonによる機械学習実験の管理
Toshihiro Kamishima
 

More from Toshihiro Kamishima (12)

RecSys2018論文読み会 資料
RecSys2018論文読み会 資料RecSys2018論文読み会 資料
RecSys2018論文読み会 資料
 
WSDM2018読み会 資料
WSDM2018読み会 資料WSDM2018読み会 資料
WSDM2018読み会 資料
 
機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用
 
KDD2016勉強会 資料
KDD2016勉強会 資料KDD2016勉強会 資料
KDD2016勉強会 資料
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要
 
WSDM2016勉強会 資料
WSDM2016勉強会 資料WSDM2016勉強会 資料
WSDM2016勉強会 資料
 
ICML2015読み会 資料
ICML2015読み会 資料ICML2015読み会 資料
ICML2015読み会 資料
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative Clustering
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in Recommendation
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシング
 
Pythonによる機械学習実験の管理
Pythonによる機械学習実験の管理Pythonによる機械学習実験の管理
Pythonによる機械学習実験の管理
 

Recently uploaded

一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 

Recently uploaded (20)

一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 

Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects

  • 1. Future directions of Fairness-aware Data Mining Recommendation, Causality, and Theoretical Aspects Toshihiro Kamishima*1 and Kazuto Fukuchi*2 joint work with Shotaro Akaho*1, Hideki Asoh*1, and Jun Sakuma*2,3 *1National Institute of Advanced Industrial Science and Technology (AIST), Japan *2University of Tsukuba, and *3JST CREST Workshop on Fairness, Accountability, and Transparency in Machine Learning In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015 1
  • 2. Outline New Applications of Fairness-Aware Data Mining Applications of FADM techniques, other than anti-discrimination, especially in a recommendation context New Directions of Fairness Relations of existing formal fairness with causal inference and information theory Introducing an idea of a fair division problem and avoiding unfair treatments Generalization Bound in terms of Fairness Theoretical aspects of fairness not on training data, but on test data ✤ We use the term “fairness-aware” instead of “discrimination-aware,” because the word “discrimination” means classification in a ML context, and this technique applicable to tasks other than avoiding discriminative decisions 2 Foods for discussion about new directions of fairness in DM / ML
  • 4. Fairness-Aware Data Mining 4 Fairness-aware Data mining (FADM) data analysis taking into account potential issues of fairness Unfairness Prevention S : sensitive feature: representing information that is wanted not to influence outcomes Other factors: Y: target variable, X: non-sensitive feature Learning a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes Two major tasks of FADM Unfairness Detection: Finding unfair treatments in database Unfairness Prevention: Building a model to provide fair outcomes [Romei+ 2014]
  • 5. Anti-Discrimination 5 [Sweeney 13] obtaining socially and legally anti-discriminative outcomes African descent names European descent names Arrested? Located: Advertisements indicating arrest records were more frequently displayed for names that are more popular among individuals of African descent than those of European descent sensitive feature = users’ socially sensitive demographic information anti-discriminative outcomes
  • 6. Unbiased Information 6 [Pariser 2011, TED Talk by Eli Pariser, http://www.filterbubble.com, Kamishima+ 13] To fit for Pariser’s preference, conservative people are eliminated from his friend recommendation list in a social networking service avoiding biased information that doesn’t meet a user’s intention Filter Bubble: a concern that personalization technologies narrow and bias the topics of information provided to people sensitive feature = political conviction of a friend candidate unbiased information in terms of candidates’ political conviction
  • 7. In the recommendation on the retail store, the items sold by the site owner are constantly ranked higher than those sold by tenants Tenants will complain about this unfair treatment Fair Trading 7 equal treatment of content providers Online retail store The site owner directly sells items The site is rented to tenants, and the tenants also sells items [Kamishima+ 12, Kamishima+ 13] sensitive feature = a content provider of a candidate item site owner and its tenants are equally treated in recommendation
  • 8. Ignoring Uninteresting Information 8 [Gondek+ 04] ignore information unwanted by a user A simple clustering method finds two clusters: one contains only faces, and the other contains faces with shoulders A data analyst considers this clustering is useless and uninteresting By ignoring this uninteresting information, more meaningful female- and male-like clusters could be obtained non-redundant clustering : find clusters that are as independent from a given uninteresting partition as possible clustering facial images sensitive feature = uninteresting information ignore the influence of uninteresting information
  • 9. Part Ⅰ: Summary 9 A belief introduction of FADM and a unfairness prevention task Learning a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes FADM techniques are widely applicable There are many FADM applications other than anti-discrimination, such as providing unbiased information, fair trading, and ignoring uninteresting information
  • 10. PART Ⅱ New Directions of Fairness 10
  • 11. Discussion about formal definitions and treatments of fairness in data mining and machine learning contexts Related Topics of a Current Formal Fairness connection between formal fairness and causal inference interpretation in view of information theory New Definitions of Formal Fairness Why statistical independence can be used as fairness Introducing an idea of a fair division problem New Treatments of Formal Fairness methods for avoiding unfair treatments instead of enhancing fairness PART Ⅱ: Outline 11
  • 12. Causality 12 [Žliobaitė+ 11, Calders+ 13] sensitive feature: S gender male / female target variable: Y acceptance accept / not accept Fair determination: the gender does not influence the acceptance statistical independence: Y ?? S An example of university admission in [Žliobaitė+ 11] Unfairness Prevention task optimization of accuracy under causality constraints
  • 13. Information Theoretic Interpretation 13 statistical independence between S and Y implies zero mutual information: I(S; Y) = 0 the degree of influence S to Y can be measured by I(S; Y) Sensitive: S Target: Y H(Y ) H(S) I(S; Y ) H(S | Y ) H(Y | S) Information theoretical view of a fairness condition
  • 14. Causality with Explainable Features 14 [Žliobaitė+ 11, Calders+ 13] sensitive feature: S gender male / female target variable: Y acceptance accept / not accept Removing the pure influence of S to Y, excluding the effect of E conditional statistical independence: explainable feature: E (confounding feature) program medicine / computer medicine → acceptance=low computer → acceptance=high female → medicine=high male → computer=high An example of fair determination even if S and Y are not independent Y ?? S | E
  • 15. Information Theoretic Interpretation 15 Sensitive: S Target: Y Explainable: E H(E) the degree of conditional independence between Y and S given E conditional mutual information: I(S; Y | E) We can exploit additional information I(S; Y; E) to obtain outcomes I(S; Y ; E) H(Y ) H(S) H(S | Y, E) H(Y | S, E) H(E | S, Y ) I(S; E | Y ) I(Y ; E | S) I(S; Y | E)
  • 16. Why outcomes are assumed as being fair? 16 Why outcomes are assumed as being fair, if a sensitive feature does not influence the outcomes? All parties agree with the use of this criterion, may be because this is objective and reasonable Is there any way for making an agreement? To further examine new directions, we introduce a fair division problem In this view, [Brendt+ 12]’s approach is regarded as a way of making agreements in a wisdom-of-crowds way. The size and color of circles indicate the size of samples and the risk of discrimination, respectively
  • 17. Fair Division 17 Alice and Bob want to divide this swiss-roll FAIRLY Total length of this swiss-roll is 20cm 20cm Alice and Bob get half each based on agreed common measure This approach is adopted in current FADM techniques
  • 18. Fair Division 17 Alice and Bob want to divide this swiss-roll FAIRLY Alice and Bob get half each based on agreed common measure This approach is adopted in current FADM techniques 10cm 10cm divide the swiss-roll into 10cm each
  • 19. Fair Division 18 Unfortunately, Alice and Bob don’t have a scale Alice cut the swiss-roll exactly in halves based on her own feeling envy-free division: Alice and Bob get a equal or larger piece based on their own measure
  • 20. Fair Division 18 Unfortunately, Alice and Bob don’t have a scale envy-free division: Alice and Bob get a equal or larger piece based on their own measure Bob pick a larger piece based on his own feeling Bob
  • 21. Fair Division 19 Fairness in a fair division context Envy-Free Division: Every party gets a equal or larger piece than other parties’ pieces based on one’s own measure Proportional Division: Every party gets an equal or larger piece than 1/n based on one’s own measure; Envy-free division is proportional division Exact Division: Every party gets a equal-sized piece There are n parties Every party i has one’s own measure mi(Pj) for each piece Pj mi(Pj) = 1/n, 8i, j mi(Pi) 1/n, 8i mi(Pi) mi(Pj), 8i, j
  • 22. Envy-Free in a FADM Context 20 Current FADM techniques adopt common agreed measure Can we develop FADM techniques using an envy-free approach? This technique can be applicable without agreements on fairness criterion FADM under envy-free fairness Maximize the utility of analysis, such as prediction accuracy, under the envy-free fairness constraints A Naïve method for Classification Among n candidates k ones can be classified as positive Among all nCk classifications, enumerate those satisfying envy-free conditions based on parties’ own utility measures ex. Fair classifiers with different sets of explainable features Pick the classification whose accuracy is maximum Open Problem: Can we develop a more efficient algorithm?
  • 23. Fairness Guardian 21 Current fairness prevention methods are designed so as to be fair Example: Logistic Regression + Prejudice Remover [Kamishima+ 12] The objective function is composed of classification loss and fairness constraint terms P D ln Pr[Y | X, S; ⇥] + 2 k⇥k2 2 + ⌘ I(Y ; S) Fairness Guardian Approach Unfairness is prevented by enhancing fairness of outcomes
  • 24. Fair Is Not Unfair? 22 A reverse treatment of fairness: not to be unfair One possible formulation of a unfair classifier Outcomes are determined ONLY by a sensitive feature Pr[Y | S; ⇤ ] Ex. Your paper is rejected, just because you are not handsome Penalty term to maximize the KL divergence between a pre-trained unfair classifier and a target classifier DKL[Pr[Y | S; ⇤ ]k Pr[Y | X, S; ⇥]]
  • 25. Unfairness Hater 23 Unfairness Hater Approach Unfairness is prevented by avoiding unfair outcomes This approach was almost useless for obtaining fair outcomes, but… Better Optimization The fairness-enhanced objective function tends to be non-convex; thus, adding a unfairness hater may help for avoiding local minima Avoiding Unfair Situation There would be unfair situations that should be avoided; Ex. Humans’ photos were mistakenly labeled as gorilla in auto- tagging [Barr 2015] There would be many choices between to be fair and not to be unfair that should be examined
  • 26. Part Ⅱ: Summary Relation of fairness with causal inference and information theory We review a current formal definition of fairness by relating it with Rubin’s causal inference; and, its interpretation based on information theory New Directions of formal fairness without agreements We showed the possibility of formal fairness that does not presume a common criterion agreed between concerned parties New Directions of treatment of fairness by avoiding unfairness We discussed that FADM techniques for avoiding unfairness, instead of enhancing fairness. 24
  • 27. PART Ⅲ Generalization Bound in terms of Fairness 25
  • 28. Part Ⅲ: Introduction There are many technical problems to solve in a FADM literature, because tools for excluding specific information has not been developed actively. Types of Sensitive Features Non-binary sensitive feature Analysis Techniques Analysis methods other than classification or regression Optimization Constraint terms make objective functions non-convex Fairness measure Interpretable to humans and having convenient properties Learning Theory Generalization ability in terms of fairness 26
  • 30. Conclusion Applications of Fairness-Aware Data Mining Applications other than anti-discrimination: providing unbiased information, fair trading, and excluding unwanted information New Directions of Fairness Relation of fairness with causal inference and information theory Formal fairness introducing an idea of a fair division problem Avoiding unfair treatment, instead of enhancing fairness Generalization bound in terms of fairness Generalization bound in terms of fairness based on f-divergence Additional Information and codes http://www.kamishima.net/fadm Acknowledgments: This work is supported by MEXT/JSPS KAKENHI Grant Number 24500194, 24680015, 25540094, 25540094, and 15K00327 28
  • 31. Bibliography I A. Barr. Google mistakenly tags black people as ‘gorillas,’ showing limits of algorithms. The Wall Street Journal, 2015. ⟨http://on.wsj.com/1CaCNlb⟩. B. Berendt and S. Preibusch. Exploring discrimination: A user-centric evaluation of discrimination-aware data mining. In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining, pages 344–351, 2012. T. Calders, A. Karim, F. Kamiran, W. Ali, and X. Zhang. Controlling attribute effect in linear regression. In Proc. of the 13th IEEE Int’l Conf. on Data Mining, pages 71–80, 2013. T. Calders and S. Verwer. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21:277–292, 2010. C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In Proc. of the 3rd Innovations in Theoretical Computer Science Conf., pages 214–226, 2012. 1 / 4
  • 32. Bibliography II K. Fukuchi and J. Sakuma. Fairness-aware learning with restriction of universal dependency using f-divergences. arXiv:1104.3913 [cs.CC], 2015. D. Gondek and T. Hofmann. Non-redundant data clustering. In Proc. of the 4th IEEE Int’l Conf. on Data Mining, pages 75–82, 2004. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Considerations on fairness-aware data mining. In Proc. of the IEEE Int’l Workshop on Discrimination and Privacy-Aware Data Mining, pages 378–385, 2012. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Enhancement of the neutrality in recommendation. In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems, pages 8–14, 2012. T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Fairness-aware classifier with prejudice remover regularizer. In Proc. of the ECML PKDD 2012, Part II, pages 35–50, 2012. [LNCS 7524]. 2 / 4
  • 33. Bibliography III T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Efficiency improvement of neutrality-enhanced recommendation. In Proc. of the 3rd Workshop on Human Decision Making in Recommender Systems, pages 1–8, 2013. E. Pariser. The filter bubble. ⟨http://www.thefilterbubble.com/⟩. E. Pariser. The Filter Bubble: What The Internet Is Hiding From You. Viking, 2011. A. Romei and S. Ruggieri. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review, 29(5):582–638, 2014. L. Sweeney. Discrimination in online ad delivery. Communications of the ACM, 56(5):44–54, 2013. I. ˇZliobait˙e, F. Kamiran, and T. Calders. Handling conditional discrimination. In Proc. of the 11th IEEE Int’l Conf. on Data Mining, 2011. 3 / 4
  • 34. Bibliography IV R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In Proc. of the 30th Int’l Conf. on Machine Learning, 2013. 4 / 4