Efficiency Improvement of Neutrality-Enhanced Recommendation
Workshop on Human Decision Making in Recommender Systems, in conjunction with RecSys 2013
Article @ Official Site: http://ceur-ws.org/Vol-1050/
Article @ Personal Site: http://www.kamishima.net/archive/2013-ws-recsys-print.pdf
Handnote : http://www.kamishima.net/archive/2013-ws-recsys-HN.pdf
Program codes : http://www.kamishima.net/inrs/
Workshop Homepage: http://recex.ist.tugraz.at/RecSysWorkshop/
Abstract:
This paper proposes an algorithm for making recommendations so that neutrality from a viewpoint specified by the user is enhanced. This algorithm is useful for avoiding decisions based on biased information. Such a problem is pointed out as the filter bubble, which is the influence in social decisions biased by personalization technologies. To provide a neutrality-enhanced recommendation, we must first assume that a user can specify a particular viewpoint from which the neutrality can be applied, because a recommendation that is neutral from all viewpoints is no longer a recommendation. Given such a target viewpoint, we implement an information-neutral recommendation algorithm by introducing a penalty term to enforce statistical independence between the target viewpoint and a rating. We empirically show that our algorithm enhances the independence from the specified viewpoint.
Model-based Approaches for Independence-Enhanced RecommendationToshihiro Kamishima
Model-based Approaches for Independence-Enhanced Recommendation
IEEE International Workshop on Privacy Aspects of Data Mining (PADM), in conjunction with ICDM2016
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2016.0127
Workshop Homepage: http://pddm16.eurecat.org/
Abstract:
This paper studies a new approach to enhance recommendation independence. Such approaches are useful in ensuring adherence to laws and regulations, fair treatment of content providers, and exclusion of unwanted information. For example, recommendations that match an employer with a job applicant should not be based on socially sensitive information, such as gender or race, from the perspective of social fairness. An algorithm that could exclude the influence of such sensitive information would be useful in this case. We previously gave a formal definition of recommendation independence and proposed a method adopting a regularizer that imposes such an independence constraint. As no other options than this regularization approach have been put forward, we here propose a new model-based approach, which is based on a generative model that satisfies the constraint of recommendation independence. We apply this approach to a latent class model and empirically show that the model-based approach can enhance recommendation independence. Recommendation algorithms based on generative models, such as topic models, are important, because they have a flexible functionality that enables them to incorporate a wide variety of information types. Our new model-based approach will broaden the applications of independence-enhanced recommendation by integrating the functionality of generative models.
The Independence of Fairness-aware Classifiers
IEEE International Workshop on Privacy Aspects of Data Mining (PADM), in conjunction with ICDM2013
Article @ Official Site:
Article @ Personal Site: http://www.kamishima.net/archive/2013-ws-icdm-print.pdf
Handnote : http://www.kamishima.net/archive/2013-ws-icdm-HN.pdf
Program codes : http://www.kamishima.net/fadm/
Workshop Homepage: http://www.cs.cf.ac.uk/padm2013/
Abstract:
Due to the spread of data mining technologies, such technologies are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair in sensitive features, such as race, gender, religion, and so on. The goal of fairness-aware classifiers is to classify data while taking into account the potential issues of fairness, discrimination, neutrality, and/or independence. In this paper, after reviewing fairness-aware classification methods, we focus on one such method, Calders and Verwer's two-naive-Bayes method. This method has been shown superior to the other classifiers in terms of fairness, which is formalized as the statistical independence between a class and a sensitive feature. However, the cause of the superiority is unclear, because it utilizes a somewhat heuristic post-processing technique rather than an explicitly formalized model. We clarify the cause by comparing this method with an alternative naive Bayes classifier, which is modified by a modeling technique called "hypothetical fair-factorization." This investigation reveals the theoretical background of the two-naive-Bayes method and its connections with other methods. Based on these findings, we develop another naive Bayes method with an "actual fair-factorization" technique and empirically show that this new method can achieve an equal level of fairness as that of the two-naive-Bayes classifier.
Enhancement of the Neutrality in Recommendation
Workshop on Human Decision Making in Recommender Systems, in conjunction with RecSys 2012
Article @ Official Site: http://ceur-ws.org/Vol-893/
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-recsys-print.pdf
Handnote : http://www.kamishima.net/archive/2012-ws-recsys-HN.pdf
Program codes : http://www.kamishima.net/inrs
Workshop Homepage: http://recex.ist.tugraz.at/RecSysWorkshop2012
Abstract:
This paper proposes an algorithm for making recommendation so that the neutrality toward the viewpoint specified by a user is enhanced. This algorithm is useful for avoiding to make decisions based on biased information. Such a problem is pointed out as the filter bubble, which is the influence in social decisions biased by a personalization technology. To provide such a recommendation, we assume that a user specifies a viewpoint toward which the user want to enforce the neutrality, because recommendation that is neutral from any information is no longer recommendation. Given such a target viewpoint, we implemented information neutral recommendation algorithm by introducing a penalty term to enforce the statistical independence between the target viewpoint and a preference score. We empirically show that our algorithm enhances the independence toward the specified viewpoint by and then demonstrate how sets of recommended items are changed.
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Toshihiro Kamishima
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects
Invited Talk @ Workshop on Fairness, Accountability, and Transparency in Machine Learning
In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015
Web Site: http://www.kamishima.net/fadm/
Handnote: http://www.kamishima.net/archive/2015-ws-icml-HN.pdf
The goal of fairness-aware data mining (FADM) is to analyze data while taking into account potential issues of fairness. In this talk, we will cover three topics in FADM:
1. Fairness in a Recommendation Context: In classification tasks, the term "fairness" is regarded as anti-discrimination. We will present other types of problems related to the fairness in a recommendation context.
2. What is Fairness: Most formal definitions of fairness have a connection with the notion of statistical independence. We will explore other types of formal fairness based on causality, agreement, and unfairness.
3. Theoretical Problems of FADM: After reviewing technical and theoretical open problems in the FADM literature, we will introduce the theory of the generalization bound in terms of accuracy as well as fairness.
Joint work with Jun Sakuma, Shotaro Akaho, and Hideki Asoh
Correcting Popularity Bias by Enhancing Recommendation NeutralityToshihiro Kamishima
Correcting Popularity Bias by Enhancing Recommendation Neutrality on
The 8th ACM Conference on Recommender Systems, Poster
Article @ Official Site: http://ceur-ws.org/Vol-1247/
Article @ Personal Site: http://www.kamishima.net/archive/2014-po-recsys-print.pdf
Abstract:
In this paper, we attempt to correct a popularity bias, which is the tendency for popular items to be recommended more frequently, by enhancing recommendation neutrality. Recommendation neutrality involves excluding specified information from the prediction process of recommendation. This neutrality was formalized as the statistical independence between a recommendation result and the specified information, and we developed a recommendation algorithm that satisfies this independence constraint. We correct the popularity bias by enhancing neutrality with respect to information regarding whether candidate items are popular or not. We empirically show that a popularity bias in the predicted preference scores can be corrected.
Fairness-aware Learning through Regularization Approach
The 3rd IEEE International Workshop on Privacy Aspects of Data Mining (PADM 2011)
Dec. 11, 2011 @ Vancouver, Canada, in conjunction with ICDM2011
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2011.83
Article @ Personal Site: http://www.kamishima.net/archive/2011-ws-icdm_padm.pdf
Handnote: http://www.kamishima.net/archive/2011-ws-icdm_padm-HN.pdf
Workshop Homepage: http://www.zurich.ibm.com/padm2011/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect people's lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be socially and legally fair from a viewpoint of social responsibility; namely, it must be unbiased and nondiscriminatory in sensitive features, such as race, gender, religion, and so on. Several researchers have recently begun to attempt the development of analysis techniques that are aware of social fairness or discrimination. They have shown that simply avoiding the use of sensitive features is insufficient for eliminating biases in determinations, due to the indirect influence of sensitive information. From a privacy-preserving viewpoint, this can be interpreted as hiding sensitive information when classification results are observed. In this paper, we first discuss three causes of unfairness in machine learning. We then propose a regularization approach that is applicable to any prediction algorithm with probabilistic discriminative models. We further apply this approach to logistic regression and empirically show its effectiveness and efficiency.
Considerations on Recommendation Independence for a Find-Good-Items TaskToshihiro Kamishima
Considerations on Recommendation Independence for a Find-Good-Items Task
Workshop on Responsible Recommendation (FATREC), in conjunction with RecSys2017
Article @ Official Site: http://doi.org/10.18122/B2871W
Workshop Homepage: https://piret.gitlab.io/fatrec/
This paper examines the notion of recommendation independence, which is a constraint that a recommendation result is independent from specific information. This constraint is useful in ensuring adherence to laws and regulations, fair treatment of content providers, and exclusion of unwanted information. For example, to make a job-matching recommendation socially fair, the matching should be independent of socially sensitive information, such as gender or race. We previously developed several recommenders satisfying recommendation independence, but these were all designed for a predicting-ratings task, whose goal is to predict a score that a user would rate. We here focus on another find-good-items task, which aims to find some items that a user would prefer. In this task, scores representing the degree of preference to items are first predicted, and some items having the largest scores are displayed in the form of a ranked list. We developed a preliminary algorithm for this task through a naive approach, enhancing independence between a preference score and sensitive information. We empirically show that although this algorithm can enhance independence of a preference score, it is not fit for the purpose of enhancing independence in terms of a ranked list. This result indicates the need for inventing a notion of independence that is suitable for use with a ranked list and that is applicable for completing a find-good-items task.
Consideration on Fairness-aware Data Mining
IEEE International Workshop on Discrimination and Privacy-Aware Data Mining (DPADM 2012)
Dec. 10, 2012 @ Brussels, Belgium, in conjunction with ICDM2012
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.101
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-icdm-print.pdf
Handnote: http://www.kamishima.net/archive/2012-ws-icdm-HN.pdf
Workshop Homepage: https://sites.google.com/site/dpadm2012/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair regarding sensitive features such as race, gender, religion, and so on. Several researchers have recently begun to develop fairness-aware or discrimination-aware data mining techniques that take into account issues of social fairness, discrimination, and neutrality. In this paper, after demonstrating the applications of these techniques, we explore the formal concepts of fairness and techniques for handling fairness in data mining. We then provide an integrated view of these concepts based on statistical independence. Finally, we discuss the relations between fairness-aware data mining and other research topics, such as privacy-preserving data mining or causal inference.
Model-based Approaches for Independence-Enhanced RecommendationToshihiro Kamishima
Model-based Approaches for Independence-Enhanced Recommendation
IEEE International Workshop on Privacy Aspects of Data Mining (PADM), in conjunction with ICDM2016
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2016.0127
Workshop Homepage: http://pddm16.eurecat.org/
Abstract:
This paper studies a new approach to enhance recommendation independence. Such approaches are useful in ensuring adherence to laws and regulations, fair treatment of content providers, and exclusion of unwanted information. For example, recommendations that match an employer with a job applicant should not be based on socially sensitive information, such as gender or race, from the perspective of social fairness. An algorithm that could exclude the influence of such sensitive information would be useful in this case. We previously gave a formal definition of recommendation independence and proposed a method adopting a regularizer that imposes such an independence constraint. As no other options than this regularization approach have been put forward, we here propose a new model-based approach, which is based on a generative model that satisfies the constraint of recommendation independence. We apply this approach to a latent class model and empirically show that the model-based approach can enhance recommendation independence. Recommendation algorithms based on generative models, such as topic models, are important, because they have a flexible functionality that enables them to incorporate a wide variety of information types. Our new model-based approach will broaden the applications of independence-enhanced recommendation by integrating the functionality of generative models.
The Independence of Fairness-aware Classifiers
IEEE International Workshop on Privacy Aspects of Data Mining (PADM), in conjunction with ICDM2013
Article @ Official Site:
Article @ Personal Site: http://www.kamishima.net/archive/2013-ws-icdm-print.pdf
Handnote : http://www.kamishima.net/archive/2013-ws-icdm-HN.pdf
Program codes : http://www.kamishima.net/fadm/
Workshop Homepage: http://www.cs.cf.ac.uk/padm2013/
Abstract:
Due to the spread of data mining technologies, such technologies are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair in sensitive features, such as race, gender, religion, and so on. The goal of fairness-aware classifiers is to classify data while taking into account the potential issues of fairness, discrimination, neutrality, and/or independence. In this paper, after reviewing fairness-aware classification methods, we focus on one such method, Calders and Verwer's two-naive-Bayes method. This method has been shown superior to the other classifiers in terms of fairness, which is formalized as the statistical independence between a class and a sensitive feature. However, the cause of the superiority is unclear, because it utilizes a somewhat heuristic post-processing technique rather than an explicitly formalized model. We clarify the cause by comparing this method with an alternative naive Bayes classifier, which is modified by a modeling technique called "hypothetical fair-factorization." This investigation reveals the theoretical background of the two-naive-Bayes method and its connections with other methods. Based on these findings, we develop another naive Bayes method with an "actual fair-factorization" technique and empirically show that this new method can achieve an equal level of fairness as that of the two-naive-Bayes classifier.
Enhancement of the Neutrality in Recommendation
Workshop on Human Decision Making in Recommender Systems, in conjunction with RecSys 2012
Article @ Official Site: http://ceur-ws.org/Vol-893/
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-recsys-print.pdf
Handnote : http://www.kamishima.net/archive/2012-ws-recsys-HN.pdf
Program codes : http://www.kamishima.net/inrs
Workshop Homepage: http://recex.ist.tugraz.at/RecSysWorkshop2012
Abstract:
This paper proposes an algorithm for making recommendation so that the neutrality toward the viewpoint specified by a user is enhanced. This algorithm is useful for avoiding to make decisions based on biased information. Such a problem is pointed out as the filter bubble, which is the influence in social decisions biased by a personalization technology. To provide such a recommendation, we assume that a user specifies a viewpoint toward which the user want to enforce the neutrality, because recommendation that is neutral from any information is no longer recommendation. Given such a target viewpoint, we implemented information neutral recommendation algorithm by introducing a penalty term to enforce the statistical independence between the target viewpoint and a preference score. We empirically show that our algorithm enhances the independence toward the specified viewpoint by and then demonstrate how sets of recommended items are changed.
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Toshihiro Kamishima
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, and Theoretical Aspects
Invited Talk @ Workshop on Fairness, Accountability, and Transparency in Machine Learning
In conjunction with the ICML 2015 @ Lille, France, Jul. 11, 2015
Web Site: http://www.kamishima.net/fadm/
Handnote: http://www.kamishima.net/archive/2015-ws-icml-HN.pdf
The goal of fairness-aware data mining (FADM) is to analyze data while taking into account potential issues of fairness. In this talk, we will cover three topics in FADM:
1. Fairness in a Recommendation Context: In classification tasks, the term "fairness" is regarded as anti-discrimination. We will present other types of problems related to the fairness in a recommendation context.
2. What is Fairness: Most formal definitions of fairness have a connection with the notion of statistical independence. We will explore other types of formal fairness based on causality, agreement, and unfairness.
3. Theoretical Problems of FADM: After reviewing technical and theoretical open problems in the FADM literature, we will introduce the theory of the generalization bound in terms of accuracy as well as fairness.
Joint work with Jun Sakuma, Shotaro Akaho, and Hideki Asoh
Correcting Popularity Bias by Enhancing Recommendation NeutralityToshihiro Kamishima
Correcting Popularity Bias by Enhancing Recommendation Neutrality on
The 8th ACM Conference on Recommender Systems, Poster
Article @ Official Site: http://ceur-ws.org/Vol-1247/
Article @ Personal Site: http://www.kamishima.net/archive/2014-po-recsys-print.pdf
Abstract:
In this paper, we attempt to correct a popularity bias, which is the tendency for popular items to be recommended more frequently, by enhancing recommendation neutrality. Recommendation neutrality involves excluding specified information from the prediction process of recommendation. This neutrality was formalized as the statistical independence between a recommendation result and the specified information, and we developed a recommendation algorithm that satisfies this independence constraint. We correct the popularity bias by enhancing neutrality with respect to information regarding whether candidate items are popular or not. We empirically show that a popularity bias in the predicted preference scores can be corrected.
Fairness-aware Learning through Regularization Approach
The 3rd IEEE International Workshop on Privacy Aspects of Data Mining (PADM 2011)
Dec. 11, 2011 @ Vancouver, Canada, in conjunction with ICDM2011
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2011.83
Article @ Personal Site: http://www.kamishima.net/archive/2011-ws-icdm_padm.pdf
Handnote: http://www.kamishima.net/archive/2011-ws-icdm_padm-HN.pdf
Workshop Homepage: http://www.zurich.ibm.com/padm2011/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect people's lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be socially and legally fair from a viewpoint of social responsibility; namely, it must be unbiased and nondiscriminatory in sensitive features, such as race, gender, religion, and so on. Several researchers have recently begun to attempt the development of analysis techniques that are aware of social fairness or discrimination. They have shown that simply avoiding the use of sensitive features is insufficient for eliminating biases in determinations, due to the indirect influence of sensitive information. From a privacy-preserving viewpoint, this can be interpreted as hiding sensitive information when classification results are observed. In this paper, we first discuss three causes of unfairness in machine learning. We then propose a regularization approach that is applicable to any prediction algorithm with probabilistic discriminative models. We further apply this approach to logistic regression and empirically show its effectiveness and efficiency.
Considerations on Recommendation Independence for a Find-Good-Items TaskToshihiro Kamishima
Considerations on Recommendation Independence for a Find-Good-Items Task
Workshop on Responsible Recommendation (FATREC), in conjunction with RecSys2017
Article @ Official Site: http://doi.org/10.18122/B2871W
Workshop Homepage: https://piret.gitlab.io/fatrec/
This paper examines the notion of recommendation independence, which is a constraint that a recommendation result is independent from specific information. This constraint is useful in ensuring adherence to laws and regulations, fair treatment of content providers, and exclusion of unwanted information. For example, to make a job-matching recommendation socially fair, the matching should be independent of socially sensitive information, such as gender or race. We previously developed several recommenders satisfying recommendation independence, but these were all designed for a predicting-ratings task, whose goal is to predict a score that a user would rate. We here focus on another find-good-items task, which aims to find some items that a user would prefer. In this task, scores representing the degree of preference to items are first predicted, and some items having the largest scores are displayed in the form of a ranked list. We developed a preliminary algorithm for this task through a naive approach, enhancing independence between a preference score and sensitive information. We empirically show that although this algorithm can enhance independence of a preference score, it is not fit for the purpose of enhancing independence in terms of a ranked list. This result indicates the need for inventing a notion of independence that is suitable for use with a ranked list and that is applicable for completing a find-good-items task.
Consideration on Fairness-aware Data Mining
IEEE International Workshop on Discrimination and Privacy-Aware Data Mining (DPADM 2012)
Dec. 10, 2012 @ Brussels, Belgium, in conjunction with ICDM2012
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.101
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-icdm-print.pdf
Handnote: http://www.kamishima.net/archive/2012-ws-icdm-HN.pdf
Workshop Homepage: https://sites.google.com/site/dpadm2012/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair regarding sensitive features such as race, gender, religion, and so on. Several researchers have recently begun to develop fairness-aware or discrimination-aware data mining techniques that take into account issues of social fairness, discrimination, and neutrality. In this paper, after demonstrating the applications of these techniques, we explore the formal concepts of fairness and techniques for handling fairness in data mining. We then provide an integrated view of these concepts based on statistical independence. Finally, we discuss the relations between fairness-aware data mining and other research topics, such as privacy-preserving data mining or causal inference.
Fairness-aware Classifier with Prejudice Remover RegularizerToshihiro Kamishima
Fairness-aware Classifier with Prejudice Remover Regularizer
Proceedings of the European Conference on Machine Learning and Principles of Knowledge Discovery in Databases (ECMLPKDD), Part II, pp.35-50 (2012)
Article @ Official Site: http://dx.doi.org/10.1007/978-3-642-33486-3_3
Article @ Personal Site: http://www.kamishima.net/archive/2012-p-ecmlpkdd-print.pdf
Handnote: http://www.kamishima.net/archive/2012-p-ecmlpkdd-HN.pdf
Program codes : http://www.kamishima.net/fadm/
Conference Homepage: http://www.ecmlpkdd2012.net/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair in sensitive features, such as race, gender, religion, and so on. Several researchers have recently begun to attempt the development of analysis techniques that are aware of social fairness or discrimination. They have shown that simply avoiding the use of sensitive features is insufficient for eliminating biases in determinations, due to the indirect influence of sensitive information. In this paper, we first discuss three causes of unfairness in machine learning. We then propose a regularization approach that is applicable to any prediction algorithm with probabilistic discriminative models. We further apply this approach to logistic regression and empirically show its effectiveness and efficiency.
Instance Selection and Optimization of Neural NetworksITIIIndustries
Credit scoring is an important tool in financial institutions, which can be used in credit granting decision. Credit applications are marked by credit scoring models and those with high marks will be treated as “good”, while those with low marks will be regarded as “bad”. As data mining technique develops, automatic credit scoring systems are warmly welcomed for their high efficiency and objective judgments. Many machine learning algorithms have been applied in training credit scoring models, and ANN is one of them with good performance. This paper presents a higher accuracy credit scoring model based on MLP neural networks trained with back propagation algorithm. Our work focuses on enhancing credit scoring models in three aspects: optimize data distribution in datasets using a new method called Average Random Choosing; compare effects of training-validation-test instances numbers; and find the most suitable number of hidden units. Another contribution of this paper is summarizing the tendency of scoring accuracy of models when the number of hidden units increases. The experiment results show that our methods can achieve high credit scoring accuracy with imbalanced datasets. Thus, credit granting decision can be made by data mining methods using MLP neural networks.
Aspect Extraction Performance With Common Pattern of Dependency Relation in ...Nurfadhlina Mohd Sharef
A. S., Shafie, Sharef, N. M., Murad, M. A. A., Azman, A., (2018), "Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis", 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP18), Kota Kinabalu, in press.
Enhancing Multi-Aspect Collaborative Filtering for Personalized RecommendationNurfadhlina Mohd Sharef
Khairudin, N., Sharef, N. M., Mustapha, N., Noah, S A. M., (2018), "Enhancing Multi-Aspect Collaborative Filtering for Personalized Recommendation", 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP18), Kota Kinabalu
Recommendation Independence
The 1st Conference on Fairness, Accountability, and Transparency
Article @ Official Site: http://proceedings.mlr.press/v81/kamishima18a.html
Conference site: https://fatconference.org/2018/
Abstract:
This paper studies a recommendation algorithm whose outcomes are not influenced by specified information. It is useful in contexts potentially unfair decision should be avoided, such as job-applicant recommendations that are not influenced by socially sensitive information. An algorithm that could exclude the influence of sensitive information would thus be useful for job-matching with fairness. We call the condition between a recommendation outcome and a sensitive feature Recommendation Independence, which is formally defined as statistical independence between the outcome and the feature. Our previous independence-enhanced algorithms simply matched the means of predictions between sub-datasets consisting of the same sensitive value. However, this approach could not remove the sensitive information represented by the second or higher moments of distributions. In this paper, we develop new methods that can deal with the second moment, i.e., variance, of recommendation outcomes without increasing the computational complexity. These methods can more strictly remove the sensitive information, and experimental results demonstrate that our new algorithms can more effectively eliminate the factors that undermine fairness. Additionally, we explore potential applications for independence-enhanced recommendation, and discuss its relation to other concepts, such as recommendation diversity.
Recommender Systems Fairness Evaluation via Generalized Cross EntropyVito Walter Anelli
Fairness in recommender systems has been considered with respect to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue in a multistakeholder setting). Regardless, the concept has been commonly interpreted as some form of equality – i.e., the degree to which the system is meeting the information needs of all its users in an equal sense. In this paper, we argue that fairness in recommender systems does not necessarily imply equality, but instead it should consider a distribution of resources based on merits and needs. We present a probabilistic framework based on generalized cross entropy to evaluate fairness of recommender systems under this perspective, where we show that the proposed framework is flexible and explanatory by allowing to incorporate domain knowledge (through an ideal fair distribution) that can help to understand which item or user aspects a recommendation algorithm is over- or under-representing. Results on two real-world datasets show the merits of the proposed evaluation framework both in terms of user and item fairness.
Fairness-aware Classifier with Prejudice Remover RegularizerToshihiro Kamishima
Fairness-aware Classifier with Prejudice Remover Regularizer
Proceedings of the European Conference on Machine Learning and Principles of Knowledge Discovery in Databases (ECMLPKDD), Part II, pp.35-50 (2012)
Article @ Official Site: http://dx.doi.org/10.1007/978-3-642-33486-3_3
Article @ Personal Site: http://www.kamishima.net/archive/2012-p-ecmlpkdd-print.pdf
Handnote: http://www.kamishima.net/archive/2012-p-ecmlpkdd-HN.pdf
Program codes : http://www.kamishima.net/fadm/
Conference Homepage: http://www.ecmlpkdd2012.net/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair in sensitive features, such as race, gender, religion, and so on. Several researchers have recently begun to attempt the development of analysis techniques that are aware of social fairness or discrimination. They have shown that simply avoiding the use of sensitive features is insufficient for eliminating biases in determinations, due to the indirect influence of sensitive information. In this paper, we first discuss three causes of unfairness in machine learning. We then propose a regularization approach that is applicable to any prediction algorithm with probabilistic discriminative models. We further apply this approach to logistic regression and empirically show its effectiveness and efficiency.
Instance Selection and Optimization of Neural NetworksITIIIndustries
Credit scoring is an important tool in financial institutions, which can be used in credit granting decision. Credit applications are marked by credit scoring models and those with high marks will be treated as “good”, while those with low marks will be regarded as “bad”. As data mining technique develops, automatic credit scoring systems are warmly welcomed for their high efficiency and objective judgments. Many machine learning algorithms have been applied in training credit scoring models, and ANN is one of them with good performance. This paper presents a higher accuracy credit scoring model based on MLP neural networks trained with back propagation algorithm. Our work focuses on enhancing credit scoring models in three aspects: optimize data distribution in datasets using a new method called Average Random Choosing; compare effects of training-validation-test instances numbers; and find the most suitable number of hidden units. Another contribution of this paper is summarizing the tendency of scoring accuracy of models when the number of hidden units increases. The experiment results show that our methods can achieve high credit scoring accuracy with imbalanced datasets. Thus, credit granting decision can be made by data mining methods using MLP neural networks.
Aspect Extraction Performance With Common Pattern of Dependency Relation in ...Nurfadhlina Mohd Sharef
A. S., Shafie, Sharef, N. M., Murad, M. A. A., Azman, A., (2018), "Aspect Extraction Performance With Common Pattern of Dependency Relation in Multi Aspect Sentiment Analysis", 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP18), Kota Kinabalu, in press.
Enhancing Multi-Aspect Collaborative Filtering for Personalized RecommendationNurfadhlina Mohd Sharef
Khairudin, N., Sharef, N. M., Mustapha, N., Noah, S A. M., (2018), "Enhancing Multi-Aspect Collaborative Filtering for Personalized Recommendation", 2018 Fourth International Conference on Information Retrieval and Knowledge Management (CAMP18), Kota Kinabalu
Recommendation Independence
The 1st Conference on Fairness, Accountability, and Transparency
Article @ Official Site: http://proceedings.mlr.press/v81/kamishima18a.html
Conference site: https://fatconference.org/2018/
Abstract:
This paper studies a recommendation algorithm whose outcomes are not influenced by specified information. It is useful in contexts potentially unfair decision should be avoided, such as job-applicant recommendations that are not influenced by socially sensitive information. An algorithm that could exclude the influence of sensitive information would thus be useful for job-matching with fairness. We call the condition between a recommendation outcome and a sensitive feature Recommendation Independence, which is formally defined as statistical independence between the outcome and the feature. Our previous independence-enhanced algorithms simply matched the means of predictions between sub-datasets consisting of the same sensitive value. However, this approach could not remove the sensitive information represented by the second or higher moments of distributions. In this paper, we develop new methods that can deal with the second moment, i.e., variance, of recommendation outcomes without increasing the computational complexity. These methods can more strictly remove the sensitive information, and experimental results demonstrate that our new algorithms can more effectively eliminate the factors that undermine fairness. Additionally, we explore potential applications for independence-enhanced recommendation, and discuss its relation to other concepts, such as recommendation diversity.
Recommender Systems Fairness Evaluation via Generalized Cross EntropyVito Walter Anelli
Fairness in recommender systems has been considered with respect to sensitive attributes of users (e.g., gender, race) or items (e.g., revenue in a multistakeholder setting). Regardless, the concept has been commonly interpreted as some form of equality – i.e., the degree to which the system is meeting the information needs of all its users in an equal sense. In this paper, we argue that fairness in recommender systems does not necessarily imply equality, but instead it should consider a distribution of resources based on merits and needs. We present a probabilistic framework based on generalized cross entropy to evaluate fairness of recommender systems under this perspective, where we show that the proposed framework is flexible and explanatory by allowing to incorporate domain knowledge (through an ideal fair distribution) that can help to understand which item or user aspects a recommendation algorithm is over- or under-representing. Results on two real-world datasets show the merits of the proposed evaluation framework both in terms of user and item fairness.
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Olivier Jeunen
Slides for my Doctoral Symposium presentation at RecSys '19 in Copenhagen, titled "Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems".
Empirical Evaluation of Active Learning in Recommender SystemsUniversity of Bergen
The accuracy of collaborative-filtering recommender systems largely depends on three factors: the quality of the rating prediction algorithm, and the quantity and quality of available ratings. While research in the field of recommender systems often concentrates on improving prediction algorithms, even the best algorithms will fail if they are fed poor quality data during training. Active learning aims to remedy this problem by focusing on obtaining better quality data that more aptly reflects a user’s preferences. In attempt to do that, an active learning strategy selects the best items to be presented to the user in order to acquire her ratings and hence improve the output of the RS.
In this seminar, I present a set of active learning strategies with different characteristics and the evaluation results with respect to several evaluation measures (i.e., MAE, NDCG, Precision, Coverage, Recommendation Quality, and, Quantity of the acquired ratings and contextual conditions).
The traditional evaluation of active learning strategies has two major flaws: (1) Performance has been evaluated for each user independently (ignoring system-wide improvements) (2) Active learning strategies have been evaluated in isolation from unsolicited user ratings (natural acquisition). Addressing these flaws, I present that an elicited rating has effects across the system, so a typical user-centric evaluation which ignores any changes of rating prediction of other users also ignores these cumulative effects, which may be more influential on the performance of the system as a whole (system-centric). Hence, I present a novel offline evaluation methodology and use it to evaluate some novel and state of the art rating elicitation strategies.
While the first set of experiments was done offline, the true value of active learning must be evaluated in an online setting. Hence, in the second part of the seminar, I present a novel active learning approach that exploits some additional information of the user (i.e. the user’s personality) to deal with the cold start problem in an up-and-running mobile context-aware RS called STS, that provides users with recommendations for places of interest (POIs). The results of live user studies, have shown that the proposed AL approach significantly increases the quantity of the ratings and contextual conditions acquired from the user as well as the recommendation accuracy.
Social Recommendation A Review ACM Transactions on Intelligent Systems and Technology 2013 Jiliang Tang · Xia Hu · Huan Liu Associate professor at Michigan.
Exploring Author Gender in Book Rating and Recommendation
M. D. Ekstrand, M. Tian, M. R. I. Kazi, H. Mehrpouyan, and D. Kluver
https://doi.org/10.1145/3240323.3240373
RecSys2018 論文読み会 (2018-11-17) https://atnd.org/events/101334
WSDM2018読み会
2018-04-14 @ クックパッド
https://atnd.org/events/95510
Offline A/B Testing for Recommender Systems
A. Gilotte, C. Calauzénes, T. Nedelec, A. Abraham, and S. Dollé
https://doi.org/10.1145/3159652.3159687
KDD2016勉強会 https://atnd.org/events/80771
論文:“Why Should I Trust You?”Explaining the Predictions of Any Classifier
著者:M. T. Ribeiro and S. Singh and C. Guestrin
論文リンク: http://www.kdd.org/kdd2016/subtopic/view/why-should-i-trust-you-explaining-the-predictions-of-any-classifier
WSDM2016勉強会 https://atnd.org/events/74341
論文:Portrait of an Online Shopper: Understanding and Predicting Consumer Behavior
著者:F. Kooti and K. Lerman and L. M. Aiello and M. Grbovic and N. Djuric
論文リンク: http://dx.doi.org/10.1145/2835776.2835831
Absolute and Relative Clustering
4th MultiClust Workshop on Multiple Clusterings, Multi-view Data, and Multi-source Knowledge-driven Clustering (Multiclust 2013)
Aug. 11, 2013 @ Chicago, U.S.A, in conjunction with KDD2013
Article @ Official Site: http://dx.doi.org/10.1145/2501006.2501013
Article @ Personal Site: http://www.kamishima.net/archive/2013-ws-kdd-print.pdf
Handnote: http://www.kamishima.net/archive/2013-ws-kdd-HN.pdf
Workshop Homepage: http://cs.au.dk/research/research-areas/data-intensive-systems/projects/multiclust2013/
Abstract:
Research into (semi-)supervised clustering has been increasing. Supervised clustering aims to group similar data that are partially guided by the user's supervision. In this supervised clustering, there are many choices for formalization. For example, as a type of supervision, one can adopt labels of data points, must/cannot links, and so on. Given a real clustering task, such as grouping documents or image segmentation, users must confront the question ``How should we mathematically formalize our task?''To help answer this question, we propose the classification of real clusterings into absolute and relative clusterings, which are defined based on the relationship between the resultant partition and the data set to be clustered. This categorization can be exploited to choose a type of task formalization.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Efficiency Improvement of Neutrality-Enhanced Recommendation
1. Efficiency Improvement of
Neutrality-Enhanced Recommendation
Toshihiro Kamishima*, Shotaro Akaho*, Hideki Asoh*, and Jun Sakuma†
*National Institute of Advanced Industrial Science and Technology (AIST), Japan
†University of Tsukuba, Japan; and Japan Science and Technology Agency
Workshop on Human Decision Making in Recommender Systems
In conjunction with the RecSys 2013 @ Hong Kong, China, Oct. 12, 2013
1START
2. Overview
2
Providing neutral information is important in recommendation
Information-neutral Recommender System
This system makes recommendation so as to enhance
the neutrality from a viewpoint feature specified by a user
The absolutely neutral recommendation is intrinsically infeasible,
because recommendation is always biased in a sense that it is
arranged for a specific user
avoidance of biased recommendation
fair treatment of content suppliers or item providers
adherence to laws and regulations in recommendation
3. Outline
3
Introduction
Recommendation Neutrality
Viewpoint feature, Intuitive definition
Applications of Recommendation Neutrality
Avoidance of biased recommendation, Fair treatment of content
providers, Adherence to laws and regulations
Information-neutral Recommendation System
pmf model, information-neutral recommender system, neutrality
terms
Experiments
small data, genre-wise mean differences, larger data
Discussion about the Recommendation Neutrality
subjectivity & objectivity, recommendation diversity, privacy-
preserving data mining, and more
Conclusion
5. Viewpoint Feature
5
V : viewpoint feature
It is specified by a user depending on his/her purpose
Recommendation results are neutral from this viewpoint
Its value is determined depending on a user, an item, and their
features
As in a case of standard recommendation, we use random variables
X: a user, Y: an item, and R: a rating value
In this talk, a viewpoint feature is restricted to a binary type
We adopt an additional variable for the recommendation neutrality
Ex. viewpoint = user’s gender / movie’s release year
6. Recommendation Neutrality
6
Whether a movie is new or old does not influence
the inference of whether the movie is recommended or not
Ex. viewpoint = movie’s release year
Recommendation Neutrality
Recommendation results are neutral if no information about a given
viewpoint feature does not influence the results
The status of the specified viewpoint feature is explicitly excluded
from the inference of the results
If movies A and B are the same except for their release year,
the movie A is always recommended when the movie B is recommended,
and vice versa
8. Avoidance of Biased Recommendation
8
Biased Recommendation
exclude a good candidate from a set of options
rate relatively inferior options higher
The Filter Bubble Problem
Pariser posed a concern that personalization technologies narrow and
bias the topics of information provided to people
http://www.thefilterbubble.com/
Biased Recommendations can lead to inappropriate decisions
9. Filter Bubble
9
[TED Talk by Eli Pariser]
Friend recommendation list in Facebook
To fit for Pariser’s preference, conservative people are eliminated
from his recommendation list, while this fact is not noticed to him
viewpoint = a political conviction of a friend candidate
Whether a candidate is conservative or progressive
does not influence whether he/she is included in a friend list or not
10. Fair Treatment of Content Providers
10
System managers should fairly treat their content providers
The US FTC has been investigating Google to determine whether the
search engine ranks its own services higher than those of competitors
Ranking in a list retrieved by search engines
Content providers are managers’ customers
For marketplace sites, their tenants are customers, and these tenants
must be treated fairly when recommending the tenants’ products
viewpoint = a content provider of a candidate item
Information about who provides a candidate item is ignored,
and providers are treated fairly
[Bloomberg]
11. Adherence to Laws and Regulations
11
Recommendation services must be managed
while adhering to laws and regulations
suspicious placement keyword-matching advertisement
Advertisements indicating arrest records were more frequently
displayed for names that are more popular among individuals of
African descent than those of European descent
viewpoint = users’ socially sensitive demographic information
Legally or socially sensitive information
can be excluded from the inference process of recommendation
Socially discriminative treatments must be avoided
[Sweeney 13]
13. Information-neutral Recommender System
13
Information-neutral Recommender System
neutral from a specified viewpoint feature
A penalty term enhances the recommendation neutrality
Information-neutral version of a probabilistic matrix factorization model
+
high prediction accuracy
Accurate prediction is achieved by minimizing an empirical error
14. ˆr(x, y) = µ + bx + cy + pxq>
y
Probabilistic Matrix Factorization
14
Probabilistic Matrix Factorization Model
predict a preference rating of an item y rated by a user x
well-performed and widely used
cross effect of
users and itemsglobal bias
user-dependent bias item-dependent bias
For a given training data set (xi: user, yi: item, ri: rating), model
parameters are learned by minimizing the squared loss function with a
L2 regularizer
[Salakhutdinov 08, Koren 08]
15. Information-neutral PMF Model
15
information-neutral version of a PMF model
adjust ratings according to the state of a viewpoint
incorporate dependency on a viewpoint variable
enhance the neutrality of a score from a viewpoint
add a neutrality function as a constraint term
adjust ratings according to the state of a viewpoint
Multiple models are built separately, and each of these models
corresponds to the each value of a viewpoint feature
When predicting ratings, a model is selected according to the value
of viewpoint feature
viewpoint feature
ˆr(x, y, v) = µ(v)
+ b(v)
x + c(v)
y + p(v)
x q(v)
y
>
16. Neutrality Term and Objective Function
16
Parameters are learned by minimizing this objective function
squared loss function neutrality term L2 regularizer
regularization
parameter
neutrality parameter to control the balance
between the neutrality and accuracy
X
D
(ri ˆr(xi, yi, vi))
2
+ ⌘ neutral(R, V ) + k⇥k2
2
neutrality term, : quantify the degree of neutralityneutral(R, V )
It depends on both ratings and view point features
The larger value of the neutrality term indicates that the higher level
of the neutrality
Objective Function of an Information-neutral PMF Model
enhance the neutrality of a rating from a viewpoint feature
17. Formal Definition of
the Recommendation Neutrality
17
Recommendation Neutrality
Recommendation results are neutral if no information about a given
viewpoint feature does not influence the results
the statistical independence
between a rating variable, R, and a viewpoint feature, V
Two types of neutrality terms
I(R; V ) =
X
R,V
Pr[R, V ] log
Pr[R|V ]
Pr[R]
Mutual Information Calders&Verwer’s Score
k Pr[R|V = 0] Pr[R|V = 1]k
Pr[R|V ] = Pr[R] ⌘ R ?? V
18. Mutual Information
18
mi-hist (mutual information - histogram)
This term cannot differentiate analytically
Optimization is too slow to process even the moderate size of data
[Kamishima 12]
Neutrality Term in Our Previous Work
I(R; V ) ⇡
1
|D|
X
(ri,vi)2D
log
Pr[ˆri|vi]
P
v2{0,1} Pr[ˆri|v] Pr[v]
Pr[r|v] =
1
|D|
X
(x,y)2D
Pr[r|x, y, v]
too complex
modeling by a histogram
19. Calders&Verwer’s Score (CV Score)
19
Our New Neutrality Term
analytically differentiable and efficient in optimization
make two distributions of R given V = 0 and 1 similar
m-match r-match
Matching means of predicted
ratings when V = 0 and V = 1
k Pr[R|V = 0] Pr[R|V = 1]k
(MeanD(0) [ˆr] MeanD(1) [ˆr])
2
X
(x,y)2D
(ˆr(x, y, 0) ˆr(x, y, 1))
2
Matching two predicted ratings
when V = 0 and V =1,
regardless of the actual value of V
21. Small Data Set
21
to compare mi-hist and m-match/r-match terms
General Conditions
9,409 use-item pairs are sampled from the Movielens 100k data set
(the mi-hist term cannot process larger than this data set)
the number of latent factor K = 1 (due to the small size of data)
regularization parameter λ=1 (more finely tuned than that in an article)
Evaluation measures are calculated by using five-fold cross validation
Evaluation Measure
MAE (mean absolute error)
prediction accuracy
Random recommendation: MAE=0.903
Original PMF recommendation: MAE=0.759
NMI (normalized mutual information)
the neutrality of a predicted ratings from a specified viewpoint
22. Viewpoint Featues
22
The older movies have a tendency to be rated higher, perhaps because
only masterpieces have survived [Koren 2009]
“Year” viewpoint : movie’s release year is newer than 1990 or not
“Gender” viewpoint : a user is male or female
The movie rating would depend on the user’s gender
23. mi-hist
m-match
r-match
0.70
0.75
0.80
0.01 0.1 1 10 100
mi-hist
m-match
r-match
10
−4
10
−3
10
−2
10
−1
0.01 0.1 1 10 100
neutrality parameter η : the lager value enhances the neutrality more
Small Data & Year Viewpoint
23
moreaccurate
moreneutral
neutrality (NMI)accuracy (MAE)
As the increase of a neutrality parameter η,
prediction accuracies were worsened slightly in all cases,
the neutralities were enhanced drastically in mi-hist/m-match, cases but
not in a r-match case
both m-hist and m-match successfully enhanced the neutrality
24. As the increase of a neutrality parameter η,
both the accuracy and neutrality did not largely change
mi-hist
m-match
r-match
10
−4
10
−3
10
−2
10
−1
0.01 0.1 1 10 100
mi-hist
m-match
r-match
0.70
0.75
0.80
0.01 0.1 1 10 100
neutrality parameter η : the lager value enhances the neutrality more
Small Data & Gender Viewpoint
24
moreaccurate
moreneutral
neutrality (NMI)accuracy (MAE)
the neutrality was originally high and failed to improve further
25. Genre-wise Differences of Mean Ratings
25
To examine how the recommendation patterns are changed
Data were firstly divided according to their 18 kinds of movies’ genres
Each genre-wise data were further divided into two sets according to
their view-point value
mean ratings were computed for each set, and we showed the
differences of between mean ratings for two different values of V
Computation Procedure
Not provided in a proceeding article
Three Types of Ratings
the original true ratings in training data
ratings predicted by the PMF model with the mi-hist and m-match
terms, respectively (η = 100)
26. Genre-wise Differences of Mean Ratings
26
original mi-hist m-match
Children’s -0.229 -0.132 -0.139
Animation -0.224 -0.064 -0.068
Romance -0.122 -0.073 -0.073
Documentary 0.333 0.103 0.084
Horror 0.479 0.437 0.409
Fantasy 0.783 0.433 0.387
Gender: males’ mean ratings - females’ mean ratings
the positive values indicate genres more highly rated by males
Differences are basically narrowed by the neutrality enhancement
INRS didn’t simply shift the ratings, and changes were different
genre-wise
NMIs were not changed for a Gender case, but recommendation
patterns were surely changed
27. Large Data Set
27
to show that m-match and r-match terms are
applicable to larger data sets
General Conditions
The Movielens 1M data set (larger than that in an article)
mi-hist model cannot process this size of data set
the number of latent factor K = 7
regularization parameter λ = 1
Evaluation measures are calculated by using five-fold cross
validation
Baseline Accuracies
Random recommendation: MAE=0.934
Original PMF recommendation: MAE=0.685
28. m-match
r-match
0.65
0.70
0.75
0.01 0.1 1 10 100
m-match
r-match
10
−4
10
−3
10
−2
10
−1
0.01 0.1 1 10 100
neutrality parameter η : the lager value enhances the neutrality more
Large Data & Year Viewpoint
28
moreaccurate
moreneutral
neutrality (NMI)accuracy (MAE)
As the increase of a neutrality parameter η,
accuracies were more steeply worsened in a r-match case
the neutralities were improved in a m-match, but not in a r-match case
m-match succeed, but r-match failed
29. m-match
r-match
0.65
0.70
0.75
0.01 0.1 1 10 100
m-match
r-match
10
−4
10
−3
10
−2
10
−1
0.01 0.1 1 10 100
As the increase of a neutrality parameter η,
accuracies were similarly worsened in both cases
neutralities were successfully improved in a m-match case, but not in
a r-match case
the m-match term performed slightly better than the small data case
neutrality parameter η : the lager value enhances the neutrality more
Large Data & Gender Viewpoint
29
moreaccurate
moreneutral
neutrality (NMI)accuracy (MAE)
31. Objectivity and Subjectivity
31
Subjective Neutrality
Reviewers pointed out the
importance of testing how
users perceive the neutrality
Objective Neutrality
The neutrality is currently
evaluated by objective criteria
The neutrality should be guaranteed based on objective criteria
If users perceive the neutrality in recommendation, but it is truly
biased, such a recommender system would be a very dangerous tool
for big brothers to control users
showing the neutrality indexes, such as mutual information
comparing original recommendations and information-neutral ones
listing items in parallel under the conditions a viewpoint feature is a
original and is another value
possible tools for displaying the neutrality in recommendation
32. Recommendation Diversity
32
Recommendation Diversity
Similar items are not recommended in a sigle list, to a sigle user, to
all users, or in a temporally successive lists
[Ziegler+ 05, Zhang+ 08, Latha+ 09, Adomavicius+ 12]
Diversity Neutrality
Items that are similar in a
specified metric are excluded
from recommendation results
Information about a viewpoint
feature is excluded from
recommendation results
The mutual relations
among results
The relation between
results and viewpoints
recommendation list
similar items
excluded
33. Privacy-preserving Data Mining
33
recommendation results, R, and viewpoint features, V,
are statistically independent
In a context of privacy-preservation
Even if the information about R is disclosed,
the information about V will not exposed
mutual information between recommendation results, R,
and viewpoint features, V, is zero
I(R; V) = 0
In particular, a notion of the t-closeness has strong connection
34. More Recommendation Neutrality
34
Trade-offs between the accuracy and neutrality
red-lining effect
Information usable for inferring recommendation decreases in an INRS
where F is all information about other than V
Available information is non-increasing by enhancing the neutrality
Even if a feature V is eliminated from prediction model,
the information of V cannot excluded
the information of V is contained in the other correlated features
I(R; V, F) I(R; F) = I(X; V | F) 0
35. Conclusion
35
Our Contributions
We formulate the recommendation neutrality from a specified
viewpoint feature
We developed a recommender system that can enhance the
recommendation neutrality
The efficiency in optimization was drastically improved
Our experimental results show that the neutrality is successfully
enhanced without seriously sacrificing the prediction accuracy
Future Work
An INRS for non-binary viewpoint features
Information neutral version of generative recommendation models,
such as a pLSA / LDA model
36. program codes and data sets
http://www.kamishima.net/inrs/
acknowledgements
We would like to thank for providing a data set for the Grouplens
research lab
This work is supported by MEXT/JSPS KAKENHI Grant Number
16700157, 21500154, 23240043, 24500194, and 25540094
36