SlideShare a Scribd company logo
Exploring Author Gender
in Book Rating and Recommendation
M. D. Ekstrand et al.
1
2






3
4
RecSys ’18, October 2–7, 2018, Vancouver, BC, Canada
u
unu
µ
¯ua ua
¯ua ¯nua
ba
sa a
u 2 U
a 2 A
u u a
5
RecSys ’18, October 2–7, 2018, Vancouver, BC, Can
u
unu
µ
¯ua
¯ua
ba
sa
u 2 U
Binomial(nu, θu)NegBinomial(ν, γ)
logit(θu) Normal(μ, σ)
6
ober 2–7, 2018, Vancouver, BC, Canada
u
u
µ
¯ua ua
¯ua ¯nua
ba
sa a
a 2 A
Table
Variab
n
¯nu
¯u
logit( ) Normal( + logit( ), 2)<latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit>
7
btain author information from
(VIAF)3, a directory of author
ity records from the Library of
und the world. Author gender
s for many records.
mployed by the VIAF is exible
ender identities, supporting an
es for the validity of an identity.
se exibility — all its assertions
This is a signicant limitation
on 5.1.
book data with rating data by
ve data linking coverage, and
works instead of individual edi-
m a bipartite graph of ISBNs and
“edition” records, and OpenLi-
e) and consider each connected
ess than 1% of ratings) this caus-
or a book; we resolve multiple
ir ratings.
VIAF do not share linking iden-
hority records by author name.
ontain multiple name entries,
izations of the author’s name.
arry multiple known forms of
ng names to improve matching
ng both “Last, First” and “First
e all VIAF records containing a
d names for the rst author of
n a book’s cluster. If all records
hor’s gender agree, we take that
ontradicting gender statements,
as “ambiguous”.
ure good coverage while main-
Table 2: Summary of rating data
BookCrossing Amazon
Ratings 1,149,780 22,507,155
Users 105,283 8,026,324
Rated ISBNs/ASINs 340,554 2,330,066
Rated ‘Books’ 295,935 2,286,656
Matched Books 240,255 1,083,066
Known-Gender Books 166,928 616,317
Female-Author Books 66,524 181,850
Male-Author Books 100,404 434,467
% Female Books 39.9% 29.5%
% Female Ratings 45.3% 36.2%
BXA BXE
LOC AZ
fem
ale
m
ale
am
biguousunknow
nunlinked
fem
ale
m
ale
am
biguousunknow
nunlinked
0%
20%
40%
60%
0%
10%
20%
30%
40%
0%
10%
20%
30%
40%
0%
10%
20%
30%
40%
Linking Result
CoveragePercent
Scope
Books
Ratings
Figure 1: Results of data linking and gender resolution. LOC
is the set of books with Library of Congress records; other
panes are the results of linking rating data.
8
dependent
TAN 2.17.3
each per-
We report
arameters
h existing
acterizing
nalyze the
Tables 1–
sample of
nders are
in our cat-
has a more
ookCross-
wn-gender
oportions
(est. sd log odds) 1.03 1.11 1.77
Posterior Mean 0.42 0.40 0.37
Std. Dev. 0.23 0.23 0.28
AZBXABXE
0.00 0.25 0.50 0.75 1.00
0
1
2
3
4
0.0
0.5
1.0
1.5
2.0
0.0
0.5
1.0
1.5
2.0
Proportion of Female Authors
Density
Method Estimated θ Observed y/n Predicted y/n
Figure 4: Distribution of user author-gender tendencies. His-
togram shows observed proportions; lines show kernel den-
sities of estimated tendencies ( 0) along with observed and
predicted proportions.
and Figure 4 shows the distribution of observed author gender
9
Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist.
Prole 1,000 35,187 66.5 1,000 24,913 73.6 1,000 27,525 88.2 1,000 27,525 88.2
UserUser 1,000 6,007 12.0 988 6,235 12.7 1,000 15,343 30.7 939 25,853 55.1
ItemItem 1,000 21,282 42.6 997 10,174 20.4 999 33,363 67.7 999 22,360 45.6
MF 1,000 140 0.3 1,000 264 0.5 1,000 164 0.3 1,000 651 1.3
PF 1,000 1,506 3.0 1,000 4,105 8.2 1,000 2,746 5.4 1,000 3,538 7.0
AZ (Explicit) AZ (Implicit) BXA BXE
UserUserItemItemMFPF
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0
1
2
3
0
1
2
3
4
0
10
20
0
1
2
3
4
Proportion of Books by Female Authors
Density
Mean
Algorithm
Popular
Profile
Method
Observed
Predicted
Figure 5: Posterior densities of recommender biases from integrated regression model.
proportions. The ripples in predicted and observed proportions are
due to the commonality of 5-item user proles, for which there are
only 6 possible proportions; estimated tendency ( ) smooths them
out. This smoothing, along with avoiding estimated extreme biases
based on limited data, are why we nd it useful to estimate tenden-
cy instead of directly computing statistics on observed proportions.
To support direct comparison of the densities of observations and
predictions, we resampled observed proportions with replacement
to yield 10,000 observations.
We observe a population tendency to rate male authors more
frequently than female authors in all data sets (µ  0), but to rate
female authors more frequently than they would be rated were
users drawing books uniformly at random from the available set.
The average user author-gender tendency is slightly closer to an
even balance than the set of rated books. We also found a large
diversity amongst users about their estimated tendencies (s.d. of
Table 6: Mean / SD of rec. list female author proportions.
BXA BXE AZ (Implicit) AZ (Explicit)
Popular 0.458 0.500 0.364 0.364
Rating — 0.383 — 0.222
UserUser 0.399 / 0.180 0.435 / 0.190 0.315 / 0.186 0.367 / 0.278
ItemItem 0.465 / 0.200 0.348 / 0.124 0.351 / 0.245 0.389 / 0.336
MF 0.134 / 0.027 0.334 / 0.039 0.468 / 0.079 0.418 / 0.124
PF 0.372 / 0.208 0.429 / 0.177 0.374 / 0.144 0.394 / 0.177
basic coverage statistics of these algorithms along with correspond-
ing user prole statistics. Users for which an algorithm could not
produce recommendations are rare. We also computed the extent
to which algorithms recommend dierent items to dierent users;
“% Dist.” is the percentage of all recommendations that were distinct
items. Algorithms that repeatedly recommend the same items will
10
BXE
-0.139 0.162 0.906 -0.573 0.129 0.531 -0.652 0.002 0.161 -0.166 0.298 0.772
(-0.20,-0.08) (0.10,0.22) (0.87,0.95) (-0.61,-0.54) (0.09,0.16) (0.51,0.56) (-0.66,-0.64) (-0.01,0.01) (0.15,0.17) (-0.22,-0.11) (0.25,0.35) (0.74,0.81)
AZ (Implicit)
-0.127 0.688 0.715 0.094 0.863 0.895 -0.244 0.011 0.364 -0.224 0.287 0.537
(-0.19,-0.06) (0.65,0.73) (0.68,0.76) (0.02,0.17) (0.81,0.92) (0.84,0.95) (-0.27,-0.22) (-0.00,0.02) (0.35,0.38) (-0.26,-0.18) (0.26,0.31) (0.51,0.56)
AZ (Explicit)
-0.580 0.322 0.681 -0.380 0.438 0.852 -0.117 0.006 0.273 -0.403 0.141 0.525
(-0.63,-0.53) (0.29,0.35) (0.65,0.71) (-0.44,-0.32) (0.40,0.48) (0.81,0.89) (-0.14,-0.10) (-0.00,0.02) (0.26,0.29) (-0.44,-0.37) (0.12,0.16) (0.50,0.55)
AZ (Explicit) AZ (Implicit) BXA BXE
UserUserItemItemMFPF
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
Profile Proportion of Female Authors
RecommenderProportionofFemaleAuthors
Figure 6: Scatter plots and regression curves for recommender response to individual users.
more concentrated. In the BookCrossing data, it tends to favor male
authors more than the underlying data would support; in implic-
it feedback mode, it is highly biased towards male authors with
respect even to the baseline distributions.
4.4 From Proles to Recommendations
Our extended Bayesian model (Section 3.4.2) allows us to address
RQ4: the extent to which our algorithms propagate individual users’
tendencies into their recommendations (RQ4).
Figure 5 shows the posterior predictive and observed densities
of recommender author-gender tendencies, and Figure 6 shows
scatter plots of observed recommendation proportions against user
prole proportions with regression curves (regression lines in log-
place. Visual inspection of the scatter plot suggests that there is a
strong component with consistent tendencies, but the regression
may accurately model the remaining users. Future work will use a
model that can better account for some global consistency.
4.5 Summary
RQ1 — Baseline Gender Distribution Known books are sig-
nicantly more likely to be written by men than by women;
representation among rated books is more balanced.
RQ2 — User Input Gender Distributions User are diuse in
their rating tendencies, with an overall trend favoring male
authors but less strongly than the baseline distribution.
RQ3 — Recommender Output Distributions Dierent CF

More Related Content

Similar to RecSys2018論文読み会 資料

YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)考司 小杉
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsNora Galambos
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationSilvio Cesare
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...GigaScience, BGI Hong Kong
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classificationZheliang Jiang
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Angelina Lessa
 
Chapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationChapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationHamdy F. F. Mahmoud
 
Engineering Statistics
Engineering Statistics Engineering Statistics
Engineering Statistics Bahzad5
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Dennis Sweitzer
 
Psy 870 module 3 problem set answers
Psy 870  module 3 problem set answersPsy 870  module 3 problem set answers
Psy 870 module 3 problem set answersbestwriter
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...CSCJournals
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptxefrembeyene4
 
1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docxteresehearn
 

Similar to RecSys2018論文読み会 資料 (20)

YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)YamadaiR(Categorical Factor Analysis)
YamadaiR(Categorical Factor Analysis)
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 
lecture_4.pptx
lecture_4.pptxlecture_4.pptx
lecture_4.pptx
 
Galambos N Analysis Of Survey Results
Galambos N Analysis Of Survey ResultsGalambos N Analysis Of Survey Results
Galambos N Analysis Of Survey Results
 
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
 
Hepatic injury classification
Hepatic injury classificationHepatic injury classification
Hepatic injury classification
 
Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...Statistical analysis of correlated data using generalized estimating equation...
Statistical analysis of correlated data using generalized estimating equation...
 
Chapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlationChapter 03 scatterplots and correlation
Chapter 03 scatterplots and correlation
 
Engineering Statistics
Engineering Statistics Engineering Statistics
Engineering Statistics
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Statistics
StatisticsStatistics
Statistics
 
Psy 870 module 3 problem set answers
Psy 870  module 3 problem set answersPsy 870  module 3 problem set answers
Psy 870 module 3 problem set answers
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
Slides_SB3.ppt
Slides_SB3.pptSlides_SB3.ppt
Slides_SB3.ppt
 
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
 
analysis part 02.pptx
analysis part 02.pptxanalysis part 02.pptx
analysis part 02.pptx
 
1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx1 Assignment Quantitative Methods 2 The following ass.docx
1 Assignment Quantitative Methods 2 The following ass.docx
 
Logistic regression teaching
Logistic regression teachingLogistic regression teaching
Logistic regression teaching
 

More from Toshihiro Kamishima

機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用Toshihiro Kamishima
 
Considerations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskConsiderations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskToshihiro Kamishima
 
Model-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationModel-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationToshihiro Kamishima
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要Toshihiro Kamishima
 
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Toshihiro Kamishima
 
Correcting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityCorrecting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityToshihiro Kamishima
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないToshihiro Kamishima
 
The Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersThe Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersToshihiro Kamishima
 
Efficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationEfficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationToshihiro Kamishima
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative ClusteringToshihiro Kamishima
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningToshihiro Kamishima
 
Fairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerFairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerToshihiro Kamishima
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationToshihiro Kamishima
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングToshihiro Kamishima
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachToshihiro Kamishima
 

More from Toshihiro Kamishima (20)

WSDM2018読み会 資料
WSDM2018読み会 資料WSDM2018読み会 資料
WSDM2018読み会 資料
 
Recommendation Independence
Recommendation IndependenceRecommendation Independence
Recommendation Independence
 
機械学習研究でのPythonの利用
機械学習研究でのPythonの利用機械学習研究でのPythonの利用
機械学習研究でのPythonの利用
 
Considerations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items TaskConsiderations on Recommendation Independence for a Find-Good-Items Task
Considerations on Recommendation Independence for a Find-Good-Items Task
 
Model-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced RecommendationModel-based Approaches for Independence-Enhanced Recommendation
Model-based Approaches for Independence-Enhanced Recommendation
 
KDD2016勉強会 資料
KDD2016勉強会 資料KDD2016勉強会 資料
KDD2016勉強会 資料
 
科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要科学技術計算関連Pythonパッケージの概要
科学技術計算関連Pythonパッケージの概要
 
WSDM2016勉強会 資料
WSDM2016勉強会 資料WSDM2016勉強会 資料
WSDM2016勉強会 資料
 
ICML2015読み会 資料
ICML2015読み会 資料ICML2015読み会 資料
ICML2015読み会 資料
 
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
Future Directions of Fairness-Aware Data Mining: Recommendation, Causality, a...
 
Correcting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation NeutralityCorrecting Popularity Bias by Enhancing Recommendation Neutrality
Correcting Popularity Bias by Enhancing Recommendation Neutrality
 
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしないPyMCがあれば,ベイズ推定でもう泣いたりなんかしない
PyMCがあれば,ベイズ推定でもう泣いたりなんかしない
 
The Independence of Fairness-aware Classifiers
The Independence of Fairness-aware ClassifiersThe Independence of Fairness-aware Classifiers
The Independence of Fairness-aware Classifiers
 
Efficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced RecommendationEfficiency Improvement of Neutrality-Enhanced Recommendation
Efficiency Improvement of Neutrality-Enhanced Recommendation
 
Absolute and Relative Clustering
Absolute and Relative ClusteringAbsolute and Relative Clustering
Absolute and Relative Clustering
 
Consideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data MiningConsideration on Fairness-aware Data Mining
Consideration on Fairness-aware Data Mining
 
Fairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover RegularizerFairness-aware Classifier with Prejudice Remover Regularizer
Fairness-aware Classifier with Prejudice Remover Regularizer
 
Enhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in RecommendationEnhancement of the Neutrality in Recommendation
Enhancement of the Neutrality in Recommendation
 
OpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシングOpenOpt の線形計画で圧縮センシング
OpenOpt の線形計画で圧縮センシング
 
Fairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization ApproachFairness-aware Learning through Regularization Approach
Fairness-aware Learning through Regularization Approach
 

Recently uploaded

Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...correoyaya
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like BitcoinDOT TECH
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?DOT TECH
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 

Recently uploaded (20)

Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
how can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoinhow can i exchange pi coins for others currency like Bitcoin
how can i exchange pi coins for others currency like Bitcoin
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 

RecSys2018論文読み会 資料

  • 1. Exploring Author Gender in Book Rating and Recommendation M. D. Ekstrand et al. 1
  • 3. 3
  • 4. 4 RecSys ’18, October 2–7, 2018, Vancouver, BC, Canada u unu µ ¯ua ua ¯ua ¯nua ba sa a u 2 U a 2 A u u a
  • 5. 5 RecSys ’18, October 2–7, 2018, Vancouver, BC, Can u unu µ ¯ua ¯ua ba sa u 2 U Binomial(nu, θu)NegBinomial(ν, γ) logit(θu) Normal(μ, σ)
  • 6. 6 ober 2–7, 2018, Vancouver, BC, Canada u u µ ¯ua ua ¯ua ¯nua ba sa a a 2 A Table Variab n ¯nu ¯u logit( ) Normal( + logit( ), 2)<latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit><latexit sha1_base64="WiSy2qnMJnJn/Jh+eBgx0ac955E=">AAADb3ichVLPaxNBFP6a1Vrrj8Z6UBBkMEQSlDIJQqWnohe9SJs0baBbl9l1mizdX+xMQuuy/4AXjx68qOBB/DO8ePLmoX+CeJIW9GDFt5sNakvrLDvz5pv3fe/Nm2dHnqs057sTJePU6ckzU2enz52/cHGmfGl2VYWD2JEdJ/TCuGsLJT03kB3tak92o1gK3/bkmr11PztfG8pYuWGwonciueGLXuBuuo7QBFnlvukL3Y/9xAt7rk5rpi3ixNR9qUVqJQOR1pmpXJ+N3R6FsS+8tGZbgt1iiubDAjnXGtRvs4SYPV9YIn3crFvlCp/j+WBHjUZhVFCMpbD8AyaeIISDAXxIBNBkexBQ9K2jAY6IsA0khMVkufm5RIpp4g7IS5KHIHSL5h7t1gs0oH2mqXK2Q1E8+mNiMlT5Z/6O7/GP/D3/wn8eq5XkGlkuO7TaI66MrJlnV9vf/8vyadXo/2GdmLPGJu7mubqUe5Qj2S2cEX/49MVee6FVTW7yN/wr5f+a7/IPdINguO+8XZatl6ReBXt+0HrVmjwhUkBVyJS3i7qqvKrbaI7i0D/Om1FGYf4mC2S3sYKH6P6FHn//scK4blntVfZm1CKNww1x1FhtzjXIXr5TWbxXNMsUruEGatQR81jEAyyhQxE/YR8H+FX6Zlwxrhts5FqaKDiX8c8w6r8BoCnTFQ==</latexit>
  • 7. 7 btain author information from (VIAF)3, a directory of author ity records from the Library of und the world. Author gender s for many records. mployed by the VIAF is exible ender identities, supporting an es for the validity of an identity. se exibility — all its assertions This is a signicant limitation on 5.1. book data with rating data by ve data linking coverage, and works instead of individual edi- m a bipartite graph of ISBNs and “edition” records, and OpenLi- e) and consider each connected ess than 1% of ratings) this caus- or a book; we resolve multiple ir ratings. VIAF do not share linking iden- hority records by author name. ontain multiple name entries, izations of the author’s name. arry multiple known forms of ng names to improve matching ng both “Last, First” and “First e all VIAF records containing a d names for the rst author of n a book’s cluster. If all records hor’s gender agree, we take that ontradicting gender statements, as “ambiguous”. ure good coverage while main- Table 2: Summary of rating data BookCrossing Amazon Ratings 1,149,780 22,507,155 Users 105,283 8,026,324 Rated ISBNs/ASINs 340,554 2,330,066 Rated ‘Books’ 295,935 2,286,656 Matched Books 240,255 1,083,066 Known-Gender Books 166,928 616,317 Female-Author Books 66,524 181,850 Male-Author Books 100,404 434,467 % Female Books 39.9% 29.5% % Female Ratings 45.3% 36.2% BXA BXE LOC AZ fem ale m ale am biguousunknow nunlinked fem ale m ale am biguousunknow nunlinked 0% 20% 40% 60% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% 0% 10% 20% 30% 40% Linking Result CoveragePercent Scope Books Ratings Figure 1: Results of data linking and gender resolution. LOC is the set of books with Library of Congress records; other panes are the results of linking rating data.
  • 8. 8 dependent TAN 2.17.3 each per- We report arameters h existing acterizing nalyze the Tables 1– sample of nders are in our cat- has a more ookCross- wn-gender oportions (est. sd log odds) 1.03 1.11 1.77 Posterior Mean 0.42 0.40 0.37 Std. Dev. 0.23 0.23 0.28 AZBXABXE 0.00 0.25 0.50 0.75 1.00 0 1 2 3 4 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 Proportion of Female Authors Density Method Estimated θ Observed y/n Predicted y/n Figure 4: Distribution of user author-gender tendencies. His- togram shows observed proportions; lines show kernel den- sities of estimated tendencies ( 0) along with observed and predicted proportions. and Figure 4 shows the distribution of observed author gender
  • 9. 9 Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Users Dist. Items % Dist. Prole 1,000 35,187 66.5 1,000 24,913 73.6 1,000 27,525 88.2 1,000 27,525 88.2 UserUser 1,000 6,007 12.0 988 6,235 12.7 1,000 15,343 30.7 939 25,853 55.1 ItemItem 1,000 21,282 42.6 997 10,174 20.4 999 33,363 67.7 999 22,360 45.6 MF 1,000 140 0.3 1,000 264 0.5 1,000 164 0.3 1,000 651 1.3 PF 1,000 1,506 3.0 1,000 4,105 8.2 1,000 2,746 5.4 1,000 3,538 7.0 AZ (Explicit) AZ (Implicit) BXA BXE UserUserItemItemMFPF 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0 1 2 3 0 1 2 3 4 0 10 20 0 1 2 3 4 Proportion of Books by Female Authors Density Mean Algorithm Popular Profile Method Observed Predicted Figure 5: Posterior densities of recommender biases from integrated regression model. proportions. The ripples in predicted and observed proportions are due to the commonality of 5-item user proles, for which there are only 6 possible proportions; estimated tendency ( ) smooths them out. This smoothing, along with avoiding estimated extreme biases based on limited data, are why we nd it useful to estimate tenden- cy instead of directly computing statistics on observed proportions. To support direct comparison of the densities of observations and predictions, we resampled observed proportions with replacement to yield 10,000 observations. We observe a population tendency to rate male authors more frequently than female authors in all data sets (µ 0), but to rate female authors more frequently than they would be rated were users drawing books uniformly at random from the available set. The average user author-gender tendency is slightly closer to an even balance than the set of rated books. We also found a large diversity amongst users about their estimated tendencies (s.d. of Table 6: Mean / SD of rec. list female author proportions. BXA BXE AZ (Implicit) AZ (Explicit) Popular 0.458 0.500 0.364 0.364 Rating — 0.383 — 0.222 UserUser 0.399 / 0.180 0.435 / 0.190 0.315 / 0.186 0.367 / 0.278 ItemItem 0.465 / 0.200 0.348 / 0.124 0.351 / 0.245 0.389 / 0.336 MF 0.134 / 0.027 0.334 / 0.039 0.468 / 0.079 0.418 / 0.124 PF 0.372 / 0.208 0.429 / 0.177 0.374 / 0.144 0.394 / 0.177 basic coverage statistics of these algorithms along with correspond- ing user prole statistics. Users for which an algorithm could not produce recommendations are rare. We also computed the extent to which algorithms recommend dierent items to dierent users; “% Dist.” is the percentage of all recommendations that were distinct items. Algorithms that repeatedly recommend the same items will
  • 10. 10 BXE -0.139 0.162 0.906 -0.573 0.129 0.531 -0.652 0.002 0.161 -0.166 0.298 0.772 (-0.20,-0.08) (0.10,0.22) (0.87,0.95) (-0.61,-0.54) (0.09,0.16) (0.51,0.56) (-0.66,-0.64) (-0.01,0.01) (0.15,0.17) (-0.22,-0.11) (0.25,0.35) (0.74,0.81) AZ (Implicit) -0.127 0.688 0.715 0.094 0.863 0.895 -0.244 0.011 0.364 -0.224 0.287 0.537 (-0.19,-0.06) (0.65,0.73) (0.68,0.76) (0.02,0.17) (0.81,0.92) (0.84,0.95) (-0.27,-0.22) (-0.00,0.02) (0.35,0.38) (-0.26,-0.18) (0.26,0.31) (0.51,0.56) AZ (Explicit) -0.580 0.322 0.681 -0.380 0.438 0.852 -0.117 0.006 0.273 -0.403 0.141 0.525 (-0.63,-0.53) (0.29,0.35) (0.65,0.71) (-0.44,-0.32) (0.40,0.48) (0.81,0.89) (-0.14,-0.10) (-0.00,0.02) (0.26,0.29) (-0.44,-0.37) (0.12,0.16) (0.50,0.55) AZ (Explicit) AZ (Implicit) BXA BXE UserUserItemItemMFPF 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Profile Proportion of Female Authors RecommenderProportionofFemaleAuthors Figure 6: Scatter plots and regression curves for recommender response to individual users. more concentrated. In the BookCrossing data, it tends to favor male authors more than the underlying data would support; in implic- it feedback mode, it is highly biased towards male authors with respect even to the baseline distributions. 4.4 From Proles to Recommendations Our extended Bayesian model (Section 3.4.2) allows us to address RQ4: the extent to which our algorithms propagate individual users’ tendencies into their recommendations (RQ4). Figure 5 shows the posterior predictive and observed densities of recommender author-gender tendencies, and Figure 6 shows scatter plots of observed recommendation proportions against user prole proportions with regression curves (regression lines in log- place. Visual inspection of the scatter plot suggests that there is a strong component with consistent tendencies, but the regression may accurately model the remaining users. Future work will use a model that can better account for some global consistency. 4.5 Summary RQ1 — Baseline Gender Distribution Known books are sig- nicantly more likely to be written by men than by women; representation among rated books is more balanced. RQ2 — User Input Gender Distributions User are diuse in their rating tendencies, with an overall trend favoring male authors but less strongly than the baseline distribution. RQ3 — Recommender Output Distributions Dierent CF