Model-based Approaches for Independence-Enhanced RecommendationToshihiro Kamishima
Model-based Approaches for Independence-Enhanced Recommendation
IEEE International Workshop on Privacy Aspects of Data Mining (PADM), in conjunction with ICDM2016
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2016.0127
Workshop Homepage: http://pddm16.eurecat.org/
Abstract:
This paper studies a new approach to enhance recommendation independence. Such approaches are useful in ensuring adherence to laws and regulations, fair treatment of content providers, and exclusion of unwanted information. For example, recommendations that match an employer with a job applicant should not be based on socially sensitive information, such as gender or race, from the perspective of social fairness. An algorithm that could exclude the influence of such sensitive information would be useful in this case. We previously gave a formal definition of recommendation independence and proposed a method adopting a regularizer that imposes such an independence constraint. As no other options than this regularization approach have been put forward, we here propose a new model-based approach, which is based on a generative model that satisfies the constraint of recommendation independence. We apply this approach to a latent class model and empirically show that the model-based approach can enhance recommendation independence. Recommendation algorithms based on generative models, such as topic models, are important, because they have a flexible functionality that enables them to incorporate a wide variety of information types. Our new model-based approach will broaden the applications of independence-enhanced recommendation by integrating the functionality of generative models.
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
Video https://youtu.be/P4QogT8bdqY
ACS Spring 2023 Symposium on AI-Accelerated Scientific Workflow
https://acs.digitellinc.com/acs/sessions/526630/view
ACS SPRING 2023 ———— Crossroads of Chemistry
Indianapolis, IN & Hybrid, March 26-30
https://www.acs.org/meetings/acs-meetings/spring-2023.html
Slide PDF
https://itakigawa.page.link/acs2023spring
Our Paper
Accelerated discovery of multi-elemental reverse water-gas shift catalysts using extrapolative machine learning approach (2022, ChemRxiv)
https://doi.org/10.26434/chemrxiv-2022-695rj
Ichi Takigawa
https://itakigawa.github.io/
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
Perspectives on Artificial Intelligence and Machine Learning in Materials Science
February 4, 2022. – February 6, 2022.
https://joint.imi.kyushu-u.ac.jp/post-2698/
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
Dec 1, 2021, Pacifico Yokohama, Japan.
Symposium 1AS-17 "Data science and machine learning: Tackling the Noise and Heterogeneity of the Real World"
The 44th Annual Meetingn of the Molecular Biology Society of Japan
https://www2.aeplan.co.jp/mbsj2021/english/designation/index.html
1. 紹介する論論⽂文:
Reducing the Sampling Complexity of Topic Models
(KDD2014 Best Paper)
Aaron Q Li (CMU)
Amr Ahmed (Google)
Sujith Ravi (Google)
Alexander J Smola (CMU & Google)
担当:瀧川 ⼀一学
2. 伏線:KDDはこういうのに結構関⼼心があるらしい…
[22] KDD09: Efficient methods for topic model inference
on streaming document collections
by Limin Yao, David Mimno, Andrew McCallum (UMass)
KDD08: Fast collapsed gibbs sampling for latent dirichlet
allocation
by Ian Porteous, David Newman, Alex Ihler, Arthur Asuncion,
Padhraic Smyth, Max Welling (UC Irvine)
18. 準備2:Walkerʼ’s alias method (1974)
問題:コレ↓どうやって実装する??
離離散分布 k=5
1 2 3 4 5
1,5,5,3,5,4,5,5,3,1,…
乱数⽣生成
サンプル
1 2 3 4 5
a b c d
0 1
⼀一様乱数
u ← [0,1]の⼀一様乱数
if u < a
return 1
else if u < b
return 2
else if u < c
return 3
else if u < d
return 4
else
return 5
u
右のようにやって O(k)
※⼆二分探索索すると O(log k)
19. 準備2:Walkerʼ’s alias method (1974)
問題:コレ↓どうやって実装する??
離離散分布 k=5
1 2 3 4 5
1,5,5,3,5,4,5,5,3,1,…
乱数⽣生成
サンプル
1 2 3 4 5
a b c d
0 1
⼀一様乱数
u ← [0,1]の⼀一様乱数
if u < a
return 1
else if u < b
return 2
else if u < c
return 3
else if u < d
return 4
else
return 5
u
右のようにやって O(k)
※⼆二分探索索すると O(log k)
ちょっとした前処理理をするとこれをO(1)で出来る (Walkerʼ’s alias method)
GNU Rはver 2.2.0で復復元抽出に採⽤用
20. 準備2:Walkerʼ’s alias method (1974)
問題:コレ↓どうやって実装する??
離離散分布 k=5
1 2 3 4 5
1,5,5,3,5,4,5,5,3,1,…
乱数⽣生成
サンプル
1 2 3 4 5
a b c d
0 1
⼀一様乱数
u ← [0,1]の⼀一様乱数
if u < a
return 1
else if u < b
return 2
else if u < c
return 3
else if u < d
return 4
else
return 5
u
右のようにやって O(k)
※⼆二分探索索すると O(log k)
ちょっとした前処理理をするとこれをO(1)で出来る (Walkerʼ’s alias method)
GNU Rはver 2.2.0で復復元抽出に採⽤用
ポイント1:もし上の分割が等分割なら、O(1)で出来ることを思い出す
例例) 「rand()%6」は0から5の整数の乱数
ポイント2:前処理理で”等分割+1回の⼆二者択⼀一”で⾏行行けるようTableを整理理(Alias Table)
1 2 3 4 5
0 1
A B C D E A: u<a→1 else 3
0 1
B: u<b→2 else 3
a b c d :
57. 本論論⽂文のkey idea: Metropolis-‐‑‒Hasting-‐‑‒Walker Sampling
以下の戦略略を「Metropolis-‐‑‒Hasting-‐‑‒Walker Sampling」と名付けた!
• 離離散分布pからのサンプリングをWalkerʼ’s Alias methodでO(1)にする。
• ここで分布pが(ちょっとだけ)変化して分布pʼ’になっているとき離離散分布pʼ’
からサンプリングしたい!
s t a l e な
• O(1)でサンプリングできる変化前の(ちょっと前の)分布pを提案分布として
Metropolis-‐‑‒Hasting Samplingを実⾏行行し、分布pʼ’からのサンプルを得る!
(pとpʼ’の変化がちょっとならほぼ即座に受理理されるのでとても効率率率的!)
本論論⽂文の趣旨は、トピックモデル推定においてGibbs Samplingの代わりにこの
MHW Samplingを(疎密分解と共に)使うことにより、⾼高速化を図るというもの
これを次に説明