42. AI 42
▪ Finite-time Analysis of the Multiarmed Bandit Problem
▪ 多腕バンディット問題におけるUCB方策を理解する
▪ Bandit Algorithms in Information Retrieval
▪ A Contextual-Bandit Approach to Personalized News Article Recommendation
▪ Thompson Sampling for Contextual Bandits with Linear Payoffs
An Empirical Evaluation of Thompson Sampling
▪ Netflixも使っている!Contextual Banditアルゴリズムを徹底解説!(Part 1)
▪ PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
▪ [論文紹介] PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits
Cascading Bandits: Learning to Rank in the Cascade Model
▪ The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
▪ Cascade Model に適用する Bandit Algorithms の理論と実装
▪ Cascading Bandits for Large-Scale Recommendation Problems
参考文献