* Satoshi Hara and Kohei Hayashi. Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach. AISTATS'18 (to appear).
arXiv ver.: https://arxiv.org/abs/1606.09066#
* GitHub
https://github.com/sato9hara/defragTrees
* Satoshi Hara and Kohei Hayashi. Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach. AISTATS'18 (to appear).
arXiv ver.: https://arxiv.org/abs/1606.09066#
* GitHub
https://github.com/sato9hara/defragTrees
Data-Intensive Text Processing with MapReduce ch6.1Sho Shimauchi
This document is written about "Data-Intensive Text Processing with MapReduce" Chapter 6.1.
Chapter 6 describes how to design Expectation Maximization with MapReduce algorithm.
Section 6.1 focus to Expectation Maximization algorithm itself, and so there are no description about MapReduce.
This document proposes a unified framework for approximating the optimal key estimation of stream ciphers using probabilistic inference. It formulates the key estimation problem as determining the secret key that maximizes the joint probability based on the observed keystream. An approximation algorithm called the sum-product algorithm is used to efficiently compute approximate marginal probabilities on a factor graph representing the cipher structure. Preprocessing techniques can reduce the complexity of the sum-product algorithm when applied to ciphers using linear feedback shift registers.
This document analyzes the asymptotic properties of expected cumulative logarithmic loss in Bayesian estimation when models are nested and when there is misspecification. The main theorem states that if the true distribution does not belong to the model class, the asymptotic loss per symbol goes to the Kullback-Leibler divergence between the true and model distributions, rather than 0. If the true distribution does belong to the model class, the results reduce to previous studies. The proof is separated into two parts and relies on a lemma showing posterior concentration at the true model.
This document proposes a linear programming (LP) based approach for solving maximum a posteriori (MAP) estimation problems on factor graphs that contain multiple-degree non-indicator functions. It presents an existing LP method for problems with single-degree functions, then introduces a transformation to handle multiple-degree functions by introducing auxiliary variables. This allows applying the existing LP method. As an example, it applies this to maximum likelihood decoding for the Gaussian multiple access channel. Simulation results demonstrate the LP approach decodes correctly with polynomial complexity.
The document proposes a new method for document classification with small training data. It discusses previous methods that estimate parameters for prior distributions either using fixed values or estimating data. The new proposed method estimates parameters for prior distributions as a weighted combination of estimating data and training data. Experiments show the new method achieves higher accuracy than previous methods, especially with small training data sizes.
The document proposes a method to calculate the theoretical throughput limit of type-I hybrid selective-repeat ARQ with a finite receiver buffer using Markov decision processes. The authors model the problem as an MDP and develop an algorithm to compute the maximum expected utility and throughput limit by applying dynamic programming. Simulation results show the throughput of previous methods approaches the proposed theoretical limit with increasing buffer size.
The document proposes reducing the computational complexity of message passing algorithms like belief propagation (BP) and concave-convex procedure (CCCP) for multiuser detection in CDMA systems. It does this by changing the factor graph structure used to represent the detection problem from a fully connected graph (Factor Graph I) to a sparsely connected graph (Factor Graph II). Simulation results show the proposed CCCP detector for the new factor graph achieves near optimal performance with lower complexity than existing approaches.
6. 5. ベイズ最適な予測法の導出 通常の線形回帰モデルに対するベイズ最適な予測値 ⇒ 自然共役な事前分布を仮定すると t 分布の期待値 として解析的に計算可能 [Bernardo’94] z n における全ての正常値のデータの組 全ての z n について事後確率で重み付け ⇒ 外れ値を検出する必要はない 2 n 個の z n についての重み付け計算が必要 ⇒ O (2 n ) の計算量