This document presents research on using 3-layer Bayesian models for text classification. It shows that models using a "sum of products" factorization can learn non-linear decision boundaries, unlike models using a "product of sums" factorization which are limited to linear boundaries. Experimental results on several text datasets demonstrate that a 3-layer Bayesian classifier can outperform naive Bayes, maximum entropy, and SVM models. The number of hidden nodes per class affects accuracy, with more nodes generally performing better.