This document discusses web spam detection using machine learning techniques. Specifically, it proposes an improved Naive Bayes classifier that incorporates user feedback and domain-specific features to better detect spam pages. The key points are:
1) Web spam has become a serious problem as internet usage has increased, threatening search engines and users. Spam pages aim to deceive search engines' ranking algorithms.
2) Existing spam detection techniques like content analysis are still lacking and Naive Bayes classifiers are commonly used but have limitations like treating all terms equally.
3) The paper proposes an improved Naive Bayes classifier that assigns different weights to terms based on user feedback and considers domain-specific features to reduce false positives and negatives and improve accuracy