The document discusses the challenges of data cleaning for knowledge extraction from social media, emphasizing that keyword or hashtag-based filtering is often insufficient. It presents various models using supervised learning techniques applied to annotated tweet datasets to improve topic relevancy detection. The study evaluates different feature extraction strategies and concludes that machine learning can enhance relevancy detection in social media data.