In a Twitter dataset we are provided with tweets, retrieved per a pre-defined “Trend”.
Can we verify those trends back from the raw statuses? If so – we could use this technique to topic any un-trended tweet-list!
On a tweet dataset, curated from a top of 10 twitter trends, I researched different Natural Language Processing (NLP) and Clustering techniques to apply on the raw tweets’ text.
I found that with the right NLP and Clustering – I could verify ~80% of the tweets back to their labeled trends!