Unveiling Design Patterns: A Visual Guide with UML Diagrams
Ranking Twitter Conversations
1. 1
Ranking Twitter conversations
• Motivating Example
– Extract real-time information
– Peoples views on upcoming elections, products.
– Extracting user interest through conversation
topics
• Problem Definition
– Rank Twitter conversations.
– Generate snippet for each ranked conversation
2. 2
Related Work
• Wang, Hao, Zhengdong Lu, Hang Li, and Enhong Chen. "A
Dataset for Research on Short-Text Conversations." In
EMNLP, pp. 935-945. 2013.
• Key Idea
– retrieval-based response model for short-text based
conversation
• Their solution
– Considered few selected topics from Sina Weibo
– Semantic matching between post-response
– Post-response similarity
• Their results
– Mean average precision – 0.621
– retrieval is fairly effective at capturing the semantic relevance,
but relative weak on modeling the logic consistency
3. 3
Our Methodology
• Key Idea of your work
– Give an importance score to tweets based on their position and
user based on their appearances apart from using inverted
index.
• Solution Description
– Filter tweets
– Create word index
– Considering SMS language
– Score tweets according to TF and tweet and user score
– TF score for tweets according to word type
• Hashtag, user mention, other words
– Generate snippets
• Our approach is ranking twitter conversation rather that
just finding responses to tweets.
4. 4
Parse twitter data Filter valid tweets Extract conversation
Remove stop words
Remove duplicate
words in a tweet
Creating inverted
word index
Calculate user and
tweet score
Get query Parse words in query
Expand SMS words
Calculate
conversation score
based on TF and
tweet and user score
Generate snippet
and display the
results
5. 5
Dataset and Experimental Settings
• Dataset details (size, source, other data statistics)
– 12077 tweets
– 4521 conversations (length >= 2)
– 119 Stop words
• Experimental settings
– Play with removing or adding the below constraints
• duplicate words
• stop words
• Tweet/user score
– Expand SMS words in query
• Accuracy or any other metric you used
– Results were subjective and it was obtained iteratively
6. 6
Results and Summary
• Results and analysis of results
– Subjective in nature. Accuracy could not be obtained
without knowing the context of the conversation.
• What did you learn from this project?
– A basic understanding of how documents can be
ranked given a query
• Future work:
– Infer context of the conversation
– Calculate precision/recall by programmatically tagging
tweets