Predict Interestingness of An Article Using Twitter
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Predict Interestingness of An Article Using Twitter

on

  • 145 views

The project aims at measuring the interestingness of articles by analyzing the tweets related to the entities in the article. ...

The project aims at measuring the interestingness of articles by analyzing the tweets related to the entities in the article.


Application:

We can order the articles for a search query according to their interestingness.

Suggesting news articles to users on websites

Statistics

Views

Total Views
145
Views on SlideShare
145
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Predict Interestingness of An Article Using Twitter Presentation Transcript

  • 1. Predict the Interesting of an article Using Twitter Chitra khatwani Yashasvi girdhar Khyati chandu R.K. Srinivas
  • 2. The project aims at measuring the interestingness of articles by analyzing the tweets related to the entities in the article. ● Application: – We can order the articles for a search query according to their interestingness. – Suggesting news articles to users on websites
  • 3. Approach Followed ● Extract all the named entities from the article > Two methods can be followed ● Using NLTK Library ● Using A list of Wikipedia Titles We have used the second approach, because the nltk library misses out many important entities, in some cases.
  • 4. Approach Followed ● Shortlist all the dominant entities from the extracted entities – Dominant entities are those, which are most frequently talked about in the article. – Methods: ● Can be decided based on the frequency of entities ● Entities occurring in the title of the article
  • 5. Approach Followed ● Mine all the tweets related to all the dominant entities ● Done using Twitter Search API ● Need to collect the tweets of the entities, around the date when the article was published. ● Need to parse the tweets before storing them, to make thhem ready for the next steps.
  • 6. Approach Followed ● Categorize each tweet as +ve , -ve or neutral – Consider all the unigrams tokens equally – Score each token using the naive bayes formula – Sum up the scores of all the tokens to calculate the score for an entitiy
  • 7. Approach Followed ● Predict the interestingness of the article, using the number of positive and negative tweets We have followed the below approach : – Less is the difference between number of positive tweets and number of negative tweets, more is the interestingness of the article. – On the other hand, if the number of positive entities outweighs the number of negative entities, or vice- versa, the article is considered less interesting.
  • 8. Datasets used ● For Articles – A set of random news articles taken from the BBC News Dataset ● For Sentiment Analysis – Mejaj Dataset ● Built on the basis of categorizing tweets on the basis of predefined list of positive and negative words – Standford Dataset
  • 9. Challenges ● Collecting the right set of articles for testing our model ● Finding the Right dataset for twitter and then, deciding upon the parameters, to categorize the tweet ● Deciding upon the appropriate algorithm for deciding the interestingness of the article, based on the +ve and -ve tweets
  • 10. Conclusion ● Social Media, such as twitter in this case, is a very common medium for people nowadays, to express their opinions about something. This can be leveraged as a very powerful medium, in predicting the nature of the data published on the web, specially millions of articles that are published each day. This can also be used in suggesting the articles to the users. References ● Mining Sentiments from Tweets, Siel, IIIT-Hyderabad