Weird News Ranking : IRE project

Identification of the weirdness
score and ranking weird/bizarre
news stories
TA : Vijayasaradhi
Team 22
Akshay Kolge (20162021)
Dhruv Khattar (201402087)
Rupali Aher (20162063)
Satyam Mittal (201501020)

● Introduction
● Motivation
● Dataset
● Approach
● Evaluation and Results
● Conclusion
● References
Roadmap

The problem
Problem statement Challenges
Finding out the most important
features
Same things, same actions, different
objects
object-action mapping is not always
useful
Manual annotation of weird news into
ranks
Classifying given news as weird or
normal news (2 class classification
problem)
for weird news, obtain rank using ML
techniques
predicting the weirdness level of these
news
Problem statement Challenges

Final Deliverable
Implementation is proving each news ranked from 1-
4 as user interface and providing a title weirdness
filter that will give the weirdness score(0-3) and tell
whether it is weird or not

Definition and Motivation
● Weird/Bizarre news
○ a news article which is so strange that users might question its credibility.
○ strange or bizarre
● Usually very rare, strange and unbelievable.
● Keeps the boredom away.
● Use bizarre news to gain readers attention and increase viewership.

Related Work
● No prior work directly related to weird and bizarre news.
● However, we find Clickbait Detection and Fake News Detection closely
related to our topic.
○ Chen, Yimin, Niall J. Conroy, and Victoria L. Rubin. "Misleading online
content: Recognizing clickbait as false news."
○ Chakraborty, Abhijnan, et al. "Stop clickbait: Detecting and preventing
clickbaits in online news media."
○ Bajaj, Samir. "“The Pope Has a New Baby!” Fake News Detection Using
Deep Learning."

Dataset
Weird News Articles: 67361
Normal News Articles: 46893
Total News Articles: 114254
The data includes the url and the title of the news article.

Solution
Proposed Plans:
1. Classification Algorithms:
a. Support Vector Machine (SVM)
b. Random Forest
c. Logistic Regression
d. Deep Neural Network
e. Recurrent Neural Networks (RNN, LSTM, GRU)
f. Attention Network along with RNNs
2. Tools Used
a. Python
b. Spacy
c. flask
d. Scikit-learn
e. Keras
f. Theano

1st Phase
● Finding the scope of document
● Understanding and Building Project Prototype
● Discussion on
○ Applications
○ Challenges
○ Tools
○ References
● Google links : Scope Document

Overall Architecture
Phase 1: Classifying news Phase 2: Ranking the news

● Classifying a news article as weird/bizarre or normal.
In this we worked with some of the following features:
● Handcrafted features: Title length, Number of nouns, Number of stop
words, Number of verbs , Frequency of co-occurring words
● Linguistic Features: POS tags, N-grams
● Word Embeddings: Pretrained GloVe embeddings, Doc2Vec embeddings
trained on our dataset
● Google links : Scope Document
2nd Phase

Results
Model Accuracy
lstm 0.851
bilstm 0.863
bilstm+attention 0.874
Using RNN and GloVe:

Results
Using TF-IDF Vector
Model Training Score Testing Score
Random Forest 0.912355174338 0.792919347075
Neural Network 0.8030 0.79808323489284683

Results
Accuracy obtained for above classifiers (without using URL Feature):
Model Training Score Testing Score
Random Forest 0.945011086475 0.80695980955
Neural Network 0.8162 0.81084581989648374
Decision Tree 0.953401797176 0.774646408066
Logistic Regression 0.812848640448 0.80843019185
SVM 0.811226514179 0.807554964291

3rd Phase
● In the third phase of the project, we worked on how to rank the given news
articles on the basis of their weirdness.
● Each member annotated 500 news articles with a rating of 0-3 where 3
represents highly weird and 0 represents close to conventional news.
● After the annotation part, the ranking data from all the team members was
merged into a single file
● Different ranking schemes like average and majority rank were used.

Approach
● We pose the problem of ranking weird/bizarre news as that of a multi-
class classification problem.
● Each news article is given a label depending on the weirdness of the
article.
● There are 4 classes (0-3) where 3 refers to highly weird news and 0 is
close to conventional news.

Evaluations and Results
Model Accuracy (Average Rank) Accuracy (Majority Rank)
lstm 0.737 0.414
bilstm 0.748 0.421
bilstm+attention 0.76 0.43
Using RNN and GloVe:

Using TF-IDF Vector
Model Testing Score
Random Forest 0.69
Neural Network 0.79000002145767212
Using score as accuracy

Model Testing Score
Neural Network 0.790
Logistic Regression 0.79
SVM 0.79
Random Forest 0.720
Decision Tree 0.59
Using score as accuracy

References
○ Chen, Yimin, Niall J. Conroy, and Victoria L.
Rubin. "Misleading online content:
Recognizing clickbait as false news."
Proceedings of the 2015 ACM on Workshop
on Multimodal Deception Detection. ACM,
2015.
○ Chakraborty, Abhijnan, et al. "Stop clickbait:
Detecting and preventing clickbaits in online
news media." Advances in Social Networks
Analysis and Mining (ASONAM), 2016
IEEE/ACM International Conference on.
IEEE, 2016.
○ Bajaj, Samir. "“The Pope Has a New Baby!”
Fake News Detection Using Deep Learning."
Links
● Github: https://github.com/satyammittal/WEIRD-NEWS
● Website: https://satyammittal.github.io/

“Weird is not Normal”
- From an expert

Weird News Ranking : IRE project

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Weird News Ranking : IRE project

Similar to Weird News Ranking : IRE project (20)

Recently uploaded

Recently uploaded (20)

Weird News Ranking : IRE project

Editor's Notes