Mining Interesting Trivia for Entities
from Wikipedia
Supervised By: Presented By:
Dr. Dhaval Patel,
Assistant Professor,
IIT Roorkee
Abhay Prakash,
En. No. - 10211002,
IIT Roorkee
Dr. Manoj K. Chinnakotla,
Applied Researcher,
Microsoft India
Publication Accepted
[1] Abhay Prakash, Manoj K. Chinnakotla, Dhaval Patel, Puneet Garg: “Did you
know?- Mining Interesting Trivia for Entities from Wikipedia”. In 24th
International Joint Conference on Artificial Intelligence (IJCAI), 2015.
Conference Rating: A*
Introduction: Problem Statement
Definition: Trivia are any facts about an entity which are interesting due to any
of the following characteristics - unusualness, uniqueness, unexpectedness or
weirdness.
 Generally appear in “Did you know?” articles
 E.g. “To prepare for Joker’s role, Heath Ledger secluded himself in a hotel room for a month” [Batman
Begins]
 Unusual for an actor/human to seclude himself for a month
Problem Statement: For a given entity, mine top-k interesting trivia from its Wikipedia
page, where a trivia is considered interesting if when it is shown to 𝑁 persons, more
than 𝑁/2 persons find it interesting.
 For evaluation of unseen set, we chose 𝑁 = 5 (statistical significance discussed in mid evaluation)
Wikipedia Trivia Miner (WTM)
 Based on ML approach to mine trivia from unstructured text
 Trains a ranker using sample trivia of target domain
 Experiment with Movie entities and Celebrity entities
 Harness trained ranker to mine Trivia from entity’s Wikipedia page
 Retrieves Top-k standalone interesting sentences from entity’s page
 Why Wikipedia?
 Reliable for factual correctness
 Ample # of interesting trivia (56/100 in expt.)
System Architecture
 Filtering & Grading
 Filters out noisy samples
 Give a grade to each sample, as reqd. by ranker
 Interestingness Ranker
 Extracts features from the samples/candidates
 Trains ranker(SVMrank)/Ranks candidates
 Candidate Selection
 Identifies candidates from Wikipedia
Candidate
Selection
Human Voted Trivia Source
Train Dataset Candidates’ Source
Top-K Interesting Trivia
from Candidates
Wikipedia Trivia Miner (WTM)
Interestingness Ranker
Filtering & Grading
Feature Extraction Feature ExtractionSVMrank
Knowledge Base
Candidate
Selection
Candidates’ Source
Top-K Interesting Trivia
from Candidates
Feature ExtractionSVMrank
Knowledge Base
Retrieval Phase
Human Voted Trivia Source
Train Dataset
Filtering & Grading
Feature Extraction SVMrank
Train Phase
Model
Execution Phases
 Train Phase
 Crawls and prepares train data
 Featurize the train data
 Trains SVMrank to build a model
 Retrieval Phase
 Crawls entity’s Wikipedia text
 Identify candidates for trivia
 Featurize the candidates
 Rank the candidates using
already built model
Feature Engineering
Bucket Feature Significance Sample features Example Trivia
Unigram (U)
Features
Each word’s
TF-IDF
Identify imp. words which
make the trivia interesting
“stunt”, “award”,
“improvise”
“Tom Cruise did all of his own stunt driving.”
Linguistic (L)
Features
Superlative
Words
Shows the extremeness
(uniqueness)
“best”, “longest”,
“first”
“The longest animated Disney film since
Fantasia (1940).”
Contradictory
Words
Opposing ideas could spark
intrigue and interest
“but”, “although”,
“unlike”
“The studios wanted Matthew McConaughey
for lead role, but James Cameron insisted on
Leonardo DiCaprio.”
Root Word
(Main Verb)
Captures core activity being
discussed in the sentence
root_gross “Gravity grossed $274 Mn in North America”
Subject Word
(First Noun)
Captures core thing being
discussed in the sentence
subj_actor “The actors snorted crushed B vitamins for
scenes involving cocaine”
Readability Complex and lengthy trivia
are hardly interesting
FOG Index binned
in 3 bins ---
Feature Engineering (Contd…)
Bucket Feature Significance Sample features Example Trivia
Entity (E)
Features
Generic NEs captures general about-
ness
MONEY,
ORGANIZATION,
PERSON, DATE, TIME
and LOCATION
“The guns in the film were supplied by Aldo
Uberti Inc., a company in Italy.”
• ORGANIZATION and LOCATION
Related
Entities
captures specific about-
ness
(Entities resolved using
DBPedia)
entity_producer,
entity_director
“According to Victoria Alonso, Rocket Raccoon
and Groot were created through a mix of
motion-capture and rotomation VFX.”
• entity_producer, entity_character
Entity Linking
before
(L) Parsing
Captures generalized
story of sentence
subj_entity_producer [The same trivia above]
• “According to entity_producer, …”
• subj_Victoria  subj_entity_producer
Focus Entities Captures core entities
being talked about
underroot_entity_
producer
[The same trivia above]
• underroot_entity_producer,
underroot_entity_character
Feature Engineering: Example
Ex. “According to Victoria Alonso, Rocket Raccoon and Groot were created through a mix of
motion-capture and rotomation VFX.”
 Features extracted: 18025 (U) + 5 (L) + 4686 (E) columns in total for all train data
 Rest of the features have value 0.
 entity_actor = 0, award = 0, subj_actor = 0, root_win = 0, ….
create mix motion capture rotomation VFX root_create supPOS subj_entity_producer FOG
0.25 0.75 0.96 0.4 0.85 0.75 1 0 1 3
contradictory entity_producer entity_character underroot_entiy_producer underroot_entity_character
0 1 1 1 1
Comparative Approaches
I. Random [Baseline I]:
- 10 sentences picked randomly from Wikipedia
II. CS + Random
- Candidates Selected (standalone context independent sentences)
- i.e., remove sentences like “it really reminds me of my childhood”
- 10 sentences picked randomly from candidates
III. CS + supPOS(Best) [Baseline II]:
- Candidates Selected
- Ranked by # of sup. words
- Deliberately taking interesting sent. for same # of sup. words
Rank # of sup.
words
Class
1 2 Interesting
2 2 Boring
3 1 Interesting
4 1 Interesting
5 1 Interesting
6 1 Boring
7 1 Boring
supPOS (Best Case)
Variants of WTM
I. WTM (U)
- Candidates Selected
- ML Ranking of candidates using only Unigram Features
II. WTM (U+L+E)
- Candidates Selected
- ML Ranking of candidates using all features: Unigram (U) + Linguistic (L) + Entity (E)
Results: P@10
 Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
Results: P@10
 Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
 CS+Random > Random
 Shows significance of Candidate
Selection
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
Results: P@10
 Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
 CS+Random > Random
 Shows significance of Candidate
Selection
 WTM (U+L+E) >> WTM (U)
 Shows significance of Engineered
Linguistic (L) and Entity (E) Features
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
Results: Recall@K
 supPOS limited to one kind of trivia
 WTM captures varied types
 62% recall till rank 25
 Performance Comparison
 supPOS better till rank 3
 Soon after rank 3, WTM beats superPOS
0
10
20
30
40
50
60
70
0 5 10 15 20 25
%Recall
Rank
SuperPOS (Best Case) WTM Random
Sensitivity to Training Size
 Current Results reported with 6163 Train
Trivia
 WTM precision increases with train size
 Desirable property as precision can be
improved by taking more train data
WTM’s Domain Independence
 Experiment on Celebrity Domain to justify claim of domain independence.
 Dataset:
 Crawled Trivia for Top 1000 Movie celebrities from IMDB and did 5 fold test
 Train dataset: 4459 Trivia (106 entities)
 Test dataset: 500 Trivia (10 entities)
 Doubtful feature for being domain dependent – Entity Features
Unigram (E) Features Linguistic (L) Features Entity (E) Features
All words subj_actor, root_reveal,
subj_scene, but, best,
FOG_index = 7.2
entity_producer,
entity_director, …
WTM’s Domain Independence (Contd…)
 Entity Features are domain independent too
 Entity Features are automatically generated using attribute:value pairs in DBpedia
 For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’
 Unigram (U) and Linguistic (L) features clearly domain independent
DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
WTM’s Domain Independence (Contd…)
 Entity Features are domain independent too
 Entity Features are automatically generated using attribute:value pairs in DBpedia
 For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’
 Unigram (U) and Linguistic (L) features clearly domain independent
DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
FEATURE ENTITY TRIVIA
entity_partner Johnny Depp Engaged to Amber Heard [January 17, 2014].**
entity_citizenship Nicole Kidman First Australian actress to win the Best Actress Academy Award.
** After Entity Linking sentence parsed as “Engaged to entity_partner”
 Entity Feature Generation from DBpedia
 Example of Entity Features in Celebrity Domain
WTM’s Domain Independence (Contd…)
Movie Domain (ex. Batman Begins (2005) ) Celebrity Domain (ex. Angelina Jolie)
DBpedia attribute:value Feature generated DBpedia attribute:value Feature generated
Director: Christopher Nolan entity_director Partner: Brad Pitt entity_partner
Producer: Larry J. Franco entity_producer birthplace: California entity_birthPlace
Feature Contribution (Movie v/s Celeb.)
Rank Feature Group
1 win Unigram
3 magazine Unigram
4 superPOS Linguistic
5 MONEY Entity (NER)
6 entity_alternativenames Entity
7 root_engage Linguistic
14 subj_earnings Linguistic
15 subj_entity_children Linguistic + Entity
18 entity_birthplace Entity
19 subj_unlinked_location Linguistic + Entity
Rank Feature Group
1 subj_scene Linguistic
2 subj_entity_cast Linguistic + Entity
3 entity_produced_by Entity
4 underroot_unlinked_organization Linguistic + Entity
6 root_improvise Linguistic
7 entity_character Entity
8 MONEY Entity (NER)
14 stunt Unigram
16 superPOS Linguistic
17 subj_actor Linguistic
 Top Features: Our advanced features are useful and intuitive for humans too
 Entity Linking leads to better generalization (instead of entity_wolverine, model gets entity_cast)
Movie Domain Celebrity Domain
Results: P@10 (Celebrity Domain)
0.39
0.54
0.58
0.71
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Random supPOS(Best
Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
 Again WTM (U+L+E) >> WTM (U)
 Significance of advanced (L) and (E)
features
 Hence, Features and Approach are
Domain Independent
 For entities of any domain, just replace
Train Data (Sample Trivia)
Dissertation Contribution
 Identified, Defined and Provided a novel research problem
 not just only providing solutions to existing problem
 Proposed a Domain Independent system “Wikipedia Trivia Miner (WTM)”
 To mine top-k interesting trivia for any given entity based on their interestingness
 Engineered features that capture ‘about-ness’ of sentence
 Generalizes which one are interesting
 Proposed a mechanism to prepare ground truth for test-set
 Cost-effective but statistically significant
Future Works
 New Features to increase Ranking Quality
 Unusualness: Probability of occurrence of the sentence in considered domain
 Fact Popularity: Lesser known trivia could be more interesting to majority people
 Trying Deep Learning
 Could be helpful as in case of sarcasm detection
 Generating Questions from mined trivia
 To present Trivia in question form
 Obtaining personalized Interesting Trivia
 In this dissertation work, we took interesting based on majority voting. Ranking based on user
demographics
Mining Interesting Trivia for Entities from Wikipedia PART-II

Mining Interesting Trivia for Entities from Wikipedia PART-II

  • 1.
    Mining Interesting Triviafor Entities from Wikipedia Supervised By: Presented By: Dr. Dhaval Patel, Assistant Professor, IIT Roorkee Abhay Prakash, En. No. - 10211002, IIT Roorkee Dr. Manoj K. Chinnakotla, Applied Researcher, Microsoft India
  • 2.
    Publication Accepted [1] AbhayPrakash, Manoj K. Chinnakotla, Dhaval Patel, Puneet Garg: “Did you know?- Mining Interesting Trivia for Entities from Wikipedia”. In 24th International Joint Conference on Artificial Intelligence (IJCAI), 2015. Conference Rating: A*
  • 3.
    Introduction: Problem Statement Definition:Trivia are any facts about an entity which are interesting due to any of the following characteristics - unusualness, uniqueness, unexpectedness or weirdness.  Generally appear in “Did you know?” articles  E.g. “To prepare for Joker’s role, Heath Ledger secluded himself in a hotel room for a month” [Batman Begins]  Unusual for an actor/human to seclude himself for a month Problem Statement: For a given entity, mine top-k interesting trivia from its Wikipedia page, where a trivia is considered interesting if when it is shown to 𝑁 persons, more than 𝑁/2 persons find it interesting.  For evaluation of unseen set, we chose 𝑁 = 5 (statistical significance discussed in mid evaluation)
  • 4.
    Wikipedia Trivia Miner(WTM)  Based on ML approach to mine trivia from unstructured text  Trains a ranker using sample trivia of target domain  Experiment with Movie entities and Celebrity entities  Harness trained ranker to mine Trivia from entity’s Wikipedia page  Retrieves Top-k standalone interesting sentences from entity’s page  Why Wikipedia?  Reliable for factual correctness  Ample # of interesting trivia (56/100 in expt.)
  • 5.
    System Architecture  Filtering& Grading  Filters out noisy samples  Give a grade to each sample, as reqd. by ranker  Interestingness Ranker  Extracts features from the samples/candidates  Trains ranker(SVMrank)/Ranks candidates  Candidate Selection  Identifies candidates from Wikipedia Candidate Selection Human Voted Trivia Source Train Dataset Candidates’ Source Top-K Interesting Trivia from Candidates Wikipedia Trivia Miner (WTM) Interestingness Ranker Filtering & Grading Feature Extraction Feature ExtractionSVMrank Knowledge Base
  • 6.
    Candidate Selection Candidates’ Source Top-K InterestingTrivia from Candidates Feature ExtractionSVMrank Knowledge Base Retrieval Phase Human Voted Trivia Source Train Dataset Filtering & Grading Feature Extraction SVMrank Train Phase Model Execution Phases  Train Phase  Crawls and prepares train data  Featurize the train data  Trains SVMrank to build a model  Retrieval Phase  Crawls entity’s Wikipedia text  Identify candidates for trivia  Featurize the candidates  Rank the candidates using already built model
  • 7.
    Feature Engineering Bucket FeatureSignificance Sample features Example Trivia Unigram (U) Features Each word’s TF-IDF Identify imp. words which make the trivia interesting “stunt”, “award”, “improvise” “Tom Cruise did all of his own stunt driving.” Linguistic (L) Features Superlative Words Shows the extremeness (uniqueness) “best”, “longest”, “first” “The longest animated Disney film since Fantasia (1940).” Contradictory Words Opposing ideas could spark intrigue and interest “but”, “although”, “unlike” “The studios wanted Matthew McConaughey for lead role, but James Cameron insisted on Leonardo DiCaprio.” Root Word (Main Verb) Captures core activity being discussed in the sentence root_gross “Gravity grossed $274 Mn in North America” Subject Word (First Noun) Captures core thing being discussed in the sentence subj_actor “The actors snorted crushed B vitamins for scenes involving cocaine” Readability Complex and lengthy trivia are hardly interesting FOG Index binned in 3 bins ---
  • 8.
    Feature Engineering (Contd…) BucketFeature Significance Sample features Example Trivia Entity (E) Features Generic NEs captures general about- ness MONEY, ORGANIZATION, PERSON, DATE, TIME and LOCATION “The guns in the film were supplied by Aldo Uberti Inc., a company in Italy.” • ORGANIZATION and LOCATION Related Entities captures specific about- ness (Entities resolved using DBPedia) entity_producer, entity_director “According to Victoria Alonso, Rocket Raccoon and Groot were created through a mix of motion-capture and rotomation VFX.” • entity_producer, entity_character Entity Linking before (L) Parsing Captures generalized story of sentence subj_entity_producer [The same trivia above] • “According to entity_producer, …” • subj_Victoria  subj_entity_producer Focus Entities Captures core entities being talked about underroot_entity_ producer [The same trivia above] • underroot_entity_producer, underroot_entity_character
  • 9.
    Feature Engineering: Example Ex.“According to Victoria Alonso, Rocket Raccoon and Groot were created through a mix of motion-capture and rotomation VFX.”  Features extracted: 18025 (U) + 5 (L) + 4686 (E) columns in total for all train data  Rest of the features have value 0.  entity_actor = 0, award = 0, subj_actor = 0, root_win = 0, …. create mix motion capture rotomation VFX root_create supPOS subj_entity_producer FOG 0.25 0.75 0.96 0.4 0.85 0.75 1 0 1 3 contradictory entity_producer entity_character underroot_entiy_producer underroot_entity_character 0 1 1 1 1
  • 10.
    Comparative Approaches I. Random[Baseline I]: - 10 sentences picked randomly from Wikipedia II. CS + Random - Candidates Selected (standalone context independent sentences) - i.e., remove sentences like “it really reminds me of my childhood” - 10 sentences picked randomly from candidates III. CS + supPOS(Best) [Baseline II]: - Candidates Selected - Ranked by # of sup. words - Deliberately taking interesting sent. for same # of sup. words Rank # of sup. words Class 1 2 Interesting 2 2 Boring 3 1 Interesting 4 1 Interesting 5 1 Interesting 6 1 Boring 7 1 Boring supPOS (Best Case)
  • 11.
    Variants of WTM I.WTM (U) - Candidates Selected - ML Ranking of candidates using only Unigram Features II. WTM (U+L+E) - Candidates Selected - ML Ranking of candidates using all features: Unigram (U) + Linguistic (L) + Entity (E)
  • 12.
    Results: P@10  Metricis Precision at 10 (P@10), which means out of top 10 ranked candidates, how many actually are interesting 0.25 0.3 0.34 0.34 0.45 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Random CS+Random supPOS (Best Case) WTM (U) WTM (U+L+E) P@10 Approaches
  • 13.
    Results: P@10  Metricis Precision at 10 (P@10), which means out of top 10 ranked candidates, how many actually are interesting  CS+Random > Random  Shows significance of Candidate Selection 0.25 0.3 0.34 0.34 0.45 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Random CS+Random supPOS (Best Case) WTM (U) WTM (U+L+E) P@10 Approaches
  • 14.
    Results: P@10  Metricis Precision at 10 (P@10), which means out of top 10 ranked candidates, how many actually are interesting  CS+Random > Random  Shows significance of Candidate Selection  WTM (U+L+E) >> WTM (U)  Shows significance of Engineered Linguistic (L) and Entity (E) Features 0.25 0.3 0.34 0.34 0.45 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 Random CS+Random supPOS (Best Case) WTM (U) WTM (U+L+E) P@10 Approaches
  • 15.
    Results: Recall@K  supPOSlimited to one kind of trivia  WTM captures varied types  62% recall till rank 25  Performance Comparison  supPOS better till rank 3  Soon after rank 3, WTM beats superPOS 0 10 20 30 40 50 60 70 0 5 10 15 20 25 %Recall Rank SuperPOS (Best Case) WTM Random
  • 16.
    Sensitivity to TrainingSize  Current Results reported with 6163 Train Trivia  WTM precision increases with train size  Desirable property as precision can be improved by taking more train data
  • 17.
    WTM’s Domain Independence Experiment on Celebrity Domain to justify claim of domain independence.  Dataset:  Crawled Trivia for Top 1000 Movie celebrities from IMDB and did 5 fold test  Train dataset: 4459 Trivia (106 entities)  Test dataset: 500 Trivia (10 entities)  Doubtful feature for being domain dependent – Entity Features Unigram (E) Features Linguistic (L) Features Entity (E) Features All words subj_actor, root_reveal, subj_scene, but, best, FOG_index = 7.2 entity_producer, entity_director, …
  • 18.
    WTM’s Domain Independence(Contd…)  Entity Features are domain independent too  Entity Features are automatically generated using attribute:value pairs in DBpedia  For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’  Unigram (U) and Linguistic (L) features clearly domain independent DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
  • 19.
    WTM’s Domain Independence(Contd…)  Entity Features are domain independent too  Entity Features are automatically generated using attribute:value pairs in DBpedia  For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’  Unigram (U) and Linguistic (L) features clearly domain independent DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
  • 20.
    FEATURE ENTITY TRIVIA entity_partnerJohnny Depp Engaged to Amber Heard [January 17, 2014].** entity_citizenship Nicole Kidman First Australian actress to win the Best Actress Academy Award. ** After Entity Linking sentence parsed as “Engaged to entity_partner”  Entity Feature Generation from DBpedia  Example of Entity Features in Celebrity Domain WTM’s Domain Independence (Contd…) Movie Domain (ex. Batman Begins (2005) ) Celebrity Domain (ex. Angelina Jolie) DBpedia attribute:value Feature generated DBpedia attribute:value Feature generated Director: Christopher Nolan entity_director Partner: Brad Pitt entity_partner Producer: Larry J. Franco entity_producer birthplace: California entity_birthPlace
  • 21.
    Feature Contribution (Moviev/s Celeb.) Rank Feature Group 1 win Unigram 3 magazine Unigram 4 superPOS Linguistic 5 MONEY Entity (NER) 6 entity_alternativenames Entity 7 root_engage Linguistic 14 subj_earnings Linguistic 15 subj_entity_children Linguistic + Entity 18 entity_birthplace Entity 19 subj_unlinked_location Linguistic + Entity Rank Feature Group 1 subj_scene Linguistic 2 subj_entity_cast Linguistic + Entity 3 entity_produced_by Entity 4 underroot_unlinked_organization Linguistic + Entity 6 root_improvise Linguistic 7 entity_character Entity 8 MONEY Entity (NER) 14 stunt Unigram 16 superPOS Linguistic 17 subj_actor Linguistic  Top Features: Our advanced features are useful and intuitive for humans too  Entity Linking leads to better generalization (instead of entity_wolverine, model gets entity_cast) Movie Domain Celebrity Domain
  • 22.
    Results: P@10 (CelebrityDomain) 0.39 0.54 0.58 0.71 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Random supPOS(Best Case) WTM (U) WTM (U+L+E) P@10 Approaches  Again WTM (U+L+E) >> WTM (U)  Significance of advanced (L) and (E) features  Hence, Features and Approach are Domain Independent  For entities of any domain, just replace Train Data (Sample Trivia)
  • 23.
    Dissertation Contribution  Identified,Defined and Provided a novel research problem  not just only providing solutions to existing problem  Proposed a Domain Independent system “Wikipedia Trivia Miner (WTM)”  To mine top-k interesting trivia for any given entity based on their interestingness  Engineered features that capture ‘about-ness’ of sentence  Generalizes which one are interesting  Proposed a mechanism to prepare ground truth for test-set  Cost-effective but statistically significant
  • 24.
    Future Works  NewFeatures to increase Ranking Quality  Unusualness: Probability of occurrence of the sentence in considered domain  Fact Popularity: Lesser known trivia could be more interesting to majority people  Trying Deep Learning  Could be helpful as in case of sarcasm detection  Generating Questions from mined trivia  To present Trivia in question form  Obtaining personalized Interesting Trivia  In this dissertation work, we took interesting based on majority voting. Ranking based on user demographics