SentElectTM: Forecasting Elections based on
Sentiments in Social Media

V.S. Subrahmanian
SentiMetrix, Inc. & University o...
SentElectTM Election Application
On May 8 2013, Sentimetrix
predicted the outcome of
the upcoming Pakistan
election in fro...
SentElectTM
• Currently tracks Twitter feeds on virtually any topic
– Politicians
– Political parties
– Issues (in progres...
SentElectTM
SentElectTM Functionalities

Business Use

Identify sentiment and changes in
sentiment on any given topic

Tra...
SentElectTM Case Study
• Upcoming Indian election
• Identified 31 entities to track.
• Learned diffusion models from
July ...
BJP Forecast

Jan 24 2014
Feb 20 2014

OUTLOOK
•

•

•
•

BJP
supporters
exceed
opponents.
Positives
increasing
faster tha...
Narendra Modi Forecast

Jan 24 2014
Feb 20 2014

OUTLOOK
•

•

•

Modi
supporters
exceed
opponents.
Positives
increasing
f...
UPA Forecast

Jan 24 2014
Feb 20 2014

OUTLOOK
•

•
•

•

UPA
opponents
outnumber
supporters.
But
catching
up.
Raw
numbers...
Congress Party Forecast

Feb 20 2014

OUTLOOK
•

•

•

Interesting,
sentiment
on
Congress is
more
positive.
But very
muted...
Rahul Gandhi Forecast

Jan 24 2014
Feb 20 2014

OUTLOOK
•

•

•

Overall,
sentiment
on Rahul is
positive
Positives
outweig...
Arvind Kejriwal Forecast

Jan 24 2014
Feb 20 2014

OUTLOOK
•

•

Positives
and
negatives
about even
as of Feb
20
But trend...
SentElect Summary Statistics
BJP

Narendra
Modi

UPA

Congress
Party

Rahul
Gandhi

Arvind
Kejriwal

#Supporters
Feb 20 20...
Head to Head: BJP vs. UPA/Congress
•

Feb 20 2014:
–
–
–

•

Forecast for May 15 2014:
–
–

•
•
•
•

BJP shows almost 4 ti...
Head to Head: Narendra Modi vs. Rahul Gandhi
•

Feb 20 2014:
– Mr. Gandhi and Mr. Modi are
about equal in “likes” as of
Fe...
Head to Head: Rahul Gandhi vs. Arvind Kejriwal
•

Feb 20 2014:
– Mr Gandhi has 2x
supporters w.r.t. Mr.
Kejriwal
– But he ...
Head to Head: Narendra Modi vs. Arvind Kejriwal
•

Feb 20 2014:
– Mr Modi has 2x supporters
as Mr. Kejriwal
– But also has...
SentElectTM : Identifying Key Influencers

Selected
topic(s)

© Sentimetrix, All rights reserved,
Sentiment Analysis Sympo...
SentElectTM : Identifying Key Influencers

Constraints on identifying
influential users

© Sentimetrix, All rights reserve...
SentElectTM : Identifying Key Influencers
List of most
influential users
on the select topic
– note that
number of
followe...
SentElectTM : User Profile
Distribution
of topics
discussed

© Sentimetrix, All rights reserved,
Sentiment Analysis Sympos...
SentElectTM : User Profile
Tabs allow user to see
other tweets

List of
tweets on
selected
topics

© Sentimetrix, All righ...
SentElectTM : Sentiment Profile
Average
sentiment score
on selected topics
range from -1
(max negative) to
+1 (max positiv...
SentElectTM : Sentiment Profile

Volume of tweets on
selected topic

© Sentimetrix, All rights reserved,
Sentiment Analysi...
Forecast Summary

Forecast #1

Forecast #2

Forecast #3

• Narendra
Modi will
be India’s
next Prime
Minister.

• BJP (by
i...
Forecast Risks
• Our forecast can go wrong.
– Risk #1 Forecasting based on unsupervised learning is difficult at best.
No ...
One Sybil’s strategy: @IsabellaObregom
1. Take tweet from a reputable account:
– @AapKaJawab, an Aam Aadmi Party enthusias...
A larger Sybil network in our dataset

• We found many Sybil/bot accounts
• @Marie____Taylor and @Amy____Jones tweet
ident...
© Sentimetrix, All rights reserved,
Sentiment Analysis Symposium March 6
2014

28
SentiMetrix Contact Information
• Address
6017 Southport Drive
20814 Bethesda MD
USA

• E-mail
info@sentimetrix.com
• www....
Upcoming SlideShare
Loading in …5
×

Sent elect march6-2014

1,144 views

Published on

Using SentElect product to forecast elections results in India

Published in: Social Media
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,144
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Sent elect march6-2014

  1. 1. SentElectTM: Forecasting Elections based on Sentiments in Social Media V.S. Subrahmanian SentiMetrix, Inc. & University of Maryland @vssubrah vs@sentimetrix.com March 6 2014 This work was performed for Sentimetrix, Inc. © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 1
  2. 2. SentElectTM Election Application On May 8 2013, Sentimetrix predicted the outcome of the upcoming Pakistan election in front of 100+ people in V.S. Subrahmanian’s keynote at the Sentiment Analysis Symposium in New York City On May 9, the BBC said the election was too close to call “Pakistan Elections: Five Reasons why the vote is unpredictable” Sentimetrix was correct! © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 2
  3. 3. SentElectTM • Currently tracks Twitter feeds on virtually any topic – Politicians – Political parties – Issues (in progress, expected completion April 2014) • Identifies intensity of sentiment on each topic in each tweet. • Forecasts trends in terms of expected number of supporters/opponents on Twitter • Identifies individuals who are most influential in shaping an opinion/trend • Provides a single dashboard to cover all of this. © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 3
  4. 4. SentElectTM SentElectTM Functionalities Business Use Identify sentiment and changes in sentiment on any given topic Track sentiment on both your political campaign as well as your competitor’s Learns a model on “big data” showing Understand how your campaign (and your how support/opposition to a topic spreads opponent’s) are doing with voters and why Forecast the expected number of people who will support/oppose a topic Forecast how many people support/oppose your campaign and/or your opponent’s Identify the most important individuals responsible for shaping/spreading opinion on a topic Identify those shaping positive/negative opinion about you and see if you can get them to work on your behalf. Engage with influential Twitter users © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 4
  5. 5. SentElectTM Case Study • Upcoming Indian election • Identified 31 entities to track. • Learned diffusion models from July 15 – Jan 25 2014. • Tested models on Jan 25-Feb 20 data (~26 days) • Forecast trends on all 31 entities from Feb 20 2014 to May 15 2014. • Tested diffusion forecasts on January 25-Feb 20 2014 data with Pearson correlation coefficients consistently over 0.8, usually over 0.9. SUMMARY STATISTICS • Study reported here uses data from July 2013 to Feb 20 2014 • Forecasts made till May 15 2014. • 19.5M tweets studied in all • 16M distinct Twitter accounts • 40M edge network Twitter collection done using Twitter ontology and semantic database developed by Rensselaer Polytechnic Institute. [@jahendler] © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 5
  6. 6. BJP Forecast Jan 24 2014 Feb 20 2014 OUTLOOK • • • • BJP supporters exceed opponents. Positives increasing faster than negatives Large number of supporters Outlook is very good July 15 2013 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 6
  7. 7. Narendra Modi Forecast Jan 24 2014 Feb 20 2014 OUTLOOK • • • Modi supporters exceed opponents. Positives increasing faster than negatives Outlook is very good July 15 2013 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 7
  8. 8. UPA Forecast Jan 24 2014 Feb 20 2014 OUTLOOK • • • • UPA opponents outnumber supporters. But catching up. Raw numbers much smaller than for BJP. Outlook not good. July 15 2013 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 8
  9. 9. Congress Party Forecast Feb 20 2014 OUTLOOK • • • Interesting, sentiment on Congress is more positive. But very muted in terms of numbers. Outlook is not good. July 15 2013 Jan 24 2014 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 9
  10. 10. Rahul Gandhi Forecast Jan 24 2014 Feb 20 2014 OUTLOOK • • • Overall, sentiment on Rahul is positive Positives outweigh negatives and are growing. But negatives are much higher than Modi’s July 15 2013 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 10
  11. 11. Arvind Kejriwal Forecast Jan 24 2014 Feb 20 2014 OUTLOOK • • Positives and negatives about even as of Feb 20 But trend shows increasing doubts about Mr. Kejriwal as election time draws near. July 15 2013 May 15 2014 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 11
  12. 12. SentElect Summary Statistics BJP Narendra Modi UPA Congress Party Rahul Gandhi Arvind Kejriwal #Supporters Feb 20 2014 193031 68320 42482 7082 66399 31626 #Opponent Feb 20 2014 135077 26868 47893 4177 39641 19964 #Supporters May 15 2014 273119 95006 52736 9592 74773 96931 #Opponent May 15 2014 191171 40466 54189 5060 40389 213784 Accuracy (PCC*) Pos. 0.985 0.83 0.986 0.900 0.936 0.983 Accuracy (PCC) Neg. 0.984 0.957 0.984 0.931 0.911 0.966 * Pearson Correlation Coefficient © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 12
  13. 13. Head to Head: BJP vs. UPA/Congress • Feb 20 2014: – – – • Forecast for May 15 2014: – – • • • • BJP shows almost 4 times as many supporters as Congress/UPA supporters. BJP opponents are less than 3 times as many as Congress/UPA opponents. So BJP is doing well. BJP will maintain about 1.5x supporters as compared to opponents. Congress/UPA has slightly more opponents than supporters. BJP’s outlook in terms of positives and negatives shows a combined growth. But UPA/Congress combined negatives exceed positives. And support for UPA/Congress is tepid raising the question of Congress/UPA supporters showing up to vote. In general, till May 15 2014, BJP seems to garner more support than Congress/UPA. UPA/Congress 2/20 BJP -2/20 Support UPA/Congress 5/15 Opposition BJP -5/15 0 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 400000 13
  14. 14. Head to Head: Narendra Modi vs. Rahul Gandhi • Feb 20 2014: – Mr. Gandhi and Mr. Modi are about equal in “likes” as of Feb 20 2014 with Mr. Modi having a small [insignificant] lead. – But Mr. Gandhi has 1.5x as many opponents in comparison to Mr. Modi. • Gandhi - 2/20 Modi -2/20 Support May 15 2014: Gandhi - 5/15 – In terms of supporters, Mr. Modi is pulling ahead of Mr. Gandhi with 1.3x supporters compared with Mr. Gandhi. – On opponents, we expect them to be even. • Opposition Modi -5/15 Mr. Modi is likely to pull away ahead of Mr. Gandhi by May 15 2014. 0 50000 100000150000 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 14
  15. 15. Head to Head: Rahul Gandhi vs. Arvind Kejriwal • Feb 20 2014: – Mr Gandhi has 2x supporters w.r.t. Mr. Kejriwal – But he also has 2x opponents w.r.t. Mr. Kejriwal • Gandhi - 2/20 Kejriwal -2/20 May 15 2014: – Mr. Kejriwal will have 1.3x supports w.r.t. Mr. Gandhi [an about turn!] – Mr. Kejriwal will have 5x opponents w.r.t. Mr. Gandhi. • • In short, though supporters for Mr. Kejriwal will grow, opponents will increase in number faster. Congress/UPA should outperform AAP/Mr. Kejriwal. Support Gandhi - 5/15 Opposition Kejriwal -5/15 0 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 200000 400000 15
  16. 16. Head to Head: Narendra Modi vs. Arvind Kejriwal • Feb 20 2014: – Mr Modi has 2x supporters as Mr. Kejriwal – But also has about 1.4x opponents as Mr. Kejriwal • May 15 2014: – Mr. Modi and Mr. Kejriwal will have about the same number of supporters – Mr. Kejriwal will have about 5x the number of opponents as Mr. Modi • • Though support for Mr. Kejriwal is growing, opposition is growing at a much faster rate. We expect BJP to handily outperform AAP/Mr. Kejriwal. Modi - 2/20 Kejriwal -2/20 Support Modi - 5/15 Opposition Kejriwal -5/15 0 200000 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 400000 16
  17. 17. SentElectTM : Identifying Key Influencers Selected topic(s) © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 17
  18. 18. SentElectTM : Identifying Key Influencers Constraints on identifying influential users © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 18
  19. 19. SentElectTM : Identifying Key Influencers List of most influential users on the select topic – note that number of followers is not adequate © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 19
  20. 20. SentElectTM : User Profile Distribution of topics discussed © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 20
  21. 21. SentElectTM : User Profile Tabs allow user to see other tweets List of tweets on selected topics © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 21
  22. 22. SentElectTM : Sentiment Profile Average sentiment score on selected topics range from -1 (max negative) to +1 (max positive) © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 22
  23. 23. SentElectTM : Sentiment Profile Volume of tweets on selected topic © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 23
  24. 24. Forecast Summary Forecast #1 Forecast #2 Forecast #3 • Narendra Modi will be India’s next Prime Minister. • BJP (by itself) will fall short of a majority in Parliament, securing less than 272 seats. • Next Indian government will be a BJP-led coalition © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 24
  25. 25. Forecast Risks • Our forecast can go wrong. – Risk #1 Forecasting based on unsupervised learning is difficult at best. No training data connecting votes on the ground in India to number of supporters/opponents on Twitter. Selection bias. – Risk #2 Forecast is based on publicly available Twitter data, not on entire Twitter fire-hose. – Risk #3 Twitter-based and technology based risks: geo-location issues, bots/sybils/fake accounts. – Risk #4 Changing situation on the ground with new allegations (e.g. corruption) emerging frequently. – Risk #5 External events we can’t control for (e.g. terrorist attacks) can dramatically change the electoral landscape. • Sentimetrix will update its forecasts approximately once every 2-3 weeks on www.sentimetrix.com. Next scheduled update – March 27 2014. © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 25
  26. 26. One Sybil’s strategy: @IsabellaObregom 1. Take tweet from a reputable account: – @AapKaJawab, an Aam Aadmi Party enthusiast, retweets: “Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in – http://t.co/bVCHPte60k” 2. Follow link, rewrap in new shortened URL – – @AapKaJawab’s link leads to an Indian news article @IsabellaObregom shrinks URL with Adf.ly, tweets: “Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in http://t.co/81cq9eyrNh” 3. @IsabellaObregom now paid per click through Adf.ly! (In early 2014, Adf.ly and Twitter suspended account – original owner tweeted only in Spanish) © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 26
  27. 27. A larger Sybil network in our dataset • We found many Sybil/bot accounts • @Marie____Taylor and @Amy____Jones tweet identically, except different shortened links. – Overlapping network of followers – 100K+ tweets – Many “smaller” inactive followers, each following 3040 random people, with 30-40 bot followers. – Related: @Lea___Smith, @Megan__Martinez, etc… © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 27
  28. 28. © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 28
  29. 29. SentiMetrix Contact Information • Address 6017 Southport Drive 20814 Bethesda MD USA • E-mail info@sentimetrix.com • www.sentimetrix.com • Telephone +1 240 479 9286 • V.S. Subrahmanian • Twitter: @vssubrah • Email: vs@sentimetrix.com • www.cs.umd.edu/~vs/ • Telephone: +1 301 405 6724 © Sentimetrix, All rights reserved, Sentiment Analysis Symposium March 6 2014 29

×