SentElectTM: Forecasting Elections based on
Sentiments in Social Media
V.S. Subrahmanian
SentiMetrix, Inc. & University of...
SentElectTM Election Application
© Sentimetrix, Inc
All rights reserved 2014
2
On May 8 2013, Sentimetrix
predicted the ou...
SentElectTM
• Currently tracks Twitter feeds on virtually any topic
– Politicians
– Political parties
– Issues (in progres...
SentElectTM
SentElectTM Functionalities Business Use
Identify sentiment and changes in
sentiment on any given topic
Track ...
SentElectTM Case Study
© Sentimetrix, Inc
All rights reserved 2014
5
• Upcoming Indian election
• Identified 31 entities t...
BJP Forecast
© Sentimetrix, Inc
All rights reserved 2014
6
July 15 2013
Feb 24 2014
Mar 24 2014
May 15 2014
OUTLOOK
• Posi...
Narendra Modi Forecast Forecast
© Sentimetrix, Inc
All rights reserved 2014
7
July 15 2013
Feb 24 2014
Mar 24 2014
May 15 ...
UPA Forecast
© Sentimetrix, Inc
All rights reserved 2014
8
July 15 2013
Feb 24 2014 Mar 24 2014
May 15 2014
OUTLOOK
• Oppo...
Congress Party Forecast
© Sentimetrix, Inc
All rights reserved 2014
9
July 15 2013
Feb 24 2014
Mar 24 2014
May 15 2014
OUT...
Rahul Gandhi Forecast
© Sentimetrix, Inc
All rights reserved 2014
10July 15 2013
Feb 24 2014
Mar 24 2014
May 15 2014
OUTLO...
Arvind Kejriwal Forecast
© Sentimetrix, Inc
All rights reserved 2014
11July 15 2013
Feb 24 2014
Mar 24 2014
May 15 2014
OU...
SentElect Summary Statistics
© Sentimetrix, Inc
All rights reserved 2014
12
BJP Narendra
Modi
UPA Congress
Party
Rahul
Gan...
Head to Head: BJP vs. UPA/Congress
• Mar 24 2014:
– BJP shows almost 5 times as many
supporters as Congress/UPA
supporters...
Head to Head: Narendra Modi vs. Rahul Gandhi
• Mar 24 2014:
– Mr. Gandhi has about 5%
more supporters than Mr.
Modi.
– But...
Head to Head: Rahul Gandhi vs. Arvind Kejriwal
• Mar 24 2014:
– Mr Gandhi has 2x supporters
w.r.t. Mr. Kejriwal
– But he h...
Head to Head: Narendra Modi vs. Arvind Kejriwal
• Mar 24 2014:
– Mr Modi has 1.9x supporters
as Mr. Kejriwal
– But on oppo...
Forecast Summary
© Sentimetrix, Inc
All rights reserved 2014
17
Forecast #1
• Narendra Modi will
be India’s next Prime
Min...
Forecast Risks
• Our forecast can go wrong.
– Risk #1 Forecasting based on unsupervised learning is
difficult at best. No ...
One Sybil’s strategy: @IsabellaObregom
1. Take tweet from a reputable account:
– @AapKaJawab, an Aam Aadmi Party enthusias...
A larger Sybil network in our dataset
• We found many Sybil/bot accounts
• @Marie____Taylor and @Amy____Jones tweet
identi...
© Sentimetrix, Inc
All rights reserved 2014
21
SentiMetrix Contact Information
• Address
6017 Southport Drive
20814 Bethesda MD
USA
• E-mail
info@sentimetrix.com
• www.s...
Upcoming SlideShare
Loading in …5
×

Updated Indian elections forecast

953 views
836 views

Published on

SentiMetrix updated its forecast for the Indian election, first made on March 6, 2014 at the Sentiment Analysis Symposium in New York. This update is based on the data collected after the SAS14

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
953
On SlideShare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Updated Indian elections forecast

  1. 1. SentElectTM: Forecasting Elections based on Sentiments in Social Media V.S. Subrahmanian SentiMetrix, Inc. & University of Maryland @vssubrah vs@sentimetrix.com Apr 19 2014 © Sentimetrix, Inc All rights reserved 2014 1 This work was performed for Sentimetrix, Inc.
  2. 2. SentElectTM Election Application © Sentimetrix, Inc All rights reserved 2014 2 On May 8 2013, Sentimetrix predicted the outcome of the upcoming Pakistan election in front of 100+ people in V.S. Subrahmanian’s keynote at the Sentiment Analysis Symposium in New York City On May 9, the BBC said the election was too close to call “Pakistan Elections: Five Reasons why the vote is unpredictable” Sentimetrix was correct!
  3. 3. SentElectTM • Currently tracks Twitter feeds on virtually any topic – Politicians – Political parties – Issues (in progress, expected completion April 2014) • Identifies intensity of sentiment on each topic in each tweet. • Forecasts trends in terms of expected number of supporters/opponents on Twitter • Identifies individuals who are most influential in shaping an opinion/trend • Provides a single dashboard to cover all of this. © Sentimetrix, Inc All rights reserved 2014 3
  4. 4. SentElectTM SentElectTM Functionalities Business Use Identify sentiment and changes in sentiment on any given topic Track sentiment on both your political campaign as well as your competitor’s Learns a model on “big data” showing how support/opposition to a topic spreads Understand how your campaign (and your opponent’s) are doing with voters and why Forecast the expected number of people who will support/oppose a topic Forecast how many people support/oppose your campaign and/or your opponent’s Identify the most important individuals responsible for shaping/spreading opinion on a topic Identify those shaping positive/negative opinion about you and see if you can get them to work on your behalf. Engage with influential Twitter users © Sentimetrix, Inc All rights reserved 2014 4
  5. 5. SentElectTM Case Study © Sentimetrix, Inc All rights reserved 2014 5 • Upcoming Indian election • Identified 31 entities to track. • Learned diffusion models from July 15 – Jan 25 2014. • Tested models on Jan 25-Feb 20 data (~26 days) • Forecast trends on all 31 entities from Feb 20 2014 to May 15 2014. • Tested diffusion forecasts on January 25-Feb 20 2014 data with Pearson correlation coefficients consistently over 0.8, usually over 0.9. SUMMARY STATISTICS • Study reported here uses data from July 2013 to Feb 20 2014 • Forecasts made till May 15 2014. • 19.5M tweets studied in all • 16M distinct Twitter accounts • 40M edge network Twitter collection done using Twitter ontology and semantic database developed by Rensselaer Polytechnic Institute. [@jahendler]
  6. 6. BJP Forecast © Sentimetrix, Inc All rights reserved 2014 6 July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Positive support for BJP is growing at a faster rate than negatives. • Outlook is good but more or less same as March 6 forecast.
  7. 7. Narendra Modi Forecast Forecast © Sentimetrix, Inc All rights reserved 2014 7 July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Positive support for Modi is growing at a much faster rate than negatives. • Outlook is very good and has improved since our March 6 forecast.
  8. 8. UPA Forecast © Sentimetrix, Inc All rights reserved 2014 8 July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Opposition to UPA exceeds support. It is also growing at a slightly faster rate. • Outlook for the UPA is not good and has worsened slightly since the March 6 forecast • Number of people tweeting about UPA is way smaller
  9. 9. Congress Party Forecast © Sentimetrix, Inc All rights reserved 2014 9 July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Congress has more supporters than opponents. • Growth in support Iarger than growth in opposition • But number of supporters is small compared to BJP.
  10. 10. Rahul Gandhi Forecast © Sentimetrix, Inc All rights reserved 2014 10July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Sentiment on Rahul Gandhi is strong and growth in supporters outweights growth in opponents. • But in raw numbers, h is 1/3 the supporters that Modi has. • Outlook is good but not great.
  11. 11. Arvind Kejriwal Forecast © Sentimetrix, Inc All rights reserved 2014 11July 15 2013 Feb 24 2014 Mar 24 2014 May 15 2014 OUTLOOK • Kejriwal will have more opponents than supporters by early May. • Steep increase in both supporters and opponents around mid- December 2013.
  12. 12. SentElect Summary Statistics © Sentimetrix, Inc All rights reserved 2014 12 BJP Narendra Modi UPA Congress Party Rahul Gandhi Arvind Kejriwal #Supporters Mar 24 2014 294848 96376 59880 9324 102541 54777 #Opponent Mar 24 2014 211002 43217 71514 5839 59958 42367 #Supporters May 15 2014 385819 102669 68926 11289 147989 64371 #Opponent May 15 2014 257902 48002 81436 7948 65820 71717 Accuracy (PCC*) Pos. 0.999 0.998 0.998 0.977 0.995 0.979 Accuracy (PCC) Neg. 0.988 0.998 0.998 0.970 0.996 0.971 * Pearson Correlation Coefficient
  13. 13. Head to Head: BJP vs. UPA/Congress • Mar 24 2014: – BJP shows almost 5 times as many supporters as Congress/UPA supporters, up in ratio from a month back. – BJP opponents are less than 3 times as many as Congress/UPA opponents. – So BJP is doing well. • Forecast for May 15 2014: – BJP will have almost 3x supporters as compared to opponents. – Congress/UPA has about 10% more opponents than supporters. • BJP’s outlook in terms of positives and negatives shows a combined growth. • But UPA/Congress combined negatives exceed positives. • And support for UPA/Congress is tepid raising the question of Congress/UPA supporters showing up to vote. • In general, till May 15 2014, BJP seems to garner more support than Congress/UPA. © Sentimetrix, Inc All rights reserved 2014 13 0 400000 800000 BJP -5/15 UPA/Congress - 5/15 BJP - 3/24 UPA/Congress - 3/24 Support Opposition
  14. 14. Head to Head: Narendra Modi vs. Rahul Gandhi • Mar 24 2014: – Mr. Gandhi has about 5% more supporters than Mr. Modi. – But Mr. Gandhi has 1.4x as many opponents in comparison to Mr. Modi. • May 15 2014: – In terms of supporters, Mr. Gandhi is pulling ahead of Mr. Modi with 1.5x supporters compared with Mr. Gandhi. – But on opponents, Mr. Gandhi has 1.3x of the opponents Mr. Modi has. • This reverses a trend seen in our Mar 6 2014 forecast. • Head-to-head, Mr. Gandhi has improved his showing in between Feb 20 and Mar 24. © Sentimetrix, Inc All rights reserved 2014 14 0 200000 Modi -5/15 Gandhi - 5/15 Modi - 3/24 Gandhi - 3/24 Support Opposition
  15. 15. Head to Head: Rahul Gandhi vs. Arvind Kejriwal • Mar 24 2014: – Mr Gandhi has 2x supporters w.r.t. Mr. Kejriwal – But he has 1.4x opponents w.r.t. Mr. Kejriwal (down from 2x in our Feb 6 forecast) • May 15 2014: – Mr. Gandhi will have over 2x supporters that Mr. Kejriwal [an about turn from our Mar 6 forecast!] – Mr. Kejriwal will have 1.2x opponents w.r.t. Mr. Gandhi, a significant reduction of the ratio from the last month. • In short, Mr. Gandhi has made an about-turn in the race in terms of positives. • Congress/UPA should outperform AAP/Mr. Kejriwal. © Sentimetrix, Inc All rights reserved 2014 15 0 200000 Kejriwal -5/15 Gandhi - 5/15 Kejriwal - 3/24 Gandhi - 3/24 Support Opposition
  16. 16. Head to Head: Narendra Modi vs. Arvind Kejriwal • Mar 24 2014: – Mr Modi has 1.9x supporters as Mr. Kejriwal – But on opponents, he is more or less even with Mr. Kejriwal (a sharp reduction from our Mar 6 talk) • May 15 2014: – Mr. Modi and Mr. Kejriwal will have about 1.6x the supporters of Mr. Kejriwal – Mr. Kejriwal will have about 1.5x the number of opponents as Mr. Modi • Overall, the situation in the Modi vs. Kejriwal race has not changed much. • Though support for Mr. Kejriwal is growing, opposition is growing at a much faster rate. • We expect BJP to handily outperform AAP/Mr. Kejriwal. © Sentimetrix, Inc All rights reserved 2014 16 0 100000 200000 Kejriwal -5/15 Modi - 5/15 Kejriwal - 3/24 Modi - 3/24 Support Opposition
  17. 17. Forecast Summary © Sentimetrix, Inc All rights reserved 2014 17 Forecast #1 • Narendra Modi will be India’s next Prime Minister. Forecast #2 • BJP (by itself) will fall short of a majority in Parliament, securing less than 272 seats. Forecast #3 • Next Indian government will be a BJP-led coalition
  18. 18. Forecast Risks • Our forecast can go wrong. – Risk #1 Forecasting based on unsupervised learning is difficult at best. No training data connecting votes on the ground in India to number of supporters/opponents on Twitter. Selection bias. – Risk #2 Forecast is based on publicly available Twitter data, not on entire Twitter fire-hose. – Risk #3 Twitter-based and technology based risks: geo- location issues, bots/sybils/fake accounts. – Risk #4 Changing situation on the ground with new allegations (e.g. corruption) emerging frequently. – Risk #5 External events we can’t control for (e.g. terrorist attacks) can dramatically change the electoral landscape. © Sentimetrix, Inc All rights reserved 2014 18
  19. 19. One Sybil’s strategy: @IsabellaObregom 1. Take tweet from a reputable account: – @AapKaJawab, an Aam Aadmi Party enthusiast, retweets: “Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in – http://t.co/bVCHPte60k” 2. Follow link, rewrap in new shortened URL – @AapKaJawab’s link leads to an Indian news article – @IsabellaObregom shrinks URL with Adf.ly, tweets: “Arvind Kejriwal breaks into Manna Dey song on brotherhood at swearing-in http://t.co/81cq9eyrNh” 3. @IsabellaObregom now paid per click through Adf.ly! (In early 2014, Adf.ly and Twitter suspended account – original owner tweeted only in Spanish) © Sentimetrix, Inc All rights reserved 2014 19
  20. 20. A larger Sybil network in our dataset • We found many Sybil/bot accounts • @Marie____Taylor and @Amy____Jones tweet identically, except different shortened links. – Overlapping network of followers – 100K+ tweets – Many “smaller” inactive followers, each following 30- 40 random people, with 30-40 bot followers. – Related: @Lea___Smith, @Megan__Martinez, etc… © Sentimetrix, Inc All rights reserved 2014 20
  21. 21. © Sentimetrix, Inc All rights reserved 2014 21
  22. 22. SentiMetrix Contact Information • Address 6017 Southport Drive 20814 Bethesda MD USA • E-mail info@sentimetrix.com • www.sentimetrix.com • Telephone +1 240 479 9286 • V.S. Subrahmanian • Twitter: @vssubrah • Email: vs@sentimetrix.com • www.cs.umd.edu/~vs/ • Telephone: +1 301 405 6724 © Sentimetrix, Inc All rights reserved 2014 22

×