Political Opinion Mining, Sentiment Analysis and Technology


Published on


Slide Deck Focus

Sentiment analysis through Facebook and Twitter leveraging


This slide deck was a product of developing a sentiment and text analytics engine. We leveraged Facebook Connect, Twitter Firehose and web scrapting to gather text and store it in both MongoDB and Hadoop. Once we had it stored we performed Mahout and Solr text searching and anlytics to determine trends within the data. Although our dataset was not large enough to need it, we used Greenplum as a test MPP database to tie all three of those technologies into one dashboard using Pentaho.

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Political Opinion Mining, Sentiment Analysis and Technology

  1. 1. Advanced Political Analysis through “Big Data”
  2. 2. Z DATA’S AGILE ANALYSIS – THE “BIG DATA STACK”• How we leverage the “Big Data” stack? – Technology • Don’t back your problem into available technologies, leave your toolset open. • Organically grow new skillsets, hire the right individuals – Development • Be agile in your approach • Comparative analysis both using new mathematical methods and open source technologies – Embrace the shift into a data driven world • Empower your Engineering and Science team to be creative • Let the data lead your direction • Use new data types previously unavailable to drive insights“Associating structured and unstructured data at relevant points iswhere the most value is gained and where the highest level ofchallenge is presented.” – Ryan Abo PHD – Z Data Inc.
  3. 3. ANALYZING THE POLITICAL LANDSCAPE • Location based Google Search and Twitter mentionsPhase 1 • Word pair mentions • Facebook and Twitter Sentiment and GeospatialPhase 2 Analysis
  4. 4. UNSTRUCTURED AND STRUCTURED DATA COMPLEMENTING YOUR TECHNOLOGIESStructured Data• Standard Datawarehouse – finance, sales• GeoSpatial – locations, places• Technologies – Greenplum, Netezza, TeradataUnstructured Data• Textual Objects - Social Media, Blogs, forums• Bitmap Objects – images, video, audio• Technologies – Hadoop, Cassandra, Solr, NoSql
  5. 5. Identifying Unstructured Data SourcesObjective: Identify and leverage social media outlets to better predict the overallsentiment across political candidates. Facebook Twitter- User Likes and Favorites- Article/Video/Link Shares Google / You Tube Tweet Characteristics- Views - Length- Comments - Blogs - Language Model- Location / Geospatial - Comments - Symantics - Search Statistics - Emoticons - Likes vs Dislikes - Location / Geospatial - Shares / Views / Comments
  6. 6. SEARCH, MENTION AND WORD PAIR ANALYSISSearch Engine Data• Number of Searches for a candidate or political party• Word pair / combination analysisWhy should we care?• Determine the most successful candidate online• Effectiveness of campaigns and conversion to online competitive content
  7. 7. ADVANCED SENTIMENT ANALYSISWhat is this sentiment they speak of?• Unstructured Text Data• Using computational linguistics to accurately determine the attitude of a writer with respect to a topic.Why should we care?• Use “Opinion Mining” to predict political bias
  8. 8. Relational and Unstructured Analytics / BI ELTCustomer Data
  9. 9. Agile Analysis - Mathematical Methods Prediction and Machine Learning -Unigram and Bigram Features -Bayesian Probability -Maximum Entropy -Distant Supervision -Support Vector Machines
  10. 10. UnStructured Analysis - Naïve Bayes classifier 100% 80% ACURACY 60% PECISION#obama #Kardashian 40% RECALL#iran #bieber 20%#biglove #romney #obama 0%#palin #healthcare #iran NAÏVE BAYES#stimulus #nexttopmodel#bigdata #teaparty #romney #palin #stimulus #teapartyErica – Wow I love cookies in themorning, check out my new batch Daria – #Romney Political Classification speech was horrible Unstructured + StructuredDaria – #Romney speech was horrible that guy knows Political Relevancethat guy knows nothing nothing 100% 80% ACURACY 60% PECISION 40% RECALL 20% 0% NAÏVE BAYES
  11. 11. Positive Neutral Negative Education Education Education Economy Economy Economy Foreign Foreign Foreign Policy Policy Policy Health Care Health Care Health Care Mitt Romney Republican PrimaryFILTER BY: Orange County (January 2011 – May 2011) 10 9 8 7 6 5Facebook 4 Sentiment 3Twitter 2 Actuals 1Google 0 Romney Santorum Huntsman Gingrich Paul Democratic Vote Democratic Sentiment Republican Vote Republican Sentiment