Real-World Challenges of Real-Time Social Analytics


Published on

In this 40 minute presentation, Attensity’s CTO, Ian Hersey, speaks about the challenges and critical benefits of real-time social media analytics. Real-world examples further illustrate the types of insights that natural-language-processing is capable of discovering.

Recorded at the 2012 Social Media Analytics Summit, some of the topics covered are:

- How people have become “human sensors” about all kinds of news
- The limitations of insights pulled from the Twitter stream
- The success of predictions based on social media data
- The application of natural language processing in social analytics

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Real-World Challenges of Real-Time Social Analytics

  1. 1. Real-Time Social AnalyticsIan HerseyCTO and EVP, Products
  2. 2. The Business ChallengeThe BIG DATA waveDriven by conversations on the Internet, Social Media, Mobile Apps • 300 Million: Tweets per day • 250 Billion: Emails per day • 800 Million: Facebook users • 126 Million: Blogs • 1.97 Billion: Internet users worldwide • 5 Trillion: SMS messages annually • Millions of CRM Records • 100s of Millions of Survey Verbatims
  3. 3. Social Media: “Human Sensors”• News, firsthand, secondhand, thirdhand… • Natural disasters • Military movements • Networks, velocity, acceleration• Opinions • Products • Services • Popular culture (TV, film, music) • Politics• Conversations, Comments, Recommendations• Can sometimes predict (or explain) outcomes
  4. 4. Predictive Power• Don’t take it to Vegas… • 90+% success rate if data volumes are sufficient• Successful business uses involve not just prediction, but engagement • Product feedback • Direct customer service • Analysis of marketing campaign effectiveness (TV, film, music) • Political outreach/mobilization• Science Art is still in its infancy• Equally or more important are the “whys” behind the predictions/outcomes
  5. 5. GOP Florida – Newt Gingrich There was a sustained campaign to drum up support for Newt Gingrich Selected Newt Gingrich topics were discussed Another topic that had mileage more at length throughout the day. For example, throughout the day particularly around being sued by the band Foreign for using the song mid-afternoon. A Ron Paul supporter “Eye Of The Tiger” since 2009 captured the was somewhat roughed up at a Gingrich imagination of Twittersphere … rally. Later in the day Ron Paul’s team demanded an apology…
  6. 6. GOP Florida – Mitt Romney There was a sustained campaign to drum up support for Newt Gingrich The most consistent theme throughout One of the key criticisms were jabs the day was Romney being a Populist. aimed at Romney’s wealth in the sense that money and privilege can win you leadership…
  7. 7. Some Major Technical Challenges• Data scale and rates• NLP – no “one size fits all” technology• Multi-channel content acquisition, coverage and quality• Domain and customer specificity in the metadata• Combining structured and full-text queries• Operation by non-linguists
  8. 8. Data Scale and Rates• Experience with Hadoop, HBase and Solr• Biggest issues • “Enterprise friendliness” • Cannot support low-latency processing • No current commercial offerings with both SQL and full-text front ends• “Real-time” analysis scenario • Match a tweet according to an initial filter • Do further analysis to determine whether it is “actionable” vs “just a mention” • Figure out who to route it to with what kind of priority • All within a handful of seconds from the time it was tweeted • 2500 times or more per second• Required development of real-time ingestion and orchestration framework
  9. 9. Real-Time Processing Flow Analyze Command Center Harvest & Harmonize Sensemaking Firehose Pipeline 150+ Million & Annotation Sources Respond Custom Apps
  10. 10. Real-Time Content Aggregation Direct API Access Scrapers, Crawlers, RSS Collectors Aggregators and Syndicators Structured and Unstructured
  11. 11. Social Analytics Application Stack
  12. 12. Natural Language Processing “Reads” Every CommunicationI bought an iPad2 for my mom last week. She loves the weight, butdoesn’t like the color. She wishes it came in blue. She says if it came inblue, then she’d buy one for all her friends.Entities (brands, people, locations, times, products…)Events and relationships (purchasing event, my mom…)SentimentSuggestionsIntent (to purchase, to leave)I:have:momI:buy[past]:Product.apple_product.iPad2
  13. 13. Limitations of NLP• Irony, sarcasm• “slanguage”• Who’s talking/tweeting? • Agendas • Impact (“opinions are like…”)• Cross-/multi-language• Single posts vs. “body of work”
  14. 14. Annotated Data Streams Feed Downstream Applications
  15. 15. Real-Time Processing Pipeline Advanced Topic Creator Geotagging Language ID Reach Klout Entity, Event, Sentiment Tagging Topic Matcher Message Tracking Worker Libraries All standard content (Twitter, Google+, Facebook, Forums, Blogs, Online News) SDK Kit and API documentation
  16. 16. Advanced Topic Creator
  17. 17. Command Center Concepts and Overview The Command Center is a highly branded shared experience providing a lens to real-time social media conversations Command Center screens use a responsive design for the following resolutions 1920x1080 (Most televisions) 1024x728 (Compatible with desktop computers and tablets) A Command Center implementation is made up of multiple Dashboards Implementations are hosted by Attensity Dashboards contain multiple Widgets Widgets are configurable with lots of options Endless combinations
  18. 18. Thank you