Building Effective Frameworks for Social Media Analysis


Published on

Learn about gaining agile intelligence with open analytics from social media.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Given my background, I come at the social media problem from an intelligence analysis perspective. This comes with a certain set of vocabulary and paradigms, but I believe they are useful for understanding how to frame out an effective analytic framework.
  • Building Effective Frameworks for Social Media Analysis

    1. 1. Building Effective Frameworks for Social Media Analysis
    2. 2. Agenda• Social Media: An INT perspective• Common Analytic Pitfalls• An Analytic Framework• Case Study: Brand Management – Problem Definition – Source Selection – Data Capture – Data Reporting – Data Analysis• Ways Forward, Future Analysis• Questions?
    3. 3. Intelligence• Intelligence is information that has been transformed to meet an operational need Operational Lens Data Intelligence
    4. 4. Intelligence CycleNo matter what method you use… …analysis is an iterative process
    5. 5. Social Media: The INT Perspective Social Media gets the best and worst of three disciplines: HUMINT – HUMINT • Pros: Reveals intentions • Cons: Can be unreliable – OSINT • Pros: Fast, Accessible OSINT SIGINT • Cons: Noise – SIGINT • Pros: Network, High Volume • Cons: Noise
    6. 6. Social Media Analysis Goals• Need to have an end-goal with value to the organization (operational lens)• Need to ensure cyclical feedback occurs from collection, processing, analysis, and consumption• Need to make sure that a particular network is the right source for the task
    7. 7. Common Misconceptions• Social media is not a panacea – Not everyone uses social media – Users of social media use it unevenly – User behavior changes based on situations• Just because people can talk about anything does not mean they talk about everything all the time.
    8. 8. Common Pitfalls• The important thing is often not what people are saying… but why they are saying it.• Reporting tools rarely help dig into the why.• Many common tools, reports, and metrics are actually misleading: – Word clouds atomize message context – Sentiment metrics are often highly inaccurate – Information in aggregate hides more than it reveals
    9. 9. Dangers of Disintegration Source: Matthew Auer, Policy Studies Journal, Volume 39, Issue 4, pages 709–736, Nov 2011
    10. 10. Analytic Framework• Data Capture (DC)• Data Reporting (DR)• Data Analysis (DA) – 1. What to measure – 2. What the data is saying – 3. What should be done based on the data Source: Avinash Kaushik, Occam’s Razor Blog framework-smarter-decisions/
    11. 11. Analytic Framework Capture Analysis Reporting
    12. 12. Choosing a Platform• Social media is still new, evolving; and so is how we use it. – Static approaches to social media are flawed from the outset – No one metric or set of metrics will always let you know what is happening• Need an adaptive platform to facilitate data capture, reporting, and analysis
    13. 13. Case Study: Brand Management• Industry: Gaming – Experiencing 10% growth annually – Overall revenue expected to exceed $80 billion by 2014• In May, Zenimax Online Studios announced Elder Scrolls Online – Elder Scrolls V: Skyrim 2nd largest game of 2011
    14. 14. Problem Definition• As a brand manager, how can I use social media to track and understand public attitudes toward my product?• Challenge is getting relevant information – Query too large = false positives – Query too small = miss potential information
    15. 15. Source: Twitter• Twitter has some of the best analytic potential – High volume traffic – High volume user-base – Open API• Not without limitations: – 140 characters – Limited historical / lookback
    16. 16. Platform: Infinit.e Infinit.e is a scalable framework for Visualizing Analyzing Retrieving Enriching StoringCollecting Unstructured documents & Structured records
    17. 17. Platform: Infinit.e• Infinit.e supports the extraction of entities and creation of associations using a combination of built in enrichment libraries and 3rd party NLP APIs.
    18. 18. Data Capture – Initial Query• Twitter search for “Elder Scrolls Online” – Simplest possible way to access information – RSS feed for 10 days (Jun 27 – July 6 2012)
    19. 19. Data Capture - Tagging{ "_id": "4fea6ddce4b0fa6316c7e07a", "communityIds": ["4fce07a1e4b06dc8a9107f3b"], "created": "Jun 26, 2012 10:20:12 PM", "description": "Twitter search for "Elder Scrolls Online" - started 6/26/2012", "extractType": "Feed", "tags": [ "games", "social", "entertainment" ], "title": "Elder Scrolls Online - Twitter“ "url": "", "useExtractor": "AlchemyAPI-metadata", "useTextExtractor": "none“ ...}
    20. 20. Data Capture – Entity Map Hashtag TwitterHandle URL Who TwitterHandle What Hashtags, Keywords, URLs When Time, Date Unstructured Keywords Where Time / Date Stamp Geo (if Available)
    21. 21. Data Reporting• Used Infinit.e’s Flash U/I Widget Framework – Document Browser (Individual Tweets) – Entity Significance (Top Entities) – Sentiment (Top Entities w/ Sentiment) – Query Metrics (Breakdowns of Query Results)• Framework allows for additional visualizations to be constructed as needed• Export options also available for manual review (e.g. graphml, excel, pdf)
    22. 22. Data Reporting
    23. 23. Data Reporting
    24. 24. Data Reporting
    25. 25. Data Analysis• Analysis needs to be rooted in the operational need: “How can I use social media to track and understand public attitudes toward my product”• Emphasis on hypothesis generation, testing, and experimentation
    26. 26. Data Analysis -> Capture• Hash tags from an initial subset of Tweets fed back into the initial query Initial Expanded Query Query Results Results Twitter
    27. 27. Data Analysis - Hashtags• Top hashtags were almost all generic / more abstract – Undermines tracking and understanding – Top hashtags tied to franchise, not to the game
    28. 28. Data Analysis - Sentiment• Converted URLs into derivative sources• 35% additional sources• Larger text sources offer potential value with sentiment analysis that tweets alone cannot offer
    29. 29. Data Analysis - Sentiment• Top negative and positive scores provided glimpses into aggregate attitudes• Provide starting points for additional analysis
    30. 30. Data Analysis - Recommendations• Actionable recommendations allow decision makers to make changes
    31. 31. Future Data Analysis• Initial conclusions should be starting points for new analysis• Broad entity capture allows for: – Key influencer identification – Clustering of tweets for segmentation – Map / Reduce for aggregate functions
    32. 32. Infinit.e’s Hadoop Integration
    33. 33. Expandable Model• Identify key influencers on specific topics• Look at relationships between websites / blogs and Twitter use (cross-network analysis)
    34. 34. Counting and Summing• “Traditional” business intelligence analytics problems solved using aggregate functions: – Sum – Count – Average – Min – Max – Etc.
    35. 35. Clustering - Topic• Topic Extraction – Key words -> Categories – Categories -> Related CategoriesKeyword Topic Key Valuegraphics graphics graphics gameplay.pdfscreenshots graphics story gameplay.pdfresolution graphics company corporate.txtquests story … …zenimax company … …… …
    36. 36. Clustering - Geo
    37. 37. Take-Aways• All data providers can and do change their formats; users flock to and abandon platforms – what works today may not work tomorrow.• Whatever platform you choose to do analysis, make sure it’s open and adaptable or your investment may degrade over time.
    38. 38. Take Aways (Things to Avoid)• Data puking (less is more)• Metrics that cannot be tied to actions• Visualizations / reports that remove context• Taking dashboards at face value
    39. 39. Take Aways (Things to Do)• Segment data rather than work in aggregate• Look for the why behind the message• Always return to the source material• Explore alternative explanations• Always consider the ultimate goal
    40. 40. Thank You! Andrew Strite
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.