Understanding Social Media Analytics : Big Picture


Published on

Understanding Social Media Analytic : Big Picture

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Understanding Social Media Analytics : Big Picture

  1. 1. © This document contains confidential and proprietary information of Adroitent. It is furnished for evaluation purposes only. Except with the express prior written permission of Adroitent, this document and the information contained herein may not be published, disclosed, or used for any other purpose. | www.adroitent.com Understanding Social Media Analytics SANDEEP SEERAPU
  2. 2. • Web is no longer a static library that people passively browse • Web is a place where people: o Consume and create content o Interact with other people:  Internet forums, Blogs, Social networks, Twitter, Wikis, Podcasts, Slide sharing, Bookmark sharing, Product reviews, Comments, … • DATA POINT: Facebook traffic tops Google (for USA) • March 2010: FB > 7% of US traffic http://money.cnn.com/2010/03/16/technology/facebook_most_visited Social Media : Big Change
  3. 3. • Rich and big data: • Billions users, billions contents • Textual, Multimedia (image, videos, etc.) • Billions of connections • Behaviours, preferences, trends... • Data is open and easy to access • It’s easy to get data from Social Media • Datasets • Developers APIs • Spidering the Web Social Media : Rich and Big data
  4. 4. Social Media : Opportunities Any user can share and contribute content, express opinions, link to others This means: Can data-mine opinions and behaviours of millions of users to gain insights into: • Human behaviour • Marketing analytics • Product sentiment
  5. 5. What can we do with this data?
  6. 6. • Consumer Brand Analytics • What are people saying about our brand? • Marketing Communications • Significant spending on marketing, advertising: • Companies trying to position their products • Brand analytics helps to determine whether such campaigns are effective • Product reviews • Automatically mine product reviews for information on product features, new requests, … • Easy to use, Comfortable chair, Light weight, Sturdy, Good price Applications: Reputation Management
  7. 7. • Citizen response • Solicit citizen feedback on bills debated in Congress • What new issues are being raised, what aspects of bill are popular, unpopular • Political Campaigns • Why do people support a candidate? • Law enforcement • Gang members boast about their activities on Facebook • Protests being planned through Twitter • NYT: Sending the Police Before There’s a Crime http://www.nytimes.com/2011/08/16/us/16police.html?_r=1 Applications: Citizen Response
  8. 8. • Viral marketing: • Personalized recommendations Online forum users are • Brand advocates: • 79.2% of forum contributors help a friend to make a decision about a product • purchase (47.6% of non-contributors). • 65% of forum contributors share advice (offline and in person) based on information that they’ve read online (35% of non-contributors) http://www.socialmediaexaminer.com/new-studies-show-value-of-social-media Applications: Social Media Marketing
  9. 9. Information Flow How do we capture and model the flow of information?
  10. 10. Given that social media generate a wealth of consumer data, how can brands turn raw social media comment data from Twitter, Facebook, blogs, and forums into actionable business insights? The answer lies in the application of text-mining and semantic technology to these new sources of unstructured data. How does it work? • Text mining is similar to data mining in that it is aimed at identifying interesting patterns in data • The first step in any text-mining effort is to identify the text-based sources to be analysed and gather this material through information retrieval or selecting the corpus that comprises the set of textual files and content of interest. • Extensive NLP is deployed that invokes "part of speech tagging" and text sequencing to parse for syntax (that is, tokenizing text) and applying Named Entity Recognition (that is, identifying the mention of brands, people's names, places, common abbreviations, and so on). Text mining and semantic methods
  11. 11. Unique challenges exist when setting out to apply text mining to social media data. The data that social networking sites, blogs, and forums generate falls in the category of what is commonly referred to as big data. The data is unstructured and semi-structured, petabytes are generated around larger brands on a daily basis, and traditional relational databases cannot efficiently scale to support real-time analytics based on the data. Big data and NoSQL database solutions are therefore required. Social media datamarts and big data
  12. 12. There are several commercial and open source options for text-mining software and applications. Of the open source text mining tools, RapidMiner and R appear to be two of the most popular. R has a wider user base; a programming language in which source code is required, it has a large selection of algorithms. However, scalability is an issue with R so it's not ideal for large datasets without workarounds. RapidMiner has a smaller user base, but it doesn't require source code and has a powerful user interface (UI). Embedded is a list of other Text Mining tools: Text mining tools
  13. 13. Who does these Text Mining?
  14. 14. Spinn3r is a web service that provides raw access to posts, articles, tweets, status updates, etc. being published - in real or near real time, allowing you to focus on building your application, mashup, or search engine. We find the sources, index their content and take care of all the heavy lifting around delivering large amounts of relevant data. They publish an API for companies to build Analytic products on top of this data • Spinn3r Dataset: http://spinn3r.com • 30 million articles/day (50GB of data) • 20,000 news sources + millions blogs and forums • And lots of Tweets and public Facebook posts Gnip and DataSift are among the many others who provide these kind of Datasets Dataset Providers
  15. 15. Now that you have the Datasets, What Next?
  16. 16. Product Companies There are many product companies who use these datasets and build analytical products for organizations: InsideView With InsideView CRM+, your marketing, sales, and service teams can: • Research market, company, contact, and competitor information • Use real-time news and social network connections to target new leads and engage with customers • Enrich leads to help sales move from lead to win • One-click integration with CRM to update leads and contacts into your CRM Tealeaf Tealeaf's Customer Behavior Analysis Suite • Improving online customer experience is a top priority for many organizations and Tealeaf's Customer Behavior Analysis Suite was created with this goal in mind. By utilizing cxImpact, cxResults and cxView in concert, companies have both the quantitative data, as well as the qualitative experience information necessary to understand customers' true experiences
  17. 17. And similarly Further list of product companies those provide analytical tools from datasets www.sprinklr.com www.leadformix.com www.xactlycorp.com www.moxiesoft.com www.synaptris.com www.quinstreet.com www.enirogroup.com/en www.saama.com www.mu-sigma.com And many more..
  18. 18. Conceptually, what do these tools provide?
  19. 19. Sentiment analysis depends on an appropriate subjectivity lexicon that understands the relative positive, neutral or negative context of a word or expression. It is both language and context specific. A good example can be seen below: I find PRODUCTX to be very good and useful, but it is a bit too expensive. The expression (and therefore the PRODUCTX) is rated as positive, since there are two positive words “good” and “useful” – and one negative word “expensive”. In addition, one of the positive words is enhanced with the word “very” while the negative word is put into perspective by the qualifier “a bit”. The more advanced the lexica, the more detailed the analysis and the findings can be. Sentiment analysis is a well-established, stand-alone predictive analytic technique. Sentiment Analysis: Predictive Analytic Technique
  20. 20. These tools are generally cloud-based applications that pull many different social media data sources (datasets) together including communities and blogs. They are able to do this because they generally incorporate a massive back end infrastructure that constantly crawls and captures new data as it occurs from the API’s. They all provide an interface to filter the data and enter selection criteria to look across a broad range of channel choices. The results usually take some form of a visual scorecard that combines different graphical and tabular techniques for displaying the summarized information. Many allow an interactive “drill down” to see further details, most of them allowing you to drill right through to the original source of the data. Social Media Scorecards
  21. 21. Technologies Used by these Product Companies Big Data Technologies: • Hadoop Frameworks (hdfs, Pig, Hive, oozie, Hbase, Mahout), • Cloudera (CDH3 & CDH4) distributions, • Postgres+ Postgis, • Cassandra Languages: • Java, • Perl Cloud computing technologies: • Amazon Web Services (AWS) / Amazon EC2, • Amazon S3, • Amazon EMR, • Amazon Cloud watch
  22. 22. © This document contains confidential and proprietary information of Adroitent. It is furnished for evaluation purposes only. Except with the express prior written permission of Adroitent, this document and the information contained herein may not be published, disclosed, or used for any other purpose. | www.adroitent.com