Using Big Data Technologies for Social Media Analytics- Impetus White Paper

1,263 views

Published on

For Impetus’ White Papers archive, visit- http://www.impetus.com/whitepaper

In this white paper, Impetus talks about the need for building Big Data technologies based social analytics platform for better business insight.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,263
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Using Big Data Technologies for Social Media Analytics- Impetus White Paper

  1. 1. Using Big Data technologies toenable social media analyticsW H I T E P A P E RAbstractIn this white paper, Impetus talks about the need forbuilding Big Data technologies based social analyticsplatform for better business insight. The paper alsofocuses on why social media analytics is important intoday’s world and how 3-D data sources—that is ,internal, external and social data—can be utilized to builda data warehouse based on Big Data technologies.Impetus also shares in this white paper, its recommendedsolution, and how Big Data technologies can be used tooptimize costs and handle and exponential increases indata over time.Impetus Technologies Inc.www.impetus.com
  2. 2. Using Big Data technologies to enable social media analytics2Table of ContentsIntroduction ..................................................................................................................................................3The benefits of Social Analytics ....................................................................................................................5Data sources that facilitate Social Media Analytics ......................................................................................6Technical tenets of Social Media Analytics...................................................................................................6Using Big Data technologies to enable Social Media Analytics ....................................................................8Building a Big Data warehouse .....................................................................................................................9A step-by-step approach to creating the Big Data EDW.............................................................................10The Impetus solution ..................................................................................................................................11The iLaDaP high level architecture..........................................................................................................11Summary.....................................................................................................................................................13
  3. 3. Using Big Data technologies to enable social media analytics3IntroductionSocial Media Analytics is a discipline that helps organizations measure, assessand explain the performance of their social media initiatives.There are four stages of analyzing social media data, including the following:Step 1: collecting the data. This facilitates the compiling of reports and statisticsthat are to be shared with the management or the internal and externalstakeholders.Step 2: measuring the data. This helps in Sentiment Analysis and gauging whichproducts are well received in the marketplace.Step 3: analysis. Here, data is presented in a visual and interactive manner tothe management, as well as the sales and marketing teams to provide betterinsights.Step 4: innovation. Based on the insights and analysis, there is a move towardsinnovation, where organizations determine the new products and ideas they aregoing to pursue, as a response to customer requirements. Innovation also helpsunearth the cross sell or up sell opportunities that were not visible before.Social Analytics opens up a host of new opportunities and perspectives.Category-wise analysis of customer data for instance, enables theirdemographic profiling and helps determine their usage patterns. Similarly, withFeature analysis, it is possible to figure out which forums, platforms or sourcesof data are more active as compared to others.Product Growth Analysis, which focuses on the data generated for a specificproduct, helps understand the response of users to that product. There is also aRecommendation Engine, which helps zero in on what is missing or lacking in aproduct range.
  4. 4. Using Big Data technologies to enable social media analytics4Finally, Social Analytics enables Third Party Analysis, which is purely focused onwhat the public social media platforms, such as Twitter, Facebook, MySpace,etc. have to say about the product.
  5. 5. Using Big Data technologies to enable social media analytics5The benefits of Social AnalyticsSocial Analytics is an outcome-based approach and one which creates visibleReturn on Investment (RoI).• It helps organizations retain customers by addressing their concernsupfront, rather than being slaves to processes. The results of theanalytics help organizations retain brand preference in a fickleconsumer world.• It improves customer service and brings down the cost of operations.• It enables organizations to add new customers, by understanding andaddressing their requirements• Social Analytics helps companies keep an eye on their competition. Witheasy access to social media data, it is simple to track and counter themoves of competitors.• It helps companies remain proactive. The turnaround time for gatheringcustomer feedback is reduced drastically. Moreover, the reactions ofcustomers and their subsequent actions can be predicted moreaccurately, enabling organizations to take appropriate measures.
  6. 6. Using Big Data technologies to enable social media analytics6Social Media Analytics effectively converges on-site, social media and third partydata to extract useful information. Considering these factors, and the fact that itenables enterprises to leverage the colossal data that is continuously generatedthrough social media interactions, Social Media Analytics should be made anintegral part of the marketing and research strategies of enterprises.Data sources that facilitate Social MediaAnalyticsData sources include internal data, such as the purchase history of customers,their transactions, and profiles in the enterprise database. It also encompasseswebsite traffic analysis, covering internal CSR logs, customer queries,automated agent discussions, complaints and resolutions, and employeeinsights.Data sources can also be the social activities and profile updates of customerson public social media platforms such as Twitter, Facebook, Myspace, LinkedIn,etc.External data sources can additionally be used, and customers analyzed byfactoring in industry sources of information and market research reports.Technical tenets of Social Media AnalyticsHere’s a look at what Social Media Analytics entails and enables:Clustering: Clustering is about capturing and analyzing various comments,demands, and questions that customers share with like-minded friends andgroups, over social media platforms. It helps identify the appropriate responseand behavioral anomalies.Classification: Having captured data on the activities of customers and theircomments, it is possible to perform natural language processing on it to evolvepatterns. These patterns can then be categorized and understood forappropriate responses. Organizations can use Classification to address theconcerns of customers and approach them with products and offerings thatreally meet their needs.Sequential classification: This enables organizations to identify the subsequentsteps and actions that customers might take, based on their recent experiences.
  7. 7. Using Big Data technologies to enable social media analytics7Entity Extraction: Organizations can identify the concerns and issues thatdissatisfied customers are struggling with through Entity Extraction. They canthen take appropriate measures to ease the situation and retain customers onthe verge of switching to other suppliers or vendors. Event Extraction enablescompanies to unearth the sequence of events leading up to customerdefections, or why people moved on to other providers.Communications Graphs: Once organizations have all the data nicely sliced anddiced, they can draw Communication Graphs. These graphs can help analyzeand identify the top influencers, and active members in various groups. Theycan also help companies gain a better understanding of where the messagesoriginate, and how they travel through the network. Knowing this, organizationscan target the top influencers and most active members in the network,projecting a positive image of the brand or product in the community.
  8. 8. Using Big Data technologies to enable social media analytics8Using Big Data technologies to enable SocialMedia AnalyticsOne of the biggest challenges that organizations face with their social mediadata is its humungous size.Existing Enterprise Data warehousing (EDW) environments, designed decadesago, simply lack the ability to capture, and process social media data within areasonable time. Moreover, these traditional EDWs have limited capabilitieswhen it comes to analyzing the behavioral data of users. Traditional solutionscannot help companies in managing complex and unstructured data generatedby social media interactions nor handle multimedia data.Using Big Data technologies is their best bet in this scenario. Big Datatechnologies can help organizations handle large volumes of complex,unstructured data from social sources, of the order of terabytes and petabytes,gain insights into customers and trends, store images and videos, and savehundreds of thousands of dollars per terabyte per year.Take the instance of a Big Data Social Analytics Platform which has to deal withinformation from various data sources such as Social Media sites and web 2.0enabled websites. The Platform can also pull historical bulk data lying around inexisting systems using appropriate connectors.The connectors enable the conversion of the data from all kinds of data sourcesinto a Hadoop-based data warehouse. After collecting this data, Apache’sMahout, a scalable machine learning and data mining solution, can be used tocategorize the data and store it in accordance with the categories for later use.It is also possible to run Map-Reduce jobs that use Natural Language Tool Kits(NLTK) to perform natural language processing of the comments and feedbackfrom the social data sources.The aptly massaged and categorized data can then be used to draw graphs, andanalyze market sentiment about a product. The data can be used for MIS and tocompile regulatory reports that need to be produced on a regular basis usingSqoop.Since the Big Data Social Analytics is powered by Hadoop, it can linearly scale upto thousands of nodes using commodity hardware. This spells a significant costadvantage for organizations, in the long run.Since it is important for businesses to track down, and take advantage ofopportunities quickly, this platform can enable them to react to the events asthey happen.
  9. 9. Using Big Data technologies to enable social media analytics9Building a Big Data warehouseIn order to build a Big Data warehouse that extracts data from the sourcesdiscussed earlier, and draw pertinent insights from it, organizations must beginby grabbing social media data from various public social media platforms. Thehistorical master data and transactional data about customers can be takenfrom existing systems. Sqoop can come in handy for pulling out the data into theRDBMS systems, which are already in place.Text User Location SourceGift card TweetUser USA, NY TwitterFree offer FaceUser USA, GA Facebook
  10. 10. Using Big Data technologies to enable social media analytics10For natural language processing, using a NLTK is a good Open Source option.Data preparation/Mashups can be accomplished by running Map-Reduce jobsover the collected data and massaging it.Apache Mahout’s k-means algorithm can be used for clustering, while its NaïveBayesian algorithm can be used for classification/sentiment analysis using thecomments and tweets from social media data sources and identifying patterns.The item-based similarity algorithm of Mahout can be used for collaborativefiltering and recommendations. When the data is ready for analytical reportingand deep mining, Hive or Pig can be used.A step-by-step approach to creating the BigData EDWStep 1: The first step is to create and run training data through Mahout to helpit understand how to classify social data feeds. Next, the feeds have to becollected from public social media platforms. This can be accomplished byperforming keyword based searches and streaming in the result sets on acontinuous basis. It is possible now to search on the basis of a brand name,product make and model, category, industry terminology, product segment,special offers and marketing buzzwords, using the various APIs offered by socialmedia platforms. This classified data can then be dumped into an HBASE-baseddata warehouse constantly and continuously.The data from existing systems can also be imported into the HBASE base BigData warehouse. Online content can be crawled and dumped into the HBASEdatabase. Connectors are available for classification of online pages. Luceneand Solr are very suitable for this purpose.Step 2: At this stage, quantitative analytics can be performed on the collecteddata. It is possible to draw comparisons between ‘Total tweets’ versus ‘Ourproduct specific tweets.’ This is accomplished by using Mahout algorithms overa Hadoop cluster. Organizations can also publish a daily trend watch. This maycontain the ‘total number of comments about the products of theircompetitors,’ versus the ‘total number of comments about their own products.’With customers increasingly using devices for connecting to social mediaplatforms, it is now possible to perform location-based trend analysis.Classification and clustering is performed by using Mahout/NLTK processeddata. Organizations can run the training data through Mahout/NLTK to help itunderstand how to build trained models. After that, it is possible to run thetweets and feed from other social media platforms through trained models, andhave the tweets and comments classified. This provides a clear picture of the
  11. 11. Using Big Data technologies to enable social media analytics11sentiments prevailing in the marketplace for the products of organizations aswell as their competitors.Companies can come up with recommendations by running the data throughMahout. These recommendations can then be factored into future productdesign and rollouts.Step 3: This step is about using customer data to recommend new and relatedproducts. Once companies have data from their existing systems as well associal sources, they can prepare the mock customer data for Social ID mappingand run Item or User based recommendations on this data using Mahout.At this stage, it is possible to produce Analytical Reports on data generated byMahout. This can be accomplished by generating reports using a traditionalReporting product or framework. The nicely sliced and diced reporting data canbe dumped into a MySQL database or some other SQL database, with the helpof Sqoop. This SQL database can be used to meet the regular downstreamreporting requirements of organizations. This will enable them to use theirexisting investments in reporting tools as well as provide the drill down reportsfor use by the management and Sales and Marketing departments.Alongside social media, this Big Data Media Analytics platform can be used toaddress other large data analytics requirements. The platform can givecompanies a head start in putting together the pieces of their Big Data strategyand provide them with an asymmetric advantage over competition.The Impetus solutionImpetus has used this approach and technologies to build a platform for SocialMedia Analytics. Impetus, an established thought leader in the Big Data spacehas conceptualized, architected and built this platform based on the experienceand expertise that it has gained through its client engagements.The iLaDaP high level architectureThe Large Data Analytics Platform developed by Impetus is built using theService Oriented Architecture (SOA), and incorporates all the key characteristicsof an ideal Big Data Analytics Platform. The iLaDaP is designed to deriveintelligence and operate on huge datasets collected from numerous datasources in multiple data formats.
  12. 12. Using Big Data technologies to enable social media analytics12It is powered by Hadoop, and therefore, can linearly scale up to thousands ofnodes using commodity hardware. This spells a significant cost advantage in thelong run. iLaDaP also comes with a set of pre-canned and customized reports.Businesses that need to track down and take advantage of opportunities as theyhappen can use the Impetus platform to react to events. The iLaDaP is alsocapable of collecting data from a range of disparate sources. This unstructureddata can be transformed and utilized for strategic business decisions.Furthermore, organizations can deploy the solution on-premise, as well as in aCloud supported setup. iLaDaP can be seamlessly integrated with the currentplatforms of companies, without making any major changes.
  13. 13. Using Big Data technologies to enable social media analytics13SummaryTraditional Enterprise Data Warehouses do not have the ability to keep up withrapidly increasing social media data. The need of the hour is to effectivelystrategize and build a Big Data Analytics Platform to manage, store and deriveinsights from this digital data.Any single vendor technology may not be sufficient to undertake this task, and itis recommended that organizations go for Open Source options to build a SocialMedia Analytics Platform using Big Data technologies. The fact is that thesuccess of a Big Data platform depends entire on the tools that are used.Organizations therefore, need to use discretion and select the most appropriatetools from the available options. Companies can also re-use existing EDWinvestments for their Big Data Analytics Platform.About ImpetusImpetus Technologies is a leading provider of Big Data solutions for theFortune 500®. We help customers effectively manage the “3-Vs” of Big Dataand create new business insights across their enterprises.Website: www.bigdata.impetus.com | Email: bigdata@impetus.com© 2013 Impetus Technologies,Inc. All rights reserved. Productand company names mentionedherein may be trademarks oftheir respective companies.May 2013

×