REAL TIME TEXT ANALTYICSVendor BriefingSeptember 7, 2011Proprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.
Making Sense of Social & Private Data DelugeHow does an organization unify social & private text analytics?Proprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.
Making Sense of Social & Private Data DelugeBig data presents both unique challenges & potential business insightsShifting categorization or theme patterns
Spam, duplicates, misspellings
Volume and velocity
Processing large volumes of data quickly
Real-time, unsolicited, voice of customer or employees
Weak signal can be sign of emerging trend or issuesProprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.
Building a Platform for Handling Social & Private - A Text Analytics Command Center(TACC)Proprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.
Social Media CharacteristicsChanging the way consumers interact with brands, products & servicesVolume. The number of consumers adopting and using social media platforms continues to grow
Immediate. Real-time, unsolicited, true voice-of-customer
Brand and product mentions but they may be embedded with other themes, topics and interests
Niche but expanding. May represent only a portion of a business’ total consumer audience
Narrow S-CRM focus. Need is expanding beyond Marcom/PR to loyalty, customer service, product development. Proprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.
Analyzing Social MediaA number of technologies use keyword and Boolean expressionsPros
Inexpensive and easy to set up
Quick results for “exact” terms and phrase matching
Cons
Inefficient at white-space or open-ended analysis
Become more brittle as each expression is added to include or exclude data
Problems with ambiguity and granular filteringProprietary and confidential. Not to be used or reproduced without the consent of Collective Intellect, Inc.

Text Analytics Command Center - Vendor Briefing

Editor's Notes

  • #3 A white paper released today from IDC revised the research firm's earlier estimates to show that by 2011, the amount of electronic data created and stored will grow to 10 times the 180 exabytes that existed in 2006, reflecting a compound annual growth rate of almost 60%.There is lots of data out there much of it unstructured, which may contain enormous business insights. Much of this unstructured content is from social media conversations or if it is internal data, it might be surveys, chat or video transcripts or email threads. How do you address the analytical requirements of both social/private data? How do you begin to unify and integrate the analysis and research?
  • #4 "Over the last 20 to 25 years, companies have been focused on leveraging maybe up to 5% of the information available to them," said Brian Hopkins, a principal analyst at Forrester Research Inc. in Cambridge, Mass. "Everything we didn't know what to do with hit the floor and fell through the cracks. In order to compete well, companies are looking to dip into the rest of the 95% of the data swimming around them that can make them better than anyone else.”http://searchbusinessanalytics.techtarget.com/news/2240039382/Big-data-poses-big-challenges-for-traditional-analytics-approaches
  • #6 2011 values - Twitter 75m user accounts, LinkedIn over 50m members & Facebook 350m active users
  • #7 If you are wanting to conduct open-ended or white space analysis, Keyword & Boolean is simply unable to derive meaning and context for large data sets.
  • #8 Existing systems – organization may have processes or systems in place that may not scale or are unable precise insights
  • #10  The  command  center  will  serve  customers  having  various  social  media  and   private  data  analytics  needs,  allowing  them  to  simply  use  or  integrate  our  data  and  technology   into  their  business  environment  serving  various  departments.     The  following  diagram  describes  the  overall  functionalities  offered  by  the  command  center.  CI  will   own  the  first  three  layers  (CI  Inputs,  CI  Engine  and  CI  Outputs)  and  will  integrate  through   partners/client  own  applications.
  • #11 This is not a new way of doing business – delighting your customer with the right message. But never has the customer had such a powerful and amplifying platform to inform you of their opinion and perspective. A successful business engagement requires a two-way conversation with the customer. Without collecting and understanding your consumer’s input and responses, you are missing out on valuable and increasingly critical information
  • #12 Industry research estimates 127 million people, or 57.5% of internet users visited a social networking site at least once a month in 2010. Not only is the number of users growing quickly, but the audience demographics continue to widen. In 2010, it’s estimated that 59.2% of adult internet users will visit social networks monthly, up from 52.4% in 2009.Research estimates predict a steady rise in social media users by 2014, with 2/3 of all internet users, 164.9 million people, visiting social network sites on a regular basis. Ideally, your listening tool is able to manage both unstructured social data but also private, internal data. Otherwise you are analyzing data in a vacuum.
  • #13 -LSA in particular is the "secret sauce. It is an evolving system versus an analysis of word groupings at a single point in time, this makes it far more flexible/nimble than competitors in the NLP space. It compares 600,000 documents for the meaning of each word results in more accurate analysis and better "listening".  The semantic services layer essentially broadens the end market to anyone who needs more accurate search - this includes web search (as long as the user is willing to "wait"), e-discovery, email archiving, and potentially more accurate video search based on descriptions/reviews/tagging, etc.CI’s semantic search and analytics technology is unique with its proprietary approach to how data is handled, categorized and measured for relevancy. The proprietary technologies isolate important attributes from groups of authors and reveal unique considerations and preferences in addition to providing the ability to identify unknown associations occurring through natural online conversation. CI’s technology is used in a compounding fashion, starting with topic categorization, to theme extraction, then to trait extraction.Based on highly precise categorization functionality, once the semantic processing engine has been trained for accurate categorization ongoing analysis becomes repeatable, scalable and reliable.
  • #14 Applying semantic technology to large volumes of data LSA in particular is the "secret sauce. It is an evolving system versus an analysis of word groupings at a single point in timefar more flexible/nimble than competitors in the NLP space. It compares 600,000 documents for the meaning of each word results in more accurate analysis and better "listening". LSA is a method for exposing latent contextual-meaning within a large body of text – more relevant terms carry more weight to construct more accurate vectors of how consumers are talking about a category, brand or productAble to apply contextual meaning to topics – select conversations based on meaning Social Search - Categorizing ConversationsThe semantic services layer essentially broadens the end market to anyone who needs more accurate search - this includes web search (as long as the user is willing to "wait"), e-discovery, email archiving, and potentially more accurate video search based on descriptions/reviews/tagging, etc.semantic technology is able to isolate and categorize contentGet all the conversation, not filtered like a google or yahoo searchSemantically Surfaced Author DetailsAssign important attributes to authors or groups of authors and reveal unique considerations and preferencesExamine actual language used to describe the company, brand or productApply traits to posts then average these traits together to produce author profile
  • #15 Semantic analysis is able to differentiate between “goldfish” the fish and goldfish the cracker.
  • #17 Latent Semantic analysis allows users to perform an advanced form of filtering called dimensions to extract language around pricing, quality, loyalty. Simply cannot be done using keyword
  • #20 Dimensions Extract specific language around customer service, pricing or issuesDemographics