One of the big trends we see happening in customer feedback at the moment is the explosion in the amount of free-form text feedback – both solicited and spontaneous comments – coming in through various channels. Companies nowadays have access to a multitude of options for analyzing these large numbers of text comments.
In this whitepaper, we aim to give an overview of all analysis options available. Furthermore, we identify the optimal method of analyzing free-form text feedback in order to yield actionable insight that can be easily shared with key stakeholders throughout the organization.
Etuma whitepaper On Extracting Meaning And Feeling - Enterprise Feedback Analysis Today
ON EXTRACTING MEANING AND FEELING:ENTERPRISE FEEDBACK ANALYSIS TODAY What are the technologies and solutions currentlyavailable for analyzing free-form customer feedback?
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAY ContentsIntroduction 1Requirements of an effective free-form customer feedback analysis system 1Manual analysis is slow, expensive and mostly ineffective 2Extracting keywords through tag clouds can be visually impressive, but... 3You can detect the whole comment sentiment, but... 3Statistical analysis: takes time to sift through historic data 4Analyzing feedback the natural way: rule-based semantic analysis 5
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAYSince merchants began producing goods and services to sell, their customers have hadopinions about those goods and services. The advent of the telephone and, subsequently,personal computers and the Internet made it easy for large numbers of customers to share theiropinions. Companies soon realized the value in collecting this feedback and, today, are activelypursuing feedback from their customers and potential customers. Some companies have evengone a step further and begun to systematically analyze the feedback from the numerouschannels at their disposal.Up to now, companies have focused on gathering structured feedback, typically in the form ofrating various aspects of a product or service on a numerical scale. This has enabled them toaffordably and systematically analyze the data, recognizing trends and patterns over periods oftime. But there are some serious shortcomings with this method of gathering feedback: chieﬂy,that real wisdom and insight are often found in unstructured, free-form feedback. The use ofunstructured feedback minimizes the leading of customers’ thoughts, allowing them to openlyexpress what it is they actually want from or think of a product.The volume of this type of free-form feedback is growing because of • Social media • Transaction-based queries: How was your ﬂight? How was your user experience in this purchase? How was your car maintenance? • Net Promoter Score-type surveys: How willing are you to recommend our product on a scale of 0-10? Why? What would you improve?The analysis of free-form customer feedback is an emerging market with high-growth potential,and a vast amount of money will be spent on these services in the next three years. In 2011, themarket was valued at over one billion US dollars with an estimated twenty-three to twenty-sevenpercent Compound Annual Growth Rate (Gartner 2012). The companies that provide the mostaccurate, efﬁcient, and low-cost solutions for analyzing free-form customer feedback willdominate the market and earn a substantial portion of this revenue.This paper will look at the technologies and solutions currently available for analyzing free-formcustomer feedback.Requirements of an effective free-form customer feedback analysis systemFirst, let’s look at the components needed to make a customer feedback analysis system usefulin an organization: • Prioritization of feedback to make it actionable: requires alerts and trend analysis • Speed and real-time analysis • Automation 1
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAY • Flexibility and ability to learn: categories need to be dynamically updated as feedback The analysis of free-form text is topics change not a simple task, and the • Consistency: minimizing the impact of human requirements for an effective free- interpretation and subjectivity form customer feedback analysis • Sentiment detection system are diverse. • Capacity to analyze multiple languages simultaneously • Grouping of keywords/terms/phrases which have similar meaning: “not expensive” detected correctly as “cheap” • Detection of various discussion topics and sentiments inside the same feedback item • Affordability: avoiding the need for long initial projects • Name detection: names vs brand names • Readiness to start the analysis on Day 1: bypassing the need to ﬁrst execute a time- consuming analysis of historical dataManual analysis is slow, expensive and mostly ineffectiveTraditionally, enterprise feedback analysis has relied on inefﬁcient and expensive manual labor,which is prone to errors and inconsistencies. This, in fact, is still the predominant method offeedback analysis today. Large numbers of people are employed to manually read feedbackmessages gathered from various sources and to classify them according to some preset categories, which are typically related to the company’s organizational or product structure. They The analysis of free-form then set a sentiment score for the entire customer feedback is an emerging response,even in the case of long text responses market with high-growth spanning many paragraphs. potential. The companies that provide the most accurate, There are a number of issues with this method of efficient, and low-cost solutions feedback analysis. Aside from the major cost for analyzing free-form customer associated with employing the necessary number of feedback will dominate the market people to systematically keep up with the continuous and earn a substantial portion of ﬂow of increasing feedback volumes in large its revenue. companies, there are various problems with the accuracy and efﬁciency of such organizations. Humans tend not to be very good at discrete and consistent categorization. In companies employinglarge staffs due to high feedback volumes, this problem is compounded by the fact thatindividuals with different backgrounds create very different analysis results.Because the categories are rigidly preset by the company, when a new topic emergesspontaneously, it can take valuable time to create a new category. Comments coming in theinterim may be lost completely. Feedback in multiple languages further complicates the issue.And doing this work manually is slow. If a crisis arises with a massive amount of claims or social 2
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAYmedia discussions, the crucial time for action may have passed before the company is evenaware of the problem.Finally, because of the inefﬁciencies of such a manual process, analysts are often forced toconcentrate on negative comments where corrective action may be needed. Typically, if thereare both positive and negative comments in a single response, the analyst will focus on thenegative ones, categorizing and determining the sentiment accordingly. Positive comments andother seemingly minor subjects in the feedback may not be taken into account at all.In summary, manual processing is inconsistent, slow and expensive. Let’s look at some of theother alternatives available.Extracting keywords through tag clouds can be visually impressive, but...Technological developments have made possible great advances in analyzing feedback, andtoday there are dozens of companies offering services like statistical reporting and primitivelanguage analysis technologies.There are quite a few free or Open Source solutions for keyword extraction. But while keywordscan be useful in marketing, an unstructured list ofkeywords is impossible to turn into reliable,consistent and actionable trend analysis. Simplyknowing how many times a speciﬁc word is used isnot enough. There is no grouping of keywords orkeyword-based sentiment. Without this deeper “Last month ‘OUR BRAND’ gotinsight, data cannot be turned into statistically mentioned 300,000 times. Out of those 180,000 were positivemanageable information. and the rest negative.”Another problem with keyword extraction is themethod used for returning words to their base form: How much can be done with this‘run’ and ‘running’ should both be returned as ‘run’. type of information?The typical method of returning keywords to theirbase form is stemming. This means, in English,taking the basic form of the word and dropping offthe “ed”, “s”, or “ing” as well as mapping irregularverbs. But, in fact, English is the one of the fewlanguages where stemming yields relatively accurate results. In most other languages, keywordextraction requires morphological analysis. Without this, each conjugation of a word ispresented as a completely different word.You can detect the whole comment sentiment, but...Some companies offer sentiment analysis of social media feedback. This type of analysisdetects whether a comment is positive or negative about a speciﬁc term by searching commentsfor speciﬁc pre-determined terms (brands, models, etc) and checking for positive and negativewords near that term.Detecting sentiment in this manner results in the following type of information: 3
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAY • Last month “OUR BRAND” got mentioned 300,000 times. Out of those, 180,000 were positive and the rest negative. • In the previous month “OUR BRAND” got mentioned 320,000 times. Out of those 195,000 were positive and the rest negative.What can be done with this type of information? Of course, long-term trends can be analyzedand historical crises can be pinpointed, but there is no real insight into what people are saying.This type of analysis provides some information, but it is neither detailed nor accurate enough tobe used in undertaking any speciﬁc action by the people responsible for the speciﬁc area of thebusiness for which the feedback was actually intended.The problem with most sentiment detection technologies is that they are based on statisticalapproaches and, consequently, are highly inaccurate. To reach acceptable levels of accuracy,they require expensive months-long projects conducted by experienced experts, and, so, areout of reach for all but the largest enterprises.Statistical analysis: takes time to sift through historic dataSemantic statistical analysis involves creating an immense corpus of text strings, which aremapped one-by-one to predetermined topics and sentiments. It is often regarded as anadequate technology, but the cost of deploying such a system again prohibits all but the largestcompanies from even considering it. This type of analysis requires a massive amount ofhistorical data to build a corpus large enough for the results to be consistent and comparableover time. Building up such a dataset takes a lot of time and comes at a high cost.A handful of companies have gone a step further, trying to automatically derive meaning andtopic-speciﬁc sentiment. In order to set up a database that enables pattern matching, they run a language mapping project before the launch of the service. Such projects are usually costly and time- Semantic statistical analysis consuming, since they involve lots of manual work involves creating a corpus of text and low-level linguistic technologies. Moreover, the strings, which are mapped one-by- resulting analysis can only be done in the language one to predetermined topics and originally conﬁgured, with new languages requiring sentiment scores. separate projects. These systems have very limited capacity to learn, which means that each new set of terms orconcepts–new products, new sales channels, new memes–requires a separate project tocommence analysis. This is especially difﬁcult when it comes to social media where thelanguage is continuously changing and evolving.Semantic statistical analysis is always customer-speciﬁc, so it lacks a fundamentally importantcomponent: universal system learning. In the statistical approach, a system’s analysis must belimited to customer-speciﬁc or industry-speciﬁc data, because this is all the information availablein that system. 4
ON EXTRACTING MEANING AND FEELING: ENTERPRISE FEEDBACK ANALYSIS TODAYSome statistical semantic analysis systems use a low level of corpus sharing and advancedlanguage tools, which reduce the amount of customer-speciﬁc work up front. And these systemsare being continuously improved by taking advantage of lessons from ongoing customerprojects.Analyzing feedback the natural way: rule-based semantic analysisA rule-based semantic analysis system, on the other hand, understands language structuresand the relationships between words. It has a massive set of pre-conﬁgured rules that enableautomatic and accurate real-time analysis. Developing a rule-based language analysis systemtakes a considerable amount of work and time, sometimes stretching even up to twenty years.Very few companies have the skills or resources to develop such a system.Once the system has been developed, however, it is ready to be applied to any free-format text,guaranteeing high levels of accuracy. Long conﬁguration and implementation projects are notnecessary in rule-based semantic analysis, because the system combines decades of projectsin which all the necessary rules have been gatheredand a world view has been created: intelligence hasbeen built into the system. Rule-based semantic analysisA rule-based system also gives a much more likely systems are especially suited forpossibility of accurate sentiment detection. Systems customer feedback analysis.such as these are especially suited for running Positive or negative opinions cancustomer feedback analyses. Positive or negative be expressed in so many differentopinions can be expressed in so many different ways ways that it would be impossiblethat it would be impossible to list all possible ways in to list all possible ways in aa matching system. And a typical rule-based system statistical mapping system. Thecan also handle idiosyncrasies like bad grammar, likelihood of detecting the rightspelling, and jargon. So, the likelihood of detecting sentiment “out of the box” is muchthe right sentiment “out of the box” is much higher higher for a rule-based semanticfor a rule-based semantic analysis system than for a analysis system.statistical system.In a rule-based system, all customers’ data gothrough the same analysis engine. Knowledge is recycled, so everybody learns from everybodyelse. Only in this way can all incoming new messages teach the system and all customers getthe beneﬁt of this learning. This is especially important when analyzing input from environmentssuch as social media, in which the language and discussion topics change frequently over time.Buying and running a rule-based semantic analysis system is signiﬁcantly cheaper than astatistical semantic analysis system. The price level and ease of entry of a rule-based systemopens the market to small- and medium-size ﬁrms for whom the price of a statistical semanticanalysis system is absolutely prohibitive. Rule-based systems enable companies of all sizes toget started on their analysis immediately, allowing them to focus on what is important: quick,real-time actionable insights on customer feedback topics, sentiments and trends. 5
WOULD YOU LIKE TO LEARN MORE ABOUT ETUMA’S CUSTOMERFEEDBACK ANALYSIS SERVICE?To ﬁnd out how our rule-based semantic analysis service can be of beneﬁt to yourcompany, visit www.etuma.com or call Matti +358.40.822.2010.