Semantic vs Sentiment Analysis by Networked Insights
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Semantic vs Sentiment Analysis by Networked Insights

  • 2,177 views
Uploaded on

For years "sentiment" has been a popular metric for showing whether customers like or dislike a company’s products and services. But in the social media era, sentiment really doesn't tell you much......

For years "sentiment" has been a popular metric for showing whether customers like or dislike a company’s products and services. But in the social media era, sentiment really doesn't tell you much -- especially when, on average, only one out of four posts indicates sentiment. This is a real problem when you’re wading through tens or hundreds of thousands of posts, trying to figure out how people feel about your company. You miss a lot of clues about what people are really saying and feeling.

An alternative to sentiment analysis is use of semantic analysis. Semantic analysis uncovers and distills the natural structure around mountains of data – blog posts, social network chatter, tweets and more. In fact, a valuable type of semantic analysis is topic discovery: the summarization of large amounts of text by automatically discovering the topics and themes within. Networked Insights’ new Topic Discovery Engine (TDE) is a semantic analysis system finely tuned to discover topics in social media posts. The valuable information that comes from semantic analysis -- quickly and inexpensively -- can drive product development, new revenue streams and strategies for marketing, advertising and media planning.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
2,177
On Slideshare
2,177
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
57
Comments
1
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Why Semantic Analysis isBetter than Sentiment AnalysisA White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights
  • 2. Why semantic analysis is better than sentiment analysis “I like it,” “I don’t like it” or “I have no opinion” – sentiment is widely used to measure how customers view a company’s products and services. After all, who doesn’t want to be liked? But does sentiment tell you what you really need to know? Sometimes it does, for example, when you want to under- stand what people are saying that could affect your brand image. Or you may be interested in how your product fares in a straight-up comparison with a competitor’s. Other times, though, sentiment may not provide the insights you’re after. This can be especially true when Networked Insights’ new you’re trying to wade through the huge numbers of Topic Discovery Engine mentions and comments appearing in the social media (TDE) is a semantic analysis world. A promising alternative to sentiment analysis is system finely tuned to “semantic analysis.” discover topics in social media posts. Don’t be turned off by the name. Simply put, semantic analysis is a way to distill and create structure around mountains of unstructured data – blog posts, social network chatter, tweets and more – without preconceived ideas of whether or how they are related. Semantic analysis refers to a group of methods that allow machines to discover the fundamental patterns of words or phrases that act as building blocks in a large set of text. Topics, themes, sentiment and similar elements of mean- ing appear as intricate weavings of those fundamental patterns. In fact, a valuable type of semantic analysis is topic discovery: the summarization of large amounts of text by automatically discovering the topics and themes within. Networked Insights’ new Topic Discovery Engine (TDE) is a semantic analysis system finely tuned to discover topics in social media posts.networkedinsights.com 608.237.1867 info@networkedinsights.com © 2011, Networked Insights, Inc. 2
  • 3. By grouping social media posts based on semantic similarity, rather than preset sentiment categories such as positive, negative and neutral, TDE can help you uncover important information – for example, what exactly people are saying about your product or service; where and how they use it; the features they use most; and the enhance- ments or new offerings they’re interested in. All of this information can ultimately drive product development, new revenue streams, and strategies for marketing, Percentage of posts that contain sentiment advertising and media planning. 100 Why sentiment falls short One problem with sentiment analysis is what it cannot 90 tell you because it only considers a small amount of the 80 available data. Our experience shows that, on average, only about 10 percent of posts actually contain sentiment, either 70 positive or negative — and that’s a generous estimate (Figure 1). This means nine out of 10 posts are neutral, 60 revealing no sentiment, and are effectively being ignored 50 by the analysis. Thus, with sentiment analysis you’re making decisions based on what only 10 percent of the 40 posts are saying. 30 The 90 percent of posts that do not reveal sentiment 20 are not all irrelevant; they just don’t fall cleanly into the restrictive positive-negative view of semantics or mean- 10 ing that sentiment analysis adheres to. For example, many posts about a particular smartphone may come from 0 Positive Negative None Unknown dedicated, loyal fans who simply have questions about Figure 1 using the device. These are potentially valuable posts as they indicate what users want from the device, problems Data is based on a 500-post sentiment they may be having with its and features that could be study we conducted. The posts were improved. However, customer questions such as these are classified by 20 people each. rarely classified as positive or negative, so they would be Posts were assigned to a sentiment missed by sentiment analysis. category based on a majority vote. Only about 10% of posts were found A second problem with sentiment analysis deals with to contain sentiment. statistical confidence in data. All methods of sentiment analysis rely on example data to design, test or validate the analysis. The accuracy and value of sentiment analysis is directly dependent on the quality or confidence of the example data.networkedinsights.com 608.237.1867 info@networkedinsights.com © 2011, Networked Insights, Inc. 3
  • 4. Because sentiment is subjective, this example data is based on majority opinion rather than truth. For practical reasons, we cannot determine the majority opinion of all Confidence intervals for readers for each post. Instead, the example data is obtained a sample size of four readers from a small sample of human readers labeling posts with the type of sentiment they contain (for example: positive, negative or neutral). 100 95% 35% 90 Many companies report that, on average, approximately 65 to 75 percent of readers agree on the sentiment of a 80 post. Assuming one of these companies asks four people 70 about the sentiment of each post, which is very likely, sta- Percent agreement tistics tells us that the company is no more than 35 percent 60 confident it actually has a positive post when its readers 50 identify one. The graph at the right demonstrates this fact. 40 Data with such low confidence is a poor foundation for sentiment analysis and largely leaves it up to chance – ask a 30 different set of four readers or use a different set of posts, 20 and results could be drastically different. 10 Sentiment analysis is not inherently bad; for particular 0 types of questions, it may be the right tool. But if you use Sentiment of a post it, make sure the data underlying the analysis is sound and valuable data is not being ignored. When three out of four readers agree Semantic analysis gives you much more on the sentiment of a post, 35% is If you really want to discover and understand the the highest confidence interval that conversations around your company, products, services ensures a majority of readers would and brand, you need to be open to what all of the data tells considered a post positive. you. Semantic analysis is a better way to do that than Normally, statistical significance at the sentiment analysis for several reasons. 95% level is desired (for research and opinion polls). Most sentiment data In contrast to sentiment analysis, semantic analysis can only achieves statistical significance at take every post from a data set into account and can even the 35% level. Thus, most sentiment identify clear trends within groups of posts. data is not statistically significant (at the 95% level).networkedinsights.com 608.237.1867 info@networkedinsights.com © 2011, Networked Insights, Inc. 4
  • 5. It’s not limited to a positive-negative framework and doesn’t exclude neutral posts, unlike sentiment analysis in the smartphone example previously discussed. In this way, semantic analysis gives you clear insights into what’s happening in the aggregate across a large number of posts without your having to read all of them, an inefficient or impossible task. In short, semantic analysis can find any trend in the data as long as it exists in significant enough numbers. Networked Insights’ “topic tree” using semantic analysis Another important advantage of semantic analysis is that it isn’t restricted by a narrow view of meaning or iPad 2 semantics. Sentiment, after all, is semantics: “What is the author trying to communicate in this post?” But Android Motorola Xoom buy an iPad people rarely post to a social network with the intent of simply expressing that they either like or dislike a product, Android, Google Android Honeycomb Android Tablet HTC Flyer, tablet PlayBook, RIM next gen iPads price drop Verizon leak guess iPad 2 specs dual core iPhone 4 retina display company or idea; most forms of meaning are more complex and varied. Semantic analysis reveals the meaning or topics that sentiment analysis ignores. A final advantage of semantic analysis is unique to Networked Insights. Our TDE uses an advanced form of semantic analysis to produce “topic trees” – it organizes the topics it discovers into a tree-like structure, allowing Our TDE uses an advanced form of you to drill into a topic to see the subtopics within it. semantic analysis to produce “topic A tree structure is highly effective for organizing large trees” – it organizes the topics it amounts of data. It makes the process of finding valuable discovers into a tree-like structure, allowing you to drill into a topic to insights, quite literally, exponentially faster than having to see the subtopics within it. The size search a flat set of topics. of the node represents volume of conversation. In the end, it’s about you and what you’re looking for Ultimately, you are the best judge of information about your company. You understand your domain best, which topics are important and which are not. At the same time, it’s important to inject subjectivity into the process as late as possible to avoid biasing the analytic results.networkedinsights.com 608.237.1867 info@networkedinsights.com © 2011, Networked Insights, Inc. 5
  • 6. Semantic analysis with TDE considers these factors. Rather than having a machine or human readers judge the subjective sentiment of every post and then aggregate some output, TDE groups similar posts and summarizes the topics. Then, at the last stage, you or another qualified professional can examine the output and decide which topics are relevant, which are not and what they mean in the given context. A tool for these times Social media information is expanding at a challenging Social media information is pace, and valuable nuggets can come from the most expanding at a challenging unexpected places. Semantic analysis with TDE can help you harness and make sense of it all. Most exciting, pace, and valuable nuggets automatic topic discovery with TDE gives you tremendous can come from the most latitude around how you approach the analysis. You don’t unexpected places. Semantic have to be certain about what you’re looking for. analysis with TDE can help you harness and make sense Instead, it’s a journey to discovery, not a set path that may of it all. lead to inadequate insights or misleading conclusions. With TDE’s semantic analysis, you can cost-effectively learn volumes about how your company and your products and services are being judged in the marketplace – so much that you’ll have little time to be sentimental. We love the challenge of finding insights in all this data – our challenge is your success! Networked Insights was founded in 2006 by industry leaders and seasoned entrepreneurs in the fields of social media and customer intelligence. Headquarters are in Madison, WI, with offices in New York and Chicago. T.R. Fitz-Gibbon is the chief scientist at Networked Insights. His team designs the Natural Language Processing and Artificial Intelligence algorithms that power the company’s software. His background is in electrical engineering, computer engineering, and computer science with a focus on machine learning. T.R.’s passion lies in using machine learning and big-data techniques to find great solutions to problems that are too large and complex to have perfect solutions.networkedinsights.com 608.237.1867 info@networkedinsights.com © 2011, Networked Insights, Inc. 6