Sentiment analysis and opinion mining is almost same thing however there is minor difference between them that is opinion mining extracts and analyze people's opinion about an entity while Sentiment analysis search for the sentiment words/expression in a text and then analyze it.
It uses machine learning techniques like SVM (Support Vector Machines) to analyze the text and classify them as positive, negative or neutral.
Sentiment Analysis/Opinion Mining of Twitter Data on Unigram/Bigram/Unigram+Bigram Model using:
1. Machine Learning
2. Lexical Scores
3. Emoticon Scores
YouTube Video: https://youtu.be/VuR16P87yPE
Link to the WebPage: http://akirato.github.io/Twitter-Sentiment-Analysis-Tool
Github Page: https://github.com/Akirato/Twitter-Sentiment-Analysis-Tool
Sentiment analysis is essential operation to understand the polarity of particular text, blog etc. This presentation has introduction to SA and the approaches in which they can be designed.
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
Sentiment analysis and opinion mining is almost same thing however there is minor difference between them that is opinion mining extracts and analyze people's opinion about an entity while Sentiment analysis search for the sentiment words/expression in a text and then analyze it.
It uses machine learning techniques like SVM (Support Vector Machines) to analyze the text and classify them as positive, negative or neutral.
Sentiment Analysis/Opinion Mining of Twitter Data on Unigram/Bigram/Unigram+Bigram Model using:
1. Machine Learning
2. Lexical Scores
3. Emoticon Scores
YouTube Video: https://youtu.be/VuR16P87yPE
Link to the WebPage: http://akirato.github.io/Twitter-Sentiment-Analysis-Tool
Github Page: https://github.com/Akirato/Twitter-Sentiment-Analysis-Tool
Sentiment analysis is essential operation to understand the polarity of particular text, blog etc. This presentation has introduction to SA and the approaches in which they can be designed.
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Sentiment analysis - Our approach and use casesKarol Chlasta
I. Introduction to Sentiment Analysis and its applications.
II. How to approach Sentiment Analysis?
III. 2015 Elections in Poland on Twitter.com & Onet.pl.
Sentiment analysis of Twitter data using pythonHetu Bhavsar
Twitter is a popular social networking website where users posts and interact with messages known as “tweets”. To automate the analysis of such data, the area of Sentiment Analysis has emerged. It aims at identifying opinionative data in the Web and classifying them according to their polarity, i.e., whether they carry a positive or negative connotation. We will attempt to conduct sentiment analysis on “tweets” using various different machine learning algorithms.
This Project Aimed at doing a comprehensive study of Different Machine Learning Approaches on Sentiment Analysis of Movie Reviews. Support Vector Machines were the one that Performed Most Accurately with Radial Basis Function. Lots of Other kernel functions and Kernel Parameters were tried to find the optimal one. We achieved accuracy up to 83%.
Review of Natural Language Processing tasks and examples of why it is so hard. Then he describes in detail text categorization and particularly sentiment analysis. A few common approaches for predicting sentiment are discussed, going even further, explaining statistical machine learning algorithms.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Supervised Learning Based Approach to Aspect Based Sentiment AnalysisTharindu Kumara
Aspect Based Sentiment Analysis (ABSA) systems receive as input a set of texts (e.g., product reviews) discussing a particular entity (e.g., a new model of a laptop). The systems attempt to
identify the main (e.g., the most frequently discussed) aspects (features) of the entity (e.g., battery, screen) and to estimate the average sentiment of the texts per aspect (e.g., how positive or negative the opinions are on average for each aspect).
This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Sentiment analysis - Our approach and use casesKarol Chlasta
I. Introduction to Sentiment Analysis and its applications.
II. How to approach Sentiment Analysis?
III. 2015 Elections in Poland on Twitter.com & Onet.pl.
Sentiment analysis of Twitter data using pythonHetu Bhavsar
Twitter is a popular social networking website where users posts and interact with messages known as “tweets”. To automate the analysis of such data, the area of Sentiment Analysis has emerged. It aims at identifying opinionative data in the Web and classifying them according to their polarity, i.e., whether they carry a positive or negative connotation. We will attempt to conduct sentiment analysis on “tweets” using various different machine learning algorithms.
This Project Aimed at doing a comprehensive study of Different Machine Learning Approaches on Sentiment Analysis of Movie Reviews. Support Vector Machines were the one that Performed Most Accurately with Radial Basis Function. Lots of Other kernel functions and Kernel Parameters were tried to find the optimal one. We achieved accuracy up to 83%.
Review of Natural Language Processing tasks and examples of why it is so hard. Then he describes in detail text categorization and particularly sentiment analysis. A few common approaches for predicting sentiment are discussed, going even further, explaining statistical machine learning algorithms.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Supervised Learning Based Approach to Aspect Based Sentiment AnalysisTharindu Kumara
Aspect Based Sentiment Analysis (ABSA) systems receive as input a set of texts (e.g., product reviews) discussing a particular entity (e.g., a new model of a laptop). The systems attempt to
identify the main (e.g., the most frequently discussed) aspects (features) of the entity (e.g., battery, screen) and to estimate the average sentiment of the texts per aspect (e.g., how positive or negative the opinions are on average for each aspect).
Sentiment analysis, also known as opinion mining, is a field of computer science that focuses on automatically identifying the opinions and feelings expressed in text, audio and video. It aims to determine whether a document expresses a subjective view (positive, negative, or neutral) or presents objective facts.
Sentiment analysis involves determining the sentiment expressed by a writer in a document. The objective of the opinion-mining field is to conduct subjectivity analysis, indicating whether a document is subjective or objective. Subjectivity implies the presence of sentiment, while objectivity signifies content devoid of sentiment. Currently, an abundance of information about a specific product is available, with a single product often garnering hundreds of reviews across various webpages. Numerous websites, such as imdb.com, amazon.com, idlebrain.com, among others, aggregate user information and expert opinions to publish reviews. Experts meticulously analyze reviews, extract opinions, and generate ratings related to the dataset provided by the requesting agencies. However, handling the vast amount of data is a labor-intensive task for experts. The continuously growing volume of web data poses challenges in extracting precise opinions from content. Hence, there is a need to design a system that can efficiently perform these tasks with human-like accuracy.
In this research work, the propose approach enough capable of handling and analyzing large amounts of reviews. The reviews considered of analyzing are pre-analyzed with existing algorithms and further processed through the approach proposed in the present research work. The working capacity of the proposed approach extracts sentiment from the available content (dataset) and determines polarity degree using sentiment polarity and degree management. It also measures sentiment degrees based on user-provided target document features. The outcome is a summary comprising highly sentiment-related sentences, providing valuable insights to the users. The goal is to streamline sentiment analysis processes and enhance accuracy in a manner that aligns with human-like comprehension.
Sentiment analysis, also known as opinion mining, is a field of computer science that focuses on automatically identifying the opinions and feelings expressed in text, audio and video. It aims to determine whether a document expresses a subjective view (positive, negative, or neutral) or presents objective facts.
Sentiment analysis involves determining the sentiment expressed by a writer in a document. The objective of the opinion-mining field is to conduct subjectivity analysis, indicating whether a document is subjective or objective. Subjectivity implies the presence of sentiment, while objectivity signifies content devoid of sentiment. Currently, an abundance of information about a specific product is available, with a single product often garnering hundreds of reviews across various webpages. Numerous websites, such as imdb.com, amazon.com, idlebrain.com, among others, aggregate user information and expert opinions to publish reviews. Experts meticulously analyze reviews, extract opinions, and generate ratings related to the dataset provided by the requesting agencies. However, handling the vast amount of data is a labor-intensive task for experts. The continuously growing volume of web data poses challenges in extracting precise opinions from content. Hence, there is a need to design a system that can efficiently perform these tasks with human-like accuracy.
In this research work, the propose approach enough capable of handling and analyzing large amounts of reviews. The reviews considered of analyzing are pre-analyzed with existing algorithms and further processed through the approach proposed in the present research work. The working capacity of the proposed approach extracts sentiment from the available content (dataset) and determines polarity degree using sentiment polarity and degree management. It also measures sentiment degrees based on user-provided target document features. The outcome is a summary comprising highly sentiment-related sentences, providing valuable insights to the users. The goal is to streamline sentiment analysis processes and enhance accuracy in a manner that aligns with human-like comprehension.
Aspect Level Sentiment Analysis for Arabic LanguageMido Razaz
This is the presentation I used in my proposal seminar for master degree in ISSR.
the thesis about Aspect Level Sentiment Classification for Arabic Language.
Any further info. please contact me at (razaz_2006@hotmail.com)
Aspect-level sentiment analysis of customer reviews using Double PropagationHardik Dalal
Aspect-Based Sentiment Analysis (ABSA) of customer reviews is one of the on going research in Data Mining domain. The algorithm used to detect aspect from reviews using Double Propagation. It uses PageRank to rank the aspect which is based on occurrence.
A Survey on Opinion Mining and its Challengesijsrd.com
Today opinion mining has become one of the latest emerging fields of technology, where people are becoming keen observers of the opinions. Opinion mining is a process of extracting the opinions of the users given when they buy some product or they have the knowledge related to that domain. Thus in this paper we have shown the various aspect elements of the opinion mining as a survey. There are various way in which the opinion can be analysed and retrieved, here we have surveyed on the technique called sentiment analysis which has various classifications in it. As the number of Internet users increases the reviews also keep on increasing, thus we have some major challenges which prevail in the opinion mining hence we have tried to classify the various problems related to it.
One fundamental problem in sentiment analysis is categorization of sentiment polarity. Given a piece of written text, the problem is to categorize the text into one specific sentiment polarity, positive or negative (or neutral). Based on the scope of the text, there are three distinctions of sentiment polarity categorization, namely the document level, the sentence level, and the entity and aspect level. Consider a review “I like multimedia features but the battery life sucks.†This sentence has a mixed emotion. The emotion regarding multimedia is positive whereas that regarding battery life is negative. Hence, it is required to extract only those opinions relevant to a particular feature (like battery life or multimedia) and classify them, instead of taking the complete sentence and the overall sentiment. In this paper, we present a novel approach to identify pattern specific expressions of opinion in text.
Most of us use e-shopping (Any product) these days and refer its rating or reviews before we download or buy that product. Amazon/Play store provide a great number of products but unfortunately few of those product reviews are fraud. Hence such products must be marked, so that they will be recognizable for rest of the users. Here we are comparing reviews from two sites so that we can get more clear idea. We can get higher probability of getting real reviews if we take data from multiple sites. We are proposing a system to develop an android application that will take reviews from two different websites for single product, and analyze them with NLP for positive or negative rating. In this, user will give two different URLs of two different sites for same product to the system as input. For every URL reviews and comments will be fetched separately and analyzed with NLP for positive negative rating. Then their rating will be combined together with average to give final rating for the product. As we are handling the big data here, we are using Hadoops map reduce. So it will be easier to decide which product reviews are fraud or not.
Design of Automated Sentiment or Opinion Discovery System to Enhance Its Perf...idescitation
In today’s social networking era, if one has to make
decision about any product, service or individual performance,
the availability of various comments, suggestions, ratings,
and feedbacks are abundant. The required decision support
data can be collected through different sources of Medias like
newspapers, blogs, and discussion forums and from internet
too. So surely, it leads to the selection of best product, service
or individual if it is analyzed efficiently. In leading and
competitive world, this is huge and practical need of industries,
organizations to empower their qualities. In the recent years,
the significant study is done in the field of sentiment analysis.
However, the earlier work focused the implementation and
evaluation of individual sub technique of sentiment analysis.
Though these implementations produces significant results
of sentiment or opinion analysis, the trust of decision makers
is still in dangling to accept the results of such analysis. In
this paper, initially, we have been described the brief review
about the sentiment or opinion analysis system. Then the
details are provided about the design and about how to build
an automated opinion discovery system to enhance
performance of sentiment or opinion analysis based on feature
extraction sentiment analysis sub technique, natural language
processing and data mining techniques in an integrated way
A proposed Novel Approach for Sentiment Analysis and Opinion Miningijujournal
as the people are being dependent on internet the requirement of user view analysis is increasing
exponentially. Customer posts their experience and opinion about the product policy and services. But,
because of the massive volume of reviews, customers can’t read all reviews. In order to solve this problem,
a lot of research is being carried out in Opinion Mining. In order to solve this problem, a lot of research is
being carried out in Opinion Mining. Through the Opinion Mining, we can know about contents of whole
product reviews, Blogs are websites that allow one or more individuals to write about things they want to
share with other The valuable data contained in posts from a large number of users across geographic,
demographic and cultural boundaries provide a rich data source not only for commercial exploitation but
also for psychological & sociopolitical research. This paper tries to demonstrate the plausibility of the idea
through our clustering and classifying opinion mining experiment on analysis of blog posts on recent
product policy and services reviews. We are proposing a Nobel approach for analyzing the Review for the
customer opinion
A proposed novel approach for sentiment analysis and opinion miningijujournal
as the people are being dependent on internet the requirement of user view analysis is increasing
exponentially. Customer posts their experience and opinion about the product policy and services. But,
because of the massive volume of reviews, customers can’t read all reviews. In order to solve this problem,
a lot of research is being carried out in Opinion Mining. In order to solve this problem, a lot of research is
being carried out in Opinion Mining. Through the Opinion Mining, we can know about contents of whole
product reviews, Blogs are websites that allow one or more individuals to write about things they want to
share with other The valuable data contained in posts from a large number of users across geographic,
demographic and cultural boundaries provide a rich data source not only for commercial exploitation but
also for psychological & sociopolitical research. This paper tries to demonstrate the plausibility of the idea
through our clustering and classifying opinion mining experiment on analysis of blog posts on recent
product policy and services reviews. We are proposing a Nobel approach for analyzing the Review for the
customer opinion.
Determine the sentiment of sentence that is positive or negative based on the presence of part of
speech tag, the emoticons present in the sentences. For this research we use the most popular microblogging sit
twitter for sentiment orientation. In this paper we want to extract tweets form the twitter related to the product
like mobile phones, home appliances, vehicle etc. After retrieving tweets we perform some preprocessing on it
like remove retweets, remove tweets containing few words with minimum threshold of length five, remove tweets
containing only urls. After this the remaining tweets are pre-processed like that transform all letters of the
tweets to the lower case then remove punctuation from the tweets because it reduces the accuracy of result.
After this remove extra white spaces from the tweets, then we apply a pos tagger to tag each word. The tuple
after the applying above steps contain (word, pos tag, English-word, stop-word). We are interested in only
tweets that contain opinion and eliminate the remaining non-opinion tweets from the data set. For this we use
the Naïve Bays classification algorithm. After this we use short text classification on tweets i.e., the word having
different meaning in different domain. In order to solve this problem we use two different feature selection
algorithms the mutual information (MI) and the X2 feature selection. At final stage predicting the orientation of
an opinion sentence that is positive or negative as we mentioned above. For this we use two model like unigram
model and opinion miner.
Comparison between cbow, skip gram and skip-gram with subword information (1)Kavita Ganesan
A practical comparison between the different Word2Vec architectures. These are companion slides for this article: https://kavita-ganesan.com/comparison-between-cbow-skipgram-subword/
Comparison between cbow, skip gram and skip-gram with subword informationKavita Ganesan
A practical comparison between the different Word2Vec architectures. These are companion slides for this article: https://kavita-ganesan.com/comparison-between-cbow-skipgram-subword/
This was presented at the 2014 IEEE Conference on Big Data
For details:
http://kavita-ganesan.com/content/general-supervised-approach-segmentation-clinical-texts
Citation:
Ganesan, Kavita, and Michael Subotin. "A General Supervised Approach to Segmentation of Clinical Texts."
This talk was given at the Sentiment Analysis Innovation Summit on how to leverage large amounts of opinions to help users make decisions. Topics include methods to abstract out opinions, opinion-driven search engine and how FindiLike Hotel Search uses some of the state-of-the-art opinion-driven decision making tools.
Micropinion Generation: An Unsupervised Approach to Generating Ultra-Concise Summaries of Opinions, WWW '12
This work presents a new unsupervised approach to generating ultra-concise summaries of opinions focusing on three aspects: compactness, representativeness and readability.
By: Kavita Ganesan
A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012.
Slides By Kavita Ganesan.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Opinion Mining Tutorial (Sentiment Analysis)
1. Opinion Mining
A Short Tutorial
Kavita Ganesan
Hyun Duk Kim
(http://www.kavita-ganesan.com/)
(https://sites.google.com/site/gildong2/)
Text Information Management Group
University of Illinois @ Urbana Champaign
2. Agenda
Introduction
Application Areas
Sub-fields of Opinion Mining
Some Basics
Opinion Mining Work
Sentiment Classification
Opinion Retrieval
3. “What people think?”
What others think has always been an important piece of information
“Which car should I buy?”
“Which schools should I
apply to?”
“Which Professor to work for?”
“Whom should I vote for?”
4. “So whom shall I ask?”
Pre Web
Friends and relatives
Acquaintances
Consumer Reports
Post Web
“…I don’t know who..but apparently it’s a good phone. It has good battery life and…”
Blogs (google blogs, livejournal)
E-commerce sites (amazon, ebay)
Review sites (CNET, PC Magazine)
Discussion forums (forums.craigslist.org,
forums.macrumors.com)
Friends and Relatives (occasionally)
5. “Whoala! I have the reviews I need”
Now that I have “too much” information on one
topic…I could easily form my opinion and make
decisions…
Is this true?
6. …Not Quite
Searching for reviews may be difficult
Can you search for opinions as conveniently
as general Web search?
eg: is it easy to search for “iPhone vs Google Phone”?
Overwhelming amounts of information on one
topic
Difficult to analyze each and every review
Reviews are expressed in different ways
“the google phone is a disappointment….”
“don’t waste your money on the g-phone….”
“google phone is great but I expected more in terms of…”
“…bought google phone thinking that it would be useful but…”
7. “Let me look at reviews on one site only…”
Problems?
Biased views
all reviewers on one site may have the same opinion
Fake reviews/Spam (sites like YellowPages, CitySearch are prone to
this)
people post good reviews about their own product OR
services
some posts are plain spams
8. Coincidence or Fake?
Reviews for a moving
company from YellowPages
# of merchants
reviewed by the each of
these reviewers 1
Review dates close
to one another
All rated 5 star
Reviewers seem to know
exact names of people
working in the company and
TOO many positive mentions
10. Heard of these terms?
Opinion Mining
Review Mining
Sentiment Analysis
Appraisal Extraction
Subjectivity Analysis
Synonymous
&
Interchangeably Used!
11. So, what is Subjectivity?
The linguistic expression of somebody’s opinions,
sentiments, emotions…..(private states)
private state: state that is not open to objective verification
(Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.)
Subjectivity analysis - is the computational study of
affect, opinions, and sentiments expressed in text
blogs
editorials
reviews (of products, movies, books, etc.)
newspaper articles
12. Example: iPhone review
CNET review
Review on InfoWorld -
tech news site
Review posted on a tech blog
InfoWorld
-summary is structured
-everything else is plain text
-mixture of objective and
subjective information
-no separation between
positives and negatives
CNET
-nice structure
-positives and negatives
separated
Tech BLOG
-everything is plain text
-no separation between
positives and negatives
14. Subjectivity Analysis on iPhone
Reviews
Individual’s Perspective
Highlight of what is good and bad about iPhone
Ex. Tech blog may contain mixture of information
Combination of good and bad from the different
sites (tech blog, InfoWorld and CNET)
Complementing information
Contrasting opinions
Ex.
CNET: The iPhone lacks some basic features
Tech Blog: The iPhone has a complete set of features
15. Subjectivity Analysis on iPhone
Reviews
Business’ Perspective
Apple: What do consumers think about iPhone?
Do they like it?
What do they dislike?
What are the major complaints?
What features should we add?
Apple’s competitor:
What are iPhone’s weaknesses?
How can we compete with them?
Do people like everything about it?
Known as Business
Intelligence
16. Opinion Trend (temporal) ?
Sentiments for a given product/brand/services
Business Intelligence
Software
Business Intelligence
Software
17. Other examples
Blog Search
http://www.blogsearchengine.com/blog-search/?q=ob
Forum Search
http://www.boardtracker.com/
18. Application Areas Summarized
Businesses and organizations: interested in opinions
product and service benchmarking
market intelligence
survey on a topic
Individuals: interested in other’s opinions when
Purchasing a product
Using a service
Tracking political topics
Other decision making tasks
Ads placements: Placing ads in user-generated content
Place an ad when one praises an product
Place an ad from a competitor if one criticizes a product
Opinion search: providing general search for opinions
19. Opinion Mining – The Big Picture
Opinion Retrieval Opinion Question
Answering
Sentiment Classification
Opinion
Spam/Trustworthiness
Comparative mining
Sentence Level
Document Level
Feature Level
use one or combination
Opinion Mining
Direct Opinions
Opinion Integration
IR IR
20. Some basics…
Basic components of an opinion
1. Opinion holder: The person or organization that holds a
specific opinion on a particular object
2. Object: item on which an opinion is expressed
3. Opinion: a view, attitude, or appraisal on an object from an
opinion holder.
This is a great
book
Opinion
Opinion Holder
Object
21. Some basics…
Two types of evaluations
Direct opinions
“This car has poor mileage”
Comparisons
“The Toyota Corolla is not as good as Honda Civic”
They use different language constructs
Direct opinions are easier to work with but comparisons
may be more insightful
22. An Overview of the Sub Fields
Sentiment Classification
Comparative mining
Classify sentence/document/feature
based on sentiments expressed by
authors
positive, negative, neutral
Classify sentence/document/feature
based on sentiments expressed by
authors
positive, negative, neutral
Identify comparative sentences & extract
comparative relations
Comparative sentence:
“Canon’s picture quality is better than that of
Sony and Nikon”
Comparative relation:
(better, [picture quality], [Canon], [Sony, Nikon])
Identify comparative sentences & extract
comparative relations
Comparative sentence:
“Canon’s picture quality is better than that of
Sony and Nikon”
Comparative relation:
(better, [picture quality], [Canon], [Sony, Nikon])
relation feature entity1 entity2
23. An Overview of the Sub Fields
Opinion Integration
Opinion
Spam/Trustworthiness
Automatically integrate opinions
from different sources such as
expert review sites, blogs and
forums
Automatically integrate opinions
from different sources such as
expert review sites, blogs and
forums
Try to determine likelihood of spam in
opinion and also determine authority
of opinion
Ex. of Untrustworthy opinions:
Repetition of reviews
Misleading “positive” opinion
High concentration of certain words
Try to determine likelihood of spam in
opinion and also determine authority
of opinion
Ex. of Untrustworthy opinions:
Repetition of reviews
Misleading “positive” opinion
High concentration of certain words
Opinion Retrieval
Analogous to document retrieval
process
Requires documents to be retrieved
and ranked according to opinions
about a topic
A relevant document must satisfy the
following criteria:
relevant to the query topic
contains opinions about the
query
Analogous to document retrieval
process
Requires documents to be retrieved
and ranked according to opinions
about a topic
A relevant document must satisfy the
following criteria:
relevant to the query topic
contains opinions about the
query
24. An Overview of the Sub Fields
Opinion Question
Answering
Similar to opinion retrieval task, only that
instead of returning a set of opinions,
answers have to be a summary of those
opinions and format has to be in “natural
language” form
Ex.
Q: What is the international reaction to
the reelection of Robert Mugabe as
president of Zimbabwe?
A: African observers generally approved of
his victory while Western Governments
strongly denounced it.
Similar to opinion retrieval task, only that
instead of returning a set of opinions,
answers have to be a summary of those
opinions and format has to be in “natural
language” form
Ex.
Q: What is the international reaction to
the reelection of Robert Mugabe as
president of Zimbabwe?
A: African observers generally approved of
his victory while Western Governments
strongly denounced it.
25. Agenda
Introduction
Application Areas
Sub-fields of Opinion Mining
Some Basics
Opinion Mining Work
Sentiment Classification
Opinion Retrieval
26. Where to find details of previous work?
“Web Data Mining” Book, Bing Liu, 2007
“Opinion Mining and Sentiment Analysis” Book,
Bo Pang and Lillian Lee, 2008
27. What is Sentiment Classification?
Classify sentences/documents (e.g. reviews)/features
based on the overall sentiments expressed by authors
positive, negative and (possibly) neutral
Similar to topic-based text classification
Topic-based classification: topic words are important
Sentiment classification: sentiment words are more important
(e.g: great, excellent, horrible, bad, worst)
28. A. Sentence Level Classification
Assumption: a sentence contains only one opinion (not
true in many cases)
Task 1: identify if sentence is opinionated
classes: objective and subjective (opinionated)
Task 2: determine polarity of sentence
classes: positive, negative and neutral
Sentiment Classification
Quiz:
“This is a beautiful bracelet..”
Is this sentence subjective/objective?
Is it positive, negative or neutral?
29. Sentiment Classification
B. Document(post/review) Level Classification
Assumption:
each document focuses on a single object (not true in
many cases)
contains opinion from a single opinion holder (not true
in many cases)
Task: determine overall sentiment orientation in
document
classes: positive, negative and neutral
30. Sentiment Classification
C. Feature Level Classification
Goal: produce a feature-based opinion summary of multiple
reviews
Task 1: Identify and extract object features that have been
commented on by an opinion holder (e.g. “picture”,“battery
life”).
Task 2: Determine polarity of opinions on features
classes: positive, negative and neutral
Task 3: Group feature synonyms
32. Sentiment Classification
In summary, approaches used in sentiment classification
Unsupervised – eg: NLP pattern @ NLP patterns with lexicon
Supervised – eg: SVM, Naive Bayes..etc (with varying
features like POS tags, word phrases)
Semi Supervised – eg: lexicon+classifier
Based on previous work, four features generally used
a. Syntactic features
b. Semantic features
c. Link-based features
d. Stylistic features
Semantic & Syntactic features are most commonly used
33. Sentiment Classification
Features Usually Considered
A. Syntatic Features
What is syntactic feature? - Usage of principles and rules for
constructing sentences in natural languages [wikipedia]
Different usage of syntactic features:
POS+punctuation
[Pang et al. 2002; Gamon
2004]
POS Pattern
[Nasukawa and Yi 2003; Yi et
al. 2003; Fei et al. 2004;
Wiebe et al 2004]
Modifiers
Whitelaw et al. [2005]
>> POS (part-of-speech) tags
& punctuation
eg: a sentence containing
an adjective and “!” could
indicate existence of an
opinion
“the book is great !”
>>POS n-gram patterns
patterns like “n+aj” (noun
followed by +ve adjective) -
represents positive sentiment
patterns like “n+dj” (noun
followed by -ve adjective) -
express negative sentiment
>> Used a set of modifier
features (e.g., very, mostly,
not)
the presence of these
features indicate the presence
of appraisal
“the book is not great”
“..this camera is very handy”
34. Sentiment Classification
Features Usually Considered
B. Semantic Features
Leverage meaning of words
Can be done manually/semi/fully automatically
Score Based - Turney [2002] Lexicon Based - Whitelaw et al. [2005]
Use PMI calculation to compute the SO
score for each word/phrase
Idea: If a phrase has better association
with the word “Excellent” than with
“Poor” it is positively oriented
and vice versa
Score computation:
PMI (phrase ^ “excellent”) –
PMI (phrase ^ “poor”)
PMI- is a measure of association used in
information theory.
The PMI of a pair of x and y belonging to discrete random
variables quantifies the discrepancy between the probability of
their coincidence given their joint distribution versus the
probability of their coincidence given only their individual
distributions and assuming independence.
Use appraisal groups for assigning
semantics to words/phrases
Each phrase/word is manually
classified into various appraisal classes
(eg: deny, endorse, affirm)
35. Sentiment Classification
Features Usually Considered
C. Link-Based Features
Use link/citation analysis to determine sentiments of documents
Efron [2004] found that opinion Web pages heavily linking to each
other often share similar sentiments
Not a popular approach
D. Stylistic Features
Incorporate stylometric/authorship studies into sentiment classification
Style markers have been shown highly prevalent in Web discourse
[Abbasi and Chen 2005; Zheng et al. 2006; Schler et al. 2006]
Ex. Study the blog authorship style of “Students vs Professors”
Authorship Style of
Students
Authorship Style of
Professors
Colloquial@spoken style of
writing
Formal way of writing
Improper punctuation Proper punctuation
More usage of curse words &
abbreviations
Limited curse words &
abbreviation
36. Sentiment Classification – Recent Work
Abbasi, Chen & Salem (TOIS-08)
Propose:
sentiment analysis of web forum opinions in multiple languages
(English and Arabic)
Motivation:
Limited work on sentiment analysis on Web forums
Most studies have focused on sentiment classification of a
single language
Almost no usage of stylistic feature categories
Little emphasis has been placed on feature reduction/selection
techniques
New:
Usage of stylistic and syntactic features of English and Arabic
Introduced new feature selection algorithm: entropy weighted
genetic algorithm (EWGA)
EWGA outperforms no feature selection baseline, GA and
Information Gain
Results, using SVM indicate a high level of classification accuracy
40. Sentiment Classification – Recent Work
Blitzer et al. (ACL 2007)
Propose:
Method for domain adaptation for sentiment classifiers (ML
based) focusing on online reviews for different types of
products
Motivation:
Sentiments are expressed differently in different domains -
annotating corpora for every domain is impractical
If you train on the “kitchen” domain and classify on “book”
domain error increases
New:
Extend structural correspondence learning algorithm (SCL) to
sentiment classification area
Key Idea:
If two topics/sentences from different domains have high correlation
on unlabeled data, then we can tentatively align them
Achieved 46% improvement over a supervised baseline for
sentiment classification
41. Sentiment Classification – Recent Work
Blitzer et al. (ACL 2007)
• Do not buy the Shark portable
steamer …. Trigger mechanism is
defective.
• the very nice lady assured me that I
must have a defective set …. What
a disappointment!
• Maybe mine was defective …. The
directions were unclear
Unlabeled kitchen contexts
• The book is so repetitive that I
found myself yelling …. I will
definitely not buy another.
• A disappointment …. Ender
was talked about for <#> pages
altogether.
• it’s unclear …. It’s repetitive and
boring
Unlabeled books contexts
Aligning sentiments from
different domains
42. Sentiment Classification - Summary
Relatively focused & widely researched area
NLP groups
IR groups
Machine Learning groups (e.g., domain adaptation)
Important because:
With the wealth of information out on the web, we need an easy
way to know what people think about a certain “topic” – e.g.,
products, political issues, etc
Heuristics generally used:
Occurrences of words
Phrase patterns
Punctuation
POS tags/POS n-gram patterns
Popularity of opinion pages
Authorship style
43. Quiz
Can you think of other heuristics to
determine the polarity of subjective
(opinionated) sentences?
44. Opinion Retrieval
Is the task of retrieving documents according to topic and
ranking them according to opinions about the topic
Important when you need people’s opinion on certain topic or
need to make a decision, based on opinions from others
General Search Opinion Search
search for facts search for opinions/opinionated
topics
rank pages according to some
authority and relevance scores
Rank is based on relevance to
topic and content of opinion
45. Opinion Retrieval
“Opinion retrieval” started with the work of Hurst
and Nigam (2004)
Key Idea: Fuse together topicality and polarity
judgment “opinion retrieval”
Motivation: To enable IR systems to select content
based on a certain opinion about a certain topic
Method:
Topicality judgment: statistical machine learning
classifier (Winnow)
Polarity judgment: shallow NLP techniques (lexicon
based)
No notion of ranking strategy
46. Opinion Retrieval
Summary of TREC Blog Track (2006 – 2008)
TREC 2006 Blog
Track
TREC 2007 Blog Track TREC 2008 Blog Track
Opinion RetrievalOpinion Retrieval Same as 2006 with 2 new
tasks:
-blog distillation (feed search)
-polarity determination
Same as 2006 and 2007
with 1 new task:
-Baseline blog post
retrieval task (i.e. “Find
me blog posts about
X.”)
47.
48. TREC-2006 Blog Track – Focus is on Opinion Retrieval
14 participants
Baseline System
A standard IR system without any opinion finding layer
Most participants use a 2 stage approach
Opinion Retrieval
Summary of TREC-2006 Blog Track [6]
Ranked
documents
standard retrieval &
ranking scheme
Query
opinion related
re-ranking/filterRanked
opinionated
documents
STAGE 1
STAGE 2
-tf*idf
-language model
-probabilistic
-dictionary based
-text classification
-linguistics
49. Opinion Retrieval
Summary of TREC-2006 Blog Track [6]
TREC-2006 Blog Track
The two stage approach
First stage
documents are ranked based on topical relevance
mostly off-the-shelf retrieval systems and weighting
models
TF*IDF ranking scheme
language modeling approaches
probabilistic approaches.
Second stage
results re-ranked or filtered by applying one or more
heuristics for detecting opinions
Most approaches use linear combination of relevance score
and opinion score to rank documents. eg:
α and β are combination parameters
50. Opinion Retrieval
Summary of TREC-2006 Blog Track [6]
Opinion detection approaches used:
Lexicon-based approach [2,3,4,5]:
(a) Some used frequency of certain terms to rank documents
(b) Some combined terms from (a) with information about
the
distance between sentiment words &
occurrence of query words in the document
Ex:
Query: “Barack Obama”
Sentiment Terms: great, good, perfect, terrific…
Document: “Obama is a great leader….”
success of lexicon based approach varied
..of greatest
quality…………
……nice…….
wonderful……
…good battery
life…
..of greatest
quality…………
……nice…….
wonderful……
…good battery
life…
51. Opinion Retrieval
Summary of TREC-2006 Blog Track [6]
Opinion detection approaches used:
Text classification approach:
training data:
sources known to contain opinionated content (eg: product
reviews)
sources assumed to contain little opinionated content
(eg:news, encyclopedia)
classifier preference: Support Vector Machines
Features (discussed in sentiment classification section):
n-grams of words– eg: beautiful/<ww>, the/worst, love/it
part-of-speech tags
Success of this approach was limited
Due to differences between training data and actual
opinionated content in blog posts
Shallow linguistic approach:
frequency of pronouns (eg: I, you, she) or adjectives (eg: great,
tall, nice) as indicators of opinionated content
success of this approach was also limited
52. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Gilad Mishne (TREC 2006)
Propose: multiple ranking strategies for opinion retrieval in
blogs
Introduced 3 aspects to opinion retrieval
topical relevance
degree to which the post deals with the given topic
opinion expression
given a “topically-relevant” blog post, to what degree it
contains subjective (opinionated) information about it
post quality
estimate of the quality of a “blog post”
assumption - higher-quality posts are likely to contain
meaningful opinions
53. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Step 1: Is blog post Relevant to Topic?
language
modeling
based retrieval
language
modeling
based retrieval
Ranked
documents
Ranked
documents
Query
blind relevance
feedback [9]
blind relevance
feedback [9]
term proximity
[10,11]
term proximity
[10,11]
• add terms to original query by
comparing language model of top-
retrieved docs entire collection
• limited to 3 terms
• every word n-gram from the
query treated as a phrase
temporal
properties
temporal
properties
• determine if query is looking for
recent posts
•boost scores of posts published
close to time of the query date
Topic relevance
improvements
Topic relevance
improvements
54. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Step 2: Is blog post Opinionated?
Lexicon-based method- using GeneralInquirer
GeneralInquirer
large-scale, manually-constructed lexicon
assigns a wide range of categories to more than 10,000
English words
Example of word categories :
emotional category: pleasure, pain, feel, arousal, regret
pronoun category: self, our, and you;
“The meaning of a word is its use in the language” - Ludwig Wittgenstein (1958)
55. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Step 2: Is blog post Opinionated?
For each post calculate two sentiment related values known as
“opinion level”
Calculate opinion
level
Calculate opinion
level
Blog postBlog post
Extract “topical sentences” from post
to count the opinion-bearing words in
it
Topical sentences:
>sentences relevant to topic
>sentences immediately
surrounding them
“post opinion” level“post opinion” level
“feed opinion” level“feed opinion” level
(# of occurrences of
words from any of “opinion
indicating” categories )
total # of words
Idea:
Feeds containing a fair amount of
opinions are more likely to express
an opinion in any of its posts
Method:
> use entire feed to which the post
belongs
> topic-independent score per feed
estimates the degree to which it
contains opinions (about any topic)
56. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Step 3: Is the blog post of good quality?
A. Authority of blog post: Link-based Authority
Estimates authority of documents using analysis of the link
structure
Key Idea:
placing a link to a page other than your own is like
“recommending” that page
similar to document citation in the academic world
Follow Upstill et al (ADCS 2003) - inbound link
degree (indegree) as an approximation
captures how many links there are to a page
Post’s authority estimation is based on:
Indegree of a post p & indegree of post p’s feed
Site ASite A
Site BSite B
Site CSite C
Blog
Post P
Blog
Post P
Authority=Log(indegree=3)
57. Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)]
Step 3: Is the blog post of good quality?
B. Spam Likelihood
Method 1: machine-learning approach - SVM
Method 2: text-level compressibility - Ntoulas et al (WWW
2006)
Determine: How likely is post P from feed F a SPAM
entry?
Intuition: Many spam blogs use “keyword stuffing”
High concentration of certain words
Words are repeated hundreds of times in the same post
and across feed
When you detect spam post and compress them high
compression ratios for these feeds
Higher the compression ratio for feed F, more likely that
post P is splog (Spam Blog)
comp. ratio=(size of uncompressed pg.) / (size of compressed pg.)
Final spam likelihood estimate: (SVM prediction) * (compressibility prediction)
58. temporal
properties
temporal
properties
Opinion Retrieval
Highlight: Gilad Mishne (TREC 2006)
Step 4: Linear Model
Combination
top
1000
posts
top
1000
posts
1.Topic relevance
language model
1.Topic relevance
language model
blind relevance
feedback [9]
blind relevance
feedback [9]
term proximity
query [10,11]
term proximity
query [10,11]
2.Opinion level2.Opinion level3.Post Quality3.Post Quality
“feed opinion”
level
“feed opinion”
level
“post opinion”
level
“post opinion”
level
spam likelihood
[13]
spam likelihood
[13]
link based
authority [12]
link based
authority [12]
4. Weighted linear
combination
of scores
4. Weighted linear
combination
of scores
partial “post quality”
scores
partial “opinion level”
scores
ranked
opinionated
posts
ranked
opinionated
posts
final scores
59. Opinion Retrieval
Summary of TREC-2008 Blog Track
Opinion Retrieval & Sentiment Classification
Basic techniques are similar – classification vs. lexicon
More use of external information source
Traditional source: WordNet, SentiWordNet
New source: Wikipedia, Google search result,
Amazon, Opinion web sites (Epinions.com,
Rateitall.com)
60. Opinion Retrieval
Summary of TREC-2008 Blog Track
Retrieve ‘good’ blog posts
1. Expert search techniques
Limit search space by joining data by criteria
Used characteristics: number of comments, post
length, the posting time
-> estimate strength of association between a post
and a blog.
1. Use of folksonomies
Folksonomy: collaborative tagging, social indexing.
User generating taxonomy
Creating and managing tagging.
Showed limited performance improvement
61. Opinion Retrieval
Summary of TREC-2008 Blog Track
Retrieve ‘good’ blog posts
3. Temporal evidence
Some investigated use of temporal span and temporal
dispersion.
Recurring interest, new post.
3. Blog relevancy approach
Assumption:
A blog that has many relevant posts is more relevant.
The top N posts best represent the topic of the blog
Compute two scores to score a given blog.
The first score is the average score of all posts in the
blog
The second score is the average score of the top N
posts that have the highest relevance scores.
Topic relevance score of each post is calculated using a
language modeling approach.
62. Opinion Retrieval – Recent Work
He et al (CIKM-08)
Motivation:
Current methods require manual effort or external resources
for opinion detection
Propose:
Dictionary based statistical approach - automatically derive
evidence of subjectivity
Automatic dictionary generation – remove too frequent or few
terms with skewed query model
assign weights – how opinionated. divergence from randomness
(DFR)
For term w, divergence D(opREl) from D(Rel)
( Retrieved Doc = Rel + nonRel. Rel = opRel + nonOpRel )
assign opinion score to each document using top weighted terms
Linear combine opinion score with initial relevance score
Results:
Significant improvement over best TREC baseline
Computationally inexpensive compared to NLP techniques
63. Zhang and Ye (SIGIR 08)
Motivation:
Current ranking uses only linear combination of scores
Lack theoretical foundation and careful analysis
Too specific (like restricted to domain of blogs)
Propose:
Generation model that unifies topic-relevance and
opinion generation by a quadratic combination
Relevance ranking serves as weighting factor to
lexicon based sentiment ranking function
Different than the popular linear combination
Opinion Retrieval – Recent Work
64. Zhang and Ye (SIGIR 08)
Key Idea:
Traditional document generation model:
Given a query q, how well the document d “fits” the query q
estimate posterior probability p(d|q)
In this opinion retrieval model, new sentiment parameter, S
(latent variable) is introduced
Iop(d,q,s): given query q, what is the probability that document d
generates a sentiment expression s
Tested on TREC Blog datasets – observed significant improvement
Opinion Retrieval – Recent Work
Opinion Generation Model Document Generation Model
65. Challenges in opinion mining
Summary of TREC Blog Track focus (2006 – 2008)
TREC 2006 Blog
Track
TREC 2007 Blog Track TREC 2008 Blog Track
Opinion RetrievalOpinion Retrieval Same as 2006 with 2 new
tasks:
-blog distillation (feed search)
-polarity determination
Same as 2006 and 2007
with 1 new task:
-Baseline blog post
retrieval task (i.e. “Find
me blog posts about
X.”)
Main Lessons Learnt from TREC 2006, 2007 & 2008:
Good performance in opinion-finding is strongly dependent
on finding as many relevant documents as possible
regardless of their opinionated nature
66. Many Opinion classification and retrieval system could not make
improvements.
Used same relevant document retrieval model.
Evaluate the performance of opinion module.
ΔMAP = Opinion finding MAP score with only retrieval system –
Opinion finding MAP score with opinion module.
A lot of margins to research!!!
Group ΔMAP of Mix ΔMAP of Positive ΔMAP of Negative
KLE 4.86% 6.08% 3.51%
UoGTr -3.77% -4.62% -2.76%
UWaterlooEng -6.70% -1.69% -12.33%
UIC_IR_Group -22.10% 2.12% -49.60%
UTD_SLP_Lab -22.96% -17.51% -29.23%
Fub -55.26% -59.81% -50.18%
Tno -76.42% -75.93% -77.02%
UniNE -43.68 -39.41% -48.49%
Challenges in opinion mining
Summary of TREC Blog Track focus (2006 – 2008)
67. Challenges in opinion mining
Highlight of TREC-2008 Blog Track
Lee et al. Jia et al. (TREC 2008)
Propose:
Method for query dependent opinion retrieval and sentiment
classification
Motivation:
Sentiments are expressed differently in different query. Similar
to the Blitzer’s idea.
New:
Use external web source to obtain positive and negative
opinionated lexicons.
Key Idea:
Objective words: Wikipedia, product specification part of Amazon.com
Subjective words: Reviews from Amazon.com, Rateitall.com and
Epinions.com
Reviews rated 4 or 5 out of 5: positive words
Reviews rated 1 or 2 out of 5: negative words
Top ranked in Text Retrieval Conference.
68. Challenges in opinion mining
Polarity terms are context sensitive.
Ex. Small can be good for ipod size, but can be bad for LCD monitor size.
Even in the same domain, use different words depending on target
feature.
Ex. Long ‘ipod’ battery life vs. long ‘ipod’ loading time
Partially solved (query dependent sentiment classification)
Implicit and complex opinion expressions
Rhetoric expression, metaphor, double negation.
Ex. The food was like a stone.
Need both good IR and NLP techniques for opinion mining.
Cannot divide into pos/neg clearly
Not all opinions can be classified into two categories
Interpretation can be changed based on conditions.
Ex. 1) The battery life is ‘long’ if you do not use LCD a lot. (pos)
2) The battery life is ‘short’ if you use LCD a lot. (neg)
Current system classify the first one as positive and second one as negative.
However, actually both are saying the same fact.
69. Opinion Retrieval – Summary
Opinion Retrieval is a fairly broad area (IR, sentiment
classification, spam detection, opinion authority…etc)
Important:
Opinion search is different than general web search
Opinion retrieval techniques
Traditional IR model + Opinion filter/ranking layer (TREC
2006,2007,2008)
Some approaches in opinion ranking layer:
Sentiment: lexicon based, ML based, linguistics
Opinion authority - trustworthiness
Opinion spam likelihood
Expert search technique
Folksonomies
Opinion generation model
Unify topic-relevance and opinion generation into one
model
Future approach:
Use opinion as a feature or a dimension of more refined
and complex search tasks
71. References
[1] M. Hurst and K. Nigam, “Retrieving topical sentiments from online document collections,” in
Document Recognition and Retrieval XI, pp. 27–34, 2004.
[2] Liao, X., Cao, D., Tan, S., Liu, Y., Ding, G., and Cheng X.Combining Language Model with
Sentiment Analysis for Opinion Retrieval of Blog-Post. Online Proceedings of Text Retrieval
Conference (TREC) 2006. http://trec.nist.gov/
[3] Mishne, G. Multiple Ranking Strategies for Opinion Retrieval in Blogs. Online Proceedings of
TREC, 2006.
[4] Oard, D., Elsayed, T., Wang, J., and Wu, Y. TREC-2006 at Maryland: Blog, Enterprise, Legal and
QA Tracks. OnlineProceedings of TREC, 2006. http://trec.nist.gov/
[5] Yang, K., Yu, N., Valerio, A., Zhang, H. WIDIT in TREC-2006 Blog track. Online Proceedings of
TREC, 2006. http://trec.nist.gov/
[6] Ounis, I., de Rijke, M., Macdonald, C., Mishne, G., and Soboroff, I. Overview of the TREC 2006
Blog Track. In Proceedings of TREC 2006, 15–27. http://trec.nist.gov/
[7] Min Zhang, Xingyao Ye: A generation model to unify topic relevance and lexicon-based
sentiment for opinion retrieval. SIGIR 2008: 411-418
[9] J. M. Ponte. Language models for relevance feedback. In Advances in Information Retrieval:
Recent Research from the Center for Intelligent Information Retrieval, 2000.
[10] D. Metzler and W. B. Croft. A markov random field model for term dependencies. In SIGIR ’05,
2005.
[11] G. Mishne and M. de Rijke. BoostingWeb Retrieval through Query Operations. In ECIR 2005,
2005.
[12] T. Upstill, N. Craswell, and D. Hawking. Predicting fame and fortune: Pagerank or indegree? In
ADCS2003, 2003.
[13] A. Ntoulas, M. Najork, M. Manasse, and D. Fetterly. Detecting spam web pages through
content analysis. In WWW 2006, 2006.
[14] An Effective Statistical Approach to Blog Post Opinion Retrieval.
Ben He, Craig Macdonald Jiyin He, and Iadh Ounis. In Proceedings of CIKM 2008.
72. References
[15] I. Ounis, C. Macdonald and I. Soboroff, Overview of the TREC 2008 Blog Track (notebook
paper), TREC, 2008.
[16] Y. Lee, S.-H. Na, J. Kim, S.-H. Nam, H.-Y. Jung and J.-H. Lee , KLE at TREC 2008 Blog Track:
Blog Post and Feed Retrieval (notebook paper), TREC, 2008.
[17] L. Jia, C. Yu and W. Zhang, UIC at TREC 208 Blog Track (notebook paper), TREC, 2008.
Editor's Notes
Lets give this a slight twist… lets say the kid finished dinner and has gone to watch some TV..
My editing skills at work..