This document summarizes an approach to generating abstractive summaries of product reviews. It discusses extracting aspects from reviews, annotating reviews with aspect sentiment polarity and strength, applying a discourse parser to obtain discourse trees, aggregating trees to generate an Aspect Rhetorical Relation Graph (ARRG), selecting important aspects and relations using PageRank, and generating a natural language summary template based on the selected content. Evaluation shows the approach identifies aspects and relations accurately and is able to generate a multi-sentence summary reflecting the most prominent aspects of multiple reviews.
3. Introduction
Product Summary
• plays vital role for both Customers and Manufacturers
Effective Review Summary
• how good product is based on different parameters and aspects.
• It is more abstract and captures more parameters and prioritize
them based on number of times they are used.
• Manufacturers used them for the improvement of the product.
4. Problem Statement
• Generate a abstractive summarization system for product reviews by
generating aspect-based sentiment analysis and exploiting their
discourse structure and assuming no prior domain knowledge.
• Generate an aspect-based abstract from multiple reviews of a product.
• Product-independent template-based NLG framework to generate an
abstract based on the selected content.
5. Approach
• At first we made use of Stanford NLP API to generate the universal
dependencies and the Parse tree.
• Then we extract aspects from the dependency tree bank of a review.
• Now analysing the reviews we generate a annotated review with aspects
and its sentiment polarity and strength.
• Then Apply a discourse parser to each review and obtain a discourse tree
representation for every review and modified the discourse trees so that it
contains only the aspects.
6. Approach
• After that, we aggregate the aspect discourse trees and generate a graph,
select a sub graph representing the most important aspects and the
rhetorical relations between them using a PageRank algorithm, then
transform it into an aspect tree.
• Finally, we generate a natural language summary by applying a template-
based NLG framework.
7. Aspect Extraction
•From the dependency tree bank of a review, we extract all the noun
phrases and the nouns.
•The noun phrases are then classified further into categories like
adjective based, noun based and determiner based.
•We then apply the dependency relations to check the aspects that
contributes to the sentiment of the sentences as per a specific set of
rules.
8. Annotated review
•From each review sentence, we generate a graph using the relations like
nsubj, amod, advmod, dobj between the words. This graph is essentially the
dependency tree but presented in a more structured manner for traversal.
•Given an aspect we see patterns as stated in [5]. Dependency relations are
the basis for such patterns. Based on these patterns we find sentiment flow
from the polar words(derived from SenticNet [6]) to the aspect words. In the
end, the patterns determine the polarity of the aspects.
•The strength of sentiment is fetched by matching the modifiers with a
dictionary to give the final strength to each aspect.
10. Summarization Framework
•Generates a summary from multiple input reviews based on an Aspect
Hierarchy Tree (AHT) that reflects the importance of aspects as well as
the relationships between them.
•In our framework, an AHT is generated automatically from the set of
input reviews, where each sentence of every review is marked by the
aspects presented in that sentence and the polarity of opinions over
them.
• P/S scores are integer values in the range [-3, +3], where +3 is the most
positive and 3 is the most negative polarity value
11.
12. Abstract Generation
•The automatic generation of a natural language summary in our
system involves the following tasks
i)Micro-planning, which covers lexical selection.
i)sentence realization, which produces English text from the
output of the Micro-planner.
13. Microplanning and Sentence Realization
• Once the content is selected and structured, it is passed to the
microplanning module which perform lexical choice.
• Lexical choice is an important component of microplanning.
• Lexical choice is formulated in our system based on a “formal” style
and “fluent” connectivity among other lexical units.
• In sentence realization we generated abstract sentences for aspects
with no children and generate supporting sentences for aspects with
children
14. Results Obtained
We have obtained the discourse parsed tree of the reviews and have identified the
aspects with their polarity strength. Rhetorical relation among the EDUs are also
identified. A small snapshot of the result is provided below:
( Nucleus (span 1 3) (rel2par Joint)
( Satellite (leaf 1) (rel2par Attribution) (text _!_!I want to start off!__!) )
( Nucleus (span 2 3) (rel2par span)
( Satellite (leaf 2) (rel2par Attribution) (text _!_!saying!__!) )
( Nucleus (leaf 3) (rel2par span) (text _!_!that this camera is small for a
reason . <s>!__!) )
)
15. Results Obtained
From the Discourse tree, we have generated the Aspect based discourse
tree which defined the underlying aspect of EDU and the Rhetorical
relation among them.
room,Evaluation,small,0.261905
room,Evaluation,camera,0.33333
room,Evaluation,size,0.357143
memory,Elaboration,size,0.547619
size,Manner-Means,camera,0.761905
camera,Evaluation,size,0.166667
camera,Evaluation,memory,0.261905
camera,Contrast,small,0.5
16. Results Obtained
We have generated the ARRG of the product based on the ADTs based on the output of
previous component. The output snippet is provided below:
camera,Elaboration,auto mode,0.23
photo quality,Background,auto mode,0.75
camera,Elaboration,photo quality,0.5
camera,Elaboration,auto mode,0.33
camera,Elaboration,photo quality,0.2
####,####,####,####
camera,Elaboration,control,0.5
control,Contrast,auto mode,0.66
camera,Elaboration,auto mode,0.075
camera,Elaboration,control,0.2
camera,Elaboration,auto mode,0.375
####,####,####,####
17. Results Obtained
We have generated the tuples with highest strength by applying page
ranking on ARRG
photo quality,Background,auto mode,0.75
camera,Elaboration,photo quality,0.5
camera,Elaboration,auto mode,0.705
camera,Elaboration,control,0.5
control,Contrast,auto mode,0.66
18. Results Obtained
We have used the above tuple set of graphs to generate our review
summary and we have received the following results :
All customers ( 51 people ) who reviewed the camera felt that it was great .Most shoppers
( 36 people ) mentioned the size and they really liked this feature .Accordingly almost half
(29 people) of the users commented about the pictures and they really liked this feature
mainly because of clarity. About 45.0% of reviewer commented about the software and
they absolutely liked it .About 45.0% of reviewer commented about the small size and
they really liked this feature .In relation to the aspect, About 30.0% of the shoppers
mentioned the use and they absolutely liked it. 8 reviewers commented about the flash
and in overall they felt that it was fine mainly because of pictures.
19. Conclusion
•We have presented a framework for abstractive summarization of product reviews
based on discourse structure.
• For content selection, we propose a graph model based on the importance and
association relations between aspects, that assumes no prior domain knowledge,
by taking advantage of the discourse structure of reviews.
•For abstract generation, we propose a product independent template-based
natural language generation(NLG) framework that takes aspects and their
structured relation as input and generates an abstractive summary.
•In addition, we plan to develop and evaluate an end-to-end system, in which the
aspect extraction and polarity estimation of aspects are automated.
20. References
[1] Abstractive Summarization of Product Reviews Using Discourse Structure
http://www.aclweb.org/anthology/D/D14/D14-1168.pdf
[2] Opinosis: A Graph-Based Approach to Abstractive Summarization of Highly Redundant Opinions
http://lexitron.nectec.or.th/public/COLING-2010_Beijing_China/PAPERS/pdf/PAPERS039.pdf
[3] Poria, E. Cambria, G. Winterstein, and G.-B. Huang. Sentic patterns: Dependency-based rules for
concept-level sentiment analysis. Knowledge-Based Systems 69, pp. 45-63 (2014)
[4] S. Poria, E. Cambria, A. Gelbukh, F. Bisio, and A. Hussain. Sentiment data flow analysis by means of
dynamic linguistic patterns. IEEE Computational Intelligence Magazine 10(4), pp. 26-36 (2015)
In our summarization framework, anything evaluated is an aspect, even the product itself.
We assume that sentiment flows from polar words to other words using dependency arcs and based on the patterns.
Some useful patterns –
I like the lens of this camera.
The camera of this phone is nice.
Battery runs well.
The strength of sentiment is fetched by matching the modifiers with a dictionary to give the final strength to each aspect.
Future Work – Form local features surrounding the aspects and employ supervised learning for sentiment classification.