This document presents a hybrid approach to condition mining that uses both computational linguistics and deep learning. It proposes using a deep regressor trained on condition candidates generated from sentences to score the candidates. The main methods involve generating candidates from new sentences, scoring them with the regressor, and selecting the best candidates. Experimental results on English and Spanish reviews show the approach outperforms baselines, with variants using CNNs and CNN-BiGRUs performing best on average. Analysis of the results finds these variants statistically significantly outperform other methods. The approach overcomes issues in prior work and achieves good accuracy.
1. A Hybrid Approach to
Mining Conditions
Fernando O. Gallego,
and Rafael Corchuelo
2. Opinion mining
Attribute Polarity
“lens” Positive
2
Attribute Polarity
“resolution” Neutral
“Flash” Negative
I think that the lens is beyond
excellent for amateurs.
The resolution of this camera is
13Mp. Flash is tacky when
using outdoors.
4. Opinion mining (with conditions)
Attribute Polarity
“lens” Positive (for
amateurs)
4
Attribute Polarity
“resolution” Neutral
“Flash” Negative
(when using
outdoors)
I think that the lens is beyond
excellent for amateurs.
The resolution of this camera is
13Mp. Flash is tacky when
using outdoors.
7. Condition mining
7
I think that the lens is beyond
excellent for amateurs.
The resolution of this camera is
13Mp. Flash is tacky when
using outdoors.
for amateurs
when using
outdoors
11. Variability of conditions
11
0/1st/2nd/3rd
conditionals
If you do sth
Even if sby fell down
If sth had passed
Should you help me
When sth happens
May it be accepted
For sby
To sby
During my event
While doing sth After/before sth
If it occurs
12. Machine learning
• Nakayama et al (2015):
– SVM/CRF Model
– 3k Japanese sentences
– Several lexicons used
12
16. Inputs
Sentence Conditions
I think that the lens is beyond excellent for
amateurs.
[“for amateurs”]
The resolution of this camera is 13Mp. []
Flash is tacky when using outdoors. [“when using outdoors”]
… …
16
19. Train (1/4)
• Create a subset of
training examples for
each sentence
19
S1
S1
S1
S1
ts
ts
ts
ts
20. Train (2/4)
• Generate condition
candidates for a given
sentence
20
outdoors
tacky
Flash is
usingcop
advcl
nsubj
advmod
when
c1:
c2:
advmod
Flash is tacky when using outdoors
when using outdoors
21. Train (3/4)
• Score each candidate
21
c1:
c2:
Flash is tacky when using outdoors
when using outdoors
0.8560
1.0000
25. Apply (1/5)
• Generate condition
candidates
25
outdoors
tacky
Flash is
usingcop
advcl
nsubj
advmod
when
c1:
c2:
advmod
Flash is tacky when using outdoors
when using outdoors
26. Apply (2/5)
• For each condition
candidate it checks
whether it must be
considered or not
26
27. Apply (3/5)
• The regressor scores
the candidate
27
c1:
c2:
Flash is tacky when using outdoors
when using outdoors
0.8560
1.0000
28. Apply (4/5)
• If score is equal to or
greater than a given
threshold, it is
considered
28
39. Detailed example (1/3)
39
who cake
if you be lik- ’s
try
nsubj dobj
someone then john
advmod
xcompadvcl
mark acl:relcl
copnsubj
case
If you are someone who likes cakes then try John’s
44. Statistical analysis
q = 0.2500 q = 0.5000
Proposal Ranking Comparison z p-value Proposal Ranking Comparison z p-value
CNN 1.0000 CNN x CNN - - CNN-BiGRU 1.4000 CNN-BiGRU x CNN-BiGRU - -
CNN-BiGRU 2.0000 CNN x CNN-BiGRU 1.4142 0.1573 CNN 1.6000 CNN-BiGRU x CNN 0.2828 0.7773
BiGRU 3.5000 CNN x BiGRU 3.5355 0.0008 MLP 3.1000 CNN-BiGRU x MLP 2.4042 0.0324
MLP 4.1000 CNN x MLP 4.3841 0.0000 BiGRU 4.2000 CNN-BiGRU x BiGRU 3.9598 0.0002
GRU 4.4000 CNN x GRU 4.8083 0.0000 GRU 4.7000 CNN-BiGRU x GRU 4.6669 0.0000
(a) (b)
q = 0.7500
Proposal Ranking Comparison z p-value Proposal Ranking Comparison z p-value
CNN 1.3000 CNN x CNN - - CNN0.25 1.4000 CNN0.25 x CNN0.25 - -
CNN-BiGRU 1.7000 CNN x CNN-BiGRU 0.5657 0.5716 CNN-BiGRU0.50 1.8000 CNN0.25 x CNN-BiGRU0.50 0.5657 0.5716
MLP 3.0000 CNN x MLP 2.4042 0.0324 MB 3.4000 CNN0.25 xMB 2.8284 0.0094
GRU 4.5000 CNN x GRU 4.5255 0.0000 CNN0.75 3.7000 CNN0.25 xCNN0.75 3.2527 0.0034
BiGRU 4.5000 CNN x BiGRU 4.5255 0.0000 CB 4.7000 CNN0.25 x CB 4.6669 0.0000
(c) (d)
44
Editor's Notes
Thanks for attending my presentation. My name is Fernando O. Gallego and I co-authored this paper with Rafael Corchuelo, both from the University of Seville.
--
Copyright (C) 2018 The Distributed Group
The use of these slides is hereby constrained to the conditions
of the TDG Licence, a copy of which you may download from
http://www.tdg-seville.info/License.html
First of all, let’s introduce an example to understand the problem. Opinion mining is a set of natural language processing tasks whose main goal is to determine whether an opinion of a document or an aspect is positive, negative, or neutral.
But wait! There is a problem that you likely didn’t notice.
There are some clauses in the sentence, which are known as conditions, that changes the sense of the opinion. For instance, the positive opinion about “lens” is only true if you consider amateur photographers. Alike, the negative opinion regarding “Flash” is only true if the user uses the camera outdoors.
This is the roadmap of my presentation: I’ll start with a broad introduction to the problem, then I’ll report on our proposal, then on some experimental results, and, finally, I’ll present some conclusions.
Let’s start with the introduction
Simply put, condition mining is a task whose goal is to identify conditions from a piece of text.
Currently, there are two approaches in the literature, namely: handcrafted patterns and machine learning.
Handcrafted or user-defined patterns clearly describe how to identify a condition in a text by means of connectives, pos-tags, dependency tags, or another clue words.
There are two proposals in this way: Mausam, who studied the problem in the field of entity-relation extraction and uses adverbial clauses from the dependency tree; and Chikersal, who studied the problem in the field of opinion mining and uses basic connectives and tokens “then”/comma.
Unfortunately, the previous proposals are not appealing because of the human effort when handcrafting such patterns.
Furthermore, the results typically fall short regarding recall because of the variability of the conditions.
The only existing machine-learning proposal was introduced by Nakayama et al, who worked in the field of opinion mining in Japanese. They devised a model that is based on several features from opinion expressions, which requires to provide some specific-purpose dictionaries, taxonomies, and heuristics. They used Conditional Random Fields and Support Vector Machines to learn classifiers of syntactic units of the sentences.
Unfortunately, their proposal was only evaluated on a small dataset with 3,155 sentences regarding hotels and the best F1 score attained was 0.58. As a conclusion, this proposal is not generally applicable and its effectiveness is poor
Then, we’ll describe our proposal
Our solution is a hybrid approach that combines computational linguistics and deep learning. It does not have any of the problems found in the related work.
Our inputs are a set of sentences with its corresponding sets of labelling. Those sets identify the conditions for each sentence.
These are our proposal’s main methods.
Method “train” returns a regressor that computes a score that assesses how likely a candidate condition is an actual condition.
The procedure is repeated for every input sentence to compute a subset of training examples.
The procedure starts by generating a set of condition candidates from the sentence’s dependency tree. The heuristic used is quite simple, we consider every non-leaf node in the dependency tree and compute all of the sequences of tokens that originate from that node.
For each candidate, we computed a score that represents how likely it is a condition.
And finally, we train a deep regressor using well-known Deep learning networks.
We have experimented with a dozens neural network alternatives but the best ones are those that we present in our paper, namely: Multilayer Perceptron, Gated Recurrent Unit Network, Bidirectional Gated Recurrent Unit Network, Convolutional Neural Network, and a hybrid neural network composed of both Convolutional layers and Bidirectional Gated Recurrent Unit layers.
Method “apply” returns the conditions found in a sentence by means of that regressor.
We first need to compute the set of candidate conditions of the sentence. This method is the same as one used in main method train.
The procedure is repeated for every candidate to check whether it must be considered or not
Given a candidate, we need to score it. In this case, we use the regressor that we trained before.
If the score is equal to or greater than a given threshold, it is added to the result set.
Finally, we remove the conditions that overlap others with a higher score.
Now, let me show you our experimental results.
This is our hardware and software configuration. As you can see, it’s a pretty regular configuration with recent versions of software components.
We used a dataset with almost 4 million sentences in English and Spanish. In addiction, we just increased the amount of sentences by means of new languages like French or Italian, and we uploaded it to Kaggle platform.
We used the Handcrafted patterns proposals as baselines. But the Machine Learning proposal wasn’t considered because it is not clear if it can be customised to deal with languages other than Japanese and its best F1 was 0.58; neither could we find an implementation or the dataset.
In this slide we present our results in terms of F1 score. Our best alternatives are, namely: CNN and CNN-BiGRU, which beats the related work proposals. We performed statistical analysis to determine which alternative is the winner.
It’s time for conclusions
Our conclusions are that we’ve present a proposal that overcomes the problems found in the literature. Our experimental analysis covers a variety of alternatives and it achieves promising results.