1.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Jonathan Ortigosa-Hernández1 , Juan Diego Rodríguez1 , Leandro Alzate2 , Iñaki Inza1 , and José A. Lozano1 1 Intelligent Systems Group Computer Science and Artiﬁcial Intelligence Department University of the Basque Country 2 Socialware Company, Bilbao, Spain CEDI 2010 Valencia – September 9th, 2010
2.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classiﬁer 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
3.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Supervised Classiﬁcation It consists of building a classiﬁer Ψ from a given labelled training dataset D, by using an induction algorithm A (A(D) = Ψ), X1 X2 ... Xn C (1) (1) (1) x1 x2 ... xn c(1) (2) (2) (2) x1 x2 ... xn c(2) ... ... ... ... ... (N ) (N ) (N ) x1 x2 ... xn c(N ) in order to predict the value of a class variable C for any new unlabelled instance x (Ψ(x) = c).
4.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Uni-dimensional and Multi-dimensional Classiﬁcation Uni-dimensional classiﬁcation tries to predict a single class variable based on a dataset composed of a set of labelled examples. (Uni-dimensional Class) Bayesian Network Classiﬁers (Larrañaga et al, 2005). Multi-dimensional classiﬁcation is the generalisation of the single-class classiﬁcation task to the simultaneous prediction of a set of class variables. Multi-dimensional Class Bayesian Network Classiﬁers (v.d. Gaag and d. Waal, 2006).
5.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Multi-dimensional Supervised Learning A typical supervised training dataset X1 X2 ... Xn C1 C2 ... Cm (1) (1) (1) (1) (1) (1) x1 x2 ... xn c1 c2 ... cm (2) (2) (2) (2) (2) (2) x1 x2 ... xn c1 c2 ... cm ... ... ... ... ... ... ... ... (N ) (N ) (N ) (N ) (N ) (N ) x1 x2 ... xn c1 c2 ... cm Each instance of the dataset contains both the values of the attributes and m labels which characterise the attributes.
6.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Bayesian Network Classiﬁers C X1 X2 X3 X4 X5 X6
7.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Multi-dimensional Class Bayesian Network Classiﬁers (MDBNC) C1 C2 C3 X1 X2 X3 X4 X5 X6
9.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Sub-families of MDBNC (a) Multi-dimensional naive Bayes (b) Multi-dimensional tree-augmented network (c) Multi-dimensional J/K dependence Bayesian (2/3)
10.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Introduction Sub-families of MDBNC 1 Multi-dimensional naive Bayes classiﬁer (MDnB) It has a ﬁxed structure. Thus, it has no structural learning (v.d. Gaag and d. Waal, 2006). 2 Multi-dimensional tree-augmented network classiﬁer (MDTAN) Using the structural learning proposed in (v.d. Gaag and d. Waal, 2006). 3 Multi-dimensional J/K dependence Bayesian classiﬁer (MD J/K) The structural learning is proposed in this paper.
11.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classiﬁer Multi-dimensional J/K Dependence Bayesian Classiﬁer (MD J/K) I A supervised method to learn a J/K structure (1) Learn the structure between the class variables (AC ): 1 Calculate the p-value (signiﬁcance of the mutual information M I(Ci , Cj )) using the independence test for each pair of class variables, and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs between the class variables fulﬁlling the conditions of no cycles between the class variables and no more than J-parents per class.
12.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classiﬁer Multi-dimensional J/K Dependence Bayesian Classiﬁer (MD J/K) II A supervised method to learn a J/K structure (2) Learn the structure between the class variables and the features(ACF ): 1 Calculate the p-value (signiﬁcance of the mutual information M I(Ci , Xj )) using the independence test for each pair Ci and Xj and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs from the class variables to the features.
13.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classiﬁer Multi-dimensional J/K Dependence Bayesian Classiﬁer (MD J/K) III A supervised method to learn a J/K structure (3) Learn the structure between the features(AF ): 1 Calculate the p-value (signiﬁcance of the conditional mutual information M I(Xi , Xj |Pac (Xj ))) using the conditional independence test for each pair Xi and Xj given Pac (Xj ) and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs between the class variables fulﬁlling the conditions of no cycles between the features and no more than K-parents per feature.
14.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Multi-dimensional Semi-supervised Learning Major Problem of Supervised Learning The labelling of the training data is usually done by an external mechanism (usually human beings). However, in many real world problems, obtaining data is relatively easy, while labelling is difﬁcult, expensive or labor intensive. This problem is accentuated when using multiple target variables. DESIRE: Learning algorithms able to incorporate a large number of unlabelled data with a small number of labeled data when learning classiﬁers.
16.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Multi-dimensional Semi-supervised Learning The Expectation-Maximization Algorithm The EM algorithm (Dempster et al, 1977) Learn an initial model. Repeat until convergence: (a) Expectation step: Using the current model, estimate the missing values of the data. (b) Maximisation step: Using the whole data and the previous estimations, learn a new current model. Any MDBNC learning algorithm can be used as model in this algorithm.
17.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classiﬁer 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
18.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Deﬁnitions (Liu, 2010) Sentiment Analysis (AKA Opinion Mining) is the computational study of opinions, sentiments and emotions expressed in text. When treating Sentiment Analysis as a classiﬁcation problem, two different problems appear: 1 Subjectivity Classiﬁcation. Its aim is to classify a text as subjective or objective. 2 Sentiment Classiﬁcation. It classiﬁes an opinionated text as expressing a positive, neutral, or negative opinion.
19.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Motivation for Using Semi-supervised Learning of Multi-dimensional Classiﬁers 1 Up to now, these two subproblems have been studied in isolation despite of being closely related. So, probably it would be helpful to use multi-dimensional classiﬁers. 2 Obtaining enough labeled examples for a classiﬁer may be costly and time consuming. This motivates us to deal with unlabelled examples in a semi-supervised framework.
20.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classiﬁer 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
21.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Properties of the Dataset Collected by Socialware Company, from the ASOMO service of mobilised opinion analysis. It consists of 2, 542 Spanish reviews extracted from a blog: 150 documents have been labeled in isolation by an expert. 2, 392 posts are left unlabelled.
22.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Properties of Each Document Each document is represented as: 14 features Obtained by using an open source morphological analyser (Carreras et al, 2006). Each feature provide different information related to part-of-speech (POS). Eg. First Persons, Agreement Expressions, Imperatives, Prediction Verbs (future), Questions, Positive Adjectives, etc. Represented as a real number between 0 and 1. 3 class variables Will to Inﬂuence: {declarative sentence, soft WI, medium WI, strong WI} Sentiment: {very negative, negative, neutral, positive, very positive} Subjectivity: {Yes (subjective), No (objective)}
23.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classiﬁer 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
24.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Experiment The ASOMO dataset has been used to learn: 3 (uni-dimensional) Bayesian network classiﬁers: nB, TAN and 2DB. 5 MDBNC: MDnB, MDTAN, MD 2/2, MD 2/3 and MD 2/4. In both Supervised and Semi-supervised (EM algorithm) learning frameworks. Features from ASOMO dataset are discretised into 3 values using equal frequency. Results averaged over 5 × 5 fold cross validation.
25.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results JOINT Accuracy This measure estimates the values of all class variables simultaneously, that is, it only counts a success if all the classes are correctly predicted, otherwise it counts an error. 25 20 15 10 5 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
26.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Will to Inﬂuence 70 60 50 40 30 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
27.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Sentiment Polarity 40 35 30 25 20 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
28.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Subjectivity 90 80 70 60 50 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
29.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Conclusions and Future Work Conclusions Multi-dimensional classiﬁcation and semi-supervised learning are two different branches of machine learning. In this research, we have established a bridge between them showing that these techniques are competitive with the state-of-the-art algorithms. “As discussed in the literature, currently there is no coherent strategy for handling unlabelled data, so some creativity must be exercised.” (Cohen)
30.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Conclusions and Future Work Future work What kinds of problem could beneﬁt from this? Web-related problems. E.g. Affect Analysis. Scalability of MDBNC (high dimensionality of multi-label problems). The literature has shown that structure learning algorithms that maximise the likelihood are not likely to ﬁnd structures yielding good classiﬁers in a semi-supervised manner. The ASOMO preliminary results show that using the EM with a probabilistic model cannot ensure learning a correct model. Thus, we need to perform structure search, attempting to maximise classiﬁcation accuracy directly.
31.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis Conclusions and Future Work Questions THANK YOU jonathan.ortigosa@ehu.es
32.
A Semi-supervised Approach to Multi-dimensional Classiﬁcation with Application to Sentiment Analysis References References Liu B. (2010). Sentiment Analysis and Subjectivity. In: Indurkhya N. and Damerau F.J. Handbook of Natural Language Processing, Chapman & Hall, 2nd Ed. Rodriguez, J.D. and Lozano, J.A. (2008). Multi-objective learning of multi-dimensional Bayesian classiﬁers. In Proceedings of the Eighth International Conference on Hybrid Intelligent Systems, HIS 2008, Barcelona, Spain. pp. 501-506 van der Gaag L. and d. Waal P. (2006). Multi-dimensional Bayesian Classiﬁers. In Proceedings of the Third European Workshop in Probabilistic Graphical Models, pages 107–114 Larrañaga, P., Lozano, J.A., Peña, J.M. and Inza, I (2005). Special Issue on Probabilistic Graphical Models for Classiﬁcation. Machine Learning, 59(3) Cozman, F. and Cohen, I. (2006). Risk of Semi-Supervised Learning. In: Chapelle, O. Scholkopf, B. and Zien, A. Semi-Supervised Learning. The MIT Press. pp 57-72. Friedman, N. (1998.) The Bayesian Structural EM algorithm. In Proc. 14th Conf. on Uncertainty in Artiﬁcial Intelligence. Morgan Kaufmann, San Francisco, CA, pp. 129–138 Dempster A., Laird N. and Rubin D. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B, 39(1): 1–38 Carreras X., Chao I., Padro L., Padro M. (2006). An Open-Source Suite of Language Analyzers. In Proceedings of the 4th Int. Conference on Language Resources and Evaluation, Vol. 10, pp. 239–342
Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.
Be the first to comment