CEDI2010 - Slides
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
727
On Slideshare
727
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
6
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Jonathan Ortigosa-Hernández1 , Juan Diego Rodríguez1 , Leandro Alzate2 , Iñaki Inza1 , and José A. Lozano1 1 Intelligent Systems Group Computer Science and Artificial Intelligence Department University of the Basque Country 2 Socialware Company, Bilbao, Spain CEDI 2010 Valencia – September 9th, 2010
  • 2. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classifier 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
  • 3. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Supervised Classification It consists of building a classifier Ψ from a given labelled training dataset D, by using an induction algorithm A (A(D) = Ψ), X1 X2 ... Xn C (1) (1) (1) x1 x2 ... xn c(1) (2) (2) (2) x1 x2 ... xn c(2) ... ... ... ... ... (N ) (N ) (N ) x1 x2 ... xn c(N ) in order to predict the value of a class variable C for any new unlabelled instance x (Ψ(x) = c).
  • 4. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Uni-dimensional and Multi-dimensional Classification Uni-dimensional classification tries to predict a single class variable based on a dataset composed of a set of labelled examples. (Uni-dimensional Class) Bayesian Network Classifiers (Larrañaga et al, 2005). Multi-dimensional classification is the generalisation of the single-class classification task to the simultaneous prediction of a set of class variables. Multi-dimensional Class Bayesian Network Classifiers (v.d. Gaag and d. Waal, 2006).
  • 5. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Multi-dimensional Supervised Learning A typical supervised training dataset X1 X2 ... Xn C1 C2 ... Cm (1) (1) (1) (1) (1) (1) x1 x2 ... xn c1 c2 ... cm (2) (2) (2) (2) (2) (2) x1 x2 ... xn c1 c2 ... cm ... ... ... ... ... ... ... ... (N ) (N ) (N ) (N ) (N ) (N ) x1 x2 ... xn c1 c2 ... cm Each instance of the dataset contains both the values of the attributes and m labels which characterise the attributes.
  • 6. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Bayesian Network Classifiers C X1 X2 X3 X4 X5 X6
  • 7. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Multi-dimensional Class Bayesian Network Classifiers (MDBNC) C1 C2 C3 X1 X2 X3 X4 X5 X6
  • 8. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction MDBNC Structure C1 C2 C3 C1 C2 C3 X1 X2 X3 X4 X5 X1 X2 X3 X4 X5 (a) Complete graph (b) Feature selection subgraph C1 C2 C3 X1 X2 X3 X4 X5 (c) Class subgraph (d) Feature subgraph
  • 9. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Sub-families of MDBNC (a) Multi-dimensional naive Bayes (b) Multi-dimensional tree-augmented network (c) Multi-dimensional J/K dependence Bayesian (2/3)
  • 10. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Introduction Sub-families of MDBNC 1 Multi-dimensional naive Bayes classifier (MDnB) It has a fixed structure. Thus, it has no structural learning (v.d. Gaag and d. Waal, 2006). 2 Multi-dimensional tree-augmented network classifier (MDTAN) Using the structural learning proposed in (v.d. Gaag and d. Waal, 2006). 3 Multi-dimensional J/K dependence Bayesian classifier (MD J/K) The structural learning is proposed in this paper.
  • 11. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classifier Multi-dimensional J/K Dependence Bayesian Classifier (MD J/K) I A supervised method to learn a J/K structure (1) Learn the structure between the class variables (AC ): 1 Calculate the p-value (significance of the mutual information M I(Ci , Cj )) using the independence test for each pair of class variables, and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs between the class variables fulfilling the conditions of no cycles between the class variables and no more than J-parents per class.
  • 12. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classifier Multi-dimensional J/K Dependence Bayesian Classifier (MD J/K) II A supervised method to learn a J/K structure (2) Learn the structure between the class variables and the features(ACF ): 1 Calculate the p-value (significance of the mutual information M I(Ci , Xj )) using the independence test for each pair Ci and Xj and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs from the class variables to the features.
  • 13. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional J/K Dependence Bayesian Classifier Multi-dimensional J/K Dependence Bayesian Classifier (MD J/K) III A supervised method to learn a J/K structure (3) Learn the structure between the features(AF ): 1 Calculate the p-value (significance of the conditional mutual information M I(Xi , Xj |Pac (Xj ))) using the conditional independence test for each pair Xi and Xj given Pac (Xj ) and rank them. 2 Remove the p-values greater than the threshold α = 0,1. 3 Use the ranking to add arcs between the class variables fulfilling the conditions of no cycles between the features and no more than K-parents per feature.
  • 14. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional Semi-supervised Learning Major Problem of Supervised Learning The labelling of the training data is usually done by an external mechanism (usually human beings). However, in many real world problems, obtaining data is relatively easy, while labelling is difficult, expensive or labor intensive. This problem is accentuated when using multiple target variables. DESIRE: Learning algorithms able to incorporate a large number of unlabelled data with a small number of labeled data when learning classifiers.
  • 15. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional Semi-supervised Learning Multi-dimensional Semi-supervised Learning A typical semi-supervised training dataset X1 X2 ... Xn C1 C2 ... Cm (1) (1) (1) (1) (1) (1) x1 x2 ... xn c1 c2 ... cm (2) (2) (2) (2) (2) (2) x1 x2 ... xn c1 c2 ... cm ... ... ... ... ... ... ... ... (L) (L) (L) (L) (L) (L) x1 x2 ... xn c1 c2 ... cm (L+1) (L+1) (L+1) x1 x2 ... xn ? ? ... ? (L+2) (L+2) (L+2) x1 x2 ... xn ? ? ... ? ... ... ... ... ... ... ... ... (N ) (N ) (N ) x1 x2 ... xn ? ? ... ? Semi-supervised Learning fulfills this desire.
  • 16. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Multi-dimensional Semi-supervised Learning The Expectation-Maximization Algorithm The EM algorithm (Dempster et al, 1977) Learn an initial model. Repeat until convergence: (a) Expectation step: Using the current model, estimate the missing values of the data. (b) Maximisation step: Using the whole data and the previous estimations, learn a new current model. Any MDBNC learning algorithm can be used as model in this algorithm.
  • 17. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classifier 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
  • 18. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Definitions (Liu, 2010) Sentiment Analysis (AKA Opinion Mining) is the computational study of opinions, sentiments and emotions expressed in text. When treating Sentiment Analysis as a classification problem, two different problems appear: 1 Subjectivity Classification. Its aim is to classify a text as subjective or objective. 2 Sentiment Classification. It classifies an opinionated text as expressing a positive, neutral, or negative opinion.
  • 19. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Sentiment Analysis Problem Motivation for Using Semi-supervised Learning of Multi-dimensional Classifiers 1 Up to now, these two subproblems have been studied in isolation despite of being closely related. So, probably it would be helpful to use multi-dimensional classifiers. 2 Obtaining enough labeled examples for a classifier may be costly and time consuming. This motivates us to deal with unlabelled examples in a semi-supervised framework.
  • 20. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classifier 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
  • 21. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Properties of the Dataset Collected by Socialware Company, from the ASOMO service of mobilised opinion analysis. It consists of 2, 542 Spanish reviews extracted from a blog: 150 documents have been labeled in isolation by an expert. 2, 392 posts are left unlabelled.
  • 22. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis ASOMO Dataset Properties of Each Document Each document is represented as: 14 features Obtained by using an open source morphological analyser (Carreras et al, 2006). Each feature provide different information related to part-of-speech (POS). Eg. First Persons, Agreement Expressions, Imperatives, Prediction Verbs (future), Questions, Positive Adjectives, etc. Represented as a real number between 0 and 1. 3 class variables Will to Influence: {declarative sentence, soft WI, medium WI, strong WI} Sentiment: {very negative, negative, neutral, positive, very positive} Subjectivity: {Yes (subjective), No (objective)}
  • 23. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Index 1 Introduction 2 Multi-dimensional J/K Dependence Bayesian Classifier 3 Multi-dimensional Semi-supervised Learning 4 Experimentation in Sentiment Analysis Sentiment Analysis Problem ASOMO Dataset Results 5 Conclusions and Future Work 6 References
  • 24. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Experiment The ASOMO dataset has been used to learn: 3 (uni-dimensional) Bayesian network classifiers: nB, TAN and 2DB. 5 MDBNC: MDnB, MDTAN, MD 2/2, MD 2/3 and MD 2/4. In both Supervised and Semi-supervised (EM algorithm) learning frameworks. Features from ASOMO dataset are discretised into 3 values using equal frequency. Results averaged over 5 × 5 fold cross validation.
  • 25. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results JOINT Accuracy This measure estimates the values of all class variables simultaneously, that is, it only counts a success if all the classes are correctly predicted, otherwise it counts an error. 25 20 15 10 5 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
  • 26. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Will to Influence 70 60 50 40 30 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
  • 27. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Sentiment Polarity 40 35 30 25 20 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
  • 28. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Experimentation in Sentiment Analysis Results Subjectivity 90 80 70 60 50 nB TAN 2DB MDnB MDTAN MD 2/2 MD 2/3 MD 2/4
  • 29. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Conclusions and Future Work Conclusions Multi-dimensional classification and semi-supervised learning are two different branches of machine learning. In this research, we have established a bridge between them showing that these techniques are competitive with the state-of-the-art algorithms. “As discussed in the literature, currently there is no coherent strategy for handling unlabelled data, so some creativity must be exercised.” (Cohen)
  • 30. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Conclusions and Future Work Future work What kinds of problem could benefit from this? Web-related problems. E.g. Affect Analysis. Scalability of MDBNC (high dimensionality of multi-label problems). The literature has shown that structure learning algorithms that maximise the likelihood are not likely to find structures yielding good classifiers in a semi-supervised manner. The ASOMO preliminary results show that using the EM with a probabilistic model cannot ensure learning a correct model. Thus, we need to perform structure search, attempting to maximise classification accuracy directly.
  • 31. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis Conclusions and Future Work Questions THANK YOU jonathan.ortigosa@ehu.es
  • 32. A Semi-supervised Approach to Multi-dimensional Classification with Application to Sentiment Analysis References References Liu B. (2010). Sentiment Analysis and Subjectivity. In: Indurkhya N. and Damerau F.J. Handbook of Natural Language Processing, Chapman & Hall, 2nd Ed. Rodriguez, J.D. and Lozano, J.A. (2008). Multi-objective learning of multi-dimensional Bayesian classifiers. In Proceedings of the Eighth International Conference on Hybrid Intelligent Systems, HIS 2008, Barcelona, Spain. pp. 501-506 van der Gaag L. and d. Waal P. (2006). Multi-dimensional Bayesian Classifiers. In Proceedings of the Third European Workshop in Probabilistic Graphical Models, pages 107–114 Larrañaga, P., Lozano, J.A., Peña, J.M. and Inza, I (2005). Special Issue on Probabilistic Graphical Models for Classification. Machine Learning, 59(3) Cozman, F. and Cohen, I. (2006). Risk of Semi-Supervised Learning. In: Chapelle, O. Scholkopf, B. and Zien, A. Semi-Supervised Learning. The MIT Press. pp 57-72. Friedman, N. (1998.) The Bayesian Structural EM algorithm. In Proc. 14th Conf. on Uncertainty in Artificial Intelligence. Morgan Kaufmann, San Francisco, CA, pp. 129–138 Dempster A., Laird N. and Rubin D. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B, 39(1): 1–38 Carreras X., Chao I., Padro L., Padro M. (2006). An Open-Source Suite of Language Analyzers. In Proceedings of the 4th Int. Conference on Language Resources and Evaluation, Vol. 10, pp. 239–342