Joint Author Sentiment Topic Model

257
-1

Published on

Joint Author Sentiment Topic Model, Subhabrata Mukherjee, Gaurab Basu and Sachindra Joshi, In Proc. of the SIAM International Conference in Data Mining (SDM 2014), Pennsylvania, USA, Apr 24-26, 2014 [http://people.mpi-inf.mpg.de/~smukherjee/jast.pdf]

Published in: Data & Analytics, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
257
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Joint Author Sentiment Topic Model

  1. 1. Joint Author Sentiment Topic Model Subhabrata Mukherjee Max Planck Institute for Informatics Gaurab Basu and Sachindra Joshi IBM India Research Lab April 25, 2014
  2. 2. April 25, 2014 “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”        Aspect Rating and Review Rating
  3. 3. April 25, 2014 “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Identify topics - direction, story and acting  Story has facets - plot and narration      Aspect Rating and Review Rating
  4. 4. April 25, 2014 “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Identify topics - direction, story and acting  Story has facets - plot and narration  Identify facet sentiments – great (plot), powerful (story), sloppy (acting) etc.     Aspect Rating and Review Rating
  5. 5. April 25, 2014 “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Identify topics - direction, story and acting  Story has facets - plot and narration  Identify facet sentiments – great (plot), powerful (story), sloppy (acting) etc.  Overall review rating - aggregation of facet-specific sentiments    Aspect Rating and Review Rating
  6. 6. April 25, 2014 “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Identify topics - direction, story and acting  Story has facets - plot and narration  Identify facet sentiments – great (plot), powerful (story), sloppy (acting) etc.  Overall review rating - aggregation of facet-specific sentiments  Why joint modeling ?  Sentiment words help locating topic words and vice-versa  Neighboring words establish semantics / sentiment of terms Aspect Rating and Review Rating
  7. 7. Why Author-Specificity ? “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”          April 25, 2014
  8. 8. Why Author-Specificity ? “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Overall rating varies for authors with different topic preferences  Positive for those with greater preference for acting and narration  Negative for acting       April 25, 2014
  9. 9. Why Author-Specificity ? “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Overall rating varies for authors with different topic preferences  Positive for those with greater preference for acting and narration  Negative for acting  Affective sentiment value varies for authors  How much negative is “does not quite make the mark” for me ?     April 25, 2014
  10. 10. Why Author-Specificity ? “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Overall rating varies for authors with different topic preferences  Positive for those with greater preference for acting and narration  Negative for acting  Affective sentiment value varies for authors  How much negative is “does not quite make the mark” for me ?  Author-writing style helps in locating / associating facets and sentiments  E.g. topic switch, verbosity, use of content and function words etc.  The author makes a topic switch in above review using the function word however  April 25, 2014
  11. 11. Why Author-Specificity ? “ [ This film is based on a true-life incident. It sounds like a great plot and the director makes a decent attempt in narrating a powerful story. ] [ However, the film does not quite make the mark due to sloppy acting. ] ”  Overall rating varies for authors with different topic preferences  Positive for those with greater preference for acting and narration  Negative for acting  Affective sentiment value varies for authors  How much negative is “does not quite make the mark” for me ?  Author-writing style helps in locating / associating facets and sentiments  E.g. topic switch, verbosity, use of content and function words etc.  The author makes a topic switch in above review using the function word however  Traditional works learn a global model independent of the review author April 25, 2014
  12. 12. Why care about writing style or coherence?  Better association of facets to topics by detecting semantic-syntactic class transitions and topic switch  semantic dependencies - association between facets to topics  syntactic dependencies - connection between facets and background words required to make the review coherent and grammatically correct April 25, 2014
  13. 13. Contributions  Show that author identity helps in rating prediction      April 25, 2014
  14. 14. Contributions  Show that author identity helps in rating prediction  Author-specific generative model of a review that incorporates author-specific topic and facet preferences    April 25, 2014
  15. 15. Contributions  Show that author identity helps in rating prediction  Author-specific generative model of a review that incorporates author-specific topic and facet preferences grading style   April 25, 2014
  16. 16. Contributions  Show that author identity helps in rating prediction  Author-specific generative model of a review that incorporates author-specific topic and facet preferences grading style writing style  April 25, 2014
  17. 17. Contributions  Show that author identity helps in rating prediction  Author-specific generative model of a review that incorporates author-specific topic and facet preferences grading style writing style maintain coherence in reviews April 25, 2014
  18. 18. Topic Models April 25, 2014
  19. 19. Topic Models April 25, 2014 1. LDA Model
  20. 20. Topic Models April 25, 2014 1. LDA Model 2. Author-Topic Model
  21. 21. Topic Models April 25, 2014 1. LDA Model 2. Author-Topic Model 3. Joint Sentiment Topic Model
  22. 22. Topic Models April 25, 2014 1. LDA Model 2. Author-Topic Model 3. Joint Sentiment Topic Model 4. Topic Syntax Model
  23. 23. Generative Process for a Review April 25, 2014 Visit Restaurant
  24. 24. Generative Process for a Review April 25, 2014 Visit Restaurant Overall Impression …I think I will give overall rating +4
  25. 25. Generative Process for a Review April 25, 2014 Visit Restaurant Overall Impression …I think I will give overall rating +4 Topics to write on I will write about food, ambience and …
  26. 26. Generative Process for a Review April 25, 2014 Visit Restaurant Overall Impression …I think I will give overall rating +4 Topic Ratings I will give food +5 . It makes awesome Pasta … my favorite !!! But the ambience is loud… I will give it +2. But I do not care about it much Topics to write on I will write about food, ambience and …
  27. 27. Generative Process for a Review April 25, 2014 Visit Restaurant Overall Impression …I think I will give overall rating +4 Topic Ratings I will give food +5 . It makes awesome Pasta … my favorite !!! But the ambience is loud… I will give it +2. But I do not care about it much Topics to write on I will write about food, ambience and … Topic Opinion It makes awesome Pasta. But the ambience is loud.
  28. 28. Generative Process for a Review April 25, 2014 Visit Restaurant Overall Impression …I think I will give overall rating +4 Topic Ratings I will give food +5 . It makes awesome Pasta … my favorite !!! But the ambience is loud… I will give it +2. But I do not care about it much Topics to write on I will write about food, ambience and … Topic Opinion It makes awesome Pasta. But the ambience is loud. How to write it ?
  29. 29. Generative Process for a Review April 25, 2014 How to write it ?
  30. 30. Generative Process for a Review April 25, 2014 How to write it ? Topic Word ? Background Word ?
  31. 31. Generative Process for a Review April 25, 2014 How to write it ? Topic Word ? Background Word ? New Topic ? Current Topic ?
  32. 32. Generative Process for a Review April 25, 2014 How to write it ? Topic Word ? Background Word ? New Topic ? Current Topic ? Topic Label ?
  33. 33. Generative Process for a Review April 25, 2014 How to write it ? Topic Word ? Background Word ? New Topic ? Current Topic ? Topic Label ? Word
  34. 34. JAST Model
  35. 35. JAST Model 1. For each document d, author a chooses overall rating r ~ Multinomial(Ω) from author-specific overall document rating distribution
  36. 36. JAST Model 2. For each topic z and each sentiment label l, draw ξz, l ~ Dirichlet(γ) 3. For each class c and each sentiment label l = 0, draw ξc, l ~ Dirichlet(δ)
  37. 37. JAST Model 4. Choose author-specific class transition distribution π Author Writing Style
  38. 38. JAST Model 5. Author a chooses author-rating specific topic-label distribution ϕa, r ~ Dirichlet(α) Author-Topic Preference Author Emotional Attachment to Topics Author Grading Style
  39. 39. JAST Model5. For each word w in the document
  40. 40. JAST Model5. For each word w in the document b. If c = 1, Draw z, l ~ Multinomial(ϕa,r) . Draw w ~ Multinomial(ξz,l).
  41. 41. JAST Model5. For each word w in the document b. If c = 1, Draw z, l ~ Multinomial(ϕa,r) . Draw w ~ Multinomial(ξz,l).
  42. 42. JAST Model5. For each word w in the document b. If c = 1, Draw z, l ~ Multinomial(ϕa,r) . Draw w ~ Multinomial(ξz,l). Semantic Dependencies and Review Coherence
  43. 43. JAST Model5. For each word w in the document b. If c = 1, Draw z, l ~ Multinomial(ϕa,r) . Draw w ~ Multinomial(ξz,l). Review Coherence and Syntactic Dependencies
  44. 44. JAST Model5. For each word w in the document b. If c = 1, Draw z, l ~ Multinomial(ϕa,r) . Draw w ~ Multinomial(ξz,l). d. If c≠ 1, 2, Draw w ~ Multinomial(ξc,l). Review Coherence and Syntactic Dependencies
  45. 45. Inferencing April 25, 2014
  46. 46. Inferencing April 25, 2014
  47. 47. Inferencing April 25, 2014
  48. 48. Inferencing April 25, 2014
  49. 49. Inferencing April 25, 2014
  50. 50. Inferencing April 25, 2014
  51. 51. Inferencing April 25, 2014
  52. 52. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  53. 53. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  54. 54. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  55. 55. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  56. 56. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  57. 57. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  58. 58. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  59. 59. Inferencing  We use collapsed Gibb's sampling for estimating the parameters  Conditional distribution for joint updation of the latent variables is given by : April 25, 2014
  60. 60. Inferencing April 25, 2014
  61. 61. Inferencing April 25, 2014
  62. 62. Inferencing April 25, 2014
  63. 63. Dataset for Evaluation  IMDB movie review dataset  TripAdvisor restaurant review dataset April 25, 2014
  64. 64. Baselines  Lexical classification using majority voting  Joint Sentiment Topic Model1  Author-Topic LR Model2  Model Prior A sentiment lexicon is used to initialize the prior polarity of words in ξT x L[w] April 25, 2014 1. Chenghua Lin and Yulan He, Joint sentiment/topic model for sentiment analysis, CIKM '09, pp. 375-384. 2. Subhabrata Mukherjee, Gaurab Basu, and Sachindra Joshi, Incorporating author preference in sentiment rating prediction of reviews, WWW 2013.
  65. 65. Model Initialization Parameters April 25, 2014
  66. 66. Model Initialization Parameters April 25, 2014
  67. 67. Model Initialization Parameters April 25, 2014 Minimize Model Perplexity
  68. 68. Model Comparison with Baselines April 25, 2014
  69. 69. Model Comparison with Baselines April 25, 2014 IMDB Movie Review Dataset
  70. 70. Model Comparison with Baselines April 25, 2014 IMDB Movie Review Dataset TripAdvisor Restaurant Review Dataset
  71. 71. April 25, 2014 ComparisonwithTopPerformingModelsinIMDBDataset
  72. 72. April 25, 2014 ComparisonwithTopPerformingModelsinIMDBDataset
  73. 73. April 25, 2014 ComparisonwithTopPerformingModelsinIMDBDataset
  74. 74. Snapshot of Topic-Label-Word Extraction by JAST April 25, 2014
  75. 75. Snapshot of Topic-Label-Word Extraction by JAST April 25, 2014
  76. 76. Snapshot of Topic-Label-Word Extraction by JAST April 25, 2014
  77. 77. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  78. 78. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  79. 79. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  80. 80. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  81. 81. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  82. 82. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  83. 83. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  84. 84. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - TripAdvisor April 25, 2014
  85. 85. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - IMDB April 25, 2014
  86. 86. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - IMDB April 25, 2014
  87. 87. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - IMDB April 25, 2014
  88. 88. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - IMDB April 25, 2014
  89. 89. Snapshot of Author-Rating-Topic-Label Distribution Extracted by JAST - IMDB April 25, 2014
  90. 90. Conclusions  Sentiment classification and aspect rating prediction models can be improved if author is known      April 25, 2014
  91. 91. Conclusions  Sentiment classification and aspect rating prediction models can be improved if author is known  Authorship information helps in identifying author topic preferences, and author writing style to maintain review coherence  Semantic-syntactic class transition and topic switch    April 25, 2014
  92. 92. Conclusions  Sentiment classification and aspect rating prediction models can be improved if author is known  Authorship information helps in identifying author topic preferences, and author writing style to maintain review coherence  Semantic-syntactic class transition and topic switch  JAST is unsupervised, with overhead of knowing author identity  Performs better than all unsupervised/semi-supervised models and some supervised models  April 25, 2014
  93. 93. Conclusions  Sentiment classification and aspect rating prediction models can be improved if author is known  Authorship information helps in identifying author topic preferences, and author writing style to maintain review coherence  Semantic-syntactic class transition and topic switch  JAST is unsupervised, with overhead of knowing author identity  Performs better than all unsupervised/semi-supervised models and some supervised models  It will be interesting to use JAST for authorship attribution task April 25, 2014
  94. 94. QUESTIONS ??? April 25, 2014

×