Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training

478 views

Published on

Social media systems provide a platform for users to freely express their thoughts and opinions. Although this property represents incredible and unique communication opportunities, it also brings along important challenges. Often, content which constitutes hate speech, abuse, harmful intent proliferates online platforms. Since problematic content reduces the health of a platform and negatively affects user experience, communities have terms of usage or community norms in place, which when violated by a user, leads to moderation action on that user by the platform. Unfortunately, the scale at which these platforms operate makes manual content moderation near impossible, leading to the need for automated or semi-automated content moderation systems. For understanding the prevalence and impact of such content, there are multiple methods including supervised machine learning and deep learning models. Despite the vast interest in the theme and wide popularity of some methods, it is unclear which model is most suitable for a certain platform since there have been few benchmarking efforts for moderated content. To that end, we compare existing approaches used for automatic moderation of multimodal content on five online platforms: Twitter, Reddit, Wikipedia, Quora, Whisper. In addition to investigating existing approaches, we propose a novel Capsule Network based method that performs better due to its ability to understand hierarchical patterns. In practical scenarios, labeling large scale data for training new models for a different domain or platform is a cumbersome task. Therefore we enrich our existing pre-trained model with a minimal number of labeled examples from a different domain to create a co-trained model for the new domain. We perform a cross-platform analysis using different models to identify which model is better. Finally, we analyze all methods, both qualitatively and quantitatively, to gain a deeper understanding of model performance, concluding that our method shows an increase of 10% in average precision. We also find that the co-trained models perform well despite having less training data and may be considered a cost-effective solution.

Published in: Engineering
  • Be the first to comment

Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training

  1. 1. Content Moderation Across Multiple Platforms with Capsule Networks and Co-Training Vani Agarwal linkedin/in/vani-agarwal-04a02bb b/ @VaniAgarwal9 fb.com/vani.agarwal30 Dr. Arun Balaji (Chair) Dr. Ponnurangam Kumaraguru (Co-chair)
  2. 2. 2 Thesis Committee ◆ Dr. Rajiv Ratn Shah, IIIT Delhi ◆ Dr. Niharika Sachdeva, InfoEdge ◆ Dr. Arun Balaji Buduru, IIIT Delhi ◆ Dr. Ponnurangam Kumaraguru, IIIT Delhi
  3. 3. Demo 3
  4. 4. What is content moderation 4Ref: 11 Different platforms have certain policies, if content does not meet the guidelines than moderation action takes place.
  5. 5. Why content moderation is necessary 5 ◆
  6. 6. Posts on different platforms 6
  7. 7. Challenges for content moderation 7 ◆ Different needs of different platforms ◆ Huge amount of content ◆ The way in which content displayed differs for each platform
  8. 8. Platforms struggling to moderate content 8 Human Moderators suffer PTSD High Turnaround Time Ref: 1 High Cost
  9. 9. Research Aim Given posts P = p1 ,p2 ,. . . ., pk from domains D = D1 ,D2 ,. . . ., Dn , find a subset of posts which should be flagged for moderation 9 INPUT OUTPUT List of items to be moderated P’ where P’ ⊆ P MODEL
  10. 10. Contributions ◆ Comparison of different methods across multiple platforms ◆ Capsule Networks for Content Moderation ◆ Co-training to understand domain adaptability 10
  11. 11. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 11
  12. 12. Data Collection 12 ◆ Twitter, Quora, Wikipedia - public datasets ◆ Whisper - combination of public dataset and website scraping ◆ Reddit - Collected data from subreddits
  13. 13. Data Collection of Reddit ◆ Subreddit r/creepy ▶ violent content ▶ Weak labels ◆ Subreddit r/pics ▶ normal content ▶ Manually checked 200 posts to see if they are problematic or not 13
  14. 14. Twitter 14
  15. 15. Reddit 15
  16. 16. Wikipedia “hey punk dont be deleting my stuff, you know nothing bout the harly drags so stay out of my shit you stupid nerd, punk fag female thats all u, bitch” 16
  17. 17. Quora 17
  18. 18. Whisper 18
  19. 19. Dataset Summary 19 Twitter Reddit Wikipedia Whisper Quora Data collection strategy 1. Tweets related to protest, riots[7] Collected data from subreddits: r/creepy r/pics 1.Comments of personal attacks[3] Hate speech related posts[4] and web scraping Insincere questions asked on Quora[2] 2. Tweets related to Racism, Sexism[1] 2. Toxic comments on talk page[12]
  20. 20. Dataset Summary 20 Dataset Positive(1) Negative(0) Total Positive class % Text / Image Quora 817 12244 13061 6% Text Whisper 760 1720 2480 30% Text Wikipedia1 647 5146 5793 11% Text Wikipedia2 783 7195 7978 10% Text Twitter2 1200 2000 3200 37% Text Twitter1 3619 1052 4671 77% Text + Image Reddit 2073 2598 4671 44% Text + Image
  21. 21. Data Pre-processing Tweet - "nice to see that the top trending post by suriya #TamilNaduBandh #Saithan are located around TamilNadu" Anonymized Tweet - "nice to see that the top trending post by <NAME> are located around tamilnadu" 21 Lower case Remove hashtags, emoticons, punctuations Named Entity Recognizer
  22. 22. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 22
  23. 23. Methods 23 Text Models Logistic Regression[5] LR_machina Logistic Regression[6] LR_Badjatiya Multi Layer Perceptron[5] MLP Gated Recurrent Unit[7] GRU Long short term memory[7] LSTM Convolutional Neural Network[6] CNN Capsule Network CapsNet Fusion Models LSTM + (Object + Scene recognition) LstmFusion CapsNet + (Object + Scene recognition) CapsFusion
  24. 24. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 24
  25. 25. Capsule Network Intuition 25Ref: 10 ◆ It is not a human face. ◆ Capsule networks understand spatial orientation. ◆ Max Pooling loses information.
  26. 26. Capsule Working 26Ref: 10
  27. 27. Why CapsNet? ◆ Capsules output vector. ◆ Each Capsule decides which feature to pass to higher capsule. ◆ Prominent features are transformed from one Capsule to another using routing protocol. ◆ This helps to learn semantic meaning of text data well. 27
  28. 28. Capsule Network Architecture 28
  29. 29. Experimental Design - Evaluation parameters used - Average Precision or Area under PR curve - Macro F1 - 5 fold cross validations - Grid search on various Hyper-parameters - Train set - 80% - Test set - 20% 29
  30. 30. Text All Model Results 30 Method Macro F1 Average Precision GRU 0.6977 0.6983 LR_Badjatiya 0.7560 0.7260 LR_machina 0.6772 0.3805 CNN 0.6400 0.6960 LSTM 0.7076 0.7057 MLP 0.7300 0.6916 CapsNet 0.8254 0.7695 Performance in Twitter2 dataset
  31. 31. Text Model Results on all Datasets 31 Dataset Method Macro F1 Average Precision Quora CapsNet 0.6959 0.9269 LSTM 0.6731 0.6560 Reddit CapsNet 0.7967 0.7373 LSTM 0.7306 0.7321 Twitter1 CapsNet 0.7953 0.7695 LSTM 0.8635 0.6748 Twitter2 CapsNet 0.8254 0.7695 LSTM 0.7076 0.7057
  32. 32. Text Model Results on all Datasets 32 Dataset Method Macro F1 Average Precision Whisper CapsNet 0.9856 0.9783 LSTM 0.9816 0.9816 Wikipedia1 CapsNet 0.8361 0.9195 LSTM 0.7775 0.7413 Wikipedia2 CapsNet 0.8361 0.9195 LSTM 0.8098 0.7698 Average CapsNet 0.8244 0.8600 LSTM 0.7919 0.7516
  33. 33. Fusion Model Architecture 33
  34. 34. Fusion Model Results 34 Dataset Method Macro F1 Average Precision Twitter1 LstmFusion 0.6711 0.8381 CapsFusion 0.6968 0.8613 Reddit LstmFusion 0.7529 0.7566 CapsFusion 0.8141 0.8149
  35. 35. Takeaways from Capsule Network Model ◆ Capsule networks perform better than LSTM by 10.54% in average precision. ◆ CapsFusion model performs better than LstmFusion by 5.2% in average precision. 35
  36. 36. Error Analysis ◆ Manual analysis of 50 instances marked wrong by LSTM but correctly by CapsNet ◆ Findings: ▶ False Positives by LSTM / correctly classified by CapsNet “ :I think ``YOU RACIST CUNT`` qualifies as defamation. diff. I didn't edit your post, that was someone else.” ▶ False Negatives by LSTM / correctly classified by CapsNet “` I had no interest in getting ``under your skin``. It's you and your fellow admins who got under my skin. So well done. It doesn't matter anymore. — `” 36
  37. 37. Qualitative Analysis Text - “Where are the activists and foot soldiers when k'tak bleeds in silence.” Label - Positive DeepSHAP[8] results 37
  38. 38. Qualitative Analysis Text - “The proud hero of kashmir! The hero of freedom struggle.” Label - Negative DeepSHAP results 38
  39. 39. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 39
  40. 40. 40Ref: [9]
  41. 41. Example of Domain Adaptation 41 Example of Domain Adaptation from CODA paper (Chen et al) on reviews from different domains
  42. 42. Our Co-training for Domain Adaptation Algorithm 42
  43. 43. Co-training for Domain Adaptation 43 Twitter1 Reddit (20%) + Training Testing Twitter 2 (20%) Whisper (20%) M1 M2 + + . . . M6 Reddit (80%) Twitter2 (80%) Whisper (80%) . . .
  44. 44. Co-training for Domain Adaptation Results 44 Trained on (Domain1) Co-trained on (Domain2) Method Macro F1 Average Precision Twitter1 Quora CapsNet 0.6739 0.6723 Reddit 0.6690 0.5581 Twitter2 0.6321 0.6232 Whisper 0.9167 0.9201 Wikipedia1 0.7496 0.7480 Wikipedia2 0.7341 0.7457
  45. 45. Tradeoff Analysis 45 Domain1-Twitter1 and Domain2-Reddit
  46. 46. Co-training Results ◆ Just by augmenting with 20% samples we face reduction in performance by 17% compared to a model trained on 100% samples. ◆ As the percentage of Domain2 samples added to Domain1 increases, models performance improves. ◆ Therefore, if we only have a small amount of labeled data co-training for domain adaptation is a viable option. 46
  47. 47. Outline ◆ Data Collection ◆ Comparison of Methods ◆ Capsule Networks ◆ Co-training for Domain Adaptation ◆ Conclusion 47
  48. 48. Conclusion 48 ◆ We perform Multi-platform comparison for Content moderation. ◆ Capsule Networks outperformed existing methods for content moderation. ◆ Co-training for domain adaptation, a cost-effective solution for annotating data.
  49. 49. Challenges, Limitation, Future Work 49 ◆ Some datasets use weak labels, ▶ Future work: see if stronger labels perform better than weak labels. ◆ Different platforms have different style of expressing content and also have different moderation policies ▶ Co-training may not work if the policies of platforms do not align. ◆ It was challenging to collect Reddit dataset ▶ Quarantined subreddits no longer available ◆ We plan to extend the work to video content on various platforms.
  50. 50. Acknowledgement ◆ Committee Members ◆ Indira Sen, GESIS ◆ Snehal Gupta, Asmit Kumar Singh, Shubham Singh ◆ Members of Precog ◆ Family and friends 50
  51. 51. References [1]Twitter data- https://github.com/zeerakw/hatespeech [2] Quora data- https://www.kaggle.com/c/quora-insincere-questions-classificati on/data [3] Wikipedia data- https://figshare.com/articles/Wikipedia_Detox_Data/4054689 [4] Whisper data- https://github.com/Mainack/hatespeech-data-HT-2017 [5] ExMachina - https://arxiv.org/abs/1610.08914 [6] Badjatiya- https://arxiv.org/abs/1706.00188 51
  52. 52. References [7] LSTM- http://precog.iiitd.edu.in/pubs/empowering-first-responders.pdf [8] DeepShap - https://github.com/slundberg/shap [9] Co-training for domain adaptation - https://papers.nips.cc/paper/4433-co-training-for-domain-adapta tion.pdf 52
  53. 53. Thanks! vani17068@iiitd.ac.in @VaniAgarwal9 53

×