Relevance Mining and Detection System

1. Classiﬁcation of Social Media Posts according to their Relevance Author: Alexandre Pinto Advisors: Prof. Dr. Hugo Gon¸calo Oliveira Prof. Dr. Ana Oliveira Alves Faculty of Sciences and Technology Department of Informatics Engineering University of Coimbra September 9, 2016

2. Summary Contents 1 Introduction 2 Objectives 3 Benchmarking NLP Toolkits 4 Relevance Detection 5 Conclusions Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 2/60

3. Introduction

4. Introduction REMINDS REMINDS = RElevance MINing and Detection System Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 4/60

5. Introduction REMINDS REMINDS = RElevance MINing and Detection System Main Goal: • Development of a system capable of detecting relevant information, according to journalistic criteria, published in social networks while ignoring irrelevant information such as private comments and personal information, or public text that is not important. Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 5/60

6. Introduction REMINDS REMINDS = RElevance MINing and Detection System Four Main Approaches/Four Diﬀerent Teams: • Text Mining • Sentiment Analysis • Interaction Patterns and Network Topologies • Natural Language Processing (NLP) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 6/60

7. Introduction What is Relevance ? Definition: • “The degree to which something is related or useful to what is happening or being talked about” Human notion: • Hard to measure and define Ambiguos nature: • Cannot simply search for it (Information Retrieval) • Must instead filter out irrelevant content (Classification) Classification of Social Media Posts, according to their Relevance Alexandre Pinto 7/60

10. Objectives

11. Objectives Goal • Automatic classification of public social data according to their potential relevance to a general audience, filtering out irrelevant information. • Rely primarily on linguistic features, extracted with the help of existing NLP tools • Confirm if relevance can be predicted from a set of journalistic criteria. Classification of Social Media Posts, according to their Relevance Alexandre Pinto 9/60

14. Objectives The Big Picture Figure: System Overview Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 10/60

15. Objectives The Big Picture Figure: System Overview (Focus of this work) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 11/60

16. Benchmarking NLP Toolkits

17. Why Benchmarking NLP Toolkits [Pinto et al.(2016)Pinto, Oliveira, and Alves] For widely-spoken languages, such as English: • Wide range of NLP toolkits • Complex applications do not have to be developed from scratch • Diﬃcult choice among the available tools Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 13/60

18. Why Benchmarking NLP Toolkits User Choices Aspects to consider: • Community of users • Frequency of new versions and updates • Cost of integration • Programming language • Covered tasks • Performance (with formal and social media text) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 14/60

25. Benchmarking NLP Toolkits Workplan Methodology: • Choose a range of NLP toolkits • Use of default conﬁgurations (pre-trained models) • Perform a set of standard tasks • Use of popular datasets that cover newspaper and social network text • Analyse results Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 15/60

30. Addressed Tasks

31. Addressed Tasks Lower-level NLP Tasks Figure: Tokenization1 1 www.nltk.org/book/ch07.html Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 17/60

32. Addressed Tasks Lower-level NLP Tasks Figure: Tokenization1 Figure: Part-of-Speech (POS) Tagging1 1 www.nltk.org/book/ch07.html Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 17/60

33. Addressed Tasks Lower-level NLP Tasks Figure: Chunking1 1 www.nltk.org/book/ch07.html Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 17/60

34. Addressed Tasks Lower-level NLP Tasks Figure: Chunking1 Figure: Name Entity Recognition/Classiﬁcation2 1 www.nltk.org/book/ch07.html 2 stanfordnlp.github.io/CoreNLP Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 17/60

35. Used Datasets

36. Used Datasets • Public datasets to assess the performance of NLP tools and thus making decisions • Well-known and widely used in text classification research, such as training and evaluating new tools • Different gold standard datasets that cover different kinds of text – newspaper and social media Classification of Social Media Posts, according to their Relevance Alexandre Pinto 19/60

39. Used Datasets Newspaper and Social Media CoNLL-2003 shared task data • Collection of news wire articles from the Reuters Corpus (PoS,Chunk,NER) Alan Ritter Twitter dataset • Collection of randomly sampled tweets (PoS,Chunk,NER) MSM 2013 workshop • Collection of randomly sampled tweets (NER) Format Token POS Syntactic Chunk Named Entity Only RB B-NP O France NNP I-NP LOC and CC I-NP O Britain NNP I-NP LOC backed VBD B-VP O Fischler NNP B-NP PER ’s POS B-NP O proposal NN I-NP O . . O O Table: Example of the Annotated Data Format Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 20/60

43. Used Datasets Newspaper and Social Media PoS: • Penn Treebank style (CoNLL2003) • PTB + twitter-speciﬁc tags (@usernames, #hashtags, and urls) (Ritter) Chunking Format: • IOB-TYPE format Named Entities: • PER, LOC, ORG or MISC Format Token POS Syntactic Chunk Named Entity Only RB B-NP O France NNP I-NP LOC and CC I-NP O Britain NNP I-NP LOC backed VBD B-VP O Fischler NNP B-NP PER ’s POS B-NP O proposal NN I-NP O . . O O Table: Example of the Annotated Data Format Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 21/60

46. Used Datasets Statistics Dataset Documents Tokens Average Tokens per Document CoNLL (Reuter Corpus) 946 203621 215 Twitter (Alan Ritter) 2394 46469 19 #MSM2013 2815 52124 19 Table: Dataset properties Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 22/60

47. Compared Tools

48. Compared Tools Standard vs Social NLP toolkits Standard NLP toolkits: • NLTK • Apache OpenNLP • Stanford CoreNLP • Pattern Social Network-Oriented Toolkits: • TwitterNLP • TweetNLP • TwitIE Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 24/60

49. Compared Tools Standard vs Social NLP toolkits Standard NLP toolkits: • NLTK • Apache OpenNLP • Stanford CoreNLP • Pattern Social Network-Oriented Toolkits: • TwitterNLP • TweetNLP • TwitIE Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 24/60

50. Compared Tools Tools Summary System Programming Target Text Tok- PoS Chunking NER Language enization tagging NLTK Python Generic OpenNLP Java Generic CoreNLP Java Generic Pattern Python Generic TweetNLP Java Social Media TwitterNLP Python Social Media TwitIE Java Social Media Table: Toolkit properties Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 25/60

51. Comparison Results

52. Comparison Results Dataset CoNLL Alan Ritter - Twitter Task PoS Chunking NEC PoS Chunking NEC PPPPPPPPPTool Metric F1 ± σ F1 ± σ F1 ± σ F1 ± σ F1 ± σ F1 ± σ OpenNLP 0.88 ± 0.10 0.83 ± 0.12 0.87 ± 0.09 0.71 ± 0.17 0.45 ± 0.39 0.87 ± 0.13 TweetNLP 0.84 ± 0.09 n/a n/a 0.95 ± 0.07 n/a n/a TwitterNLP 0.83 ± 0.15 0.83 ± 0.13 0.85 ± 0.12 0.92 ± 0.11 0.90 ± 0.11 0.95 ± 0.08 Table: Best Performance Results Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 27/60

53. Comparison Results Discussion • Common NLP tools usually have good performance on well-formed content, such as news • Noisy and informal text, such as tweets, brings new challenges, decreasing the performance • Special tailored tools such as CMU TweetNLP and Twitter NLP perform good on social media text and were used in the feature extraction process. • General purpose tools oﬀer better support and are more customizable (accept new trained models) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 28/60

57. Relevance Detection

58. Relevance Detection Methods used in this work Deﬁnition of Relevance NLP Tasks Machine Learning Methods Criteria3 Extraction Preprocessing Selection Reduction Models Evaluation Controversialness Part-of-Speech Standardization Info. Gain PCA MDC Accuracy Informativeness Chunking Normalization Gain Ratio kNN Precision Meaningfulness Named Entities Scaling Fisher NB Recall Novelty Polarity of words Pearson SVM F1 Reliability LDA topics Chi-square DT ROC Scope N-gram RF AP Stemming k-Fold-CV Lemmatization Table: Methods used in this work 3 Journalistic criteria established by CRACS@INESC-TEC Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 30/60

59. Related Work Author(s) Mohammad et. al Sriram Fernandes et. al Guerini et. al Zeng et. al Irani et. al Lee et. al Frain et. al Liparas et. alFeature Groups word ngrams char ngrams all-caps POS #hashtags punctuation emoticons elongated words clusters authorship info. digital media #words #links lenght LDA topics polarity lemmas TF-IDF #profanity Target Class Sentiment News’s type Popularity Buzz Helpful Opinion Trending Content Trending Categories Satiric Content Topic Source Data Twitter Twitter Mashable Digg Amazon Twitter Twitter Created News Sites Classiﬁer SVM SVM RF SVM SVM C4.5 NB SVM RF Performance F1=0.69 Acc=0.96 F1=0.69 F1 =0.81 Acc=0.72 F1=0.79 Acc=0.65 F1=0.89 F1=0.85 Table: Related Work Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 31/60

60. Used Datasets

61. Relevance Detection Used Datasets • Textual messages gathered (by CRACS@INESC-TEC) from Twitter and Facebook • Text quality preferred over text quantity Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 33/60

62. Relevance Detection Used Datasets • Textual messages gathered (by CRACS@INESC-TEC) from Twitter and Facebook • Text quality preferred over text quantity Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 33/60

63. Relevance Detection Used Datasets Twitter search queries: • “refugees” and “Syria” • “elections” and “US” • “Olympic Games” • “terrorism” • “Daesh” Official Facebook pages: • Euronews, CNN, Washington Post, Financial Times, New York Post, The New York Times, BBC News, The Telegraph, The Guardian, The Huffington Post, Der Spiegel International, Deutsche Welle News, Pravda and Fox News. Classification of Social Media Posts, according to their Relevance Alexandre Pinto 34/60

64. Relevance Detection Used Datasets Twitter search queries: • “refugees” and “Syria” • “elections” and “US” • “Olympic Games” • “terrorism” • “Daesh” Official Facebook pages: • Euronews, CNN, Washington Post, Financial Times, New York Post, The New York Times, BBC News, The Telegraph, The Guardian, The Huffington Post, Der Spiegel International, Deutsche Welle News, Pravda and Fox News. Classification of Social Media Posts, according to their Relevance Alexandre Pinto 34/60

65. Relevance Detection Used Datasets • The same method was used with other journalistic criteria, such as: interestingness, controversy, meaningfulness, novelty, reliability and scope. Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 35/60

66. Relevance Detection Used Datasets #Facebook Posts #Facebook Comments #Tweets Search Word Relevant Irrelevant Relevant Irrelevant Relevant Irrelevant “Refugees” + “Syria” 20 4 30 13 55 23 “Elections” + “US’ 21 8 21 14 29 39 “Olympic Games” 2 0 4 1 22 114 “Terrorism” 53 16 138 88 59 53 “Daesh” 2 0 14 12 26 30 “Referendum” + “UK” + “EU” 4 0 7 1 14 4 Table: Documents grouped by source, relevance label and query Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 36/60

67. Relevance Detection Used Datasets Content Source Answers Class A1 A2 A3 Putin: Turkey supports terrorism and stabs Russia in the back FB post 5 4 5 Relevant Canada to accept additional 10,000 Syrian refugees Tweet 4 5 5 Relevant Lololol winning the internet and stomping out daesh #merica Tweet 1 1 1 Irrelevant Comparing numbers of people killed by terrorism with numbers killed by slipping in bath tub is stupid as eﬀ. It totally ignores the mal-intent behind terrorism, its impact on way of life and ideology. FB comment 2 4 3 Irrelevant Table: Examples of messages in the dataset. Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 37/60

68. Feature Extraction

69. Feature Extraction Feature Set Feature Set #Distinct Features PoS-tags 54 Chunk tags 23 NE tags 11 Total number of PoS/Chunk tags 2 Total number of Named Entities 1 Total number of positive/neutral/negative words 3 Total number of characters/tokens 2 Total number/proportion of all capitalized words 2 LDA topic distribution 20 Token 1-3grams 2711 (f ≥ 3 ) Lemma 1-5grams top-750 (f ≥ 1 ) Stem 1-5grams top-750 (f ≥ 1 ) PoS 1-5grams (1-5) top-125 (f ≥ 1 ) Chunk 1-5grams (1-5) top-125 (f ≥ 1 ) Total 4,579 Table: Feature sets used Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 39/60

70. Baseline Experiments

71. Baseline Experiments Feature sets: • Full feature set • Part-of-speech • Chunks • Named entities • Chars+Tokens+Allcaps+Allcaps-ratio • Positive+Neutral+Negative • LDA topic distribution • Token n-grams • Lemma n-grams • Stem n-grams • PoS n-grams • Chunk n-grams Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 41/60

72. Baseline Experiments Classifiers: • Minimum Distance Classifier • k-Nearest Neighbors • Naive Bayes • Support Vector Machine • Decision Tree • Random Forest Classifier Classification of Social Media Posts, according to their Relevance Alexandre Pinto 42/60

73. Baseline Experiments Performance Metrics: • Accuracy • Precision • Recall • F1 • Area Under the Curve (AUC) • Average Precision (AP) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 43/60

74. Baseline Experiments Results Classifier Minimum Distance Classifier Performance Metrics Accuracy Precision Recall F1 AP AUC Feature Set Full feature set 0.57 ± 0.05 0.72 ± 0.14 0.41 ± 0.14 0.50 ± 0.11 0.73 ± 0.07 0.59 ± 0.05 Part-of-speech 0.57 ± 0.06 0.71 ± 0.15 0.42 ± 0.14 0.50 ± 0.11 0.72 ± 0.07 0.58 ± 0.06 Chunks 0.57 ± 0.05 0.71 ± 0.14 0.42 ± 0.13 0.51 ± 0.10 0.72 ± 0.07 0.59 ± 0.05 Named entities 0.57 ± 0.05 0.71 ± 0.14 0.42 ± 0.13 0.51 ± 0.10 0.72 ± 0.07 0.58 ± 0.05 Chars+Tokens Allcaps+Allcaps-ratio 0.57 ± 0.06 0.71 ± 0.14 0.41 ± 0.14 0.50 ± 0.11 0.73 ± 0.07 0.59 ± 0.05 Positive+Neutral+Negative 0.56 ± 0.05 0.70 ± 0.15 0.41 ± 0.13 0.50 ± 0.11 0.72 ± 0.07 0.58 ± 0.05 LDA topic distribution 0.63 ± 0.15 0.65 ± 0.17 0.89 ± 0.17 0.73 ± 0.12 0.80 ± 0.09 0.60 ± 0.17 Token n-grams 0.57 ± 0.06 0.70 ± 0.16 0.46 ± 0.15 0.53 ± 0.11 0.73 ± 0.08 0.58 ± 0.06 Lemma n-grams 0.58 ± 0.05 0.71 ± 0.14 0.48 ± 0.15 0.55 ± 0.10 0.74 ± 0.07 0.60 ± 0.05 Stem n-grams 0.59 ± 0.05 0.72 ± 0.13 0.48 ± 0.15 0.55 ± 0.10 0.74 ± 0.06 0.60 ± 0.05 PoS n-grams 0.58 ± 0.05 0.71 ± 0.11 0.46 ± 0.16 0.53 ± 0.12 0.73 ± 0.05 0.59 ± 0.05 Chunk n-grams 0.57 ± 0.06 0.70 ± 0.15 0.43 ± 0.14 0.51 ± 0.11 0.72 ± 0.07 0.59 ± 0.05 Table: Baseline Results for the Minimum Distance Classifier Classification of Social Media Posts, according to their Relevance Alexandre Pinto 44/60

75. Baseline Experiments Best ROC Curves Figure: ROC Curves of a SVM Classiﬁer using PoS tags as features Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 45/60

76. Baseline Experiments Best PR Curves Figure: Precision-Recall Curves of a Minimum Distance Classiﬁer using LDA topic distributions as features Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 46/60

77. Feature Engineering

78. Feature Engineering • Number of used features: 201 Preprocessing methods: • Standardization / Normalization / Scaling Feature Selection/Reduction methods: • Information Gain/Gain Ratio • Chi-square (χ2) / Fisher score / Pearson Correlation • PCA (4 dimensions) Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 48/60

81. Feature Engineering Results Classifier Support Vector Machine (SVM) Performance Metrics Accuracy Precision Recall F1 AP AUC Pipeline applied Standardization + full feature set 0.58 ± 0.12 0.65 ± 0.14 0.57 ± 0.21 0.58 ± 0.17 0.73 ± 0.10 0.58 ± 0.12 Normalization + full feature set 0.58 ± 0.09 0.66 ± 0.13 0.58 ± 0.16 0.60 ± 0.10 0.74 ± 0.07 0.59 ± 0.10 Scaling[0,1] + full feature set 0.59 ± 0.11 0.65 ± 0.15 0.61 ± 0.19 0.61 ± 0.14 0.74 ± 0.09 0.59 ± 0.11 Standardization + information gain 0.54 ± 0.09 0.62 ± 0.14 0.53 ± 0.37 0.47 ± 0.28 0.71 ± 0.10 0.54 ± 0.07 Standardization + gain ratio 0.54 ± 0.06 0.57 ± 0.06 0.66 ± 0.09 0.61 ± 0.05 0.71 ± 0.04 0.52 ± 0.06 Standardization + chi square (χ2 ) 0.56 ± 0.11 0.61 ± 0.20 0.58 ± 0.34 0.54 ± 0.24 0.71 ± 0.14 0.56 ± 0.11 Standardization + fisher score 0.54 ± 0.09 0.57 ± 0.09 0.54 ± 0.29 0.53 ± 0.18 0.68 ± 0.10 0.54 ± 0.08 Standardization + pearson 0.65 ± 0.16 0.66 ± 0.17 0.94 ± 0.11 0.76 ± 0.10 0.81 ± 0.08 0.61 ± 0.18 Standardization + pearson + pca4d 0.64 ± 0.16 0.65 ± 0.17 0.94 ± 0.11 0.75 ± 0.10 0.81 ± 0.09 0.61 ± 0.18 Standardization + gain ratio + pca4d 0.59 ± 0.05 0.58 ± 0.04 0.96 ± 0.10 0.72 ± 0.04 0.78 ± 0.03 0.54 ± 0.06 Table: Results of applying different Preprocessing and Feature Selection methods with a Support Vector Machine Classification of Social Media Posts, according to their Relevance Alexandre Pinto 49/60

82. Feature Engineering Best ROC Curves Figure: ROC Curves of a kNN Classiﬁer using Standardization and the Pearson Correlation Filter Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 50/60

83. Feature Engineering Best PR Curves Figure: Precision-Recall Curves of a Naive Bayes Classiﬁer using Standardization and the Pearson Correlation Filter Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 51/60

84. Predicting Relevance through Journalistic Criteria

85. Predicting Relevance through Journalistic Criteria Overview Figure: Prediction of Relevance using Journalistic Criteria Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 53/60

86. Predicting Relevance through Journalistic Criteria Results Relevance based on Journalistic Criteria Performance Metrics Accuracy Precision Recall F1 AP AUC Intermediate Classifiers Minimum Distance Classifiers 0.62 ± 0.11 0.66 ± 0.17 0.89 ± 0.19 0.72 ± 0.08 0.80 ± 0.06 0.59 ± 0.13 K-Nearest Neighbors 0.54 ± 0.08 0.63 ± 0.14 0.57 ± 0.17 0.57 ± 0.08 0.72 ± 0.05 0.54 ± 0.09 Naive Bayes 0.56 ± 0.01 0.56 ± 0.01 0.97 ± 0.03 0.71 ± 0.01 0.77 ± 0.01 0.52 ± 0.01 Linear SVMs 0.54 ± 0.03 0.56 ± 0.02 0.89 ± 0.08 0.68 ± 0.03 0.75 ± 0.02 0.50 ± 0.04 Decision Trees 0.55 ± 0.05 0.57 ± 0.03 0.78 ± 0.11 0.65 ± 0.05 0.73 ± 0.03 0.52 ± 0.05 Random Forests 0.79 ± 0.07 0.80 ± 0.08 0.84 ± 0.07 0.82 ± 0.06 0.86 ± 0.04 0.78 ± 0.08 Table: Results on Predicting Relevance by an Ensemble of Journalistic Classifiers Classification of Social Media Posts, according to their Relevance Alexandre Pinto 54/60

87. Predicting Relevance through Journalistic Criteria Best ROC Curves Figure: ROC Curves of a Journalistic Based kNN Classifier, using Random Forests for the intermediate classifiers Classification of Social Media Posts, according to their Relevance Alexandre Pinto 55/60

88. Predicting Relevance through Journalistic Criteria Best PR Curves Figure: Precision-Recall Curves of a Journalistic Based kNN Classifier, using Random Forests for the intermediate classifiers Classification of Social Media Posts, according to their Relevance Alexandre Pinto 56/60

89. Conclusions

90. Conclusion Final Remarks • Under the scope of the REMINDS project, a classifier was created using exclusively linguistic features. • Feature engineering leads to slightly better results, but the baseline experiments are still competitive • Future integration with other filters which consider other features. Classification of Social Media Posts, according to their Relevance Alexandre Pinto 58/60

93. Conclusion Final Remarks • Best approach uses an ensemble of classiﬁers targeted to each one of the journalistic criteria with linguistic features extracted from text, achieving a F1 score of 0.82 and an AUC of 0.78. • Results are in line with state of the art results that follow similar approaches but for classifying documents according to other criteria. Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 59/60

94. Conclusion Final Remarks • Best approach uses an ensemble of classiﬁers targeted to each one of the journalistic criteria with linguistic features extracted from text, achieving a F1 score of 0.82 and an AUC of 0.78. • Results are in line with state of the art results that follow similar approaches but for classifying documents according to other criteria. Classiﬁcation of Social Media Posts, according to their Relevance Alexandre Pinto 59/60

95. Classiﬁcation of Social Media Posts according to their Relevance Author: Alexandre Pinto Advisors: Prof. Dr. Hugo Gon¸calo Oliveira Prof. Dr. Ana Oliveira Alves Faculty of Sciences and Technology Department of Informatics Engineering University of Coimbra September 9, 2016

96. References A. Pinto, H. Gon¸calo Oliveira, and A. Oliveira Alves. Comparing the Performance of Different NLP Toolkits in Formal and Social Media Text. In Marjan Mernik, José Paulo Leal, and Hugo Gon¸calo Oliveira, editors, 5th Symposium on Languages, Applications and Technologies (SLATE’16), volume 51 of OpenAccess Series in Informatics (OASIcs), pages 1–16, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. ISBN 978-3-95977-006-4. doi: http://dx.doi.org/10.4230/OASIcs.SLATE.2016.3. URL http://drops.dagstuhl.de/opus/volltexte/2016/6008. Classification of Social Media Posts, according to their Relevance Alexandre Pinto 1/1

Relevance Mining and Detection System

Recommended

Recommended

More Related Content

Similar to Relevance Mining and Detection System

Similar to Relevance Mining and Detection System (20)

Recently uploaded

Recently uploaded (20)

Relevance Mining and Detection System