The document presents a methodology for classifying student tweets to understand their learning experiences. It proposes using a naive Bayes multilabel classifier with 7 categories instead of the existing 5 categories. The data was collected from social media and preprocessed. The classifier assigned probabilities to keywords and tweets for each category. It achieved 81.7% accuracy, improving classification of "Others" and "Good Things" categories. The approach provides a better way to analyze social media data for educational purposes compared to manual or computational methods.
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Msd seminar
1. Mining Social Media Data for Understanding Students’
Learning Experiences
Presented by :
Saba Farheen Munshi (2018MIT015)
Under Guidance of :
Prof. R.S. Potpelwar
Department of Information Technology
Shri Guru Gobid singhji Institute of Engineering and Technology, Nanded
29th March,2019
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 1 / 16
2. Table of Content
1 Intrduction
2 Problem Definition
3 Related Work
Existing Work
Proposed Methodology
4 Data Collection
5 Naive Bayes Multilabel Classifier
Text Pre-processing
Multilable Naive Bayes Clasification Algorithm
Evaluation Measure for Multilabel Clasifier
Classification Result
6 Conclusion
7 References
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 2 / 16
3. Intrduction
Introduction
Social media plays powerful role in today’s era .Social media provides stage to
share happiness, struggle, sentiment, stress and acquire social support.
Students share their happiness and sorrows related to studies on social media in
the form of judgmental comments, tweets, posts etc. Analyze these data from
such environment require classification techniques.
Basically we explore engineering students informal conversations on twitter.
Understanding issues and problems students encounter in their learning
experiences.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 3 / 16
5. Problem Definition
Problem Definition
Researches have been using old ways such as surveys, interviews, focus groups,
classroom activities to collect data related to students’ learning experiences.The
existing work has not measured student academic performance to identify the
students’ problem and classify them accurately for enhancing E-learning
experiences.
Being an optimistic students need to reflect on what they were thinking and
doing sometime in the past or at the moment, which make it difficult to
understand over time.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 5 / 16
6. Related Work Existing Work
Existing Work
The Goffman’s Theory : Gaffney analyzes tweets using histograms, user
networks, as well as frequencies of top keywords to quantify online activism.The
existing system uses classification algorithms finds only negative emotions of
students learning.
Alec.Go Theory : Alec Go introduced a completely unique approach for
automatically classifying the sentiment of Twitter messages. These messages are
classified as either positive or negative with relevance to a query term.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 6 / 16
7. Related Work Proposed Methodology
Proposed Architecture
we have proposed following three steps for sentiment tracking -
1 First step is to collect the data for processing about their academic experiences.
2 In next step collected data is explored and define the categories into which tweets
can be differentiated.
3 Finally, depend upon the sentiment labels obtained for each tweet, we identify
the sentiment w.r.t the model ,is trained using multilabel classifier.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 7 / 16
8. Data Collection
Data Collection
We searched data using an educational account on a commercial social media
monitoring tool named Radian6 (25,284 tweets) and Twitter search API (39,095
tweets)
Existing system have five prominent themes and proposed system consist of seven
prominent categories- Heavystudy Load, Lack of Social-Engagement, Negative
Emotion, Sleep Problems, Diversity Issues, Others, Good Things.
Figure: Categories
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 8 / 16
9. Naive Bayes Multilabel Classifier
Naive Bayes Multilabel Classifier
Text Pre-processing
Multilable Naive Bayes Clasification Algorithm
Evaluation Measure for Multilabel Clasifier
Classification Result
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 9 / 16
10. Naive Bayes Multilabel Classifier Multilable Naive Bayes Clasification Algorithm
Multilable Naive Bayes Clasification Algorithm
This classifier considers each sub words in the review and accordingly classifies the
reviews in different categories. Let S is the Sentence
Step 1: Define categories c=c1,c2,c3,...,cn
Step 2: Read data from a database.
Step 3: Divide S into sub worksw1,w2,w3wn split.
Step 4: Check sub words w1,w2,w3..wn for every categories.
Step 5: If words match with categories c1,c2,c3,..cn increment the counter for that
categories.
Step 6: Find probability of each category.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 10 / 16
11. Continued..
If a word wn appears in a category c for mwnc times, then the probability of this word
in a specific category c is,
p(wn|c) =
mwnc
∑N
n=1 mwnc
Similarly, the probability of this word in categories other than c is,
p(wn|c ) =
mwnc
∑N
n=1 mwnc
the probability that di belongs to category c is,
p(c|di ) =
p(di |c).p(c)
p(di )
α
k
∏
k=1
p(wik |c).p(c)
the probability that documents di belongs to category other than c is,
p(c|di ) =
p(di |c ).p(c)
p(di )
α
k
∏
k=1
p(wik |c ).p(c )
12. Naive Bayes Multilabel Classifier Evaluation Measure for Multilabel Clasifier
Evalaution Measure
Figure: Contigency Matrix
There are four evaluation metrics as follows to classify the tweet data to specific
category:
Accuracy = tp+tn
tp+tn+fp+fn
Precission = tp
tp+fp
Recall = tp
tp+fn
F1 = 2.p.r
p+r = 2tp
2tp+fp+fn
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 12 / 16
13. Naive Bayes Multilabel Classifier Classification Result
Comparative Classification Result
The following table shows the performance of classifiers
Classifiers Accuracy (%) Classification Time (Seconds)
Naïve Bayes 81.7 0.805
K - NN 67.55 2.198
SVM 69.6 2.248
Simple Logistic 98.75 45.368
Table: Perform ance analysis of different classifiers
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 13 / 16
14. Conclusion
Conclusion
It provides a workflow for analyzing social media data for educational purposes
that overcomes the major limitations of both manual qualitative analysis and
large scale computational analysis of user generated textual content.
By using Naive Bayes probability rules, we classified the tweets for finding the
Probability of the keywords and probability tweets under labels.
The proposed system successfully improved the probability and category
probability of the two labels: “Others” and “Goodthing” labels. And concluded
that student just don’t only post their bad experiences’ but also good
experiences’ on social media.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 14 / 16
15. References
References
[1] Xin Chen, Mihaela and Krishna P.C, “Mining Social Media Data for
Understanding Students Learning Experiences”, IEEE Transactions on Learning
Technologies, 2014.
[2]David Ediger, Karl Jiang,Jason Riedly, David A Bader, Courtney Corley Rob
Farber Wi lliam N Reynolds,”Masisve Social Network Analysis: “ Mining Twitter for
Social Good”, IEEE 39 th International Conference on Parrel Processing, San Dego
CA, Sep 13 - 16,2013,pp.583 - 593
[3] Loo Hanley, Timothy Ong Chee Aik, Raymond Wee Keat Kheng & Lim See Yew,
“Mining Tw i tter Data to understand student behavior” IEEE 63 rd Annual
Conference International Council for Educational Media, Myanmar, c 5 - 8. 2011,
125 - 223
[4]E. Tonkin, “Searching the Long Tail: Hidden Structure in Social Tagging,” Proc.
17th ASIS&SIG /CR Classification Research Workshop, 2006.
Saba Munshi (SGGS , Nanded) Mining Social Media Data 29th March,2019 15 / 16