SlideShare a Scribd company logo
Under the guidance of
Prof. Vasudeva Varma
IIIT Hyderabad Submitted by:
Abhishek Jain(201201137)
Pradeep Anumala(201350843)
Pragya Musal(201405545)
Shashank S(201405599)
Mentored by:
Satarupa Guha
Input: Textual content of a
tweet.
Output: Label signifying
whether the tweet is positive,
negative or neutral
Problem?
download Parser Tokenizer
PreProcessor
Feature
Vector
Builder
Add Additional
Features
Construct feed
File
SVM Classifier Training.model
SVM Classifier
Test data Polarity of tweet
(positive/negative/neutral)
A total of 9684 train tweets and 8987 test tweets
were downloaded from twitter and are fed to Parser.
The parser
1. Removes the unavailable tweets
2.Segregates the tweet and polarity
3.After removing the unavailable tweets,the total no. of
train tweets were 7875 and test tweets were 8011
We used the ‘ARK tokenizer’ to tokenize the tweets
The tokenizer divides each tweets into a sequence of
space separated tokens and puts them into a file,
which is used at a later stage for processing.
The tokenized tweets are fed to the pre processor
which :
1)Replaces the urls with | | U | |
2)Replaces @references with | | T | |
3)Replaces +ve emoticons with the word ‘epositive’ and
–ve emoticons with the word ‘enegative’
4)Replaces the words that signify negative context with
the word “not”
The preprocessed file is fed to the feature vector
builder which creates the final feature vector.
The basic(baseline) feature that was considered was
of unigrams.
A list of all unique unigrams across the training set
was constructed and it formed the basic vector for
each tweet.
• The Feature Vector was enhanced by introducing
more features like:
• POS-Tagging
• Count of emoticons, hashtags and exclamations.
• Scores from standard Lexicons
• Negated contexts
• Elongated words (sooooo,happppppppppy)
The formed feature vector was written into a file in a
format expected by the libsvm classifier.
A linear SVM Classifier was used and trained with the
training file as an input and creates training.model
file
This model file was used on the testing file to predict
the results.
The model is tested on a set of 8011 test tweets.
The following results were obtained:
Accuracy : 64% (5127/8011)
F-measure : 0.6163
Thank You

More Related Content

What's hot

Design pattern
Design patternDesign pattern
Design pattern
Omar Isaid
 
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
Lola Burgueño
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
Bhagyashree Deokar
 
Lesson11 more behavioural patterns
Lesson11 more behavioural patternsLesson11 more behavioural patterns
Lesson11 more behavioural patterns
OktJona
 
overloading
overloadingoverloading
overloading
cpsivaku
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
SmritiAgarwal26
 
OOP static variables - friend function - operator overloading - copy constructor
OOP static variables - friend function - operator overloading - copy constructorOOP static variables - friend function - operator overloading - copy constructor
OOP static variables - friend function - operator overloading - copy constructor
Tamir Azrab
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
Dev Sahu
 
Unit testing
Unit testingUnit testing
Unit testing
Vinod Wilson
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Geetika Gautam
 

What's hot (10)

Design pattern
Design patternDesign pattern
Design pattern
 
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
A Generic Neural Network Architecture to Infer Heterogeneous Model Transforma...
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
 
Lesson11 more behavioural patterns
Lesson11 more behavioural patternsLesson11 more behavioural patterns
Lesson11 more behavioural patterns
 
overloading
overloadingoverloading
overloading
 
Sentiment Analysis on Twitter
Sentiment Analysis on TwitterSentiment Analysis on Twitter
Sentiment Analysis on Twitter
 
OOP static variables - friend function - operator overloading - copy constructor
OOP static variables - friend function - operator overloading - copy constructorOOP static variables - friend function - operator overloading - copy constructor
OOP static variables - friend function - operator overloading - copy constructor
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
Unit testing
Unit testingUnit testing
Unit testing
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 

Similar to 19-14-Sentiment Analysis On Twitter

SubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an EntitySubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an Entity
Ankita Kumari
 
Svm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweetsSvm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweets
S M Raju
 
Cucumber jvm best practices v3
Cucumber jvm best practices v3Cucumber jvm best practices v3
Cucumber jvm best practices v3
Ahmed Misbah
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
Ayush Khandelwal
 
Data Driven Testing
Data Driven TestingData Driven Testing
Data Driven Testing
Maveryx
 
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain KnowledgeLearning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
Xin Ye
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
Hari Prasad
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdf
visheshs4
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep Learning
IRJET Journal
 
Word Tagging using Max Entropy Model and Feature selection
Word Tagging using Max Entropy Model and Feature selection Word Tagging using Max Entropy Model and Feature selection
Word Tagging using Max Entropy Model and Feature selection
Yomna Mahmoud Ibrahim Hassan
 
Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...
cscpconf
 
Aaai 1
Aaai 1Aaai 1
Ire major project
Ire major projectIre major project
Ire major project
Abhishek Mungoli
 
ShwetaKBijay-resume
ShwetaKBijay-resumeShwetaKBijay-resume
ShwetaKBijay-resume
Kumari Shweta Bijay
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
syaidatulamirah
 
03 test specification and execution
03   test specification and execution03   test specification and execution
03 test specification and execution
Clemens Reijnen
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
boopathy Prabhaharan
 
Data Driven Framework in Selenium
Data Driven Framework in SeleniumData Driven Framework in Selenium
Data Driven Framework in Selenium
Knoldus Inc.
 
Document Summarizer
Document SummarizerDocument Summarizer
Document Summarizer
Aditya Lunawat
 
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
Michaela Greiler
 

Similar to 19-14-Sentiment Analysis On Twitter (20)

SubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an EntitySubTopic Detection of Tweets Related to an Entity
SubTopic Detection of Tweets Related to an Entity
 
Svm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweetsSvm and maximum entropy model for sentiment analysis of tweets
Svm and maximum entropy model for sentiment analysis of tweets
 
Cucumber jvm best practices v3
Cucumber jvm best practices v3Cucumber jvm best practices v3
Cucumber jvm best practices v3
 
Twitter Sentiment Analysis
Twitter Sentiment AnalysisTwitter Sentiment Analysis
Twitter Sentiment Analysis
 
Data Driven Testing
Data Driven TestingData Driven Testing
Data Driven Testing
 
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain KnowledgeLearning to Rank Relevant Files for Bug Reports using Domain Knowledge
Learning to Rank Relevant Files for Bug Reports using Domain Knowledge
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
 
sentimentanaly 2.pdf
sentimentanaly 2.pdfsentimentanaly 2.pdf
sentimentanaly 2.pdf
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep Learning
 
Word Tagging using Max Entropy Model and Feature selection
Word Tagging using Max Entropy Model and Feature selection Word Tagging using Max Entropy Model and Feature selection
Word Tagging using Max Entropy Model and Feature selection
 
Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...
 
Aaai 1
Aaai 1Aaai 1
Aaai 1
 
Ire major project
Ire major projectIre major project
Ire major project
 
ShwetaKBijay-resume
ShwetaKBijay-resumeShwetaKBijay-resume
ShwetaKBijay-resume
 
Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660Spam detection using machine learning based binary classifier_043660
Spam detection using machine learning based binary classifier_043660
 
03 test specification and execution
03   test specification and execution03   test specification and execution
03 test specification and execution
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
Data Driven Framework in Selenium
Data Driven Framework in SeleniumData Driven Framework in Selenium
Data Driven Framework in Selenium
 
Document Summarizer
Document SummarizerDocument Summarizer
Document Summarizer
 
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
Understanding Eclipse Plug-in Test Suites @ The Eclipse Testing Day 2011
 

Recently uploaded

writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
سمير بسيوني
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
deepaannamalai16
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Vivekanand Anglo Vedic Academy
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 

Recently uploaded (20)

writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdfمصحف القراءات العشر   أعد أحرف الخلاف سمير بسيوني.pdf
مصحف القراءات العشر أعد أحرف الخلاف سمير بسيوني.pdf
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
 
HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.HYPERTENSION - SLIDE SHARE PRESENTATION.
HYPERTENSION - SLIDE SHARE PRESENTATION.
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 

19-14-Sentiment Analysis On Twitter

  • 1. Under the guidance of Prof. Vasudeva Varma IIIT Hyderabad Submitted by: Abhishek Jain(201201137) Pradeep Anumala(201350843) Pragya Musal(201405545) Shashank S(201405599) Mentored by: Satarupa Guha
  • 2. Input: Textual content of a tweet. Output: Label signifying whether the tweet is positive, negative or neutral Problem?
  • 3. download Parser Tokenizer PreProcessor Feature Vector Builder Add Additional Features Construct feed File SVM Classifier Training.model SVM Classifier Test data Polarity of tweet (positive/negative/neutral)
  • 4. A total of 9684 train tweets and 8987 test tweets were downloaded from twitter and are fed to Parser. The parser 1. Removes the unavailable tweets 2.Segregates the tweet and polarity 3.After removing the unavailable tweets,the total no. of train tweets were 7875 and test tweets were 8011
  • 5. We used the ‘ARK tokenizer’ to tokenize the tweets The tokenizer divides each tweets into a sequence of space separated tokens and puts them into a file, which is used at a later stage for processing.
  • 6. The tokenized tweets are fed to the pre processor which : 1)Replaces the urls with | | U | | 2)Replaces @references with | | T | | 3)Replaces +ve emoticons with the word ‘epositive’ and –ve emoticons with the word ‘enegative’ 4)Replaces the words that signify negative context with the word “not”
  • 7. The preprocessed file is fed to the feature vector builder which creates the final feature vector. The basic(baseline) feature that was considered was of unigrams. A list of all unique unigrams across the training set was constructed and it formed the basic vector for each tweet.
  • 8. • The Feature Vector was enhanced by introducing more features like: • POS-Tagging • Count of emoticons, hashtags and exclamations. • Scores from standard Lexicons • Negated contexts • Elongated words (sooooo,happppppppppy)
  • 9. The formed feature vector was written into a file in a format expected by the libsvm classifier. A linear SVM Classifier was used and trained with the training file as an input and creates training.model file This model file was used on the testing file to predict the results.
  • 10. The model is tested on a set of 8011 test tweets. The following results were obtained: Accuracy : 64% (5127/8011) F-measure : 0.6163