Sentiment analysis of tweets using Neural Networks

•

3 likes•5,621 views

Adrián Palacios Corella

Technology

Sentiment analysis of tweets using Neural Networks
Adri´an Palacios
Universidad Polit´ecnica de Valencia
June 6th, 2013
1 de 15

Introduction
The objective of this work is:
• To use Neural Networks (using the April toolkit) for the polarity
classiﬁcation of tweets.
• To check how NNs behave when applying diﬀerent techniques for
preprocessing the data.
• We don’t look for good results, we are just experimenting with
these techniques.
2 de 15

Preprocessing of tweets
Prior to the training of NNs, we need to obtain a feature vector
representation for the samples (tweets):
3 de 15

Preprocessing techniques
To achieve this, we create a bag of words after applying one of the
following preprocessing techniques:
1. Unigrams.
2. Bigrams.
3. Stemming.
4. Lemmatization.
5. Part-of-Speech tagging.
4 de 15

Stemming
Stemming: A process that chops oﬀ the suﬃxes of a given word
following some predeﬁned rules.
Examples:
• Stem(run): run.
• Stem(ran): ran.
• Stem(running): run.
5 de 15

Lemmatization
Lemmatization: A process that determines the lemma (canonical
form of the lexeme) of a given word.
Examples:
• Lemma(run): run.
• Lemma(ran): run.
• Lemma(running): run.
6 de 15

PoS tagging
PoS tagging: The assignation Part-of-Speech tags to the words of a
given sentence.
7 de 15

Learning techniques
The polarity classiﬁcation will be made:
• Using a Multilayer Perceptron with a single layer,
• after 5-fold cross-validation technique,
• and an ensemble of the resulting MLPs.
8 de 15

Hyper-parameter search
We will perform a random search for hyper-parameter optimization
instead of a grid search.
9 de 15

Ensemble methods
After training is done, since we use 5-fold cross-validation, we get 5
MLPs for each set of parameters.
To be consistent, we merge these 5 classiﬁers into a single one using
the bootstrap aggregating method (votes have equal weight) for the
ensemble.
10 de 15

Corpus
We will work with the corpus provided at the 2012 edition of the
Workshop on Sentiment Analysis at SEPLN.
Training Test
Samples 7219 60798
11 de 15

Training results
Accuracy of the validation set classiﬁcation:
3 levels 5 levels
Unigrams 54.44 45.62
Bigrams 54.09 39.99
Stemming 62.34 47.49
Lemmatization 61.60 46.75
PoS-tagging 52.58 38.40
12 de 15

Test results
Accuracy of the test set classiﬁcation (average and ensemble):
3 levels 5 levels
Unigrams 32.13 26.12
Bigrams 32.39 28.21
Stem. 32.34 26.81
Lemma. 31.84 26.18
PoS-tag. 35.22 35.22
3 levels 5 levels
Unigrams 32.16 26.52
Bigrams 32.32 29.32
Stem. 32.23 27.16
Lemma. 31.80 26.49
PoS-tag. 35.22 35.22
13 de 15

Conclusions
Results are bad, but we can improve by:
• Using more complex techniques for preprocessing.
• Using more complex models for learning.
• Exploring more values for random hyper-parameter search.
• Learning from PoS tagged tweets in a diﬀerent way.
14 de 15

Questions?
The tools used for the experiments can be found at:
• The NLTK: nltk.org
• Freeling: nlp.lsi.upc.edu/freeling
• The April toolkit: github.com/pakozm/april-ann
15 de 15

What's hot

Sentiment analysis using naive bayes classifier Dev Sahu

Introduction to Machine LearningKmPooja4

Machine Learning presentation.butest

Sentimental Analysis - Naive Bayes AlgorithmKhushboo Gupta

Svm and maximum entropy model for sentiment analysis of tweetsS M Raju

Sentiment analysis using mlPravin Katiyar

Machine Learning Interview Questions and AnswersSatyam Jaiswal

Internship project report,Predictive ModellingAmit Kumar

Applications in Machine LearningJoel Graff

Presentation on supervised learningTonmoy Bhagawati

Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra

Basics of Machine Learningbutest

SENTIMENT ANALYSIS OF TWITTER DATAParvathy Devaraj

Semi supervised learning machine learning made simpleDevansh16

Introduction to Machine LearningLior Rokach

Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina

LearningAmit Pandey

Four machine learning methods to predict academic achievement of college stud...Venkat Projects

What's hot (18)

Sentiment analysis using naive bayes classifier

Introduction to Machine Learning

Machine Learning presentation.

Sentimental Analysis - Naive Bayes Algorithm

Svm and maximum entropy model for sentiment analysis of tweets

Sentiment analysis using ml

Machine Learning Interview Questions and Answers

Internship project report,Predictive Modelling

Applications in Machine Learning

Presentation on supervised learning

Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University

Basics of Machine Learning

SENTIMENT ANALYSIS OF TWITTER DATA

Semi supervised learning machine learning made simple

Introduction to Machine Learning

Matrix Factorization Technique for Recommender Systems

Learning

Four machine learning methods to predict academic achievement of college stud...

Viewers also liked

Can Deep Learning solve the Sentiment Analysis ProblemMark Cieliebak

Evolutionary Multi-Agent Systems for RTS GamesAdrián Palacios Corella

Adaptación de skip-gramas a modelos conexionistas del lenguajeAdrián Palacios Corella

Multimedia data minig and analytics sentiment analysis using social multimediaKan-Han (John) Lu

CNN for Sentiment Analysis on Italian TweetsGiuseppe Attardi

Model selection and tuning at scaleOwen Zhang

Turning Analysis into Action with APIs - Superweek 2017Peter Meyer

How Sentiment Analysis worksCJ Jenkins

Viewers also liked (8)

Can Deep Learning solve the Sentiment Analysis Problem

Evolutionary Multi-Agent Systems for RTS Games

Adaptación de skip-gramas a modelos conexionistas del lenguaje

Multimedia data minig and analytics sentiment analysis using social multimedia

CNN for Sentiment Analysis on Italian Tweets

Model selection and tuning at scale

Turning Analysis into Action with APIs - Superweek 2017

How Sentiment Analysis works

Similar to Sentiment analysis of tweets using Neural Networks

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya

CIKM14: Fixing grammatical errors by preposition rankingeXascale Infolab

Ensemble methods in Machine learning technologysikethatsarightemail

Getting started with Machine LearningGaurav Bhalotia

ANN - UNIT 3.pptxSRM Institute of Science and Technology

crossvalidation.pptxPriyadharshiniG41

Cross validation.pptxYouKnowwho28

Ensemble learningMegha Sharma

Week 11 Model Evalaution Model Evaluationkhairulhuda242

Simple Ensemble LearningMushfiq18

Poster_Reseau_Neurones_Journees_2013Pedro Lopes

Artificial Neural Networks , Recurrent networks , Perceptron'sSRM institute of Science and Technology

NITW_Improving Deep Neural Networks (1).pptxDrKBManwade

NITW_Improving Deep Neural Networks.pptxssuserd23711

Random Forest Decision Tree.pptxRamakrishna Reddy Bijjam

AWS Certified Machine Learning Specialty Adnan Rashid

LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptxshamsul2010

EssentialsOfMachineLearning.pdfAnkita Tiwari

Testingnazeer pasha

Similar to Sentiment analysis of tweets using Neural Networks (20)

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...

CIKM14: Fixing grammatical errors by preposition ranking

Ensemble methods in Machine learning technology

Getting started with Machine Learning

ANN - UNIT 3.pptx

crossvalidation.pptx

Cross validation.pptx

Ensemble learning

Week 11 Model Evalaution Model Evaluation

Simple Ensemble Learning

Poster_Reseau_Neurones_Journees_2013

Artificial Neural Networks , Recurrent networks , Perceptron's

NITW_Improving Deep Neural Networks (1).pptx

NITW_Improving Deep Neural Networks.pptx

Random Forest Decision Tree.pptx

AWS Certified Machine Learning Specialty

LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx

EssentialsOfMachineLearning.pdf

Testing

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

APIForce Zurich 5 April Automation LPDGMarianaLemus7

AI as an Interface for Commercial BuildingsMemoori

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Install Stable Diffusion in windows machinePadma Pradeep

Understanding the Laravel MVC ArchitecturePixlogix Infotech

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Bluetooth Controlled Car with Arduino.pdfngoud9212

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?

Connect Wave/ connectwave Pitch Deck Presentation

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

Unleash Your Potential - Namagunga Girls Coding Club

APIForce Zurich 5 April Automation LPDG

AI as an Interface for Commercial Buildings

Pigging Solutions Piggable Sweeping Elbows

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Human Factors of XR: Using Human Factors to Design XR Systems

Scanning the Internet for External Cloud Exposures via SSL Certs

Install Stable Diffusion in windows machine

Understanding the Laravel MVC Architecture

My INSURER PTE LTD - Insurtech Innovation Award 2024

Bluetooth Controlled Car with Arduino.pdf

Sentiment analysis of tweets using Neural Networks

1. Sentiment analysis of tweets using Neural Networks Adri´an Palacios Universidad Polit´ecnica de Valencia June 6th, 2013 1 de 15

2. Introduction The objective of this work is: • To use Neural Networks (using the April toolkit) for the polarity classiﬁcation of tweets. • To check how NNs behave when applying diﬀerent techniques for preprocessing the data. • We don’t look for good results, we are just experimenting with these techniques. 2 de 15

3. Preprocessing of tweets Prior to the training of NNs, we need to obtain a feature vector representation for the samples (tweets): 3 de 15

4. Preprocessing techniques To achieve this, we create a bag of words after applying one of the following preprocessing techniques: 1. Unigrams. 2. Bigrams. 3. Stemming. 4. Lemmatization. 5. Part-of-Speech tagging. 4 de 15

5. Stemming Stemming: A process that chops off the suffixes of a given word following some predefined rules. Examples: • Stem(run): run. • Stem(ran): ran. • Stem(running): run. 5 de 15

6. Lemmatization Lemmatization: A process that determines the lemma (canonical form of the lexeme) of a given word. Examples: • Lemma(run): run. • Lemma(ran): run. • Lemma(running): run. 6 de 15

7. PoS tagging PoS tagging: The assignation Part-of-Speech tags to the words of a given sentence. 7 de 15

8. Learning techniques The polarity classiﬁcation will be made: • Using a Multilayer Perceptron with a single layer, • after 5-fold cross-validation technique, • and an ensemble of the resulting MLPs. 8 de 15

9. Hyper-parameter search We will perform a random search for hyper-parameter optimization instead of a grid search. 9 de 15

10. Ensemble methods After training is done, since we use 5-fold cross-validation, we get 5 MLPs for each set of parameters. To be consistent, we merge these 5 classiﬁers into a single one using the bootstrap aggregating method (votes have equal weight) for the ensemble. 10 de 15

11. Corpus We will work with the corpus provided at the 2012 edition of the Workshop on Sentiment Analysis at SEPLN. Training Test Samples 7219 60798 11 de 15

12. Training results Accuracy of the validation set classiﬁcation: 3 levels 5 levels Unigrams 54.44 45.62 Bigrams 54.09 39.99 Stemming 62.34 47.49 Lemmatization 61.60 46.75 PoS-tagging 52.58 38.40 12 de 15

13. Test results Accuracy of the test set classiﬁcation (average and ensemble): 3 levels 5 levels Unigrams 32.13 26.12 Bigrams 32.39 28.21 Stem. 32.34 26.81 Lemma. 31.84 26.18 PoS-tag. 35.22 35.22 3 levels 5 levels Unigrams 32.16 26.52 Bigrams 32.32 29.32 Stem. 32.23 27.16 Lemma. 31.80 26.49 PoS-tag. 35.22 35.22 13 de 15

14. Conclusions Results are bad, but we can improve by: • Using more complex techniques for preprocessing. • Using more complex models for learning. • Exploring more values for random hyper-parameter search. • Learning from PoS tagged tweets in a diﬀerent way. 14 de 15

15. Questions? The tools used for the experiments can be found at: • The NLTK: nltk.org • Freeling: nlp.lsi.upc.edu/freeling • The April toolkit: github.com/pakozm/april-ann 15 de 15

Sentiment analysis of tweets using Neural Networks

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (8)

Similar to Sentiment analysis of tweets using Neural Networks

Similar to Sentiment analysis of tweets using Neural Networks (20)

Recently uploaded

Recently uploaded (20)

Sentiment analysis of tweets using Neural Networks