SlideShare a Scribd company logo
1 of 21
TWITTER ANALYSIS
(IN RSTUDIO USING R PROGRAMMING LANGUAGE)

Prepared By:

KAIFY RAIS
in.linkedin.com/pub/kaify-rais/31/346/886/

1
Acknowledgement

This project is done as a final project as a part of the training course titled “Business Analytics with R”. I
am really thankful to our course instructor Mr. Ajay Ohri, Founder, DecisionStats, for giving me an
opportunity to work on the project “Twitter Analysis using R” and providing me with the necessary
support and guidance which made me complete the project on time. I am extremely grateful to him for
providing me the necessary links and material to start the project and understand the concept of Twitter
Analysis using R.
In this project “Twitter Analysis using R” , I have performed the Sentiment Analysis and Text Mining
techniques on “#Kejriwal “. This project is done in RStudio which uses the libraries of R programming
languages. I am really grateful to the resourceful articles and websites of R-project which helped me in
understanding the tool as well as the topic.
Also, I would like to extend my sincere regards to the support team of Edureka for their constant and
timely support.

2
Table of Contents
Introduction .................................................................................................................................................. 4
Limitations .................................................................................................................................................... 4
Tools and Packages used .............................................................................................................................. 5
Twitter Analysis:............................................................................................................................................ 6
Creating a Twitter Application .................................................................................................................. 6
Working on RStudio- Building the corpus ................................................................................................. 8
Saving Tweets ......................................................................................................................................... 11
Sentiment Function................................................................................................................................. 12
Scoring tweets and adding column ......................................................................................................... 13
Import the csv file ................................................................................................................................... 14
Visualizing the tweets ............................................................................................................................. 15
Analysis & Conclusion ................................................................................................................................. 16
Text Analysis ............................................................................................................................................... 17
Final code for Twitter Analysis .................................................................................................................... 19
Final code for Text Mining .......................................................................................................................... 20
References .................................................................................................................................................. 21

3
Introductions
Twitter is an amazing micro blogging tool and an extraordinary communication medium. In addition,
twitter can also be an amazing open mine for text and social web analyses. Among the different
softwares that can be used to analyze twitter, R offers a wide variety of options to do lots of interesting
and fun things. In this project I have used RStudio as its pretty much easier working with scripts as
compared to R.
According to Wikipedia, Sentiment analysis (also known as opinion mining) refers to the use of natural
language processing, text analysis and computational linguistics to identify and extract subjective
information in source materials.
Sentiment analysis, also referred to as Opinion Mining, implies extracting opinions, emotions and
sentiments in text. As you can imagine, one of the most common applications of sentiment analysis is to
track attitudes and feelings on the web, especially for tacking products, services, brands or even people.
The main idea is to determine whether they are viewed positively or negatively by a given audience.
The purpose of Text Mining is to process unstructured (textual) information, extract meaningful numeric
indices from the text, and, thus, make the information contained in the text accessible to the various
data mining algorithms. Information can be extracted to derive summaries for the words contained in
the documents or to compute summaries for the documents based on the words contained in them.
Hence, you can analyze words, clusters of words used in documents, or you could analyze documents
and determine similarities between them or how they are related to other variables of interest in the
data mining project. In the most general terms, text mining turns “text into number” which can then be
incorporated in other analyses.
Applications of text Mining are analyzing open-ended survey responses, automatic processing of
messages, emails, etc., analyzing warranty or insurance claims, diagnostic interviews, etc., investigating
competitors by crawling their web sites.

Limitations
There are certain limitations while doing Twitter Analysis using R. Firstly, while getting Status of user
timeline the method can only return a fixed maximum number of tweets which is limited by the Twitter
API.
Secondly, while requesting tweets for a particular keyword, it sometime happens that the number of
retrieved tweets are less than the number of requested tweets.
Thirdly, while requesting tweets for a particular keyword , the older tweets cannot be retrieved.

4
Tools and Packages used
In this project “Twitter Analysis using R” I have used RStudio GUI and following packages:
twitteR : Provides an interface to the Twitter web API.
ROAuth : This package provides an interface to the OAuth 1.0 specification, allowing users
to authenticate via OAuth to the server of their choice.
plyr : This package is a set of tools that solves a common set of problems: you need to break
a big problem down into manageable pieces, operate on each pieces and then put all the
pieces back together.
stringr : stringr is a set of simple wrappers that make R's string functions more consistent,
simpler and easier to use. It does this by ensuring that: function and argument names (and
positions) are consistent, all functions deal with NA's and zero length character
appropriately, and the output data structures from each function matches the input data
structures of other functions.
ggplot2 : An implementation of the grammar of graphics in R. It combines the advantages of
both base and lattice graphics: conditioning and shared axes are handled automatically, and
you can still build up a plot step by step from multiple data sources.
RColorBrewer : The packages provides palettes for drawing nice maps shaded according to
a variable.
tm : A framework for text mining applications within R.
wordcloud : This package helps in creating pretty looking word clouds in Text Mining.

5
Twitter Analysis:
Creating a Twitter Application
First step to perform Twitter Analysis is to create a twitter application. This application will allow
you to perform analysis by connecting your R console to the twitter using the Twitter API. The steps
for creating your twitter applications are:
Go to https://dev.twitter.com and login by using your twitter account.
Then go to My Applications  Create a new application.

Give your application a name, describe about your application in few words, provide your
website’s URL or your blog address (in case you don’t have any website). Leave the Callback
URL blank for now. Complete other formalities and create your twitter application. Once, all
the steps are done, the created application will show as below. Please note the Consumer
key and Consumer Secret numbers as they will be used in RStudio later.

6
This step is done. Next, I will work on my RStudio.

7
Working on RStudio- Building the corpus
In this section, I will first use some packages in R. These are twitter, ROAuth, plyr, stringr and
ggplot2. You can install these packages by the following commands:

Now run the following R script code snippet

After running this script section, the console will look like this

Now, windows users need to download a small file by following command

After running this code, RStudio will look for the following file at the given url, and will download it
for you. Your console will look like this:

8
Now once this file is downloaded, we are now moving on to accessing the twitter API. This step
include the script code to perform handshake using the Consumer Key and Consumer Secret
number of your own application. You have to change these entries by the keys from your
application. Following is the code you have to run to perform handshake.

Here, line number 12,13 and 14 assign the request url, access url and authorization url of twitter
application to the variables requestURL, accessURL and authURL respectively. consumerKey and
consumerSecret are unique to a twitter applicartion. Running this gives following message on the R
console :

The last three lines of the console are a message to the user. To enable the connection, please direct
your web browser to:
http://api.twitter.com/oauth/authorize?oauth_token=dHwEGXdxbjJ093sG0tVjYV
T0NQrkjU3DuCxcC1YQyc

9
After opening the above link in your browser, authorize application by providing you username and
password. And the app will be authorized. You will receive a code like this:

Write this code in the console. The console will give a message like this.

Now register the handshake by following command

The console will give a message with TRUE, which means that the handshake is complete. Now we
can get the tweets from the twitter timeline.

10
Saving Tweets
Once the handshake is done and authorized by twitter, we can fetch most recent tweets related to
any keyword. I have used #Kejriwal as Mr. Arvind Kejriwal is the most talked about person in Delhi
now a day.
The code for getting tweets related to #Kejriwal is:

This command will get 1000 tweets related to Kejriwal. The function “searchTwitter” is used to
download tweets from the timeline. Now we need to convert this list of 1000 tweets into the data
frame, so that we can work on it. Then finally we convert the data frame into .csv file

The Kejriwal.csv file contains all the information about the tweets. A snapshot of the csv file is given
below:

11
Sentiment Function
Once we have the tweets we just need to apply some functions to convert these tweets into some
useful information. The main working principle of sentiment analysis is to find the words in the
tweets that represent positive sentiments and find the words in the tweets that represent negative
sentiments. For this we need a list of words that contains positive and negative sentiment words. I
have downloaded the list from Google and it is easily available.
After downloading the list, save it in your working directory. The sentiment analysis uses two
packages plyr and stringr to manipulate strings. The function is

The sentiment function calculate score for each individual tweet. It first calculate the positive score
by comparing words with the negative words list and then calculate negative score by comparing
words with negative words list. The final score is calculated as

score= positive score – negative score.

12
Scoring tweets and adding column
In this step we score tweets from the above sentiment function.

The console gives the following output

13
Import the csv file

When we import this csv file, a dataset file is created in the working directory. Next step is to score
the tweets, this can be done by creating a separate csv file which contains the score of each tweet.
This can be done as follows:

The snapshot of the score file shows the score of each tweet as an integer in front of every tweet.

And this is the status shown on console

14
Visualizing the tweets
Now that all the work is done. We can create visual histograms and other plots to visualize the
sentiments of the user. This can be done by using hist function. I have used a package RColorBrewer
to play with colors. The code for creating histogram is

15
Analysis & Conclusion
The above histogram shows the frequency of tweets with respect of scores allotted to each tweets.
The x-axis shows the score of each tweet as a negative and positive integer or zero. A positive score
represents positive or good sentiments associated with that particular tweet whereas a negative
score represents negative or bad sentiments associated with that tweet. A score of zero indicates a
neutral sentiment. The more positive the score, the more positive the sentiments of the person
tweeting and vice-versa.
The above histogram is skewed towards negative score which shows that the sentiments of people
regarding Mr. Kejriwal are negative. This can be justified as in recent days , some of his schemes has
backfired such as holding Janta Darbar has been scrapped because of chaos it created.

This can also be stated using a Quick plot with following commands:

Out of 1000 tweets that were fetched from the twitter, A majority of them (400) are neutral,
whereas around 225 were having negative sentiments. Less than 100 tweets were having positive
sentiments but the overall score is negative as can be seen from the plot.
16
Text Analysis
Now that we have created a csv file containing tweets of the #Kejriwal. We can do the text mining of
the tweets. The procedure for this use wordcloud and tm package for text mining. The code for text
mining is:

The output of console will be a collection of all the tweets saved in that file.
Now, we perform some text mining functions to refine the text and filter the text according to our
need.

17
The final wordcloud obtained is as follows:

According to this wordcloud , we can see that Kejriwal is the most used term in the tweets followed
by power, cut, modi which shows that while tweeting about Kejriwal the person also connect the
term Kejriwal with the words like voting, power cuts and Modi.

18
Final code for Twitter Analysis

19
Final code for Text Mining

20
References
http://www.google.com
http://www.wikipedia.com
http://txcdk.unt.edu/iralab/sentiment_analysis
https://sites.google.com/site/miningtwitter/questions/sentiment/analysis

21

More Related Content

Similar to Twitter Analysis Using R

Sentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine LearningSentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine LearningIRJET Journal
 
Datapedia Analysis Report
Datapedia Analysis ReportDatapedia Analysis Report
Datapedia Analysis ReportAbanoub Amgad
 
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...ijtsrd
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptxKaviya452563
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysisAntaraBhattacharya12
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxJOELFRANKLIN13
 
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveIRJET Journal
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Charles Guedenet
 
IRJET- Hosting NLP based Chatbot on AWS Cloud using Docker
IRJET-  	  Hosting NLP based Chatbot on AWS Cloud using DockerIRJET-  	  Hosting NLP based Chatbot on AWS Cloud using Docker
IRJET- Hosting NLP based Chatbot on AWS Cloud using DockerIRJET Journal
 
Annotated Bibliography On Unreliable Software
Annotated Bibliography On Unreliable SoftwareAnnotated Bibliography On Unreliable Software
Annotated Bibliography On Unreliable SoftwareMary Brown
 
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech RecognitionIRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech RecognitionIRJET Journal
 
How TypeScript App Development is Important.pdf
How TypeScript App Development is Important.pdfHow TypeScript App Development is Important.pdf
How TypeScript App Development is Important.pdfWDP Technologies
 
srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016Saurabh Deochake
 
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET Journal
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docxrohithprabhas1
 
Build up and tune PC website(prototype)
Build up and tune PC website(prototype)Build up and tune PC website(prototype)
Build up and tune PC website(prototype)Saurabh Sutone
 

Similar to Twitter Analysis Using R (20)

Sentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine LearningSentiment Analysis on Twitter data using Machine Learning
Sentiment Analysis on Twitter data using Machine Learning
 
Datapedia Analysis Report
Datapedia Analysis ReportDatapedia Analysis Report
Datapedia Analysis Report
 
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...
Advanced Virtual Assistant Based on Speech Processing Oriented Technology on ...
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptx
 
Python report on twitter sentiment analysis
Python report on twitter sentiment analysisPython report on twitter sentiment analysis
Python report on twitter sentiment analysis
 
Twitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptxTwitter_Sentiment_analysis.pptx
Twitter_Sentiment_analysis.pptx
 
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
 
Final Algos
Final AlgosFinal Algos
Final Algos
 
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...Up your data game: How to use R to wrangle, analyze, and visualize data faste...
Up your data game: How to use R to wrangle, analyze, and visualize data faste...
 
IRJET- Hosting NLP based Chatbot on AWS Cloud using Docker
IRJET-  	  Hosting NLP based Chatbot on AWS Cloud using DockerIRJET-  	  Hosting NLP based Chatbot on AWS Cloud using Docker
IRJET- Hosting NLP based Chatbot on AWS Cloud using Docker
 
Annotated Bibliography On Unreliable Software
Annotated Bibliography On Unreliable SoftwareAnnotated Bibliography On Unreliable Software
Annotated Bibliography On Unreliable Software
 
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech RecognitionIRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
 
Datasciencetools
DatasciencetoolsDatasciencetools
Datasciencetools
 
How TypeScript App Development is Important.pdf
How TypeScript App Development is Important.pdfHow TypeScript App Development is Important.pdf
How TypeScript App Development is Important.pdf
 
srd117.final.512Spring2016
srd117.final.512Spring2016srd117.final.512Spring2016
srd117.final.512Spring2016
 
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
PRELIM-Lesson-2.pdf
PRELIM-Lesson-2.pdfPRELIM-Lesson-2.pdf
PRELIM-Lesson-2.pdf
 
employee turnover prediction document.docx
employee turnover prediction document.docxemployee turnover prediction document.docx
employee turnover prediction document.docx
 
Build up and tune PC website(prototype)
Build up and tune PC website(prototype)Build up and tune PC website(prototype)
Build up and tune PC website(prototype)
 

More from Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessAjay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
 

More from Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
How Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help businessHow Big Data ,Cloud Computing ,Data Science can help business
How Big Data ,Cloud Computing ,Data Science can help business
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 

Recently uploaded

西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做j5bzwet6
 
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988oolala9823
 
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfREFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfssusere8ea60
 
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhulesrsj9000
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...ur8mqw8e
 
(南达科他州立大学毕业证学位证成绩单-永久存档)
(南达科他州立大学毕业证学位证成绩单-永久存档)(南达科他州立大学毕业证学位证成绩单-永久存档)
(南达科他州立大学毕业证学位证成绩单-永久存档)oannq
 
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改atducpo
 
Call Girls In Dwarka Sub City ☎️7838079806 ✅ 💯Call Girls In Delhi
Call Girls In Dwarka Sub City  ☎️7838079806 ✅ 💯Call Girls In DelhiCall Girls In Dwarka Sub City  ☎️7838079806 ✅ 💯Call Girls In Delhi
Call Girls In Dwarka Sub City ☎️7838079806 ✅ 💯Call Girls In DelhiSoniyaSingh
 
Inspiring Through Words Power of Inspiration.pptx
Inspiring Through Words Power of Inspiration.pptxInspiring Through Words Power of Inspiration.pptx
Inspiring Through Words Power of Inspiration.pptxShubham Rawat
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ EscortsDelhi Escorts Service
 
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭o8wvnojp
 
E J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxE J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxJackieSparrow3
 
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 AvilableCall Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilabledollysharma2066
 
南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证kbdhl05e
 

Recently uploaded (18)

西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做西伦敦大学毕业证学位证成绩单-怎么样做
西伦敦大学毕业证学位证成绩单-怎么样做
 
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988Ahmedabad Escorts Girl Services For Male Tourists 9537192988
Ahmedabad Escorts Girl Services For Male Tourists 9537192988
 
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdfREFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
REFLECTIONS Newsletter Jan-Jul 2024.pdf.pdf
 
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
Authentic No 1 Amil Baba In Pakistan Amil Baba In Faisalabad Amil Baba In Kar...
 
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service DhuleDhule Call Girls #9907093804 Contact Number Escorts Service Dhule
Dhule Call Girls #9907093804 Contact Number Escorts Service Dhule
 
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
🔝9953056974🔝!!-YOUNG BOOK model Call Girls In Aerocity Delhi Escort service
 
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
《塔夫斯大学毕业证成绩单购买》做Tufts文凭毕业证成绩单/伪造美国假文凭假毕业证书图片Q微信741003700《塔夫斯大学毕业证购买》《Tufts毕业文...
 
(南达科他州立大学毕业证学位证成绩单-永久存档)
(南达科他州立大学毕业证学位证成绩单-永久存档)(南达科他州立大学毕业证学位证成绩单-永久存档)
(南达科他州立大学毕业证学位证成绩单-永久存档)
 
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
办理国外毕业证学位证《原版美国montana文凭》蒙大拿州立大学毕业证制作成绩单修改
 
Call Girls In Dwarka Sub City ☎️7838079806 ✅ 💯Call Girls In Delhi
Call Girls In Dwarka Sub City  ☎️7838079806 ✅ 💯Call Girls In DelhiCall Girls In Dwarka Sub City  ☎️7838079806 ✅ 💯Call Girls In Delhi
Call Girls In Dwarka Sub City ☎️7838079806 ✅ 💯Call Girls In Delhi
 
Inspiring Through Words Power of Inspiration.pptx
Inspiring Through Words Power of Inspiration.pptxInspiring Through Words Power of Inspiration.pptx
Inspiring Through Words Power of Inspiration.pptx
 
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
(No.1)↠Young Call Girls in Sikanderpur (Gurgaon) ꧁❤ 9711911712 ❤꧂ Escorts
 
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Govindpuri Delhi 💯Call Us 🔝8264348440🔝
 
办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭办理西悉尼大学毕业证成绩单、制作假文凭
办理西悉尼大学毕业证成绩单、制作假文凭
 
E J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptxE J Waggoner against Kellogg's Pantheism 8.pptx
E J Waggoner against Kellogg's Pantheism 8.pptx
 
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Lado Sarai Delhi reach out to us at 🔝9953056974🔝
 
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 AvilableCall Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
Call Girls In Karkardooma 83770 87607 Just-Dial Escorts Service 24X7 Avilable
 
南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证南新罕布什尔大学毕业证学位证成绩单-学历认证
南新罕布什尔大学毕业证学位证成绩单-学历认证
 

Twitter Analysis Using R

  • 1. TWITTER ANALYSIS (IN RSTUDIO USING R PROGRAMMING LANGUAGE) Prepared By: KAIFY RAIS in.linkedin.com/pub/kaify-rais/31/346/886/ 1
  • 2. Acknowledgement This project is done as a final project as a part of the training course titled “Business Analytics with R”. I am really thankful to our course instructor Mr. Ajay Ohri, Founder, DecisionStats, for giving me an opportunity to work on the project “Twitter Analysis using R” and providing me with the necessary support and guidance which made me complete the project on time. I am extremely grateful to him for providing me the necessary links and material to start the project and understand the concept of Twitter Analysis using R. In this project “Twitter Analysis using R” , I have performed the Sentiment Analysis and Text Mining techniques on “#Kejriwal “. This project is done in RStudio which uses the libraries of R programming languages. I am really grateful to the resourceful articles and websites of R-project which helped me in understanding the tool as well as the topic. Also, I would like to extend my sincere regards to the support team of Edureka for their constant and timely support. 2
  • 3. Table of Contents Introduction .................................................................................................................................................. 4 Limitations .................................................................................................................................................... 4 Tools and Packages used .............................................................................................................................. 5 Twitter Analysis:............................................................................................................................................ 6 Creating a Twitter Application .................................................................................................................. 6 Working on RStudio- Building the corpus ................................................................................................. 8 Saving Tweets ......................................................................................................................................... 11 Sentiment Function................................................................................................................................. 12 Scoring tweets and adding column ......................................................................................................... 13 Import the csv file ................................................................................................................................... 14 Visualizing the tweets ............................................................................................................................. 15 Analysis & Conclusion ................................................................................................................................. 16 Text Analysis ............................................................................................................................................... 17 Final code for Twitter Analysis .................................................................................................................... 19 Final code for Text Mining .......................................................................................................................... 20 References .................................................................................................................................................. 21 3
  • 4. Introductions Twitter is an amazing micro blogging tool and an extraordinary communication medium. In addition, twitter can also be an amazing open mine for text and social web analyses. Among the different softwares that can be used to analyze twitter, R offers a wide variety of options to do lots of interesting and fun things. In this project I have used RStudio as its pretty much easier working with scripts as compared to R. According to Wikipedia, Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. Sentiment analysis, also referred to as Opinion Mining, implies extracting opinions, emotions and sentiments in text. As you can imagine, one of the most common applications of sentiment analysis is to track attitudes and feelings on the web, especially for tacking products, services, brands or even people. The main idea is to determine whether they are viewed positively or negatively by a given audience. The purpose of Text Mining is to process unstructured (textual) information, extract meaningful numeric indices from the text, and, thus, make the information contained in the text accessible to the various data mining algorithms. Information can be extracted to derive summaries for the words contained in the documents or to compute summaries for the documents based on the words contained in them. Hence, you can analyze words, clusters of words used in documents, or you could analyze documents and determine similarities between them or how they are related to other variables of interest in the data mining project. In the most general terms, text mining turns “text into number” which can then be incorporated in other analyses. Applications of text Mining are analyzing open-ended survey responses, automatic processing of messages, emails, etc., analyzing warranty or insurance claims, diagnostic interviews, etc., investigating competitors by crawling their web sites. Limitations There are certain limitations while doing Twitter Analysis using R. Firstly, while getting Status of user timeline the method can only return a fixed maximum number of tweets which is limited by the Twitter API. Secondly, while requesting tweets for a particular keyword, it sometime happens that the number of retrieved tweets are less than the number of requested tweets. Thirdly, while requesting tweets for a particular keyword , the older tweets cannot be retrieved. 4
  • 5. Tools and Packages used In this project “Twitter Analysis using R” I have used RStudio GUI and following packages: twitteR : Provides an interface to the Twitter web API. ROAuth : This package provides an interface to the OAuth 1.0 specification, allowing users to authenticate via OAuth to the server of their choice. plyr : This package is a set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each pieces and then put all the pieces back together. stringr : stringr is a set of simple wrappers that make R's string functions more consistent, simpler and easier to use. It does this by ensuring that: function and argument names (and positions) are consistent, all functions deal with NA's and zero length character appropriately, and the output data structures from each function matches the input data structures of other functions. ggplot2 : An implementation of the grammar of graphics in R. It combines the advantages of both base and lattice graphics: conditioning and shared axes are handled automatically, and you can still build up a plot step by step from multiple data sources. RColorBrewer : The packages provides palettes for drawing nice maps shaded according to a variable. tm : A framework for text mining applications within R. wordcloud : This package helps in creating pretty looking word clouds in Text Mining. 5
  • 6. Twitter Analysis: Creating a Twitter Application First step to perform Twitter Analysis is to create a twitter application. This application will allow you to perform analysis by connecting your R console to the twitter using the Twitter API. The steps for creating your twitter applications are: Go to https://dev.twitter.com and login by using your twitter account. Then go to My Applications  Create a new application. Give your application a name, describe about your application in few words, provide your website’s URL or your blog address (in case you don’t have any website). Leave the Callback URL blank for now. Complete other formalities and create your twitter application. Once, all the steps are done, the created application will show as below. Please note the Consumer key and Consumer Secret numbers as they will be used in RStudio later. 6
  • 7. This step is done. Next, I will work on my RStudio. 7
  • 8. Working on RStudio- Building the corpus In this section, I will first use some packages in R. These are twitter, ROAuth, plyr, stringr and ggplot2. You can install these packages by the following commands: Now run the following R script code snippet After running this script section, the console will look like this Now, windows users need to download a small file by following command After running this code, RStudio will look for the following file at the given url, and will download it for you. Your console will look like this: 8
  • 9. Now once this file is downloaded, we are now moving on to accessing the twitter API. This step include the script code to perform handshake using the Consumer Key and Consumer Secret number of your own application. You have to change these entries by the keys from your application. Following is the code you have to run to perform handshake. Here, line number 12,13 and 14 assign the request url, access url and authorization url of twitter application to the variables requestURL, accessURL and authURL respectively. consumerKey and consumerSecret are unique to a twitter applicartion. Running this gives following message on the R console : The last three lines of the console are a message to the user. To enable the connection, please direct your web browser to: http://api.twitter.com/oauth/authorize?oauth_token=dHwEGXdxbjJ093sG0tVjYV T0NQrkjU3DuCxcC1YQyc 9
  • 10. After opening the above link in your browser, authorize application by providing you username and password. And the app will be authorized. You will receive a code like this: Write this code in the console. The console will give a message like this. Now register the handshake by following command The console will give a message with TRUE, which means that the handshake is complete. Now we can get the tweets from the twitter timeline. 10
  • 11. Saving Tweets Once the handshake is done and authorized by twitter, we can fetch most recent tweets related to any keyword. I have used #Kejriwal as Mr. Arvind Kejriwal is the most talked about person in Delhi now a day. The code for getting tweets related to #Kejriwal is: This command will get 1000 tweets related to Kejriwal. The function “searchTwitter” is used to download tweets from the timeline. Now we need to convert this list of 1000 tweets into the data frame, so that we can work on it. Then finally we convert the data frame into .csv file The Kejriwal.csv file contains all the information about the tweets. A snapshot of the csv file is given below: 11
  • 12. Sentiment Function Once we have the tweets we just need to apply some functions to convert these tweets into some useful information. The main working principle of sentiment analysis is to find the words in the tweets that represent positive sentiments and find the words in the tweets that represent negative sentiments. For this we need a list of words that contains positive and negative sentiment words. I have downloaded the list from Google and it is easily available. After downloading the list, save it in your working directory. The sentiment analysis uses two packages plyr and stringr to manipulate strings. The function is The sentiment function calculate score for each individual tweet. It first calculate the positive score by comparing words with the negative words list and then calculate negative score by comparing words with negative words list. The final score is calculated as score= positive score – negative score. 12
  • 13. Scoring tweets and adding column In this step we score tweets from the above sentiment function. The console gives the following output 13
  • 14. Import the csv file When we import this csv file, a dataset file is created in the working directory. Next step is to score the tweets, this can be done by creating a separate csv file which contains the score of each tweet. This can be done as follows: The snapshot of the score file shows the score of each tweet as an integer in front of every tweet. And this is the status shown on console 14
  • 15. Visualizing the tweets Now that all the work is done. We can create visual histograms and other plots to visualize the sentiments of the user. This can be done by using hist function. I have used a package RColorBrewer to play with colors. The code for creating histogram is 15
  • 16. Analysis & Conclusion The above histogram shows the frequency of tweets with respect of scores allotted to each tweets. The x-axis shows the score of each tweet as a negative and positive integer or zero. A positive score represents positive or good sentiments associated with that particular tweet whereas a negative score represents negative or bad sentiments associated with that tweet. A score of zero indicates a neutral sentiment. The more positive the score, the more positive the sentiments of the person tweeting and vice-versa. The above histogram is skewed towards negative score which shows that the sentiments of people regarding Mr. Kejriwal are negative. This can be justified as in recent days , some of his schemes has backfired such as holding Janta Darbar has been scrapped because of chaos it created. This can also be stated using a Quick plot with following commands: Out of 1000 tweets that were fetched from the twitter, A majority of them (400) are neutral, whereas around 225 were having negative sentiments. Less than 100 tweets were having positive sentiments but the overall score is negative as can be seen from the plot. 16
  • 17. Text Analysis Now that we have created a csv file containing tweets of the #Kejriwal. We can do the text mining of the tweets. The procedure for this use wordcloud and tm package for text mining. The code for text mining is: The output of console will be a collection of all the tweets saved in that file. Now, we perform some text mining functions to refine the text and filter the text according to our need. 17
  • 18. The final wordcloud obtained is as follows: According to this wordcloud , we can see that Kejriwal is the most used term in the tweets followed by power, cut, modi which shows that while tweeting about Kejriwal the person also connect the term Kejriwal with the words like voting, power cuts and Modi. 18
  • 19. Final code for Twitter Analysis 19
  • 20. Final code for Text Mining 20