Harvesting Data from Twitter Workshop: Hands-on Experience

Harvesting Data from
Twitter: Hands on
Experience
Dr. Nora alTwairesh, Ms. Tarfa alBuhairi, Ms. Mawaheb
alTuwaijri, and Ms. Afnan alMoammar

Content
• Introduction about Twitter API
• Some ready to use tools (no programming)
• Comparison between R and Python
• R
• Python

Why Twitter
• Twitter has become a mass information hub that can be
used to study the evolution of any issue matter:
revolutionary machine
• Research disciplines that study Twitter data spanned
the domains of computer science, information science,
communications, business, economics, education,
medicine, political science, and sociology.

• Recent studies show that %60 of daily Arabic tweets
are from Saudi Arabia.
Why Twitter
Hamdy Mubarak and Kareem Darwish. 2014. Using Twitter to collect a multi-dialectal corpus of Arabic. ANLP 2014:1.

Twitter API
• Free access to the tweets posted in the last 7 days within a certain
rate-limit.
• Any tweets posted earlier than 7 days are considered historical
tweets and should be purchased through third party providers
• The Twitter API provides three interfaces for tweet collection:
Streaming API, REST API and Search API

Streaming API
• The Streaming API provides real-time tweets in a live-poll fashion.
• In a Streaming API, requested tweets will be constantly flowing as
they are posted on Twitter. It is delivered in three bandwidths:
“spritzer” :1%, “gardenhose”: 10% and “firehose”: 100% of all
tweets posted on Twitter.
• A regular user wanting to collect tweets will be granted spritzer
access.

REST API
• The REST API was specifically designed for programmatic access
to read and write Twitter data.
• Third party applications that interact with Twitter are provided with
a large set of methods in the REST API to develop these
applications.
• The access of the REST API is also rate-limited, the limit is 150
requests per hour.

Search API
• Similar to the REST API, the Search API is pull-based. It replicates
the search functionality provided on the Twitter website. However,
tweets retrieved are restricted to the past 7 days.
• the Search API is not appropriate for high-throughput real-time data
acquisition. As such Twitter Inc. discourages its use and plans to
discontinue it in the future.

Create a Twitter App
• To access the Twitter API you need to create a twitter app:
follow this simple tutorial to do so:
https://iag.me/socialmedia/how-to-create-a-twitter-app-in-8-
easy-steps/
• you will use the OAUTH settings in both R and Python:
• Consumer Key
• Consumer Secret
• OAuth Access Token
• OAuth Access Token Secret

Tools to Collect Tweets
• Nodexl: https://nodexl.codeplex.com/
• Tweet Archivist : https://www.tweetarchivist.com/
• Twitter Archiving Google Spreadsheet (TAGS):
https://tags.hawksey.info/

What is R?
•Roos & Robert.
16

Why R?
Statistics
Machine
Learning
Data
Analysis

Why R?
Statistics
Machine
Learning
Data
Analysis Also:
Programming
Language

R allows you to integrate with

Code
Code
C++
Code
Jave
Code
Python
Code
R

Fastest-growing language
https://www.r-bloggers.com/r-is-the-fastest-growing-language-on-stackoverflow/

Now ..
Open your laptop, please


Steps to install R
1: install R:
• https://cran.r-project.org/bin/windows/base/ ---- http://cran.r-
project.org/bin/macosx/
2: install RStudio (after installing R)
• https://www.rstudio.com/products/rstudio/download3/
3: Install these packages (see the user manual):
• streamR/ ROAuth/ RJSONIO/ RTextTools/ e1071/ SparseM.
User manual:
• http://www.devchakraborty.com/RunningRJafroc.pdf
R Packages list:
• https://cran.r-project.org/web/packages/available_packages_by_date.html
Developing Packages with RStudio:
• https://support.rstudio.com/hc/en-
us/articles/200486488?version=0.99.903&mode=desktop
• https://cran.r-project.org/doc/manuals/R-exts.html

Useful URLs
• https://www.r-bloggers.com
• https://www.r-bloggers.com/how-to-learn-r-2/
• http://www.slideshare.net/ChiuYW/r-language-tutorial
• https://www.rwaq.org/courses/introduction-r-
programming
• https://www.researchgate.net/publication/288485806_Hy
brid_Sentiment_Analyser_for_Arabic_Tweets_using_R

Python
• Two versions: 2.7 3.X
• Twitter packages: twitter -- -tweepy
• IDE :Anaconda: iPython notebook: Jupyter

Installing Python
• Install Anaconda from here
• https://www.continuum.io/downloads
choose Python 2.7 version (only for this tutorial)
• Install the twitter package: From the command line
(terminal) type: pip install twitter

Comparison between R and Python
• https://www.datacamp.com/community/tutorials/r-or-python-for-
data-analysis#gs.GuXGfAc
• http://blog.udacity.com/2015/01/python-vs-r-learn-first.html
• http://www.dataschool.io/python-or-r-for-data-science/

Contact Us
ASA Research Group
Twitter: @ASA__IU
Email: asa@imamu.edu.sa
Website: http://asa.imamu.edu.sa/
IWAN Research Group
Twitter: @IWAN_RG
Email: iwan@ksu.edu.sa
Website: http://iwan.ksu.edu.sa

Thank you,
See you later …
THE END ..

Harvesting Data from Twitter Workshop: Hands-on Experience

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

Similar to Harvesting Data from Twitter Workshop: Hands-on Experience

Similar to Harvesting Data from Twitter Workshop: Hands-on Experience (20)

Recently uploaded

Recently uploaded (20)

Harvesting Data from Twitter Workshop: Hands-on Experience

Editor's Notes