SlideShare a Scribd company logo
1 of 33
Created by The Curiosity Bits Blog (curiositybits.com)
Download the Python code used in the tutorial
Codes provided by Dr. Gregory D. Saxton
Mining Twitter User Profile on
Python
1
Prerequisite
Setting up API keys: pg.4-6
Installing necessary Python libraries: pg.7-8
Creating a list ofTwitter screen-names: pg.9
Setting up a SQLite Database to storeTwitter data: pg.10-14
But, if you are a Python newbie, so let’s start with the
very basics.
2
We assume you are a Python newbie, so let’s start with the
very basics.
• Choosing the right Python platform: Python is a programing
language, but you can use different software packages to write, edit
and run Python codes. We choose Anaconda which is free to
download, and the Python version is 2.7.
• Once you install Anaconda, you can play around Python codes in
Spyder
3
Setting up API keys
• We need keys to getTwitter data throughTwitter API
(https://dev.twitter.com/).You need: API Key, API Secret, Access token,
Access token secret.
• First, go to https://dev.twitter.com/, and sign in yourTwitter account. Go
to my applications page to create an application.
4
Enter any name that makes sense to
you
Enter any text that makes sense to
you
you can enter any legitimate URL, here, I put in
the URL of my institution.
Same as above, you can enter any legitimate URL,
here, I put in the URL of my institution.
Setting up API keys
5
• After creating the app, go to API Keys page, scroll down to the
bottom and click Create my access token. Wait for a few minutes
and refresh the page, then you get all your keys!
Setting up API keys
you need API Key, API Secret, Access token, Access token secret.
6
Installing necessary Python libraries
Think of Python libraries as the apps running on your operating
system.To use our code, you need the following libraries:
• Simplejson (https://pypi.python.org/pypi/simplejson)
• Sqlite3 (http://sqlite.org/)
• Sqlalchemy (http://www.sqlalchemy.org/)
• Twython
(https://twython.readthedocs.org/en/latest/index.html)
7
Installing necessary Python libraries
To install the libraries, go to Start menu and type in CMD and run the CMD file as
administrator. Once you are on CMD, type in the command line pip install, followed by the
name of Python library. For example, to install Twython, you need to type pip install
twython, and press enter. Use this procedure to Install all necessary libraries.
8
• Our Python code enables gathering profile information for multiple
Twitter users. So, first let’s create a list of users.The list should be in
.csv format and contains three columns (in accordance to the
configuration in our Python code). Specially, it looks like this:
Creating a list ofTwitter screen-names
The first column lists sequential
numbers
the second column listsTwitter
screen-names you are interested
in
For the third column, I entered 1
all throughout, but you can leave
it blank.
9
Setting up a SQLite Database to storeTwitter data
You need a storage for incoming data fromTwitterAPI.That
is what databases are for.We use SQLite, a Python library
based on SQL. SQL is a common relational database
management system (RDBMS). In previous steps, you have
installed this sqlite library (sqlite3). On top of that, you can
download a database browser to view and edit the database
just like an Excel file.
Go to http://sqlitebrowser.sourceforge.net/ and download
SQLite Database Browser. It allows you to view and edit
SQLite databases. 10
Setting up a SQLite Database to storeTwitter data
Once you have the files downloaded, run the following file.
11
Setting up a SQLite Database to storeTwitter data
Now, we need to import theTwitter users list into a SQLite database.To do that,
create a new database. Remember the database file name because we need to
write that into Python code.
The default file extension for sqlite is .sqlite, to prevent future complications,
add the extension .sqlite when you save a file in SQLite database browser,.
12
File-Import-Table From CSV File, import the
.csv file you saved. Name the imported table as
accounts.This table name corresponds to the
one we will use in Python code. After you click
create, the csv list will be loaded into the
database, and you can browse it in Browse
Data. Lastly, remember to save the database.
Setting up a SQLite Database to storeTwitter data
Stay on the database file you just created.
13
Setting up a SQLite Database to storeTwitter data
Now, we need to modify the imported table.
Go to Edit-ModifyTables, then use Edit field
to change column names.To correspond to our
Python code, name the first column as rowed,
and FiledType as Integer; the second column
as screen_name, and Field type String, and the
third as user_type, and String. In the end, the
database table is defined as the screen-shoted.
14
Now, moving on to the actual Python code…
Download the Python code, and open it inAnaconda
15
There are only a few places you need to change, but let’s
walk through the code first…
The first block of code is to import necessary Python libraries
Make sure you have
installed all these
necessary libraries
16
The second block is where you need to enter the keys we have obtained in the
beginning. Just copy and paste the keys inside quotation mark.
API Key
API secret
Access token
Access token secret
17
The third block is where we define columns in SQLite database. For now, we do not
need to edit anything here.
18
The fourth block is where we ask the Python code to getTwitter user profile
information based on a list of users already saved in SQLite database. Here, you will
see that table names and the column names correspond to the ones we previously
saved in SQLite.
19
The fifth block is where we make specific request throughTwitter API to
get data:
Here, we ask Python to
get one recent status
from the listed user.This
procedure returns the
user’s profile
information.We will
discuss what profile
information is available
later on.
20
The raw output fromTwitter API is in JSON format. JSON is a standardized way of
storing information. Now we need to map the information in JSON format to the
tables in database. Notice that each column in the database represents aTwitter
output variable.
e.g. A Twitter user’s profile description is
stored as description under user in
JSON. This line of code maps the
profile description in JSON to the
database column named
from_user_description.
21
You need to change the file path and file name here
(RECOMMENDED).
If the Python file and your SQLite database are in the
same folder, just paste your database name here.
22
Now, you are ready to run the code. Go to Run, and choose Execute in a new dedicated
Python interpreter. The first option Execute in current Python or IPython interpreter
does not work on my end, but may be working on your computer.
23
Now, look at the right-side bar in Anaconda.
Oops, looks like I am getting error messages!
ERRORS!!
Don’t panic! Its likely you will hit roadblocks
when you run Python codes. So, it is important
to learn to debug.
For this error, it is likely because I saved the
Python file in a folder that is not a default
Python folder.
But what is default Python folder ?
24
the simple way to find out your default
Python folder is
• On a WINDOWS machine, In Start menu, right-click the Computer
and choose Properties
25
Folders listed
here are your
default Python
folders.
26
In my case, C:AnacondaLibsite-packages is my default Python folder. So I moved the
Python code there, edited the file path in the code, and ran it. Here you go, the code is
running and is getting what we want! If you go check the database file, you will see a
new table named typhoon is created (you can change the table name in the Python
code), and it includes the listed users’ recent tweets and profile information.
27
Oops! Error again!
Twitter API has rate limit.
Based on the version ofTwitter API in our
Python code, you can get 300ish users per
15 minutes. Once you hit the limit, you
will see the error message shown in the
screenshot.
There are two ways to deal with the
restriction:
1. wait for 15 minutes for another run;
2. create multipleTwitter apps and get
multiple keys. Once you use up the quota
in one run, paste in a new key to start a
new run!
28
If putting 0 here, the code starts with the user listed in the first row.
Because we will hit rate limit, you will need to run the code multiple times
to complete crawling all users on the list. Make sure to change the starting
row number!
For example, in the first run, you get user (0) to user (150), and hit rate
limit.You should put 151 in the second run to start with the user listed on
the 150th row. 29
A list ofTwitter output variables
Go to SQLite Database Browser and select the table typhoon (again, this is the name we
gave in Python code).You will see output variables across columns.
30
A list ofTwitter output variables
Some key variables related to user profile:
• from_user_screen_name: user’sTwitter screen-name
• from_user_followers_count: how many people are following the user
• from_user_friends_count: how many people this user is following
• from_user_listed_count: how many times the user is listed in other users’ public
lists
• from_user_favourites_count: how many times the user is favored (liked) by
other users
• from_user_statuses_count: how many tweets has the user sent
• from_user_description: the user’s profile bio
• from_user_location: location
• from_user_created_at: when is the account created
31
A list ofTwitter output variables
File – Export –Table as CSV to export the data into csv. format. Make sure to
add the .csv file extension name.
32
Please send your questions and comments to
weiaixu [at] buffalo dot edu
33

More Related Content

What's hot

Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)
Rozalia Alik
 

What's hot (8)

Android Presentation
Android Presentation Android Presentation
Android Presentation
 
Corporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by AzadCorporate Secret Challenge - CyberDefenders.org by Azad
Corporate Secret Challenge - CyberDefenders.org by Azad
 
R project(Analyze Twitter with R)
R project(Analyze Twitter with R)R project(Analyze Twitter with R)
R project(Analyze Twitter with R)
 
Browser Extensions
Browser ExtensionsBrowser Extensions
Browser Extensions
 
Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)Rozalia alik task2 math3 (new)
Rozalia alik task2 math3 (new)
 
Introduction to Web Scraping with Python
Introduction to Web Scraping with PythonIntroduction to Web Scraping with Python
Introduction to Web Scraping with Python
 
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from FacebookESWC 2014 Tutorial Handson 1: Collect Data from Facebook
ESWC 2014 Tutorial Handson 1: Collect Data from Facebook
 
Installing Python on Windows OS
Installing Python on Windows OSInstalling Python on Windows OS
Installing Python on Windows OS
 

Viewers also liked

Slideshare tutorial
Slideshare tutorialSlideshare tutorial
Slideshare tutorial
Margie C
 

Viewers also liked (8)

Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
Mining Social Web APIs with IPython Notebook - Data Day Texas 2014
 
Predicting opinion leadership on twitter
Predicting opinion leadership on twitter   Predicting opinion leadership on twitter
Predicting opinion leadership on twitter
 
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR) How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
How Do We Fight Email Phishing? (ICA2015 - San Juan, PR)
 
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social MediaPredicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
Predicting Social Capital in Nonprofits’ Stakeholder Engagement on Social Media
 
Network Structures For A Better Twitter Community
Network Structures For A Better Twitter CommunityNetwork Structures For A Better Twitter Community
Network Structures For A Better Twitter Community
 
Slideshare tutorial
Slideshare tutorialSlideshare tutorial
Slideshare tutorial
 
Basic tutorial how to use slideshare
Basic tutorial how to use slideshareBasic tutorial how to use slideshare
Basic tutorial how to use slideshare
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
 

Similar to Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2

Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdf
Rahul Mogal
 
unit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptxunit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptx
usvirat1805
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdf
gmadhu8
 

Similar to Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2 (20)

OpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in PythonOpenWhisk by Example - Auto Retweeting Example in Python
OpenWhisk by Example - Auto Retweeting Example in Python
 
PYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdfPYTHON PROGRAMMING NOTES RKREDDY.pdf
PYTHON PROGRAMMING NOTES RKREDDY.pdf
 
Fundamentals of python
Fundamentals of pythonFundamentals of python
Fundamentals of python
 
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdfCSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
CSC2308 - PRINCIPLE OF PROGRAMMING II.pdf
 
Python fundamentals
Python fundamentalsPython fundamentals
Python fundamentals
 
Introduction to Python.pdf
Introduction to Python.pdfIntroduction to Python.pdf
Introduction to Python.pdf
 
Python Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txtPython Requirements File How to Create Python requirements.txt
Python Requirements File How to Create Python requirements.txt
 
Openpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud servicesOpenpicus Flyport interfaces the cloud services
Openpicus Flyport interfaces the cloud services
 
unit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptxunit (1)INTRODUCTION TO PYTHON course.pptx
unit (1)INTRODUCTION TO PYTHON course.pptx
 
Week 1.pptx
Week 1.pptxWeek 1.pptx
Week 1.pptx
 
Introduction to python3.pdf
Introduction to python3.pdfIntroduction to python3.pdf
Introduction to python3.pdf
 
python programming.pptx
python programming.pptxpython programming.pptx
python programming.pptx
 
01 python introduction
01 python introduction 01 python introduction
01 python introduction
 
Core python programming tutorial
Core python programming tutorialCore python programming tutorial
Core python programming tutorial
 
Intro to python
Intro to pythonIntro to python
Intro to python
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdf
 
python into.pptx
python into.pptxpython into.pptx
python into.pptx
 
Python PPT.pptx
Python PPT.pptxPython PPT.pptx
Python PPT.pptx
 
Python
PythonPython
Python
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPT
 

More from Weiai Wayne Xu (6)

Big data, small data and everything in between
Big data, small data and everything in betweenBig data, small data and everything in between
Big data, small data and everything in between
 
Say search and sales e-cigar and big data
Say search and sales   e-cigar and big data Say search and sales   e-cigar and big data
Say search and sales e-cigar and big data
 
Xu talk 3-17-2015
Xu talk 3-17-2015Xu talk 3-17-2015
Xu talk 3-17-2015
 
The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0The Networked Creativity in the Censored Web 2.0
The Networked Creativity in the Censored Web 2.0
 
The Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTubeThe Networked Cultural Diffusion of Kpop on YouTube
The Networked Cultural Diffusion of Kpop on YouTube
 
What makes an image worth a thousand words NCA2014
What makes an image worth a thousand words   NCA2014What makes an image worth a thousand words   NCA2014
What makes an image worth a thousand words NCA2014
 

Recently uploaded

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Recently uploaded (20)

Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

Curiosity Bits Tutorial: Mining Twitter User Profile on Python V2

  • 1. Created by The Curiosity Bits Blog (curiositybits.com) Download the Python code used in the tutorial Codes provided by Dr. Gregory D. Saxton Mining Twitter User Profile on Python 1
  • 2. Prerequisite Setting up API keys: pg.4-6 Installing necessary Python libraries: pg.7-8 Creating a list ofTwitter screen-names: pg.9 Setting up a SQLite Database to storeTwitter data: pg.10-14 But, if you are a Python newbie, so let’s start with the very basics. 2
  • 3. We assume you are a Python newbie, so let’s start with the very basics. • Choosing the right Python platform: Python is a programing language, but you can use different software packages to write, edit and run Python codes. We choose Anaconda which is free to download, and the Python version is 2.7. • Once you install Anaconda, you can play around Python codes in Spyder 3
  • 4. Setting up API keys • We need keys to getTwitter data throughTwitter API (https://dev.twitter.com/).You need: API Key, API Secret, Access token, Access token secret. • First, go to https://dev.twitter.com/, and sign in yourTwitter account. Go to my applications page to create an application. 4
  • 5. Enter any name that makes sense to you Enter any text that makes sense to you you can enter any legitimate URL, here, I put in the URL of my institution. Same as above, you can enter any legitimate URL, here, I put in the URL of my institution. Setting up API keys 5
  • 6. • After creating the app, go to API Keys page, scroll down to the bottom and click Create my access token. Wait for a few minutes and refresh the page, then you get all your keys! Setting up API keys you need API Key, API Secret, Access token, Access token secret. 6
  • 7. Installing necessary Python libraries Think of Python libraries as the apps running on your operating system.To use our code, you need the following libraries: • Simplejson (https://pypi.python.org/pypi/simplejson) • Sqlite3 (http://sqlite.org/) • Sqlalchemy (http://www.sqlalchemy.org/) • Twython (https://twython.readthedocs.org/en/latest/index.html) 7
  • 8. Installing necessary Python libraries To install the libraries, go to Start menu and type in CMD and run the CMD file as administrator. Once you are on CMD, type in the command line pip install, followed by the name of Python library. For example, to install Twython, you need to type pip install twython, and press enter. Use this procedure to Install all necessary libraries. 8
  • 9. • Our Python code enables gathering profile information for multiple Twitter users. So, first let’s create a list of users.The list should be in .csv format and contains three columns (in accordance to the configuration in our Python code). Specially, it looks like this: Creating a list ofTwitter screen-names The first column lists sequential numbers the second column listsTwitter screen-names you are interested in For the third column, I entered 1 all throughout, but you can leave it blank. 9
  • 10. Setting up a SQLite Database to storeTwitter data You need a storage for incoming data fromTwitterAPI.That is what databases are for.We use SQLite, a Python library based on SQL. SQL is a common relational database management system (RDBMS). In previous steps, you have installed this sqlite library (sqlite3). On top of that, you can download a database browser to view and edit the database just like an Excel file. Go to http://sqlitebrowser.sourceforge.net/ and download SQLite Database Browser. It allows you to view and edit SQLite databases. 10
  • 11. Setting up a SQLite Database to storeTwitter data Once you have the files downloaded, run the following file. 11
  • 12. Setting up a SQLite Database to storeTwitter data Now, we need to import theTwitter users list into a SQLite database.To do that, create a new database. Remember the database file name because we need to write that into Python code. The default file extension for sqlite is .sqlite, to prevent future complications, add the extension .sqlite when you save a file in SQLite database browser,. 12
  • 13. File-Import-Table From CSV File, import the .csv file you saved. Name the imported table as accounts.This table name corresponds to the one we will use in Python code. After you click create, the csv list will be loaded into the database, and you can browse it in Browse Data. Lastly, remember to save the database. Setting up a SQLite Database to storeTwitter data Stay on the database file you just created. 13
  • 14. Setting up a SQLite Database to storeTwitter data Now, we need to modify the imported table. Go to Edit-ModifyTables, then use Edit field to change column names.To correspond to our Python code, name the first column as rowed, and FiledType as Integer; the second column as screen_name, and Field type String, and the third as user_type, and String. In the end, the database table is defined as the screen-shoted. 14
  • 15. Now, moving on to the actual Python code… Download the Python code, and open it inAnaconda 15
  • 16. There are only a few places you need to change, but let’s walk through the code first… The first block of code is to import necessary Python libraries Make sure you have installed all these necessary libraries 16
  • 17. The second block is where you need to enter the keys we have obtained in the beginning. Just copy and paste the keys inside quotation mark. API Key API secret Access token Access token secret 17
  • 18. The third block is where we define columns in SQLite database. For now, we do not need to edit anything here. 18
  • 19. The fourth block is where we ask the Python code to getTwitter user profile information based on a list of users already saved in SQLite database. Here, you will see that table names and the column names correspond to the ones we previously saved in SQLite. 19
  • 20. The fifth block is where we make specific request throughTwitter API to get data: Here, we ask Python to get one recent status from the listed user.This procedure returns the user’s profile information.We will discuss what profile information is available later on. 20
  • 21. The raw output fromTwitter API is in JSON format. JSON is a standardized way of storing information. Now we need to map the information in JSON format to the tables in database. Notice that each column in the database represents aTwitter output variable. e.g. A Twitter user’s profile description is stored as description under user in JSON. This line of code maps the profile description in JSON to the database column named from_user_description. 21
  • 22. You need to change the file path and file name here (RECOMMENDED). If the Python file and your SQLite database are in the same folder, just paste your database name here. 22
  • 23. Now, you are ready to run the code. Go to Run, and choose Execute in a new dedicated Python interpreter. The first option Execute in current Python or IPython interpreter does not work on my end, but may be working on your computer. 23
  • 24. Now, look at the right-side bar in Anaconda. Oops, looks like I am getting error messages! ERRORS!! Don’t panic! Its likely you will hit roadblocks when you run Python codes. So, it is important to learn to debug. For this error, it is likely because I saved the Python file in a folder that is not a default Python folder. But what is default Python folder ? 24
  • 25. the simple way to find out your default Python folder is • On a WINDOWS machine, In Start menu, right-click the Computer and choose Properties 25
  • 26. Folders listed here are your default Python folders. 26
  • 27. In my case, C:AnacondaLibsite-packages is my default Python folder. So I moved the Python code there, edited the file path in the code, and ran it. Here you go, the code is running and is getting what we want! If you go check the database file, you will see a new table named typhoon is created (you can change the table name in the Python code), and it includes the listed users’ recent tweets and profile information. 27
  • 28. Oops! Error again! Twitter API has rate limit. Based on the version ofTwitter API in our Python code, you can get 300ish users per 15 minutes. Once you hit the limit, you will see the error message shown in the screenshot. There are two ways to deal with the restriction: 1. wait for 15 minutes for another run; 2. create multipleTwitter apps and get multiple keys. Once you use up the quota in one run, paste in a new key to start a new run! 28
  • 29. If putting 0 here, the code starts with the user listed in the first row. Because we will hit rate limit, you will need to run the code multiple times to complete crawling all users on the list. Make sure to change the starting row number! For example, in the first run, you get user (0) to user (150), and hit rate limit.You should put 151 in the second run to start with the user listed on the 150th row. 29
  • 30. A list ofTwitter output variables Go to SQLite Database Browser and select the table typhoon (again, this is the name we gave in Python code).You will see output variables across columns. 30
  • 31. A list ofTwitter output variables Some key variables related to user profile: • from_user_screen_name: user’sTwitter screen-name • from_user_followers_count: how many people are following the user • from_user_friends_count: how many people this user is following • from_user_listed_count: how many times the user is listed in other users’ public lists • from_user_favourites_count: how many times the user is favored (liked) by other users • from_user_statuses_count: how many tweets has the user sent • from_user_description: the user’s profile bio • from_user_location: location • from_user_created_at: when is the account created 31
  • 32. A list ofTwitter output variables File – Export –Table as CSV to export the data into csv. format. Make sure to add the .csv file extension name. 32
  • 33. Please send your questions and comments to weiaixu [at] buffalo dot edu 33