This document outlines an introductory Python programming course, including discussions of:
1. Python data types like integers, floats, booleans, and strings.
2. Python structures like lists, dictionaries, functions, and methods.
3. Using indentation in Python to structure programs with blocks like for loops and if/else statements.
4. An upcoming class will cover data harvesting, storage formats, APIs, scrapers, databases, and an assigned reading.
1. Basics Exercise Next meetings
Big Data and Automated Content Analysis
Week 2 – Wednesday
»Getting started with Python«
Damian Trilling
d.c.trilling@uva.nl
@damian0604
www.damiantrilling.net
Afdeling Communicatiewetenschap
Universiteit van Amsterdam
8 April 2014
Big Data and Automated Content Analysis Damian Trilling
2. Basics Exercise Next meetings
Today
1 The very, very, basics of programming with Python
Datatypes
Indention: The Python way of structuring your program
2 Exercise
3 Next meetings
Big Data and Automated Content Analysis Damian Trilling
3. The very, very, basics of programming
You’ve read all this in chapter 3.
4. Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
Big Data and Automated Content Analysis Damian Trilling
5. Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
"5" and 5 is not the same.
But you can transform it: int("5") will return 5.
You cannot calculate 3 * "5".
But you can calculate 3 * int("5")
Big Data and Automated Content Analysis Damian Trilling
6. Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
7. Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
8. Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
9. Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
dict familynames= {’Bjoern’: ’Burscher’,
’Damian’: ’Trilling’, ’Lori’: ’Meester’}
dict {’Bjoern’: 26, ’Damian’: 31, ’Lori’:
25}
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
10. Basics Exercise Next meetings
Datatypes
Python lingo
Functions
Big Data and Automated Content Analysis Damian Trilling
11. Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
Big Data and Automated Content Analysis Damian Trilling
12. Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Big Data and Automated Content Analysis Damian Trilling
13. Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Both functions and methods end with (). Between the (),
arguments can (sometimes have to) be supplied.
Big Data and Automated Content Analysis Damian Trilling
15. Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Big Data and Automated Content Analysis Damian Trilling
16. Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Don’t mix up TABs and spaces! Both are valid, but you have
to be consequent!!!
Big Data and Automated Content Analysis Damian Trilling
17. Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 print ("The names and ages of all BigData people:")
2 for naam in firstnames:
3 print (naam,age[naam])
4 if naam=="Damian":
5 print ("He teaches this course")
6 elif naam=="Lori":
7 print ("She was an assistant last year")
8 elif naam=="Bjoern":
9 print ("He helps on Wednesdays")
10 else:
11 print ("No idea who this is")
Big Data and Automated Content Analysis Damian Trilling
18. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Big Data and Automated Content Analysis Damian Trilling
19. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
Big Data and Automated Content Analysis Damian Trilling
20. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
Big Data and Automated Content Analysis Damian Trilling
21. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
Big Data and Automated Content Analysis Damian Trilling
22. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
Big Data and Automated Content Analysis Damian Trilling
23. Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
• a file is opened, but should be closed again after the block has
been executed (with statement)
Big Data and Automated Content Analysis Damian Trilling
24. Basics Exercise Next meetings
We’ll now together do the exercise “Describing an existing
structured dataset”.
Big Data and Automated Content Analysis Damian Trilling
25. Basics Exercise Next meetings
Next meetings
Big Data and Automated Content Analysis Damian Trilling
26. Basics Exercise Next meetings
Week 3: Data harvesting and storage
Monday, 13–4
A conceptual overview of APIs, scrapers, crawlers, RSS-feeds,
databases, and different file formats
Wednesday, 15–4
Writing some first data collection scripts
Preparation
• Conceptual level: Read the article by Morstatter, Pfeffer, Liu,
and Carley (2013) about the limitations of the Twitter API.
• Technical level: Make sure you are comfortable with the
techniques we’ve covered so far. Play around. Give
yourself some tasks and solve them. Google.
Big Data and Automated Content Analysis Damian Trilling