SlideShare a Scribd company logo
LEARNING PYTHON 
FROM DATA 
Mosky 
1
THIS SLIDE 
• The online version is at 
https://speakerdeck.com/mosky/learning-python-from-data. 
• The examples are at 
https://github.com/moskytw/learning-python-from-data-examples. 
2
MOSKY 
3
MOSKY 
• I am working at Pinkoi. 
3
MOSKY 
• I am working at Pinkoi. 
• I've taught Python for 100+ hours. 
3
MOSKY 
• I am working at Pinkoi. 
• I've taught Python for 100+ hours. 
• A speaker at 
COSCUP 2014, PyCon SG 2014, PyCon APAC 014, 
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... 
3
MOSKY 
• I am working at Pinkoi. 
• I've taught Python for 100+ hours. 
• A speaker at 
COSCUP 2014, PyCon SG 2014, PyCon APAC 014, 
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... 
• The author of the Python packages: 
MoSQL, Clime, ZIPCodeTW, ... 
3
MOSKY 
• I am working at Pinkoi. 
• I've taught Python for 100+ hours. 
• A speaker at 
COSCUP 2014, PyCon SG 2014, PyCon APAC 014, 
OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... 
• The author of the Python packages: 
MoSQL, Clime, ZIPCodeTW, ... 
• http://mosky.tw/ 
3
SCHEDULE 
4
SCHEDULE 
•Warm-up 
4
SCHEDULE 
•Warm-up 
• Packages - Install the packages we need. 
4
SCHEDULE 
•Warm-up 
• Packages - Install the packages we need. 
• CSV - Download a CSV from the Internet and handle it. 
4
SCHEDULE 
•Warm-up 
• Packages - Install the packages we need. 
• CSV - Download a CSV from the Internet and handle it. 
• HTML - Parse a HTML source code and write a Web crawler. 
4
SCHEDULE 
•Warm-up 
• Packages - Install the packages we need. 
• CSV - Download a CSV from the Internet and handle it. 
• HTML - Parse a HTML source code and write a Web crawler. 
• SQL - Save data into a SQLite database. 
4
SCHEDULE 
•Warm-up 
• Packages - Install the packages we need. 
• CSV - Download a CSV from the Internet and handle it. 
• HTML - Parse a HTML source code and write a Web crawler. 
• SQL - Save data into a SQLite database. 
• The End 
4
FIRST OF ALL, 
5
6
PYTHON IS AWESOME! 
6
2 OR 3? 
7
2 OR 3? 
• Use Python 3! 
7
2 OR 3? 
• Use Python 3! 
• But it actually depends on the libs you need. 
7
2 OR 3? 
• Use Python 3! 
• But it actually depends on the libs you need. 
• https://python3wos.appspot.com/ 
7
2 OR 3? 
• Use Python 3! 
• But it actually depends on the libs you need. 
• https://python3wos.appspot.com/ 
•We will go ahead with Python 2.7, 
but I will also introduce the changes in Python 3. 
7
THE ONLINE RESOURCES 
8
THE ONLINE RESOURCES 
• The Python Official Doc 
• http://docs.python.org 
• The Python Tutorial 
• The Python Standard 
Library 
8
THE ONLINE RESOURCES 
• The Python Official Doc 
• http://docs.python.org 
• The Python Tutorial 
• The Python Standard 
Library 
• My Past Slides 
• Programming with Python 
- Basic 
• Programming with Python 
- Adv. 
8
THE BOOKS 
9
THE BOOKS 
• Learning Python by Mark Lutz 
9
THE BOOKS 
• Learning Python by Mark Lutz 
• Programming in Python 3 by Mark Summerfield 
9
THE BOOKS 
• Learning Python by Mark Lutz 
• Programming in Python 3 by Mark Summerfield 
• Python Essential Reference by David Beazley 
9
PREPARATION 
10
PREPARATION 
• Did you say "hello" to Python? 
10
PREPARATION 
• Did you say "hello" to Python? 
• If no, visit 
• http://www.slideshare.net/moskytw/programming-with-python- 
basic. 
10
PREPARATION 
• Did you say "hello" to Python? 
• If no, visit 
• http://www.slideshare.net/moskytw/programming-with-python- 
basic. 
• If yes, open your Python shell. 
10
WARM-UP 
The things you must know. 
11
MATH & VARS 
2 + 3 
2 - 3 
2 * 3 
2 / 3, -2 / 3 
! 
(1+10)*10 / 2 
! 
2.0 / 3 
! 
2 % 3 
! 
2 ** 3 
x = 2 
! 
y = 3 
! 
z = x + y 
! 
print z 
! 
'#' * 10 
12
FOR 
for i in [0, 1, 2, 3, 4]: 
print i 
! 
items = [0, 1, 2, 3, 4] 
for i in items: 
print i 
! 
for i in range(5): 
print i 
! 
! 
! 
chars = 'SAHFI' 
for i, c in enumerate(chars): 
print i, c 
! 
! 
words = ('Samsung', 'Apple', 
'HP', 'Foxconn', 'IBM') 
for c, w in zip(chars, words): 
print c, w 
13
IF 
for i in range(1, 10): 
if i % 2 == 0: 
print '{} is divisible by 2'.format(i) 
elif i % 3 == 0: 
print '{} is divisible by 3'.format(i) 
else: 
print '{} is not divisible by 2 nor 3'.format(i) 
14
WHILE 
while 1: 
n = int(raw_input('How big pyramid do you want? ')) 
if n <= 0: 
print 'It must greater than 0: {}'.format(n) 
continue 
break 
15
TRY 
while 1: 
! 
try: 
n = int(raw_input('How big pyramid do you want? ')) 
except ValueError as e: 
print 'It must be a number: {}'.format(e) 
continue 
! 
if n <= 0: 
print 'It must greater than 0: {}'.format(n) 
continue 
! 
break 
16
LOOP ... ELSE 
for n in range(2, 100): 
for i in range(2, n): 
if n % i == 0: 
break 
else: 
print '{} is a prime!'.format(n) 
17
A PYRAMID 
* 
*** 
***** 
******* 
********* 
*********** 
************* 
*************** 
***************** 
******************* 
18
A FATER PYRAMID 
* 
***** 
********* 
************* 
******************* 
19
YOUR TURN! 
20
LIST COMPREHENSION 
[ 
n 
for n in range(2, 100) 
if not any(n % i == 0 for i in range(2, n)) 
] 
21
PACKAGES 
import is important. 
22
23
GET PIP - UN*X 
24
GET PIP - UN*X 
• Debian family 
• # apt-get install python-pip 
24
GET PIP - UN*X 
• Debian family 
• # apt-get install python-pip 
• Rehat family 
• # yum install python-pip 
24
GET PIP - UN*X 
• Debian family 
• # apt-get install python-pip 
• Rehat family 
• # yum install python-pip 
• Mac OS X 
• # easy_install pip 
24
GET PIP - WIN * 
25
GET PIP - WIN * 
• Follow the steps in http://stackoverflow.com/questions/ 
4750806/how-to-install-pip-on-windows. 
25
GET PIP - WIN * 
• Follow the steps in http://stackoverflow.com/questions/ 
4750806/how-to-install-pip-on-windows. 
• Or just use easy_install to install. 
The easy_install should be found at C:Python27Scripts. 
25
GET PIP - WIN * 
• Follow the steps in http://stackoverflow.com/questions/ 
4750806/how-to-install-pip-on-windows. 
• Or just use easy_install to install. 
The easy_install should be found at C:Python27Scripts. 
• Or find the Windows installer on Python Package Index. 
25
3-RD PARTY PACKAGES 
26
3-RD PARTY PACKAGES 
• requests - Python HTTP for Humans 
26
3-RD PARTY PACKAGES 
• requests - Python HTTP for Humans 
• lxml - Pythonic XML processing library 
26
3-RD PARTY PACKAGES 
• requests - Python HTTP for Humans 
• lxml - Pythonic XML processing library 
• uniout - Print the object representation in readable chars. 
26
3-RD PARTY PACKAGES 
• requests - Python HTTP for Humans 
• lxml - Pythonic XML processing library 
• uniout - Print the object representation in readable chars. 
• clime - Convert module into a CLI program w/o any config. 
26
YOUR TURN! 
27
CSV 
Let's start from making a HTTP request! 
28
HTTP GET 
import requests 
! 
#url = 'http://stats.moe.gov.tw/files/school/101/ 
u1_new.csv' 
url = 'https://raw.github.com/moskytw/learning-python- 
from-data-examples/master/sql/schools.csv' 
! 
print requests.get(url).content 
! 
#print requests.get(url).text 
29
FILE 
save_path = 'school_list.csv' 
! 
with open(save_path, 'w') as f: 
f.write(requests.get(url).content) 
! 
with open(save_path) as f: 
print f.read() 
! 
with open(save_path) as f: 
for line in f: 
print line, 
30
DEF 
from os.path import basename 
! 
def save(url, path=None): 
! 
if not path: 
path = basename(url) 
! 
with open(path, 'w') as f: 
f.write(requests.get(url).content) 
31
CSV 
import csv 
from os.path import exists 
! 
if not exists(save_path): 
save(url, save_path) 
! 
with open(save_path) as f: 
for row in csv.reader(f): 
print row 
32
+ UNIOUT 
import csv 
from os.path import exists 
import uniout # You want this! 
! 
if not exists(save_path): 
save(url, save_path) 
! 
with open(save_path) as f: 
for row in csv.reader(f): 
print row 
33
NEXT 
with open(save_path) as f: 
next(f) # skip the unwanted lines 
next(f) 
for row in csv.reader(f): 
print row 
34
DICT READER 
with open(save_path) as f: 
next(f) 
next(f) 
for row in csv.DictReader(f): 
print row 
! 
# We now have a great output. :) 
35
DEF AGAIN 
def parse_to_school_list(path): 
school_list = [] 
with open(path) as f: 
next(f) 
next(f) 
for school in csv.DictReader(f): 
school_list.append(school) 
! 
return school_list[:-2] 
36
+ COMPREHENSION 
def parse_to_school_list(path='schools.csv'): 
with open(path) as f: 
next(f) 
next(f) 
school_list = [school for school in 
csv.DictReader(f)][:-2] 
! 
return school_list 
37
+ PRETTY PRINT 
from pprint import pprint 
! 
pprint(parse_to_school_list(save_path)) 
! 
# AWESOME! 
38
PYTHONIC 
school_list = parse_to_school_list(save_path) 
! 
# hmmm ... 
! 
for school in shcool_list: 
print shcool['School Name'] 
! 
# It is more Pythonic! :) 
! 
print [school['School Name'] for school in school_list] 
39
GROUP BY 
from itertools import groupby 
! 
# You MUST sort it. 
keyfunc = lambda school: school['County'] 
school_list.sort(key=keyfunc) 
! 
for county, schools in groupby(school_list, keyfunc): 
for school in schools: 
print '%s %r' % (county, school) 
print '---' 
40
DOCSTRING 
'''It contains some useful function for paring data 
from government.''' 
! 
def save(url, path=None): 
'''It saves data from `url` to `path`.''' 
... 
! 
--- Shell --- 
! 
$ pydoc csv_docstring 
41
CLIME 
if __name__ == '__main__': 
import clime.now 
! 
--- shell --- 
! 
$ python csv_clime.py 
usage: basename <p> 
or: parse-to-school-list <path> 
or: save [--path] <url> 
! 
It contains some userful function for parsing data from 
government. 
42
DOC TIPS 
help(requests) 
! 
print dir(requests) 
! 
print 'n'.join(dir(requests)) 
43
YOUR TURN! 
44
HTML 
Have fun with the final crawler. ;) 
45
LXML 
import requests 
from lxml import etree 
! 
content = requests.get('http://clbc.tw').content 
root = etree.HTML(content) 
! 
print root 
46
CACHE 
from os.path import exists 
! 
cache_path = 'cache.html' 
! 
if exists(cache_path): 
with open(cache_path) as f: 
content = f.read() 
else: 
content = requests.get('http://clbc.tw').content 
with open(cache_path, 'w') as f: 
f.write(content) 
47
SEARCHING 
head = root.find('head') 
print head 
! 
head_children = head.getchildren() 
print head_children 
! 
metas = head.findall('meta') 
print metas 
! 
title_text = head.findtext('title') 
print title_text 
48
XPATH 
titles = root.xpath('/html/head/title') 
print titles[0].text 
! 
title_texts = root.xpath('/html/head/title/text()') 
print title_texts[0] 
! 
as_ = root.xpath('//a') 
print as_ 
print [a.get('href') for a in as_] 
49
MD5 
from hashlib import md5 
! 
message = 'There should be one-- and preferably 
only one --obvious way to do it.' 
! 
print md5(message).hexdigest() 
! 
# Actually, it is noting about HTML. 
50
DEF GET 
from os import makedirs 
from os.path import exists, join 
! 
def get(url, cache_dir_path='cache/'): 
! 
if not exists(cache_dir_path): 
makedirs(cache_dir) 
! 
cache_path = join(cache_dir_path, 
md5(url).hexdigest()) 
! 
... 
51
DEF FIND_URLS 
def find_urls(content): 
root = etree.HTML(content) 
return [ 
a.attrib['href'] for a in root.xpath('//a') 
if 'href' in a.attrib 
] 
52
BFS 1/2 
NEW = 0 
QUEUED = 1 
VISITED = 2 
! 
def search_urls(url): 
! 
url_queue = [url] 
url_state_map = {url: QUEUED} 
! 
while url_queue: 
! 
url = url_queue.pop(0) 
print url 
53
BFS 2/2 
# continue the previous page 
try: 
found_urls = find_urls(get(url)) 
except Exception, e: 
url_state_map[url] = e 
print 'Exception: %s' % e 
except KeyboardInterrupt, e: 
return url_state_map 
else: 
for found_url in found_urls: 
if not url_state_map.get(found_url, NEW): 
url_queue.append(found_url) 
url_state_map[found_url] = QUEUED 
url_state_map[url] = VISITED 
54
DEQUE 
from collections import deque 
... 
! 
def search_urls(url): 
url_queue = deque([url]) 
... 
while url_queue: 
! 
url = url_queue.popleft() 
print url 
... 
55
YIELD 
... 
! 
def search_urls(url): 
... 
while url_queue: 
! 
url = url_queue.pop(0) 
yield url 
... 
except KeyboardInterrupt, e: 
print url_state_map 
return 
... 
56
YOUR TURN! 
57
SQL 
How about saving the CSV file into a db? 
58
TABLE 
CREATE TABLE schools ( 
id TEXT PRIMARY KEY, 
name TEXT, 
county TEXT, 
address TEXT, 
phone TEXT, 
url TEXT, 
type TEXT 
); 
! 
DROP TABLE schools; 
59
CRUD 
INSERT INTO schools (id, name) VALUES ('1', 'The 
First'); 
INSERT INTO schools VALUES (...); 
! 
SELECT * FROM schools WHERE id='1'; 
SELECT name FROM schools WHERE id='1'; 
! 
UPDATE schools SET id='10' WHERE id='1'; 
! 
DELETE FROM schools WHERE id='10'; 
60
COMMON PATTERN 
import sqlite3 
! 
db_path = 'schools.db' 
conn = sqlite3.connect(db_path) 
cur = conn.cursor() 
! 
cur.execute('''CREATE TABLE schools ( 
... 
)''') 
conn.commit() 
! 
cur.close() 
conn.close() 
61
ROLLBACK 
... 
! 
try: 
cur.execute('...') 
except: 
conn.rollback() 
raise 
else: 
conn.commit() 
! 
... 
62
PARAMETERIZE QUERY 
... 
! 
rows = ... 
! 
for row in rows: 
cur.execute('INSERT INTO schools VALUES (?, ?, ?, ?, ?, 
?, ?)', row) 
! 
conn.commit() 
! 
... 
63
EXECUTEMANY 
... 
! 
rows = ... 
! 
cur.executemany('INSERT INTO schools VALUES (?, ?, ?, ?, ?, 
?, ?)', rows) 
! 
conn.commit() 
! 
... 
64
FETCH 
... 
cur.execute('select * from schools') 
! 
print cur.fetchone() 
! 
# or 
print cur.fetchall() 
! 
# or 
for row in cur: 
print row 
... 
65
TEXT FACTORY 
# SQLite only: Let you pass the 8-bit string as parameter. 
! 
... 
! 
conn = sqlite3.connect(db_path) 
conn.text_factory = str 
! 
... 
66
ROW FACTORY 
# SQLite only: Let you convert tuple into dict. It is 
`DictCursor` in some other connectors. 
! 
def dict_factory(cursor, row): 
d = {} 
for idx, col in enumerate(cursor.description): 
d[col[0]] = row[idx] 
return d 
! 
... 
con.row_factory = dict_factory 
... 
67
MORE 
68
MORE 
• Python DB API 2.0 
68
MORE 
• Python DB API 2.0 
• MySQLdb - MySQL connector for Python 
68
MORE 
• Python DB API 2.0 
• MySQLdb - MySQL connector for Python 
• Psycopg2 - PostgreSQL adapter for Python 
68
MORE 
• Python DB API 2.0 
• MySQLdb - MySQL connector for Python 
• Psycopg2 - PostgreSQL adapter for Python 
• SQLAlchemy - the Python SQL toolkit and ORM 
68
MORE 
• Python DB API 2.0 
• MySQLdb - MySQL connector for Python 
• Psycopg2 - PostgreSQL adapter for Python 
• SQLAlchemy - the Python SQL toolkit and ORM 
• MoSQL - Build SQL from common Python data structure. 
68
THE END 
69
THE END 
• You learned how to ... 
69
THE END 
• You learned how to ... 
• make a HTTP request 
69
THE END 
• You learned how to ... 
• make a HTTP request 
• load a CSV file 
69
THE END 
• You learned how to ... 
• make a HTTP request 
• load a CSV file 
• parse a HTML file 
69
THE END 
• You learned how to ... 
• make a HTTP request 
• load a CSV file 
• parse a HTML file 
• write a Web crawler 
69
THE END 
• You learned how to ... 
• make a HTTP request 
• load a CSV file 
• parse a HTML file 
• write a Web crawler 
• use SQL with SQLite 
69
THE END 
• You learned how to ... 
• make a HTTP request 
• load a CSV file 
• parse a HTML file 
• write a Web crawler 
• use SQL with SQLite 
• and lot of techniques today. ;) 
69

More Related Content

What's hot

Sphinx autodoc - automated api documentation - PyCon.KR 2015
Sphinx autodoc - automated api documentation - PyCon.KR 2015Sphinx autodoc - automated api documentation - PyCon.KR 2015
Sphinx autodoc - automated api documentation - PyCon.KR 2015
Takayuki Shimizukawa
 
Happy Go Programming
Happy Go ProgrammingHappy Go Programming
Happy Go Programming
Lin Yo-An
 
pyconjp2015_talk_Translation of Python Program__
pyconjp2015_talk_Translation of Python Program__pyconjp2015_talk_Translation of Python Program__
pyconjp2015_talk_Translation of Python Program__
Renyuan Lyu
 
OSCON2014 : Quick Introduction to System Tools Programming with Go
OSCON2014 : Quick Introduction to System Tools Programming with GoOSCON2014 : Quick Introduction to System Tools Programming with Go
OSCON2014 : Quick Introduction to System Tools Programming with Go
Chris McEniry
 
Infrastructure as code might be literally impossible
Infrastructure as code might be literally impossibleInfrastructure as code might be literally impossible
Infrastructure as code might be literally impossible
ice799
 
Introduction to Programming in Go
Introduction to Programming in GoIntroduction to Programming in Go
Introduction to Programming in Go
Amr Hassan
 
Happy Go Programming Part 1
Happy Go Programming Part 1Happy Go Programming Part 1
Happy Go Programming Part 1Lin Yo-An
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
extremecoders
 
Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015
Takayuki Shimizukawa
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Michael Barker
 
Dive into Pinkoi 2013
Dive into Pinkoi 2013Dive into Pinkoi 2013
Dive into Pinkoi 2013
Mosky Liu
 
Easy contributable internationalization process with Sphinx @ pyconmy2015
Easy contributable internationalization process with Sphinx @ pyconmy2015Easy contributable internationalization process with Sphinx @ pyconmy2015
Easy contributable internationalization process with Sphinx @ pyconmy2015
Takayuki Shimizukawa
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MoreMatt Harrison
 
Php extensions
Php extensionsPhp extensions
Php extensions
Elizabeth Smith
 
Why Python (for Statisticians)
Why Python (for Statisticians)Why Python (for Statisticians)
Why Python (for Statisticians)
Matt Harrison
 
Python教程 / Python tutorial
Python教程 / Python tutorialPython教程 / Python tutorial
Python教程 / Python tutorial
ee0703
 
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
Steven Francia
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015
Younggun Kim
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Takayuki Shimizukawa
 
Clean Manifests with Puppet::Tidy
Clean Manifests with Puppet::TidyClean Manifests with Puppet::Tidy
Clean Manifests with Puppet::Tidy
Puppet
 

What's hot (20)

Sphinx autodoc - automated api documentation - PyCon.KR 2015
Sphinx autodoc - automated api documentation - PyCon.KR 2015Sphinx autodoc - automated api documentation - PyCon.KR 2015
Sphinx autodoc - automated api documentation - PyCon.KR 2015
 
Happy Go Programming
Happy Go ProgrammingHappy Go Programming
Happy Go Programming
 
pyconjp2015_talk_Translation of Python Program__
pyconjp2015_talk_Translation of Python Program__pyconjp2015_talk_Translation of Python Program__
pyconjp2015_talk_Translation of Python Program__
 
OSCON2014 : Quick Introduction to System Tools Programming with Go
OSCON2014 : Quick Introduction to System Tools Programming with GoOSCON2014 : Quick Introduction to System Tools Programming with Go
OSCON2014 : Quick Introduction to System Tools Programming with Go
 
Infrastructure as code might be literally impossible
Infrastructure as code might be literally impossibleInfrastructure as code might be literally impossible
Infrastructure as code might be literally impossible
 
Introduction to Programming in Go
Introduction to Programming in GoIntroduction to Programming in Go
Introduction to Programming in Go
 
Happy Go Programming Part 1
Happy Go Programming Part 1Happy Go Programming Part 1
Happy Go Programming Part 1
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
 
Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015Sphinx autodoc - automated api documentation - PyCon.MY 2015
Sphinx autodoc - automated api documentation - PyCon.MY 2015
 
Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!Lock? We don't need no stinkin' locks!
Lock? We don't need no stinkin' locks!
 
Dive into Pinkoi 2013
Dive into Pinkoi 2013Dive into Pinkoi 2013
Dive into Pinkoi 2013
 
Easy contributable internationalization process with Sphinx @ pyconmy2015
Easy contributable internationalization process with Sphinx @ pyconmy2015Easy contributable internationalization process with Sphinx @ pyconmy2015
Easy contributable internationalization process with Sphinx @ pyconmy2015
 
PyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and MorePyCon 2013 : Scripting to PyPi to GitHub and More
PyCon 2013 : Scripting to PyPi to GitHub and More
 
Php extensions
Php extensionsPhp extensions
Php extensions
 
Why Python (for Statisticians)
Why Python (for Statisticians)Why Python (for Statisticians)
Why Python (for Statisticians)
 
Python教程 / Python tutorial
Python教程 / Python tutorialPython教程 / Python tutorial
Python教程 / Python tutorial
 
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015
 
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
Sphinx autodoc - automated API documentation (PyCon APAC 2015 in Taiwan)
 
Clean Manifests with Puppet::Tidy
Clean Manifests with Puppet::TidyClean Manifests with Puppet::Tidy
Clean Manifests with Puppet::Tidy
 

Viewers also liked

Programming with Python - Adv.
Programming with Python - Adv.Programming with Python - Adv.
Programming with Python - Adv.
Mosky Liu
 
Boost Maintainability
Boost MaintainabilityBoost Maintainability
Boost Maintainability
Mosky Liu
 
Beyond the Style Guides
Beyond the Style GuidesBeyond the Style Guides
Beyond the Style Guides
Mosky Liu
 
Network theory - PyCon 2015
Network theory - PyCon 2015Network theory - PyCon 2015
Network theory - PyCon 2015
Sarah Guido
 
ZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
ZIPCodeTW: Find Taiwan ZIP Code by Address FuzzilyZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
ZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
Mosky Liu
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
Karissa Rae McKelvey
 
PyData: The Next Generation
PyData: The Next GenerationPyData: The Next Generation
PyData: The Next Generation
Wes McKinney
 
ログ分析のある生活(概要編)
ログ分析のある生活(概要編)ログ分析のある生活(概要編)
ログ分析のある生活(概要編)
Masakazu Kishima
 
Parse The Web Using Python+Beautiful Soup
Parse The Web Using Python+Beautiful SoupParse The Web Using Python+Beautiful Soup
Parse The Web Using Python+Beautiful SoupJim Chang
 
Analyzing Data With Python
Analyzing Data With PythonAnalyzing Data With Python
Analyzing Data With Python
Sarah Guido
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
Andrew Henshaw
 
ログ解析を支えるNoSQLの技術
ログ解析を支えるNoSQLの技術ログ解析を支えるNoSQLの技術
ログ解析を支えるNoSQLの技術Drecom Co., Ltd.
 
Pyladies Tokyo meet up #6
Pyladies Tokyo meet up #6Pyladies Tokyo meet up #6
Pyladies Tokyo meet up #6
Katayanagi Nobuko
 
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜Takahiro Inoue
 
2 5 1.一般化線形モデル色々_CPUE標準化
2 5 1.一般化線形モデル色々_CPUE標準化2 5 1.一般化線形モデル色々_CPUE標準化
2 5 1.一般化線形モデル色々_CPUE標準化
logics-of-blue
 
2 1.予測と確率分布
2 1.予測と確率分布2 1.予測と確率分布
2 1.予測と確率分布
logics-of-blue
 
サービス改善はログデータ分析から
サービス改善はログデータ分析からサービス改善はログデータ分析から
サービス改善はログデータ分析から
Kenta Suzuki
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort
 
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
logics-of-blue
 
2 5 2.一般化線形モデル色々_ロジスティック回帰
2 5 2.一般化線形モデル色々_ロジスティック回帰2 5 2.一般化線形モデル色々_ロジスティック回帰
2 5 2.一般化線形モデル色々_ロジスティック回帰
logics-of-blue
 

Viewers also liked (20)

Programming with Python - Adv.
Programming with Python - Adv.Programming with Python - Adv.
Programming with Python - Adv.
 
Boost Maintainability
Boost MaintainabilityBoost Maintainability
Boost Maintainability
 
Beyond the Style Guides
Beyond the Style GuidesBeyond the Style Guides
Beyond the Style Guides
 
Network theory - PyCon 2015
Network theory - PyCon 2015Network theory - PyCon 2015
Network theory - PyCon 2015
 
ZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
ZIPCodeTW: Find Taiwan ZIP Code by Address FuzzilyZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
ZIPCodeTW: Find Taiwan ZIP Code by Address Fuzzily
 
Intro to Python Data Analysis in Wakari
Intro to Python Data Analysis in WakariIntro to Python Data Analysis in Wakari
Intro to Python Data Analysis in Wakari
 
PyData: The Next Generation
PyData: The Next GenerationPyData: The Next Generation
PyData: The Next Generation
 
ログ分析のある生活(概要編)
ログ分析のある生活(概要編)ログ分析のある生活(概要編)
ログ分析のある生活(概要編)
 
Parse The Web Using Python+Beautiful Soup
Parse The Web Using Python+Beautiful SoupParse The Web Using Python+Beautiful Soup
Parse The Web Using Python+Beautiful Soup
 
Analyzing Data With Python
Analyzing Data With PythonAnalyzing Data With Python
Analyzing Data With Python
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
 
ログ解析を支えるNoSQLの技術
ログ解析を支えるNoSQLの技術ログ解析を支えるNoSQLの技術
ログ解析を支えるNoSQLの技術
 
Pyladies Tokyo meet up #6
Pyladies Tokyo meet up #6Pyladies Tokyo meet up #6
Pyladies Tokyo meet up #6
 
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
MongoDBを用いたソーシャルアプリのログ解析 〜解析基盤構築からフロントUIまで、MongoDBを最大限に活用する〜
 
2 5 1.一般化線形モデル色々_CPUE標準化
2 5 1.一般化線形モデル色々_CPUE標準化2 5 1.一般化線形モデル色々_CPUE標準化
2 5 1.一般化線形モデル色々_CPUE標準化
 
2 1.予測と確率分布
2 1.予測と確率分布2 1.予測と確率分布
2 1.予測と確率分布
 
サービス改善はログデータ分析から
サービス改善はログデータ分析からサービス改善はログデータ分析から
サービス改善はログデータ分析から
 
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkXGraph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
 
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
2 5 3.一般化線形モデル色々_Gamma回帰と対数線形モデル
 
2 5 2.一般化線形モデル色々_ロジスティック回帰
2 5 2.一般化線形モデル色々_ロジスティック回帰2 5 2.一般化線形モデル色々_ロジスティック回帰
2 5 2.一般化線形モデル色々_ロジスティック回帰
 

Similar to Learning Python from Data

05 python.pdf
05 python.pdf05 python.pdf
05 python.pdf
SugumarSarDurai
 
Natural Language Processing sample code by Aiden
Natural Language Processing sample code by AidenNatural Language Processing sample code by Aiden
Natural Language Processing sample code by Aiden
Aiden Wu, FRM
 
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 coreTYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
timohund
 
Learn Python 3 for absolute beginners
Learn Python 3 for absolute beginnersLearn Python 3 for absolute beginners
Learn Python 3 for absolute beginners
KingsleyAmankwa
 
Python in 30 minutes!
Python in 30 minutes!Python in 30 minutes!
Python in 30 minutes!
Fariz Darari
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 Tutorial
Justin Lin
 
Mastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loopsMastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loops
Ruth Marvin
 
Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...
Samuel Lampa
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
Diego Freniche Brito
 
How to not blow up spaceships
How to not blow up spaceshipsHow to not blow up spaceships
How to not blow up spaceships
Sabin Marcu
 
python into.pptx
python into.pptxpython into.pptx
python into.pptx
Punithavel Ramani
 
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
Heejong Ahn
 
Using Buildout to Develop and Deploy Python Projects
Using Buildout to Develop and Deploy Python ProjectsUsing Buildout to Develop and Deploy Python Projects
Using Buildout to Develop and Deploy Python Projects
Clayton Parker
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Edureka!
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdf
gmadhu8
 
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Codemotion
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPT
Shivam Gupta
 
Python
PythonPython
Python
Shivam Gupta
 
Sylius, the good choice
Sylius, the good choiceSylius, the good choice
Sylius, the good choice
Jacques Bodin-Hullin
 
python program
python programpython program
python program
tomlee12821
 

Similar to Learning Python from Data (20)

05 python.pdf
05 python.pdf05 python.pdf
05 python.pdf
 
Natural Language Processing sample code by Aiden
Natural Language Processing sample code by AidenNatural Language Processing sample code by Aiden
Natural Language Processing sample code by Aiden
 
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 coreTYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
TYPO3 8 is here - how we keep EXT:solr uptodate with the TYPO3 core
 
Learn Python 3 for absolute beginners
Learn Python 3 for absolute beginnersLearn Python 3 for absolute beginners
Learn Python 3 for absolute beginners
 
Python in 30 minutes!
Python in 30 minutes!Python in 30 minutes!
Python in 30 minutes!
 
PyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 TutorialPyCon Taiwan 2013 Tutorial
PyCon Taiwan 2013 Tutorial
 
Mastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loopsMastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loops
 
Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...Using Flow-based programming to write tools and workflows for Scientific Comp...
Using Flow-based programming to write tools and workflows for Scientific Comp...
 
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides:  Let's build macOS CLI Utilities using SwiftMobileConf 2021 Slides:  Let's build macOS CLI Utilities using Swift
MobileConf 2021 Slides: Let's build macOS CLI Utilities using Swift
 
How to not blow up spaceships
How to not blow up spaceshipsHow to not blow up spaceships
How to not blow up spaceships
 
python into.pptx
python into.pptxpython into.pptx
python into.pptx
 
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
TypeScript와 Flow: 
자바스크립트 개발에 정적 타이핑 도입하기
 
Using Buildout to Develop and Deploy Python Projects
Using Buildout to Develop and Deploy Python ProjectsUsing Buildout to Develop and Deploy Python Projects
Using Buildout to Develop and Deploy Python Projects
 
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
Advanced Python Tutorial | Learn Advanced Python Concepts | Python Programmin...
 
python-160403194316.pdf
python-160403194316.pdfpython-160403194316.pdf
python-160403194316.pdf
 
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017
 
Python Seminar PPT
Python Seminar PPTPython Seminar PPT
Python Seminar PPT
 
Python
PythonPython
Python
 
Sylius, the good choice
Sylius, the good choiceSylius, the good choice
Sylius, the good choice
 
python program
python programpython program
python program
 

More from Mosky Liu

Statistical Regression With Python
Statistical Regression With PythonStatistical Regression With Python
Statistical Regression With Python
Mosky Liu
 
Data Science With Python
Data Science With PythonData Science With Python
Data Science With Python
Mosky Liu
 
Hypothesis Testing With Python
Hypothesis Testing With PythonHypothesis Testing With Python
Hypothesis Testing With Python
Mosky Liu
 
Simple Belief - Mosky @ TEDxNTUST 2015
Simple Belief - Mosky @ TEDxNTUST 2015Simple Belief - Mosky @ TEDxNTUST 2015
Simple Belief - Mosky @ TEDxNTUST 2015
Mosky Liu
 
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
Mosky Liu
 
MoSQL: More than SQL, but less than ORM
MoSQL: More than SQL, but less than ORMMoSQL: More than SQL, but less than ORM
MoSQL: More than SQL, but less than ORM
Mosky Liu
 

More from Mosky Liu (6)

Statistical Regression With Python
Statistical Regression With PythonStatistical Regression With Python
Statistical Regression With Python
 
Data Science With Python
Data Science With PythonData Science With Python
Data Science With Python
 
Hypothesis Testing With Python
Hypothesis Testing With PythonHypothesis Testing With Python
Hypothesis Testing With Python
 
Simple Belief - Mosky @ TEDxNTUST 2015
Simple Belief - Mosky @ TEDxNTUST 2015Simple Belief - Mosky @ TEDxNTUST 2015
Simple Belief - Mosky @ TEDxNTUST 2015
 
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
MoSQL: More than SQL, but Less than ORM @ PyCon APAC 2013
 
MoSQL: More than SQL, but less than ORM
MoSQL: More than SQL, but less than ORMMoSQL: More than SQL, but less than ORM
MoSQL: More than SQL, but less than ORM
 

Recently uploaded

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Hivelance Technology
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 

Recently uploaded (20)

Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 

Learning Python from Data

  • 1. LEARNING PYTHON FROM DATA Mosky 1
  • 2. THIS SLIDE • The online version is at https://speakerdeck.com/mosky/learning-python-from-data. • The examples are at https://github.com/moskytw/learning-python-from-data-examples. 2
  • 4. MOSKY • I am working at Pinkoi. 3
  • 5. MOSKY • I am working at Pinkoi. • I've taught Python for 100+ hours. 3
  • 6. MOSKY • I am working at Pinkoi. • I've taught Python for 100+ hours. • A speaker at COSCUP 2014, PyCon SG 2014, PyCon APAC 014, OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... 3
  • 7. MOSKY • I am working at Pinkoi. • I've taught Python for 100+ hours. • A speaker at COSCUP 2014, PyCon SG 2014, PyCon APAC 014, OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... • The author of the Python packages: MoSQL, Clime, ZIPCodeTW, ... 3
  • 8. MOSKY • I am working at Pinkoi. • I've taught Python for 100+ hours. • A speaker at COSCUP 2014, PyCon SG 2014, PyCon APAC 014, OSDC 2014, PyCon APAC 2013, COSCUP 2014, ... • The author of the Python packages: MoSQL, Clime, ZIPCodeTW, ... • http://mosky.tw/ 3
  • 11. SCHEDULE •Warm-up • Packages - Install the packages we need. 4
  • 12. SCHEDULE •Warm-up • Packages - Install the packages we need. • CSV - Download a CSV from the Internet and handle it. 4
  • 13. SCHEDULE •Warm-up • Packages - Install the packages we need. • CSV - Download a CSV from the Internet and handle it. • HTML - Parse a HTML source code and write a Web crawler. 4
  • 14. SCHEDULE •Warm-up • Packages - Install the packages we need. • CSV - Download a CSV from the Internet and handle it. • HTML - Parse a HTML source code and write a Web crawler. • SQL - Save data into a SQLite database. 4
  • 15. SCHEDULE •Warm-up • Packages - Install the packages we need. • CSV - Download a CSV from the Internet and handle it. • HTML - Parse a HTML source code and write a Web crawler. • SQL - Save data into a SQLite database. • The End 4
  • 17. 6
  • 19. 2 OR 3? 7
  • 20. 2 OR 3? • Use Python 3! 7
  • 21. 2 OR 3? • Use Python 3! • But it actually depends on the libs you need. 7
  • 22. 2 OR 3? • Use Python 3! • But it actually depends on the libs you need. • https://python3wos.appspot.com/ 7
  • 23. 2 OR 3? • Use Python 3! • But it actually depends on the libs you need. • https://python3wos.appspot.com/ •We will go ahead with Python 2.7, but I will also introduce the changes in Python 3. 7
  • 25. THE ONLINE RESOURCES • The Python Official Doc • http://docs.python.org • The Python Tutorial • The Python Standard Library 8
  • 26. THE ONLINE RESOURCES • The Python Official Doc • http://docs.python.org • The Python Tutorial • The Python Standard Library • My Past Slides • Programming with Python - Basic • Programming with Python - Adv. 8
  • 28. THE BOOKS • Learning Python by Mark Lutz 9
  • 29. THE BOOKS • Learning Python by Mark Lutz • Programming in Python 3 by Mark Summerfield 9
  • 30. THE BOOKS • Learning Python by Mark Lutz • Programming in Python 3 by Mark Summerfield • Python Essential Reference by David Beazley 9
  • 32. PREPARATION • Did you say "hello" to Python? 10
  • 33. PREPARATION • Did you say "hello" to Python? • If no, visit • http://www.slideshare.net/moskytw/programming-with-python- basic. 10
  • 34. PREPARATION • Did you say "hello" to Python? • If no, visit • http://www.slideshare.net/moskytw/programming-with-python- basic. • If yes, open your Python shell. 10
  • 35. WARM-UP The things you must know. 11
  • 36. MATH & VARS 2 + 3 2 - 3 2 * 3 2 / 3, -2 / 3 ! (1+10)*10 / 2 ! 2.0 / 3 ! 2 % 3 ! 2 ** 3 x = 2 ! y = 3 ! z = x + y ! print z ! '#' * 10 12
  • 37. FOR for i in [0, 1, 2, 3, 4]: print i ! items = [0, 1, 2, 3, 4] for i in items: print i ! for i in range(5): print i ! ! ! chars = 'SAHFI' for i, c in enumerate(chars): print i, c ! ! words = ('Samsung', 'Apple', 'HP', 'Foxconn', 'IBM') for c, w in zip(chars, words): print c, w 13
  • 38. IF for i in range(1, 10): if i % 2 == 0: print '{} is divisible by 2'.format(i) elif i % 3 == 0: print '{} is divisible by 3'.format(i) else: print '{} is not divisible by 2 nor 3'.format(i) 14
  • 39. WHILE while 1: n = int(raw_input('How big pyramid do you want? ')) if n <= 0: print 'It must greater than 0: {}'.format(n) continue break 15
  • 40. TRY while 1: ! try: n = int(raw_input('How big pyramid do you want? ')) except ValueError as e: print 'It must be a number: {}'.format(e) continue ! if n <= 0: print 'It must greater than 0: {}'.format(n) continue ! break 16
  • 41. LOOP ... ELSE for n in range(2, 100): for i in range(2, n): if n % i == 0: break else: print '{} is a prime!'.format(n) 17
  • 42. A PYRAMID * *** ***** ******* ********* *********** ************* *************** ***************** ******************* 18
  • 43. A FATER PYRAMID * ***** ********* ************* ******************* 19
  • 45. LIST COMPREHENSION [ n for n in range(2, 100) if not any(n % i == 0 for i in range(2, n)) ] 21
  • 46. PACKAGES import is important. 22
  • 47. 23
  • 48. GET PIP - UN*X 24
  • 49. GET PIP - UN*X • Debian family • # apt-get install python-pip 24
  • 50. GET PIP - UN*X • Debian family • # apt-get install python-pip • Rehat family • # yum install python-pip 24
  • 51. GET PIP - UN*X • Debian family • # apt-get install python-pip • Rehat family • # yum install python-pip • Mac OS X • # easy_install pip 24
  • 52. GET PIP - WIN * 25
  • 53. GET PIP - WIN * • Follow the steps in http://stackoverflow.com/questions/ 4750806/how-to-install-pip-on-windows. 25
  • 54. GET PIP - WIN * • Follow the steps in http://stackoverflow.com/questions/ 4750806/how-to-install-pip-on-windows. • Or just use easy_install to install. The easy_install should be found at C:Python27Scripts. 25
  • 55. GET PIP - WIN * • Follow the steps in http://stackoverflow.com/questions/ 4750806/how-to-install-pip-on-windows. • Or just use easy_install to install. The easy_install should be found at C:Python27Scripts. • Or find the Windows installer on Python Package Index. 25
  • 57. 3-RD PARTY PACKAGES • requests - Python HTTP for Humans 26
  • 58. 3-RD PARTY PACKAGES • requests - Python HTTP for Humans • lxml - Pythonic XML processing library 26
  • 59. 3-RD PARTY PACKAGES • requests - Python HTTP for Humans • lxml - Pythonic XML processing library • uniout - Print the object representation in readable chars. 26
  • 60. 3-RD PARTY PACKAGES • requests - Python HTTP for Humans • lxml - Pythonic XML processing library • uniout - Print the object representation in readable chars. • clime - Convert module into a CLI program w/o any config. 26
  • 62. CSV Let's start from making a HTTP request! 28
  • 63. HTTP GET import requests ! #url = 'http://stats.moe.gov.tw/files/school/101/ u1_new.csv' url = 'https://raw.github.com/moskytw/learning-python- from-data-examples/master/sql/schools.csv' ! print requests.get(url).content ! #print requests.get(url).text 29
  • 64. FILE save_path = 'school_list.csv' ! with open(save_path, 'w') as f: f.write(requests.get(url).content) ! with open(save_path) as f: print f.read() ! with open(save_path) as f: for line in f: print line, 30
  • 65. DEF from os.path import basename ! def save(url, path=None): ! if not path: path = basename(url) ! with open(path, 'w') as f: f.write(requests.get(url).content) 31
  • 66. CSV import csv from os.path import exists ! if not exists(save_path): save(url, save_path) ! with open(save_path) as f: for row in csv.reader(f): print row 32
  • 67. + UNIOUT import csv from os.path import exists import uniout # You want this! ! if not exists(save_path): save(url, save_path) ! with open(save_path) as f: for row in csv.reader(f): print row 33
  • 68. NEXT with open(save_path) as f: next(f) # skip the unwanted lines next(f) for row in csv.reader(f): print row 34
  • 69. DICT READER with open(save_path) as f: next(f) next(f) for row in csv.DictReader(f): print row ! # We now have a great output. :) 35
  • 70. DEF AGAIN def parse_to_school_list(path): school_list = [] with open(path) as f: next(f) next(f) for school in csv.DictReader(f): school_list.append(school) ! return school_list[:-2] 36
  • 71. + COMPREHENSION def parse_to_school_list(path='schools.csv'): with open(path) as f: next(f) next(f) school_list = [school for school in csv.DictReader(f)][:-2] ! return school_list 37
  • 72. + PRETTY PRINT from pprint import pprint ! pprint(parse_to_school_list(save_path)) ! # AWESOME! 38
  • 73. PYTHONIC school_list = parse_to_school_list(save_path) ! # hmmm ... ! for school in shcool_list: print shcool['School Name'] ! # It is more Pythonic! :) ! print [school['School Name'] for school in school_list] 39
  • 74. GROUP BY from itertools import groupby ! # You MUST sort it. keyfunc = lambda school: school['County'] school_list.sort(key=keyfunc) ! for county, schools in groupby(school_list, keyfunc): for school in schools: print '%s %r' % (county, school) print '---' 40
  • 75. DOCSTRING '''It contains some useful function for paring data from government.''' ! def save(url, path=None): '''It saves data from `url` to `path`.''' ... ! --- Shell --- ! $ pydoc csv_docstring 41
  • 76. CLIME if __name__ == '__main__': import clime.now ! --- shell --- ! $ python csv_clime.py usage: basename <p> or: parse-to-school-list <path> or: save [--path] <url> ! It contains some userful function for parsing data from government. 42
  • 77. DOC TIPS help(requests) ! print dir(requests) ! print 'n'.join(dir(requests)) 43
  • 79. HTML Have fun with the final crawler. ;) 45
  • 80. LXML import requests from lxml import etree ! content = requests.get('http://clbc.tw').content root = etree.HTML(content) ! print root 46
  • 81. CACHE from os.path import exists ! cache_path = 'cache.html' ! if exists(cache_path): with open(cache_path) as f: content = f.read() else: content = requests.get('http://clbc.tw').content with open(cache_path, 'w') as f: f.write(content) 47
  • 82. SEARCHING head = root.find('head') print head ! head_children = head.getchildren() print head_children ! metas = head.findall('meta') print metas ! title_text = head.findtext('title') print title_text 48
  • 83. XPATH titles = root.xpath('/html/head/title') print titles[0].text ! title_texts = root.xpath('/html/head/title/text()') print title_texts[0] ! as_ = root.xpath('//a') print as_ print [a.get('href') for a in as_] 49
  • 84. MD5 from hashlib import md5 ! message = 'There should be one-- and preferably only one --obvious way to do it.' ! print md5(message).hexdigest() ! # Actually, it is noting about HTML. 50
  • 85. DEF GET from os import makedirs from os.path import exists, join ! def get(url, cache_dir_path='cache/'): ! if not exists(cache_dir_path): makedirs(cache_dir) ! cache_path = join(cache_dir_path, md5(url).hexdigest()) ! ... 51
  • 86. DEF FIND_URLS def find_urls(content): root = etree.HTML(content) return [ a.attrib['href'] for a in root.xpath('//a') if 'href' in a.attrib ] 52
  • 87. BFS 1/2 NEW = 0 QUEUED = 1 VISITED = 2 ! def search_urls(url): ! url_queue = [url] url_state_map = {url: QUEUED} ! while url_queue: ! url = url_queue.pop(0) print url 53
  • 88. BFS 2/2 # continue the previous page try: found_urls = find_urls(get(url)) except Exception, e: url_state_map[url] = e print 'Exception: %s' % e except KeyboardInterrupt, e: return url_state_map else: for found_url in found_urls: if not url_state_map.get(found_url, NEW): url_queue.append(found_url) url_state_map[found_url] = QUEUED url_state_map[url] = VISITED 54
  • 89. DEQUE from collections import deque ... ! def search_urls(url): url_queue = deque([url]) ... while url_queue: ! url = url_queue.popleft() print url ... 55
  • 90. YIELD ... ! def search_urls(url): ... while url_queue: ! url = url_queue.pop(0) yield url ... except KeyboardInterrupt, e: print url_state_map return ... 56
  • 92. SQL How about saving the CSV file into a db? 58
  • 93. TABLE CREATE TABLE schools ( id TEXT PRIMARY KEY, name TEXT, county TEXT, address TEXT, phone TEXT, url TEXT, type TEXT ); ! DROP TABLE schools; 59
  • 94. CRUD INSERT INTO schools (id, name) VALUES ('1', 'The First'); INSERT INTO schools VALUES (...); ! SELECT * FROM schools WHERE id='1'; SELECT name FROM schools WHERE id='1'; ! UPDATE schools SET id='10' WHERE id='1'; ! DELETE FROM schools WHERE id='10'; 60
  • 95. COMMON PATTERN import sqlite3 ! db_path = 'schools.db' conn = sqlite3.connect(db_path) cur = conn.cursor() ! cur.execute('''CREATE TABLE schools ( ... )''') conn.commit() ! cur.close() conn.close() 61
  • 96. ROLLBACK ... ! try: cur.execute('...') except: conn.rollback() raise else: conn.commit() ! ... 62
  • 97. PARAMETERIZE QUERY ... ! rows = ... ! for row in rows: cur.execute('INSERT INTO schools VALUES (?, ?, ?, ?, ?, ?, ?)', row) ! conn.commit() ! ... 63
  • 98. EXECUTEMANY ... ! rows = ... ! cur.executemany('INSERT INTO schools VALUES (?, ?, ?, ?, ?, ?, ?)', rows) ! conn.commit() ! ... 64
  • 99. FETCH ... cur.execute('select * from schools') ! print cur.fetchone() ! # or print cur.fetchall() ! # or for row in cur: print row ... 65
  • 100. TEXT FACTORY # SQLite only: Let you pass the 8-bit string as parameter. ! ... ! conn = sqlite3.connect(db_path) conn.text_factory = str ! ... 66
  • 101. ROW FACTORY # SQLite only: Let you convert tuple into dict. It is `DictCursor` in some other connectors. ! def dict_factory(cursor, row): d = {} for idx, col in enumerate(cursor.description): d[col[0]] = row[idx] return d ! ... con.row_factory = dict_factory ... 67
  • 103. MORE • Python DB API 2.0 68
  • 104. MORE • Python DB API 2.0 • MySQLdb - MySQL connector for Python 68
  • 105. MORE • Python DB API 2.0 • MySQLdb - MySQL connector for Python • Psycopg2 - PostgreSQL adapter for Python 68
  • 106. MORE • Python DB API 2.0 • MySQLdb - MySQL connector for Python • Psycopg2 - PostgreSQL adapter for Python • SQLAlchemy - the Python SQL toolkit and ORM 68
  • 107. MORE • Python DB API 2.0 • MySQLdb - MySQL connector for Python • Psycopg2 - PostgreSQL adapter for Python • SQLAlchemy - the Python SQL toolkit and ORM • MoSQL - Build SQL from common Python data structure. 68
  • 109. THE END • You learned how to ... 69
  • 110. THE END • You learned how to ... • make a HTTP request 69
  • 111. THE END • You learned how to ... • make a HTTP request • load a CSV file 69
  • 112. THE END • You learned how to ... • make a HTTP request • load a CSV file • parse a HTML file 69
  • 113. THE END • You learned how to ... • make a HTTP request • load a CSV file • parse a HTML file • write a Web crawler 69
  • 114. THE END • You learned how to ... • make a HTTP request • load a CSV file • parse a HTML file • write a Web crawler • use SQL with SQLite 69
  • 115. THE END • You learned how to ... • make a HTTP request • load a CSV file • parse a HTML file • write a Web crawler • use SQL with SQLite • and lot of techniques today. ;) 69