PYTHON IN DATA SCIENCE WORK
RICK BAHAGUE, DATA SCIENTIST

RBAHAGUEJR@GMAIL.COM
Our Agenda
What is Data Science?
Introduction to Python
Python Tools for Data Science
A bit of Python for Big Data Processing
Questions
Data Science
Source: Python Data Analytics
Data Scientist asks relevant
real world questions
Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
And hopefully,
discovers
actionable
recommendations
from data
TOOLS
WHAT IS
PYTHON?
“THE NAME PYTHON COMES
FROM THE SURREAL BRITISH
COMEDY GROUP MONTY PYTHON,
NOT FROM THE SNAKE. PYTHON
PROGRAMMERS ARE
AFFECTIONATELY CALLED
PYTHONISTAS, AND BOTH MONTY
PYTHON AND SERPENTINE
REFERENCES USUALLY PEPPER
PYTHON TUTORIALS AND
DOCUMENTATION.”

Automate the Boring Stuff with Python
import antigravity
Installing Python
https://www.continuum.io/downloads
Launching Anaconda Python
Distribution
When is data ready and
prepared for analysis ?
Image source: http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine-learning-problem-abhishek-thakur/
Github: https://github.com/RickBahague/dspop
Sample Data Set:
Github: https://github.com/veekun/
pokedex
Pandas: Python Data Analysis
Library
Import pandas library
Reading/Writing Data
Series
DataFrame
Selecting Internal Elements
Assigning Values to Elements
Pandas: Python Data Analysis
Library
Evaluating Values (unique, isin, value_counts,
NaN)
Filtering Values
Transpose
Operations between DataFrame and Series
Statistics Functions, Correlation/Covariance
Scikit-learn & ML Basics
... learning from experience either
with or without supervision of
humans
Mastering Machine Learning with scikit-learn
ML Flow
Image source: http://blog.kaggle.com/2016/07/21/approaching-almost-any-machine-learning-problem-abhishek-thakur/
Machine Learning with
Scikit-learn
Source: http://scikit-learn.org/stable/
A bit of Big Data Processing
Source: Python Data Analytics
Creative Commons License
Python in Data Science Work by Rick
Bahague is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 4.0
International License.
Based on a work at https://medium.com/
@rbahaguejr.
Permissions beyond the scope of this license
may be available at https://medium.com/
@rbahaguejr.

Python in Data Science Work