SlideShare a Scribd company logo
Numerical tour in the Python eco-system 
Python, NumPy, scikit-learn 
Arnaud Joly 
October 2, 2014
2 / 37
How to install Python? 
Download and use the Anaconda python distribution 
https://store.continuum.io/cshop/anaconda/. It comes with 
all the scientific python stack. 
Alternatives: linux packages, pythonxy, canopy, . . . 
3 / 37
Using the python interpreter 
Interactive mode 
1. Start a python shell 
$ ipython 
2. Write python code 
>>> print("Hello World!") 
Hello World! 
Script mode 
1. hello.py 
print("Hello World!") 
2. Launch the script 
$ ipython hello.py 
Hello world! 
4 / 37
Basic types 
Integer >>> 5 
5 
>>> a = 5 
>>> a 
5 
Float >>> pi = 3.14 
complex >>> c = 1 - 1j 
boolean >>> b = 5 > 3 # 5 <= 3 
>>> b 
True # False 
string >>> s = ’hello!’ # Also works with "hello!" 
>>> s 
’hello !’ 
5 / 37
Python is a dynamic program language 
Variable types are implicitly inferred during the assignment. 
Variables are not declared. 
>>> # In python 
>>> a = 1 
By contrast in statically typed language, you must declared the 
type. 
// In java, c, c++ 
int a = 1 
6 / 37
Numbers and their arithmetic operations (+,-,/,//,*,**,%) 
>>> 1 + 2 
4 
>>> 50 - 5 * 6 
20 
>>> 2 / 3 # with py3 0.66... 
0 
>>> 2. / 3 # float division in py2 and py3 
0.6666666666666666 
>>> 4 // 3 # Integer division with py2 and py3 
1 
>>> 5 ** 3.5 # exponent 
279.5084971874737 
>>> 4 % 2 # modulo operation 
0 
7 / 37
Playing with strings 
>>> s = ’Great day!’ 
>>> s 
’Great day!’ 
>>> s[0] # strings are sequences 
’G’ 
>>> """A very 
very long string 
""" 
’A verynvery long stringn’ 
>>> ’i={0} f={2} s={1}’.format(1, ’test’, 3.14) 
’i=1 f=3.14 s=test’ 
8 / 37
list, an ordered collection of objects 
Instantiation >>> l = [] # an empty list 
>>> l = [’spam’, ’egg’, [’another list’], 42] 
Indexing >>> l[1] 
’egg’ 
>>> l[-1] # n_elements - 1 
42 
>>> l[1:2] # a slice 
["egg", [’another list’]] 
Methods >>> len(l) 
4 
>>> l.pop(0) 
’spam’ 
>>> l.append(3) 
>>> l 
[’egg’, [’another list’], 42, 3] 
9 / 37
dict, an unordered and associative data structure of 
key-value pairs 
Instantiation >>> d = {1: "a", "b": 2, 0: [4, 5, 6]} 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2} 
Indexing >>> d[’b’] 
2 
>>> ’b’ in d 
True 
Insertion >>> d[’new’] = 56 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2, ’new’: 56} 
Deletion >>> del d[’new’] 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2} 
10 / 37
dict, an unordered and associative data structure of 
key-value pairs 
Methods >>> len(d) 
3 
>>> d.keys() 
[0, 1, ’b’] 
>>> d.values() 
[[4, 5, 6], ’a’, 2] 
11 / 37
Control flow: if / elif / else 
>>> x = 3 
>>> if x == 0: 
... print("zero") 
... elif x == 1: 
... print("one") 
... else: 
... print("A big number") 
... 
’A big number’ 
Each indentation level corresponds to a block of code 
12 / 37
Control flow: for loop 
>>> l = [0, 1, 2, 3] 
>>> for a in l: # Iterate over a sequence 
... print(a ** 2) 
0 
1 
4 
Iterating over sequence of numbers is easy with the range built-in. 
>>> range(3) 
[0, 1, 2] 
>>> range(3, 10, 3) 
[3, 6, 9] 
13 / 37
Control flow: while 
>>> a, b = 0, 1 
>>> while b < 50: # while True do ... 
... a, b = b, a + b 
... print(a) 
... 
1 
1 
2 
3 
5 
8 
13 
21 
34 
14 / 37
Control flow: functions 
>>> def f(x, e=2): 
... return x ** e 
... 
>>> f(3) 
9 
>>> f(5, 3) 
125 
>>> f(5, e=3) 
125 
Function arguments are passed by reference in python. Be aware of 
side effects: mutable default parameters, inplace modifications of 
the arguments. 
15 / 37
Classes and object 
>>> class Counter: 
... def __init__(self, initial_value=0): 
... self.value = initial_value 
... def inc(self): 
... self.value += 1 
... 
>>> c = Counter() # Instantiate a counter object 
>>> c.value # Access to an attribute 
0 
>>> c.inc() # Call a method 
>>> c.value 
1 
16 / 37
Import a package 
>>> import math 
>>> math.log(3) 
1.0986122886681098 
>>> from math import log 
>>> log(4) 
1.3862943611198906 
You can try "import this" and "import antigravity". 
17 / 37
Python reference and tutorial 
I Python Tutorial : http://docs.python.org/tutorial/ 
I Python Reference : https://docs.python.org/library/ 
How to use the "?" in ipython? 
In [0]: d = {"a": 1} 
In [1]: d? 
Type: dict 
String Form:{’a’: 1} 
Length: 1 
Docstring: 
dict() -> new empty dictionary 
dict(mapping) -> new dictionary initialized from a mapping object’s 
(key, value) pairs 
dict(iterable) -> new dictionary initialized as if via: 
d = {} 
for k, v in iterable: 
d[k] = v 
dict(**kwargs) -> new dictionary initialized with the name=value pairs 
in the keyword argument list. For example: dict(one=1, two=2) 
18 / 37
19 / 37
NumPy 
NumPy is the fundamental package for scientific computing with 
Python. It contains among other things: 
I a powerful N-dimensional array object, 
I sophisticated (broadcasting) functions, 
I tools for integrating C/C++ and Fortran code, 
I useful linear algebra, Fourier transform, and random number 
capabilities 
With SciPy, it’s a replacement for MATLAB(c). 
20 / 37
1-D numpy arrays 
Let’s import the package. 
>>> import numpy as np 
Let’s create a 1-dimensional array. 
>>> a = np.array([0, 1, 2, 3]) 
>>> a 
array([0, 1, 2, 3]) 
>>> a.ndim 
1 
>>> a.shape 
(4,) 
21 / 37
2-D numpy arrays 
Let’s import the package. 
>>> import numpy as np 
Let’s create a 2-dimensional array. 
>>> b = np.array([[0, 1, 2], [3, 4, 5]]) 
>>> b 
array([[ 0, 1, 2], 
[ 3, 4, 5]]) 
>>> b.ndim 
2 
>>> b.shape 
(2, 3) 
Routine to create array: np.ones, np.zeros,. . . 
22 / 37
Array operations 
>>> a = np.ones(3) / 5. 
>>> b = np.array([1, 2, 3]) 
>>> a + b 
array([ 1.2, 2.2, 3.2]) 
>>> np.dot(a, b) 
1.200000 
>>> ... 
Many functions to operate efficiently on arrays : np.max, np.min, 
np.mean, np.unique, . . . 
23 / 37
Indexing numpy array 
>>> a = np.array([[1, 2, 3], [4, 5, 6]]) 
>>> a[1, 2] 
6 
>>> a[1] 
array([4, 5, 6]) 
>>> a[:, 2] 
array([3, 6]) 
>>> a[:, 1:3] 
array([[2, 3], 
[5, 6]]) 
>>> b = a > 2 
>>> b 
array([[False, False, True], 
[ True, True, True]], dtype=bool) 
>>> a[b] 
array([3, 4, 5, 6]) 
24 / 37
Reference and documentation 
I NumPy User Guide: 
http://docs.scipy.org/doc/numpy/user/ 
I NumPy Reference: 
http://docs.scipy.org/doc/numpy/reference/ 
I MATLAB to NumPy: 
http://wiki.scipy.org/NumPy_for_Matlab_Users 
25 / 37
26 / 37
scikit-learn Machine Learning in Python 
I Simple and efficient tools for data mining and data analysis 
I Accessible to everybody, and reusable in various contexts 
I Built on NumPy, SciPy, and matplotlib 
I Open source, commercially usable - BSD license 
27 / 37
A bug or need help? 
I Mailing-list: 
scikit-learn-general@lists.sourceforge.net; 
I Tag scikit-learn on Stack Overflow. 
How to install? 
I It’s shipped with Anaconda. 
I http://scikit-learn.org/stable/install.html 
28 / 37
Digits classification task 
# Load some data 
from sklearn.datasets import load_digits 
digits = load_digits() 
X, y = digits.data, digits.target 
How can we build a system to classify images? 
What is the first step? 
29 / 37
Data exploration and visualization 
# Data visualization 
import matplotlib.pyplot as plt 
plt.gray() 
plt.matshow(digits.images[0]) 
plt.show() 
What else can be done? 
30 / 37
Fit a supervised learning model 
from sklearn.svm import SVC 
clf = SVC() # Instantiate a classifier 
# API The base object, implements a fit method to learn from clf.fit(X, y) # Fit a classifier with the learning samples 
# API Exploit the fitted model to make prediction 
clf.predict(X) 
# API Get a goodness of fit given data (X, y) 
clf.score(X, y) # accuracy=1. 
What do you think about this score of 1.? 
31 / 37
Cross validation 
from sklearn.svm import SVC 
from sklearn.cross_validation import KFold 
scores = [] 
for train, test in KFold(len(X), n_folds=5, shuffle=True): 
X_train, y_train = X[train], y[train] 
X_test, y_test = X[test], y[test] 
clf = SVC() 
clf.fit(X_train, y_train) 
scores.append(clf.score(X_test, y_test)) 
print(np.mean(scores)) # 0.44... ! 
What do you think about this score of 0.44? 
Tip: This could be simplified using the cross_val_score function. 
32 / 37
Hyper-parameter optimization 
from sklearn.svm import SVC 
from sklearn.cross_validation import cross_val_score 
parameters = np.linspace(0.0001, 0.01, num=10) 
scores = [] 
for value in parameters: 
clf = SVC(gamma=value) 
s = cross_val_score(clf, X, y=y, cv=5) 
scores.append(np.mean(s, axis=0)) 
print(np.max(scores)) # 0.97... ! 
Tip: This could be simplified using the GridSearchCV 
meta-estimator. 
33 / 37
Visualizing hyper-parameter search 
import matplotlib.pyplot as plt 
plt.figure() 
plt.plot(parameters, scores) 
plt.xlabel("Gamma") 
plt.ylabel("Accuracy") 
plt.savefig("images/grid.png") 
34 / 37
Estimator cooking: transformer union and pipeline 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import make_pipeline 
# API Transformer has a transform method 
clf = make_pipeline(StandardScaler(), 
# More transformers here 
SVC()) 
from sklearn.pipeline import make_union 
from sklearn.preprocessing import PolynomialFeatures 
union_transformers = make_union(StandardScaler(), 
# More transformers here 
PolynomialFeatures()) 
clf = make_pipeline(union_transformers, SVC()) 
35 / 37
Model persistence 
from sklearn.externals import joblib 
# Save the model for later 
joblib.dump(clf, "model.joblib") 
# Load the model 
clf = joblib.load("model.joblib") 
36 / 37
Reference and documentation 
I User Guide: 
http://scikit-learn.org/stable/user_guide.html 
I Reference: http: 
//scikit-learn.org/stable/modules/classes.html 
I Examples: http: 
//scikit-learn.org/stable/auto_examples/index.html 
37 / 37

More Related Content

What's hot

Python programming –part 3
Python programming –part 3Python programming –part 3
Python programming –part 3
Megha V
 
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNetAlex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
AI Frontiers
 
Java Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type ConversionJava Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type Conversion
Svetlin Nakov
 
Python programming –part 7
Python programming –part 7Python programming –part 7
Python programming –part 7
Megha V
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
Massimo Quadrana
 
18. Java associative arrays
18. Java associative arrays18. Java associative arrays
18. Java associative arrays
Intro C# Book
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
Simplilearn
 
07. Arrays
07. Arrays07. Arrays
07. Arrays
Intro C# Book
 
Algorithm analysis and design
Algorithm analysis and designAlgorithm analysis and design
Algorithm analysis and design
Megha V
 
Python programming- Part IV(Functions)
Python programming- Part IV(Functions)Python programming- Part IV(Functions)
Python programming- Part IV(Functions)
Megha V
 
Chapter 3 Arrays in Java
Chapter 3 Arrays in JavaChapter 3 Arrays in Java
Chapter 3 Arrays in Java
Khirulnizam Abd Rahman
 
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
Edureka!
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
Alessio Tonioni
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
CloudxLab
 
Java Foundations: Arrays
Java Foundations: ArraysJava Foundations: Arrays
Java Foundations: Arrays
Svetlin Nakov
 
18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set
Intro C# Book
 
16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues
Intro C# Book
 
Introduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersIntroduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning Programmers
Kimikazu Kato
 
07. Java Array, Set and Maps
07.  Java Array, Set and Maps07.  Java Array, Set and Maps
07. Java Array, Set and Maps
Intro C# Book
 
Chapter 4 - Classes in Java
Chapter 4 - Classes in JavaChapter 4 - Classes in Java
Chapter 4 - Classes in Java
Khirulnizam Abd Rahman
 

What's hot (20)

Python programming –part 3
Python programming –part 3Python programming –part 3
Python programming –part 3
 
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNetAlex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
 
Java Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type ConversionJava Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type Conversion
 
Python programming –part 7
Python programming –part 7Python programming –part 7
Python programming –part 7
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
 
18. Java associative arrays
18. Java associative arrays18. Java associative arrays
18. Java associative arrays
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
07. Arrays
07. Arrays07. Arrays
07. Arrays
 
Algorithm analysis and design
Algorithm analysis and designAlgorithm analysis and design
Algorithm analysis and design
 
Python programming- Part IV(Functions)
Python programming- Part IV(Functions)Python programming- Part IV(Functions)
Python programming- Part IV(Functions)
 
Chapter 3 Arrays in Java
Chapter 3 Arrays in JavaChapter 3 Arrays in Java
Chapter 3 Arrays in Java
 
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
 
Java Foundations: Arrays
Java Foundations: ArraysJava Foundations: Arrays
Java Foundations: Arrays
 
18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set
 
16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues
 
Introduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersIntroduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning Programmers
 
07. Java Array, Set and Maps
07.  Java Array, Set and Maps07.  Java Array, Set and Maps
07. Java Array, Set and Maps
 
Chapter 4 - Classes in Java
Chapter 4 - Classes in JavaChapter 4 - Classes in Java
Chapter 4 - Classes in Java
 

Viewers also liked

Think machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetanThink machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetan
Chetan Khatri
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
Yoss Cohen
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
Gael Varoquaux
 
Intro to scikit learn may 2017
Intro to scikit learn may 2017Intro to scikit learn may 2017
Intro to scikit learn may 2017
Francesco Mosconi
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
Asim Jalis
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe
 
Exploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-LearnExploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-Learn
Kan Ouivirach, Ph.D.
 
Realtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learnRealtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learn
AWeber
 
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
PyData
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
Damian R. Mingle, MBA
 
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael VaroquauxPyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pôle Systematic Paris-Region
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learn
Matt Hagy
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
odsc
 
Intro to scikit-learn
Intro to scikit-learnIntro to scikit-learn
Intro to scikit-learn
AWeber
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
Gael Varoquaux
 
Converting Scikit-Learn to PMML
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
Villu Ruusmann
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
Gilles Louppe
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
Oswal Abhishek
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Jimmy Lai
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
DataRobot
 

Viewers also liked (20)

Think machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetanThink machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetan
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
Intro to scikit learn may 2017
Intro to scikit learn may 2017Intro to scikit learn may 2017
Intro to scikit learn may 2017
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Exploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-LearnExploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-Learn
 
Realtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learnRealtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learn
 
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael VaroquauxPyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learn
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Intro to scikit-learn
Intro to scikit-learnIntro to scikit-learn
Intro to scikit-learn
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
 
Converting Scikit-Learn to PMML
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 

Similar to Numerical tour in the Python eco-system: Python, NumPy, scikit-learn

Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
Kimikazu Kato
 
Programming python quick intro for schools
Programming python quick intro for schoolsProgramming python quick intro for schools
Programming python quick intro for schools
Dan Bowen
 
python lab programs.pdf
python lab programs.pdfpython lab programs.pdf
python lab programs.pdf
CBJWorld
 
Cc code cards
Cc code cardsCc code cards
Cc code cards
ysolanki78
 
Python testing using mock and pytest
Python testing using mock and pytestPython testing using mock and pytest
Python testing using mock and pytest
Suraj Deshmukh
 
Python Training v2
Python Training v2Python Training v2
Python Training v2
ibaydan
 
Writing Faster Python 3
Writing Faster Python 3Writing Faster Python 3
Writing Faster Python 3
Sebastian Witowski
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
PyData
 
Introducción a Elixir
Introducción a ElixirIntroducción a Elixir
Introducción a Elixir
Svet Ivantchev
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
Lennart Regebro
 
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
DRVaibhavmeshram1
 
Using-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptxUsing-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptx
UadAccount
 
Machine learning with py torch
Machine learning with py torchMachine learning with py torch
Machine learning with py torch
Riza Fahmi
 
Ds lab manual by s.k.rath
Ds lab manual by s.k.rathDs lab manual by s.k.rath
Ds lab manual by s.k.rathSANTOSH RATH
 
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook

Artur Rodrigues
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
Lennart Regebro
 
Python for R developers and data scientists
Python for R developers and data scientistsPython for R developers and data scientists
Python for R developers and data scientists
Lambda Tree
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Cdiscount
 
Chapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQChapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQ
Intro C# Book
 
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Make Mannan
 

Similar to Numerical tour in the Python eco-system: Python, NumPy, scikit-learn (20)

Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Programming python quick intro for schools
Programming python quick intro for schoolsProgramming python quick intro for schools
Programming python quick intro for schools
 
python lab programs.pdf
python lab programs.pdfpython lab programs.pdf
python lab programs.pdf
 
Cc code cards
Cc code cardsCc code cards
Cc code cards
 
Python testing using mock and pytest
Python testing using mock and pytestPython testing using mock and pytest
Python testing using mock and pytest
 
Python Training v2
Python Training v2Python Training v2
Python Training v2
 
Writing Faster Python 3
Writing Faster Python 3Writing Faster Python 3
Writing Faster Python 3
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
Introducción a Elixir
Introducción a ElixirIntroducción a Elixir
Introducción a Elixir
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
 
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
 
Using-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptxUsing-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptx
 
Machine learning with py torch
Machine learning with py torchMachine learning with py torch
Machine learning with py torch
 
Ds lab manual by s.k.rath
Ds lab manual by s.k.rathDs lab manual by s.k.rath
Ds lab manual by s.k.rath
 
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook

 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
 
Python for R developers and data scientists
Python for R developers and data scientistsPython for R developers and data scientists
Python for R developers and data scientists
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
 
Chapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQChapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQ
 
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
 

Recently uploaded

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 

Recently uploaded (20)

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 

Numerical tour in the Python eco-system: Python, NumPy, scikit-learn

  • 1. Numerical tour in the Python eco-system Python, NumPy, scikit-learn Arnaud Joly October 2, 2014
  • 3. How to install Python? Download and use the Anaconda python distribution https://store.continuum.io/cshop/anaconda/. It comes with all the scientific python stack. Alternatives: linux packages, pythonxy, canopy, . . . 3 / 37
  • 4. Using the python interpreter Interactive mode 1. Start a python shell $ ipython 2. Write python code >>> print("Hello World!") Hello World! Script mode 1. hello.py print("Hello World!") 2. Launch the script $ ipython hello.py Hello world! 4 / 37
  • 5. Basic types Integer >>> 5 5 >>> a = 5 >>> a 5 Float >>> pi = 3.14 complex >>> c = 1 - 1j boolean >>> b = 5 > 3 # 5 <= 3 >>> b True # False string >>> s = ’hello!’ # Also works with "hello!" >>> s ’hello !’ 5 / 37
  • 6. Python is a dynamic program language Variable types are implicitly inferred during the assignment. Variables are not declared. >>> # In python >>> a = 1 By contrast in statically typed language, you must declared the type. // In java, c, c++ int a = 1 6 / 37
  • 7. Numbers and their arithmetic operations (+,-,/,//,*,**,%) >>> 1 + 2 4 >>> 50 - 5 * 6 20 >>> 2 / 3 # with py3 0.66... 0 >>> 2. / 3 # float division in py2 and py3 0.6666666666666666 >>> 4 // 3 # Integer division with py2 and py3 1 >>> 5 ** 3.5 # exponent 279.5084971874737 >>> 4 % 2 # modulo operation 0 7 / 37
  • 8. Playing with strings >>> s = ’Great day!’ >>> s ’Great day!’ >>> s[0] # strings are sequences ’G’ >>> """A very very long string """ ’A verynvery long stringn’ >>> ’i={0} f={2} s={1}’.format(1, ’test’, 3.14) ’i=1 f=3.14 s=test’ 8 / 37
  • 9. list, an ordered collection of objects Instantiation >>> l = [] # an empty list >>> l = [’spam’, ’egg’, [’another list’], 42] Indexing >>> l[1] ’egg’ >>> l[-1] # n_elements - 1 42 >>> l[1:2] # a slice ["egg", [’another list’]] Methods >>> len(l) 4 >>> l.pop(0) ’spam’ >>> l.append(3) >>> l [’egg’, [’another list’], 42, 3] 9 / 37
  • 10. dict, an unordered and associative data structure of key-value pairs Instantiation >>> d = {1: "a", "b": 2, 0: [4, 5, 6]} >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2} Indexing >>> d[’b’] 2 >>> ’b’ in d True Insertion >>> d[’new’] = 56 >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2, ’new’: 56} Deletion >>> del d[’new’] >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2} 10 / 37
  • 11. dict, an unordered and associative data structure of key-value pairs Methods >>> len(d) 3 >>> d.keys() [0, 1, ’b’] >>> d.values() [[4, 5, 6], ’a’, 2] 11 / 37
  • 12. Control flow: if / elif / else >>> x = 3 >>> if x == 0: ... print("zero") ... elif x == 1: ... print("one") ... else: ... print("A big number") ... ’A big number’ Each indentation level corresponds to a block of code 12 / 37
  • 13. Control flow: for loop >>> l = [0, 1, 2, 3] >>> for a in l: # Iterate over a sequence ... print(a ** 2) 0 1 4 Iterating over sequence of numbers is easy with the range built-in. >>> range(3) [0, 1, 2] >>> range(3, 10, 3) [3, 6, 9] 13 / 37
  • 14. Control flow: while >>> a, b = 0, 1 >>> while b < 50: # while True do ... ... a, b = b, a + b ... print(a) ... 1 1 2 3 5 8 13 21 34 14 / 37
  • 15. Control flow: functions >>> def f(x, e=2): ... return x ** e ... >>> f(3) 9 >>> f(5, 3) 125 >>> f(5, e=3) 125 Function arguments are passed by reference in python. Be aware of side effects: mutable default parameters, inplace modifications of the arguments. 15 / 37
  • 16. Classes and object >>> class Counter: ... def __init__(self, initial_value=0): ... self.value = initial_value ... def inc(self): ... self.value += 1 ... >>> c = Counter() # Instantiate a counter object >>> c.value # Access to an attribute 0 >>> c.inc() # Call a method >>> c.value 1 16 / 37
  • 17. Import a package >>> import math >>> math.log(3) 1.0986122886681098 >>> from math import log >>> log(4) 1.3862943611198906 You can try "import this" and "import antigravity". 17 / 37
  • 18. Python reference and tutorial I Python Tutorial : http://docs.python.org/tutorial/ I Python Reference : https://docs.python.org/library/ How to use the "?" in ipython? In [0]: d = {"a": 1} In [1]: d? Type: dict String Form:{’a’: 1} Length: 1 Docstring: dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2) 18 / 37
  • 20. NumPy NumPy is the fundamental package for scientific computing with Python. It contains among other things: I a powerful N-dimensional array object, I sophisticated (broadcasting) functions, I tools for integrating C/C++ and Fortran code, I useful linear algebra, Fourier transform, and random number capabilities With SciPy, it’s a replacement for MATLAB(c). 20 / 37
  • 21. 1-D numpy arrays Let’s import the package. >>> import numpy as np Let’s create a 1-dimensional array. >>> a = np.array([0, 1, 2, 3]) >>> a array([0, 1, 2, 3]) >>> a.ndim 1 >>> a.shape (4,) 21 / 37
  • 22. 2-D numpy arrays Let’s import the package. >>> import numpy as np Let’s create a 2-dimensional array. >>> b = np.array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2], [ 3, 4, 5]]) >>> b.ndim 2 >>> b.shape (2, 3) Routine to create array: np.ones, np.zeros,. . . 22 / 37
  • 23. Array operations >>> a = np.ones(3) / 5. >>> b = np.array([1, 2, 3]) >>> a + b array([ 1.2, 2.2, 3.2]) >>> np.dot(a, b) 1.200000 >>> ... Many functions to operate efficiently on arrays : np.max, np.min, np.mean, np.unique, . . . 23 / 37
  • 24. Indexing numpy array >>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>> a[1, 2] 6 >>> a[1] array([4, 5, 6]) >>> a[:, 2] array([3, 6]) >>> a[:, 1:3] array([[2, 3], [5, 6]]) >>> b = a > 2 >>> b array([[False, False, True], [ True, True, True]], dtype=bool) >>> a[b] array([3, 4, 5, 6]) 24 / 37
  • 25. Reference and documentation I NumPy User Guide: http://docs.scipy.org/doc/numpy/user/ I NumPy Reference: http://docs.scipy.org/doc/numpy/reference/ I MATLAB to NumPy: http://wiki.scipy.org/NumPy_for_Matlab_Users 25 / 37
  • 27. scikit-learn Machine Learning in Python I Simple and efficient tools for data mining and data analysis I Accessible to everybody, and reusable in various contexts I Built on NumPy, SciPy, and matplotlib I Open source, commercially usable - BSD license 27 / 37
  • 28. A bug or need help? I Mailing-list: scikit-learn-general@lists.sourceforge.net; I Tag scikit-learn on Stack Overflow. How to install? I It’s shipped with Anaconda. I http://scikit-learn.org/stable/install.html 28 / 37
  • 29. Digits classification task # Load some data from sklearn.datasets import load_digits digits = load_digits() X, y = digits.data, digits.target How can we build a system to classify images? What is the first step? 29 / 37
  • 30. Data exploration and visualization # Data visualization import matplotlib.pyplot as plt plt.gray() plt.matshow(digits.images[0]) plt.show() What else can be done? 30 / 37
  • 31. Fit a supervised learning model from sklearn.svm import SVC clf = SVC() # Instantiate a classifier # API The base object, implements a fit method to learn from clf.fit(X, y) # Fit a classifier with the learning samples # API Exploit the fitted model to make prediction clf.predict(X) # API Get a goodness of fit given data (X, y) clf.score(X, y) # accuracy=1. What do you think about this score of 1.? 31 / 37
  • 32. Cross validation from sklearn.svm import SVC from sklearn.cross_validation import KFold scores = [] for train, test in KFold(len(X), n_folds=5, shuffle=True): X_train, y_train = X[train], y[train] X_test, y_test = X[test], y[test] clf = SVC() clf.fit(X_train, y_train) scores.append(clf.score(X_test, y_test)) print(np.mean(scores)) # 0.44... ! What do you think about this score of 0.44? Tip: This could be simplified using the cross_val_score function. 32 / 37
  • 33. Hyper-parameter optimization from sklearn.svm import SVC from sklearn.cross_validation import cross_val_score parameters = np.linspace(0.0001, 0.01, num=10) scores = [] for value in parameters: clf = SVC(gamma=value) s = cross_val_score(clf, X, y=y, cv=5) scores.append(np.mean(s, axis=0)) print(np.max(scores)) # 0.97... ! Tip: This could be simplified using the GridSearchCV meta-estimator. 33 / 37
  • 34. Visualizing hyper-parameter search import matplotlib.pyplot as plt plt.figure() plt.plot(parameters, scores) plt.xlabel("Gamma") plt.ylabel("Accuracy") plt.savefig("images/grid.png") 34 / 37
  • 35. Estimator cooking: transformer union and pipeline from sklearn.preprocessing import StandardScaler from sklearn.pipeline import make_pipeline # API Transformer has a transform method clf = make_pipeline(StandardScaler(), # More transformers here SVC()) from sklearn.pipeline import make_union from sklearn.preprocessing import PolynomialFeatures union_transformers = make_union(StandardScaler(), # More transformers here PolynomialFeatures()) clf = make_pipeline(union_transformers, SVC()) 35 / 37
  • 36. Model persistence from sklearn.externals import joblib # Save the model for later joblib.dump(clf, "model.joblib") # Load the model clf = joblib.load("model.joblib") 36 / 37
  • 37. Reference and documentation I User Guide: http://scikit-learn.org/stable/user_guide.html I Reference: http: //scikit-learn.org/stable/modules/classes.html I Examples: http: //scikit-learn.org/stable/auto_examples/index.html 37 / 37