SlideShare a Scribd company logo
1 of 37
Download to read offline
Numerical tour in the Python eco-system 
Python, NumPy, scikit-learn 
Arnaud Joly 
October 2, 2014
2 / 37
How to install Python? 
Download and use the Anaconda python distribution 
https://store.continuum.io/cshop/anaconda/. It comes with 
all the scientific python stack. 
Alternatives: linux packages, pythonxy, canopy, . . . 
3 / 37
Using the python interpreter 
Interactive mode 
1. Start a python shell 
$ ipython 
2. Write python code 
>>> print("Hello World!") 
Hello World! 
Script mode 
1. hello.py 
print("Hello World!") 
2. Launch the script 
$ ipython hello.py 
Hello world! 
4 / 37
Basic types 
Integer >>> 5 
5 
>>> a = 5 
>>> a 
5 
Float >>> pi = 3.14 
complex >>> c = 1 - 1j 
boolean >>> b = 5 > 3 # 5 <= 3 
>>> b 
True # False 
string >>> s = ’hello!’ # Also works with "hello!" 
>>> s 
’hello !’ 
5 / 37
Python is a dynamic program language 
Variable types are implicitly inferred during the assignment. 
Variables are not declared. 
>>> # In python 
>>> a = 1 
By contrast in statically typed language, you must declared the 
type. 
// In java, c, c++ 
int a = 1 
6 / 37
Numbers and their arithmetic operations (+,-,/,//,*,**,%) 
>>> 1 + 2 
4 
>>> 50 - 5 * 6 
20 
>>> 2 / 3 # with py3 0.66... 
0 
>>> 2. / 3 # float division in py2 and py3 
0.6666666666666666 
>>> 4 // 3 # Integer division with py2 and py3 
1 
>>> 5 ** 3.5 # exponent 
279.5084971874737 
>>> 4 % 2 # modulo operation 
0 
7 / 37
Playing with strings 
>>> s = ’Great day!’ 
>>> s 
’Great day!’ 
>>> s[0] # strings are sequences 
’G’ 
>>> """A very 
very long string 
""" 
’A verynvery long stringn’ 
>>> ’i={0} f={2} s={1}’.format(1, ’test’, 3.14) 
’i=1 f=3.14 s=test’ 
8 / 37
list, an ordered collection of objects 
Instantiation >>> l = [] # an empty list 
>>> l = [’spam’, ’egg’, [’another list’], 42] 
Indexing >>> l[1] 
’egg’ 
>>> l[-1] # n_elements - 1 
42 
>>> l[1:2] # a slice 
["egg", [’another list’]] 
Methods >>> len(l) 
4 
>>> l.pop(0) 
’spam’ 
>>> l.append(3) 
>>> l 
[’egg’, [’another list’], 42, 3] 
9 / 37
dict, an unordered and associative data structure of 
key-value pairs 
Instantiation >>> d = {1: "a", "b": 2, 0: [4, 5, 6]} 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2} 
Indexing >>> d[’b’] 
2 
>>> ’b’ in d 
True 
Insertion >>> d[’new’] = 56 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2, ’new’: 56} 
Deletion >>> del d[’new’] 
>>> d 
{0: [4, 5, 6], 1: ’a’, ’b’: 2} 
10 / 37
dict, an unordered and associative data structure of 
key-value pairs 
Methods >>> len(d) 
3 
>>> d.keys() 
[0, 1, ’b’] 
>>> d.values() 
[[4, 5, 6], ’a’, 2] 
11 / 37
Control flow: if / elif / else 
>>> x = 3 
>>> if x == 0: 
... print("zero") 
... elif x == 1: 
... print("one") 
... else: 
... print("A big number") 
... 
’A big number’ 
Each indentation level corresponds to a block of code 
12 / 37
Control flow: for loop 
>>> l = [0, 1, 2, 3] 
>>> for a in l: # Iterate over a sequence 
... print(a ** 2) 
0 
1 
4 
Iterating over sequence of numbers is easy with the range built-in. 
>>> range(3) 
[0, 1, 2] 
>>> range(3, 10, 3) 
[3, 6, 9] 
13 / 37
Control flow: while 
>>> a, b = 0, 1 
>>> while b < 50: # while True do ... 
... a, b = b, a + b 
... print(a) 
... 
1 
1 
2 
3 
5 
8 
13 
21 
34 
14 / 37
Control flow: functions 
>>> def f(x, e=2): 
... return x ** e 
... 
>>> f(3) 
9 
>>> f(5, 3) 
125 
>>> f(5, e=3) 
125 
Function arguments are passed by reference in python. Be aware of 
side effects: mutable default parameters, inplace modifications of 
the arguments. 
15 / 37
Classes and object 
>>> class Counter: 
... def __init__(self, initial_value=0): 
... self.value = initial_value 
... def inc(self): 
... self.value += 1 
... 
>>> c = Counter() # Instantiate a counter object 
>>> c.value # Access to an attribute 
0 
>>> c.inc() # Call a method 
>>> c.value 
1 
16 / 37
Import a package 
>>> import math 
>>> math.log(3) 
1.0986122886681098 
>>> from math import log 
>>> log(4) 
1.3862943611198906 
You can try "import this" and "import antigravity". 
17 / 37
Python reference and tutorial 
I Python Tutorial : http://docs.python.org/tutorial/ 
I Python Reference : https://docs.python.org/library/ 
How to use the "?" in ipython? 
In [0]: d = {"a": 1} 
In [1]: d? 
Type: dict 
String Form:{’a’: 1} 
Length: 1 
Docstring: 
dict() -> new empty dictionary 
dict(mapping) -> new dictionary initialized from a mapping object’s 
(key, value) pairs 
dict(iterable) -> new dictionary initialized as if via: 
d = {} 
for k, v in iterable: 
d[k] = v 
dict(**kwargs) -> new dictionary initialized with the name=value pairs 
in the keyword argument list. For example: dict(one=1, two=2) 
18 / 37
19 / 37
NumPy 
NumPy is the fundamental package for scientific computing with 
Python. It contains among other things: 
I a powerful N-dimensional array object, 
I sophisticated (broadcasting) functions, 
I tools for integrating C/C++ and Fortran code, 
I useful linear algebra, Fourier transform, and random number 
capabilities 
With SciPy, it’s a replacement for MATLAB(c). 
20 / 37
1-D numpy arrays 
Let’s import the package. 
>>> import numpy as np 
Let’s create a 1-dimensional array. 
>>> a = np.array([0, 1, 2, 3]) 
>>> a 
array([0, 1, 2, 3]) 
>>> a.ndim 
1 
>>> a.shape 
(4,) 
21 / 37
2-D numpy arrays 
Let’s import the package. 
>>> import numpy as np 
Let’s create a 2-dimensional array. 
>>> b = np.array([[0, 1, 2], [3, 4, 5]]) 
>>> b 
array([[ 0, 1, 2], 
[ 3, 4, 5]]) 
>>> b.ndim 
2 
>>> b.shape 
(2, 3) 
Routine to create array: np.ones, np.zeros,. . . 
22 / 37
Array operations 
>>> a = np.ones(3) / 5. 
>>> b = np.array([1, 2, 3]) 
>>> a + b 
array([ 1.2, 2.2, 3.2]) 
>>> np.dot(a, b) 
1.200000 
>>> ... 
Many functions to operate efficiently on arrays : np.max, np.min, 
np.mean, np.unique, . . . 
23 / 37
Indexing numpy array 
>>> a = np.array([[1, 2, 3], [4, 5, 6]]) 
>>> a[1, 2] 
6 
>>> a[1] 
array([4, 5, 6]) 
>>> a[:, 2] 
array([3, 6]) 
>>> a[:, 1:3] 
array([[2, 3], 
[5, 6]]) 
>>> b = a > 2 
>>> b 
array([[False, False, True], 
[ True, True, True]], dtype=bool) 
>>> a[b] 
array([3, 4, 5, 6]) 
24 / 37
Reference and documentation 
I NumPy User Guide: 
http://docs.scipy.org/doc/numpy/user/ 
I NumPy Reference: 
http://docs.scipy.org/doc/numpy/reference/ 
I MATLAB to NumPy: 
http://wiki.scipy.org/NumPy_for_Matlab_Users 
25 / 37
26 / 37
scikit-learn Machine Learning in Python 
I Simple and efficient tools for data mining and data analysis 
I Accessible to everybody, and reusable in various contexts 
I Built on NumPy, SciPy, and matplotlib 
I Open source, commercially usable - BSD license 
27 / 37
A bug or need help? 
I Mailing-list: 
scikit-learn-general@lists.sourceforge.net; 
I Tag scikit-learn on Stack Overflow. 
How to install? 
I It’s shipped with Anaconda. 
I http://scikit-learn.org/stable/install.html 
28 / 37
Digits classification task 
# Load some data 
from sklearn.datasets import load_digits 
digits = load_digits() 
X, y = digits.data, digits.target 
How can we build a system to classify images? 
What is the first step? 
29 / 37
Data exploration and visualization 
# Data visualization 
import matplotlib.pyplot as plt 
plt.gray() 
plt.matshow(digits.images[0]) 
plt.show() 
What else can be done? 
30 / 37
Fit a supervised learning model 
from sklearn.svm import SVC 
clf = SVC() # Instantiate a classifier 
# API The base object, implements a fit method to learn from clf.fit(X, y) # Fit a classifier with the learning samples 
# API Exploit the fitted model to make prediction 
clf.predict(X) 
# API Get a goodness of fit given data (X, y) 
clf.score(X, y) # accuracy=1. 
What do you think about this score of 1.? 
31 / 37
Cross validation 
from sklearn.svm import SVC 
from sklearn.cross_validation import KFold 
scores = [] 
for train, test in KFold(len(X), n_folds=5, shuffle=True): 
X_train, y_train = X[train], y[train] 
X_test, y_test = X[test], y[test] 
clf = SVC() 
clf.fit(X_train, y_train) 
scores.append(clf.score(X_test, y_test)) 
print(np.mean(scores)) # 0.44... ! 
What do you think about this score of 0.44? 
Tip: This could be simplified using the cross_val_score function. 
32 / 37
Hyper-parameter optimization 
from sklearn.svm import SVC 
from sklearn.cross_validation import cross_val_score 
parameters = np.linspace(0.0001, 0.01, num=10) 
scores = [] 
for value in parameters: 
clf = SVC(gamma=value) 
s = cross_val_score(clf, X, y=y, cv=5) 
scores.append(np.mean(s, axis=0)) 
print(np.max(scores)) # 0.97... ! 
Tip: This could be simplified using the GridSearchCV 
meta-estimator. 
33 / 37
Visualizing hyper-parameter search 
import matplotlib.pyplot as plt 
plt.figure() 
plt.plot(parameters, scores) 
plt.xlabel("Gamma") 
plt.ylabel("Accuracy") 
plt.savefig("images/grid.png") 
34 / 37
Estimator cooking: transformer union and pipeline 
from sklearn.preprocessing import StandardScaler 
from sklearn.pipeline import make_pipeline 
# API Transformer has a transform method 
clf = make_pipeline(StandardScaler(), 
# More transformers here 
SVC()) 
from sklearn.pipeline import make_union 
from sklearn.preprocessing import PolynomialFeatures 
union_transformers = make_union(StandardScaler(), 
# More transformers here 
PolynomialFeatures()) 
clf = make_pipeline(union_transformers, SVC()) 
35 / 37
Model persistence 
from sklearn.externals import joblib 
# Save the model for later 
joblib.dump(clf, "model.joblib") 
# Load the model 
clf = joblib.load("model.joblib") 
36 / 37
Reference and documentation 
I User Guide: 
http://scikit-learn.org/stable/user_guide.html 
I Reference: http: 
//scikit-learn.org/stable/modules/classes.html 
I Examples: http: 
//scikit-learn.org/stable/auto_examples/index.html 
37 / 37

More Related Content

What's hot

TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
Simplilearn
 

What's hot (20)

Python programming –part 3
Python programming –part 3Python programming –part 3
Python programming –part 3
 
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNetAlex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
Alex Smola at AI Frontiers: Scalable Deep Learning Using MXNet
 
Java Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type ConversionJava Foundations: Data Types and Type Conversion
Java Foundations: Data Types and Type Conversion
 
Python programming –part 7
Python programming –part 7Python programming –part 7
Python programming –part 7
 
Deep Learning in theano
Deep Learning in theanoDeep Learning in theano
Deep Learning in theano
 
18. Java associative arrays
18. Java associative arrays18. Java associative arrays
18. Java associative arrays
 
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
TensorFlow Tutorial | Deep Learning With TensorFlow | TensorFlow Tutorial For...
 
07. Arrays
07. Arrays07. Arrays
07. Arrays
 
Algorithm analysis and design
Algorithm analysis and designAlgorithm analysis and design
Algorithm analysis and design
 
Python programming- Part IV(Functions)
Python programming- Part IV(Functions)Python programming- Part IV(Functions)
Python programming- Part IV(Functions)
 
Chapter 3 Arrays in Java
Chapter 3 Arrays in JavaChapter 3 Arrays in Java
Chapter 3 Arrays in Java
 
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
TensorFlow Tutorial | Deep Learning Using TensorFlow | TensorFlow Tutorial Py...
 
Tensorflow - Intro (2017)
Tensorflow - Intro (2017)Tensorflow - Intro (2017)
Tensorflow - Intro (2017)
 
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLabIntroduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
 
Java Foundations: Arrays
Java Foundations: ArraysJava Foundations: Arrays
Java Foundations: Arrays
 
18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set18. Dictionaries, Hash-Tables and Set
18. Dictionaries, Hash-Tables and Set
 
16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues16. Arrays Lists Stacks Queues
16. Arrays Lists Stacks Queues
 
Introduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning ProgrammersIntroduction to NumPy for Machine Learning Programmers
Introduction to NumPy for Machine Learning Programmers
 
07. Java Array, Set and Maps
07.  Java Array, Set and Maps07.  Java Array, Set and Maps
07. Java Array, Set and Maps
 
Chapter 4 - Classes in Java
Chapter 4 - Classes in JavaChapter 4 - Classes in Java
Chapter 4 - Classes in Java
 

Viewers also liked

Viewers also liked (20)

Think machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetanThink machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetan
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
Intro to scikit learn may 2017
Intro to scikit learn may 2017Intro to scikit learn may 2017
Intro to scikit learn may 2017
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Exploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-LearnExploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-Learn
 
Realtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learnRealtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learn
 
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael VaroquauxPyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
 
Introduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learnIntroduction to Machine Learning with Python and scikit-learn
Introduction to Machine Learning with Python and scikit-learn
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Intro to scikit-learn
Intro to scikit-learnIntro to scikit-learn
Intro to scikit-learn
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
 
Converting Scikit-Learn to PMML
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Gradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learnGradient Boosted Regression Trees in scikit-learn
Gradient Boosted Regression Trees in scikit-learn
 

Similar to Numerical tour in the Python eco-system: Python, NumPy, scikit-learn

Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
DRVaibhavmeshram1
 
Ds lab manual by s.k.rath
Ds lab manual by s.k.rathDs lab manual by s.k.rath
Ds lab manual by s.k.rath
SANTOSH RATH
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
Cdiscount
 

Similar to Numerical tour in the Python eco-system: Python, NumPy, scikit-learn (20)

Effective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPyEffective Numerical Computation in NumPy and SciPy
Effective Numerical Computation in NumPy and SciPy
 
Programming python quick intro for schools
Programming python quick intro for schoolsProgramming python quick intro for schools
Programming python quick intro for schools
 
python lab programs.pdf
python lab programs.pdfpython lab programs.pdf
python lab programs.pdf
 
Cc code cards
Cc code cardsCc code cards
Cc code cards
 
Python testing using mock and pytest
Python testing using mock and pytestPython testing using mock and pytest
Python testing using mock and pytest
 
Python Training v2
Python Training v2Python Training v2
Python Training v2
 
Writing Faster Python 3
Writing Faster Python 3Writing Faster Python 3
Writing Faster Python 3
 
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
Pythran: Static compiler for high performance by Mehdi Amini PyData SV 2014
 
Introducción a Elixir
Introducción a ElixirIntroducción a Elixir
Introducción a Elixir
 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
 
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
Introduction to Python 01-08-2023.pon by everyone else. . Hence, they must be...
 
Using-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptxUsing-Python-Libraries.9485146.powerpoint.pptx
Using-Python-Libraries.9485146.powerpoint.pptx
 
Machine learning with py torch
Machine learning with py torchMachine learning with py torch
Machine learning with py torch
 
Ds lab manual by s.k.rath
Ds lab manual by s.k.rathDs lab manual by s.k.rath
Ds lab manual by s.k.rath
 
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook

 
Porting to Python 3
Porting to Python 3Porting to Python 3
Porting to Python 3
 
Python for R developers and data scientists
Python for R developers and data scientistsPython for R developers and data scientists
Python for R developers and data scientists
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
 
Chapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQChapter 22. Lambda Expressions and LINQ
Chapter 22. Lambda Expressions and LINQ
 
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)Lab manual data structure (cs305 rgpv) (usefulsearch.org)  (useful search)
Lab manual data structure (cs305 rgpv) (usefulsearch.org) (useful search)
 

Recently uploaded

Recently uploaded (20)

VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 

Numerical tour in the Python eco-system: Python, NumPy, scikit-learn

  • 1. Numerical tour in the Python eco-system Python, NumPy, scikit-learn Arnaud Joly October 2, 2014
  • 3. How to install Python? Download and use the Anaconda python distribution https://store.continuum.io/cshop/anaconda/. It comes with all the scientific python stack. Alternatives: linux packages, pythonxy, canopy, . . . 3 / 37
  • 4. Using the python interpreter Interactive mode 1. Start a python shell $ ipython 2. Write python code >>> print("Hello World!") Hello World! Script mode 1. hello.py print("Hello World!") 2. Launch the script $ ipython hello.py Hello world! 4 / 37
  • 5. Basic types Integer >>> 5 5 >>> a = 5 >>> a 5 Float >>> pi = 3.14 complex >>> c = 1 - 1j boolean >>> b = 5 > 3 # 5 <= 3 >>> b True # False string >>> s = ’hello!’ # Also works with "hello!" >>> s ’hello !’ 5 / 37
  • 6. Python is a dynamic program language Variable types are implicitly inferred during the assignment. Variables are not declared. >>> # In python >>> a = 1 By contrast in statically typed language, you must declared the type. // In java, c, c++ int a = 1 6 / 37
  • 7. Numbers and their arithmetic operations (+,-,/,//,*,**,%) >>> 1 + 2 4 >>> 50 - 5 * 6 20 >>> 2 / 3 # with py3 0.66... 0 >>> 2. / 3 # float division in py2 and py3 0.6666666666666666 >>> 4 // 3 # Integer division with py2 and py3 1 >>> 5 ** 3.5 # exponent 279.5084971874737 >>> 4 % 2 # modulo operation 0 7 / 37
  • 8. Playing with strings >>> s = ’Great day!’ >>> s ’Great day!’ >>> s[0] # strings are sequences ’G’ >>> """A very very long string """ ’A verynvery long stringn’ >>> ’i={0} f={2} s={1}’.format(1, ’test’, 3.14) ’i=1 f=3.14 s=test’ 8 / 37
  • 9. list, an ordered collection of objects Instantiation >>> l = [] # an empty list >>> l = [’spam’, ’egg’, [’another list’], 42] Indexing >>> l[1] ’egg’ >>> l[-1] # n_elements - 1 42 >>> l[1:2] # a slice ["egg", [’another list’]] Methods >>> len(l) 4 >>> l.pop(0) ’spam’ >>> l.append(3) >>> l [’egg’, [’another list’], 42, 3] 9 / 37
  • 10. dict, an unordered and associative data structure of key-value pairs Instantiation >>> d = {1: "a", "b": 2, 0: [4, 5, 6]} >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2} Indexing >>> d[’b’] 2 >>> ’b’ in d True Insertion >>> d[’new’] = 56 >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2, ’new’: 56} Deletion >>> del d[’new’] >>> d {0: [4, 5, 6], 1: ’a’, ’b’: 2} 10 / 37
  • 11. dict, an unordered and associative data structure of key-value pairs Methods >>> len(d) 3 >>> d.keys() [0, 1, ’b’] >>> d.values() [[4, 5, 6], ’a’, 2] 11 / 37
  • 12. Control flow: if / elif / else >>> x = 3 >>> if x == 0: ... print("zero") ... elif x == 1: ... print("one") ... else: ... print("A big number") ... ’A big number’ Each indentation level corresponds to a block of code 12 / 37
  • 13. Control flow: for loop >>> l = [0, 1, 2, 3] >>> for a in l: # Iterate over a sequence ... print(a ** 2) 0 1 4 Iterating over sequence of numbers is easy with the range built-in. >>> range(3) [0, 1, 2] >>> range(3, 10, 3) [3, 6, 9] 13 / 37
  • 14. Control flow: while >>> a, b = 0, 1 >>> while b < 50: # while True do ... ... a, b = b, a + b ... print(a) ... 1 1 2 3 5 8 13 21 34 14 / 37
  • 15. Control flow: functions >>> def f(x, e=2): ... return x ** e ... >>> f(3) 9 >>> f(5, 3) 125 >>> f(5, e=3) 125 Function arguments are passed by reference in python. Be aware of side effects: mutable default parameters, inplace modifications of the arguments. 15 / 37
  • 16. Classes and object >>> class Counter: ... def __init__(self, initial_value=0): ... self.value = initial_value ... def inc(self): ... self.value += 1 ... >>> c = Counter() # Instantiate a counter object >>> c.value # Access to an attribute 0 >>> c.inc() # Call a method >>> c.value 1 16 / 37
  • 17. Import a package >>> import math >>> math.log(3) 1.0986122886681098 >>> from math import log >>> log(4) 1.3862943611198906 You can try "import this" and "import antigravity". 17 / 37
  • 18. Python reference and tutorial I Python Tutorial : http://docs.python.org/tutorial/ I Python Reference : https://docs.python.org/library/ How to use the "?" in ipython? In [0]: d = {"a": 1} In [1]: d? Type: dict String Form:{’a’: 1} Length: 1 Docstring: dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s (key, value) pairs dict(iterable) -> new dictionary initialized as if via: d = {} for k, v in iterable: d[k] = v dict(**kwargs) -> new dictionary initialized with the name=value pairs in the keyword argument list. For example: dict(one=1, two=2) 18 / 37
  • 20. NumPy NumPy is the fundamental package for scientific computing with Python. It contains among other things: I a powerful N-dimensional array object, I sophisticated (broadcasting) functions, I tools for integrating C/C++ and Fortran code, I useful linear algebra, Fourier transform, and random number capabilities With SciPy, it’s a replacement for MATLAB(c). 20 / 37
  • 21. 1-D numpy arrays Let’s import the package. >>> import numpy as np Let’s create a 1-dimensional array. >>> a = np.array([0, 1, 2, 3]) >>> a array([0, 1, 2, 3]) >>> a.ndim 1 >>> a.shape (4,) 21 / 37
  • 22. 2-D numpy arrays Let’s import the package. >>> import numpy as np Let’s create a 2-dimensional array. >>> b = np.array([[0, 1, 2], [3, 4, 5]]) >>> b array([[ 0, 1, 2], [ 3, 4, 5]]) >>> b.ndim 2 >>> b.shape (2, 3) Routine to create array: np.ones, np.zeros,. . . 22 / 37
  • 23. Array operations >>> a = np.ones(3) / 5. >>> b = np.array([1, 2, 3]) >>> a + b array([ 1.2, 2.2, 3.2]) >>> np.dot(a, b) 1.200000 >>> ... Many functions to operate efficiently on arrays : np.max, np.min, np.mean, np.unique, . . . 23 / 37
  • 24. Indexing numpy array >>> a = np.array([[1, 2, 3], [4, 5, 6]]) >>> a[1, 2] 6 >>> a[1] array([4, 5, 6]) >>> a[:, 2] array([3, 6]) >>> a[:, 1:3] array([[2, 3], [5, 6]]) >>> b = a > 2 >>> b array([[False, False, True], [ True, True, True]], dtype=bool) >>> a[b] array([3, 4, 5, 6]) 24 / 37
  • 25. Reference and documentation I NumPy User Guide: http://docs.scipy.org/doc/numpy/user/ I NumPy Reference: http://docs.scipy.org/doc/numpy/reference/ I MATLAB to NumPy: http://wiki.scipy.org/NumPy_for_Matlab_Users 25 / 37
  • 27. scikit-learn Machine Learning in Python I Simple and efficient tools for data mining and data analysis I Accessible to everybody, and reusable in various contexts I Built on NumPy, SciPy, and matplotlib I Open source, commercially usable - BSD license 27 / 37
  • 28. A bug or need help? I Mailing-list: scikit-learn-general@lists.sourceforge.net; I Tag scikit-learn on Stack Overflow. How to install? I It’s shipped with Anaconda. I http://scikit-learn.org/stable/install.html 28 / 37
  • 29. Digits classification task # Load some data from sklearn.datasets import load_digits digits = load_digits() X, y = digits.data, digits.target How can we build a system to classify images? What is the first step? 29 / 37
  • 30. Data exploration and visualization # Data visualization import matplotlib.pyplot as plt plt.gray() plt.matshow(digits.images[0]) plt.show() What else can be done? 30 / 37
  • 31. Fit a supervised learning model from sklearn.svm import SVC clf = SVC() # Instantiate a classifier # API The base object, implements a fit method to learn from clf.fit(X, y) # Fit a classifier with the learning samples # API Exploit the fitted model to make prediction clf.predict(X) # API Get a goodness of fit given data (X, y) clf.score(X, y) # accuracy=1. What do you think about this score of 1.? 31 / 37
  • 32. Cross validation from sklearn.svm import SVC from sklearn.cross_validation import KFold scores = [] for train, test in KFold(len(X), n_folds=5, shuffle=True): X_train, y_train = X[train], y[train] X_test, y_test = X[test], y[test] clf = SVC() clf.fit(X_train, y_train) scores.append(clf.score(X_test, y_test)) print(np.mean(scores)) # 0.44... ! What do you think about this score of 0.44? Tip: This could be simplified using the cross_val_score function. 32 / 37
  • 33. Hyper-parameter optimization from sklearn.svm import SVC from sklearn.cross_validation import cross_val_score parameters = np.linspace(0.0001, 0.01, num=10) scores = [] for value in parameters: clf = SVC(gamma=value) s = cross_val_score(clf, X, y=y, cv=5) scores.append(np.mean(s, axis=0)) print(np.max(scores)) # 0.97... ! Tip: This could be simplified using the GridSearchCV meta-estimator. 33 / 37
  • 34. Visualizing hyper-parameter search import matplotlib.pyplot as plt plt.figure() plt.plot(parameters, scores) plt.xlabel("Gamma") plt.ylabel("Accuracy") plt.savefig("images/grid.png") 34 / 37
  • 35. Estimator cooking: transformer union and pipeline from sklearn.preprocessing import StandardScaler from sklearn.pipeline import make_pipeline # API Transformer has a transform method clf = make_pipeline(StandardScaler(), # More transformers here SVC()) from sklearn.pipeline import make_union from sklearn.preprocessing import PolynomialFeatures union_transformers = make_union(StandardScaler(), # More transformers here PolynomialFeatures()) clf = make_pipeline(union_transformers, SVC()) 35 / 37
  • 36. Model persistence from sklearn.externals import joblib # Save the model for later joblib.dump(clf, "model.joblib") # Load the model clf = joblib.load("model.joblib") 36 / 37
  • 37. Reference and documentation I User Guide: http://scikit-learn.org/stable/user_guide.html I Reference: http: //scikit-learn.org/stable/modules/classes.html I Examples: http: //scikit-learn.org/stable/auto_examples/index.html 37 / 37