Python for Scientific Computing -- Ricardo Cruz

Introduction into
Python for Scientiﬁc Computing
Jo˜ao Machado • Ricardo Cruz

Introduction
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
? ? ?
? ? ?
? ? ?
What is the result of this operation?

Introduction
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
? ? ?
? ? ?
? ? ?
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
g d a
h e b
i f c

Introduction
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
? ? ?
? ? ?
? ? ?
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
g d a
h e b
i f c
1 from numpy import *
2 cout = p r i n t
3
4 A = random.random ((3, 3));
5 B = fliplr(eye (3));
6 C = dot(A, B);
7 cout(C);
What programming language is this?

Introduction
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
? ? ?
? ? ?
? ? ?
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
g d a
h e b
i f c
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);
It’s Python!
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);

Introduction
1 #i n c l u d e <armadillo >
2 using namespace arma;
3 using namespace std;
4
5 mat A(3 ,3), B(3 ,3);
6 A.randu ();
7 B = fliplr(B.eye ());
8 M3 = M1 * M2;
9 cout << M3 << endl;
What about this programming language?
a d g
b e h
c f i
×
0 0 1
0 1 0
1 0 0
=
g d a
h e b
i f c
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);
It’s Python!
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);

Introduction
1 #i n c l u d e <armadillo >
2 using namespace arma;
3 using namespace std;
4
5 mat A(3 ,3), B(3 ,3);
6 A.randu ();
7 B = fliplr(B.eye ());
8 M3 = M1 * M2;
9 cout << M3 << endl;
What about this programming language?
Why use Python?
More important than the programming
language is the ecosystem – and Python
has a great scientific community
Python has good interoperability with
other systems
The entire stack can be developed in
Python: machine learning, flask, etc
Computations do not run in Python; the
slow stuff is implemented in Fortran and
C
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);
It’s Python!
2 cout = p r i n t
3
6 C = dot(A, B);
7 cout(C);

Python
matplotlib
numpy
sklearn
pandas
R
ggplot2
rpart
foreign
dplyr
survival
ggmaps
zooMATLAB
Statistics
Toolbox
Biostatistics
Toolbox
Neural Network
Toolbox

Why Python?
Good data mining ecosystem.
Not as centralized/monopolistic as
Matlab’s
Not as decentralized and messy as R :P

Why Python?
Source: http://www.kdnuggets.com/2017/01/most-popular-language-machine-learning-data-science.html

Numpy Notes
Let A and B be matrices,
Python/Numpy MATLAB R
A.dot(B) A * B A %*% B
A * B A .* B A * B
Operations are elementwise by default (like
R)

Numpy Notes
A * B A .* B A * B
R)
A.shape size(A) length, nrow, ncol
A[0:4,:] or
A[0:4] or A[:4] A(1:4,:) A[1:4,]
A[0:10:2] A[seq(0, 9, 2)]
A[-4:] A(end-4:end,:) A[nrow(A)-4:nrow(A),]
A.T A.’ t(A)
Numpy in general allows for more succinct
writing.
Furthermore:
Indexing starts at zero.
Intervals are of the form [i, j[

Numpy Notes
A * B A .* B A * B
R)
A[0:4,:] or
A[0:4] or A[:4] A(1:4,:) A[1:4,]
A[0:10:2] A[seq(0, 9, 2)]
A.T A.’ t(A)
writing.
Furthermore:
This is further aided by the fact that Numpy
supports arithmetic broadcasting. (unlike
MATLAB or R.)
That is, you can do the following element-
wise multiplication: (6,3) * (6,1). It auto-
matically assumes you want to multiply by
column. In MATLAB, you would have to use
bsxfun(@times,r,A) or ﬁrst use repmat().

Numpy Notes
A * B A .* B A * B
R)
A[0:4,:] or
A[0:4] or A[:4] A(1:4,:) A[1:4,]
A[0:10:2] A[seq(0, 9, 2)]
A.T A.’ t(A)
writing.
Furthermore:
Something like the following is valid in
Numpy...
1 import skimage.data
2 img1 = skimage.data.astronaut ()
3 img2 = skimage.data.moon ()
4 p r i n t (img1.shape) # (512 , 512 , 3)
5 p r i n t (img2.shape) # (512 , 512)
6
7 import matplotlib.pyplot as plt
8 plt.subplot (1, 2, 1)
9 plt.imshow(img1)
11 plt.imshow(img2 , cmap=’gray ’)
12 plt.show ()
MATLAB or R.)

Numpy Notes
A[0:4,:] or
A[0:4] or A[:4] A(1:4,:) A[1:4,]
A[0:10:2] A[seq(0, 9, 2)]
A.T A.’ t(A)
writing.
Furthermore:
Numpy...
4 p r i n t (img1.shape) # (512 , 512 , 3)
5 p r i n t (img2.shape) # (512 , 512)
6
9 plt.imshow(img1)
12 plt.show ()
MATLAB or R.)

Numpy Notes
Arithmetic mean
1 img2 = img2[:, :, np.newaxis] #(512 ,512 ,1)
2 img1 = img1.astype(np.uint32)
4 img3 = (img1 + img2)//2
6 plt.imshow(img3)
7 plt.show ()
Numpy...
4 p r i n t (img1.shape) # (512 , 512 , 3)
5 p r i n t (img2.shape) # (512 , 512)
6
9 plt.imshow(img1)
12 plt.show ()
MATLAB or R.)

Numpy Notes
Arithmetic mean
4 img3 = (img1 + img2)//2
6 plt.imshow(img3)
7 plt.show ()
Numpy...
4 p r i n t (img1.shape) # (512 , 512 , 3)
5 p r i n t (img2.shape) # (512 , 512)
6
9 plt.imshow(img1)
12 plt.show ()

Numpy Notes
Arithmetic mean
4 img3 = (img1 + img2)//2
6 plt.imshow(img3)
7 plt.show ()
Geometric mean
1 img2 = img2[:, :, np.newaxis]
4 img3 = np.sqrt(img1 * img2)
6 plt.imshow(img3)
7 plt.show ()

Pandas and Data Visualization –

Pandas
What is Pandas?
A package for data manipulation and
analysis, based on the concept of data
frame in the R language
Optimized for performance, with critical
code paths written in C
Originally developed by Wes McKinney,
while working for AQR Capital (a
quantitative ﬁnance ﬁrm)

Pandas
What is Pandas?
A package for data manipulation and
analysis, based on the concept of data
frame in the R language
Optimized for performance, with critical
code paths written in C
Originally developed by Wes McKinney,
while working for AQR Capital (a
quantitative finance firm)
Given the previous point, it makes sense
to demonstrate some of the
functionalities of Pandas with a dataset
comprised of financial stocks :)

Data Mining –

Models
Let us produce fake data...
y(x) = 2x + 10 + ε1 + ε2
ε1 ∼ N(0, 2)
ε2 ∼
|N(0, 25)| with p = 0.1,
0 otherwise.

Models
y(x) = 2x + 10 + ε1 + ε2
ε1 ∼ N(0, 2)
ε2 ∼
|N(0, 25)| with p = 0.1,
0 otherwise.
y(x) = 2x + 10 + ε1 + bε2
ε1 ∼ N(0, 2)
b ∼ B(2, 0.1)
ε2 ∼ |N(0, 25)|

Models
y(x) = 2x + 10 + ε1 + ε2
ε1 ∼ N(0, 2)
ε2 ∼
|N(0, 25)| with p = 0.1,
0 otherwise.
Translation to numpy:
1 import numpy as np
2 N = 50
3 x = np.linspace (0, 25, N)
4 y = 2*x + 10
5 y += np.random.randn(N)*2
6 y += np.random.binomial (2, 0.10 , N)*np. abs
(np.random.randn(N)*25)
y(x) = 2x + 10 + ε1 + bε2
ε1 ∼ N(0, 2)
b ∼ B(2, 0.1)
ε2 ∼ |N(0, 25)|

Models
2 plt.plot(x, y)
3 plt.title(’Data ’)
4 plt.show ()
y(x) = 2x + 10 + ε1 + ε2
ε1 ∼ N(0, 2)
ε2 ∼
|N(0, 25)| with p = 0.1,
0 otherwise.
2 N = 50
4 y = 2*x + 10
y(x) = 2x + 10 + ε1 + bε2
ε1 ∼ N(0, 2)
b ∼ B(2, 0.1)
ε2 ∼ |N(0, 25)|

Models
2 plt.plot(x, y)
4 plt.show ()
What model could we create to explain this
data?
2 N = 50
4 y = 2*x + 10
y(x) = 2x + 10 + ε1 + bε2
ε1 ∼ N(0, 2)
b ∼ B(2, 0.1)
ε2 ∼ |N(0, 25)|

Models
2 plt.plot(x, y)
4 plt.show ()
data?
2 N = 50
4 y = 2*x + 10
Linear Regression
Model: ˆy = β0 + β1x
Minimize: i (yi − ˆyi )2

Models
2 plt.plot(x, y)
4 plt.show ()
data?
1 from sklearn. linear_model import
LinearRegression
2 m = LinearRegression ()
3 m.fit(x[:, np.newaxis], y)
4 yp = m.predict(x[:, np.newaxis ])
5
6 plt.plot(x, y)
7 plt.plot(x, yp)
8 plt.title(’Linear regression ’)
9 plt.text(0, 70, ’m=%.1f b=%.1f’ % (m.coef_
[0], m.intercept_))
10 plt.show ()
Linear Regression

Models
data?
LinearRegression
5
6 plt.plot(x, y)
7 plt.plot(x, yp)
[0], m.intercept_))
10 plt.show ()
Linear Regression

Models
y(x) = 2x + 10 + ε1 + bε2
ˆy(x) = 2x + 18
What if I want to explain only the trend?
How can I avoid the impact of these spikes?
LinearRegression
5
6 plt.plot(x, y)
7 plt.plot(x, yp)
[0], m.intercept_))
10 plt.show ()
Linear Regression

Models
y(x) = 2x + 10 + ε1 + bε2
ˆy(x) = 2x + 18
LinearRegression
5
6 plt.plot(x, y)
7 plt.plot(x, yp)
[0], m.intercept_))
10 plt.show ()
What would a statistician do?
1 res = yp -y
2 plt.boxplot(res)
3 plt.show ()

Models
y(x) = 2x + 10 + ε1 + bε2
ˆy(x) = 2x + 18
1 q1 = np.percentile(res , 25)
3 t = np.logical_and(res > q1 , res < q3)
4 x2 = x[t]
5 y2 = y[t]
6
8 m.fit(x2[:, np.newaxis], y2)
1 res = yp -y
2 plt.boxplot(res)
3 plt.show ()

Models
Approach #2: What would a statistician
with some computer science knowledge do?
4 x2 = x[t]
5 y2 = y[t]
6
1 res = yp -y
2 plt.boxplot(res)
3 plt.show ()

Models
4 x2 = x[t]
5 y2 = y[t]
6
Minimize: i |yi − ˆyi |

Models
1 from statsmodels.regression.
quantile_regression import QuantReg
2
3 m = QuantReg(y, np.c_[np.ones(N), x])
4 m = m.fit (0.5)
5 yp = m.predict ()

Models
Approach #3: What would a crazy com-
puter scientist do?
2
4 m = m.fit (0.5)
5 yp = m.predict ()

Models
puter scientist do?
2
4 m = m.fit (0.5)
5 yp = m.predict ()
1 plt.plot(x, y)
2 f o r it i n range (10):
3 t = np.random.choice(N, N//10 , replace
=False)
4 x2 = x[t]
5 y2 = y[t]
8 plt.plot(x, yp , color=’black ’, alpha
=0.4)
9 plt.show ()

Models
puter scientist do?
1 plt.plot(x, y)
=False)
4 x2 = x[t]
5 y2 = y[t]
=0.4)
9 plt.show ()

Models
Sklearn already comes with this crazy model
too:
RANSACRegressor
2 m = RANSACRegressor ()
4
5 plt.plot(x, y)
6 plt.plot(x, m.predict(x[:, np.newaxis ]))
7 plt.title(’RANSAC ’)
8 plt.show ()
puter scientist do?
1 plt.plot(x, y)
=False)
4 x2 = x[t]
5 y2 = y[t]
=0.4)
9 plt.show ()

Models
Sklearn already comes with this crazy model
too:
RANSACRegressor
2 m = RANSACRegressor ()
4
5 plt.plot(x, y)
6 plt.plot(x, m.predict(x[:, np.newaxis ]))
7 plt.title(’RANSAC ’)
8 plt.show ()
1 plt.plot(x, y)
=False)
4 x2 = x[t]
5 y2 = y[t]
=0.4)
9 plt.show ()

What kind of things can we use data mining /
machine learning for?

Data Mining Problems
Regression: predict a continuous
variable
e.g.
House Price = 100 + 20 × Land Size
In scikit-learn, LinearRegression, Gradient-
BoostingRegressor, etc (:: RegressorMixin)
.fit(X, y)
.predict(X) -> yp

variable
e.g.
.fit(X, y)
.predict(X) -> yp
Classification: predict a discrete variable
e.g. House Price =
Expensive if in the city center
Cheap if outside the city
In scikit-learn, LogisticRegression, Gradient-
BoostingClassifier, etc (:: ClassifierMixin)
.fit(X, y)
.predict(X) -> yp

variable
e.g.
.fit(X, y)
.predict(X) -> yp
e.g. House Price =
.fit(X, y)
.predict(X) -> yp
Clustering: not predict, aggregate
In scikit-learn, KMeans, LatentDirichletAllo-
cation, etc (:: ClusterMixin)
.fit(X)
.transform(X) -> X’
.fit transform(X) -> X’

variable
e.g.
.fit(X, y)
.predict(X) -> yp
e.g. House Price =
.fit(X, y)
.predict(X) -> yp
Re-inforcement learning: (predict best
move)
Clustering: not predict, aggregate
In scikit-learn, KMeans, LatentDirichletAllo-
cation, etc (:: ClusterMixin)
.fit(X)
.transform(X) -> X’
.fit transform(X) -> X’

Use Cases

Signal processing:
Packages:
numpy
pandas
scipy
matplotlib

Text Mining w/ Twitter
Packages:
tweepy
numpy
matplotlib
scikit-learn

Text Mining
1 import tweepy
2 auth = tweepy. OAuthHandler (api_key ,
api_secret)
3 auth. set_access_token (access_token ,
access_secret )
4 api = tweepy.API(auth)
5
6 timeline = api. user_timeline (’
realDonaldTrump ’, count =100)
7 texts = [tweet.text f o r tweet i n timeline]

Text Mining
1 import tweepy
api_secret)
access_secret )
5
1 from sklearn. feature_extraction .text
import CountVectorizer
2 m = CountVectorizer (stop_words=’english ’,
min_df =5, max_df =16)
3 X = m. fit_transform (texts)
4 words = sorted (m.vocabulary_ , key=m.
vocabulary_.get)
5
6 import pandas as pd
7 p r i n t (pd.DataFrame(X.todense (), columns=
words).ix[:5, :5]. to_latex ())
america big comey day dems
0 0 0 0 0 0
1 0 1 0 0 0
2 1 0 0 0 0

Text Mining
1 import tweepy
api_secret)
access_secret )
5
vocabulary_.get)
5
0 0 0 0 0 0
1 0 1 0 0 0
2 1 0 0 0 0
2 counts = np.asarray(X.sum(0))[0]
3 plt.barh( range ( len (counts)), counts)
4 plt.xticks( range (0, 14, 2))
5 plt.yticks( range ( len (counts)), words)
6 plt.show ()

Text Mining
1 from sklearn. decomposition import
LatentDirichletAllocation
2 lda = LatentDirichletAllocation (2,
learning_method =’online ’)
3 lda.fit(X)
4 topics = lda. components_
newword1 = β11word1 + β12word2 + . . .
vocabulary_.get)
5
0 0 0 0 0 0
1 0 1 0 0 0
2 1 0 0 0 0
6 plt.show ()

Text Mining
3 lda.fit(X)
1 topics = topics / topics.max(1)[:, np.
newaxis]
2 topics += np.random.randn (* topics.shape)
*0.02
3 f o r i, word i n enumerate(words):
4 plt.text(topics [0, i], topics [1, i],
word , ha=’center ’)
5 plt.show ()
6 plt.show ()

Text Mining
3 lda.fit(X)
newaxis]
*0.02
5 plt.show ()

Text Mining
3 lda.fit(X)
newaxis]
*0.02
5 plt.show ()
marcelorebelo_ ’, count =100)

Traditional Learning vs Deep Learning
Traditionally, hand-crafted features would be extracted from the dataset and learning
would happen on top of those features. Deep learning learns from the raw data.
Packages:
scikit-image
numpy
keras

Traditional Learning
Cats vs Dogs – Kaggle Competition – https:
//www.kaggle.com/c/dogs-vs-cats
25,000 images of cats and dogs

Feature #1: Extract histogram of colors
1 from skimage.io import imread
2 from skimage.transform import rgb2gray
3
4 f o r filename i n os.listdir(’train ’):
5 im = imread(os.path.join(’train ’,
filename))
6 im = rgb2gray(im)
7 f1 = np.histogram(im.flatten (), 10) [0]
8 f1 = (f1/f1.sum()).cumsum ()

3
filename))
6 im = rgb2gray(im)
Feature #2: Histogram of Oriented Gradi-
ents
1 im2 = resize(im , (32, 32) , mode=’reflect
’)
2 im2 = np.sqrt(im2)
3 f2 = hog(im2 , block_norm=’L2 -Hys ’)

3
filename))
6 im = rgb2gray(im)
1 from sklearn.tree import
DecisionTreeClassifier ,
export_graphviz
2 m = DecisionTreeClassifier (max_depth =3)
3 m.fit(X, y)
ents
’)

1 from sklearn. model_selection import
cross_val_score
2 from sklearn.ensemble import
RandomForestClassifier
3 p r i n t ( cross_val_score (
RandomForestClassifier (100) , X, y))
1 [ 0.69642429 0.70086393 0.69851176]
3
filename))
6 im = rgb2gray(im)
export_graphviz
3 m.fit(X, y)
ents
’)

Deep Learning
cross_val_score
1 [ 0.69642429 0.70086393 0.69851176]
Linear Regression
ˆy = β0 + β1x1 + β2x2 + . . .
Multilayer perceptron / neural network
ˆy = β00σ(β10 + β11x1 + β12x2 + . . . )
+ β01σ(β20 + β21x1 + β22x2 + . . . ) + . . .
export_graphviz
3 m.fit(X, y)
ents
’)

Deep Learning
cross_val_score
1 [ 0.69642429 0.70086393 0.69851176]
Linear Regression
ˆy = β0 + β1x1 + β2x2 + . . .
ˆy = β00σ(β10 + β11x1 + β12x2 + . . . )
+ β01σ(β20 + β21x1 + β22x2 + . . . ) + . . .
export_graphviz
3 m.fit(X, y)

Deep Learning
cross_val_score
1 [ 0.69642429 0.70086393 0.69851176]
Linear Regression
ˆy = β0 + β1x1 + β2x2 + . . .
ˆy = β00σ(β10 + β11x1 + β12x2 + . . . )
+ β01σ(β20 + β21x1 + β22x2 + . . . ) + . . .
1 model = Sequential ()
2 model.add(Conv2D (8, 3, 1, activation=’relu
’, input_shape =(32 , 32, 1)))
3 model.add( MaxPooling2D ())
4 model.add(Conv2D (16, 3, 1, activation=’
relu ’))
6 model.add(Flatten ())
7 model.add(Dense (16, activation=’relu ’))
9 model.add(Dense (1, activation=’sigmoid ’))
10
11 sgd = SGD ()
12 model. compile (sgd , ’binary_crossentropy ’)
13
14 model.fit(X[tr], y[tr], validation_data =(X
[ts], y[ts]),
15 epochs =10, batch_size =100)

Deep Learning
1 f o r tr , ts i n StratifiedKFold ().split(X, y
):
2 model = ...
3 ...
4 yp = (model.predict(X[ts])[:, -1] > 0.5)
.astype( i n t )
5 p r i n t ( accuracy_score (y[ts], yp))
1 [0.57 , 0.57 , 0.63]
Linear Regression
ˆy = β0 + β1x1 + β2x2 + . . .
ˆy = β00σ(β10 + β11x1 + β12x2 + . . . )
+ β01σ(β20 + β21x1 + β22x2 + . . . ) + . . .
’, input_shape =(32 , 32, 1)))
relu ’))
10
11 sgd = SGD ()
13
[ts], y[ts]),

Deep Learning
):
2 model = ...
3 ...
.astype( i n t )
1 [0.57 , 0.57 , 0.63]
Overview of Python deep learning landscape:
Theano TensorFlow PyTorch
KerasLasagne
’, input_shape =(32 , 32, 1)))
relu ’))
10
11 sgd = SGD ()
13
[ts], y[ts]),

Deep Learning
):
2 model = ...
3 ...
.astype( i n t )
1 [0.57 , 0.57 , 0.63]
Overview of Python deep learning landscape:
Theano TensorFlow PyTorch
KerasLasagne
’, input_shape =(32 , 32, 1)))
relu ’))
10
11 sgd = SGD ()
13
[ts], y[ts]),
Deep learning architectures:
Fully connected perceptrons
Convolutional neural networks
Recurrent neural networks
Neural Turing Machines
Autoencoders

Conclusions –

Conclusions
Packages to know:
Numpy: basic linear algebra
Scipy: extensions to numpy
sparse matrices, pdfs, hypothesis tests
Statsmodels: several statistics models,
incl. timeseries
Pandas: extension to numpy for
dataframes support
Matplotlib, seaborn: drawing graphics

Conclusions
Packages to know:
Numpy: basic linear algebra
Scipy: extensions to numpy
sparse matrices, pdfs, hypothesis tests
Statsmodels: several statistics models,
incl. timeseries
Pandas: extension to numpy for
dataframes support
Matplotlib, seaborn: drawing graphics
scikit-learn: complete machine learning
toolkit
xgboost: famous gradient boosting
model
Keras: deep learning (and TensorFlow,
Theano, Lasagne)
OpenCV, scikit-image: image
processing
NLTK: natural language toolkit
Gensim: natural language models

Final remarks
Python’s a “jack of all trades” type of language;
Its speed and ease of development is really apt for scientific computing;
Ever increasingly adopted by scientists and engineers, due to the available third-party
scientific libraries contributed by a large community;
Has become a ’de-facto’ language present in advances in some fields, such as Deep
Learning.

About us
Jo˜ao Machado
machadojpf@gmail.com
Fraunhofer Portugal research engineer
Masters in Electrical and Computer Engineering
http://www.linkedin.com/in/machadojpf
Ricardo Cruz
rpcruz@inesctec.pt
INESC TEC researcher
Computer Science & Applied Mathematics graduate
https://rpmcruz.github.io/
Subscribe workshops:
http://tinyurl.com/cruz-workshops

Python for Scientific Computing -- Ricardo Cruz

More Related Content

What's hot

Similar to Python for Scientific Computing -- Ricardo Cruz

Recently uploaded

Python for Scientific Computing -- Ricardo Cruz