Machine Learning
in Python
Dhiana Deva - PyLadies Stockholm
Agenda
Goal: Encourage you to use Machine Learning… today!
‣ Aboutme
‣ MachineLearning
Misconceptions
Concepts
Problems andAlgorithms
‣ ML+Python
PythonToolbox
Code Snippets
Spotify<3 Python
About me
Electronics Engineering, Software
Development, Machine Learning… Why not?
Machine
Learning
It is all about learning
Misconceptions
Too difficult
Big upfront investments
Needs supercomputers
Only for PhDs from MIT
Data Silo
Takes too long to pay off
ƃ
Reality
♥
Feature Extraction
Item {
Feature 1
Feature 2
Feature 3
…
Feature N
Feature2
0
8
16
24
32
40
Feature 1
0 10 20 30 40 50 60 70 80 90
Supervised Learning
23, 45, 67, 78
12, 48, 68, 22
…
34, 58, 77, 19
3
2
…
5
20, 39, 59, 68 3
♥
Items Features Labels Algorithm
New Item PredictionFeatures Model
Regression
921 37
23 2487 1541
21 2121 21
?
Classification
BA B
A AB AB
A AB B
?
Supervised Learning Algorithms
K-Nearest Neighbors Neural Networks
DecisionTree Random Forest
Detecting particles
Online electron detection
based on more than 1500
detector cells using
Neural Networks
(GeV)TE
0 10 20 30 40 50 60 70 80
o(%)a
~
Prob.deRejeic
20
30
40
50
60
70
80
90
100
Ringer
T2Calo
smicosoC
η
-2 -1 0 1 2
o(%)a
~
Prob.deRejeic
80
85
90
95
100
Ringer
T2Calo
φ
-3 -2 -1 0 1 2 3
o(%)a
~
Prob.deRejeic
90
92
94
96
98
100
Ringer
T2Calo
Unsupervised Learning
23, 45, 67, 78
12, 48, 68, 22
…
34, 58, 77, 19
20, 39, 59, 68
♥
Items Features Algorithm
New Item
Better
RepresentationFeatures Model
Clustering
Dimensionality Reduction
Unsupervised Learning Algorithms
K-Means Self-Organising Maps t-SNE
Visualizing employees
Visualization of 2000+
employees described by
200+ skills after reducing
dimensionality using the
t-SNE algorithm
ML + Python
Amatch made in heaven!
Python Toolbox
Classification
>>> classifier = RandomForestClassifier().fit(X_train,Y_train)
>>>Y_test = classifier.predict(X_test)
23, 45, 67, 78
12, 48, 68, 22
…
34, 58, 77, 19
A
B
…
B
20, 39, 59, 68 A
♥
Items Features Labels Algorithm
New Item PredictionFeatures Model
23, 45, 67, 78
12, 48, 68, 22
…
34, 58, 77, 19
♥
Items Features Algorithm
Better
RepresentationModel
Dimensionality Reduction
>>> embedding =TSNE().fit_transform(X_train)
Data Visualization
>>> Scatter(df, x=‘x1’, y=‘x2’, color=‘label')
Spotify <3 Python
Thank you.
@dhianadeva

Machine Learning in Python - PyLadies Stockholm