SlideShare a Scribd company logo
Introduction to Machine
Learning with Python
and scikit-learn
Python Atlanta
Nov. 14th 2013
Matt Hagy
matt@liveramp.com
Machine Learning (ML):
• Finding patterns in data

• Modeling patterns
• Use models to make
predictions

Slide #2

Intro to Machine Learning with Python

matt@liveramp.com
ML can be easy*
• You already have ML applications!

• You can start applying ML methods
now with Python &scikit-learn
• Theoretical knowledge of ML not
needed (initially)*
*Gaining more background, theory, and
experience will help
Slide #3

Intro to Machine Learning with Python

matt@liveramp.com
Simple Example

Slide #4

Intro to Machine Learning with Python

matt@liveramp.com
Simple Model

Slide #5

Intro to Machine Learning with Python

matt@liveramp.com
import numpyas np
from sklearn.linear_modelimport LinearRegression
x,y = np.load('data.npz')
x_test = np.linspace(0, 200)
model = LinearRegression()
model.fit(x[::, np.newaxis], y)
y_test = model.predict(x_test[::, np.newaxis])

Slide #6

Intro to Machine Learning with Python

matt@liveramp.com
Slide #7

Intro to Machine Learning with Python

matt@liveramp.com
Variance/Bias Trade Off
• Need models that can adapt to
relationships in our data
• Highly adaptable models can over-fit
and will not generalize
• Regularization – Common strategy to
address variance/bias trade off
Slide #8

Intro to Machine Learning with Python

matt@liveramp.com
Slide #9

Intro to Machine Learning with Python

matt@liveramp.com
import numpy as np
from sklearn.svmimport SVR
from sklearn.pipelineimport Pipeline
from sklearn.preprocessingimport StandardScaler
x,y = np.load('data.npz')
x_test = np.linspace(0, 200)

regularization
term

model = Pipeline([
('standardize', StandardScaler()),
('svr', SVR(kernel='rbf', verbose=0, C=5e6,
epsilon=20)) ])
model.fit(x[::, np.newaxis], y)
y_test = model.predict(x_test[::, np.newaxis])
Slide #10

Intro to Machine Learning with Python

matt@liveramp.com
Supervised Learning
Output, Y

0
3
1
3
4
2
9
3
4

1
6
3
7
9
3
17
6
7

Sample

Input, X

Slide #11

Modeling relationship
between inputs and outputs

Intro to Machine Learning with Python

matt@liveramp.com
Multiple Inputs
Input, X

Sample

X1

X2

X3

Xn

Output, Y

0
3
1
3
4
2
9
3
4

2
3
1
6
8
9
1
2
3

1
0
3
1
2
7
5
4
2

4
7
0
2
9
1
3
2
1

1
6
3
7
9
3
17
6
7

Slide #12

…

Intro to Machine Learning with Python

matt@liveramp.com
Example: Image Classification
• Classify
handwritten digits
with ML models
• Each input is an
entire image
• Output is digit in
the image
Slide #13

Intro to Machine Learning with Python

matt@liveramp.com
Input, X

Output, Y

9
2
Slide #14

Intro to Machine Learning with Python

matt@liveramp.com
import numpyas np
from sklearn.ensembleimport RandomForestClassifier
with np.load(’train.npz') as data:
pixels_train = data['pixels']
labels_train = data['labels’]
with np.load(’test.npz') as data:
pixels_test = data['pixels']
# flatten
X_train = pixels_train.reshape(pixels_train.shape[0], -1)
X_test = pixels_test.reshape(pixels_test.shape[0], -1)
model = RandomForestClassifier(n_estimators=50)
model.fit(X_train, labels_train)
labels_test = model.predict(X_test)
Slide #15

Intro to Machine Learning with Python

matt@liveramp.com
Predicting the tags of Stack Overflow
questions with machine learning
Kaggle Data Science Competition
• Given 6 million
training questions
labeled with tags
• Predict the tags for
2 million unlabeled
test questions
www.users.globalnet.co.uk/~slocks/instructions.html
stackoverflow.com/questions/895371/bubble-sort-homework

Slide #16

Intro to Machine Learning with Python

matt@liveramp.com
Text Classification Overview
Feature Extraction &
Selection
Raw Posts

Slide #17

Model Selection
& Training

Vector Space

Intro to Machine Learning with Python

Machine
Learning Model

matt@liveramp.com
Term Frequency Feature Extraction
Characterize text by the frequency of specific
words in each text entry

Slide #18

processing

sorted

array

faster

“Why is processing a
sorted array faster
than processing an
array this is not
sorted?”

Term Frequencies
why

Example Title:

1

2

2

2

1

Ignore common words
(i.e. stop words)

Intro to Machine Learning with Python

matt@liveramp.com
sorted

array

faster

need

help

java

homework

Title 1 1

2

2

2

1

0

0

0

0

Title 2 0

0

0

0

0

1

1

1

1

Title 3 0

0

1

1

0

0

1

0

1

why

processing

Frequency of key terms is anticipated to be
correlated with the tags of the question

Slide #19

Intro to Machine Learning with Python

matt@liveramp.com
Example Model Coefficients

Slide #22

Intro to Machine Learning with Python

matt@liveramp.com
ML can be easy*
• You already have ML problems!
• You can start applying ML methods now
with Python &scikit-learn
• Theoretical knowledge of ML not needed
(initially)*
scikit-learn.org

github.com/scikit-learn
Slide #24

Intro to Machine Learning with Python

matt@liveramp.com
Helping companies use their marketing data to delight customers

Tools

Opportunities
• Backend Engineers
• Data Scientists
• Full-Stack Engineers

• Java
• Hadoop (Map/Reduce)
• Ruby

Build and work with large distributed systems that
process massive data sets.
Check out: liveramp.com/careers
Slide #25

Intro to Machine Learning with Python

matt@liveramp.com

More Related Content

What's hot

Machine learning
Machine learningMachine learning
Machine learning
Rajib Kumar De
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
Gopal Sakarkar
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
Shajun Nisha
 
Pandas
PandasPandas
Pandas
maikroeder
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.pptbutest
 
KNN
KNN KNN
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
AAKANKSHA JAIN
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
ASHOK KUMAR
 
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | EdurekaMachine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Edureka!
 
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Simplilearn
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
Sourabh Sahu
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
9xdot
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Shrey Malik
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
Megha Sharma
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
Harri Hämäläinen
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
Padma Metta
 

What's hot (20)

Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Linear regression in machine learning
Linear regression in machine learningLinear regression in machine learning
Linear regression in machine learning
 
Pandas
PandasPandas
Pandas
 
activelearning.ppt
activelearning.pptactivelearning.ppt
activelearning.ppt
 
KNN
KNN KNN
KNN
 
Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]Dimension reduction techniques[Feature Selection]
Dimension reduction techniques[Feature Selection]
 
Machine learning ppt.
Machine learning ppt.Machine learning ppt.
Machine learning ppt.
 
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | EdurekaMachine Learning in 10 Minutes | What is Machine Learning? | Edureka
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
 
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
 
Scikit Learn intro
Scikit Learn introScikit Learn intro
Scikit Learn intro
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
Machine learning and types
Machine learning and typesMachine learning and types
Machine learning and types
 

Viewers also liked

Machine learning with scikit-learn
Machine learning with scikit-learnMachine learning with scikit-learn
Machine learning with scikit-learn
Qingkai Kong
 
Intro to scikit learn may 2017
Intro to scikit learn may 2017Intro to scikit learn may 2017
Intro to scikit learn may 2017
Francesco Mosconi
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
Asim Jalis
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
Gilles Louppe
 
Intro to scikit-learn
Intro to scikit-learnIntro to scikit-learn
Intro to scikit-learn
AWeber
 
Realtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learnRealtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learn
AWeber
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
odsc
 
Think machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetanThink machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetan
Chetan Khatri
 
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael VaroquauxPyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pôle Systematic Paris-Region
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
Damian R. Mingle, MBA
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
Yoss Cohen
 
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
PyData
 
Exploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-LearnExploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-Learn
Kan Ouivirach, Ph.D.
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learn
Jeff Klukas
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
Gael Varoquaux
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Arnaud Joly
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
Gilles Louppe
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
Gael Varoquaux
 
Converting Scikit-Learn to PMML
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
Villu Ruusmann
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
Oswal Abhishek
 

Viewers also liked (20)

Machine learning with scikit-learn
Machine learning with scikit-learnMachine learning with scikit-learn
Machine learning with scikit-learn
 
Intro to scikit learn may 2017
Intro to scikit learn may 2017Intro to scikit learn may 2017
Intro to scikit learn may 2017
 
Data Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learnData Science and Machine Learning Using Python and Scikit-learn
Data Science and Machine Learning Using Python and Scikit-learn
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
Intro to scikit-learn
Intro to scikit-learnIntro to scikit-learn
Intro to scikit-learn
 
Realtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learnRealtime predictive analytics using RabbitMQ & scikit-learn
Realtime predictive analytics using RabbitMQ & scikit-learn
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Think machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetanThink machine-learning-with-scikit-learn-chetan
Think machine-learning-with-scikit-learn-chetan
 
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael VaroquauxPyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
Pyparis2017 / Scikit-learn - an incomplete yearly review, by Gael Varoquaux
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
 
Intro to machine learning with scikit learn
Intro to machine learning with scikit learnIntro to machine learning with scikit learn
Intro to machine learning with scikit learn
 
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
Authorship Attribution and Forensic Linguistics with Python/Scikit-Learn/Pand...
 
Exploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-LearnExploring Machine Learning in Python with Scikit-Learn
Exploring Machine Learning in Python with Scikit-Learn
 
Machine learning in production with scikit-learn
Machine learning in production with scikit-learnMachine learning in production with scikit-learn
Machine learning in production with scikit-learn
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnNumerical tour in the Python eco-system: Python, NumPy, scikit-learn
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
 
Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-LearnAccelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn
 
Scikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the projectScikit-learn for easy machine learning: the vision, the tool, and the project
Scikit-learn for easy machine learning: the vision, the tool, and the project
 
Converting Scikit-Learn to PMML
Converting Scikit-Learn to PMMLConverting Scikit-Learn to PMML
Converting Scikit-Learn to PMML
 
Text Classification/Categorization
Text Classification/CategorizationText Classification/Categorization
Text Classification/Categorization
 

Similar to Introduction to Machine Learning with Python and scikit-learn

IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET Journal
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
Ruby Shrestha
 
Statistics in Data Science with Python
Statistics in Data Science with PythonStatistics in Data Science with Python
Statistics in Data Science with Python
Mahe Karim
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using python
Lino Coria
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science
Frank Kienle
 
Building a custom machine learning model on android
Building a custom machine learning model on androidBuilding a custom machine learning model on android
Building a custom machine learning model on android
Isabel Palomar
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
Renjith M P
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
Alex Austin
 
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with python
Kumud Arora
 
Asgh
AsghAsgh
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Big_Data_Ukraine
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Red Hat Developers
 
CSL0777-L07.pptx
CSL0777-L07.pptxCSL0777-L07.pptx
CSL0777-L07.pptx
KonkoboUlrichArthur
 
Python Manuel-R2021.pdf
Python Manuel-R2021.pdfPython Manuel-R2021.pdf
Python Manuel-R2021.pdf
RamprakashSingaravel1
 
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
ETS Asset Management Factory
 
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORYGE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
ANJALAI AMMAL MAHALINGAM ENGINEERING COLLEGE
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdf
ssuserb4d806
 
Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARK
MRKUsafzai0607
 
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
DevDay.org
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
Yvonne K. Matos
 

Similar to Introduction to Machine Learning with Python and scikit-learn (20)

IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
The ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptxThe ABC of Implementing Supervised Machine Learning with Python.pptx
The ABC of Implementing Supervised Machine Learning with Python.pptx
 
Statistics in Data Science with Python
Statistics in Data Science with PythonStatistics in Data Science with Python
Statistics in Data Science with Python
 
Introduction to deep learning using python
Introduction to deep learning using pythonIntroduction to deep learning using python
Introduction to deep learning using python
 
Machine Learning part 2 - Introduction to Data Science
Machine Learning part 2 -  Introduction to Data Science Machine Learning part 2 -  Introduction to Data Science
Machine Learning part 2 - Introduction to Data Science
 
Building a custom machine learning model on android
Building a custom machine learning model on androidBuilding a custom machine learning model on android
Building a custom machine learning model on android
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
 
Ml programming with python
Ml programming with pythonMl programming with python
Ml programming with python
 
Asgh
AsghAsgh
Asgh
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
Jupyter Notebooks for machine learning on Kubernetes & OpenShift | DevNation ...
 
CSL0777-L07.pptx
CSL0777-L07.pptxCSL0777-L07.pptx
CSL0777-L07.pptx
 
Python Manuel-R2021.pdf
Python Manuel-R2021.pdfPython Manuel-R2021.pdf
Python Manuel-R2021.pdf
 
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
Python + Tensorflow: how to earn money in the Stock Exchange with Deep Learni...
 
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORYGE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
GE3171-PROBLEM SOLVING AND PYTHON PROGRAMMING LABORATORY
 
AIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdfAIML4 CNN lab256 1hr (111-1).pdf
AIML4 CNN lab256 1hr (111-1).pdf
 
Introduction to Machine Learning by MARK
Introduction to Machine Learning by MARKIntroduction to Machine Learning by MARK
Introduction to Machine Learning by MARK
 
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
[DevDay2019] Python Machine Learning with Jupyter Notebook - By Nguyen Huu Th...
 
Learning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and KaggleLearning Predictive Modeling with TSA and Kaggle
Learning Predictive Modeling with TSA and Kaggle
 

Recently uploaded

Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 

Introduction to Machine Learning with Python and scikit-learn

  • 1. Introduction to Machine Learning with Python and scikit-learn Python Atlanta Nov. 14th 2013 Matt Hagy matt@liveramp.com
  • 2. Machine Learning (ML): • Finding patterns in data • Modeling patterns • Use models to make predictions Slide #2 Intro to Machine Learning with Python matt@liveramp.com
  • 3. ML can be easy* • You already have ML applications! • You can start applying ML methods now with Python &scikit-learn • Theoretical knowledge of ML not needed (initially)* *Gaining more background, theory, and experience will help Slide #3 Intro to Machine Learning with Python matt@liveramp.com
  • 4. Simple Example Slide #4 Intro to Machine Learning with Python matt@liveramp.com
  • 5. Simple Model Slide #5 Intro to Machine Learning with Python matt@liveramp.com
  • 6. import numpyas np from sklearn.linear_modelimport LinearRegression x,y = np.load('data.npz') x_test = np.linspace(0, 200) model = LinearRegression() model.fit(x[::, np.newaxis], y) y_test = model.predict(x_test[::, np.newaxis]) Slide #6 Intro to Machine Learning with Python matt@liveramp.com
  • 7. Slide #7 Intro to Machine Learning with Python matt@liveramp.com
  • 8. Variance/Bias Trade Off • Need models that can adapt to relationships in our data • Highly adaptable models can over-fit and will not generalize • Regularization – Common strategy to address variance/bias trade off Slide #8 Intro to Machine Learning with Python matt@liveramp.com
  • 9. Slide #9 Intro to Machine Learning with Python matt@liveramp.com
  • 10. import numpy as np from sklearn.svmimport SVR from sklearn.pipelineimport Pipeline from sklearn.preprocessingimport StandardScaler x,y = np.load('data.npz') x_test = np.linspace(0, 200) regularization term model = Pipeline([ ('standardize', StandardScaler()), ('svr', SVR(kernel='rbf', verbose=0, C=5e6, epsilon=20)) ]) model.fit(x[::, np.newaxis], y) y_test = model.predict(x_test[::, np.newaxis]) Slide #10 Intro to Machine Learning with Python matt@liveramp.com
  • 11. Supervised Learning Output, Y 0 3 1 3 4 2 9 3 4 1 6 3 7 9 3 17 6 7 Sample Input, X Slide #11 Modeling relationship between inputs and outputs Intro to Machine Learning with Python matt@liveramp.com
  • 12. Multiple Inputs Input, X Sample X1 X2 X3 Xn Output, Y 0 3 1 3 4 2 9 3 4 2 3 1 6 8 9 1 2 3 1 0 3 1 2 7 5 4 2 4 7 0 2 9 1 3 2 1 1 6 3 7 9 3 17 6 7 Slide #12 … Intro to Machine Learning with Python matt@liveramp.com
  • 13. Example: Image Classification • Classify handwritten digits with ML models • Each input is an entire image • Output is digit in the image Slide #13 Intro to Machine Learning with Python matt@liveramp.com
  • 14. Input, X Output, Y 9 2 Slide #14 Intro to Machine Learning with Python matt@liveramp.com
  • 15. import numpyas np from sklearn.ensembleimport RandomForestClassifier with np.load(’train.npz') as data: pixels_train = data['pixels'] labels_train = data['labels’] with np.load(’test.npz') as data: pixels_test = data['pixels'] # flatten X_train = pixels_train.reshape(pixels_train.shape[0], -1) X_test = pixels_test.reshape(pixels_test.shape[0], -1) model = RandomForestClassifier(n_estimators=50) model.fit(X_train, labels_train) labels_test = model.predict(X_test) Slide #15 Intro to Machine Learning with Python matt@liveramp.com
  • 16. Predicting the tags of Stack Overflow questions with machine learning Kaggle Data Science Competition • Given 6 million training questions labeled with tags • Predict the tags for 2 million unlabeled test questions www.users.globalnet.co.uk/~slocks/instructions.html stackoverflow.com/questions/895371/bubble-sort-homework Slide #16 Intro to Machine Learning with Python matt@liveramp.com
  • 17. Text Classification Overview Feature Extraction & Selection Raw Posts Slide #17 Model Selection & Training Vector Space Intro to Machine Learning with Python Machine Learning Model matt@liveramp.com
  • 18. Term Frequency Feature Extraction Characterize text by the frequency of specific words in each text entry Slide #18 processing sorted array faster “Why is processing a sorted array faster than processing an array this is not sorted?” Term Frequencies why Example Title: 1 2 2 2 1 Ignore common words (i.e. stop words) Intro to Machine Learning with Python matt@liveramp.com
  • 19. sorted array faster need help java homework Title 1 1 2 2 2 1 0 0 0 0 Title 2 0 0 0 0 0 1 1 1 1 Title 3 0 0 1 1 0 0 1 0 1 why processing Frequency of key terms is anticipated to be correlated with the tags of the question Slide #19 Intro to Machine Learning with Python matt@liveramp.com
  • 20. Example Model Coefficients Slide #22 Intro to Machine Learning with Python matt@liveramp.com
  • 21.
  • 22. ML can be easy* • You already have ML problems! • You can start applying ML methods now with Python &scikit-learn • Theoretical knowledge of ML not needed (initially)* scikit-learn.org github.com/scikit-learn Slide #24 Intro to Machine Learning with Python matt@liveramp.com
  • 23. Helping companies use their marketing data to delight customers Tools Opportunities • Backend Engineers • Data Scientists • Full-Stack Engineers • Java • Hadoop (Map/Reduce) • Ruby Build and work with large distributed systems that process massive data sets. Check out: liveramp.com/careers Slide #25 Intro to Machine Learning with Python matt@liveramp.com