Machine Learning using Python

http://ocl.space
What is Machine Learning?

http://ocl.space
Types of Machine Learning
● Supervised Learning
● Unsupervised Learning
● Reinforcement Learning

http://ocl.space
Supervised Learning
y = f(X)
X is the features/inputs
y is the target/output
f(X) is the learning function
Types :
● Regression
● Classification

http://ocl.space
Unsupervised Learning
● We have input data (X) but no corresponding output variable (y).
● The goal is to model the distribution of the data in order to learn more
about the data.
● Types of unsupervised learning :
--> Clustering
--> Association

http://ocl.space
Other learning methods...
● Reinforcement learning
● Semi-supervised learning
● Transfer learning

http://ocl.space
Regression
● A form of predictive modelling technique which investigates the relationship
between a dependent (target) and independent variable (s) (predictor).
● It is used for forecasting, time series modelling and finding the causal effect
relationship between the variables.
● It indicates the significant relationships between dependent variable and
independent variable.
● It indicates the strength of impact of multiple independent variables on a
dependent variable.
● Types of regression : Linear, Logistic, Polynomial, Stepwise, Ridge, Lasso
and ElasticNet

http://ocl.space
Classification
● A classification problem is when the output variable is a category.
● Examples : Emails filtering, Spam/Not Spam

http://ocl.space
Clustering and Association
● The aim is to segregate groups with similar traits and assign them into clusters.
● Types of Clustering :
--> Hard Clustering: In hard clustering, each data point either belongs to a cluster
completely or not.
--> Soft Clustering: In soft clustering, instead of putting each data point into a
separate cluster, a probability or likelihood of that data point to be in those
clusters is assigned.
● When we want to discover rules that describe portions of the input data it is known
as association problem.

http://ocl.space
Linear Regression
● It is used to estimate real values (cost of houses, number of calls, total sales etc.)
based on continuous variable(s).
● Here, we establish relationship between independent and dependent variables
by fitting a best line.
● This best fit line is known as regression line and represented by a linear equation
Y= a * X + b
Y – Dependent Variable
a – Slope
X – Independent variable
b – Intercept

http://ocl.space
Linear Regression

http://ocl.space
Logistic Regression
● It is used to estimate discrete values ( Binary values like 0/1, yes/no, true/false )
based on given set of independent variable(s).
● It predicts the probability of occurrence of an event by fitting data to a logit function.

http://ocl.space
Logistic Regression

http://ocl.space
Overfitting & Underfitting
● Overfitting happens when a model performs too well on training data but does
not perform well on unseen data.
● Underfitting when a model does not perform well on training data as well as
unseen data.

http://ocl.space
Cross Validation
● A method to test how well a model performs on unseen data.
● Types of Cross Validation methods :
--> Hold out method
--> K-fold method
--> Leave-one-out cross validation

http://ocl.space
Learning = Representation + Evaluation +
Optimization

http://ocl.space
Naive Bayes
● Naive Bayes is a supervised learning algorithm which is based on bayes theorem.
● The word naive comes from the assumption of independence among features.
● We can write bayes theorem as follows :
Where,
P(x) is the prior probability of a feature.
P(x | y) is the probability of a feature given target. It's also known as likelihood.
P(y) is the prior probability of a target or class in case of classification.
p(y | x) is the posterior probability of target given feature.

http://ocl.space
Support Vector Machines (SVMs)
● SVMs are among the best supervised learning algorithms.
● It is effective in high dimensional space and it is memory efficient as well.
● We plot each data item as a point in n-dimensional space andperform classification
by finding the hyperplane that differentiate the two classes very well.
● We can draw m number of hyperplanes.
● The optimal hyperplane is obtained by maximizing the margin.

http://ocl.space
Support Vector Machines (SVMs)

http://ocl.space
Decision Tree
● Decision Tree is the supervised learning algorithm which can be used for
classification as well as regression problems.
● Here we split population into set of homogeneous sets by asking set of questions.
● Example : To decide what to do on a particular day.

http://ocl.space
Decision Tree

http://ocl.space
Random Forest
● Random Forest is the most common type of Ensemble Learning.
● It is a collection of decision trees.
● To classify a new object based on attributes, each tree gives a classification
and we say the tree “votes” for that class. The forest chooses the classification
having the most votes (over all the trees in the forest).
● There are plethora of advantages of random forest such as they are fast to train,
requires no input preparation.
● One of the disadvantage of random forest is that our model may become too large.

http://ocl.space
K-nearest Neighbors (KNN)
● KNN can be used for both classification and regression problems.
● It stores all available cases and classifies new cases by a majority vote of its k
neighbors.
● KNN is computationally expensive.

http://ocl.space
K-means clustering
● K-means is one of the simplest unsupervised learning algorithm used for
clustering problem.
● Our goal is to group objects based on their features similarity.
● Basic idea behind K-means is, we define k centroids,
that is, one for each cluster.

http://ocl.space
Neural Networks
● Neural Network is an information processing system, that is, we pass some
input to the Neural Network, some processing happens and we get some output.
● Neural Networks are inspired from biological connection of neurons and how
information processing happens in the brain.

http://ocl.space
Let’s get started...

Machine Learning using Python

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Machine Learning using Python

Similar to Machine Learning using Python (20)

More from Suraj Kumar Jana

More from Suraj Kumar Jana (15)

Recently uploaded

Recently uploaded (20)

Machine Learning using Python