Please turn off your webcam
If you are joining from a mobile phone
be sure to click on
Join via Device Audio
We are waiting for other participants to join
We will begin at 4:30 PM IST
Mihir Thakkar
Founder and Instructor
hello@codeheroku.com
Introduction to
Unsupervised
Machine Learning
SESSION
OBJECTIVES
● Quick Recap
● Unsupervised Learning
● Implement a Clustering
Algorithm in Python
Types of Machine Learning Algorithms
Supervised Machine Learning
House Size (Sq feet) Location Age (years) Prize (Lakh
Rs)
500 Mumbai 2 70
1500 Pune 3 100
2000 Banglore 4 60
1000 Mumbai 2 ?
3000 Pune 10 ?
Training Data
Test Data
www.codeheroku.com Introduction to Unsupervised Machine Learning
Brain scans by functional magnetic resonance imaging. An illustration of finding interesting underlying
phenomena in high-dimensional data (Beckmann et al, Phil. Trans. Royal Soc. B, 2005).
Types of Machine Learning Algorithms
Unsupervised Machine Learning
Unsupervised
Machine Learning
Algorithm
Supervised ML
● Labelled Dataset
Unsupervised ML
● Prediction/Classification
● More formal problem
● Labels / targets are
unknown
● Finding hidden patterns in
unlabelled data
● Problem itself is
ambiguous
www.codeheroku.com Introduction to Unsupervised Machine Learning
Applications Of
Unsupervised ML
Dimensionality
Reduction
Image
Compression
Anomaly
Detection
Topic Mapping
QUIZ
In which of these scenarios you would most likely use an
unsupervised machine learning algorithm?
1. Given a set of images you are interested in grouping the ones
which are similar
2. Given training data about a user’s preferences you are interested
in knowing whether they would like/dislike a movie
3. Given a set of 1000 features you are interested in finding features
that capture maximum variance in the data
www.codeheroku.com Introduction to Unsupervised Machine Learning
Clustering
Movie Time
IMDB
Rating
www.codeheroku.com Introduction to Unsupervised Machine Learning
Clustering
Movie Time
IMDB
Rating
1. Proximity Measures
1. Evaluation Criteria => What a
good cluster looks like?
www.codeheroku.com Introduction to Unsupervised Machine Learning
K-means
Initialize K centroids randomly
Step 1: Assignment
Assign each data point a cluster based on closest centroid
Step 2: Move Centroid
Move centroid to location which is the mean of all points in that cluster
Move centroid to location which is the mean of all points in that cluster
Assign
Move centroid
Assign
Quiz
The distortion of a cluster is given by the formula below. Calculate the
distortion of a clustering algorithm with following values
Data X Data Y Centroid X Centroid Y
5 2 3 3
3 3
2 2
Ans: (5-3)2 + (2-3)2 + 0 + (2-3)2 + (2-3)2Ans: (5-3)2 + (2-3)2 + 0 + (2-3)2 + (2-3)2
www.codeheroku.com Introduction to Unsupervised Machine Learning
Optimal Number of K
www.codeheroku.com Introduction to Unsupervised Machine Learning
Will we always get the best solution?
www.codeheroku.com Introduction to Unsupervised Machine Learning
Seen this before?
www.codeheroku.com Introduction to Unsupervised Machine Learning
Outliers!!
www.codeheroku.com Introduction to Unsupervised Machine Learning
Let’s Build It
https://drive.google.com/file/d/1wLwOro1YhfpPr0YqYywx6zLcLn8Ceh30/view?usp=sharing
https://github.com/codeheroku/Introduction-to-Machine-Learning/tree/master/
www.codeheroku.com Introduction to Unsupervised Machine Learning
Thank you!
www.codeheroku.com Introduction to Unsupervised Machine Learning
https://qr.ae/TUry32

Introduction to Unsupervised Learning - Code Heroku

Editor's Notes

  • #34 https://qr.ae/TUry32