Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Introduction to Machine
Learning
(Unsupervised learning)
Dmytro Fishman (dmytro@ut.ee)
Unsupervised Learning
Tumour size
Age
2
0
4
6
5
0
1
3
3
3
1
5
8
3
8
6
7
2
4
2
3
9
1
3
8
0
8
2
4
1
3
1
8
2
0
4
6
5
4
7
8
2
?
We were given
annotated data
2
0
4
6
5
0
1
3
3
3
1
5
8
3
8
6
7
2
4
2
3
9
1
3
8
0
8
2
4
1
3
1
8
2
0
4
6
5
4
7
8
2
?
We were given
annotated data
Based o...
2
0
4
6
5
0
1
3
3
3
1
5
8
3
8
6
7
2
4
2
3
9
1
3
8
0
8
2
4
1
3
1
8
2
0
4
6
5
4
7
8
2
?
We were given
annotated data
Based o...
2
0
4
6
5
0
1
3
3
3
1
5
8
3
8
6
7
2
4
2
3
9
1
3
8
0
8
2
4
1
3
1
8
2
0
4
6
5
4
7
8
2
3
We were given
annotated data
Based o...
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
We were given
annotated data
Based o...
Clustering
Customer 1
Customer 1
Customer 2
Customer 1
Customer 2
Customer 3
Customer 1
Customer 2
Customer 3
Customer 4
Customer 1
Customer 2
Customer 3
Customer 4
Which segments of
customers should the
store target?
Grouping data points so that
semantically similar points would
be clustered together
Jack Vilo et. al Data Mining: https://courses.cs.ut.ee/MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf
Grouping ...
Hierarchical clustering
Tumour size
Age
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Remember NN?
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instan...
Tumour size
Age
Usually similarity is defined by
euclidean or any other
distance measure1.Let’s first assume that
all instan...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
Distance between clusters can be
estimated with three strategies:
1.Let’s first assume that
all instances a...
Tumour size
Age
Distance between clusters can be
estimated with three strategies:
1. Single linkage (min)1.Let’s first assu...
Tumour size
Age
Distance between clusters can be
estimated with three strategies:
1. Single linkage (min)
2. Complete link...
Tumour size
Age
Distance between clusters can be
estimated with three strategies:
1. Single linkage (min)
2. Complete link...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and merge...
Tumour size
Age
K = 2
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and...
Tumour size
Age
K = 1
1.Let’s first assume that
all instances are
individual clusters
2.Find two most similar
instances and...
Tumour size
1.Let’s first assume that
all instances are
individual clusters
Age
2.Find two most similar
instances and merge...
Tumour size
1.Let’s first assume that
all instances are
individual clusters
Age
2.Find two most similar
instances and merge...
Tumour size
1.Let’s first assume that
all instances are
individual clusters
Age
2.Find two most similar
instances and merge...
K = 2
K = 2
K-means clustering
K-means clustering
*although they do have something in
common with K-nearest neighbour, but
they are not the same.
Tumour size
Age
1.Choose K, the number
of potential clusters
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Tumour size
Age
1.Choose K, the number
of potential clusters
Let K be 2
2.Initialise cluster centers
randomly within the d...
Hierarchical K-means
Tumour size
Age
Tumour size
Age
Hierarchical K-means
Tumour size
Age
Tumour size
Age
😈👼
Tumour size
Age
Tumour size
Age
Hierarchical K-means
Slow for modern size datasets
Hard to predict K
Good for visualisa...
to predict KTwo body parts methods
to predict KTwo body parts methods
The rule of thumb is to
choose as K
p
n/2
to predict KTwo body parts methods
The rule of thumb is to
choose as K
p
n/2
Elbow method: increase
K until it does not he...
References
• Machine Learning by Andrew Ng (https://www.coursera.org/learn/machine-
learning)
• Introduction to Machine Le...
www.biit.cs.ut.ee www.ut.ee www.quretec.ee
3 Unsupervised learning
3 Unsupervised learning
Upcoming SlideShare
Loading in …5
×

3 Unsupervised learning

278 views

Published on

The third lecture from the Machine Learning course series of lectures. It starts with an introduction to idea of unsupervised analysis, and mostly focuses on clustering. Two very popular clustering methods are discussed and compared further: k-means and hierarchical clustering. As both methods need a hyper-parameter k to be chosen beforehand, two very simplistic ways of identifying k are discussed at the end of the lecture. A link to my github (https://github.com/skyfallen/MachineLearningPracticals) with practicals that I have designed for this course in both R and Python. I can share keynote files, contact me via e-mail: dmytro.fishman@ut.ee.

Published in: Education
  • Be the first to comment

3 Unsupervised learning

  1. 1. Introduction to Machine Learning (Unsupervised learning) Dmytro Fishman (dmytro@ut.ee)
  2. 2. Unsupervised Learning Tumour size Age
  3. 3. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data
  4. 4. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data Based on which we could predict a class of novel data point
  5. 5. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data Based on which we could predict a class of novel data point
  6. 6. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 3 We were given annotated data Based on which we could predict a class of novel data point
  7. 7. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? We were given annotated data Based on which we could predict a class of novel data point Now, imagine that they have taken away all these good things from us.
  8. 8. Clustering
  9. 9. Customer 1
  10. 10. Customer 1 Customer 2
  11. 11. Customer 1 Customer 2 Customer 3
  12. 12. Customer 1 Customer 2 Customer 3 Customer 4
  13. 13. Customer 1 Customer 2 Customer 3 Customer 4 Which segments of customers should the store target?
  14. 14. Grouping data points so that semantically similar points would be clustered together
  15. 15. Jack Vilo et. al Data Mining: https://courses.cs.ut.ee/MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf Grouping data points so that semantically similar points would be clustered together
  16. 16. Hierarchical clustering
  17. 17. Tumour size Age
  18. 18. Tumour size Age 1.Let’s first assume that all instances are individual clusters
  19. 19. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
  20. 20. Remember NN? Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
  21. 21. Tumour size Age Usually similarity is defined by euclidean or any other distance measure1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
  22. 22. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
  23. 23. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  24. 24. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  25. 25. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  26. 26. Tumour size Age Distance between clusters can be estimated with three strategies: 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  27. 27. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min)1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  28. 28. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min) 2. Complete linkage (max) 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  29. 29. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min) 2. Complete linkage (max) 3. Average linkage (avg) 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  30. 30. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  31. 31. Tumour size Age 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  32. 32. Tumour size Age K = 2 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  33. 33. Tumour size Age K = 1 1.Let’s first assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
  34. 34. Tumour size 1.Let’s first assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one
  35. 35. Tumour size 1.Let’s first assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one K = 2
  36. 36. Tumour size 1.Let’s first assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one K = 3
  37. 37. K = 2
  38. 38. K = 2
  39. 39. K-means clustering
  40. 40. K-means clustering *although they do have something in common with K-nearest neighbour, but they are not the same.
  41. 41. Tumour size Age 1.Choose K, the number of potential clusters
  42. 42. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2
  43. 43. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data
  44. 44. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre
  45. 45. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre
  46. 46. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
  47. 47. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
  48. 48. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
  49. 49. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers 5.Steps 3/4 are repeated until convergence
  50. 50. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers 5.Steps 3/4 are repeated until convergence
  51. 51. Hierarchical K-means Tumour size Age Tumour size Age
  52. 52. Hierarchical K-means Tumour size Age Tumour size Age
  53. 53. 😈👼 Tumour size Age Tumour size Age Hierarchical K-means Slow for modern size datasets Hard to predict K Good for visualisation purposes
  54. 54. to predict KTwo body parts methods
  55. 55. to predict KTwo body parts methods The rule of thumb is to choose as K p n/2
  56. 56. to predict KTwo body parts methods The rule of thumb is to choose as K p n/2 Elbow method: increase K until it does not help to describe data better
  57. 57. References • Machine Learning by Andrew Ng (https://www.coursera.org/learn/machine- learning) • Introduction to Machine Learning by Pascal Vincent given at Deep Learning Summer School, Montreal 2015 (http://videolectures.net/ deeplearning2015_vincent_machine_learning/) • Welcome to Machine Learning by Konstantin Tretyakov delivered at AACIMP Summer School 2015 (http://kt.era.ee/lectures/aacimp2015/1-intro.pdf) • Stanford CS class: Convolutional Neural Networks for Visual Recognition by Andrej Karpathy (http://cs231n.github.io/) • Data Mining Course by Jaak Vilo at University of Tartu (https://courses.cs.ut.ee/ MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf) • Machine Learning Essential Conepts by Ilya Kuzovkin (https:// www.slideshare.net/iljakuzovkin) • From the brain to deep learning and back by Raul Vicente Zafra and Ilya Kuzovkin (http://www.uttv.ee/naita?id=23585&keel=eng)
  58. 58. www.biit.cs.ut.ee www.ut.ee www.quretec.ee

×