Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

278 views

Published on

Published in:
Education

No Downloads

Total views

278

On SlideShare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

31

Comments

0

Likes

2

No embeds

No notes for slide

- 1. Introduction to Machine Learning (Unsupervised learning) Dmytro Fishman (dmytro@ut.ee)
- 2. Unsupervised Learning Tumour size Age
- 3. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data
- 4. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data Based on which we could predict a class of novel data point
- 5. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 ? We were given annotated data Based on which we could predict a class of novel data point
- 6. 2 0 4 6 5 0 1 3 3 3 1 5 8 3 8 6 7 2 4 2 3 9 1 3 8 0 8 2 4 1 3 1 8 2 0 4 6 5 4 7 8 2 3 We were given annotated data Based on which we could predict a class of novel data point
- 7. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? We were given annotated data Based on which we could predict a class of novel data point Now, imagine that they have taken away all these good things from us.
- 8. Clustering
- 9. Customer 1
- 10. Customer 1 Customer 2
- 11. Customer 1 Customer 2 Customer 3
- 12. Customer 1 Customer 2 Customer 3 Customer 4
- 13. Customer 1 Customer 2 Customer 3 Customer 4 Which segments of customers should the store target?
- 14. Grouping data points so that semantically similar points would be clustered together
- 15. Jack Vilo et. al Data Mining: https://courses.cs.ut.ee/MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf Grouping data points so that semantically similar points would be clustered together
- 16. Hierarchical clustering
- 17. Tumour size Age
- 18. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters
- 19. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
- 20. Remember NN? Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
- 21. Tumour size Age Usually similarity is deﬁned by euclidean or any other distance measure1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
- 22. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster
- 23. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 24. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 25. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 26. Tumour size Age Distance between clusters can be estimated with three strategies: 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 27. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min)1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 28. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min) 2. Complete linkage (max) 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 29. Tumour size Age Distance between clusters can be estimated with three strategies: 1. Single linkage (min) 2. Complete linkage (max) 3. Average linkage (avg) 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 30. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 31. Tumour size Age 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 32. Tumour size Age K = 2 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 33. Tumour size Age K = 1 1.Let’s ﬁrst assume that all instances are individual clusters 2.Find two most similar instances and merge them into one cluster 3.Repeat 2 until all cluster merge into one
- 34. Tumour size 1.Let’s ﬁrst assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one
- 35. Tumour size 1.Let’s ﬁrst assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one K = 2
- 36. Tumour size 1.Let’s ﬁrst assume that all instances are individual clusters Age 2.Find two most similar instances and merge them into one cluster Dendrogram 4.Hierarchical clustering is usually visualised using dendrogram 3.Repeat 2 until all cluster merge into one K = 3
- 37. K = 2
- 38. K = 2
- 39. K-means clustering
- 40. K-means clustering *although they do have something in common with K-nearest neighbour, but they are not the same.
- 41. Tumour size Age 1.Choose K, the number of potential clusters
- 42. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2
- 43. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data
- 44. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre
- 45. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre
- 46. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
- 47. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
- 48. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers
- 49. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers 5.Steps 3/4 are repeated until convergence
- 50. Tumour size Age 1.Choose K, the number of potential clusters Let K be 2 2.Initialise cluster centers randomly within the data 3.Instances are clustered to the nearest cluster centre 4.Centroids of each of the K clusters become new cluster centers 5.Steps 3/4 are repeated until convergence
- 51. Hierarchical K-means Tumour size Age Tumour size Age
- 52. Hierarchical K-means Tumour size Age Tumour size Age
- 53. 😈👼 Tumour size Age Tumour size Age Hierarchical K-means Slow for modern size datasets Hard to predict K Good for visualisation purposes
- 54. to predict KTwo body parts methods
- 55. to predict KTwo body parts methods The rule of thumb is to choose as K p n/2
- 56. to predict KTwo body parts methods The rule of thumb is to choose as K p n/2 Elbow method: increase K until it does not help to describe data better
- 57. References • Machine Learning by Andrew Ng (https://www.coursera.org/learn/machine- learning) • Introduction to Machine Learning by Pascal Vincent given at Deep Learning Summer School, Montreal 2015 (http://videolectures.net/ deeplearning2015_vincent_machine_learning/) • Welcome to Machine Learning by Konstantin Tretyakov delivered at AACIMP Summer School 2015 (http://kt.era.ee/lectures/aacimp2015/1-intro.pdf) • Stanford CS class: Convolutional Neural Networks for Visual Recognition by Andrej Karpathy (http://cs231n.github.io/) • Data Mining Course by Jaak Vilo at University of Tartu (https://courses.cs.ut.ee/ MTAT.03.183/2017_spring/uploads/Main/DM_05_Clustering.pdf) • Machine Learning Essential Conepts by Ilya Kuzovkin (https:// www.slideshare.net/iljakuzovkin) • From the brain to deep learning and back by Raul Vicente Zafra and Ilya Kuzovkin (http://www.uttv.ee/naita?id=23585&keel=eng)
- 58. www.biit.cs.ut.ee www.ut.ee www.quretec.ee

No public clipboards found for this slide

Be the first to comment