Unsupervised
Learning
Orozco Hsu
2022-04-11 1
About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
Tutorial
Content
3
The Orange 3 introduction
Getting started unsupervised learning with
Orange3
Home work
What is the unsupervised learning/ K-Means
Orange Data Mining
• Orange 3 introduction
• Orange Data Mining (agnieszka.si)
4
Orange Data Mining
• How to use Orange 3
• https://www.youtube.com/@OrangeDataMining
5
Download Orange for Windows
• A Python 3 data mining library with GUI.
• https://orangedatamining.com/screenshots/
• Widget catalogs
• Orange Data Mining - Widget catalog
6
7
8
Orange3 Canvas
9
Add-on
10
Note: Once you upgrade the Orange3 version, all the Add-on you have installed will be empty
Output the Orange3 workflow file
11
Basic Orange3
• Add-on
• Associate (1.2.0)
• Explain (0.6.2)
• Educational (0.6.0)
• Image Analytics (0.10.0)
12
Basic Orange3
• Use File/Datasets widget and display dataset with Data-Table and
Scatter plot
13
Supervised learning vs. Unsupervised learning
• Supervised learning: discover patterns in the data that relate data
attributes with a target (class) attribute.
• These patterns are then utilized to predict the values of the target attribute in
future data instances.
• Unsupervised learning: The data have no target attribute.
• We want to explore the data to find some intrinsic structures in them.
• Classic unsupervised learning algorithm
• Clustering algorithms
• Association rules
14
Clustering (K-means algorithm)
15
Clustering (K-means algorithm)
• Steps
• Define number K (cluster groups)
• Randomize the centroids of each cluster, calculating the summary
of distances of each data point to the centroids.
• Move the centroids and re-calculating the summary of distances,
until the summary of distances are in the convergency.
• https://www.youtube.com/watch?v=5I3Ei69I40s
16
Clustering of data
17
Scatter plot of data
18
Orange3 Interactive K-Means
19
Orange3 K-Means
20
Hierarchical clustering
21
Image Clustering
22
Homework
• Use your own dataset to find the best number K cluster and
explain each cluster statistic information
• Use Groceries data.csv for association rules and demo it.
• https://orangedatamining.com/blog/2016/04/25/association-rules-
in-orange/
23

UnSupervised Learning Clustering

  • 1.
  • 2.
    About me • Education •NCU (MIS)、NCCU (CS) • Work Experience • Telecom big data Innovation • AI projects • Retail marketing technology • User Group • TW Spark User Group • TW Hadoop User Group • Taiwan Data Engineer Association Director • Research • Big Data/ ML/ AIOT/ AI Columnist 2
  • 3.
    Tutorial Content 3 The Orange 3introduction Getting started unsupervised learning with Orange3 Home work What is the unsupervised learning/ K-Means
  • 4.
    Orange Data Mining •Orange 3 introduction • Orange Data Mining (agnieszka.si) 4
  • 5.
    Orange Data Mining •How to use Orange 3 • https://www.youtube.com/@OrangeDataMining 5
  • 6.
    Download Orange forWindows • A Python 3 data mining library with GUI. • https://orangedatamining.com/screenshots/ • Widget catalogs • Orange Data Mining - Widget catalog 6
  • 7.
  • 8.
  • 9.
  • 10.
    Add-on 10 Note: Once youupgrade the Orange3 version, all the Add-on you have installed will be empty
  • 11.
    Output the Orange3workflow file 11
  • 12.
    Basic Orange3 • Add-on •Associate (1.2.0) • Explain (0.6.2) • Educational (0.6.0) • Image Analytics (0.10.0) 12
  • 13.
    Basic Orange3 • UseFile/Datasets widget and display dataset with Data-Table and Scatter plot 13
  • 14.
    Supervised learning vs.Unsupervised learning • Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute. • These patterns are then utilized to predict the values of the target attribute in future data instances. • Unsupervised learning: The data have no target attribute. • We want to explore the data to find some intrinsic structures in them. • Classic unsupervised learning algorithm • Clustering algorithms • Association rules 14
  • 15.
  • 16.
    Clustering (K-means algorithm) •Steps • Define number K (cluster groups) • Randomize the centroids of each cluster, calculating the summary of distances of each data point to the centroids. • Move the centroids and re-calculating the summary of distances, until the summary of distances are in the convergency. • https://www.youtube.com/watch?v=5I3Ei69I40s 16
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
    Homework • Use yourown dataset to find the best number K cluster and explain each cluster statistic information • Use Groceries data.csv for association rules and demo it. • https://orangedatamining.com/blog/2016/04/25/association-rules- in-orange/ 23