15. Clustering
Clustering is the task of dividing the
population or data points into a number of
groups such that data points in the same
groups are more similar to other data points
in the same group than those in other
groups.
In simple words, the aim is to segregate
groups with similar traits and assign them
into clusters.
16. Types of Clustering
Hard Clustering: In hard clustering, each data point either belongs to a
cluster completely or not. For example, in the above example each customer
is put into one group out of the 10 groups.
Soft Clustering: In soft clustering, instead of putting each data point into a
separate cluster, a probability or likelihood of that data point to be in those
clusters is assigned. For example, from the above scenario each customer is
assigned a probability to be in either of 10 clusters of the retail store.
17. Types of Clustering Algorithms
Connectivity models: Based on the notion that the data points closer in data
space exhibit more similarity to each other than the data points lying farther
away.
Centroid models: Iterative clustering algorithms in which similarity is derived
by the closeness of a data point to the centroid of the clusters.
Distribution models: Based on probability distribution.
Density models: Based on varied density of data points in the data space.
18. KNN (K- Nearest Neighbors)
It can be used for both classification and
regression problems.
However, it is more widely used in classification
problems in the industry. K nearest neighbors is
a simple algorithm that stores all available cases
and classifies new cases by a majority vote of its
k neighbors.
The case being assigned to the class is most
common amongst its K nearest neighbors
measured by a distance function.
19. KNN (K- Nearest Neighbors)
Things to consider before selecting KNN:
● KNN is computationally expensive
● Variables should be normalized else
higher range variables can bias it
● Works on pre-processing stage more
before going for KNN like outlier, noise
removal
22. K-Means
It is a type of unsupervised algorithm which
solves the clustering problem. Its procedure
follows a simple and easy way to classify a given
data set through a certain number of clusters
(assume k clusters).
Data points inside a cluster are homogeneous
and heterogeneous to peer groups.
26. Juxi Leitner
Jürgen “Juxi“ Leitner is a researcher at the
intersection of robotics, robotic vision and
artificial intelligence (AI) at the ARC Centre of
Excellence in Robotic Vision in Brisbane.
He is working on creating autonomous robots
that ‘can SEE and DO stuff’ in real-world
environments and has authored more than 50+
publications.
27. Marita Cheng
Marita Cheng is the founder of Robogals, a non-
profit organisation which has delivered robotics
workshops to 60,000 girls in 11 countries.
She was named the 2012 Young Australian of
the Year and is the founder and current CEO of
2Mar Robotics, a start-up robotics company.
28. Peter Corke
Peter Corke is a professor of robotics at QUT
and director of the Australian Centre for Robotic
Vision.
He wrote the textbook Robotics, Vision &
Control, authored the MATLAB toolboxes for
Robotics and Machine Vision, and created the
online educational resource, QUT Robot
Academy.