Machine learning ( Part 3 )

Sunil OS
Sunil OSCorporate Training and Placements
www.SunilOS.com 1
Unsupervised Machine Learning
www.sunilos.com
www.raystec.com
Unsupervised Learning
❑ unsupervised machine learning is a type of machine
learning where we have only data points but no labels.
❑ We will make a group based on the similarity among data
points.
o For example, in real life we are arranging our bookshelves. In bookshelves we
have different kinds of books. We will make groups of books based on their
subjects. So, in unsupervised learning we iterate through data and group them
together based on similar characteristics.
❑ Unsupervised learning also known as clustering.
www.SunilOS.com 2
Types of clustering :
❑Partition Based
o partition-based clustering methods include K-Means, K-
Medoids, CLARANS, etc.
❑Hierarchical Based
o hierarchical clustering methods include BIRCH and
Chameleon.
❑Density Based Learning
o DBSCAN, OPTICS are the most popular density-based
clustering methods.
www.SunilOS.com 3
K-means clustering algorithm
❑ K-means is based on a partition based clustering method.
❑ K-means, it is one of the simplest unsupervised learning
algorithms that will solve the most well-known clustering
problem.
❑ The procedure can be grouped as the one which follows a
simple and very easy way to classify a given data set with
the help of a certain number of clusters (assume k clusters).
❑
www.SunilOS.com 4
How K-Means Clustering Works:
❑The K Means algorithm is iterative based, it repeatedly
calculates the cluster centroids, refining the values until
they do not change much.
❑The k-means algorithm takes a dataset of ‘n’ points as
input, together with an integer parameter ‘k’ specifying
how many clusters to create(supplied by the
programmer).
❑The output is a set of ‘k’ cluster centroids and a labeling
of the dataset that maps each of the data points to a
unique cluster.
❑
www.SunilOS.com 5
Steps of K-means clustering:
❑Choose the number of clusters k
❑Select k random points from the data as centroids
❑Assign all the points to the closest cluster centroid
❑Recompute the centroids of newly formed clusters
❑Repeat step 3 and 4
www.SunilOS.com 6
When to stop Iterating?
❑Centroids of newly formed clusters do not
change
❑Points remain in the same cluster
❑Maximum number of iterations are reached
www.SunilOS.com 7
Working of K-means
www.SunilOS.com 8
❑Sample Dataset:
Objects X Y Z
OB-1 1 4 1
OB-2 1 2 2
OB-3 1 4 2
OB-4 2 1 2
OB-5 1 1 1
OB-6 2 4 2
OB-7 1 1 2
OB-8 2 1 1
❑ We have total 8 data points. We will divide these points into 2
clusters. K=2 in k-means.
❑ Taking any two centroids or data points (as you took 2 as K
hence the number of centroids also 2) in its account initially.
❑ After choosing the centroids, (say C1 and C2) the data points
(coordinates here) are assigned to any of the Clusters
❑ Assume that the algorithm chose OB-2 (1,2,2) and OB-6 (2,4,2)
as centroids and cluster 1 and cluster 2 as well.
❑ For measuring the distances, you take the following distance
measurement function (also termed as similarity measurement
function):
❑ d=|x2–x1|+|y2–y1|+|z2–z1|
www.SunilOS.com 9
calculation of distances
Objects X Y Z
Distance from
C1(1,2,2)
Distance from
C2(2,4,2)
OB-1 1 4 1 3 2
OB-2 1 2 2 0 3
OB-3 1 4 2 2 1
OB-4 2 1 2 2 3
OB-5 1 1 1 2 5
OB-6 2 4 2 3 0
OB-7 1 1 2 1 4
OB-8 2 1 1 3 4
www.SunilOS.com 10
Cluster formation
❑ After the initial pass of clustering, the clustered objects will
look something like the following:
❑
www.SunilOS.com 11
Cluster 1
OB-2
OB-4
OB-5
OB-7
OB-8
Cluster 2
OB-1
OB-3
OB-6
❑
www.SunilOS.com 12
Distance from new Centroids
Objects X Y Z
Distance from
C1(1.4,1.2,1.6)
Distance from C2(1.33, 4,
1.66)
OB-1 1 4 1 3.8 1
OB-2 1 2 2 1.6 2.66
OB-3 1 4 2 3.6 0.66
OB-4 2 1 2 1.2 4
OB-5 1 1 1 1.2 4
OB-6 2 4 2 3.8 1
OB-7 1 1 2 1 3.66
OB-8 2 1 1 1.4 4.33
www.SunilOS.com 13
Updated Clusters
❑The new assignments of the objects with respect
to the updated clusters will be:
❑Algorithm will End here because no changes in
groups.
❑
www.SunilOS.com 14
Cluster 1
OB-2
OB-4
OB-5
OB-7
OB-8
Cluster 2
OB-1
OB-3
OB-6
Code Implementation of K-means
❑ import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
import numpy as np
X = np.array([[1, 2],
[1.5, 1.8],
[5, 8 ],
[8, 8],
[1, 0.6],
[9,11]])
plt.scatter(X[:,0], X[:,1], s=150)
plt.show()
www.SunilOS.com 15
Code Implementation of K-means (cont.)
❑ from sklearn.cluster import Kmeans
❑ # You want cluster the records into 2
kmeans = KMeans(n_clusters=2)
❑ #train Model
kmeans.fit(X)
❑ #test Model
labels = kmeans.predict([[20,8]])
print(labels)
centroids = kmeans.cluster_centers_
print(centroids)
www.SunilOS.com 16
www.SunilOS.com 17
Reinforcement Learning
www.sunilos.com
www.raystec.com
Reinforcement Learning
❑We first learn by interacting with the environment.
❑Whether we are learning to drive a car or learning to walk,
the learning is based on the interaction with the
environment.
❑Learning from interaction is the foundational underlying
concept for all theories of learning and intelligence.
❑Reinforcement Learning – a goal-oriented learning based on
interaction with the environment. Reinforcement Learning
is said to be the hope of true artificial intelligence.
www.SunilOS.com 18
Problem Statement
❑How a child learn to walk?
www.SunilOS.com 19
Formalized the Problem?
❑ The child is an agent trying to manipulate the environment
(which is the surface on which it walks) by taking actions
(walking) and he/she tries to go from one state (each step
he/she takes) to another.
❑ The child gets a reward (let’s say chocolate) when he/she
accomplishes a sub module of the task (taking a couple of
steps) and will not receive any chocolate (negative reward)
when he/she is not able to walk.
❑ This is a simplified description of a reinforcement learning
problem.
www.SunilOS.com 20
Basis of Reinforcement Learning
www.SunilOS.com 21
Difference Between Different Kind of Machine Learning:
Supervised Unsupervised Reinforcement
Definition Learns by labeled data Learns by unlabelled
data
Learns by interacting
with environment by
actions and discovers
errors and rewards
Types of Problems Regression and
classification
Association and
clustering
Reward based
Data Labeled Unlabeled No predefined data
Training External supervision No supervision No supervision
Approach Map labeled input to
known output
Search patterns and
discover output
Follow trail and error
method
Algorithms SVM, KNN, Linear
Regression,
K-means, C-means Q-Learning, SARSA
etc.
www.SunilOS.com 22
Terminology of reinforcement Learning:
❑ Agent: An entity (computer program) that learns from the environment
based on the feedback.
❑ Action: Actions are steps taken by agent according to the
situation(Environment).
❑ Environment: The surrounding in which agent is present to act.
Environment is always random in nature.
❑ State: state is returned by the environment after each act of the agent
❑ Reward: It is a feedback which can be positive or negative based on
the action of the agent.
❑ Policy: This is an approach applied by agents for the next step based
on the current situation.
❑ Value: It is long term result opposite to the short term reward
❑ Q-value: same as value but with additional parameter as a current
action.
www.SunilOS.com 23
Key Points of Reinforcement Learning:
❑It is Based on try and error method
❑In this Learning agent is not guided about the
environment, and which next step to be taken.
❑Agent takes the next action based on the previous
feedback.
❑Agents will also get the delayed penalty.
❑The environment for the agent to interact is always a
random one, and the agent has to reach the destination
and get the maximum reward points.
www.SunilOS.com 24
How to implement RL in Machine Learning:
❑There are three approaches to implement RL
❑Model based learning
o In this approach a prototype is created for the environment and agents will
explore this model. For each situation a different model is created.
❑Policy based Learning
o This approach is based on finding the optimal strategy to get the maximum
future points without relying on any value function. There can be two
types of policy:
❑Value based learning
o In this approach agents try to get maximum value at any state under any
policy.
❑
www.SunilOS.com 25
When Not to Use RL?
❑Enough Data for training the model
❑It is a time consuming process
www.SunilOS.com 26
Why use Reinforcement Learning?
❑For a reward based system to learn.
❑When agents want to learn from the action.
❑Helps you to discover which action yields the
highest reward over the longer period.
❑When we want to find best method for obtaining
large rewards.
www.SunilOS.com 27
Learning Models of Reinforcement
❑Markov Decision Process
❑Q learning
❑SARSA (State Action Reward State Action)
❑Deep Q Neural Network (DQN)
www.SunilOS.com 28
Q-Learning
❑In Q learning , Q stands for quality.
❑It is a value based learning.
❑In this approach a value is given to the agent to inform
which action is best to take.
❑To perform any action, the agent will get a reward R(s,
a), and also he will end up on a certain state, so the Q -
value equation will be:
www.SunilOS.com 29
Q-Learning process
www.SunilOS.com 30
Application of RL
www.SunilOS.com 31
Gym environment for Reinforcement Learning:
❑Gym is the python library for developing
reinforcement learning algorithms:
❑We can install gym using following command:
❑pip install gym
www.SunilOS.com 32
env object contains the following main functions:
❑ The step() function takes an action object as an argument and
returns four objects:
❑ observation: An object implemented by the environment,
representing the observation of the environment.
❑ reward: A signed float value indicating the gain (or loss) from
the previous action.
❑ done: A Boolean value representing if the scenario is finished.
❑ The render() function creates a visual representation of the
environment.
❑ The reset() function resets the environment to the original state.
❑
www.SunilOS.com 33
Implementation
❑ Most Popular game is cart pole.
❑ In this game a pole is attached with a cart and we have to balance
it.
❑ If the pole tilts more than 15 degree or the cart moves more than
2.4 meter from center the pole will fall.
❑ This is the very simplest environment to learn the basics.
❑ The game has only four observations and two actions.
o The actions are to move a cart by applying a force of +1 or -
1.
o The observations are the position of the cart, the velocity of
the cart, the angle of the pole, and the rotation rate of the pole.
www.SunilOS.com 34
Getting environment Of Cartpole
❑ import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(1000):
env.render()
print(observation)
action = env.action_space.sample()
observation, reward, done, info =
env.step(action)
if done:
print("Episode finished after {}
timesteps".format(t+1))
break
env.close()
www.SunilOS.com 35
www.SunilOS.com 36
Data Preprocessing
www.sunilos.com
www.raystec.com
Why data Preprocessing
❑Data in real world is not perfect for learning.
❑It is noisy, dirty and incomplete.
❑No quality data no quality results.
www.SunilOS.com 37
Processing
Types of Data:
www.SunilOS.com 38
Types of Data(cont.)
❑ Nominal Data: categorical values without any order. For ex. Color of
cars: black, white, red, blue.
❑ Ordinal Data: Categorical Data with a natural order. For ex. Size of
clothes: small, medium, large, extra large. But the scale of difference is
not allowed. For example large-medium=small
❑ Interval Data: Numeric values with defined unit of measurement. For ex.
Temperature, dates.
❑ Ratio: numeric variables with a defined unit of measurement but both
differences and ratio is meaningful count, age , mass length.
❑ Time Series data: A time series is a series of data points indexed in time
order. Most commonly, a time series is a sequence taken at successive
equally spaced points in time. Ex. weather forecasting.
❑ Text Data: This is unstructured data. Text data usually consists of
documents which can represent words, sentences or even paragraphs of
free flowing text.
www.SunilOS.com 39
Data Processing Steps:
❑Dataset is viewed as a collection of data objects.
❑Data objects contain many features.
❑Features means characteristics of a data object. For
example color, speed, mileage of a car.
❑These are the basic steps in data processing
o Data Quality Assessment
o Feature Aggregation
o Feature Sampling
o Dimensionality Reduction
o Feature Encoding
www.SunilOS.com 40
Data Quality assessment:
❑Collected Data may be incomplete and noisy.
❑We cannot completely rely on data acquiring tools.
❑There may be flaws in the data collection process.
❑Raw data contains missing values, duplicate values, and
inconsistent values.
❑We have to tackle all these limitations before going for
machine learning.
www.SunilOS.com 41
Feature aggregation
❑After Collecting data from different sources.
❑Now aggregate data to single unit.
❑Reduce memory consumption.
❑For example we are collecting daily sales records of a
store from multiple places. We can aggregate these data
into monthly sales or yearly sales.
www.SunilOS.com 42
Feature Sampling:
❑ Large Dataset from different sources.
❑ Take a subset from it for machine learning model.
❑ Choose a sampling algorithm which properly divide the dataset
into working subset of data.
❑ Take care of imbalanced dataset classes.
❑ Some sampling algorithms:
o Simple random sampling.
o Systematic sampling.
o Stratified sampling.
o Clustered sampling.
o Convenience sampling.
o Quota sampling.
o Judgement (or Purposive) Sampling..
o Snowball sampling.
www.SunilOS.com 43
Dimensionality Reduction:
❑ Datasets are represented in Higher dimensions (3D graphs).
❑ We can not easily visualize the data in higher dimensions.
❑ Reduce the dimensions of datasets.
❑ Map Higher dimensions space (n dimensions) to the lower
dimensional space (2D plots).
❑ Lower dimension space is easy to process and visualize.
www.SunilOS.com 44
Feature Encoding:
❑ Machines cannot understand the data as humans.
❑ We have to convert the dataset into machine readable form.
❑ Feature encoding techniques are different for different kinds of
data.
www.SunilOS.com 45
Data Pre Processing Libraries
❑ # used for handling numbers
❑ import numpy as np
❑ # used for handling the dataset
❑ import pandas as pd
❑ # used for handling missing data
❑ from sklearn.impute import SimpleImputer
❑ # used for encoding categorical data
❑ from sklearn.preprocessing import LabelEncoder,
OneHotEncoder
❑ # used for splitting training and testing data
❑ from sklearn.model_selection import train_test_split
❑ # used for feature scaling
❑ from sklearn.preprocessing import StandardScaler
www.SunilOS.com 46
Label Encoder for the Categorical data:
❑ # Categorical Feature
❑ weather=['Sunny','Sunny','Overcast','Rainy','Ra
iny','Rainy','Overcast','Sunny','Sunny','Rainy'
,'Sunny','Overcast','Overcast','Rainy']
❑ # Import LabelEncoder
❑ from sklearn import preprocessing
❑ #creating labelEncoder
❑ le = preprocessing.LabelEncoder()
❑ # Converting string labels into numbers.
❑ weather_encoded=le.fit_transform(weather)
❑ print(weather_encoded)
www.SunilOS.com 47
Dealing with Missing value
❑ import pandas as pd
❑ import numpy as np
❑ df=pd.DataFrame({"Age":[23,70,56,24,np.nan],
"Salary":[30000,30000,50000,np.nan,40000]})
❑ print(df)
❑ from sklearn.impute import SimpleImputer
❑ imp = SimpleImputer(missing_values=np.nan,
❑ strategy="most_frequent")
❑ X = imp.fit_transform(df)
❑ df1=pd.DataFrame(X, columns=["Age","Salary"])
❑ print(df1)
www.SunilOS.com 48
Scaling Data
❑ from sklearn.preprocessing import
❑ StandardScaler
❑ sc = StandardScaler(with_mean=True)
❑ X = sc.fit_transform(df1)
❑ X_scaled=pd.DataFrame(X, columns=["Age","Salary
❑ ])
❑ print(X_scaled)
www.SunilOS.com 49
Disclaimer
❑This is an educational presentation to enhance the
skill of computer science students.
❑This presentation is available for free to computer
science students.
❑Some internet images from different URLs are used
in this presentation to simplify technical examples
and correlate examples with the real world.
❑We are grateful to owners of these URLs and
pictures.
www.SunilOS.com 50
Thank You!
www.SunilOS.com 51
www.SunilOS.com
1 of 51

Recommended

Machine learning ( Part 2 ) by
Machine learning ( Part 2 )Machine learning ( Part 2 )
Machine learning ( Part 2 )Sunil OS
571.7K views170 slides
Machine learning ( Part 1 ) by
Machine learning ( Part 1 )Machine learning ( Part 1 )
Machine learning ( Part 1 )Sunil OS
616.1K views58 slides
PDBC by
PDBCPDBC
PDBCSunil OS
277.4K views32 slides
Collection v3 by
Collection v3Collection v3
Collection v3Sunil OS
105.5K views63 slides
Python part2 v1 by
Python part2 v1Python part2 v1
Python part2 v1Sunil OS
608K views103 slides
JavaScript by
JavaScriptJavaScript
JavaScriptSunil OS
519K views27 slides

More Related Content

What's hot

DJango by
DJangoDJango
DJangoSunil OS
124.5K views63 slides
Hibernate by
Hibernate Hibernate
Hibernate Sunil OS
511.1K views102 slides
Java IO Streams V4 by
Java IO Streams V4Java IO Streams V4
Java IO Streams V4Sunil OS
294 views42 slides
JDBC by
JDBCJDBC
JDBCSunil OS
458.2K views54 slides
Resource Bundle by
Resource BundleResource Bundle
Resource BundleSunil OS
506.8K views15 slides
Threads V4 by
Threads  V4Threads  V4
Threads V4Sunil OS
319 views37 slides

What's hot(20)

DJango by Sunil OS
DJangoDJango
DJango
Sunil OS124.5K views
Hibernate by Sunil OS
Hibernate Hibernate
Hibernate
Sunil OS511.1K views
Java IO Streams V4 by Sunil OS
Java IO Streams V4Java IO Streams V4
Java IO Streams V4
Sunil OS294 views
JDBC by Sunil OS
JDBCJDBC
JDBC
Sunil OS458.2K views
Resource Bundle by Sunil OS
Resource BundleResource Bundle
Resource Bundle
Sunil OS506.8K views
Threads V4 by Sunil OS
Threads  V4Threads  V4
Threads V4
Sunil OS319 views
OOP V3.1 by Sunil OS
OOP V3.1OOP V3.1
OOP V3.1
Sunil OS487 views
Java 8 - CJ by Sunil OS
Java 8 - CJJava 8 - CJ
Java 8 - CJ
Sunil OS954.1K views
Jsp/Servlet by Sunil OS
Jsp/ServletJsp/Servlet
Jsp/Servlet
Sunil OS528K views
Exception Handling by Sunil OS
Exception HandlingException Handling
Exception Handling
Sunil OS1.5M views
Java Basics V3 by Sunil OS
Java Basics V3Java Basics V3
Java Basics V3
Sunil OS716 views
Collections Framework by Sunil OS
Collections FrameworkCollections Framework
Collections Framework
Sunil OS1.3M views
Java Input Output and File Handling by Sunil OS
Java Input Output and File HandlingJava Input Output and File Handling
Java Input Output and File Handling
Sunil OS1.1M views
JAVA Variables and Operators by Sunil OS
JAVA Variables and OperatorsJAVA Variables and Operators
JAVA Variables and Operators
Sunil OS1.5M views
JAVA OOP by Sunil OS
JAVA OOPJAVA OOP
JAVA OOP
Sunil OS1.5M views
Java Basics by Sunil OS
Java BasicsJava Basics
Java Basics
Sunil OS1.5M views
C++ by Sunil OS
C++C++
C++
Sunil OS590.8K views
Scalable JavaScript Design Patterns by Addy Osmani
Scalable JavaScript Design PatternsScalable JavaScript Design Patterns
Scalable JavaScript Design Patterns
Addy Osmani47K views
JUnit 4 by Sunil OS
JUnit 4JUnit 4
JUnit 4
Sunil OS513.5K views

Similar to Machine learning ( Part 3 )

Reinforcement Learning by
Reinforcement LearningReinforcement Learning
Reinforcement LearningSalem-Kabbani
951 views33 slides
Machine Learning Approach.pptx by
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptxCYPatrickKwee
9 views28 slides
Cluster Analysis for Dummies by
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for DummiesVenkata Reddy Konasani
75.7K views40 slides
Intro to Deep Reinforcement Learning by
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
805 views31 slides
Clustering.pptx by
Clustering.pptxClustering.pptx
Clustering.pptxMukul Kumar Singh Chauhan
142 views152 slides

Similar to Machine learning ( Part 3 )(20)

Machine Learning Approach.pptx by CYPatrickKwee
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
CYPatrickKwee9 views
Intro to Deep Reinforcement Learning by Khaled Saleh
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
Khaled Saleh805 views
Aaa ped-24- Reinforcement Learning by AminaRepo
Aaa ped-24- Reinforcement LearningAaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement Learning
AminaRepo59 views
Detection of Online Learning Activity Scopes by Syeda Sana
Detection of Online Learning Activity ScopesDetection of Online Learning Activity Scopes
Detection of Online Learning Activity Scopes
Syeda Sana35 views
Big Data Analytics - Unit 3.pptx by PlacementsBCA
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
PlacementsBCA42 views
EssentialsOfMachineLearning.pdf by Ankita Tiwari
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
Ankita Tiwari15 views
Barga Data Science lecture 5 by Roger Barga
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
Roger Barga241 views
reinforcement-learning-141009013546-conversion-gate02.pdf by VaishnavGhadge1
reinforcement-learning-141009013546-conversion-gate02.pdfreinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdf
VaishnavGhadge1117 views
Reinforcement Learning Guide For Beginners by gokulprasath06
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginners
gokulprasath06296 views
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ... by Edureka!
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
Edureka!3.9K views
student part time job as tutors using k-means algorithm by NUR ZARITH AMBOAKA
student part time job as tutors using k-means algorithmstudent part time job as tutors using k-means algorithm
student part time job as tutors using k-means algorithm
Reinforcement Learning on Mine Sweeper by DataScienceLab
Reinforcement Learning on Mine SweeperReinforcement Learning on Mine Sweeper
Reinforcement Learning on Mine Sweeper
DataScienceLab38 views

More from Sunil OS

OOP v3 by
OOP v3OOP v3
OOP v3Sunil OS
107.9K views56 slides
Threads v3 by
Threads v3Threads v3
Threads v3Sunil OS
104.8K views41 slides
Exception Handling v3 by
Exception Handling v3Exception Handling v3
Exception Handling v3Sunil OS
75K views29 slides
Python Pandas by
Python PandasPython Pandas
Python PandasSunil OS
613.7K views119 slides
Angular 8 by
Angular 8 Angular 8
Angular 8 Sunil OS
531K views93 slides
C# Variables and Operators by
C# Variables and OperatorsC# Variables and Operators
C# Variables and OperatorsSunil OS
494.1K views69 slides

More from Sunil OS(12)

OOP v3 by Sunil OS
OOP v3OOP v3
OOP v3
Sunil OS107.9K views
Threads v3 by Sunil OS
Threads v3Threads v3
Threads v3
Sunil OS104.8K views
Exception Handling v3 by Sunil OS
Exception Handling v3Exception Handling v3
Exception Handling v3
Sunil OS75K views
Python Pandas by Sunil OS
Python PandasPython Pandas
Python Pandas
Sunil OS613.7K views
Angular 8 by Sunil OS
Angular 8 Angular 8
Angular 8
Sunil OS531K views
C# Variables and Operators by Sunil OS
C# Variables and OperatorsC# Variables and Operators
C# Variables and Operators
Sunil OS494.1K views
C# Basics by Sunil OS
C# BasicsC# Basics
C# Basics
Sunil OS481.5K views
Rays Technologies by Sunil OS
Rays TechnologiesRays Technologies
Rays Technologies
Sunil OS915 views
C++ oop by Sunil OS
C++ oopC++ oop
C++ oop
Sunil OS598.3K views
C Basics by Sunil OS
C BasicsC Basics
C Basics
Sunil OS596.4K views
Java Threads and Concurrency by Sunil OS
Java Threads and ConcurrencyJava Threads and Concurrency
Java Threads and Concurrency
Sunil OS1.1M views
Java Swing JFC by Sunil OS
Java Swing JFCJava Swing JFC
Java Swing JFC
Sunil OS1M views

Recently uploaded

ANGULARJS.pdf by
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdfArthyR3
54 views10 slides
What is Digital Transformation? by
What is Digital Transformation?What is Digital Transformation?
What is Digital Transformation?Mark Brown
46 views11 slides
Education of marginalized and socially disadvantages segments.pptx by
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptxGarimaBhati5
52 views36 slides
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf by
 Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdfTechSoup
67 views28 slides
UNIT NO 13 ORGANISMS AND POPULATION.pptx by
UNIT NO 13 ORGANISMS AND POPULATION.pptxUNIT NO 13 ORGANISMS AND POPULATION.pptx
UNIT NO 13 ORGANISMS AND POPULATION.pptxMadhuri Bhande
48 views33 slides
ICS3211_lecture 09_2023.pdf by
ICS3211_lecture 09_2023.pdfICS3211_lecture 09_2023.pdf
ICS3211_lecture 09_2023.pdfVanessa Camilleri
150 views10 slides

Recently uploaded(20)

ANGULARJS.pdf by ArthyR3
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdf
ArthyR354 views
What is Digital Transformation? by Mark Brown
What is Digital Transformation?What is Digital Transformation?
What is Digital Transformation?
Mark Brown46 views
Education of marginalized and socially disadvantages segments.pptx by GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati552 views
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf by TechSoup
 Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf
Ask The Expert! Nonprofit Website Tools, Tips, and Technology.pdf
TechSoup 67 views
UNIT NO 13 ORGANISMS AND POPULATION.pptx by Madhuri Bhande
UNIT NO 13 ORGANISMS AND POPULATION.pptxUNIT NO 13 ORGANISMS AND POPULATION.pptx
UNIT NO 13 ORGANISMS AND POPULATION.pptx
Madhuri Bhande48 views
Research Methodology (M. Pharm, IIIrd Sem.)_UNIT_IV_CPCSEA Guidelines for Lab... by RAHUL PAL
Research Methodology (M. Pharm, IIIrd Sem.)_UNIT_IV_CPCSEA Guidelines for Lab...Research Methodology (M. Pharm, IIIrd Sem.)_UNIT_IV_CPCSEA Guidelines for Lab...
Research Methodology (M. Pharm, IIIrd Sem.)_UNIT_IV_CPCSEA Guidelines for Lab...
RAHUL PAL45 views
Interaction of microorganisms with vascular plants.pptx by MicrobiologyMicro
Interaction of microorganisms with vascular plants.pptxInteraction of microorganisms with vascular plants.pptx
Interaction of microorganisms with vascular plants.pptx
INT-244 Topic 6b Confucianism by S Meyer
INT-244 Topic 6b ConfucianismINT-244 Topic 6b Confucianism
INT-244 Topic 6b Confucianism
S Meyer51 views
Peripheral artery diseases by Dr. Garvit.pptx by garvitnanecha
Peripheral artery diseases by Dr. Garvit.pptxPeripheral artery diseases by Dr. Garvit.pptx
Peripheral artery diseases by Dr. Garvit.pptx
garvitnanecha135 views
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption... by BC Chew
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...
Artificial Intelligence and The Sustainable Development Goals (SDGs) Adoption...
BC Chew40 views
Pharmaceutical Analysis PPT (BP 102T) by yakshpharmacy009
Pharmaceutical Analysis PPT (BP 102T) Pharmaceutical Analysis PPT (BP 102T)
Pharmaceutical Analysis PPT (BP 102T)
yakshpharmacy009118 views
JRN 362 - Lecture Twenty-Two by Rich Hanley
JRN 362 - Lecture Twenty-TwoJRN 362 - Lecture Twenty-Two
JRN 362 - Lecture Twenty-Two
Rich Hanley39 views
GSoC 2024 .pdf by ShabNaz2
GSoC 2024 .pdfGSoC 2024 .pdf
GSoC 2024 .pdf
ShabNaz245 views

Machine learning ( Part 3 )

  • 1. www.SunilOS.com 1 Unsupervised Machine Learning www.sunilos.com www.raystec.com
  • 2. Unsupervised Learning ❑ unsupervised machine learning is a type of machine learning where we have only data points but no labels. ❑ We will make a group based on the similarity among data points. o For example, in real life we are arranging our bookshelves. In bookshelves we have different kinds of books. We will make groups of books based on their subjects. So, in unsupervised learning we iterate through data and group them together based on similar characteristics. ❑ Unsupervised learning also known as clustering. www.SunilOS.com 2
  • 3. Types of clustering : ❑Partition Based o partition-based clustering methods include K-Means, K- Medoids, CLARANS, etc. ❑Hierarchical Based o hierarchical clustering methods include BIRCH and Chameleon. ❑Density Based Learning o DBSCAN, OPTICS are the most popular density-based clustering methods. www.SunilOS.com 3
  • 4. K-means clustering algorithm ❑ K-means is based on a partition based clustering method. ❑ K-means, it is one of the simplest unsupervised learning algorithms that will solve the most well-known clustering problem. ❑ The procedure can be grouped as the one which follows a simple and very easy way to classify a given data set with the help of a certain number of clusters (assume k clusters). ❑ www.SunilOS.com 4
  • 5. How K-Means Clustering Works: ❑The K Means algorithm is iterative based, it repeatedly calculates the cluster centroids, refining the values until they do not change much. ❑The k-means algorithm takes a dataset of ‘n’ points as input, together with an integer parameter ‘k’ specifying how many clusters to create(supplied by the programmer). ❑The output is a set of ‘k’ cluster centroids and a labeling of the dataset that maps each of the data points to a unique cluster. ❑ www.SunilOS.com 5
  • 6. Steps of K-means clustering: ❑Choose the number of clusters k ❑Select k random points from the data as centroids ❑Assign all the points to the closest cluster centroid ❑Recompute the centroids of newly formed clusters ❑Repeat step 3 and 4 www.SunilOS.com 6
  • 7. When to stop Iterating? ❑Centroids of newly formed clusters do not change ❑Points remain in the same cluster ❑Maximum number of iterations are reached www.SunilOS.com 7
  • 8. Working of K-means www.SunilOS.com 8 ❑Sample Dataset: Objects X Y Z OB-1 1 4 1 OB-2 1 2 2 OB-3 1 4 2 OB-4 2 1 2 OB-5 1 1 1 OB-6 2 4 2 OB-7 1 1 2 OB-8 2 1 1
  • 9. ❑ We have total 8 data points. We will divide these points into 2 clusters. K=2 in k-means. ❑ Taking any two centroids or data points (as you took 2 as K hence the number of centroids also 2) in its account initially. ❑ After choosing the centroids, (say C1 and C2) the data points (coordinates here) are assigned to any of the Clusters ❑ Assume that the algorithm chose OB-2 (1,2,2) and OB-6 (2,4,2) as centroids and cluster 1 and cluster 2 as well. ❑ For measuring the distances, you take the following distance measurement function (also termed as similarity measurement function): ❑ d=|x2–x1|+|y2–y1|+|z2–z1| www.SunilOS.com 9
  • 10. calculation of distances Objects X Y Z Distance from C1(1,2,2) Distance from C2(2,4,2) OB-1 1 4 1 3 2 OB-2 1 2 2 0 3 OB-3 1 4 2 2 1 OB-4 2 1 2 2 3 OB-5 1 1 1 2 5 OB-6 2 4 2 3 0 OB-7 1 1 2 1 4 OB-8 2 1 1 3 4 www.SunilOS.com 10
  • 11. Cluster formation ❑ After the initial pass of clustering, the clustered objects will look something like the following: ❑ www.SunilOS.com 11 Cluster 1 OB-2 OB-4 OB-5 OB-7 OB-8 Cluster 2 OB-1 OB-3 OB-6
  • 13. Distance from new Centroids Objects X Y Z Distance from C1(1.4,1.2,1.6) Distance from C2(1.33, 4, 1.66) OB-1 1 4 1 3.8 1 OB-2 1 2 2 1.6 2.66 OB-3 1 4 2 3.6 0.66 OB-4 2 1 2 1.2 4 OB-5 1 1 1 1.2 4 OB-6 2 4 2 3.8 1 OB-7 1 1 2 1 3.66 OB-8 2 1 1 1.4 4.33 www.SunilOS.com 13
  • 14. Updated Clusters ❑The new assignments of the objects with respect to the updated clusters will be: ❑Algorithm will End here because no changes in groups. ❑ www.SunilOS.com 14 Cluster 1 OB-2 OB-4 OB-5 OB-7 OB-8 Cluster 2 OB-1 OB-3 OB-6
  • 15. Code Implementation of K-means ❑ import matplotlib.pyplot as plt from matplotlib import style style.use('ggplot') import numpy as np X = np.array([[1, 2], [1.5, 1.8], [5, 8 ], [8, 8], [1, 0.6], [9,11]]) plt.scatter(X[:,0], X[:,1], s=150) plt.show() www.SunilOS.com 15
  • 16. Code Implementation of K-means (cont.) ❑ from sklearn.cluster import Kmeans ❑ # You want cluster the records into 2 kmeans = KMeans(n_clusters=2) ❑ #train Model kmeans.fit(X) ❑ #test Model labels = kmeans.predict([[20,8]]) print(labels) centroids = kmeans.cluster_centers_ print(centroids) www.SunilOS.com 16
  • 18. Reinforcement Learning ❑We first learn by interacting with the environment. ❑Whether we are learning to drive a car or learning to walk, the learning is based on the interaction with the environment. ❑Learning from interaction is the foundational underlying concept for all theories of learning and intelligence. ❑Reinforcement Learning – a goal-oriented learning based on interaction with the environment. Reinforcement Learning is said to be the hope of true artificial intelligence. www.SunilOS.com 18
  • 19. Problem Statement ❑How a child learn to walk? www.SunilOS.com 19
  • 20. Formalized the Problem? ❑ The child is an agent trying to manipulate the environment (which is the surface on which it walks) by taking actions (walking) and he/she tries to go from one state (each step he/she takes) to another. ❑ The child gets a reward (let’s say chocolate) when he/she accomplishes a sub module of the task (taking a couple of steps) and will not receive any chocolate (negative reward) when he/she is not able to walk. ❑ This is a simplified description of a reinforcement learning problem. www.SunilOS.com 20
  • 21. Basis of Reinforcement Learning www.SunilOS.com 21
  • 22. Difference Between Different Kind of Machine Learning: Supervised Unsupervised Reinforcement Definition Learns by labeled data Learns by unlabelled data Learns by interacting with environment by actions and discovers errors and rewards Types of Problems Regression and classification Association and clustering Reward based Data Labeled Unlabeled No predefined data Training External supervision No supervision No supervision Approach Map labeled input to known output Search patterns and discover output Follow trail and error method Algorithms SVM, KNN, Linear Regression, K-means, C-means Q-Learning, SARSA etc. www.SunilOS.com 22
  • 23. Terminology of reinforcement Learning: ❑ Agent: An entity (computer program) that learns from the environment based on the feedback. ❑ Action: Actions are steps taken by agent according to the situation(Environment). ❑ Environment: The surrounding in which agent is present to act. Environment is always random in nature. ❑ State: state is returned by the environment after each act of the agent ❑ Reward: It is a feedback which can be positive or negative based on the action of the agent. ❑ Policy: This is an approach applied by agents for the next step based on the current situation. ❑ Value: It is long term result opposite to the short term reward ❑ Q-value: same as value but with additional parameter as a current action. www.SunilOS.com 23
  • 24. Key Points of Reinforcement Learning: ❑It is Based on try and error method ❑In this Learning agent is not guided about the environment, and which next step to be taken. ❑Agent takes the next action based on the previous feedback. ❑Agents will also get the delayed penalty. ❑The environment for the agent to interact is always a random one, and the agent has to reach the destination and get the maximum reward points. www.SunilOS.com 24
  • 25. How to implement RL in Machine Learning: ❑There are three approaches to implement RL ❑Model based learning o In this approach a prototype is created for the environment and agents will explore this model. For each situation a different model is created. ❑Policy based Learning o This approach is based on finding the optimal strategy to get the maximum future points without relying on any value function. There can be two types of policy: ❑Value based learning o In this approach agents try to get maximum value at any state under any policy. ❑ www.SunilOS.com 25
  • 26. When Not to Use RL? ❑Enough Data for training the model ❑It is a time consuming process www.SunilOS.com 26
  • 27. Why use Reinforcement Learning? ❑For a reward based system to learn. ❑When agents want to learn from the action. ❑Helps you to discover which action yields the highest reward over the longer period. ❑When we want to find best method for obtaining large rewards. www.SunilOS.com 27
  • 28. Learning Models of Reinforcement ❑Markov Decision Process ❑Q learning ❑SARSA (State Action Reward State Action) ❑Deep Q Neural Network (DQN) www.SunilOS.com 28
  • 29. Q-Learning ❑In Q learning , Q stands for quality. ❑It is a value based learning. ❑In this approach a value is given to the agent to inform which action is best to take. ❑To perform any action, the agent will get a reward R(s, a), and also he will end up on a certain state, so the Q - value equation will be: www.SunilOS.com 29
  • 32. Gym environment for Reinforcement Learning: ❑Gym is the python library for developing reinforcement learning algorithms: ❑We can install gym using following command: ❑pip install gym www.SunilOS.com 32
  • 33. env object contains the following main functions: ❑ The step() function takes an action object as an argument and returns four objects: ❑ observation: An object implemented by the environment, representing the observation of the environment. ❑ reward: A signed float value indicating the gain (or loss) from the previous action. ❑ done: A Boolean value representing if the scenario is finished. ❑ The render() function creates a visual representation of the environment. ❑ The reset() function resets the environment to the original state. ❑ www.SunilOS.com 33
  • 34. Implementation ❑ Most Popular game is cart pole. ❑ In this game a pole is attached with a cart and we have to balance it. ❑ If the pole tilts more than 15 degree or the cart moves more than 2.4 meter from center the pole will fall. ❑ This is the very simplest environment to learn the basics. ❑ The game has only four observations and two actions. o The actions are to move a cart by applying a force of +1 or - 1. o The observations are the position of the cart, the velocity of the cart, the angle of the pole, and the rotation rate of the pole. www.SunilOS.com 34
  • 35. Getting environment Of Cartpole ❑ import gym env = gym.make('CartPole-v0') for i_episode in range(20): observation = env.reset() for t in range(1000): env.render() print(observation) action = env.action_space.sample() observation, reward, done, info = env.step(action) if done: print("Episode finished after {} timesteps".format(t+1)) break env.close() www.SunilOS.com 35
  • 37. Why data Preprocessing ❑Data in real world is not perfect for learning. ❑It is noisy, dirty and incomplete. ❑No quality data no quality results. www.SunilOS.com 37 Processing
  • 39. Types of Data(cont.) ❑ Nominal Data: categorical values without any order. For ex. Color of cars: black, white, red, blue. ❑ Ordinal Data: Categorical Data with a natural order. For ex. Size of clothes: small, medium, large, extra large. But the scale of difference is not allowed. For example large-medium=small ❑ Interval Data: Numeric values with defined unit of measurement. For ex. Temperature, dates. ❑ Ratio: numeric variables with a defined unit of measurement but both differences and ratio is meaningful count, age , mass length. ❑ Time Series data: A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Ex. weather forecasting. ❑ Text Data: This is unstructured data. Text data usually consists of documents which can represent words, sentences or even paragraphs of free flowing text. www.SunilOS.com 39
  • 40. Data Processing Steps: ❑Dataset is viewed as a collection of data objects. ❑Data objects contain many features. ❑Features means characteristics of a data object. For example color, speed, mileage of a car. ❑These are the basic steps in data processing o Data Quality Assessment o Feature Aggregation o Feature Sampling o Dimensionality Reduction o Feature Encoding www.SunilOS.com 40
  • 41. Data Quality assessment: ❑Collected Data may be incomplete and noisy. ❑We cannot completely rely on data acquiring tools. ❑There may be flaws in the data collection process. ❑Raw data contains missing values, duplicate values, and inconsistent values. ❑We have to tackle all these limitations before going for machine learning. www.SunilOS.com 41
  • 42. Feature aggregation ❑After Collecting data from different sources. ❑Now aggregate data to single unit. ❑Reduce memory consumption. ❑For example we are collecting daily sales records of a store from multiple places. We can aggregate these data into monthly sales or yearly sales. www.SunilOS.com 42
  • 43. Feature Sampling: ❑ Large Dataset from different sources. ❑ Take a subset from it for machine learning model. ❑ Choose a sampling algorithm which properly divide the dataset into working subset of data. ❑ Take care of imbalanced dataset classes. ❑ Some sampling algorithms: o Simple random sampling. o Systematic sampling. o Stratified sampling. o Clustered sampling. o Convenience sampling. o Quota sampling. o Judgement (or Purposive) Sampling.. o Snowball sampling. www.SunilOS.com 43
  • 44. Dimensionality Reduction: ❑ Datasets are represented in Higher dimensions (3D graphs). ❑ We can not easily visualize the data in higher dimensions. ❑ Reduce the dimensions of datasets. ❑ Map Higher dimensions space (n dimensions) to the lower dimensional space (2D plots). ❑ Lower dimension space is easy to process and visualize. www.SunilOS.com 44
  • 45. Feature Encoding: ❑ Machines cannot understand the data as humans. ❑ We have to convert the dataset into machine readable form. ❑ Feature encoding techniques are different for different kinds of data. www.SunilOS.com 45
  • 46. Data Pre Processing Libraries ❑ # used for handling numbers ❑ import numpy as np ❑ # used for handling the dataset ❑ import pandas as pd ❑ # used for handling missing data ❑ from sklearn.impute import SimpleImputer ❑ # used for encoding categorical data ❑ from sklearn.preprocessing import LabelEncoder, OneHotEncoder ❑ # used for splitting training and testing data ❑ from sklearn.model_selection import train_test_split ❑ # used for feature scaling ❑ from sklearn.preprocessing import StandardScaler www.SunilOS.com 46
  • 47. Label Encoder for the Categorical data: ❑ # Categorical Feature ❑ weather=['Sunny','Sunny','Overcast','Rainy','Ra iny','Rainy','Overcast','Sunny','Sunny','Rainy' ,'Sunny','Overcast','Overcast','Rainy'] ❑ # Import LabelEncoder ❑ from sklearn import preprocessing ❑ #creating labelEncoder ❑ le = preprocessing.LabelEncoder() ❑ # Converting string labels into numbers. ❑ weather_encoded=le.fit_transform(weather) ❑ print(weather_encoded) www.SunilOS.com 47
  • 48. Dealing with Missing value ❑ import pandas as pd ❑ import numpy as np ❑ df=pd.DataFrame({"Age":[23,70,56,24,np.nan], "Salary":[30000,30000,50000,np.nan,40000]}) ❑ print(df) ❑ from sklearn.impute import SimpleImputer ❑ imp = SimpleImputer(missing_values=np.nan, ❑ strategy="most_frequent") ❑ X = imp.fit_transform(df) ❑ df1=pd.DataFrame(X, columns=["Age","Salary"]) ❑ print(df1) www.SunilOS.com 48
  • 49. Scaling Data ❑ from sklearn.preprocessing import ❑ StandardScaler ❑ sc = StandardScaler(with_mean=True) ❑ X = sc.fit_transform(df1) ❑ X_scaled=pd.DataFrame(X, columns=["Age","Salary ❑ ]) ❑ print(X_scaled) www.SunilOS.com 49
  • 50. Disclaimer ❑This is an educational presentation to enhance the skill of computer science students. ❑This presentation is available for free to computer science students. ❑Some internet images from different URLs are used in this presentation to simplify technical examples and correlate examples with the real world. ❑We are grateful to owners of these URLs and pictures. www.SunilOS.com 50