SlideShare a Scribd company logo
1 of 51
www.SunilOS.com 1
Unsupervised Machine Learning
www.sunilos.com
www.raystec.com
Unsupervised Learning
❑ unsupervised machine learning is a type of machine
learning where we have only data points but no labels.
❑ We will make a group based on the similarity among data
points.
o For example, in real life we are arranging our bookshelves. In bookshelves we
have different kinds of books. We will make groups of books based on their
subjects. So, in unsupervised learning we iterate through data and group them
together based on similar characteristics.
❑ Unsupervised learning also known as clustering.
www.SunilOS.com 2
Types of clustering :
❑Partition Based
o partition-based clustering methods include K-Means, K-
Medoids, CLARANS, etc.
❑Hierarchical Based
o hierarchical clustering methods include BIRCH and
Chameleon.
❑Density Based Learning
o DBSCAN, OPTICS are the most popular density-based
clustering methods.
www.SunilOS.com 3
K-means clustering algorithm
❑ K-means is based on a partition based clustering method.
❑ K-means, it is one of the simplest unsupervised learning
algorithms that will solve the most well-known clustering
problem.
❑ The procedure can be grouped as the one which follows a
simple and very easy way to classify a given data set with
the help of a certain number of clusters (assume k clusters).
❑
www.SunilOS.com 4
How K-Means Clustering Works:
❑The K Means algorithm is iterative based, it repeatedly
calculates the cluster centroids, refining the values until
they do not change much.
❑The k-means algorithm takes a dataset of ‘n’ points as
input, together with an integer parameter ‘k’ specifying
how many clusters to create(supplied by the
programmer).
❑The output is a set of ‘k’ cluster centroids and a labeling
of the dataset that maps each of the data points to a
unique cluster.
❑
www.SunilOS.com 5
Steps of K-means clustering:
❑Choose the number of clusters k
❑Select k random points from the data as centroids
❑Assign all the points to the closest cluster centroid
❑Recompute the centroids of newly formed clusters
❑Repeat step 3 and 4
www.SunilOS.com 6
When to stop Iterating?
❑Centroids of newly formed clusters do not
change
❑Points remain in the same cluster
❑Maximum number of iterations are reached
www.SunilOS.com 7
Working of K-means
www.SunilOS.com 8
❑Sample Dataset:
Objects X Y Z
OB-1 1 4 1
OB-2 1 2 2
OB-3 1 4 2
OB-4 2 1 2
OB-5 1 1 1
OB-6 2 4 2
OB-7 1 1 2
OB-8 2 1 1
❑ We have total 8 data points. We will divide these points into 2
clusters. K=2 in k-means.
❑ Taking any two centroids or data points (as you took 2 as K
hence the number of centroids also 2) in its account initially.
❑ After choosing the centroids, (say C1 and C2) the data points
(coordinates here) are assigned to any of the Clusters
❑ Assume that the algorithm chose OB-2 (1,2,2) and OB-6 (2,4,2)
as centroids and cluster 1 and cluster 2 as well.
❑ For measuring the distances, you take the following distance
measurement function (also termed as similarity measurement
function):
❑ d=|x2–x1|+|y2–y1|+|z2–z1|
www.SunilOS.com 9
calculation of distances
Objects X Y Z
Distance from
C1(1,2,2)
Distance from
C2(2,4,2)
OB-1 1 4 1 3 2
OB-2 1 2 2 0 3
OB-3 1 4 2 2 1
OB-4 2 1 2 2 3
OB-5 1 1 1 2 5
OB-6 2 4 2 3 0
OB-7 1 1 2 1 4
OB-8 2 1 1 3 4
www.SunilOS.com 10
Cluster formation
❑ After the initial pass of clustering, the clustered objects will
look something like the following:
❑
www.SunilOS.com 11
Cluster 1
OB-2
OB-4
OB-5
OB-7
OB-8
Cluster 2
OB-1
OB-3
OB-6
❑
www.SunilOS.com 12
Distance from new Centroids
Objects X Y Z
Distance from
C1(1.4,1.2,1.6)
Distance from C2(1.33, 4,
1.66)
OB-1 1 4 1 3.8 1
OB-2 1 2 2 1.6 2.66
OB-3 1 4 2 3.6 0.66
OB-4 2 1 2 1.2 4
OB-5 1 1 1 1.2 4
OB-6 2 4 2 3.8 1
OB-7 1 1 2 1 3.66
OB-8 2 1 1 1.4 4.33
www.SunilOS.com 13
Updated Clusters
❑The new assignments of the objects with respect
to the updated clusters will be:
❑Algorithm will End here because no changes in
groups.
❑
www.SunilOS.com 14
Cluster 1
OB-2
OB-4
OB-5
OB-7
OB-8
Cluster 2
OB-1
OB-3
OB-6
Code Implementation of K-means
❑ import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
import numpy as np
X = np.array([[1, 2],
[1.5, 1.8],
[5, 8 ],
[8, 8],
[1, 0.6],
[9,11]])
plt.scatter(X[:,0], X[:,1], s=150)
plt.show()
www.SunilOS.com 15
Code Implementation of K-means (cont.)
❑ from sklearn.cluster import Kmeans
❑ # You want cluster the records into 2
kmeans = KMeans(n_clusters=2)
❑ #train Model
kmeans.fit(X)
❑ #test Model
labels = kmeans.predict([[20,8]])
print(labels)
centroids = kmeans.cluster_centers_
print(centroids)
www.SunilOS.com 16
www.SunilOS.com 17
Reinforcement Learning
www.sunilos.com
www.raystec.com
Reinforcement Learning
❑We first learn by interacting with the environment.
❑Whether we are learning to drive a car or learning to walk,
the learning is based on the interaction with the
environment.
❑Learning from interaction is the foundational underlying
concept for all theories of learning and intelligence.
❑Reinforcement Learning – a goal-oriented learning based on
interaction with the environment. Reinforcement Learning
is said to be the hope of true artificial intelligence.
www.SunilOS.com 18
Problem Statement
❑How a child learn to walk?
www.SunilOS.com 19
Formalized the Problem?
❑ The child is an agent trying to manipulate the environment
(which is the surface on which it walks) by taking actions
(walking) and he/she tries to go from one state (each step
he/she takes) to another.
❑ The child gets a reward (let’s say chocolate) when he/she
accomplishes a sub module of the task (taking a couple of
steps) and will not receive any chocolate (negative reward)
when he/she is not able to walk.
❑ This is a simplified description of a reinforcement learning
problem.
www.SunilOS.com 20
Basis of Reinforcement Learning
www.SunilOS.com 21
Difference Between Different Kind of Machine Learning:
Supervised Unsupervised Reinforcement
Definition Learns by labeled data Learns by unlabelled
data
Learns by interacting
with environment by
actions and discovers
errors and rewards
Types of Problems Regression and
classification
Association and
clustering
Reward based
Data Labeled Unlabeled No predefined data
Training External supervision No supervision No supervision
Approach Map labeled input to
known output
Search patterns and
discover output
Follow trail and error
method
Algorithms SVM, KNN, Linear
Regression,
K-means, C-means Q-Learning, SARSA
etc.
www.SunilOS.com 22
Terminology of reinforcement Learning:
❑ Agent: An entity (computer program) that learns from the environment
based on the feedback.
❑ Action: Actions are steps taken by agent according to the
situation(Environment).
❑ Environment: The surrounding in which agent is present to act.
Environment is always random in nature.
❑ State: state is returned by the environment after each act of the agent
❑ Reward: It is a feedback which can be positive or negative based on
the action of the agent.
❑ Policy: This is an approach applied by agents for the next step based
on the current situation.
❑ Value: It is long term result opposite to the short term reward
❑ Q-value: same as value but with additional parameter as a current
action.
www.SunilOS.com 23
Key Points of Reinforcement Learning:
❑It is Based on try and error method
❑In this Learning agent is not guided about the
environment, and which next step to be taken.
❑Agent takes the next action based on the previous
feedback.
❑Agents will also get the delayed penalty.
❑The environment for the agent to interact is always a
random one, and the agent has to reach the destination
and get the maximum reward points.
www.SunilOS.com 24
How to implement RL in Machine Learning:
❑There are three approaches to implement RL
❑Model based learning
o In this approach a prototype is created for the environment and agents will
explore this model. For each situation a different model is created.
❑Policy based Learning
o This approach is based on finding the optimal strategy to get the maximum
future points without relying on any value function. There can be two
types of policy:
❑Value based learning
o In this approach agents try to get maximum value at any state under any
policy.
❑
www.SunilOS.com 25
When Not to Use RL?
❑Enough Data for training the model
❑It is a time consuming process
www.SunilOS.com 26
Why use Reinforcement Learning?
❑For a reward based system to learn.
❑When agents want to learn from the action.
❑Helps you to discover which action yields the
highest reward over the longer period.
❑When we want to find best method for obtaining
large rewards.
www.SunilOS.com 27
Learning Models of Reinforcement
❑Markov Decision Process
❑Q learning
❑SARSA (State Action Reward State Action)
❑Deep Q Neural Network (DQN)
www.SunilOS.com 28
Q-Learning
❑In Q learning , Q stands for quality.
❑It is a value based learning.
❑In this approach a value is given to the agent to inform
which action is best to take.
❑To perform any action, the agent will get a reward R(s,
a), and also he will end up on a certain state, so the Q -
value equation will be:
www.SunilOS.com 29
Q-Learning process
www.SunilOS.com 30
Application of RL
www.SunilOS.com 31
Gym environment for Reinforcement Learning:
❑Gym is the python library for developing
reinforcement learning algorithms:
❑We can install gym using following command:
❑pip install gym
www.SunilOS.com 32
env object contains the following main functions:
❑ The step() function takes an action object as an argument and
returns four objects:
❑ observation: An object implemented by the environment,
representing the observation of the environment.
❑ reward: A signed float value indicating the gain (or loss) from
the previous action.
❑ done: A Boolean value representing if the scenario is finished.
❑ The render() function creates a visual representation of the
environment.
❑ The reset() function resets the environment to the original state.
❑
www.SunilOS.com 33
Implementation
❑ Most Popular game is cart pole.
❑ In this game a pole is attached with a cart and we have to balance
it.
❑ If the pole tilts more than 15 degree or the cart moves more than
2.4 meter from center the pole will fall.
❑ This is the very simplest environment to learn the basics.
❑ The game has only four observations and two actions.
o The actions are to move a cart by applying a force of +1 or -
1.
o The observations are the position of the cart, the velocity of
the cart, the angle of the pole, and the rotation rate of the pole.
www.SunilOS.com 34
Getting environment Of Cartpole
❑ import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(1000):
env.render()
print(observation)
action = env.action_space.sample()
observation, reward, done, info =
env.step(action)
if done:
print("Episode finished after {}
timesteps".format(t+1))
break
env.close()
www.SunilOS.com 35
www.SunilOS.com 36
Data Preprocessing
www.sunilos.com
www.raystec.com
Why data Preprocessing
❑Data in real world is not perfect for learning.
❑It is noisy, dirty and incomplete.
❑No quality data no quality results.
www.SunilOS.com 37
Processing
Types of Data:
www.SunilOS.com 38
Types of Data(cont.)
❑ Nominal Data: categorical values without any order. For ex. Color of
cars: black, white, red, blue.
❑ Ordinal Data: Categorical Data with a natural order. For ex. Size of
clothes: small, medium, large, extra large. But the scale of difference is
not allowed. For example large-medium=small
❑ Interval Data: Numeric values with defined unit of measurement. For ex.
Temperature, dates.
❑ Ratio: numeric variables with a defined unit of measurement but both
differences and ratio is meaningful count, age , mass length.
❑ Time Series data: A time series is a series of data points indexed in time
order. Most commonly, a time series is a sequence taken at successive
equally spaced points in time. Ex. weather forecasting.
❑ Text Data: This is unstructured data. Text data usually consists of
documents which can represent words, sentences or even paragraphs of
free flowing text.
www.SunilOS.com 39
Data Processing Steps:
❑Dataset is viewed as a collection of data objects.
❑Data objects contain many features.
❑Features means characteristics of a data object. For
example color, speed, mileage of a car.
❑These are the basic steps in data processing
o Data Quality Assessment
o Feature Aggregation
o Feature Sampling
o Dimensionality Reduction
o Feature Encoding
www.SunilOS.com 40
Data Quality assessment:
❑Collected Data may be incomplete and noisy.
❑We cannot completely rely on data acquiring tools.
❑There may be flaws in the data collection process.
❑Raw data contains missing values, duplicate values, and
inconsistent values.
❑We have to tackle all these limitations before going for
machine learning.
www.SunilOS.com 41
Feature aggregation
❑After Collecting data from different sources.
❑Now aggregate data to single unit.
❑Reduce memory consumption.
❑For example we are collecting daily sales records of a
store from multiple places. We can aggregate these data
into monthly sales or yearly sales.
www.SunilOS.com 42
Feature Sampling:
❑ Large Dataset from different sources.
❑ Take a subset from it for machine learning model.
❑ Choose a sampling algorithm which properly divide the dataset
into working subset of data.
❑ Take care of imbalanced dataset classes.
❑ Some sampling algorithms:
o Simple random sampling.
o Systematic sampling.
o Stratified sampling.
o Clustered sampling.
o Convenience sampling.
o Quota sampling.
o Judgement (or Purposive) Sampling..
o Snowball sampling.
www.SunilOS.com 43
Dimensionality Reduction:
❑ Datasets are represented in Higher dimensions (3D graphs).
❑ We can not easily visualize the data in higher dimensions.
❑ Reduce the dimensions of datasets.
❑ Map Higher dimensions space (n dimensions) to the lower
dimensional space (2D plots).
❑ Lower dimension space is easy to process and visualize.
www.SunilOS.com 44
Feature Encoding:
❑ Machines cannot understand the data as humans.
❑ We have to convert the dataset into machine readable form.
❑ Feature encoding techniques are different for different kinds of
data.
www.SunilOS.com 45
Data Pre Processing Libraries
❑ # used for handling numbers
❑ import numpy as np
❑ # used for handling the dataset
❑ import pandas as pd
❑ # used for handling missing data
❑ from sklearn.impute import SimpleImputer
❑ # used for encoding categorical data
❑ from sklearn.preprocessing import LabelEncoder,
OneHotEncoder
❑ # used for splitting training and testing data
❑ from sklearn.model_selection import train_test_split
❑ # used for feature scaling
❑ from sklearn.preprocessing import StandardScaler
www.SunilOS.com 46
Label Encoder for the Categorical data:
❑ # Categorical Feature
❑ weather=['Sunny','Sunny','Overcast','Rainy','Ra
iny','Rainy','Overcast','Sunny','Sunny','Rainy'
,'Sunny','Overcast','Overcast','Rainy']
❑ # Import LabelEncoder
❑ from sklearn import preprocessing
❑ #creating labelEncoder
❑ le = preprocessing.LabelEncoder()
❑ # Converting string labels into numbers.
❑ weather_encoded=le.fit_transform(weather)
❑ print(weather_encoded)
www.SunilOS.com 47
Dealing with Missing value
❑ import pandas as pd
❑ import numpy as np
❑ df=pd.DataFrame({"Age":[23,70,56,24,np.nan],
"Salary":[30000,30000,50000,np.nan,40000]})
❑ print(df)
❑ from sklearn.impute import SimpleImputer
❑ imp = SimpleImputer(missing_values=np.nan,
❑ strategy="most_frequent")
❑ X = imp.fit_transform(df)
❑ df1=pd.DataFrame(X, columns=["Age","Salary"])
❑ print(df1)
www.SunilOS.com 48
Scaling Data
❑ from sklearn.preprocessing import
❑ StandardScaler
❑ sc = StandardScaler(with_mean=True)
❑ X = sc.fit_transform(df1)
❑ X_scaled=pd.DataFrame(X, columns=["Age","Salary
❑ ])
❑ print(X_scaled)
www.SunilOS.com 49
Disclaimer
❑This is an educational presentation to enhance the
skill of computer science students.
❑This presentation is available for free to computer
science students.
❑Some internet images from different URLs are used
in this presentation to simplify technical examples
and correlate examples with the real world.
❑We are grateful to owners of these URLs and
pictures.
www.SunilOS.com 50
Thank You!
www.SunilOS.com 51
www.SunilOS.com

More Related Content

What's hot

Threads V4
Threads  V4Threads  V4
Threads V4Sunil OS
 
Java Basics V3
Java Basics V3Java Basics V3
Java Basics V3Sunil OS
 
Collection v3
Collection v3Collection v3
Collection v3Sunil OS
 
Exception Handling
Exception HandlingException Handling
Exception HandlingSunil OS
 
Java IO Streams V4
Java IO Streams V4Java IO Streams V4
Java IO Streams V4Sunil OS
 
Java 8 - CJ
Java 8 - CJJava 8 - CJ
Java 8 - CJSunil OS
 
Collections Framework
Collections FrameworkCollections Framework
Collections FrameworkSunil OS
 
Resource Bundle
Resource BundleResource Bundle
Resource BundleSunil OS
 
Java Threads and Concurrency
Java Threads and ConcurrencyJava Threads and Concurrency
Java Threads and ConcurrencySunil OS
 
Hibernate
Hibernate Hibernate
Hibernate Sunil OS
 
Java Input Output and File Handling
Java Input Output and File HandlingJava Input Output and File Handling
Java Input Output and File HandlingSunil OS
 
Jsp/Servlet
Jsp/ServletJsp/Servlet
Jsp/ServletSunil OS
 
Java Basics
Java BasicsJava Basics
Java BasicsSunil OS
 
Object-oriented Programming-with C#
Object-oriented Programming-with C#Object-oriented Programming-with C#
Object-oriented Programming-with C#Doncho Minkov
 

What's hot (20)

Threads V4
Threads  V4Threads  V4
Threads V4
 
Java Basics V3
Java Basics V3Java Basics V3
Java Basics V3
 
Collection v3
Collection v3Collection v3
Collection v3
 
Exception Handling
Exception HandlingException Handling
Exception Handling
 
Java IO Streams V4
Java IO Streams V4Java IO Streams V4
Java IO Streams V4
 
Java 8 - CJ
Java 8 - CJJava 8 - CJ
Java 8 - CJ
 
OOP V3.1
OOP V3.1OOP V3.1
OOP V3.1
 
JDBC
JDBCJDBC
JDBC
 
Collections Framework
Collections FrameworkCollections Framework
Collections Framework
 
Log4 J
Log4 JLog4 J
Log4 J
 
Resource Bundle
Resource BundleResource Bundle
Resource Bundle
 
Java Threads and Concurrency
Java Threads and ConcurrencyJava Threads and Concurrency
Java Threads and Concurrency
 
Hibernate
Hibernate Hibernate
Hibernate
 
Java Input Output and File Handling
Java Input Output and File HandlingJava Input Output and File Handling
Java Input Output and File Handling
 
C++
C++C++
C++
 
JAVA OOP
JAVA OOPJAVA OOP
JAVA OOP
 
Jsp/Servlet
Jsp/ServletJsp/Servlet
Jsp/Servlet
 
Java Basics
Java BasicsJava Basics
Java Basics
 
Python Programming Essentials - M23 - datetime module
Python Programming Essentials - M23 - datetime modulePython Programming Essentials - M23 - datetime module
Python Programming Essentials - M23 - datetime module
 
Object-oriented Programming-with C#
Object-oriented Programming-with C#Object-oriented Programming-with C#
Object-oriented Programming-with C#
 

Similar to Machine learning ( Part 3 )

Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningSalem-Kabbani
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptxCYPatrickKwee
 
reinforcement-learning-141009013546-conversion-gate02.pptx
reinforcement-learning-141009013546-conversion-gate02.pptxreinforcement-learning-141009013546-conversion-gate02.pptx
reinforcement-learning-141009013546-conversion-gate02.pptxMohibKhan79
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptxManiMaran230751
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxSureshPolisetty2
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningKhaled Saleh
 
Aaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement LearningAaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement LearningAminaRepo
 
Detection of Online Learning Activity Scopes
Detection of Online Learning Activity ScopesDetection of Online Learning Activity Scopes
Detection of Online Learning Activity ScopesSyeda Sana
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxPlacementsBCA
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfAnkita Tiwari
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5Roger Barga
 
reinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdfreinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdfVaishnavGhadge1
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginnersgokulprasath06
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...Edureka!
 

Similar to Machine learning ( Part 3 ) (20)

Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Machine Learning Approach.pptx
Machine Learning Approach.pptxMachine Learning Approach.pptx
Machine Learning Approach.pptx
 
reinforcement-learning-141009013546-conversion-gate02.pptx
reinforcement-learning-141009013546-conversion-gate02.pptxreinforcement-learning-141009013546-conversion-gate02.pptx
reinforcement-learning-141009013546-conversion-gate02.pptx
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
For iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptxFor iiii year students of cse ML-UNIT-V.pptx
For iiii year students of cse ML-UNIT-V.pptx
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
Clustering.pptx
Clustering.pptxClustering.pptx
Clustering.pptx
 
Aaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement LearningAaa ped-24- Reinforcement Learning
Aaa ped-24- Reinforcement Learning
 
Detection of Online Learning Activity Scopes
Detection of Online Learning Activity ScopesDetection of Online Learning Activity Scopes
Detection of Online Learning Activity Scopes
 
Big Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptxBig Data Analytics - Unit 3.pptx
Big Data Analytics - Unit 3.pptx
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
Barga Data Science lecture 5
Barga Data Science lecture 5Barga Data Science lecture 5
Barga Data Science lecture 5
 
3 classification
3  classification3  classification
3 classification
 
reinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdfreinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdf
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginners
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
 
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm ...
 

More from Sunil OS

Threads v3
Threads v3Threads v3
Threads v3Sunil OS
 
Exception Handling v3
Exception Handling v3Exception Handling v3
Exception Handling v3Sunil OS
 
Python Pandas
Python PandasPython Pandas
Python PandasSunil OS
 
Angular 8
Angular 8 Angular 8
Angular 8 Sunil OS
 
C# Variables and Operators
C# Variables and OperatorsC# Variables and Operators
C# Variables and OperatorsSunil OS
 
Rays Technologies
Rays TechnologiesRays Technologies
Rays TechnologiesSunil OS
 
Java Swing JFC
Java Swing JFCJava Swing JFC
Java Swing JFCSunil OS
 

More from Sunil OS (12)

OOP v3
OOP v3OOP v3
OOP v3
 
Threads v3
Threads v3Threads v3
Threads v3
 
Exception Handling v3
Exception Handling v3Exception Handling v3
Exception Handling v3
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Angular 8
Angular 8 Angular 8
Angular 8
 
C# Variables and Operators
C# Variables and OperatorsC# Variables and Operators
C# Variables and Operators
 
C# Basics
C# BasicsC# Basics
C# Basics
 
Rays Technologies
Rays TechnologiesRays Technologies
Rays Technologies
 
C++ oop
C++ oopC++ oop
C++ oop
 
C Basics
C BasicsC Basics
C Basics
 
JUnit 4
JUnit 4JUnit 4
JUnit 4
 
Java Swing JFC
Java Swing JFCJava Swing JFC
Java Swing JFC
 

Recently uploaded

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 

Recently uploaded (20)

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 

Machine learning ( Part 3 )

  • 1. www.SunilOS.com 1 Unsupervised Machine Learning www.sunilos.com www.raystec.com
  • 2. Unsupervised Learning ❑ unsupervised machine learning is a type of machine learning where we have only data points but no labels. ❑ We will make a group based on the similarity among data points. o For example, in real life we are arranging our bookshelves. In bookshelves we have different kinds of books. We will make groups of books based on their subjects. So, in unsupervised learning we iterate through data and group them together based on similar characteristics. ❑ Unsupervised learning also known as clustering. www.SunilOS.com 2
  • 3. Types of clustering : ❑Partition Based o partition-based clustering methods include K-Means, K- Medoids, CLARANS, etc. ❑Hierarchical Based o hierarchical clustering methods include BIRCH and Chameleon. ❑Density Based Learning o DBSCAN, OPTICS are the most popular density-based clustering methods. www.SunilOS.com 3
  • 4. K-means clustering algorithm ❑ K-means is based on a partition based clustering method. ❑ K-means, it is one of the simplest unsupervised learning algorithms that will solve the most well-known clustering problem. ❑ The procedure can be grouped as the one which follows a simple and very easy way to classify a given data set with the help of a certain number of clusters (assume k clusters). ❑ www.SunilOS.com 4
  • 5. How K-Means Clustering Works: ❑The K Means algorithm is iterative based, it repeatedly calculates the cluster centroids, refining the values until they do not change much. ❑The k-means algorithm takes a dataset of ‘n’ points as input, together with an integer parameter ‘k’ specifying how many clusters to create(supplied by the programmer). ❑The output is a set of ‘k’ cluster centroids and a labeling of the dataset that maps each of the data points to a unique cluster. ❑ www.SunilOS.com 5
  • 6. Steps of K-means clustering: ❑Choose the number of clusters k ❑Select k random points from the data as centroids ❑Assign all the points to the closest cluster centroid ❑Recompute the centroids of newly formed clusters ❑Repeat step 3 and 4 www.SunilOS.com 6
  • 7. When to stop Iterating? ❑Centroids of newly formed clusters do not change ❑Points remain in the same cluster ❑Maximum number of iterations are reached www.SunilOS.com 7
  • 8. Working of K-means www.SunilOS.com 8 ❑Sample Dataset: Objects X Y Z OB-1 1 4 1 OB-2 1 2 2 OB-3 1 4 2 OB-4 2 1 2 OB-5 1 1 1 OB-6 2 4 2 OB-7 1 1 2 OB-8 2 1 1
  • 9. ❑ We have total 8 data points. We will divide these points into 2 clusters. K=2 in k-means. ❑ Taking any two centroids or data points (as you took 2 as K hence the number of centroids also 2) in its account initially. ❑ After choosing the centroids, (say C1 and C2) the data points (coordinates here) are assigned to any of the Clusters ❑ Assume that the algorithm chose OB-2 (1,2,2) and OB-6 (2,4,2) as centroids and cluster 1 and cluster 2 as well. ❑ For measuring the distances, you take the following distance measurement function (also termed as similarity measurement function): ❑ d=|x2–x1|+|y2–y1|+|z2–z1| www.SunilOS.com 9
  • 10. calculation of distances Objects X Y Z Distance from C1(1,2,2) Distance from C2(2,4,2) OB-1 1 4 1 3 2 OB-2 1 2 2 0 3 OB-3 1 4 2 2 1 OB-4 2 1 2 2 3 OB-5 1 1 1 2 5 OB-6 2 4 2 3 0 OB-7 1 1 2 1 4 OB-8 2 1 1 3 4 www.SunilOS.com 10
  • 11. Cluster formation ❑ After the initial pass of clustering, the clustered objects will look something like the following: ❑ www.SunilOS.com 11 Cluster 1 OB-2 OB-4 OB-5 OB-7 OB-8 Cluster 2 OB-1 OB-3 OB-6
  • 13. Distance from new Centroids Objects X Y Z Distance from C1(1.4,1.2,1.6) Distance from C2(1.33, 4, 1.66) OB-1 1 4 1 3.8 1 OB-2 1 2 2 1.6 2.66 OB-3 1 4 2 3.6 0.66 OB-4 2 1 2 1.2 4 OB-5 1 1 1 1.2 4 OB-6 2 4 2 3.8 1 OB-7 1 1 2 1 3.66 OB-8 2 1 1 1.4 4.33 www.SunilOS.com 13
  • 14. Updated Clusters ❑The new assignments of the objects with respect to the updated clusters will be: ❑Algorithm will End here because no changes in groups. ❑ www.SunilOS.com 14 Cluster 1 OB-2 OB-4 OB-5 OB-7 OB-8 Cluster 2 OB-1 OB-3 OB-6
  • 15. Code Implementation of K-means ❑ import matplotlib.pyplot as plt from matplotlib import style style.use('ggplot') import numpy as np X = np.array([[1, 2], [1.5, 1.8], [5, 8 ], [8, 8], [1, 0.6], [9,11]]) plt.scatter(X[:,0], X[:,1], s=150) plt.show() www.SunilOS.com 15
  • 16. Code Implementation of K-means (cont.) ❑ from sklearn.cluster import Kmeans ❑ # You want cluster the records into 2 kmeans = KMeans(n_clusters=2) ❑ #train Model kmeans.fit(X) ❑ #test Model labels = kmeans.predict([[20,8]]) print(labels) centroids = kmeans.cluster_centers_ print(centroids) www.SunilOS.com 16
  • 18. Reinforcement Learning ❑We first learn by interacting with the environment. ❑Whether we are learning to drive a car or learning to walk, the learning is based on the interaction with the environment. ❑Learning from interaction is the foundational underlying concept for all theories of learning and intelligence. ❑Reinforcement Learning – a goal-oriented learning based on interaction with the environment. Reinforcement Learning is said to be the hope of true artificial intelligence. www.SunilOS.com 18
  • 19. Problem Statement ❑How a child learn to walk? www.SunilOS.com 19
  • 20. Formalized the Problem? ❑ The child is an agent trying to manipulate the environment (which is the surface on which it walks) by taking actions (walking) and he/she tries to go from one state (each step he/she takes) to another. ❑ The child gets a reward (let’s say chocolate) when he/she accomplishes a sub module of the task (taking a couple of steps) and will not receive any chocolate (negative reward) when he/she is not able to walk. ❑ This is a simplified description of a reinforcement learning problem. www.SunilOS.com 20
  • 21. Basis of Reinforcement Learning www.SunilOS.com 21
  • 22. Difference Between Different Kind of Machine Learning: Supervised Unsupervised Reinforcement Definition Learns by labeled data Learns by unlabelled data Learns by interacting with environment by actions and discovers errors and rewards Types of Problems Regression and classification Association and clustering Reward based Data Labeled Unlabeled No predefined data Training External supervision No supervision No supervision Approach Map labeled input to known output Search patterns and discover output Follow trail and error method Algorithms SVM, KNN, Linear Regression, K-means, C-means Q-Learning, SARSA etc. www.SunilOS.com 22
  • 23. Terminology of reinforcement Learning: ❑ Agent: An entity (computer program) that learns from the environment based on the feedback. ❑ Action: Actions are steps taken by agent according to the situation(Environment). ❑ Environment: The surrounding in which agent is present to act. Environment is always random in nature. ❑ State: state is returned by the environment after each act of the agent ❑ Reward: It is a feedback which can be positive or negative based on the action of the agent. ❑ Policy: This is an approach applied by agents for the next step based on the current situation. ❑ Value: It is long term result opposite to the short term reward ❑ Q-value: same as value but with additional parameter as a current action. www.SunilOS.com 23
  • 24. Key Points of Reinforcement Learning: ❑It is Based on try and error method ❑In this Learning agent is not guided about the environment, and which next step to be taken. ❑Agent takes the next action based on the previous feedback. ❑Agents will also get the delayed penalty. ❑The environment for the agent to interact is always a random one, and the agent has to reach the destination and get the maximum reward points. www.SunilOS.com 24
  • 25. How to implement RL in Machine Learning: ❑There are three approaches to implement RL ❑Model based learning o In this approach a prototype is created for the environment and agents will explore this model. For each situation a different model is created. ❑Policy based Learning o This approach is based on finding the optimal strategy to get the maximum future points without relying on any value function. There can be two types of policy: ❑Value based learning o In this approach agents try to get maximum value at any state under any policy. ❑ www.SunilOS.com 25
  • 26. When Not to Use RL? ❑Enough Data for training the model ❑It is a time consuming process www.SunilOS.com 26
  • 27. Why use Reinforcement Learning? ❑For a reward based system to learn. ❑When agents want to learn from the action. ❑Helps you to discover which action yields the highest reward over the longer period. ❑When we want to find best method for obtaining large rewards. www.SunilOS.com 27
  • 28. Learning Models of Reinforcement ❑Markov Decision Process ❑Q learning ❑SARSA (State Action Reward State Action) ❑Deep Q Neural Network (DQN) www.SunilOS.com 28
  • 29. Q-Learning ❑In Q learning , Q stands for quality. ❑It is a value based learning. ❑In this approach a value is given to the agent to inform which action is best to take. ❑To perform any action, the agent will get a reward R(s, a), and also he will end up on a certain state, so the Q - value equation will be: www.SunilOS.com 29
  • 32. Gym environment for Reinforcement Learning: ❑Gym is the python library for developing reinforcement learning algorithms: ❑We can install gym using following command: ❑pip install gym www.SunilOS.com 32
  • 33. env object contains the following main functions: ❑ The step() function takes an action object as an argument and returns four objects: ❑ observation: An object implemented by the environment, representing the observation of the environment. ❑ reward: A signed float value indicating the gain (or loss) from the previous action. ❑ done: A Boolean value representing if the scenario is finished. ❑ The render() function creates a visual representation of the environment. ❑ The reset() function resets the environment to the original state. ❑ www.SunilOS.com 33
  • 34. Implementation ❑ Most Popular game is cart pole. ❑ In this game a pole is attached with a cart and we have to balance it. ❑ If the pole tilts more than 15 degree or the cart moves more than 2.4 meter from center the pole will fall. ❑ This is the very simplest environment to learn the basics. ❑ The game has only four observations and two actions. o The actions are to move a cart by applying a force of +1 or - 1. o The observations are the position of the cart, the velocity of the cart, the angle of the pole, and the rotation rate of the pole. www.SunilOS.com 34
  • 35. Getting environment Of Cartpole ❑ import gym env = gym.make('CartPole-v0') for i_episode in range(20): observation = env.reset() for t in range(1000): env.render() print(observation) action = env.action_space.sample() observation, reward, done, info = env.step(action) if done: print("Episode finished after {} timesteps".format(t+1)) break env.close() www.SunilOS.com 35
  • 37. Why data Preprocessing ❑Data in real world is not perfect for learning. ❑It is noisy, dirty and incomplete. ❑No quality data no quality results. www.SunilOS.com 37 Processing
  • 39. Types of Data(cont.) ❑ Nominal Data: categorical values without any order. For ex. Color of cars: black, white, red, blue. ❑ Ordinal Data: Categorical Data with a natural order. For ex. Size of clothes: small, medium, large, extra large. But the scale of difference is not allowed. For example large-medium=small ❑ Interval Data: Numeric values with defined unit of measurement. For ex. Temperature, dates. ❑ Ratio: numeric variables with a defined unit of measurement but both differences and ratio is meaningful count, age , mass length. ❑ Time Series data: A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Ex. weather forecasting. ❑ Text Data: This is unstructured data. Text data usually consists of documents which can represent words, sentences or even paragraphs of free flowing text. www.SunilOS.com 39
  • 40. Data Processing Steps: ❑Dataset is viewed as a collection of data objects. ❑Data objects contain many features. ❑Features means characteristics of a data object. For example color, speed, mileage of a car. ❑These are the basic steps in data processing o Data Quality Assessment o Feature Aggregation o Feature Sampling o Dimensionality Reduction o Feature Encoding www.SunilOS.com 40
  • 41. Data Quality assessment: ❑Collected Data may be incomplete and noisy. ❑We cannot completely rely on data acquiring tools. ❑There may be flaws in the data collection process. ❑Raw data contains missing values, duplicate values, and inconsistent values. ❑We have to tackle all these limitations before going for machine learning. www.SunilOS.com 41
  • 42. Feature aggregation ❑After Collecting data from different sources. ❑Now aggregate data to single unit. ❑Reduce memory consumption. ❑For example we are collecting daily sales records of a store from multiple places. We can aggregate these data into monthly sales or yearly sales. www.SunilOS.com 42
  • 43. Feature Sampling: ❑ Large Dataset from different sources. ❑ Take a subset from it for machine learning model. ❑ Choose a sampling algorithm which properly divide the dataset into working subset of data. ❑ Take care of imbalanced dataset classes. ❑ Some sampling algorithms: o Simple random sampling. o Systematic sampling. o Stratified sampling. o Clustered sampling. o Convenience sampling. o Quota sampling. o Judgement (or Purposive) Sampling.. o Snowball sampling. www.SunilOS.com 43
  • 44. Dimensionality Reduction: ❑ Datasets are represented in Higher dimensions (3D graphs). ❑ We can not easily visualize the data in higher dimensions. ❑ Reduce the dimensions of datasets. ❑ Map Higher dimensions space (n dimensions) to the lower dimensional space (2D plots). ❑ Lower dimension space is easy to process and visualize. www.SunilOS.com 44
  • 45. Feature Encoding: ❑ Machines cannot understand the data as humans. ❑ We have to convert the dataset into machine readable form. ❑ Feature encoding techniques are different for different kinds of data. www.SunilOS.com 45
  • 46. Data Pre Processing Libraries ❑ # used for handling numbers ❑ import numpy as np ❑ # used for handling the dataset ❑ import pandas as pd ❑ # used for handling missing data ❑ from sklearn.impute import SimpleImputer ❑ # used for encoding categorical data ❑ from sklearn.preprocessing import LabelEncoder, OneHotEncoder ❑ # used for splitting training and testing data ❑ from sklearn.model_selection import train_test_split ❑ # used for feature scaling ❑ from sklearn.preprocessing import StandardScaler www.SunilOS.com 46
  • 47. Label Encoder for the Categorical data: ❑ # Categorical Feature ❑ weather=['Sunny','Sunny','Overcast','Rainy','Ra iny','Rainy','Overcast','Sunny','Sunny','Rainy' ,'Sunny','Overcast','Overcast','Rainy'] ❑ # Import LabelEncoder ❑ from sklearn import preprocessing ❑ #creating labelEncoder ❑ le = preprocessing.LabelEncoder() ❑ # Converting string labels into numbers. ❑ weather_encoded=le.fit_transform(weather) ❑ print(weather_encoded) www.SunilOS.com 47
  • 48. Dealing with Missing value ❑ import pandas as pd ❑ import numpy as np ❑ df=pd.DataFrame({"Age":[23,70,56,24,np.nan], "Salary":[30000,30000,50000,np.nan,40000]}) ❑ print(df) ❑ from sklearn.impute import SimpleImputer ❑ imp = SimpleImputer(missing_values=np.nan, ❑ strategy="most_frequent") ❑ X = imp.fit_transform(df) ❑ df1=pd.DataFrame(X, columns=["Age","Salary"]) ❑ print(df1) www.SunilOS.com 48
  • 49. Scaling Data ❑ from sklearn.preprocessing import ❑ StandardScaler ❑ sc = StandardScaler(with_mean=True) ❑ X = sc.fit_transform(df1) ❑ X_scaled=pd.DataFrame(X, columns=["Age","Salary ❑ ]) ❑ print(X_scaled) www.SunilOS.com 49
  • 50. Disclaimer ❑This is an educational presentation to enhance the skill of computer science students. ❑This presentation is available for free to computer science students. ❑Some internet images from different URLs are used in this presentation to simplify technical examples and correlate examples with the real world. ❑We are grateful to owners of these URLs and pictures. www.SunilOS.com 50