SlideShare a Scribd company logo
PYTHON WITH
MACHINE LEARNING
ACRANTON TECHNOLOGIES PVT LTD
CONTENT OF THE INTERNSHIP
• Introduction
• Linear Regression with One Variable & Python
functions Programming
• Linear Regression with Multiple Variables
• Logistic Regression
• Support Vector Machines / Unsupervised Learning
• Applying Machine Learning & Python Manipulations
& Intelligence Programming with Mini Project
INTRODUCTION
• Machine learning is about extracting knowledge from data.
• Machine Learning theory is a field that intersects statistical,
probabilistic, computer science and algorithmic.
• Despite the immense possibilities of Machine and Deep
Learning, a thorough mathematical understanding of many of
these techniques is necessary for a good grasp of the inner
workings of the algorithms and getting good results.
Origin of Learning.. What is intelligence??
• Ability to comprehend , to understand and profit from
experience.
Three buzz words of ML
• Capability to acquire and apply knowledge
• Mystic Connection to the world
• Ability to learn or adapt to changing world.
We are in era where…
• People worry that computers will get too smart and take over the world, but the real problem is
that they're too stupid and they've already taken over the world." (Pedro Domingos)
• Data is the key to unlocking machine learning, as much as machine learning is the key to unlocking
the insight hidden in data.
->A Brief History of AI
• 1943: McCulloch and Pitts propose a model of artificial neurons
• 1956 Minsky and Edmonds build first neural network computer, the SNARC
• The Dartmouth Conference (1956)
• John McCarthy organizes a two-month workshop for researchers interested in neural networks and the study of intelligence
• Agreement to adopt a new name for this field of study: Artificial Intelligence
• 1952-1969 Enthusiasm:
• Arthur Samuel’s checkers player
• Shakey the robot • Lots of work on neural networks
• 1966-1974 Reality:
• AI problems appear to be too big and complex
• Computers are very slow, very expensive, and have very little memory (compared to today)
• 1969-1979 Knowledge-based systems:
• Birth of expert systems
• Idea is to give AI systems lots of information to start with
• 1980-1988 AI in industry:
• R1 becomes first successful commercial expert system
• Some interesting phone company systems for diagnosing failures of telephone service
• 1990s to the present:
• Increases in computational power (computers are cheaper, faster, and have tons more memory than they used to)
• An example of the coolness of speed: Computer Chess
• 2/96: Kasparov vs Deep Blue : Kasparov victorious: 3 wins, 2 draws, 1 loss
• 3/97: Kasparov vs Deeper Blue : First match won against world champion: 512 processors: 200 million chess positions per second
Why renewed interest in ML
• Loads of data from sensors, and other sources across the
globe!
• Cheap storage (Thanks to cloud computing )!!
• Lowest ever computing cost!!!
Machine Learning is almost everywhere
• Virtual Personal Assistants
• Predictions while Commuting
• Videos Surveillance
• Self driving Car
• Online recommendation offer and customer support
• Email Spam and Malware Filtering
• Epidemic Outbreak Prediction
• Online Fraud Detection
• Delayed airplane flights
• Determining which voters to canvass during an election
• Developing pharmaceutical drugs (combinatorial chemistry)
• Identifying human genes that make people more likely to develop cancer
• Predicting housing prices for real estate companies
Traditional programming vs.
machine learning
What is it???
• Using data” is what is typically referred to as “training”, while
• “answering questions” is referred to as “making predictions”,
or “inference”.
• What connects these two parts together is the model. We
train the model to make increasingly better and more
useful predictions, using the our datasets.
• This predictive model can then be deployed to serve up
predictions on previously unseen data.
What is Machine Learning?
• Science of getting computers to learn without being explicitly
programmed.
• The world is filled with data.
• Machine learning brings the promise of deriving meaning from all of that
data.
• Field of computer science that uses statistical techniques to give
computer systems the ability to "learn" with data, without being
explicitly programmed.
What is Machine Learning
One of the ways to define..
Field of computer science that uses statistical techniques to
give computer systems the ability to "learn" with data,
without being explicitly programmed.
Another definition: (Tom Mitchell)
Example: playing checkers.
E = the experience of playing many games of
checkers
T = the task of playing checkers.
P = the probability that the program will win the
next game.
A computer program is said to learn from
experience E w.r.t some task T and some
performance measure P if its performance on
T as measured by P improves with E
Training set and testing set
• Machine learning is about learning some properties of a data
set and applying them to new data.
• Data Split into two sets:
Training set on which we learn data properties
Testing set on which we test these properties.
Types of Learning
• Supervised (inductive) learning – Given: training data + desired outputs (labels). Learning with a labeled
trainingset Example: email classification with already labeled emails
• Unsupervised learning – Given: training data (without desired outputs). Discover patterns in unlabeled data
Example: cluster similar documents based on text
• Reinforcement learning – Rewards from sequence of actions, learn to act based on feedback/reward Example:
learn to play Go, reward: win or lose
Simple ML Program
Installation: anaconda powershell prompt:
Pip install numpy
Pip install pandas
pip install matplotlib
Pip install scikit_learn
Pip install scipy
Pip install opencv-python
Pip install librosa
Sample program: detection of good and bad wine.
Regression
• Regression searches for relationships among variables.
• Predict a value of a given continuous valued variable based on the values of other
variables, assuming a linear or nonlinear model of dependency.
• Greatly studied in statistics, neural network fields.
Examples: Predicting sales amounts of new product based on advetising expenditure.
Predicting wind velocities as a function of temperature, humidity, air pressure, etc.
Time series prediction of stock market indices.
Why Regression(contd..)
• The objective of a linear regression model is to find a relationship between one or more
features(independent variables) and a continuous target variable(dependent variable). When there is
only feature it is called Uni-variate Linear Regression and if there are multiple features, it is
called Multiple Linear Regression.
• The dependent features are called the dependent variables, outputs, or responses.
• The independent features are called the independent variables, inputs, or predictors.
• Typically, regression is needed to answer whether and how some phenomenon influences the other
or how several variables are related.
• Regression is also useful when you want to forecast a response using a new set of predictors.
• Example: predicting the housing price, economy, computer science, social sciences, and so on. Its
importance rises every day with the availability of large amounts of data and increased awareness of the
practical value of data.
Problem Formulation
• When implementing linear regression of some dependent variable 𝑦 on the set of independent variables
• 𝐱 = (𝑥₁, …, 𝑥ᵣ), where 𝑟 is the number of predictors,
• you assume a linear relationship between 𝑦 and 𝐱: 𝑦 = 𝛽₀ + 𝛽₁𝑥₁ + ⋯ + 𝛽ᵣ𝑥ᵣ + 𝜀. This equation is the regression equation. 𝛽₀, 𝛽₁, …, 𝛽ᵣ are
the regression coefficients, and 𝜀 is the random error.
• Linear regression calculates the estimators of the regression coefficients or simply the predicted weights, denoted with 𝑏₀, 𝑏₁, …, 𝑏ᵣ.
They define the estimated regression function 𝑓(𝐱) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ + 𝑏ᵣ𝑥ᵣ. This function should capture the dependencies between the
inputs and output sufficiently well.
• The estimated or predicted response, 𝑓(𝐱ᵢ), for each observation 𝑖 = 1, …, 𝑛, should be as close as possible to the corresponding actual
response 𝑦ᵢ. The differences 𝑦ᵢ - 𝑓(𝐱ᵢ) for all observations 𝑖 = 1, …, 𝑛, are called the residuals. Regression is about determining the best
predicted weights, that is the weights corresponding to the smallest residuals.
• To get the best weights, you usually minimize the sum of squared residuals (SSR) for all observations 𝑖 = 1, …, 𝑛: SSR = Σᵢ(𝑦ᵢ - 𝑓(𝐱ᵢ))². This
approach is called the method of ordinary least squares.
• output (y) can be calculated from a linear combination of the input variables (X). When there is a single input variable, the method is
referred to as a simple linear regression.
Linear Regression with One
Variable & Python functions
Programming
Simple Linear Regression
Simple or single-variate linear regression is the simplest case of linear regression with a
single independent variable, 𝐱 = 𝑥.
The following figure illustrates simple linear regression:
• When implementing simple linear regression,
you typically start with a given set of input-
output (𝑥-𝑦) pairs (green circles).
• The estimated regression function (black line) has
the equation 𝑓(𝑥) = 𝑏₀ + 𝑏₁𝑥.
• The predicted responses (red squares) are the
points on the regression line that correspond to
the input values.
• The residuals (vertical dashed gray lines) can be
calculated as 𝑦ᵢ - 𝑓(𝐱ᵢ) = 𝑦ᵢ - 𝑏₀ - 𝑏₁𝑥ᵢ for 𝑖 = 1, …, 𝑛.
Linear Regression with Multiple Variables
• Multiple or multivariate linear regression is a case of linear regression with two or
more independent variables.
• If there are just two independent variables, the estimated regression function is 𝑓(𝑥
₁, 𝑥₂) = 𝑏₀ + 𝑏₁𝑥₁ + 𝑏₂𝑥₂. It represents a regression plane in a three-dimensional
space. The goal of regression is to determine the values of the weights 𝑏₀, 𝑏₁, and
𝑏₂ such that this plane is as close as possible to the actual responses and yield the
minimal SSR.
• The case of more than two independent variables is similar, but more general. The
estimated regression function is 𝑓(𝑥₁, …, 𝑥ᵣ) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ +𝑏ᵣ𝑥ᵣ, and there
Logistic Regression
• Logistic Regression is used when the
dependent variable(target) is categorical.
• Consider a scenario where we need to
classify whether an email is spam or not.
Gradient Decent Algorithm
Gradient Decent Algorithm-Part 1
Gradient Decent Algorithm-Part 2
Unsupervised Learning
• No labels are given to the learning algorithm, leaving it on its own to find
structure in its input. Unsupervised learning can be a goal in itself
(discovering hidden patterns in data) or a means towards an end (feature
learning)..
• In some pattern recognition problems, the training data consists of a set of
input vectors x without any corresponding target values. The goal in such
unsupervised learning problems may be to discover groups of similar
examples within the data, where it is called clustering, or to determine how
the data is distributed in the space, known as density estimation.
Why Unsupervised Learning
• Annotating large datasets is very costly and hence we can
label only a few examples manually. Example: Speech
Recognition
• There may be cases where we don’t know how many/what
classes is the data divided into. Example: Data Mining
• We may want to use clustering to gain some insight into the
structure of the data before designing a classifier.
What is Clustering
Clustering can be considered the most important unsupervised
learning problem; so, as every other problem of this kind, it deals with finding
a structure in a collection of unlabeled data. A loose definition of clustering could
be “the process of organizing objects into groups whose members are similar in
some way”. A cluster is therefore a collection of objects which are “similar”
between them and are “dissimilar” to the objects belonging to other clusters.
Goal of Clustering
The goal of clustering is to determine the internal grouping in a set of
unlabeled data. But how to decide what constitutes a good clustering? It can
be shown that there is no absolute “best” criterion which would be independent
of the final aim of the clustering.
Proximity Measures
• For clustering, we need to define a proximity measure for two data
points. Proximity here means how similar/dissimilar the samples are
with respect to each other.
• Similarity measure S(xi,xk): large if xi,xk are similar
• Dissimilarity(or distance) measure D(xi,xk): small if xi,xk are similar
K-Means Clustering
• The procedure follows a simple and easy way to classify a given data set through a certain number of clusters
(assume k clusters) fixed a priori. The main idea is to define k centres, one for each cluster.
• These centroids should be placed in a smart way because of different location causes different result.
• The next step is to take each point belonging to a given data set and associate it to the nearest centroid.
• At this point we need to re-calculate k new centroids as barycenters of the clusters resulting from the previous
step. After we have these k new centroids, a new binding has to be done between the same data set points and
the nearest new centroid.
• A loop has been generated. As a result of this loop we may notice that the k centroids change their location step
by step until no more changes are done.
• Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective
function.
Algorithm Steps
The algorithm is composed of the following steps:
• Let X = {x1,x2,x3,……..,xn} be the set of data points and V = {v1,v2,…….,vc} be the set of
centers.
• Randomly select ‘c’ cluster centers.
• Calculate the distance between each data point and cluster centers.
• Assign the data point to the cluster center whose distance from the cluster center is minimum of
all the cluster centers.
• Recalculate the new cluster center using:
where, ‘ci’ represents the number of data points in ith cluster.
• Recalculate the distance between each data point and new obtained cluster centers.
• If no data point was reassigned then stop, otherwise repeat from step 3).
Working on projects
1.AN IMPROVED OF SPAM E-MAIL CLASSIFICATION
MECHANISM USING K-MEANS CLUSTERING
AnimprovedofspamE-
mailclassificationmechanismusingK-
meansclustering.pdf
Terms used
• Training example: a sample from x including its output from the target function
• Target function: the mapping function f from x to f(x)
• Hypothesis: approximation of f, a candidate function.
Example: E- mail spam classification, it would be the rule we came up with that
allows us to separate spam from non-spam emails.
• Concept: A Boolean target function, positive examples and negative examples
• Classifier: Learning program outputs a classifier that can be used to classify.
• Learner: Process that creates the classifier.
• Hypothesis space: set of possible approximations of f that the algorithm can create.

More Related Content

What's hot

Nimrita koul Machine Learning
Nimrita koul  Machine LearningNimrita koul  Machine Learning
Nimrita koul Machine Learning
Nimrita Koul
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
ananth
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
Mykhailo Koval
 
Ironwood4_Tuesday_Medasani_1PM
Ironwood4_Tuesday_Medasani_1PMIronwood4_Tuesday_Medasani_1PM
Ironwood4_Tuesday_Medasani_1PM
Guru Dharmateja Medasani
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
mailund
 
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
cscpconf
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
Omid Vahdaty
 
Decision trees
Decision treesDecision trees
Decision trees
Rohit Srivastava
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
萍華 楊
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
DebasisMohanty37
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
Eran Shlomo
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
KONGU ENGINEERING COLLEGE
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
Omid Vahdaty
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hakky St
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
Eran Shlomo
 
Intro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data Scientists
Parinaz Ameri
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Olivier Jeunen
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
Hayim Makabee
 
This is a heavily data-oriented
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
butest
 

What's hot (20)

Nimrita koul Machine Learning
Nimrita koul  Machine LearningNimrita koul  Machine Learning
Nimrita koul Machine Learning
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Machine Learning Overview
Machine Learning OverviewMachine Learning Overview
Machine Learning Overview
 
Ironwood4_Tuesday_Medasani_1PM
Ironwood4_Tuesday_Medasani_1PMIronwood4_Tuesday_Medasani_1PM
Ironwood4_Tuesday_Medasani_1PM
 
Linear Regression Ex
Linear Regression ExLinear Regression Ex
Linear Regression Ex
 
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
AN ALTERNATIVE APPROACH FOR SELECTION OF PSEUDO RANDOM NUMBERS FOR ONLINE EXA...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Decision trees
Decision treesDecision trees
Decision trees
 
Learning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification DataLearning On The Border:Active Learning in Imbalanced classification Data
Learning On The Border:Active Learning in Imbalanced classification Data
 
Build a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flowBuild a simple image recognition system with tensor flow
Build a simple image recognition system with tensor flow
 
Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Intro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data ScientistsIntro to Machine Learning for non-Data Scientists
Intro to Machine Learning for non-Data Scientists
 
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
This is a heavily data-oriented
This is a heavily data-orientedThis is a heavily data-oriented
This is a heavily data-oriented
 

Similar to Ml ppt at

Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
VenkateswaraBabuRavi
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
ArchanaT32
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx
SaharA84
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
GopalPatidar13
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
Spotle.ai
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
SVasuKrishna1
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
Aun Akbar
 
mining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptmining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.ppt
UttamVishwakarma7
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
Te-Yen Liu
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
Vajira Thambawita
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
Girish Gore
 
ML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdfML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdf
Shiwani Gupta
 
ML basics.pptx
ML basics.pptxML basics.pptx
ML basics.pptx
PriyadharshiniG41
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
Hchethankumar
 
Unit-V Machine Learning.ppt
Unit-V Machine Learning.pptUnit-V Machine Learning.ppt
Unit-V Machine Learning.ppt
Sharpmark256
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
Jeff Heaton
 

Similar to Ml ppt at (20)

Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Learning techniques used in AI.
Machine Learning  techniques used in AI.Machine Learning  techniques used in AI.
Machine Learning techniques used in AI.
 
06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx06-01 Machine Learning and Linear Regression.pptx
06-01 Machine Learning and Linear Regression.pptx
 
cs 601 - lecture 1.pptx
cs 601 - lecture 1.pptxcs 601 - lecture 1.pptx
cs 601 - lecture 1.pptx
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
lec1.ppt
lec1.pptlec1.ppt
lec1.ppt
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
mining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.pptmining sirdar , overman, assistant managerppt.ppt
mining sirdar , overman, assistant managerppt.ppt
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
ML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdfML MODULE 1_slideshare.pdf
ML MODULE 1_slideshare.pdf
 
ML basics.pptx
ML basics.pptxML basics.pptx
ML basics.pptx
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Internship - Python - AI ML.pptx
Internship - Python - AI ML.pptxInternship - Python - AI ML.pptx
Internship - Python - AI ML.pptx
 
Unit-V Machine Learning.ppt
Unit-V Machine Learning.pptUnit-V Machine Learning.ppt
Unit-V Machine Learning.ppt
 
rsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morningrsec2a-2016-jheaton-morning
rsec2a-2016-jheaton-morning
 

More from pradeep kumar

design thinking
design thinkingdesign thinking
design thinking
pradeep kumar
 
Reconfigurable antenna for research work
Reconfigurable antenna for research workReconfigurable antenna for research work
Reconfigurable antenna for research work
pradeep kumar
 
Project Guidance and Project flow for UG & PG candidates
Project Guidance and Project flow for UG & PG candidatesProject Guidance and Project flow for UG & PG candidates
Project Guidance and Project flow for UG & PG candidates
pradeep kumar
 
Internet of things by Mr.Pradeep_Kumar
Internet of things by Mr.Pradeep_KumarInternet of things by Mr.Pradeep_Kumar
Internet of things by Mr.Pradeep_Kumar
pradeep kumar
 
Software Defined Radio
Software Defined RadioSoftware Defined Radio
Software Defined Radio
pradeep kumar
 
My presentation for c1 course3
My presentation for c1 course3My presentation for c1 course3
My presentation for c1 course3
pradeep kumar
 

More from pradeep kumar (6)

design thinking
design thinkingdesign thinking
design thinking
 
Reconfigurable antenna for research work
Reconfigurable antenna for research workReconfigurable antenna for research work
Reconfigurable antenna for research work
 
Project Guidance and Project flow for UG & PG candidates
Project Guidance and Project flow for UG & PG candidatesProject Guidance and Project flow for UG & PG candidates
Project Guidance and Project flow for UG & PG candidates
 
Internet of things by Mr.Pradeep_Kumar
Internet of things by Mr.Pradeep_KumarInternet of things by Mr.Pradeep_Kumar
Internet of things by Mr.Pradeep_Kumar
 
Software Defined Radio
Software Defined RadioSoftware Defined Radio
Software Defined Radio
 
My presentation for c1 course3
My presentation for c1 course3My presentation for c1 course3
My presentation for c1 course3
 

Recently uploaded

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 

Recently uploaded (20)

PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 

Ml ppt at

  • 2. CONTENT OF THE INTERNSHIP • Introduction • Linear Regression with One Variable & Python functions Programming • Linear Regression with Multiple Variables • Logistic Regression • Support Vector Machines / Unsupervised Learning • Applying Machine Learning & Python Manipulations & Intelligence Programming with Mini Project
  • 3. INTRODUCTION • Machine learning is about extracting knowledge from data. • Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic. • Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results.
  • 4. Origin of Learning.. What is intelligence?? • Ability to comprehend , to understand and profit from experience. Three buzz words of ML • Capability to acquire and apply knowledge • Mystic Connection to the world • Ability to learn or adapt to changing world.
  • 5. We are in era where… • People worry that computers will get too smart and take over the world, but the real problem is that they're too stupid and they've already taken over the world." (Pedro Domingos) • Data is the key to unlocking machine learning, as much as machine learning is the key to unlocking the insight hidden in data.
  • 6. ->A Brief History of AI • 1943: McCulloch and Pitts propose a model of artificial neurons • 1956 Minsky and Edmonds build first neural network computer, the SNARC • The Dartmouth Conference (1956) • John McCarthy organizes a two-month workshop for researchers interested in neural networks and the study of intelligence • Agreement to adopt a new name for this field of study: Artificial Intelligence • 1952-1969 Enthusiasm: • Arthur Samuel’s checkers player • Shakey the robot • Lots of work on neural networks • 1966-1974 Reality: • AI problems appear to be too big and complex • Computers are very slow, very expensive, and have very little memory (compared to today) • 1969-1979 Knowledge-based systems: • Birth of expert systems • Idea is to give AI systems lots of information to start with • 1980-1988 AI in industry: • R1 becomes first successful commercial expert system • Some interesting phone company systems for diagnosing failures of telephone service • 1990s to the present: • Increases in computational power (computers are cheaper, faster, and have tons more memory than they used to) • An example of the coolness of speed: Computer Chess • 2/96: Kasparov vs Deep Blue : Kasparov victorious: 3 wins, 2 draws, 1 loss • 3/97: Kasparov vs Deeper Blue : First match won against world champion: 512 processors: 200 million chess positions per second
  • 7. Why renewed interest in ML • Loads of data from sensors, and other sources across the globe! • Cheap storage (Thanks to cloud computing )!! • Lowest ever computing cost!!!
  • 8. Machine Learning is almost everywhere • Virtual Personal Assistants • Predictions while Commuting • Videos Surveillance • Self driving Car • Online recommendation offer and customer support • Email Spam and Malware Filtering • Epidemic Outbreak Prediction • Online Fraud Detection • Delayed airplane flights • Determining which voters to canvass during an election • Developing pharmaceutical drugs (combinatorial chemistry) • Identifying human genes that make people more likely to develop cancer • Predicting housing prices for real estate companies
  • 10. What is it??? • Using data” is what is typically referred to as “training”, while • “answering questions” is referred to as “making predictions”, or “inference”. • What connects these two parts together is the model. We train the model to make increasingly better and more useful predictions, using the our datasets. • This predictive model can then be deployed to serve up predictions on previously unseen data.
  • 11. What is Machine Learning? • Science of getting computers to learn without being explicitly programmed. • The world is filled with data. • Machine learning brings the promise of deriving meaning from all of that data. • Field of computer science that uses statistical techniques to give computer systems the ability to "learn" with data, without being explicitly programmed.
  • 12. What is Machine Learning
  • 13. One of the ways to define.. Field of computer science that uses statistical techniques to give computer systems the ability to "learn" with data, without being explicitly programmed.
  • 14. Another definition: (Tom Mitchell) Example: playing checkers. E = the experience of playing many games of checkers T = the task of playing checkers. P = the probability that the program will win the next game. A computer program is said to learn from experience E w.r.t some task T and some performance measure P if its performance on T as measured by P improves with E
  • 15. Training set and testing set • Machine learning is about learning some properties of a data set and applying them to new data. • Data Split into two sets: Training set on which we learn data properties Testing set on which we test these properties.
  • 16. Types of Learning • Supervised (inductive) learning – Given: training data + desired outputs (labels). Learning with a labeled trainingset Example: email classification with already labeled emails • Unsupervised learning – Given: training data (without desired outputs). Discover patterns in unlabeled data Example: cluster similar documents based on text • Reinforcement learning – Rewards from sequence of actions, learn to act based on feedback/reward Example: learn to play Go, reward: win or lose
  • 17. Simple ML Program Installation: anaconda powershell prompt: Pip install numpy Pip install pandas pip install matplotlib Pip install scikit_learn Pip install scipy Pip install opencv-python Pip install librosa Sample program: detection of good and bad wine.
  • 18. Regression • Regression searches for relationships among variables. • Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency. • Greatly studied in statistics, neural network fields. Examples: Predicting sales amounts of new product based on advetising expenditure. Predicting wind velocities as a function of temperature, humidity, air pressure, etc. Time series prediction of stock market indices.
  • 19. Why Regression(contd..) • The objective of a linear regression model is to find a relationship between one or more features(independent variables) and a continuous target variable(dependent variable). When there is only feature it is called Uni-variate Linear Regression and if there are multiple features, it is called Multiple Linear Regression. • The dependent features are called the dependent variables, outputs, or responses. • The independent features are called the independent variables, inputs, or predictors. • Typically, regression is needed to answer whether and how some phenomenon influences the other or how several variables are related. • Regression is also useful when you want to forecast a response using a new set of predictors. • Example: predicting the housing price, economy, computer science, social sciences, and so on. Its importance rises every day with the availability of large amounts of data and increased awareness of the practical value of data.
  • 20. Problem Formulation • When implementing linear regression of some dependent variable 𝑦 on the set of independent variables • 𝐱 = (𝑥₁, …, 𝑥ᵣ), where 𝑟 is the number of predictors, • you assume a linear relationship between 𝑦 and 𝐱: 𝑦 = 𝛽₀ + 𝛽₁𝑥₁ + ⋯ + 𝛽ᵣ𝑥ᵣ + 𝜀. This equation is the regression equation. 𝛽₀, 𝛽₁, …, 𝛽ᵣ are the regression coefficients, and 𝜀 is the random error. • Linear regression calculates the estimators of the regression coefficients or simply the predicted weights, denoted with 𝑏₀, 𝑏₁, …, 𝑏ᵣ. They define the estimated regression function 𝑓(𝐱) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ + 𝑏ᵣ𝑥ᵣ. This function should capture the dependencies between the inputs and output sufficiently well. • The estimated or predicted response, 𝑓(𝐱ᵢ), for each observation 𝑖 = 1, …, 𝑛, should be as close as possible to the corresponding actual response 𝑦ᵢ. The differences 𝑦ᵢ - 𝑓(𝐱ᵢ) for all observations 𝑖 = 1, …, 𝑛, are called the residuals. Regression is about determining the best predicted weights, that is the weights corresponding to the smallest residuals. • To get the best weights, you usually minimize the sum of squared residuals (SSR) for all observations 𝑖 = 1, …, 𝑛: SSR = Σᵢ(𝑦ᵢ - 𝑓(𝐱ᵢ))². This approach is called the method of ordinary least squares. • output (y) can be calculated from a linear combination of the input variables (X). When there is a single input variable, the method is referred to as a simple linear regression.
  • 21. Linear Regression with One Variable & Python functions Programming Simple Linear Regression Simple or single-variate linear regression is the simplest case of linear regression with a single independent variable, 𝐱 = 𝑥. The following figure illustrates simple linear regression: • When implementing simple linear regression, you typically start with a given set of input- output (𝑥-𝑦) pairs (green circles). • The estimated regression function (black line) has the equation 𝑓(𝑥) = 𝑏₀ + 𝑏₁𝑥. • The predicted responses (red squares) are the points on the regression line that correspond to the input values. • The residuals (vertical dashed gray lines) can be calculated as 𝑦ᵢ - 𝑓(𝐱ᵢ) = 𝑦ᵢ - 𝑏₀ - 𝑏₁𝑥ᵢ for 𝑖 = 1, …, 𝑛.
  • 22. Linear Regression with Multiple Variables • Multiple or multivariate linear regression is a case of linear regression with two or more independent variables. • If there are just two independent variables, the estimated regression function is 𝑓(𝑥 ₁, 𝑥₂) = 𝑏₀ + 𝑏₁𝑥₁ + 𝑏₂𝑥₂. It represents a regression plane in a three-dimensional space. The goal of regression is to determine the values of the weights 𝑏₀, 𝑏₁, and 𝑏₂ such that this plane is as close as possible to the actual responses and yield the minimal SSR. • The case of more than two independent variables is similar, but more general. The estimated regression function is 𝑓(𝑥₁, …, 𝑥ᵣ) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ +𝑏ᵣ𝑥ᵣ, and there
  • 23. Logistic Regression • Logistic Regression is used when the dependent variable(target) is categorical. • Consider a scenario where we need to classify whether an email is spam or not.
  • 24. Gradient Decent Algorithm Gradient Decent Algorithm-Part 1 Gradient Decent Algorithm-Part 2
  • 25. Unsupervised Learning • No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (feature learning).. • In some pattern recognition problems, the training data consists of a set of input vectors x without any corresponding target values. The goal in such unsupervised learning problems may be to discover groups of similar examples within the data, where it is called clustering, or to determine how the data is distributed in the space, known as density estimation.
  • 26. Why Unsupervised Learning • Annotating large datasets is very costly and hence we can label only a few examples manually. Example: Speech Recognition • There may be cases where we don’t know how many/what classes is the data divided into. Example: Data Mining • We may want to use clustering to gain some insight into the structure of the data before designing a classifier.
  • 27. What is Clustering Clustering can be considered the most important unsupervised learning problem; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. A loose definition of clustering could be “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
  • 28. Goal of Clustering The goal of clustering is to determine the internal grouping in a set of unlabeled data. But how to decide what constitutes a good clustering? It can be shown that there is no absolute “best” criterion which would be independent of the final aim of the clustering.
  • 29. Proximity Measures • For clustering, we need to define a proximity measure for two data points. Proximity here means how similar/dissimilar the samples are with respect to each other. • Similarity measure S(xi,xk): large if xi,xk are similar • Dissimilarity(or distance) measure D(xi,xk): small if xi,xk are similar
  • 30. K-Means Clustering • The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centres, one for each cluster. • These centroids should be placed in a smart way because of different location causes different result. • The next step is to take each point belonging to a given data set and associate it to the nearest centroid. • At this point we need to re-calculate k new centroids as barycenters of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new centroid. • A loop has been generated. As a result of this loop we may notice that the k centroids change their location step by step until no more changes are done. • Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective function.
  • 31. Algorithm Steps The algorithm is composed of the following steps: • Let X = {x1,x2,x3,……..,xn} be the set of data points and V = {v1,v2,…….,vc} be the set of centers. • Randomly select ‘c’ cluster centers. • Calculate the distance between each data point and cluster centers. • Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers. • Recalculate the new cluster center using: where, ‘ci’ represents the number of data points in ith cluster. • Recalculate the distance between each data point and new obtained cluster centers. • If no data point was reassigned then stop, otherwise repeat from step 3).
  • 32. Working on projects 1.AN IMPROVED OF SPAM E-MAIL CLASSIFICATION MECHANISM USING K-MEANS CLUSTERING AnimprovedofspamE- mailclassificationmechanismusingK- meansclustering.pdf
  • 33. Terms used • Training example: a sample from x including its output from the target function • Target function: the mapping function f from x to f(x) • Hypothesis: approximation of f, a candidate function. Example: E- mail spam classification, it would be the rule we came up with that allows us to separate spam from non-spam emails. • Concept: A Boolean target function, positive examples and negative examples • Classifier: Learning program outputs a classifier that can be used to classify. • Learner: Process that creates the classifier. • Hypothesis space: set of possible approximations of f that the algorithm can create.