5. Machine Learning.pptx

 What is machine learning?
 Learning system model
 Training and testing
 Performance
 Learning techniques
 Machine learning structure
 Machine learning Algorithms
 Machine learning Applications
 Conclusion

 Machine learning is a type of artificial intelligence
that allows software applications to become more
accurate in predicting outcomes without being
explicitly programmed.
 A branch of artificial intelligence, concerned with
the design and development of algorithms that
allow computers to evolve behaviors based on
empirical data.
 As intelligence requires knowledge, it is necessary
for the computers to acquire knowledge.

 Email spam Filtering
 Online Fraud Detection
 Face Recognition
 Search Engine and Result Refining
 Traffic Predictions
 Product Recommendations
 Image Recognition
 Speech Recognition
 Face detection
 Character detection
 Medical diagnosis
 Web Advertising

 There are several factors affecting the performance:
◦ Types of training provided
◦ The form and extent of any initial background knowledge
◦ The type of feedback provided
◦ The learning algorithms used
 Two important factors:
◦ Modeling
◦ Optimization

 Training is the process of making the system able to
learn.
 No free lunch rule:
◦ Training set and testing set come from the same
distribution
◦ Need to make some assumptions or bias

 The success of machine learning system also
depends on the algorithms.
 The algorithms control the search to find and
build the knowledge structures.
 The learning algorithms should extract useful
information from training examples.

 Supervised learning categories and
techniques
◦ Linear classifier (numerical functions)
◦ Parametric (Probabilistic functions)
 Naïve Bayes, Gaussian discriminant
analysis (GDA), Hidden Markov models
(HMM), Probabilistic graphical models
◦ Non-parametric (Instance-based functions)
 K-nearest neighbors, Kernel regression,
Kernel density estimation, Local
regression
◦ Non-metric (Symbolic functions)
 Classification and regression tree (CART),

 Techniques:
◦ Perceptron
◦ Logistic regression
◦ Support vector machine (SVM)
◦ Ada-line
◦ Multi-layer perceptron (MLP)

 Using perceptron learning algorithm(PLA)

Machine
Learning
Supervised
Learning
Develop
Predictive model
based on both
input and
output data
Unsupervised
Learning
Group and
interpret data
based on only
input data
Classification
Regression
Clustering

 It is a process related to categorization, the
process in which ideas and objects are
recognized, differentiated, and understood

 A technique for determining the
statistical relationship between two or more
variables where a change in a dependent
variable .
 It is associated with, and depends on, a
change in one or more independent variables.

 It is the task of grouping a set of objects in
such a way that objects in the same group
(called a cluster) are more similar (in some
sense) to each other than to those in other
groups (clusters).
 It is a main task of exploratory data mining ,
and a common technique for statistical data
analysis, used in many fields,
including machine learning, pattern
recognition, image analysis,
 Information retrieval, bioinformatics, data
compression, and computer graphics.

 Supervised learning
◦ Prediction
◦ Classification (discrete labels), Regression (real values)
 Unsupervised learning
◦ Clustering
◦ Probability distribution estimation
◦ Finding association (in features)
◦ Dimension reduction
 Semi-supervised learning
 Reinforcement learning
◦ Decision making (robot, chess machine)

Machine Learning
techniques
Supervised
Learning
Unsupervised
Learning
Semi-
supervised
Learning
Reinforceme
nt Clustering
Concerned
with
classified
labeled data
Concern with
unclassified
unlabeled data
Concern with
mixture of
classified and
unclassified
data
No data

 Supervised Learning:-Learning from the
known label data to create a model then
predicting target class for the given input
data

1. Linear regression & multiple linear
regression
2. Logistic Regression
3. Polynomial Regression
4. Decision trees
5. Support Vector Machine(SVM)
6. K-nearest Neighbors (KNN)
7. Naive Bays
8. Random Forest

 It is a basic and commonly used type of
Predictive analysis .The relationship between
variable (Y) and one or more independent
variable.
 Simple Linear Regression:- There is only one
input variable (x).
 Multiple Linear Regression :- There are
input variables (e.g. x1, x2, etc.) then this
would be called multiple regression.

 It is called the sigmoid function was
developed by statisticians to describe
properties of population growth in ecology,
rising quickly and maxing out at the carrying
capacity of the environment.
 It is used to find the Probability of event
success and event failure.

 Minimized in the same way as linear
regression
 For example cubic fit with one feature x:
h()=+x+x2+x3
 Generate new feature by squaring cubing the
original feature

It is mostly used in classification.
Types of Decision tree:-
1. Categorical variable decision tree:- It has
categorical target variable then it called as
categorical .
2. Continuous variable decision tree:- It has
continuous target variable then it is called as
variable decesion tree.

 Easy to understand
 Useful in data exploration
 Less data cleaning required
 Data type is not constraint
 Non parametric method

 It is supervised learning algorithm.
 It is mostly used for classification problems
There are two types classifiers
1. Linear svm
2. Non linear svm

 In linear svm the data points are separated by
an apparent gap.
 It predicts a straight hyper plane dividing 2
classes.
 The hyper plane is called as a maximum
margin hyper plane

 In non linear svm data points plotted in
higher dimensional space .
 Here kernel trick used for maximum margin
hyper plane.

 Allows use of relatively small parameter
algorithms to redirect a chaotic system to the
target.
 Reduces waiting time for chaotic systems.
 Maintains the performance of systems.

 Face detection
 Text and hypertext categorization
 Classification of images
 Bioinformatics
 Protein fold and remote homology detection
 Handwriting recognition
 Geo and Environmental Sciences
 Generalized predictive control(GPC)

 It is used for both classification and
regression predictive problems.
 It widely used in classification industry.
 It is to predict the target label by finding the
nearest neighbor class . The closest class will
be identified using the distance measures like
Euclidean distance.

 By using cross validation technique we can
test KNN algorithm with Different Values of K.
 A small value of K means that noise will have
higher influence on the result i.e the
probability of over fitting is very high.
 A large value of K makes it computationally
expensive and defeats the basic idea behind.

 KNN classifier is very simple classifier that
works well on basic recognition problems.

 It is a straight forward and powerful
algorithm for the classification task.
 It works on Bayes theorem of probability to
predict the class of unknown data set.
 It is applicable for discrete data.

 It is used for continuous values.
 In this classifier continuous values associated
with each feature are assumed to be
distributed according to a Gaussian
Distribution and it also called normal
distribution.
 It gives a bell shaped curve which symmetric
about mean if the featured values.

 It is used for both classification and
regression kind of problem.
 This algorithm creates the forest with a
number of decision trees.
 More trees in the forest the more robust the
forest looks like .Like this the higher the
number of trees in the forest gives the high
accuracy results.

 Banking
 Medicine
 Stock market
 E-commerce

 Example: decision trees tools that create
rules
 Prediction of future cases: Use the rule to
predict the output for future inputs
 Knowledge extraction: The rule is easy to
understand
 Compression: The rule is simpler than the
data it explains
 Outlier detection: Exceptions that are not
covered by the rule, e.g., fraud

 Unsupervised Learning:- Learning from the
known unlabeled data to differentiating the
given input data.

 Learning “what normally happens”
 No output
 Clustering: Grouping similar instances
 Other applications: Summarization,
Association Analysis
 Example applications
◦ Customer segmentation in CRM
◦ Image compression: Color quantization
◦ Bioinformatics: Learning motifs

 Step 1 - exploring data
 Step 2 - training the model
 Step 3 - plotting the model
 Vector quantization - image clustering
 Getting ready
 Step 1 - collecting and describing data
 Step 2 - exploring data
 Step 3 - data cleaning
 Step 4 - visualizing cleaned data
 Step 5 - building the model and visualizing it

1. K-means clustering
2. Hierarchical clustering

 It is unsupervised learning , which is used
when you have unlabeled data.
 The goal of this algorithm is to find groups
in the data ,with the number of groups
represented by the variable K.
 The centroids of the K clusters , which can
be label new data

 Assuming we have inputs X1,X2,X3……Xn
and value of K
 Step-1:-Pick random points as cluster centers
called centroids
 Step-2:-Assign each xi to nearest cluster by
calculating its distance to each other.
 Step-3:-Find new cluster center by taking the
average of the assigned points.
 Step-4:-Repeat step-2 and step-3 until none
of the cluster assignments change.

 Image segmentation
 Clustering gene segmentation data
 News article clustering
 Species clustering
 Anomaly detection

 Hierarchical clustering is a widely used data
analysis tool.
 The idea is to build a binary tree of the data
that successively merges similar groups of
points.
 Visualizing this tree provides a useful
summary of the data.
 Hierarchical clustering only requires a
measure of similarity between groups of
data points.

1. Let X = {x1, x2, x3, ..., xn} be the set of data points.
2. Begin with the disjoint clustering having level L(0) = 0 and
sequence number m = 0.
3. Find the least distance pair of clusters in the current clustering,
say pair (r), (s), according to d[(r),(s)] = min d[(i),(j)] where the
minimum is over all pairs of clusters in the current clustering.
4. Increment the sequence number: m = m +1.Merge clusters (r)
and (s) into a single cluster to form the next clustering m. Set
the level of this clustering to L(m) = d[(r),(s)].
5. Update the distance matrix, D, by deleting the rows and
columns corresponding to clusters (r) and (s) and adding a row
and column corresponding to the newly formed cluster. The
distance between the new cluster, denoted (r,s) and old
cluster(k) is defined in this way: d[(k), (r,s)] = min (d[(k),(r)],
d[(k),(s)]).
6. If all the data points are in one cluster then stop, else repeat
from step 2).Divisive Hierarchical clustering - It is just the
reverse of Agglomerative Hierarchical approach.

 1) No a prior information about the number
of clusters required.
 2) Easy to implement and gives best result in
some cases.

1. Algorithm can never undo what was done previously.
2. Time complexity of at least O(n2 log n) is required,
where ‘n’ is the number of data points.
3. Based on the type of distance matrix chosen for
merging different algorithms can suffer with one or
more of the following:
4. i) Sensitivity to noise and outliers
5. ii) Breaking large clusters
6. iii) Difficulty handling different sized clusters and
convex shapes
7. No objective function is directly minimized
8. Sometimes it is difficult to identify the correct
number of clusters by the dendrogram.

 Labeled data is used to help identify that there are
specific groups of webpage types present in the
data .
 The algorithm is then trained on unlabeled data to
define the boundaries of those webpage types and
may even identify new types of webpages that
were unspecified in the existing human-inputted
labels.
 Semi-supervised learning falls
between unsupervised learning (without any
labeled training data) and supervise learning (with
completely labeled training data).

 Word sense disambiguation
 Document categorization
 Named entity classification
 Sentiment analysis
 Machine translation
 Computer vision
 Object recognition
 Bioinformatics
 Protein function prediction
 Cognitive psychology

 In reinforcement learning, the learner is a decision-
making agent that takes actions in an environment
and receives reward (or penalty)for its actions in
trying to solve a problem.
 After a set of trial-and error runs, it should learn the
best policy, which is the sequence of actions that
maximize the total reward.

 Topics:
◦ Policies: what actions should an agent take in a particular
situation
◦ Utility estimation: how good is a state (used by policy)
 No supervised output but delayed reward
 Credit assignment problem (what was responsible for
the outcome)
 Applications:
◦ Game playing
◦ Robot in a maze
◦ Multiple agents, partial observability, ...

 Step 1 - collecting and describing the data
 Step 2 - exploring the data
 Step 3 - preparing the regression model
 Step 4 - preparing the Markov-switching
model
 Step 5 - plotting the regime probabilities
 Step 6 - testing the Markov switching model

 Finance
 Media and advertising
 Text, speech, and dialog systems
 Health and medicine
 Education and training
 Robotics and industrial automation
 HVAC

 Face detection
 Object detection and recognition
 Multimedia event detection
 Economical and commercial usage

 We have a simple overview of some
techniques and algorithms in machine
learning. Furthermore, there are more and
more techniques apply machine learning as a
solution. In the future, machine learning will
play an important role in our daily life.

 [1] W. L. Chao, J. J. Ding, “Integrated Machine
Learning Algorithms for Human Age
Estimation”, NTU, 2011.

 UCI Repository:
http://www.ics.uci.edu/~mlearn/MLReposit
ory.html
 UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.applic
ation.html
 Statlib: http://lib.stat.cmu.edu/
 Delve: http://www.cs.utoronto.ca/~delve/

 Journal of Machine Learning Research
www.jmlr.org
 Machine Learning
 IEEE Transactions on Neural Networks
 IEEE Transactions on Pattern Analysis and
Machine Intelligence
 Annals of Statistics
 Journal of the American Statistical Association

5. Machine Learning.pptx

Recommended

Recommended

More Related Content

Similar to 5. Machine Learning.pptx

Similar to 5. Machine Learning.pptx (20)

Recently uploaded

Recently uploaded (20)

5. Machine Learning.pptx