This document discusses pattern recognition. It defines a pattern as a set of measurements describing a physical object and a pattern class as a set of patterns sharing common attributes. Pattern recognition involves relating perceived patterns to previously perceived patterns to classify them. The goals are to put patterns into categories and learn to distinguish patterns of interest. Examples of pattern recognition applications include optical character recognition, biometrics, medical diagnosis, and military target recognition. Common approaches to pattern recognition are statistical, neural networks, and structural. The process involves data acquisition, pre-processing, feature extraction, classification, and post-processing. An example of classifying fish into salmon and sea bass is provided.
The document describes two feature extraction methods: attention based and statistics based. The attention based method models how human vision finds salient regions using an architecture that decomposes images into channels and creates image pyramids, then combines the information to generate saliency maps. This method was applied to face recognition but had problems with pose and expression changes. The statistics based method aims to select a subset of important features using criteria based on how well the features represent the original data.
The document discusses the Sutherland-Hodgeman polygon clipping algorithm. It clips a polygon by considering it against each boundary edge of the window. It passes the polygon's vertices to clipping procedures for the left, right, bottom and top edges. At each stage, it generates a new set of vertices for the clipped polygon which is passed to the next stage. There are four cases considered - when vertices are inside/outside the window boundary and their intersection points are determined and stored in the output vertex list. Once all vertices are clipped against one boundary, the result is passed to the next boundary for further clipping until the fully clipped polygon is produced.
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Pattern recognition and Machine Learning.Rohit Kumar
Machine learning involves using examples to generate a program or model that can classify new examples. It is useful for tasks like recognizing patterns, generating patterns, and predicting outcomes. Some common applications of machine learning include optical character recognition, biometrics, medical diagnosis, and information retrieval. The goal of machine learning is to build models that can recognize patterns in data and make predictions.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
hands on machine learning Chapter 6&7 decision tree, ensemble and random forestJaey Jeong
This document discusses decision trees and ensemble methods like random forests. It covers decision tree training and visualization using iris datasets. Ensemble methods like bagging, boosting and stacking are introduced. Random forests are ensembles of decision trees that split on a random subset of features at each node. Boosting methods like AdaBoost and gradient boosting aim to boost weak learners into a strong learner by focusing on misclassified samples.
Supervised Machine Learning With Types And TechniquesSlideTeam
Supervised Machine Learning with Types and Techniques is for the mid level managers giving information about what is supervised machine learning, its types, how supervised machine learning, its advantages. You can also know the difference between Supervised and Unsupervised Machine learning to understand supervised machine learning in a better way for business growth. https://bit.ly/3ewivHm
This document discusses pattern recognition. It defines a pattern as a set of measurements describing a physical object and a pattern class as a set of patterns sharing common attributes. Pattern recognition involves relating perceived patterns to previously perceived patterns to classify them. The goals are to put patterns into categories and learn to distinguish patterns of interest. Examples of pattern recognition applications include optical character recognition, biometrics, medical diagnosis, and military target recognition. Common approaches to pattern recognition are statistical, neural networks, and structural. The process involves data acquisition, pre-processing, feature extraction, classification, and post-processing. An example of classifying fish into salmon and sea bass is provided.
The document describes two feature extraction methods: attention based and statistics based. The attention based method models how human vision finds salient regions using an architecture that decomposes images into channels and creates image pyramids, then combines the information to generate saliency maps. This method was applied to face recognition but had problems with pose and expression changes. The statistics based method aims to select a subset of important features using criteria based on how well the features represent the original data.
The document discusses the Sutherland-Hodgeman polygon clipping algorithm. It clips a polygon by considering it against each boundary edge of the window. It passes the polygon's vertices to clipping procedures for the left, right, bottom and top edges. At each stage, it generates a new set of vertices for the clipped polygon which is passed to the next stage. There are four cases considered - when vertices are inside/outside the window boundary and their intersection points are determined and stored in the output vertex list. Once all vertices are clipped against one boundary, the result is passed to the next boundary for further clipping until the fully clipped polygon is produced.
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Pattern recognition and Machine Learning.Rohit Kumar
Machine learning involves using examples to generate a program or model that can classify new examples. It is useful for tasks like recognizing patterns, generating patterns, and predicting outcomes. Some common applications of machine learning include optical character recognition, biometrics, medical diagnosis, and information retrieval. The goal of machine learning is to build models that can recognize patterns in data and make predictions.
Supervised learning: discover patterns in the data that relate data attributes with a target (class) attribute.
These patterns are then utilized to predict the values of the target attribute in future data instances.
Unsupervised learning: The data have no target attribute.
We want to explore the data to find some intrinsic structures in them.
hands on machine learning Chapter 6&7 decision tree, ensemble and random forestJaey Jeong
This document discusses decision trees and ensemble methods like random forests. It covers decision tree training and visualization using iris datasets. Ensemble methods like bagging, boosting and stacking are introduced. Random forests are ensembles of decision trees that split on a random subset of features at each node. Boosting methods like AdaBoost and gradient boosting aim to boost weak learners into a strong learner by focusing on misclassified samples.
Supervised Machine Learning With Types And TechniquesSlideTeam
Supervised Machine Learning with Types and Techniques is for the mid level managers giving information about what is supervised machine learning, its types, how supervised machine learning, its advantages. You can also know the difference between Supervised and Unsupervised Machine learning to understand supervised machine learning in a better way for business growth. https://bit.ly/3ewivHm
- The document discusses a lecture on machine learning given by Ravi Gupta and G. Bharadwaja Kumar.
- Machine learning allows computers to automatically improve at tasks through experience. It is used for problems where the output is unknown and computation is expensive.
- Machine learning involves training a decision function or hypothesis on examples to perform tasks like classification, regression, and clustering. The training experience and representation impact whether learning succeeds.
- Choosing how to represent the target function, select training examples, and update weights to improve performance are issues in machine learning systems.
Introduction to Linear Discriminant AnalysisJaclyn Kokx
This document provides an introduction and overview of linear discriminant analysis (LDA). It discusses that LDA is a dimensionality reduction technique used to separate classes of data. The document outlines the 5 main steps to performing LDA: 1) calculating class means, 2) computing scatter matrices, 3) finding linear discriminants using eigenvalues/eigenvectors, 4) determining the transformation subspace, and 5) projecting the data onto the subspace. Examples using the Iris dataset are provided to illustrate how LDA works step-by-step to find projection directions that separate the classes.
This document discusses main applications of machine learning including clustering, classification, and recommendation. It provides examples of each type of application and how they are used. It also discusses failures of early machine learning systems that demonstrated racial or gender bias. Additionally, it outlines the typical machine learning process including feature engineering, learning/training, evaluation, and deployment phases. Key evaluation metrics for classification problems like accuracy, precision and recall are also covered.
This presentation is here to help you understand about Machine Learning, supervised Learning, Process Flow chat of Supervised Learning and 2 steps of supervised Learning.
Image segmentation techniques
More information on this research can be found in:
Hussein, Rania, Frederic D. McKenzie. “Identifying Ambiguous Prostate Gland Contours from Histology Using Capsule Shape Information and Least Squares Curve Fitting.” The International Journal of Computer Assisted Radiology and Surgery ( IJCARS), Volume 2 Numbers 3-4, pp. 143-150, December 2007.
The Cohen-Sutherland algorithm divides the plane into 9 regions and uses 4-bit codes to encode whether each endpoint of a line segment is left, right, above, or below the clipping window. It then uses the endpoint codes to either trivially accept or reject the line segment, or perform clipping by calculating the intersection point of the line with the window boundary and replacing the outside endpoint. The main steps are to assign codes to endpoints, AND the codes to check for trivial acceptance or rejection, clip by replacing outside endpoints if needed, and repeating for other line segments.
This document discusses various applications of computer graphics including computer-aided design (CAD), visualization, animation, and computer games. It then describes the frame buffer, which stores pixel information for the screen in memory. Finally, it explains two basic line drawing algorithms - the digital differential analyzer (DDA) line drawing algorithm and Bresenham's line drawing algorithm. The DDA algorithm calculates pixel coordinates by incrementing x or y values based on the slope of the line, while Bresenham's algorithm optimizes for integer coordinates.
The document discusses the perceptron algorithm, which is a simple neural network used for binary classification. It was invented in 1957 and works by computing weighted inputs and applying a threshold activation function. The perceptron learns by adjusting its weights during the training process. It is computationally efficient but can only learn linearly separable problems and not more complex nonlinear relationships.
Machine Learning using Support Vector MachineMohsin Ul Haq
This document provides an overview of machine learning using support vector machines (SVM). It first defines machine learning as a field that allows computers to learn without explicit programming. It then describes the main types of machine learning: supervised learning using labelled training data, unsupervised learning to find hidden patterns in unlabelled data, and reinforcement learning to maximize rewards. SVM is introduced as a classification algorithm that finds the optimal separating hyperplane between classes with the largest margin. Kernels are discussed as functions that enable SVMs to operate in high-dimensional implicit feature spaces without explicitly computing coordinates.
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
Machine learning can be applied across many domains such as business, entertainment, medicine, and software engineering. The document outlines the machine learning process which includes data collection, feature extraction, model learning, and evaluation. It also provides examples of machine learning applications in various domains, such as using decision trees to make credit decisions in business, classifying emotions in music for playlist generation in entertainment, and detecting heart murmurs from audio data in medicine.
Support vector machines are a type of supervised machine learning algorithm used for classification and regression analysis. They work by mapping data to high-dimensional feature spaces to find optimal linear separations between classes. Key advantages are effectiveness in high dimensions, memory efficiency using support vectors, and versatility through kernel functions. Hyperparameters like kernel type, gamma, and C must be tuned for best performance. Common kernels include linear, polynomial, and radial basis function kernels.
The document describes the C4.5 algorithm for building decision trees. It begins with an overview of decision trees and the goals of minimizing tree levels and nodes. It then outlines the steps of the C4.5 algorithm: 1) Choose the attribute that best differentiates training instances, 2) Create a tree node for that attribute and child nodes for each value, 3) Recursively create subordinate nodes until reaching criteria or no remaining attributes. An example applies these steps to build a decision tree to predict customers' responses to a life insurance promotion using attributes like age, income and insurance status.
This document discusses feature selection concepts and methods. It defines features as attributes that determine which class an instance belongs to. Feature selection aims to select a relevant subset of features by removing irrelevant, redundant and unnecessary data. This improves learning accuracy, model performance and interpretability. The document categorizes feature selection algorithms as filter, wrapper or embedded methods based on how they evaluate feature subsets. It also discusses concepts like feature relevance, search strategies, successor generation and evaluation measures used in feature selection algorithms.
Using Classification and Clustering with Azure Machine Learning Models shows how to use classification and clustering algorithms with Azure Machine Learning.
The document summarizes statistical pattern recognition techniques. It is divided into 9 sections that cover topics like dimensionality reduction, classifiers, classifier combination, and unsupervised classification. The goal of pattern recognition is supervised or unsupervised classification of patterns based on features. Dimensionality reduction aims to reduce the number of features to address the curse of dimensionality when samples are limited. Multiple classifiers can be combined through techniques like stacking, bagging, and boosting. Unsupervised classification uses clustering algorithms to construct decision boundaries without labeled training data.
Three main types of machine learning are supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves training a model using labeled input/output data where the desired outputs are provided, allowing the model to map inputs to outputs. Unsupervised learning involves discovering hidden patterns in unlabeled data and grouping similar data points together. Reinforcement learning involves an agent learning through trial-and-error interactions with a dynamic environment by receiving rewards or punishments for actions.
Machine learning and its applications was submitted by Bhuvan Chopra to Er. Seema Rani. The document provides an introduction to machine learning, the basic prerequisites for machine learning including algebra, linear algebra, statistics and Python programming. It describes the main types of machine learning including supervised learning, unsupervised learning and reinforcement learning. Finally, it discusses some common applications of machine learning such as virtual personal assistants, video surveillance, social media services, email spam filtering, online customer support, product recommendations, and online fraud detection.
This document provides an introduction to machine learning, including definitions, key concepts, and algorithms. It defines machine learning as giving computers the ability to learn without being explicitly programmed. It distinguishes machine learning from artificial intelligence and describes supervised and unsupervised learning. Popular machine learning algorithms like naive Bayes, support vector machines, and decision trees are introduced. Python libraries for machine learning like scikit-learn are also mentioned.
The Lian-Barsky algorithm is a line clipping algorithm. This algorithm is more efficient than Cohen–Sutherland line clipping algorithm and can be extended to 3-Dimensional clipping. This algorithm is considered to be the faster parametric line-clipping algorithm. The following concepts are used in this clipping:
The parametric equation of the line.
The inequalities describing the range of the clipping window which is used to determine the intersections between the line and the clip window.
The document discusses issues in knowledge representation for artificial intelligence including how knowledge is represented, the quantitative and qualitative problems with analytical approaches, and issues such as spreading activation, subsumption, and classification. It also provides examples of knowledge representation methodologies and their problems. The document is a submission for a Master's degree that examines AI and neural networks.
This document contains 4 sets of question papers from Jawaharlal Nehru Technological University (JNTU) for the subject Artificial Neural Networks. Each set contains 8 questions and was given to undergraduate students for their supplementary exams in February of either 2007 or 2008. The questions cover topics like perceptrons, learning laws, radial basis functions, associative memory, applications of neural networks, and more.
- The document discusses a lecture on machine learning given by Ravi Gupta and G. Bharadwaja Kumar.
- Machine learning allows computers to automatically improve at tasks through experience. It is used for problems where the output is unknown and computation is expensive.
- Machine learning involves training a decision function or hypothesis on examples to perform tasks like classification, regression, and clustering. The training experience and representation impact whether learning succeeds.
- Choosing how to represent the target function, select training examples, and update weights to improve performance are issues in machine learning systems.
Introduction to Linear Discriminant AnalysisJaclyn Kokx
This document provides an introduction and overview of linear discriminant analysis (LDA). It discusses that LDA is a dimensionality reduction technique used to separate classes of data. The document outlines the 5 main steps to performing LDA: 1) calculating class means, 2) computing scatter matrices, 3) finding linear discriminants using eigenvalues/eigenvectors, 4) determining the transformation subspace, and 5) projecting the data onto the subspace. Examples using the Iris dataset are provided to illustrate how LDA works step-by-step to find projection directions that separate the classes.
This document discusses main applications of machine learning including clustering, classification, and recommendation. It provides examples of each type of application and how they are used. It also discusses failures of early machine learning systems that demonstrated racial or gender bias. Additionally, it outlines the typical machine learning process including feature engineering, learning/training, evaluation, and deployment phases. Key evaluation metrics for classification problems like accuracy, precision and recall are also covered.
This presentation is here to help you understand about Machine Learning, supervised Learning, Process Flow chat of Supervised Learning and 2 steps of supervised Learning.
Image segmentation techniques
More information on this research can be found in:
Hussein, Rania, Frederic D. McKenzie. “Identifying Ambiguous Prostate Gland Contours from Histology Using Capsule Shape Information and Least Squares Curve Fitting.” The International Journal of Computer Assisted Radiology and Surgery ( IJCARS), Volume 2 Numbers 3-4, pp. 143-150, December 2007.
The Cohen-Sutherland algorithm divides the plane into 9 regions and uses 4-bit codes to encode whether each endpoint of a line segment is left, right, above, or below the clipping window. It then uses the endpoint codes to either trivially accept or reject the line segment, or perform clipping by calculating the intersection point of the line with the window boundary and replacing the outside endpoint. The main steps are to assign codes to endpoints, AND the codes to check for trivial acceptance or rejection, clip by replacing outside endpoints if needed, and repeating for other line segments.
This document discusses various applications of computer graphics including computer-aided design (CAD), visualization, animation, and computer games. It then describes the frame buffer, which stores pixel information for the screen in memory. Finally, it explains two basic line drawing algorithms - the digital differential analyzer (DDA) line drawing algorithm and Bresenham's line drawing algorithm. The DDA algorithm calculates pixel coordinates by incrementing x or y values based on the slope of the line, while Bresenham's algorithm optimizes for integer coordinates.
The document discusses the perceptron algorithm, which is a simple neural network used for binary classification. It was invented in 1957 and works by computing weighted inputs and applying a threshold activation function. The perceptron learns by adjusting its weights during the training process. It is computationally efficient but can only learn linearly separable problems and not more complex nonlinear relationships.
Machine Learning using Support Vector MachineMohsin Ul Haq
This document provides an overview of machine learning using support vector machines (SVM). It first defines machine learning as a field that allows computers to learn without explicit programming. It then describes the main types of machine learning: supervised learning using labelled training data, unsupervised learning to find hidden patterns in unlabelled data, and reinforcement learning to maximize rewards. SVM is introduced as a classification algorithm that finds the optimal separating hyperplane between classes with the largest margin. Kernels are discussed as functions that enable SVMs to operate in high-dimensional implicit feature spaces without explicitly computing coordinates.
Machine Learning: Applications, Process and TechniquesRui Pedro Paiva
Machine learning can be applied across many domains such as business, entertainment, medicine, and software engineering. The document outlines the machine learning process which includes data collection, feature extraction, model learning, and evaluation. It also provides examples of machine learning applications in various domains, such as using decision trees to make credit decisions in business, classifying emotions in music for playlist generation in entertainment, and detecting heart murmurs from audio data in medicine.
Support vector machines are a type of supervised machine learning algorithm used for classification and regression analysis. They work by mapping data to high-dimensional feature spaces to find optimal linear separations between classes. Key advantages are effectiveness in high dimensions, memory efficiency using support vectors, and versatility through kernel functions. Hyperparameters like kernel type, gamma, and C must be tuned for best performance. Common kernels include linear, polynomial, and radial basis function kernels.
The document describes the C4.5 algorithm for building decision trees. It begins with an overview of decision trees and the goals of minimizing tree levels and nodes. It then outlines the steps of the C4.5 algorithm: 1) Choose the attribute that best differentiates training instances, 2) Create a tree node for that attribute and child nodes for each value, 3) Recursively create subordinate nodes until reaching criteria or no remaining attributes. An example applies these steps to build a decision tree to predict customers' responses to a life insurance promotion using attributes like age, income and insurance status.
This document discusses feature selection concepts and methods. It defines features as attributes that determine which class an instance belongs to. Feature selection aims to select a relevant subset of features by removing irrelevant, redundant and unnecessary data. This improves learning accuracy, model performance and interpretability. The document categorizes feature selection algorithms as filter, wrapper or embedded methods based on how they evaluate feature subsets. It also discusses concepts like feature relevance, search strategies, successor generation and evaluation measures used in feature selection algorithms.
Using Classification and Clustering with Azure Machine Learning Models shows how to use classification and clustering algorithms with Azure Machine Learning.
The document summarizes statistical pattern recognition techniques. It is divided into 9 sections that cover topics like dimensionality reduction, classifiers, classifier combination, and unsupervised classification. The goal of pattern recognition is supervised or unsupervised classification of patterns based on features. Dimensionality reduction aims to reduce the number of features to address the curse of dimensionality when samples are limited. Multiple classifiers can be combined through techniques like stacking, bagging, and boosting. Unsupervised classification uses clustering algorithms to construct decision boundaries without labeled training data.
Three main types of machine learning are supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves training a model using labeled input/output data where the desired outputs are provided, allowing the model to map inputs to outputs. Unsupervised learning involves discovering hidden patterns in unlabeled data and grouping similar data points together. Reinforcement learning involves an agent learning through trial-and-error interactions with a dynamic environment by receiving rewards or punishments for actions.
Machine learning and its applications was submitted by Bhuvan Chopra to Er. Seema Rani. The document provides an introduction to machine learning, the basic prerequisites for machine learning including algebra, linear algebra, statistics and Python programming. It describes the main types of machine learning including supervised learning, unsupervised learning and reinforcement learning. Finally, it discusses some common applications of machine learning such as virtual personal assistants, video surveillance, social media services, email spam filtering, online customer support, product recommendations, and online fraud detection.
This document provides an introduction to machine learning, including definitions, key concepts, and algorithms. It defines machine learning as giving computers the ability to learn without being explicitly programmed. It distinguishes machine learning from artificial intelligence and describes supervised and unsupervised learning. Popular machine learning algorithms like naive Bayes, support vector machines, and decision trees are introduced. Python libraries for machine learning like scikit-learn are also mentioned.
The Lian-Barsky algorithm is a line clipping algorithm. This algorithm is more efficient than Cohen–Sutherland line clipping algorithm and can be extended to 3-Dimensional clipping. This algorithm is considered to be the faster parametric line-clipping algorithm. The following concepts are used in this clipping:
The parametric equation of the line.
The inequalities describing the range of the clipping window which is used to determine the intersections between the line and the clip window.
The document discusses issues in knowledge representation for artificial intelligence including how knowledge is represented, the quantitative and qualitative problems with analytical approaches, and issues such as spreading activation, subsumption, and classification. It also provides examples of knowledge representation methodologies and their problems. The document is a submission for a Master's degree that examines AI and neural networks.
This document contains 4 sets of question papers from Jawaharlal Nehru Technological University (JNTU) for the subject Artificial Neural Networks. Each set contains 8 questions and was given to undergraduate students for their supplementary exams in February of either 2007 or 2008. The questions cover topics like perceptrons, learning laws, radial basis functions, associative memory, applications of neural networks, and more.
Self-organizing networks can perform unsupervised clustering by mapping high-dimensional input patterns into a smaller number of clusters in output space through competitive learning. Fixed weight competitive networks like Maxnet, Mexican Hat net, and Hamming net use competitive learning with fixed weights. Maxnet uses winner-take-all competition to select the neuron whose weights best match the input. Mexican Hat net has both excitatory and inhibitory connections between neurons to enhance contrast. Hamming net determines which exemplar vector most closely matches the input using the Hamming distance measure.
1. Neural networks are inspired by the human brain and are able to perform complex tasks like pattern recognition much faster than conventional computers. They learn by adjusting the strengths of connections between neurons.
2. The document discusses different types of neural network architectures including single-layer feedforward networks, multilayer feedforward networks, and recurrent networks. Multilayer feedforward networks are commonly used and can be trained with backpropagation.
3. Neural networks operate by receiving inputs, performing computations through interconnected nodes that emulate neurons, and producing outputs. Learning involves modifying the weights between nodes to optimize performance on tasks.
The document discusses two neural network models for reading comprehension tasks: the Attentive Reader model proposed by Herman et al. in 2015 and the Stanford Reader model proposed by Chen et al. in 2016. The author implemented a two-layer attention model inspired by these previous models that achieves a 1.5% higher accuracy on reading comprehension tasks compared to the Stanford Reader.
The document discusses machine learning and neural networks. It begins by explaining that machine learning algorithms take examples of input-output pairs and produce a program or model that can predict the correct output for new inputs. This is unlike traditional programming where humans write specific rules. The document then discusses different types of neural networks including feedforward, recurrent, convolutional and more. It explains concepts like supervised vs unsupervised learning, learning rules, gradient descent, long short term memory networks, and competitive learning.
Rainwater harvesting involves collecting and storing rainwater from rooftops or surfaces before it reaches the aquifer. It has been used for centuries to provide drinking water, livestock water, and irrigation. The document discusses the need for rainwater harvesting due to increasing water demand and declining groundwater levels. It describes various rainwater harvesting techniques like rooftop and surface runoff collection and explains the benefits like free water, groundwater recharge, and self-sufficiency. Rainwater harvesting was practiced in ancient civilizations and remains important for rural water supply. Maintaining quality is important for safe water consumption.
Digital image processing - Image Enhancement (MATERIAL)Mathankumar S
This document discusses various image enhancement techniques including contrast stretching, compression of dynamic range, histogram equalization, and histogram specification. It provides definitions and explanations of these concepts with examples. Histogram equalization aims to produce a linear histogram to enhance an image, while histogram specification allows specifying a desired output histogram. Local enhancement can be achieved by applying these histogram processing methods over small non-overlapping regions instead of globally to reduce edge effects.
This document discusses Recurrent Neural Networks (RNNs) and provides information about different types of RNNs including vanilla RNNs, LSTM RNNs, and GRU RNNs. It covers topics such as backpropagation through time, exploding and vanishing gradients, and the equations that define LSTM and GRU units. The document is a workshop on RNNs presented by Intelligent City Ltd. and their CEO Shindong Kang.
The Hamming Code allows for the detection and correction of single bit errors by adding parity bits to the data word. The parity bits are placed in bit positions that are powers of two. Each parity bit checks some subset of the data/parity bits based on its position, and is set to 1 if the number of ones in the checked bits is odd, or 0 if it is even. To locate the position of an error, the parity bits are checked and the positions of any incorrect ones are summed to find the data/parity bit position with the error.
FISH SEED PRODUCTION & CULTIVABLE FISH SPECIES WITH FISH CUM DUCK FORMINGMathankumar S
This document summarizes information about fish farming in India, including different species of fish that are farmed. It discusses indigenous fish species like various carps, as well as exotic species imported from other countries. For each type of fish, it provides details on physical characteristics, habitat, breeding, and use in aquaculture. The document categorizes fish into groups like indigenous carps, exotic carps, air-breathing fishes, and ornamental fishes. It provides information on commercially important species like various carps, catfish, climbing perch, and tilapia.
Backpropagation is a common supervised learning technique for training artificial neural networks by calculating the gradient of the error in the network with respect to its weights, allowing the weights to be adjusted to minimize error through methods like stochastic gradient descent. It involves performing forward and backward passes through the network, using the error signal to calculate weight updates that reduce error for each connection based on its contribution to the output error. While powerful, backpropagation has limitations such as slow convergence and susceptibility to getting stuck in local minima.
Biological control systems - Time Response Analysis - S.Mathankumar-VMKVECMathankumar S
Biological control systems - Time Response Analysis - Step and Impulse responses of first order and second order systems, Determination of time domain specifications of first and second order systems from its output responses.
The document describes a back propagation network, which is a multilayer artificial neural network that uses a supervised learning method called backward propagation of errors. The network has at least three layers - an input layer, one or more hidden layers, and an output layer. It initializes weights randomly, then performs forward propagation to calculate outputs. It calculates errors between outputs and targets, then propagates the errors back through the network to adjust the weights, in order to minimize errors through iterative training. Sigmoid activation functions are commonly used. Autoassociation is also described, where patterns are compressed in the hidden layer and reconstructed at the output layer.
This document summarizes information about fish farming in India, including different species of fish that are farmed. It discusses indigenous fish species like various carps, as well as exotic species imported from other countries. For each type of fish, it provides details on physical characteristics, habitat, breeding, and use in aquaculture. The document categorizes fish into groups like indigenous carps, exotic carps, air-breathing fishes, and ornamental fishes. It provides information on commercially important species like various carps, catfish, climbing perch, and tilapia.
The document discusses character recognition using convolutional neural networks. It begins with an introduction to classifiers and gradient-based learning methods. It then describes how multiple perceptrons can be combined into a multilayer perceptron and trained using backpropagation. Next, it introduces convolutional neural networks, which offer improvements over multilayer perceptrons in performance, accuracy, and distortion invariance. It provides details on the topology and training of convolutional neural networks. Finally, it discusses the LeNet-5 convolutional neural network and its successful application to handwritten digit recognition.
Islamic University Pattern Recognition & Neural Network 2019 Rakibul Hasan Pranto
The document discusses various topics related to pattern recognition including:
1. Pattern recognition is the automated recognition of patterns and regularities in data through techniques like machine learning. It has applications in areas like optical character recognition, diagnosis systems, and security.
2. There are two main approaches to pattern recognition - sub-symbolic and symbolic. Sub-symbolic uses connectionist models like neural networks while symbolic uses formal structures like strings and automata to represent patterns.
3. A pattern recognition system consists of steps like data acquisition, pre-processing, feature extraction, model learning, classification, and post-processing to classify patterns. Bayesian decision making and Bayes' theorem are statistical techniques used in classification.
UNIT 3: Data Warehousing and Data MiningNandakumar P
UNIT-III Classification and Prediction: Issues Regarding Classification and Prediction – Classification by Decision Tree Introduction – Bayesian Classification – Rule Based Classification – Classification by Back propagation – Support Vector Machines – Associative Classification – Lazy Learners – Other Classification Methods – Prediction – Accuracy and Error Measures – Evaluating the Accuracy of a Classifier or Predictor – Ensemble Methods – Model Section.
This document provides an overview of clustering and k-means clustering algorithms. It begins by defining clustering as the process of grouping similar objects together and dissimilar objects separately. K-means clustering is introduced as an algorithm that partitions data points into k clusters by minimizing total intra-cluster variance, iteratively updating cluster means. The k-means algorithm and an example are described in detail. Weaknesses and applications are discussed. Finally, vector quantization and principal component analysis are briefly introduced.
Unit-1 Introduction and Mathematical Preliminaries.pptxavinashBajpayee1
This document provides an introduction to pattern recognition and classification. It discusses key concepts such as patterns, features, classes, supervised vs. unsupervised learning, and classification vs. clustering. Examples of pattern recognition applications are given such as handwriting recognition, license plate recognition, and medical imaging. The main phases of developing a pattern recognition system are outlined as data collection, feature choice, model choice, training, evaluation, and considering computational complexity. Finally, some relevant basics of linear algebra are reviewed.
introduction to machine learning 3c-feature-extraction.pptxPratik Gohel
This document discusses feature extraction and dimensionality reduction techniques. It begins by defining feature extraction as mapping a set of features to a reduced feature set that maximizes classification ability. It then explains principal component analysis (PCA) and how it works by finding orthogonal directions that maximize data variance. However, PCA does not consider class information. Linear discriminant analysis (LDA) is then introduced as a technique that finds projections by maximizing between-class distance and minimizing within-class distance to better separate classes. LDA thus provides a "good projection" for classification tasks.
Comparision of methods for combination of multiple classifiers that predict b...IJERA Editor
Predictive analysis include techniques fromdata mining that analyze current and historical data and make
predictions about the future. Predictive analytics is used in actuarial science, financial services, retail, travel,
healthcare, insurance, pharmaceuticals, marketing, telecommunications and other fields.Predicting patterns can
be considered as a classification problem and combining the different classifiers gives better results. We will
study and compare three methods used to combine multiple classifiers. Bayesian networks perform
classification based on conditional probability. It is ineffective and easy to interpret as it assumes that the
predictors are independent. Tree augmented naïve Bayes (TAN) constructs a maximum weighted spanning tree
that maximizes the likelihood of the training data, to perform classification.This tree structure eliminates the
independent attribute assumption of naïve Bayesian networks. Behavior-knowledge space method works in two
phases and can provide very good performances if large and representative data sets are available.
This document summarizes the analysis of a movie review sentiment dataset using various classification algorithms. It describes extracting features from the dataset, loading it into a dataframe, and applying logistic regression, decision trees, random forests, SVM, k-NN, and Naive Bayes classifiers. Random forest achieved the highest accuracy of 0.6611. Logistic regression had the second highest at 0.6705. The document also discusses counting words by sentiment and visualizing the results.
Cluster analysis, or clustering, is the process of grouping data objects into subsets called clusters so that objects within a cluster are similar to each other but dissimilar to objects in other clusters. There are several approaches to clustering, including partitioning, hierarchical, density-based, and grid-based methods. The k-means and k-medoids algorithms are popular partitioning methods that aim to partition observations into k clusters by minimizing distances between observations and cluster centroids or medoids. K-medoids is more robust to outliers as it uses actual observations as cluster representatives rather than centroids. Both methods require specifying the number of clusters k in advance.
SVM Based Identification of Psychological Personality Using Handwritten Text IJERA Editor
This document describes a study that uses handwriting analysis to identify psychological personality traits using support vector machines (SVM). Handwriting samples were collected and preprocessed by removing noise and segmenting lines. Features like slope, shape, and edge histograms were extracted. SVM with radial basis function kernel was used for classification. Analysis of single lines achieved 95% accuracy while multiple lines achieved 91% accuracy in identifying traits like cheerfulness and weariness. The methodology was also applied to analyze handwriting of celebrities and compare the results to analyses by graphologists. The study aims to automate handwriting analysis using machine learning techniques.
Automated attendance system based on facial recognitionDhanush Kasargod
A MATLAB based system to take attendance in a classroom automatically using a camera. This project was carried out as a final year project in our Electronics and Communications Engineering course. The entire MATLAB code I've uploaded it in mathworks.com. Also the entire report will be available at academia.edu page. Will be delighted to hear from you.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
This document provides an overview of knowledge representation techniques and object recognition. It discusses syntax and semantics in representation, as well as descriptions, features, grammars, languages, predicate logic, production rules, fuzzy logic, semantic nets, and frames. It then covers statistical and cluster-based pattern recognition methods, feedforward and backpropagation neural networks, unsupervised learning including Kohonen feature maps, and Hopfield neural networks. The goal is to represent knowledge in a way that enables object classification and decision-making.
This document discusses various clustering techniques used in data mining. It begins by defining clustering as an unsupervised learning technique that groups similar objects together. It then discusses advantages of clustering such as quality improvement and reuse opportunities. Several clustering methods are described such as K-means clustering, which aims to partition observations into k clusters where each observation belongs to the cluster with the nearest mean. The document concludes by discussing advantages of K-means clustering such as its linear time complexity and its use for spherical cluster shapes.
Supervised learning uses labeled training data to predict outcomes for new data. Unsupervised learning uses unlabeled data to discover patterns. Some key machine learning algorithms are described, including decision trees, naive Bayes classification, k-nearest neighbors, and support vector machines. Performance metrics for classification problems like accuracy, precision, recall, F1 score, and specificity are discussed.
The method of identifying similar groups of data in a data set is called clustering. Entities in each group are comparatively more similar to entities of that group than those of the other groups.
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
Digital image classification involves sorting pixels into discrete classes based on their spectral values. It can be performed using supervised or unsupervised approaches. Supervised classification involves using training data to define classes, while unsupervised classification uses algorithms to automatically group similar pixels. Accuracy assessment involves comparing the classification to reference data to determine accuracy through an error matrix.
This document summarizes chapter 10 of the book "Data Mining: Concepts and Techniques" which discusses cluster analysis. The chapter covers basic concepts of cluster analysis including partitioning, hierarchical, density-based and grid-based methods. It describes popular partitioning algorithms like k-means and k-medoids, and notes that k-means can be sensitive to outliers while k-medoids uses medioids which are less sensitive to outliers. The chapter also discusses evaluating clustering quality and major considerations for cluster analysis.
This chapter discusses different clustering methods including partitioning, hierarchical, density-based, and grid-based approaches. Partitioning methods like k-means and k-medoids aim to partition observations into k clusters by optimizing some objective function. Hierarchical clustering builds a hierarchy of clusters based on distance between observations. Density-based methods identify clusters based on density rather than distance. Grid-based methods quantize the space into finite number of cells that form clusters.
This document provides an overview of machine learning techniques using R. It discusses regression, classification, linear models, decision trees, neural networks, genetic algorithms, support vector machines, and ensembling methods. Evaluation metrics and algorithms like lm(), rpart(), nnet(), ksvm(), and ga() are presented for different machine learning tasks. The document also compares inductive learning, analytical learning, and explanation-based learning approaches.
Distributed system is an application that executes a collection of protocols to coordinate the actions of multiple processes on a communication network, such that all components cooperate together to perform a single or small set of related tasks.
This document contains questions and answers related to finite automata theory. It begins with definitions of finite automata and their uses in text processing, compiler design, and hardware design. It then provides the formal definition of a deterministic finite automaton (DFA) and explains how a DFA processes strings using the transition function. Examples of DFAs are given for specific languages. Extended transition functions are described, along with examples of using them to represent languages. Specific languages are also given with their corresponding DFA diagrams and formal definitions.
An operating system (OS) is software that manages computer hardware and software. Common desktop operating systems include Windows, Mac OS X, and Linux.
Artificial Intelligence is a way of making a computer, a computer-controlled robot, or a software think intelligently, in the similar manner the intelligent humans think.
1. The document discusses converting between different number systems including binary, decimal, octal and hexadecimal.
2. Methods are provided for converting integers and decimals from one base to another by breaking down the numbers into place values and recombining in the target base.
3. Examples are given of converting specific numbers between bases such as decimal to binary and vice versa.
This document discusses different types of programming languages used in computer science. It describes machine language as the lowest-level language that uses binary digits to write instructions. Assembly language, introduced in 1950, uses symbolic codes to write programs more easily than machine language. Higher-level languages like C and C++ allow writing programs in a more intuitive way using words and symbols. The document provides examples of advantages and disadvantages of different language types.
The document discusses various topics related to computer networks including business applications, home applications, mobile users, and social issues. It then covers network hardware classifications including personal area networks, local area networks, metropolitan area networks, and wide area networks. The document also discusses network software topics such as protocol hierarchies, connection-oriented vs connectionless services, service primitives, and the relationship between services and protocols. It concludes with sections on reference models including the OSI and TCP/IP models.
This document provides an alphabetical list of terms related to cyber crimes, beginning with "Anonymizer" and ending with "Zombie". Each term is defined in 1-2 paragraphs. Some key terms summarized include:
- Anonymizer - A tool that hides a user's identity and location when browsing the internet. It can enable criminal behavior by avoiding consequences.
- ARP cache poisoning - A technique where an attacker sends fake ARP messages to intercept and alter network data like passwords or credit card numbers.
- Cyber stalking - The use of electronic devices to stalk or harass someone repeatedly in a threatening manner. Most states have laws against cyber stalking.
- DOS/DDOS attacks -
The document contains questions and answers related to compiler design topics such as parsing, grammars, syntax analysis, error handling, derivation, sentential forms, parse trees, ambiguity, left and right recursion elimination etc. Key points discussed are:
1. The role of parser is to verify the string of tokens generated by lexical analyzer according to the grammar rules and detect syntax errors. It outputs a parse tree.
2. Common parsing methods are top-down, bottom-up and universal. Top-down methods include LL, LR. Bottom-up methods include LR, LALR.
3. Errors can be lexical, syntactic, semantic and logical detected by different compiler phases. Error recovery strategies include panic mode
This document discusses concepts related to industrial management including administration, management, organization, and authority. It defines administration as decision making, policy making, and adjustments. Management is concerned with carrying out operations to accomplish aims. Organization determines duties and maintains authority relationships. Authority is the power to give orders and make decisions, while responsibility is the obligation to complete work.
This document contains questions and answers about computer graphics. It begins by defining computer graphics as pictures and movies created using computers, usually referring to image data created with specialized graphics hardware and software. Applications of computer graphics mentioned include computer-aided design, presentation graphics, computer art, entertainment, education and training, visualization, image processing, and graphical user interfaces. Key terms like pixel, resolution, aspect ratio, and persistence are also defined. The document then discusses video display devices and CRTs, and explains raster scan and random scan display systems. Color CRTs using beam penetration and shadow mask techniques are also covered.
IEEE-488, also known as GPIB or HP-IB, is a digital communications bus standard developed by Hewlett-Packard in the 1960s to connect instruments and controllers. It uses a 24-pin connector and defines 16 signal lines for bi-directional data transfer, bus management, and handshaking between devices. Up to 15 devices can be connected to a single bus with a maximum data rate of 1 MB/sec. Communication is done digitally by sending bytes over the data lines using hardware handshaking signals to control data flow.
The document provides an overview of HTML (Hypertext Markup Language) and some basic HTML tags. It defines HTML as a markup language used to structure and present content for display on the web. It explains common HTML tags like <html>, <head>, <body>, <p>, <b>, <i> and <br> and how they are used to create headings, paragraphs, bold and italic text, and line breaks in a web page. It also gives examples of how these tags can be implemented.
More from Jessore University of Science & Technology, Jessore. (15)
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
How to Add Chatter in the odoo 17 ERP ModuleCeline George
In Odoo, the chatter is like a chat tool that helps you work together on records. You can leave notes and track things, making it easier to talk with your team and partners. Inside chatter, all communication history, activity, and changes will be displayed.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Pattern recognition
1. 1
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
01725-402592
2. 2
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and
regularities in data, although it is in some cases considered to be nearly synonymous with machine learning.
Pattern recognition systems are in many cases trained from labeled "training" data.
Pattern recognition is the scientific discipline that concerns the description and classification of patterns.
Decision making
Object and pattern recognition.
Pattern Recognition applications
Build a machine that can recognize patterns:
Speech recognition
Fingerprint identification
OCR (Optical Character Recognition)
DNA sequence identification
Text Classification
Basic Structure
The task of the pattern recognition system is to classify an object into a correct class based on the
measurements about the object. Note that possible classes are usually well-defined already before the design
of the pattern recognition system. Many pattern recognition systems can be thought to consist of five stages:
1. Sensing (measurement);
2. Pre-processing and segmentation;
3. Feature extraction;
4. Classification;
5. Post-processing
Sensing
Sensing refers to some measurement or observation about the object to be classified. For example, the data
can consist of sounds or images and sensing equipment can be a microphone array or a camera.
Pre-processing
Pre-processing refers to filtering the raw data for noise suppression and other operations performed on the
raw data to improve its quality. In segmentation, the measurement data is partitioned so that each part
represents exactly one object to be classified. For example in address recognition, an image of the whole
address needs to be divided to images representing just one character.
3. 3
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Feature extraction
Feature extraction, especially when dealing with pictorial information the amount of data per one object can
be huge. A high resolution facial photograph (for face recognition) can contain 1024*1024 pixels.
Classification
The classifier takes as an input the feature vector extracted from the object to be classified. It places then the
feature vector (i.e. the object) to class that is the most appropriate one. In address recognition, the classifier
receives the features extracted from the sub-image containing just one character and places it to one of the
following classes: ‟A‟,‟B‟,‟C‟..., ‟0‟,‟1‟,...,‟9‟. The classifier can be thought as a mapping from the feature
space to the set of possible classes.
Post-processing
A pattern recognition system rarely exists in a vacuum. The final task of the pattern recognition system is to
decide upon an action based on the classification result(s). A simple example is a bottle recycling machine,
which places bottles and cans to correct boxes for further processing.
The Design Cycle
• Data collection
• Feature Choice
• Model Choice
• Training
• Evaluation
• Computational Complexity
Data Collection
How do we know when we have collected an adequately large and representative set of examples for
training and testing the system?
Feature Choice
Depends on the characteristics of the problem domain. Simple to extract, invariant to irrelevant
transformation insensitive to noise
Model Choice
Unsatisfied with the performance of our fish classifier and want to jump to another class of model.
Training
Use data to determine the classifier. Many different procedures for training classifiers and choosing models
4. 4
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Evaluation
Measure the error rate
Different feature set
Different training methods
Different training and test data sets
Computational Complexity
What is the trade-off between computational ease and performance?
Statistical Decision Making
Parametric Decision Making
In which we know or are willing to assume the general form of the probability distribution function or
density function for each class, but not the values of the parameters such as mean or variance.
Non Parametric Decision Making
When we do not have sufficient basis of assuming even the general form of the relevant densities.
Bayes’ Theorem
• Bayesian decision making refers to choosing the most likely class, given the value of the feature or
features.
• The probability of class membership is calculated from Bayes‟ Theorem.
• Let feature value is x and a class of interest is C
• Then P(x) is the probability distribution of x in the entire population.
• P(C) is the prior probability that a random sample is a member of class C.
• P (x|C) is the conditional probability of obtaining x given that the sample is from C class.
• We have to estimate the probability P (C|x) that a sample belongs to class C, given that it has the
feature x.
• Conditional Probability
• The probability of occurring A given That B has occurred is denoted by P (A|B), and is read as “P of
A given B”.
• Since we know in advance that B has occurred, so P (A|B) is the fraction of B in which A occurs.
Thus
5. 5
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The conditional probability of a sample comes from class C and has the feature value x is
• Rearranging
• Which is known as Bayes‟ Theorem? The variable x can represent a single feature or a feature
vector.
Bayes’ Theorem for k-classes
• Let C1… Ck are mutually exclusive i.e., they will not overlap each other and every sample belongs to
exactly one of the classes.
• If a sample belongs to one of the classes A or B, or both or neither, then four new mutually exclusive
classes C1 ,C2 ,C3 ,and C4 defined by
C1 = A and B C2 = A and B
C3 = A and B C4= A and B
• Thus k-nonexclusive classes could define up to 2k
mutually exclusive classes.
• Bayes Theorem for multiple features is obtained by replacing the value of a single feature x by the
value of a feature vector x.
• In the discrete case, if there are k classes we obtain
A A+B B
)(
)(
)|(
BP
BandAP
BAP
)(
)(
)|(
AP
AandBP
ABP
)|()()( BAPBPBandAP )|()()( ABPAPAandBP
)|()()|()()( xCPxPCxPCPxandCP
)(
)|()(
)|(
xP
CxPCP
xCP
6. 6
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Nonparametric Decision Making
Nearest Neighbor Classification Techniques
The single Nearest Neighbor Technique
• Beyond of the problem of probability densities, the single Nearest Neighbor Technique completely
and simply classifies an unknown sample as belonging to the relevant class as the most similar or
“nearest” sample point in the training set of data, which is often called a reference set.
• Nearest can mean the smallest Euclidean distance in n-dimensional feature space, which is the
distance between two points
And
• Defined by
• Where n is number of features.
• Although Euclidean distance is the most commonly used measure of dissimilarity / similarity
between feature vectors, it is not always the best metric.
• Before summation, squaring the distance places emphasis on features with large dissimilarity.
• A more moderate approach is simply the sum of the absolute differences in each feature, and saves
computing time.
• The distance metric would then be
• The sum of absolute distances is sometimes called the city block distance, the Manhattan metric, or
the taxi-cab distance.
)...,.........( 1 naaa
)..,.........( 1 nbbb
n
i
iie abd
1
2
)()( b,a
||)(
1
i
n
i
icb abd
b,a
7. 7
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Because it seems the distance between two locations in a city. If in a two-way street of rectangular
shape, the number of blocks north (or south) plus the number of block east (or west) would equal the
total distance traveled.
• An extreme metric which considers only the most dissimilar pair of features is the Maximum
distance metric
• A generalization of the three distances is the Minkowski distance defined by
• Where r is an adjustable parameter
Clustering
• Clustering refers to the process of grouping samples so that the samples are similar within each
group. The groups are called clusters.
• Clustering can be classified into two major types, Hierarchical and Partitioned clustering.
Hierarchical clustering algorithms can be further divided into agglomerative and divisive.
• Hierarchical clustering refers to a process that organizes data into large groups, which contain
smaller groups, and so on.
• Hierarchical clustering usually drawn pictorially by a tree or dendrogram in which the finest
grouping is at the bottom, each sample forms a cluster.
• Below is an example of a dendrogram
• Hierarchical clustering algorithms are called agglomerative if they build the dendrogram from the
bottom up and they are called divisive if they build the dendrogram from the top down.
||max)(
1
ii
n
i
m abd
b,a
rn
i
r
iir abd
1
1
)(
b,a
8. 8
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Agglomerative clustering algorithms with n number of samples is as below
• Begin with n clusters, each consisting of one sample.
• Repeat step 3 a total of n-1 times.
• Find the most similar clusters Ci and Cj and merge Ci and Cj into one cluster. If there is a tie, merge
the first pair found.
Hierarchical Clustering
• One way to measure the similarity between clusters is to define a function that measures the distance
between clusters.
• In cluster analysis nearest neighbor techniques are used to measure the distance between pairs of
samples.
The Single-Linkage Algorithm
• It is also known as the minimum method or the nearest neighbor method.
• The Single-Linkage Algorithm is obtained by defining the distance between two clusters to be the
smallest distance between two points such that one point is in each cluster.
• Formally, if Ci and Cj are clusters, the distance between them is defined as
• Where d (a,b) denotes the distance between the samples a and b.
Hierarchical Clustering: The Single-Linkage Algorithm Example
• Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the
distance between two samples.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
9. 9
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The smallest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of
clusters become four : {1,2}, {3}, {4}, {5}
{1,2} 3 4 5
{1,2} - 8.1 16.0 17.9
3 8.1 - 9.8 9.8
4 16.0 9.8 - 8.0
5 17.9 9.8 8.0 -
• The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for S L Algorithm the distance between clusters
{1,2} and {3} is the minimum 8.1 and so on.
• Since the minimum value in the matrix is 8, clusters {4} & {5} are merged.
• Thus in this level, There are three clusters: {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 8.1 16.0
3 8.1 - 9.8
{4,5} 16.0 9.8 -
• Since the minimum value in this step is 8.1, thus clusters {1, 2} and {3} are merged. Now there are
two clusters: {1, 2, 3} and {4, 5}.
• The next step will merge the two remaining clusters at a distance of 9.8. Finally the dendrogram is as
below.
1 2 3 4 5
1 - 4.0 11.7 20.0 21.5
2 4.0 - 8.1 16.0 17.9
3 11.7 8.1 - 9.8 9.8
4 20.0 16.0 9.8 - 8.0
5 21.5 17.9 9.8 8.0 -
10. 10
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Hierarchical Clustering
The Complete-Linkage Algorithm
• It is also known as the maximum method or the farthest neighbor method.
• And is obtained by defining the distance between two clusters to be the largest distance between a
sample in one cluster and that in other cluster.
• Formally, if Ci and Cj are clusters, we define
Hierarchical Clustering: The Complete-Linkage Algorithm Example
• Perform hierarchical clustering of five Samples with two features, use Euclidean distance for the
distance between two samples.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
11. 11
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The nearest distance is 4.0 between cluster {1} and {2}, so they are merged. Now the number of
clusters become four : {1,2}, {3}, {4}, {5}
{1,2} 3 4 5
{1,2} - 11.7 20.0 21.5
3 11.7 - 9.8 9.8
4 20.0 9.8 - 8.0
5 21.5 9.8 8.0 -
• The distance d(1,3)=11.7 and d(2,3)=8.1, Thus for C L Algorithm the distance between clusters
{1,2} and {3} is the Maximum 11.7 and so on.
• Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged.
• Thus in this level, There are three clusters: {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 11.7 21.5
3 11.7 - 9.8
{4,5} 21.5 9.8 -
• Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are
two clusters: {1, 2} and {3, 4, 5}.
• The next step will merge the last two clusters at a distance of 21.5.
The Average-Linkage Algorithm
• The Average-Linkage Algorithm is a compromise between the extremes of the single- and complete-
linkage algorithms.
• It is also known as the unweighted pairgroup method using arithmetic averages (UPGMA).
• And is obtained by defining the distance between two clusters to be the average distance between a
sample in one cluster and that in other cluster.
1 2 3 4 5
1 - 4.0 11.7 20.0 21.5
2 4.0 - 8.1 16.0 17.9
3 11.7 8.1 - 9.8 9.8
4 20.0 16.0 9.8 - 8.0
5 21.5 17.9 9.8 8.0 -
12. 12
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Formally, if Ci with ni members and Cj with nj members are clusters, we define
• After the first table of past example, the clusters in second step was {1,2}, {3}, {4}, {5}. In this step,
for A L Algorithm, the distance between clusters {1,2} and {3} will be the average of the distances
d(1,3)=11.7 and d(2,3)=8.1, and so on.
{1,2} 3 4 5
{1,2} - 9.9 18.0 19.7
3 9.9 - 9.8 9.8
4 18 9.8 - 8.0
5 19.7 9.8 8.0 -
• Since the minimum nearest value in the matrix is 8, clusters {4} & {5} are merged. Thus now the
clusters are {1,2}, {3}, {4,5}
{1,2} 3 {4,5}
{1,2} - 9.9 18.9
3 9.9 - 9.8
{4,5} 18.9 9.8 -
• Since the minimum value in this step is 9.8, thus clusters {3} and {4,5} are merged. Now there are
two clusters: {1, 2} and {3, 4, 5}.
• The next step will merge the last two clusters at a distance of 14.4.
Hierarchical Clustering: Ward’s Method
• Word‟s Method is also called the minimum-variance method. It begins with one cluster for each
sample.
• At each iteration, among all cluster pairs, it merges the pair that produces the smallest squared error
for the resulting set of clusters. The squared error for each cluster is defined as follows:
• Let a cluster contains m samples x1,….,xm where xi is the feature vector (xi1,….,xid)
13. 13
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The vector composed of the means of each feature
is called the mean vector or centroid of the cluster.
• The squared error for a cluster is the sum of the squared distances in each feature from the cluster
members to their mean.
• The squared error is thus equal to the total variance of the cluster times the number of
samples in the cluster m, where the total variance is defined to be
the sum of the variances of each feature. The squared error for a set of clusters is defined to be the
sum of the squared errors for the individual clusters.
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
• Example: Begin with five cluster, one sample in each. The squared error is 0, 10 possible ways to
merge a pair of clusters: merge {1} & {2}, merge {1} & {3}, and so on.
• Let merging {1} and {2}, feature vector of sample 1 is (4,4) & feature vector of sample 2 is (8,4), so
feature means are 6 & 4. The squared error for cluster {1,2}:
14. 14
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The squared error for cluster {3}, {4}, {5} is 0. Thus the total squared error for the clusters
{1,2},{3},{4},{5}:
• 8+0+0+0=8.
Clusters Squared
Error, E
{1,2},{3},{4},{5} 8.0
{1,3},{2},{4},{5} 68.5
{1,4},{2},{3},{5} 200.0
{1,5},{2},{3},{4} 232.0
{2,3},{1},{4},{5} 32.5
{2,4},{1},{3},{5} 128.0
{2,5},{1},{3},{4} 160.0
{3,4},{1},{2},{5} 48.5
{3,5},{1},{2},{4} 48.5
{4,5},{1},{2},{3} 32.0
• Since minimum error is 8, so merging {1, 2}, {3}, {4}, {5} is accepted.
Clusters Squared
Error, E
{1,2,3},{4},{5} 72.7
{1,2,4},{3},{5} 224.0
{1,2,5},{3},{4} 266.7
{1,2},{3,4},{5} 56.5
{1,2},{3,5},{4} 56.5
{1,2},{4,5},{3} 40.0
• There are 6 possible sets of clusters resulting from {1, 2}, {3}, {4}, {5}.
• From the table shown, the minimum squared error is 40 and it is for {1,2},{4,5},{3}
• There are 3 possible sets of clusters resulting from {1,2},{4,5},{3}.
• From the table shown, the minimum squared error is 94 and it is for {1,2},{3,4,5}
15. 15
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• At Last, Two remaining clusters are merged and Hierarchical clustering is complete.
Clusters Squared
Error, E
{1,2,3},{4,5} 104.7
{1,2,4,5},{3} 380.0
{1,2},{3,4,5} 94.0
• The resulting dendrogram is shown as below:
Partitional Clustering
• In partitional clustering, the goal is usually to create one set of clusters that partitions the data into
similar groups.
• Samples close to one another are assumed to be similar and the task is to group data that are closed
together.
• In many cases, the number of clusters to be constructed is specified in advance.
• If a partitional clustering algorithm divide the data set into two groups, then each of these is further
divided into two parts, and so on, a hierarchical dendrogram could be produced from the top-down.
• The hierarchy produced by this divisive technique is more general than the bottom-up hierarchies
because the groups can be divided into more than two subgroups in one step.
• Another advantage of partitional techniques is that only the top part of the tree which shows the
main groups and possibly their subgroups, may be required, and there may be no need to complete
dendrogram.
Partitional Clustering: Forgy’s Algorithm
• Besides the data, input to the algorithm consists of k, the number of clusters to be constructed, and k
samples called seed points. The seed points could be chosen randomly, or some knowledge of the
desired cluster structure could be used to guide their selection.
16. 16
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Step-1. Initialize the cluster centroids to the seed points.
• Step-2. For each sample, find the cluster centroid nearest it. put the sample in the cluster identified
with this nearest cluster centroid.
• Step-3. If no samples changed clusters in step 2, stop.
• Step-4. Compute the centroids of the resurting clusters and go to step 2.
Forgy’s Algorithm: Example
x y
1 4 4
2 8 4
3 15 8
4 24 4
5 24 12
• Set k=2 which will produce two clusters, and use the first two samples (4,4) and (8,4) in the list as
seed points.
• In this algorithm, the samples will be denoted by their feature vectors rather than their simple
numbers to aid in the computation.
• For step 2, find the nearest cluster centroid for each sample.
Sample Nearest
cluster
centroid
(4,4) (4,4)
(8,4) (8,4)
(15,8) (8,4)
(24,4) (8,4)
(24,12) (8,4)
• The ctusters {(4, 4)} and {(8,4), (15,8), (24,4), (24,12)} are produced.
• For step 4, compute the centroids of the clusters. The centroid of the first and second clusters are
(4,4) and (17.75,7) since (8+15+24+24)/4=17.75 (4+8+4+12)/4=7
Sample Nearest
17. 17
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
cluster
centroid
(4,4) (4,4)
(8,4) (4,4)
(15,8) (17.75,7)
(24,4) (17.75,7)
(24,12) (17.75,7)
• Some sample changed cluster, return to step-2
• Resulting table shows the results. The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are
produced.
• Again for step 4, compute the centroids (6,4) and (21, 8) of the clusters. Since the sample (8, 4)
changed clusters, return to step 2.
Sample Nearest
cluster
centroid
(4,4) (6,4)
(8,4) (6,4)
(15,8) (21, 8)
(24,4) (21, 8)
(24,12) (21, 8)
• Find the cluster centroid nearest each sample. Table shows the results.
• The clusters {(4, 4), (8, 4)} and {(15, 8), (24, 4), (24, 12)} are obtained.
• For step 4, compute the centroids (6, 4) and (21, 8) of the clusters.
• Since no sample will change clusters, the algorithm terminates.
18. 18
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Partitional Clustering: k-means Algorithm
• An alternative version of the, k-means algorithm iterates step 2. Specifically step-2 is replaced by the
following steps 2 through 4:
• 2. For each sample, find the centroid nearest it. Put the sarnple in the cluster identified with this
nearest centroid.
• 3. If no samples changed clusters, stop
• 4. Recompute the centroids of altered clusters and go to step 2.
K-means Algorithm: Example
• Set k: 2 and assume that the data are ordered so that the first two sarnples are (8,4) and (24,4).
• For step 1, begin with two clusters {(8,4)} and {(24,4)} which have centroids at (8,4) and (24,4). For
each of the remaining three sa,rnples, find the centroid nearest it, put the sample in this cluster, and
recompute the centroid of this cluster.
• The next sample (15, 8) is nearest the centroid (8,4) so it joins cluster {(8,4)}.
• At this point, the clusters are {(8,4),(15,8)} and {(24,4)}. The centroid of the first 3 cluster is
updated to (11.5, 6) since (8+15)/2=1.1.5, (4+8)/2=6.
• The next sample (4, 4) is nearest the centroid (11.5,6) so it joins cluster {(8,4), (15,8)}. At this point,
the clusters are {(8,4),(15,8),(4,4)} and {(24,4)}. The centroid of the first cluster is updated to (9,
5.3).
19. 19
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The next sample (24, 12) is nearest the centroid (24,,4) so it joins cluster {(24,4)}. At this point, the
clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}. The centroid of the second cluster is
updated to (24, 8). At this point, step 1 of the algorithm is complete.
• For step 2, examine the sarnples one by one and put each one in the cluster identified with the
nearest centroid. As Table shows, in this case no sarnple changes clusters.
• The resulting clusters are {(8, 4), (15, 8), (4, 4)} and {(24, 12), (24, 4)}.
Sample Distance
to
Centroid
(9, 5.3)
Distance
to
cetroid
(24, 8)
(8, 4) 1.6 16.5
(24,4) 15.1 4.0
(15, 8) 6.6 9.0
(4,4) 6.6 40.4
(24,12) 16.4 4.0
• The goal of Forry's algorithm and the, k-means algorithm is to minimize the squared error for a fixed
number of clusters. These algorithms assign samples to clusters so as to reduce the squared error
and, in the iterative versions, they stop when no further reduction occurs.
• However, to achieve reasonable computation time, they do not consider all possible clusterings. For
this reason, they sometimes terminate with a clustering that achieves a local minimum squared error.
• Furthermore, in general, the clusterings, that these algorithms generate depend on the choice of the
seed points.
• If Forgy's algorithm is applied to the original data using (8, 4) and (24, 4) as seed points, the
algorithm terminates with the clusters {(4, 4), (8, 4), (15, 8)}, {(24, 4), (24, 12)}.
• This is different from the clustering produced in forgy‟s. The above clustering has a squared error of
104.7 whereas the Forgy‟s clustering has a squared error of 94.
• The clustering above produces a local minimum and the forgy‟s clustering can be shown to produce
a global minimum.
• For a given set of seed points, the resulting clusters may also depend on the order in which the points
are checked.
Neural Network: Introduction
20. 20
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• It was more than 2000 years ago; our ancestors had started to discover the architecture and behavior
of human brain.
• Ramon Y. Cajal and Hebb continued the work of Aristotle and tried to build the artificial "thinking
machine".
• Based on the information about the functions of the brain and the quest for obtaining a mathematical
model for our learning habits, a new technology Artificial Neural Networks was started.
• Our brain can process information quickly and accurately. You can recognize your friend's voice in a
noisy railway station. How the brain is able to process the voice signal added with the noise and
retrieve the original signal?
• Can we duplicate this amazing process through a machine? Can we make a machine to duplicate
some learning habits of a human? Can a machine be made to learn from experience?
• We will get answer during the study of Neural Network.
Neural Network: Definition
• An artificial neural network is an information processing system that has been developed as a
generalization of the mathematical model of human cognition (sense of knowing).
• A neural network is a network of interconnected neurons, inspired from the studies of the biological
nervous system. In other words, neural network functions in a way similar to the human brain.
• The function of a neural network is to produce an output pattern when presented with an input
pattern.
• Neural network is the study of networks consisting of nodes connected by adaptable weights, which
store experimental knowledge from task examples through a process of learning.
• The nodes of the brain are adaptable; they acquire knowledge through changes in the node weights
by being exposed to samples.
Neural Network: Biological Neural Net.
• Neural network architectures are motivated by models of the human brain and nerve cells. Our
current knowledge of human brain is limited to its anatomical and physiological information.
• Neuron (from Greek, meaning nerve cell) is the fundamental unit of the brain. The neuron is a
complex biochemical and electrical signal processing unit that receives and combines signals from
many other neurons through filamentary input paths, the dendrites (Greek: tree links).
• A biological neuron has three types of components namely dendrites, soma and axon. Dendrites are
bunched into highly complex "dendritic trees", which have an enormous total surface area. The
dendrites receive signals from other neurons.
• Dendritic trees are connected with the main body of the neuron called the soma (Greek: body).
• The soma has a pyramidal or cylindrical shape. The soma sums the incoming signals. When
sufficient input is received, the cell fires.
21. 21
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The output area of the neuron is a long fiber called axon. The impulse signal triggered by the cell is
transmitted over the axon to other cells.
• The connecting point between a neuron's axon and another neuron‟s dendrite is called a synapse
(Greek: contact). The impulse signals are then transmitted across a synaptic gap by means of a
chemical process.
• A single neuron may have 1000 to 10000 synapses and may be connected with around 1000 neurons.
There are 100 billion neurons in our brain, and each neuron has 1000 dendrites.
Neural Network: Artificial Neuron
• The artificial neuron (also called processing element or node) mimes the characteristics of the
biological neuron. A processing element possesses a local memory and carries out localized
information processing operations.
• The artificial neuron has a set of „n‟ inputs xi, each representing the output of another neuron.
• The subscript i in xi take values between i and n and indicates the source of the vector input signal.
• The inputs are collectively referred to as X.
• Each input is weighed before it reaches the main body of the processing element by the connection
strength or the weight factor (or simply weight) analogous to the synaptic strength.
• The amount of information about the input that is required to solve a problem is stored in the form of
weights. Each signal is multiplied with an associated weight w1, w2, w3... wn before it is applied to
the summing block.
• In addition, the artificial neuron has a bias term w0, a threshold value „θ „that has to be reached or
extended for the neuron to produce a signal, a nonlinear function 'F' that acts on the produced signal
'net' and an output 'y' after the nonlinearity function.
22. 22
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The following relation describes the transfer function of the basic neuron model.
• y = F (net)
• Where
• net = w0 + x1w1 + x2w2 + x3w3 +...... + xnwn
• or
• and the neuron firing condition is:
[For linear activation function], x0=1
• Or
[For nonlinear activation function]
Neural Network: Classification
• Artificial neural networks can be classified on the basis of
1. Pattern of connection between neurons, (architecture of the network)
2. Activation function applied to the neurons
3. Method of determining weights on the connection (training method)
Neural Network: ARCHITECTURE
n
i
ii wxwnet
0
0
0i
ii wx
)(netF
23. 23
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• The neurons are assumed to be arranged in layers, and the neurons in the same layer behave in the
same manner.
• All the neurons in a layer usually have the same activation function. Within each layer, the neurons
are either fully interconnected or not connected at all.
• The neurons in one layer can be connected to neurons in another layer.
• The arrangement of neurons into layers and the connection pattern within and between layers is
known as network architecture.
Input layer:
• The neurons in this layer receive the external input signals and perform no computation, but simply
transfer the input signals to the neurons in another layer.
Output layer:
• The neurons in this layer receive signals from neurons either input layer or in the hidden layer.
Hidden layer:
• The layer of neurons that are connected in between the input layer and the output layer is known as
hidden layer.
• Neural nets are often classified as single layer networks or multilayer networks.
• The number of layers in a net can be defined as the number of layers of weighted interconnection
links between various layers.
• While determining the number of layers, the input layer is not counted as a layer, because it does not
perform any computation.
• The architecture of a single layer and a multilayer neural network is shown in the following figures.
Single Layer Network
• A single layer network consists of one layer of connection weights. The net consists of a layer of
units called input layer, which receive signals from the outside world and a layer of units called
output layer from which the response of the net can be obtained.
• This type of network can be used for pattern classification problems
24. 24
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Multilayer Network:
• A multilayer network consists of one or more layers of units (called hidden layers) between the input
and output layers. Multilayer networks may be formed by simply cascading a group of layers; the
output of one layer provides the input to the subsequent layer.
• A multilayer net with nonlinear activation function can solve any type of problem.
• However training a multilayer neural network is very difficult.
Multilayer Network:
Neural Network: ACTIVATION FUNCTIONS
• The purpose of nonlinear activation function is to ensure that the neuron's response is bounded - that
is, the actual response of the neuron is conditioned or damped, as a result of large or small activating
stimuli and thus controllable.
• Further, in order to achieve the advantages of multilayer nets compared with the limited capabilities
of single layer networks, nonlinear functions are required.
• Different nonlinear functions are used, depending upon the paradigm and the algorithm used for
training the network.
• The various activation functions are:
• Identity function (Linear function):
• Identity function can be expressed:
f(x) = x for all x.
• Binary step function: Binary step function is defined as:
26. 26
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Training an Artificial Neural Network
• The most important characteristic of an artificial neural network is its ability to learn.
• Generally, learning is a process by which a neural network adapts itself to a stimulus by properly
making parameter adjustments and producing a desired response.
• Learning (training) is a process in which the network adjusts its parameters the (synaptic weights) in
response to input stimuli so that the actual output response converges to the desired output response.
• When the actual output response is the same as the desired one, the network has completed the
learning phase and the network has acquired knowledge.
• Learning or training algorithms can be categorized as:
Supervised training
Unsupervised training
Reinforced training
Supervised Training:
• Supervised training requires the pairing of each input vector with a target vector representing the
desired output. These two vectors are termed together as training pair.
• During the training session an input vector is applied to the net, and it results in an output vector.
• This response is compared with the target response. If the actual response differs from the target, the
net will generate an error signal.
• This error signal is then used to calculate the adjustment that should be made in the synaptic weights
so that the actual output matches the target output.
• The error minimization in this kind of training requires a supervisor or a teacher, hence the name
supervised training.
• In artificial neural networks, the calculation that is required to minimize errors depends on the
algorithm used, which is normally based on the optimization techniques.
27. 27
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Supervised training methods are used in to perform nonlinear mapping in pattern classification nets.
Pattern association nets and multilayer neural nets.
Unsupervised Training:
• Unsupervised training is employed in self-organizing nets and it does not require a teacher.
• In this method, the input vectors of similar types are grouped without the use of training data to
specify how a typical member of each group looks or to which group a member belongs.
• During training the neural network receives input patterns and organizes these patterns into
categories. When new input pattern is applied, the neural network provides an output response
indicating the class to which the input pattern belongs.
• If a class cannot be found for the input pattern, a new class is generated.
• Even though unsupervised training does not require a teacher, it requires certain guidelines to form
groups.
• Grouping can be done based on color, shape or any other property of the object. If no guidelines are
given grouping may or may not be successful.
Reinforced Training
• Reinforced training is similar to supervised training. In this method, the teacher does not indicate
how close the actual output to the desired output is, but yields only a pass or a fail indicator. Thus,
the error signal generated during reinforced training is binary.
Mcculloch - Pitts Neuron Model
Warren McCulloch and Walter Pitts presented the first mathematical model of a single biological neuron
in 1943. This model is known as McCulloch - Pitts model.
• This model is not requiring learning or adoption and the neurons are binary activated. If the neuron
fires, it has an activation of l and otherwise, it has an activation of 0.
• The neurons are connected by excitatory or inhibitory weights. Excitatory connection has positive
weights, and inhibitory connection has negative weights.
• All the excitatory connection in a particular neuron have the same weight. Each neuron has a fixed
threshold such that if the net input to the neuron is greater than the threshold the neuron should fire.
• The threshold is set such that the inhibition is absolute. This means any non-zero inhibitory input
will prevent the neuron from firing.
28. 28
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
Implementation of McCULLOCH - PITTS Networks for logic functions
29. 29
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
2. OR Function
3. NOT Function
30. 30
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
4. AND NOT Function
5. XOR Function
Applications of Neural Networks
• There have been many impressive demonstrations of artificial neural networks. A few areas where
neural networks are mentioned below.
Classification
31. 31
@ Ashek Mahmud Khan; Dept. of CSE (JUST); 01725-402592
• Which is an important aspect in image classification? Neural successfully in a large number of
classification tasks which includes
(a) Recognition of printed or handwritten characters.
(b) Classification of SONAR and RADAR signals.
Signal Processing
• In digital communication systems, distorted signals cause inter-signal interference.
• One of the first commercial applications of ANN was to suppress noise cancellation and it was
implemented by Widrow using ADALINE.
• The ADALINE is trained to remove the noise from the telephone line signal.
Speech Recognition
• In recent years, speech recognition has received enormous attention.
• It involves three modules namely; the front end which samples the speech signals and extracts the
data.
• The word processor, finds the probability of words in the vocabulary.
• The sentence processor, to determine the sense in the sentence.
McCULLOCH – PITTS: NOT Function
• Medicine
• Intelligent control
• Function Approximation
• Financial Forecasting
• Condition Monitoring
• Process Monitoring and Control
• Neuro Forecasting
• Pattern Analysis