The document discusses the AdaBoost classifier algorithm. AdaBoost is an algorithm that combines multiple weak classifiers to produce a strong classifier. It works by training weak classifiers on weighted versions of the training data and combining them through a weighted majority vote. The weights are updated at each iteration to focus on misclassified examples. The final strong classifier is a linear combination of the weak classifiers.
AdaBoost is an adaptive boosting algorithm that aggregates multiple weak learners, giving higher weight to misclassified examples. It reduces bias and models for low variance and high bias. The algorithm resamples the training data and runs multiple iterations, adjusting the weights each time to focus on problematic examples. As an example, it is used to classify 10 points into two classes by generating weak classifiers at each step that focus on the misclassified points from the previous step. AdaBoost achieves high precision but is sensitive to outliers.
This document discusses machine learning and artificial intelligence. It defines machine learning as a branch of AI that allows systems to learn from data and experience. Machine learning is important because some tasks are difficult to define with rules but can be learned from examples, and relationships in large datasets can be uncovered. The document then discusses areas where machine learning is influential like statistics, brain modeling, and more. It provides an example of designing a machine learning system to play checkers. Finally, it discusses machine learning algorithm types and provides details on the AdaBoost algorithm.
The document provides an overview of deep learning and machine learning techniques. It discusses convolutional neural networks (CNNs) and how they are used for image classification. It also covers transfer learning, where pre-trained models are retrained on new datasets for tasks like computer vision. Examples are given of using Google Cloud Vision API and custom TensorFlow models to build image recognition applications.
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
This presentation provides an overview of boosting approaches for classification problems. It discusses combining classifiers through bagging and boosting to create stronger classifiers. The AdaBoost algorithm is explained in detail, including its training and classification phases. An example is provided to illustrate how AdaBoost works over multiple rounds, increasing the weights of misclassified examples to improve classification accuracy. In conclusion, AdaBoost is highlighted as an effective approach for classification problems where misclassification has severe consequences by producing highly accurate strong classifiers.
Boosting techniques like AdaBoost combine the predictions of many weak learner models to create a stronger joint model. AdaBoost uses stumps, or decision trees with one node and two leaves, as the weak learners. It adjusts the weights of samples to focus on incorrectly classified samples. Over many iterations, it boosts the weights of harder to classify samples to improve predictive performance compared to a single weak learner.
The document discusses hyperparameters and hyperparameter tuning in deep learning models. It defines hyperparameters as parameters that govern how the model parameters (weights and biases) are determined during training, in contrast to model parameters which are learned from the training data. Important hyperparameters include the learning rate, number of layers and units, and activation functions. The goal of training is for the model to perform optimally on unseen test data. Model selection, such as through cross-validation, is used to select the optimal hyperparameters. Training, validation, and test sets are also discussed, with the validation set used for model selection and the test set providing an unbiased evaluation of the fully trained model.
AdaBoost is an adaptive boosting algorithm that aggregates multiple weak learners, giving higher weight to misclassified examples. It reduces bias and models for low variance and high bias. The algorithm resamples the training data and runs multiple iterations, adjusting the weights each time to focus on problematic examples. As an example, it is used to classify 10 points into two classes by generating weak classifiers at each step that focus on the misclassified points from the previous step. AdaBoost achieves high precision but is sensitive to outliers.
This document discusses machine learning and artificial intelligence. It defines machine learning as a branch of AI that allows systems to learn from data and experience. Machine learning is important because some tasks are difficult to define with rules but can be learned from examples, and relationships in large datasets can be uncovered. The document then discusses areas where machine learning is influential like statistics, brain modeling, and more. It provides an example of designing a machine learning system to play checkers. Finally, it discusses machine learning algorithm types and provides details on the AdaBoost algorithm.
The document provides an overview of deep learning and machine learning techniques. It discusses convolutional neural networks (CNNs) and how they are used for image classification. It also covers transfer learning, where pre-trained models are retrained on new datasets for tasks like computer vision. Examples are given of using Google Cloud Vision API and custom TensorFlow models to build image recognition applications.
Presentation in Vietnam Japan AI Community in 2019-05-26.
The presentation summarizes what I've learned about Regularization in Deep Learning.
Disclaimer: The presentation is given in a community event, so it wasn't thoroughly reviewed or revised.
This presentation provides an overview of boosting approaches for classification problems. It discusses combining classifiers through bagging and boosting to create stronger classifiers. The AdaBoost algorithm is explained in detail, including its training and classification phases. An example is provided to illustrate how AdaBoost works over multiple rounds, increasing the weights of misclassified examples to improve classification accuracy. In conclusion, AdaBoost is highlighted as an effective approach for classification problems where misclassification has severe consequences by producing highly accurate strong classifiers.
Boosting techniques like AdaBoost combine the predictions of many weak learner models to create a stronger joint model. AdaBoost uses stumps, or decision trees with one node and two leaves, as the weak learners. It adjusts the weights of samples to focus on incorrectly classified samples. Over many iterations, it boosts the weights of harder to classify samples to improve predictive performance compared to a single weak learner.
The document discusses hyperparameters and hyperparameter tuning in deep learning models. It defines hyperparameters as parameters that govern how the model parameters (weights and biases) are determined during training, in contrast to model parameters which are learned from the training data. Important hyperparameters include the learning rate, number of layers and units, and activation functions. The goal of training is for the model to perform optimally on unseen test data. Model selection, such as through cross-validation, is used to select the optimal hyperparameters. Training, validation, and test sets are also discussed, with the validation set used for model selection and the test set providing an unbiased evaluation of the fully trained model.
This document summarizes various optimization techniques for deep learning models, including gradient descent, stochastic gradient descent, and variants like momentum, Nesterov's accelerated gradient, AdaGrad, RMSProp, and Adam. It provides an overview of how each technique works and comparisons of their performance on image classification tasks using MNIST and CIFAR-10 datasets. The document concludes by encouraging attendees to try out the different optimization methods in Keras and provides resources for further deep learning topics.
1. A perceptron is a basic artificial neural network that can learn linearly separable patterns. It takes weighted inputs, applies an activation function, and outputs a single binary value.
2. Multilayer perceptrons can learn non-linear patterns by using multiple layers of perceptrons with weighted connections between them. They were developed to overcome limitations of single-layer perceptrons.
3. Perceptrons are trained using an error-correction learning rule called the delta rule or the least mean squares algorithm. Weights are adjusted to minimize the error between the actual and target outputs.
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
This document provides an overview of machine learning classification and decision trees. It discusses key concepts like supervised vs. unsupervised learning, and how decision trees work by recursively partitioning data into nodes. Random forest and gradient boosted trees are introduced as ensemble methods that combine multiple decision trees. Random forest grows trees independently in parallel while gradient boosted trees grow sequentially by minimizing error from previous trees. While both benefit from ensembling, gradient boosted trees are more prone to overfitting and random forests are better at generalizing to new data.
The document discusses optimization and gradient descent algorithms. Optimization aims to select the best solution given some problem, like maximizing GPA by choosing study hours. Gradient descent is a method for finding the optimal parameters that minimize a cost function. It works by iteratively updating the parameters in the opposite direction of the gradient of the cost function, which points in the direction of greatest increase. The process repeats until convergence. Issues include potential local minimums and slow convergence.
The document provides an overview of perceptrons and neural networks. It discusses how neural networks are modeled after the human brain and consist of interconnected artificial neurons. The key aspects covered include the McCulloch-Pitts neuron model, Rosenblatt's perceptron, different types of learning (supervised, unsupervised, reinforcement), the backpropagation algorithm, and applications of neural networks such as pattern recognition and machine translation.
Our fall 12-Week Data Science bootcamp starts on Sept 21st,2015. Apply now to get a spot!
If you are hiring Data Scientists, call us at (1)888-752-7585 or reach info@nycdatascience.com to share your openings and set up interviews with our excellent students.
---------------------------------------------------------------
Come join our meet-up and learn how easily you can use R for advanced Machine learning. In this meet-up, we will demonstrate how to understand and use Xgboost for Kaggle competition. Tong is in Canada and will do remote session with us through google hangout.
---------------------------------------------------------------
Speaker Bio:
Tong is a data scientist in Supstat Inc and also a master students of Data Mining. He has been an active R programmer and developer for 5 years. He is the author of the R package of XGBoost, one of the most popular and contest-winning tools on kaggle.com nowadays.
Pre-requisite(if any): R /Calculus
Preparation: A laptop with R installed. Windows users might need to have RTools installed as well.
Agenda:
Introduction of Xgboost
Real World Application
Model Specification
Parameter Introduction
Advanced Features
Kaggle Winning Solution
Event arrangement:
6:45pm Doors open. Come early to network, grab a beer and settle in.
7:00-9:00pm XgBoost Demo
Reference:
https://github.com/dmlc/xgboost
The document discusses decision tree learning and the ID3 algorithm. It covers topics like decision tree representation, entropy and information gain for selecting attributes, overfitting, and techniques to avoid overfitting like reduced error pruning. It also discusses handling continuous values, missing data, and attributes with many values or costs in decision tree learning.
This presentation was prepared as part of the curriculum studies for CSCI-659 Topics in Artificial Intelligence Course - Machine Learning in Computational Linguistics.
It was prepared under guidance of Prof. Sandra Kubler.
Talk on Optimization for Deep Learning, which gives an overview of gradient descent optimization algorithms and highlights some current research directions.
The document discusses artificial neural networks and classification using backpropagation, describing neural networks as sets of connected input and output units where each connection has an associated weight. It explains backpropagation as a neural network learning algorithm that trains networks by adjusting weights to correctly predict the class label of input data, and how multi-layer feed-forward neural networks can be used for classification by propagating inputs through hidden layers to generate outputs.
This document discusses logistic regression for classification problems. Logistic regression models the probability of an output belonging to a particular class using a logistic function. The model parameters are estimated by minimizing a cost function using gradient descent or other advanced optimization algorithms. Logistic regression can be extended to multi-class classification problems using a one-vs-all approach that trains a separate binary classifier for each class.
Ensemble methods combine multiple machine learning models to obtain better predictive performance than from any individual model. There are two main types of ensemble methods: sequential (e.g AdaBoost) where models are generated one after the other, and parallel (e.g Random Forest) where models are generated independently. Popular ensemble methods include bagging, boosting, and stacking. Bagging averages predictions from models trained on random samples of the data, while boosting focuses on correcting previous models' errors. Stacking trains a meta-model on predictions from other models to produce a final prediction.
A confusion matrix is a table that shows the performance of a classification model by listing the true positives, true negatives, false positives, and false negatives. It displays how often the model correctly or incorrectly classified observations into their actual classes. The document provides an example confusion matrix for a model classifying apples, oranges, and pears, showing the number of observations the model correctly and incorrectly classified into each class.
Welcome to the Supervised Machine Learning and Data Sciences.
Algorithms for building models. Support Vector Machines.
Classification algorithm explanation and code in Python ( SVM ) .
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
This document discusses ensemble machine learning methods. It introduces classifiers ensembles and describes three common ensemble methods: bagging, boosting, and random forests. For each method, it explains the basic idea, how the method works, advantages and disadvantages. Bagging constructs multiple classifiers from bootstrap samples of the training data and aggregates their predictions through voting. Boosting builds classifiers sequentially by focusing on misclassified examples. Random forests create decision trees with random subsets of features and samples. Ensembles can improve performance over single classifiers by reducing variance.
This document summarizes graph coloring using backtracking. It defines graph coloring as minimizing the number of colors used to color a graph. The chromatic number is the fewest colors needed. Graph coloring is NP-complete. The document outlines a backtracking algorithm that tries assigning colors to vertices, checks if the assignment is valid (no adjacent vertices have the same color), and backtracks if not. It provides pseudocode for the algorithm and lists applications like scheduling, Sudoku, and map coloring.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
1. The Naive Bayes classifier is a simple probabilistic classifier based on Bayes' theorem that assumes independence between features.
2. It has various applications including email spam detection, language detection, and document categorization.
3. The Naive Bayes approach involves computing the class prior probabilities, feature likelihoods, and applying Bayes' theorem to calculate the posterior probabilities to classify new instances. Laplace smoothing is often used to handle cases with insufficient training data.
Dr. Subrat Panda gave an introduction to reinforcement learning. He defined reinforcement learning as dealing with agents that must sense and act upon their environment to receive delayed scalar feedback in the form of rewards. He described key concepts like the Markov decision process framework, value functions, Q-functions, exploration vs exploitation, and extensions like deep reinforcement learning. He listed several real-world applications of reinforcement learning and resources for learning more.
Kato Mivule: An Overview of Adaptive Boosting – AdaBoostKato Mivule
AdaBoost is a machine learning algorithm that uses multiple weak learners to create a strong learner. It works by assigning higher weights to misclassified examples from previous iterations and runs multiple iterations, each time adding a new weak learner that focuses on the examples with higher weights. The document presents an experiment using AdaBoost with decision stumps on a cancer dataset, finding a classification accuracy of 93.12% compared to 92.97% for decision stumps alone. ROC/AUC analysis showed AdaBoost with an AUC of 0.975 outperforming decision stumps with an AUC of 0.911, demonstrating that AdaBoost can create a more effective classifier than a single weak learner.
1. The document describes a general boosting procedure for combining weak learners to create a strong learner.
2. It involves initializing the model, learning weak learners, calculating error rates, adjusting the distribution of the training data, and combining weak learners.
3. It also describes the AdaBoost algorithm which implements this general boosting procedure and learns weak learners in sequence while focusing more on examples that previous learners got wrong.
This document summarizes various optimization techniques for deep learning models, including gradient descent, stochastic gradient descent, and variants like momentum, Nesterov's accelerated gradient, AdaGrad, RMSProp, and Adam. It provides an overview of how each technique works and comparisons of their performance on image classification tasks using MNIST and CIFAR-10 datasets. The document concludes by encouraging attendees to try out the different optimization methods in Keras and provides resources for further deep learning topics.
1. A perceptron is a basic artificial neural network that can learn linearly separable patterns. It takes weighted inputs, applies an activation function, and outputs a single binary value.
2. Multilayer perceptrons can learn non-linear patterns by using multiple layers of perceptrons with weighted connections between them. They were developed to overcome limitations of single-layer perceptrons.
3. Perceptrons are trained using an error-correction learning rule called the delta rule or the least mean squares algorithm. Weights are adjusted to minimize the error between the actual and target outputs.
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
This document provides an overview of machine learning classification and decision trees. It discusses key concepts like supervised vs. unsupervised learning, and how decision trees work by recursively partitioning data into nodes. Random forest and gradient boosted trees are introduced as ensemble methods that combine multiple decision trees. Random forest grows trees independently in parallel while gradient boosted trees grow sequentially by minimizing error from previous trees. While both benefit from ensembling, gradient boosted trees are more prone to overfitting and random forests are better at generalizing to new data.
The document discusses optimization and gradient descent algorithms. Optimization aims to select the best solution given some problem, like maximizing GPA by choosing study hours. Gradient descent is a method for finding the optimal parameters that minimize a cost function. It works by iteratively updating the parameters in the opposite direction of the gradient of the cost function, which points in the direction of greatest increase. The process repeats until convergence. Issues include potential local minimums and slow convergence.
The document provides an overview of perceptrons and neural networks. It discusses how neural networks are modeled after the human brain and consist of interconnected artificial neurons. The key aspects covered include the McCulloch-Pitts neuron model, Rosenblatt's perceptron, different types of learning (supervised, unsupervised, reinforcement), the backpropagation algorithm, and applications of neural networks such as pattern recognition and machine translation.
Our fall 12-Week Data Science bootcamp starts on Sept 21st,2015. Apply now to get a spot!
If you are hiring Data Scientists, call us at (1)888-752-7585 or reach info@nycdatascience.com to share your openings and set up interviews with our excellent students.
---------------------------------------------------------------
Come join our meet-up and learn how easily you can use R for advanced Machine learning. In this meet-up, we will demonstrate how to understand and use Xgboost for Kaggle competition. Tong is in Canada and will do remote session with us through google hangout.
---------------------------------------------------------------
Speaker Bio:
Tong is a data scientist in Supstat Inc and also a master students of Data Mining. He has been an active R programmer and developer for 5 years. He is the author of the R package of XGBoost, one of the most popular and contest-winning tools on kaggle.com nowadays.
Pre-requisite(if any): R /Calculus
Preparation: A laptop with R installed. Windows users might need to have RTools installed as well.
Agenda:
Introduction of Xgboost
Real World Application
Model Specification
Parameter Introduction
Advanced Features
Kaggle Winning Solution
Event arrangement:
6:45pm Doors open. Come early to network, grab a beer and settle in.
7:00-9:00pm XgBoost Demo
Reference:
https://github.com/dmlc/xgboost
The document discusses decision tree learning and the ID3 algorithm. It covers topics like decision tree representation, entropy and information gain for selecting attributes, overfitting, and techniques to avoid overfitting like reduced error pruning. It also discusses handling continuous values, missing data, and attributes with many values or costs in decision tree learning.
This presentation was prepared as part of the curriculum studies for CSCI-659 Topics in Artificial Intelligence Course - Machine Learning in Computational Linguistics.
It was prepared under guidance of Prof. Sandra Kubler.
Talk on Optimization for Deep Learning, which gives an overview of gradient descent optimization algorithms and highlights some current research directions.
The document discusses artificial neural networks and classification using backpropagation, describing neural networks as sets of connected input and output units where each connection has an associated weight. It explains backpropagation as a neural network learning algorithm that trains networks by adjusting weights to correctly predict the class label of input data, and how multi-layer feed-forward neural networks can be used for classification by propagating inputs through hidden layers to generate outputs.
This document discusses logistic regression for classification problems. Logistic regression models the probability of an output belonging to a particular class using a logistic function. The model parameters are estimated by minimizing a cost function using gradient descent or other advanced optimization algorithms. Logistic regression can be extended to multi-class classification problems using a one-vs-all approach that trains a separate binary classifier for each class.
Ensemble methods combine multiple machine learning models to obtain better predictive performance than from any individual model. There are two main types of ensemble methods: sequential (e.g AdaBoost) where models are generated one after the other, and parallel (e.g Random Forest) where models are generated independently. Popular ensemble methods include bagging, boosting, and stacking. Bagging averages predictions from models trained on random samples of the data, while boosting focuses on correcting previous models' errors. Stacking trains a meta-model on predictions from other models to produce a final prediction.
A confusion matrix is a table that shows the performance of a classification model by listing the true positives, true negatives, false positives, and false negatives. It displays how often the model correctly or incorrectly classified observations into their actual classes. The document provides an example confusion matrix for a model classifying apples, oranges, and pears, showing the number of observations the model correctly and incorrectly classified into each class.
Welcome to the Supervised Machine Learning and Data Sciences.
Algorithms for building models. Support Vector Machines.
Classification algorithm explanation and code in Python ( SVM ) .
Machine Learning and Data Mining: 16 Classifiers EnsemblesPier Luca Lanzi
This document discusses ensemble machine learning methods. It introduces classifiers ensembles and describes three common ensemble methods: bagging, boosting, and random forests. For each method, it explains the basic idea, how the method works, advantages and disadvantages. Bagging constructs multiple classifiers from bootstrap samples of the training data and aggregates their predictions through voting. Boosting builds classifiers sequentially by focusing on misclassified examples. Random forests create decision trees with random subsets of features and samples. Ensembles can improve performance over single classifiers by reducing variance.
This document summarizes graph coloring using backtracking. It defines graph coloring as minimizing the number of colors used to color a graph. The chromatic number is the fewest colors needed. Graph coloring is NP-complete. The document outlines a backtracking algorithm that tries assigning colors to vertices, checks if the assignment is valid (no adjacent vertices have the same color), and backtracks if not. It provides pseudocode for the algorithm and lists applications like scheduling, Sudoku, and map coloring.
The presentation is made on CNN's which is explained using the image classification problem, the presentation was prepared in perspective of understanding computer vision and its applications. I tried to explain the CNN in the most simple way possible as for my understanding. This presentation helps the beginners of CNN to have a brief idea about the architecture and different layers in the architecture of CNN with the example. Please do refer the references in the last slide for a better idea on working of CNN. In this presentation, I have also discussed the different types of CNN(not all) and the applications of Computer Vision.
1. The Naive Bayes classifier is a simple probabilistic classifier based on Bayes' theorem that assumes independence between features.
2. It has various applications including email spam detection, language detection, and document categorization.
3. The Naive Bayes approach involves computing the class prior probabilities, feature likelihoods, and applying Bayes' theorem to calculate the posterior probabilities to classify new instances. Laplace smoothing is often used to handle cases with insufficient training data.
Dr. Subrat Panda gave an introduction to reinforcement learning. He defined reinforcement learning as dealing with agents that must sense and act upon their environment to receive delayed scalar feedback in the form of rewards. He described key concepts like the Markov decision process framework, value functions, Q-functions, exploration vs exploitation, and extensions like deep reinforcement learning. He listed several real-world applications of reinforcement learning and resources for learning more.
Kato Mivule: An Overview of Adaptive Boosting – AdaBoostKato Mivule
AdaBoost is a machine learning algorithm that uses multiple weak learners to create a strong learner. It works by assigning higher weights to misclassified examples from previous iterations and runs multiple iterations, each time adding a new weak learner that focuses on the examples with higher weights. The document presents an experiment using AdaBoost with decision stumps on a cancer dataset, finding a classification accuracy of 93.12% compared to 92.97% for decision stumps alone. ROC/AUC analysis showed AdaBoost with an AUC of 0.975 outperforming decision stumps with an AUC of 0.911, demonstrating that AdaBoost can create a more effective classifier than a single weak learner.
1. The document describes a general boosting procedure for combining weak learners to create a strong learner.
2. It involves initializing the model, learning weak learners, calculating error rates, adjusting the distribution of the training data, and combining weak learners.
3. It also describes the AdaBoost algorithm which implements this general boosting procedure and learns weak learners in sequence while focusing more on examples that previous learners got wrong.
Cascade classifiers trained on gammatonegrams for reliably detecting audio ev...Nicola Strisciuglio
In this paper we propose a novel method for the detection of events of interest through audio analysis. The system that we propose is based on the representation of the audio streams through a Gammatone image, which describes the time-frequency distribution of the energy of the signal; this representation is inspired by the functioning of the human auditory system. A pool of AdaBoost cascade classifiers, one for each class of events of interest, is involved in the event detection stage. The performance of the proposed system has been evaluated on a large data set of audio events for surveillance applications and the achieved results, compared with two state of the art approaches, confirm its effectiveness.
Downlaod the paper at:
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6918643
Combining Models - In this slides, we look at a way to combine the answers form various weak classifiers to build a robust classifier. At the slides we look at the following subjects:
1.- Model Combination Vs Bayesian Model
2.- Bootstrap Data Sets
And the cherry on the top the AdaBoost
AdaBoost is an ensemble learning algorithm that combines multiple weak learners into a single strong learner. It works in rounds, assigning higher weights to examples that previous rounds misclassified. Each weak learner is trained on the reweighted data and must only be slightly better than random guessing. AdaBoost then calculates error rates and weights and combines predictions from all weak learners into a final strong learner using a weighted majority vote. The algorithm stops when error rate stops decreasing or the maximum number of rounds is reached.
Classifications & Misclassifications of EEG Signals using Linear and AdaBoost...IJARIIT
Epilepsy is one of the frequent brain disorder due to transient and unexpected electrical interruptions of brain. Electroencephalography (EEG) is one of the most clinically and scientifically exploited signals recorded from humans and very complex signal. EEG signals are non-stationary as it changes over time. So, discrete wavelet transform (DWT) technique is used for feature extraction. Classifications and misclassifications of EEG signals of linearly separable support vector machines are shown using training and testing datasets. Then AdaBoost support vector machine is used to get strong classifier.
This presentation is about Multiple Classifier System (Ensemble of Classifiers). At first tell about the general idea of decision making, then address reasons and rationales of using Multiple Classifier System, after that concentrate on designing Multiple Classifier System: 1.Create an Ensemble 2.Combining Classifiers.
Ensemble Learning: The Wisdom of Crowds (of Machines)Lior Rokach
This document discusses ensemble learning methods. It begins by introducing the concept of ensemble learning, which involves combining multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. It then discusses several popular ensemble methods, including boosting, bagging, random forests, and DECORATE. Boosting works by iteratively training weak learners on reweighted versions of the data to focus on examples that previous learners misclassified. Bagging trains learners on randomly sampled subsets of the data and combines them by averaging or voting. Random forests add additional randomness to bagging. DECORATE improves ensembles by adding artificial training examples to encourage diversity.
2013-1 Machine Learning Lecture 06 - Artur Ferreira - A Survey on Boosting…Dongseo University
This document summarizes a survey on boosting algorithms for supervised learning. It begins with an introduction to ensembles of classifiers and boosting, describing how boosting builds ensembles by combining simple classifiers with associated contributions. The AdaBoost algorithm and its variants are then explained in detail. Experimental results on synthetic and standard datasets are presented, comparing boosting with generative and RBF weak learners. The results show that boosting algorithms can achieve low error rates, with AdaBoost performing well when weak learners are only slightly better than random.
What is an "ensemble learner"? How can we combine different base learners into an ensemble in order to improve the overall classification performance? In this lecture, we are providing some answers to these questions.
The document provides contact information for Python Homework Help, including their website, email address, and phone number. It then presents sample code defining two Python classes - A and B (which inherits from A) - and evaluates various expressions involving objects of these classes. Next, it defines Student and StudentBody classes to represent university students and evaluates methods on a sample StudentBody object. Finally, it presents additional examples involving state machine definitions and difference equations.
Function Programming in Scala.
A lot of my examples here comes from the book
Functional programming in Scala By Paul Chiusano and Rúnar Bjarnason, It is a good book, buy it.
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
If you're willing to understand how neural networks work behind the scene and debug the back-propagation algorithm step by step by yourself, this presentation should be a good starting point.
We'll cover elements on:
- the popularity of neural networks and their applications
- the artificial neuron and the analogy with the biological one
- the perceptron
- the architecture of multi-layer perceptrons
- loss functions
- activation functions
- the gradient descent algorithm
At the end, there will be an implementation FROM SCRATCH of a fully functioning neural net.
code: https://github.com/ahmedbesbes/Neural-Network-from-scratch
This document provides an overview of stacks as an abstract data type (ADT). It defines a stack as a last-in first-out data structure for storing arbitrary objects. The key stack operations of push, pop, top, and size are described. Exceptions that can occur for empty stacks are discussed. An array-based implementation of stacks is presented, including algorithms for the stack operations and analysis of its performance and limitations. Applications of stacks like undo buffers and method call stacks are mentioned. Finally, an example of using a stack to check matching parentheses in an expression is provided.
This document provides an overview of supervised learning and linear regression. It introduces supervised learning problems using an example of predicting house prices based on living area. Linear regression is discussed as an initial approach to model this relationship. The cost function is defined as the mean squared error between predictions and targets. Gradient descent and stochastic gradient descent are presented as algorithms to minimize this cost function and learn the parameters of the linear regression model.
This document discusses algorithms for predictive modeling, including logistic regression. It presents a medical dataset containing measurements of heart patients and whether they survived. Logistic regression is applied to predict survival using maximum likelihood estimation. Numerical optimization techniques like BFGS and Fisher's algorithm are discussed for maximum likelihood estimation of logistic regression. Iteratively reweighted least squares is also presented as an alternative approach.
This document describes the solutions and questions for a midterm exam in 6.036: Spring 2018. It provides instructions for taking the exam such as writing your name on each page and coming to the front to ask questions. The exam consists of 6 multiple choice questions worth a total of 100 points. Question 1 involves linear classification and calculating margins. Question 2 asks about sources of error in machine learning models. Question 3 involves choosing appropriate representations and loss functions for different prediction problems. Question 4 introduces radial basis features for nonlinear classification. Question 5 discusses shortcut connections in neural networks.
This document provides an introduction to machine learning, covering key topics such as what machine learning is, common learning algorithms and applications. It discusses linear models, kernel methods, neural networks, decision trees and more. It also addresses challenges in machine learning like balancing fit and robustness, and evaluating model performance using techniques like ROC curves. The goal of machine learning is to build models that can learn from data to make predictions or decisions.
The document discusses arrays and array data structures. It defines an array as a set of index-value pairs where each index maps to a single value. It then describes the common array abstract data type (ADT) with methods like create, retrieve, and store for manipulating arrays. The document also discusses sparse matrix data structures and provides an ADT for sparse matrices with methods like create, transpose, add, and multiply.
I am Josh U. I am a Python Homework Expert at pythonhomeworkhelp.com. I hold a Master's in Computer Science from, the University of Warwicks. I have been helping students with their homework for the past 13 years. I solve homework related to Python.
Visit pythonhomeworkhelp.com or email support@pythonhomeworkhelp.com. You can also call on +1 678 648 4277 for any assistance with Python Homework.
1) Base types in Python include integers, floats, booleans, strings, bytes, lists, tuples, dictionaries, sets, and None. These types support various operations like indexing, slicing, mathematical operations, membership testing, etc.
2) Functions are defined using the def keyword and can take parameters and return values. Functions are called by specifying the function name followed by parentheses that may contain arguments.
3) Common operations on containers in Python include getting the length, minimum/maximum values, sum, sorting, checking for membership, enumerating, and zipping containers. Methods like append, extend, insert, remove, pop can modify lists in-place.
This document discusses object detection using Adaboost and various techniques. It begins with an overview of the Adaboost algorithm and provides a toy example to illustrate how it works. Next, it describes how Viola and Jones used Adaboost with Haar-like features and an integral image representation for rapid face detection in images. It achieved high detection rates with very low false positives. The document also discusses how Schneiderman and Kanade used a parts-based representation with localized wavelet coefficients as features for object detection and used statistical independence of parts to obtain likelihoods for classification.
This document summarizes a lecture on fuzzy logic and neural networks. It introduces fuzzy sets and compares them to classical or crisp sets. Key concepts covered include fuzzy set representation using membership functions, common membership function types like triangular and trapezoidal, fuzzy set operations, and properties of fuzzy and crisp sets. Examples are provided to demonstrate calculating membership values and performing operations on fuzzy sets.
The document discusses stacks as an abstract data type (ADT) and their implementation in Java. It defines stacks as LIFO data structures that support push, pop, and peek operations. An array-based implementation of stacks in Java is presented using an array and index to track elements. Growable stacks are also discussed, comparing strategies to dynamically increase the array size. The document concludes by explaining how method calls in Java programs use a stack to enable recursion and error tracing.
This document provides a summary of key Python concepts including:
1. Base data types like integers, floats, booleans, strings, lists, tuples, dictionaries, sets, and None.
2. Variables, assignments, identifiers, conversions between types, and string formatting.
3. Conditional statements like if/elif/else and boolean logic operators.
4. Loops like for and while loops for iterating over sequences.
5. Functions for defining reusable blocks of code and calling functions.
1. Python provides various built-in container types including lists, tuples, dictionaries, sets, and strings for storing and organizing data.
2. These container types support common operations like indexing, slicing, membership testing, and methods for insertion, deletion, and modification.
3. The document provides examples of using operators and built-in functions to perform tasks like formatting strings, file I/O, conditional logic, loops, functions, and exceptions.
1) Base types in Python include integers, floats, booleans, strings, bytes, lists, tuples, dictionaries, sets, and None. These types support various operations like indexing, slicing, membership testing, and type conversions.
2) Common loop statements in Python are for loops and while loops. For loops iterate over sequences, while loops repeat as long as a condition is true. Loop control statements like break, continue, and else can be used to control loop execution.
3) Functions are defined using the def keyword and can take parameters and return values. Functions allow for code reusability and organization. Built-in functions operate on containers to provide functionality like sorting, summing, and converting between types.
Scala is a multi-paradigm programming language that runs on the Java Virtual Machine. It integrates features of object-oriented and functional programming languages. Some key features of Scala include: supporting both object-oriented and functional programming, providing improvements over Java in areas like syntax, generics, and collections, and introducing new features like pattern matching, traits, and implicit conversions.
This document discusses stacks as an abstract data type (ADT) and their implementation in Java. It begins by defining an ADT as having a specific interface of operations and axioms defining the semantics of those operations. Stacks are introduced as a LIFO data structure that supports push, pop, and top operations. The document then discusses implementing a stack interface in Java using exceptions to handle errors. It provides an example array-based stack implementation in Java using an array and index to track elements. Finally, it discusses an application of stacks to efficiently compute the span of stock price changes over time by using a stack to track previous higher prices.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
3. Adaboost v.2c3
Concept-Classifier
Training procedures
Give +ve and –ve examples to the system, then the
system will learn to classify an unknown input.
E.g. give pictures of faces (+ve examples) and non-
faces (-ve examples) to train the system.
Detection procedures
Input an unknown (e.g. an image) , the system will
tell you it is a face or not.
Face non-face
4.
)(
otherwise1
if1
)(
becomes)(,
.variablesareconstants,are
and),]),[((:functiontheis
1},-or1{polaritywhere
)(
otherwise1
)(if1
)(
use.oyou want tcasewhichcontroltopolarityuseand
equationbecometotogether2and1casecombine,-At time-
otherwise1
constantsgivenare,where1
)(
:aswrittenbecanIt.1otherwise
1then,areawhite""in theis],[xpointaIf
---Case2-
otherwise1
constantsgivenare,where,if1
)(
:aswrittenbecanIt.1otherwise
1then,areagray""in theis][pointaIf
---Case1-
ib
cpmuvp
xh
iequationcpmuvp
u,vm,cwhere
cmuvvuxff
p
i
pxfp
xh
p
(i)t
m,xc ,mu)if -(v-
xh
h(x)
h(x)v)(u
m,xcv-mu
xh
h(x)
h(x)(u,v)x
tt
t
tt
t
t
tttt
t
t
4
(Updated)!! First let us learn what is what a weak classifier h( )
v=mu+c
or
v-mu=c
•m,c are used to define
the line
•Any points in the gray
area satisfy v-mu<c
•Any points in the white
area satisfy v-mu>c
v
c
Gradient m
(0,0)
v-mu<c
v-mu>c
u
5. Adaboost - Adaptive Boosting
5
Instead of resampling, uses training set re-weighting
Each training sample uses a weight to determine the
probability of being selected for a training set.
AdaBoost is an algorithm for constructing a “strong”
classifier as linear combination of “simple” “weak”
classifier
Final classification based on weighted vote of weak
classifiers
6. Concept
Weak learners from
the family of lines
h => p(error) = 0.5 it is at chance
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )
yt =
7. Concept
This one seems to be the best
Each data point has
a class label:
wt =1
and a weight:
+1 ( )
-1 ( )
yt =
This is a ‘weak classifier’: It performs slightly better than chance.
8. Concept
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =
9. We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =
Concept
10. Concept
We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =
11. We set a new problem for which the previous weak classifier performs at chance again
Each data point has
a class label:
wt wt exp{-yt Ht}
We update the weights:
+1 ( )
-1 ( )
yt =
Concept
12. The strong (non- linear) classifier is built as the combination of
all the weak (linear) classifiers.
f1 f2
f3
f4
Concept
13. An example to show how Adaboost
works
Adaboost v.2c13
Training,
Present ten samples to the
system :[xi={ui,vi},yi={’+’ or ‘-’}]
5 +ve (blue, diamond) samples
5 –ve (red, circle) samples
Train up the system
Detection
Give an input xj=(1.5,3.4)
The system will tell you it is ‘+’ or ‘-’.
E.g. Face or non-face
Example:
u=weight, v=height
Classification: suitability to play in
the basket ball team.
[xi={-0.48,0},yi=’+’]
[xi={-0.2,-0.5},yi=’+’]u-axis
v-axis
14. Adaboost concept
Adaboost v.2c14
Use this training data,
how to make a classifier
One axis-parallel weak
classifier cannot achieve 100%
classification. E.g. h1(), h2(),
h3() all fail. That means no
matter how you place the
decision line (horizontally or
vertically) you cannot get 100%
classification result.
You may try it yourself!
The above strong classifier should; work,
but how can we find it?
ANSWER:
Combine many weak classifiers to
achieve it.
Training data
6 squares,
5 circles.
h1( )
h2 ( )
h3( )
The solution is a
H_complex( )
Objective: Train a classifier to
classify an unknown input to see
if it is a circle or square.
15. How? Each classifier may not be perfect but each can achieve over 50%
correct rate.
1
T
t
tt (x)hαsignH(x)
1
Adaboost v.2c15
Classification
Result
Combine to form the
Final strong classifier
h1( ) h2() h3( ) h4( ) h5( ) h6() h7()
2 3 4 5 6
7
7,..,2,1for,
classifierweak
eachforWeight
ii
16. Adaboost
Algorithm
Adaboost v.2c16
otherwise
iosigny
)(xhαtS)(xhαxo
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE
Z
xhyiD
iDStep
ε
ε
.ε
otherwise
yxh
IIiD
εhD
Xh
,...Tt
)(
niD
YyXx),,y),..(xy(x
ti
i
t
iit
t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
t
ititt
t
t
t
t
t
t
iit
yxhyxh
n
i
tt
q
q
tt
t
t
iinn,
iitiit
0
)(if1
,,and,)(outputThe
}
break;t,Tthen0If
1,,errorhence,
i.e.classifiercascadedcurrentby theclassifiedyincorrectlisIf
0,,errorhence,
i.e.,classifiercascadedrentrcuby thedclassifiecorrectlyisIf
:followsasdefinedis)(and
,,,
1
errorclassifiercurrentthehilew
,,errorclassifiercascadedtalCurrent to:Step4
nexplanatioforslidenextsee,
))(exp()(
)(:3
value).confidence(orweight,
1
ln
2
1
:Step2
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking
0
y)incorrectld(classifie)(if1
where,*)(error:Step1b
minarg:meansthat,respect toerror with
theminimizesthat}1,1{:classifiertheFind:Step1a{
1For
examples)1(negativeofnumberLexamples;1positiveofnumberM
LMnsuch that;/1)((weight)ondistributiInitialze
}1,1{,where:Given
1
1
1
1
1
1
)()(
1
1
11
classifierstrongfinalThe
1
T
t
tt (x)hαsignH(x)
Initialization
Main
Training
loop
The final strong
classifier
See
enlarged
versions
in the
following
slides
)(xhy(i)eD)(xhy(i)eD
weightincorrrectweightcorrectZ
ondistrubutiyprobabilitDionnormalizatZ
iti
α
classifiedyincorrectln
i
titi
-α
classifiedcorrectlyn
i
t
classifiedyincorrectln
i
classifiedcorrectlyn
i
t
tt
tt
__
1
__
1
__
1
__
1
__
aissofactor,where
18. Main loop
(step1,2,3)
Adaboost v.2c18
nexplanatioforslidenextsee,
))(exp()(
)(:3
value).confidence(orweight,
1
ln
2
1
:Step2
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking
0
)yincrroectlclassified()(if1
where,*)(error:Step1b
minarg:meansthat,respect toerror with
theminimizesthat}1,1{:classifiertheFind:Step1a{
1For
1
)()(
1
t
ititt
t
t
t
t
t
t
iit
yxhyxh
n
i
tt
q
q
tt
t
Z
xhyiD
iDStep
ε
ε
.ε
otherwise
yxh
IIiD
εhD
Xh
,...Tt
iitiit
19. Main loop (step 4)
Adaboost v.2c19
otherwise
iosigny
)(xhαtS)(xhαxo
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE
ti
i
t
iit
t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
0
)(if1
,,and,)(outputThe
}
break;t,Tthen0If
1,,errorhence,
i.e.classifiercascadedcurrentby theclassifiedyincorrectlisIf
0,,errorhence,
i.e.,classifiercascadedrentrcuby thedclassifiecorrectlyisIf
:followsasdefinedis)(and
,,,
1
errorclassifiercurrentthehilew
,,errorclassifiercascadedtalCurrent to:Step4
1
1
1
1
1
classifierstrongfinalThe
1
T
t
tt (x)hαsignH(x)
20. AdaBoost chooses this weight update function deliberately
Because,
•when a training sample is correctly classified, weight decreases
•when a training sample is incorrectly classified, weight increases
Note: Normalization factor Zt in step3
Adaboost v.2c20
)(xhy(i)eD)(xhy(i)eD
weightincorrrectweightcorrectZ
ondistrubutiyprobabilitDionnormalizatZ
Z
xhyiD
iDStep
call
iti
α
classifiedyincorrectln
i
titi
-α
classifiedcorrectlyn
i
t
classifiedyincorrectln
i
classifiedcorrectlyn
i
t
tt
t
ititt
t
tt
__
1
__
1
__
1
__
1
1
__
abecomessofactor,where
,
))(exp()(
)(:3
:Re
))(exp()()(1 itittt xhyiDiD
21. Note: Stopping criterion of the main loop
The main loops stops when all training data are correctly
classified by the cascaded classifier up to stage t.
}
break;t,Tthen0If
1,,errorhence,
i.e.classifiercascadedcurrentby theclassifiedyincorrectlisIf
0,,errorhence,
i.e.,classifiercascadedrentrcuby thedclassifiecorrectlyisIf
:followsasdefinedis)(and
,,,
1
errorclassifiercurrentthehilew
,,errorclassifiercascadedtalCurrent to:Step4
1
1
1
1
t
i
t
ii
i
i
t
ii
i
n
i
tj
j
ijt
CE
)(xhαtI)(xhαsigny
x
)(xhαtI)(xhαsigny
x
I
)(xhαtI
n
E
)(xhαtECE
Adaboost v.2c21
22. Dt(i) =weight
Adaboost v.2c22
Dt(i) = probability distribution of the i-th
training sample at time t . i=1,2…n.
It shows how much you trust this sample.
At t=1, all samples are the same with equal
weight. Dt=1(all i)=same
At t >1 , Dt>1(i) will be modified, we will see later.
23. An example to show how Adaboost
works
Adaboost v.2c23
Training,
Present ten samples to the
system :[xi={ui,vi},yi={’+’ or ‘-’}]
5 +ve (blue, diamond) samples
5 –ve (red, circle) samples
Train up the classification system.
Detection example:
Give an input xj=(1.5,3.4)
The system will tell you it is ‘+’ or ‘-’.
E.g. Face or non-face.
Example:
You may treat u=weight, v=height
Classification task: suitability to play
in the basket ball team.
[xi={-0.48,0},yi=’+’]
[xi={-0.2,-0.5},yi=’+’]u-axis
v-axis
24. Initialization
M=5 +ve (blue, diamond) samples
L=5 –ve (red, circle) samples
n=M+L=10
Initialize weight D(t=1)(i)= 1/10 for all
i=1,2,..,10,
So, D(1)(1)=0.1, D(1) (2)=0.1,……, D(1)(10)=0.1
exampleLexample;positiveM
LMnthatsuch;/1)(Initialze
}1,1{,wherewhere:Given
1
11
negative
niD
YyXx),,y),..(x,y(x
t
iinn
Adaboost v.2c24
25. Select h( ): For simplicity in implementation
we use the Axis-parallel weak classifier
0
0
bycontrolledbecanlinetheofpositionthe
line)(vertcialmgradientoflineais
or
bycontrolledbecanlinetheofpositionthe
line)l(horizonta0mgradientoflineais
classifierweakparallel-Axis
.variablesareconstants,are),(:functiontheis
threshold1},-or1{polaritywhere
)(
otherwise0
)(if1
)(
Recall
u
f
v
f
u,vm,ccmuff
vp
i
pxfp
xh
tt
tttt
t
Adaboost v.2c25
ha (x)
hb(x)
u0
v0
26. Step1a,
1b
Assume h() can only be
horizontal or vertical
separators. (axis-parallel
weak classifier)
There are still many ways to
set h(), here, if this hq() is
selected, there will be 3
incorrectly classified training
samples.
See the 3 circled training
samples
We can go through all h( )s
and select the best with the
least misclassification (see
the following 2 slides)
stop.otherwiseok)is0.5ansmaller th(error:50:teprerequisi:stepchecking:Step1b
minarg:meansThat
respect toerror withtheminimizethat}1,1{:classifiertheFind:{Step1a
.ε
εh
DXh
t
q
q
t
tt
Adaboost v.2c26
Incorrectly classified by hq()
hq()
27. Example :Training example slides from [Smyth 2007]
classifier the ten red (circle)/blue (diamond) dots
Step 1a:
},-{p
(x)h
vvux
pupu
xh
i
i
i
11polarity
axis.verticalthe
toparallelisbecause
usednotis),,(
otherwise1
if1
)(
Adaboost v.2c27
Initialize:
Dn
(t=1)=1/10
You may choose
one of the following
axis-parallel (vertical
line) classifiers
Vertical Dotted lines
are possible choices
hi=1(x) ………….. hi=4(x) ……………… hi=9(x)
u1 u2 u3 u4 u5 u6 u7 u8 u9
u-axis
v-axis
There are 9x2 choices here,
hi=1,2,3,..9, (polarity +1)
h’i=1,2,3,..9, (polarity -1)
28. Example :Training example slides from [Smyth 2007]
classifier the ten red (circle)/blue (diamond) dots
Step 1a:
},-{p
(x)h
uvux
pvpv
xh
j
j
j
11polarity
axis.horizontalthe
toparallelisbecause
usednotis),,(
otherwise1
if1
)(
28
Initialize:
Dn
(t=1)=1/10
You may choose
one of the following
axis-parallel (horizontal
lines) classifiers
Horizontal dotted lines
are possible choices
hj=1(x)
hj=2(x)
:
hj=4(x)
:
:
:
:
:
hj=9(x)
v1
v2
v3
V4
V5
V6
V7
V8
v9
u-axis
v-axis
There are 9x2 choices here,
hj=1,2,3,..9, (polarity +1)
h’j=1,2,3,..9, (polarity -1)
All together including the previous
slide 36 choices
29. Step 1b:
Find and check the error of the weak classifier
h( )
To evaluate how successful is your selected weak classifier h( ),
we can evaluate the error rate of the weak classifier
ɛt = Misclassification probability of h( )
Checking: If εt>= 0.5 (something wrong), stop the training
Because, by definition a weak classifier should be slightly
better than a random choice--probability =0.5
So if εt >= 0.5 , your h( ) is a bad choice, redesign another
h”( ) and do the training based on the new h”( ).
stop.otherwise,50:teprerequisi:stepchecking:Step1b
0
)classifiedly(incorrect)(if1
where,*)( )()(
1
.ε
otherwise
yxh
IIiD
t
iit
yxhyxh
n
i
tt iitiit
Adaboost v.2c29
30. Assume h() can only be
horizontal or vertical
separators.
How many different
classifiers are available?
If hj() is selected as shown,
circle the misclassified
training samples. Find ɛ( ) to
see misclassification
probability if the probability
distribution (D) for each
sample is the same.
Find h() with minimum error.
stop.otherwise,50:teprerequisi:stepchecking:Step1b
respect toerror withtheminimizesthat}1,1{:classfiertheFind:{Step1a
.ε
DXh
t
tt
Adaboost v.2c30
hj()
31. Result of step2 at t=1
Adaboost v.2c31
Incorrectly classified by ht=1(x)
ht=1(x)
32. Step2 at t=1 (refer to the previous
slide)
Using εt=1=0.3, because
3 samples are
incorrectly classified
424.0
30.0
3.01
ln
2
1
.classifierofrateerrorweightedtheiswhere
1
ln
2
1
:Step2
3.01.01.01.0
1
1
t
tt
t
t
t
t
so
hε
ε
ε
ε
otherwise
yxh
I
IiD
iit
yxh
yxh
n
i
tt
iit
iit
0
)(if1
where
,*)(
)(
)(
1
Adaboost v.2c32
The proof can be found at http://vision.ucsd.edu/~bbabenko/data/boosting_note.pdf
Also see appendix.
33. Step3 at t=1, update Dt to Dt+1
Update the weight Dt(i) for each training sample i
function)(prob.ondistrubutiaisso
factor,ionnormalizatwhere
))(exp()(
)(:3 1
t
t
t
ititt
t
D
Z
Z
xhyiD
iDStep
Adaboost v.2c33
The proof can be found at http://vision.ucsd.edu/~bbabenko/data/boosting_note.pdf
Also see appendix.
34. Step 3: Find first Z (the normalization
factor). Note that Dt=1=0.1, at=1 =0.424
911.0
456.0455.0
52.1*3*1.065.0*7*1.0*3*1.0*7*1.0
)__()__(
initput,1)(so),(:classifiedyincorrectl
initput,1)(so),(:classifiedcorrectly
)(
__
1t
samplesincorrect3andcorrect7
424.0,1.0
1
424.0424.0
)()()(
)1(
)(
)1(
)()(
)( )(
11
t
xhyi
α
t
xhy
α
t
xhyi
α
t
xhy
α
tt
iiiiii
iiiiii
xhyi
)(xhyα
t
xhy
)(xhyα
tt
xhy xhyi
t
tt
Z
ee
weightincorrecttotalweightcorrecttotal
(i)eD(i)eD(i)eD(i)eDZ
(i)xhyxhy
(i)xhyxhy
i(i)eD(i)eDZ
weightincorrectweightcorrectZ
αD
ii
t
iii
t
ii
t
iii
t
ii
itit
iii
itit
iii ii
Adaboost v.2c34
Note: currently t=1,
Dt=1(i)=0.1 for all i
7 correctly classified
3 incorrectly classified
35. Step 3: Example: update Dt to Dt+1
If correctly classified, weight Dt+1 will decrease, and vice versa.
167.052.1
911.0
1.0
911.0
1.0
)(
0714.065.0
911.0
1.0
911.0
1.0
)(
,911.0since
52.1*1.0
1.01.0
)(
65.0
1.0
)(
1.0
)(
1
1
1
1
42.01
1
1
42.0
)(
1
eiDincrease
eiDdecrease
SoZ
e
Z
e
Z
iD
Z
iD
e
Z
e
Z
iD
D
incorrectt
correctt
t
tt
incorrectt
t
correctt
t
correct
t
t
t
Adaboost v.2c35
36. Now run the main training loop second time t=2
167.052.1
911.0
1.0
911.0
1.0
)(
0714.065.0
911.0
1.0
911.0
1.0
)(
1
1
1
1
eiD
eiD
incorrectt
correctt
Adaboost v.2c36
37. Now run the main training loop second
time t=2, and then t=3
Adaboost v.2c37
Final classifier by
combining three weak
classifiers
38. Combined classifier for t=1,2,3
Exercise: work out 1and 2
)()()(*424.0)( 33221
1
xhαxhαxhsignxH
(x)hαsignH(x)
tt
T
t
tt
Adaboost v.2c38
Combine to form the
classifier.
May need one more step for the
final classifier
ht=1()
ht=2()
ht=3()
1
2 3
47. Example
)Sum(r)Sum(r blacki,whitei, if
•Feature’s value is calculated as the difference between the
sum of the pixels within white and black rectangle regions.
thresholdfif
thresholdfif
xh
i
i
i
1
1
)(