The proposed method uses an online weighted ensemble of one-class SVMs for feature selection in background/foreground separation. It automatically selects the best features for different image regions. Multiple base classifiers are generated using weighted random subspaces. The best base classifiers are selected and combined based on error rates. Feature importance is computed adaptively based on classifier responses. The background model is updated incrementally using a heuristic approach. Experimental results on the MSVS dataset show the proposed method achieves higher precision, recall, and F-score than other methods compared.
Introduction to machine learning terminology.
Applications within High Energy Physics and outside HEP.
* Basic problems: classification and regression.
* Nearest neighbours approach and spacial indices
* Overfitting (intro)
* Curse of dimensionality
* ROC curve, ROC AUC
* Bayes optimal classifier
* Density estimation: KDE and histograms
* Parametric density estimation
* Mixtures for density estimation and EM algorithm
* Generative approach vs discriminative approach
* Linear decision rule, intro to logistic regression
* Linear regression
Introduction to machine learning terminology.
Applications within High Energy Physics and outside HEP.
* Basic problems: classification and regression.
* Nearest neighbours approach and spacial indices
* Overfitting (intro)
* Curse of dimensionality
* ROC curve, ROC AUC
* Bayes optimal classifier
* Density estimation: KDE and histograms
* Parametric density estimation
* Mixtures for density estimation and EM algorithm
* Generative approach vs discriminative approach
* Linear decision rule, intro to logistic regression
* Linear regression
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
One of the central tasks in computational mathematics and statistics is to accurately approximate unknown target functions. This is typically done with the help of data — samples of the unknown functions. The emergence of Big Data presents both opportunities and challenges. On one hand, big data introduces more information about the unknowns and, in principle, allows us to create more accurate models. On the other hand, data storage and processing become highly challenging. In this talk, we present a set of sequential algorithms for function approximation in high dimensions with large data sets. The algorithms are of iterative nature and involve only vector operations. They use one data sample at each step and can handle dynamic/stream data. We present both the numerical algorithms, which are easy to implement, as well as rigorous analysis for their theoretical foundation.
In this paper, we solve a semi-supervised regression
problem. Due to the luck of knowledge about the
data structure and the presence of random noise, the considered data model is uncertain. We propose a method which combines graph Laplacian regularization and cluster ensemble methodologies. The co-association matrix of the ensemble is calculated on both labeled and unlabeled data; this matrix is used as a similarity matrix in the regularization framework to derive the predicted outputs. We use the low-rank decomposition of the co-association matrix to significantly speedup calculations and reduce memory. Two clustering problem examples are presented.
Full version is here https://arxiv.org/abs/1901.03919
* ML in HEP
* classification and regression
* knn classification and regression
* ROC curve
* optimal bayesian classifier
* Fisher's QDA
* intro to Logistic Regression
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Factorization including Spatial Constraint with Iterative Reweighted Regression”, International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 2012.
this is the forth slide for machine learning workshop in Hulu. Machine learning methods are summarized in the beginning of this slide, and boosting tree is introduced then. You are commended to try boosting tree when the feature number is not too much (<1000)
We will describe and analyze accurate and efficient numerical algorithms to interpolate and approximate the integral of multivariate functions. The algorithms can be applied when we are given the function values at an arbitrary positioned, and usually small, existing sparse set of function values (samples), and additional samples are impossible, or difficult (e.g. expensive) to obtain. The methods are based on local, and global, tensor-product sparse quasi-interpolation methods that are exact for a class of sparse multivariate orthogonal polynomials.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
* Logistic regression, logistic loss (log loss)
* stochastic optimization
* adding new features, generalized linear model
* Kernel trick, intro to SVM
* Overfitting
* Decision trees for classification and regression
* Building trees greedily: Gini index, entropy
* Trees fighting with overfitting: pre-stopping and post-pruning
* Feature importances
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
A review of one of the most popular methods of clustering, a part of what is know as unsupervised learning, K-Means. Here, we go from the basic heuristic used to solve the NP-Hard problem to an approximation algorithm K-Centers. Additionally, we look at variations coming from the Fuzzy Set ideas. In the future, we will add more about On-Line algorithms in the line of Stochastic Gradient Ideas...
In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.
A BA-based algorithm for parameter optimization of support vector machineAboul Ella Hassanien
Presentation at the workshop on Intelligent systems and application, held at faculty of computer and information, Cairo University on Saturday 3 Dec. 2016
In this experiment, I tried to implement Minimum
error rate classifier using the posterior probabilities which
uses Normal distribution to calculate likelihood probabilities to
classify given sample points
Markov chain Monte Carlo (MCMC) methods are popularly used in Bayesian computation. However, they need large number of samples for convergence which can become costly when the posterior distribution is expensive to evaluate. Deterministic sampling techniques such as Quasi-Monte Carlo (QMC) can be a useful alternative to MCMC, but the existing QMC methods are mainly developed only for sampling from unit hypercubes. Unfortunately, the posterior distributions can be highly correlated and nonlinear making them occupy very little space inside a hypercube. Thus, most of the samples from QMC can get wasted. The QMC samples can be saved if they can be pulled towards the high probability regions of the posterior distribution using inverse probability transforms. But this can be done only when the distribution function is known, which is rarely the case in Bayesian problems. In this talk, I will discuss a deterministic sampling technique, known as minimum energy designs, which can directly sample from the posterior distributions.
One of the central tasks in computational mathematics and statistics is to accurately approximate unknown target functions. This is typically done with the help of data — samples of the unknown functions. The emergence of Big Data presents both opportunities and challenges. On one hand, big data introduces more information about the unknowns and, in principle, allows us to create more accurate models. On the other hand, data storage and processing become highly challenging. In this talk, we present a set of sequential algorithms for function approximation in high dimensions with large data sets. The algorithms are of iterative nature and involve only vector operations. They use one data sample at each step and can handle dynamic/stream data. We present both the numerical algorithms, which are easy to implement, as well as rigorous analysis for their theoretical foundation.
In this paper, we solve a semi-supervised regression
problem. Due to the luck of knowledge about the
data structure and the presence of random noise, the considered data model is uncertain. We propose a method which combines graph Laplacian regularization and cluster ensemble methodologies. The co-association matrix of the ensemble is calculated on both labeled and unlabeled data; this matrix is used as a similarity matrix in the regularization framework to derive the predicted outputs. We use the low-rank decomposition of the co-association matrix to significantly speedup calculations and reduce memory. Two clustering problem examples are presented.
Full version is here https://arxiv.org/abs/1901.03919
* ML in HEP
* classification and regression
* knn classification and regression
* ROC curve
* optimal bayesian classifier
* Fisher's QDA
* intro to Logistic Regression
C. Guyon, T. Bouwmans. E. Zahzah, “Foreground Detection via Robust Low Rank Matrix Factorization including Spatial Constraint with Iterative Reweighted Regression”, International Conference on Pattern Recognition, ICPR 2012, Tsukuba, Japan, November 2012.
this is the forth slide for machine learning workshop in Hulu. Machine learning methods are summarized in the beginning of this slide, and boosting tree is introduced then. You are commended to try boosting tree when the feature number is not too much (<1000)
We will describe and analyze accurate and efficient numerical algorithms to interpolate and approximate the integral of multivariate functions. The algorithms can be applied when we are given the function values at an arbitrary positioned, and usually small, existing sparse set of function values (samples), and additional samples are impossible, or difficult (e.g. expensive) to obtain. The methods are based on local, and global, tensor-product sparse quasi-interpolation methods that are exact for a class of sparse multivariate orthogonal polynomials.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
* Logistic regression, logistic loss (log loss)
* stochastic optimization
* adding new features, generalized linear model
* Kernel trick, intro to SVM
* Overfitting
* Decision trees for classification and regression
* Building trees greedily: Gini index, entropy
* Trees fighting with overfitting: pre-stopping and post-pruning
* Feature importances
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
A review of one of the most popular methods of clustering, a part of what is know as unsupervised learning, K-Means. Here, we go from the basic heuristic used to solve the NP-Hard problem to an approximation algorithm K-Centers. Additionally, we look at variations coming from the Fuzzy Set ideas. In the future, we will add more about On-Line algorithms in the line of Stochastic Gradient Ideas...
In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.
A BA-based algorithm for parameter optimization of support vector machineAboul Ella Hassanien
Presentation at the workshop on Intelligent systems and application, held at faculty of computer and information, Cairo University on Saturday 3 Dec. 2016
In this experiment, I tried to implement Minimum
error rate classifier using the posterior probabilities which
uses Normal distribution to calculate likelihood probabilities to
classify given sample points
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
Object recognition from RGB-D sensors has recently emerged as a renowned and challenging research topic. The current systems often require large amounts of time to train the models and to classify new data. We proposed an effective and fast object recognition approach from 3D data acquired from depth sensors such as Structure or Kinect sensors.
Our contribution in this work} is to present a novel fast and effective approach for real-time object recognition from 3D depth data:
- First, we extract simple but effective frame-level features, which we name as differential frames, from the raw depth data.
- Second, we build a recognition system based on Extreme Learning Machine classifier with a Local Receptive Field (ELM-LRF).
Anomaly detection using deep one class classifier홍배 김
- Anomaly detection의 다양한 방법을 소개하고
- Support Vector Data Description (SVDD)를 이용하여
cluster의 모델링을 쉽게 하도록 cluster의 형상을 단순화하고
boundary근방의 애매한 point를 처리하는 방법 소개
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional feature space using gaussian kernel and clustering in the feature space. The Matlab simulation results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results.
Fuzzy clustering algorithm can not obtain good clustering effect when the sample characteristic is not
obvious and need to determine the number of clusters firstly. For thi0s reason, this paper proposes an
adaptive fuzzy kernel clustering algorithm. The algorithm firstly use the adaptive function of clustering
number to calculate the optimal clustering number, then the samples of input space is mapped to highdimensional
feature space using gaussian kernel and clustering in the feature space. The Matlab simulation
results confirmed that the algorithm's performance has greatly improvement than classical clustering algorithm and has faster convergence speed and more accurate clustering results
Support Vector Machine (SVM) is a popular supervised machine learning algorithm used for classification and regression tasks. It works by finding a hyperplane in a high-dimensional space that best separates data points of different classes. SVM aims to maximize the margin between the classes, where the margin is defined as the distance between the hyperplane and the nearest data points from each class. The data points that are closest to the hyperplane are called support vectors.
Here are some key concepts associated with SVM:
Hyperplane: In a two-dimensional space, a hyperplane is a line that separates the data points of different classes. In higher dimensions, it becomes a hyperplane. SVM tries to find the hyperplane with the maximum margin between classes.
Margin: The margin is the distance between the hyperplane and the nearest data points of each class. SVM seeks to maximize this margin.
Support Vectors: These are the data points that are closest to the hyperplane and have the most influence on determining its position. These points "support" the placement of the hyperplane.
Kernel Trick: SVM can be extended to non-linearly separable data using the kernel trick. A kernel function takes the original feature space and maps it to a higher-dimensional space where the data might be linearly separable. Common kernel functions include the linear kernel, polynomial kernel, and radial basis function (RBF) kernel.
C parameter: In SVM, the C parameter is a regularization parameter that balances the trade-off between maximizing the margin and minimizing the classification error. A small C value allows for a larger margin but may lead to more misclassifications, while a large C value prioritizes correct classification over margin maximization.
SVM can be used for both classification and regression tasks:
Classification: In classification, SVM tries to find a hyperplane that separates data points into different classes. New data points can then be classified based on which side of the hyperplane they fall.
Regression: In regression, SVM is used to find a hyperplane that best fits the data points. The goal is to minimize the error between the actual and predicted values.
SVMs have been widely used in various fields such as image classification, text categorization, bioinformatics, and more. However, they can be sensitive to the choice of hyperparameters and might not perform well on extremely noisy or overlapping data.
It's important to note that while SVMs are a powerful and versatile algorithm, newer algorithms like deep learning models have gained popularity due to their ability to automatically learn complex features and patterns from data.
Support Vector Machine (SVM) is a powerful machine learning algorithm for classification and regression. It finds a hyperplane that best separates data into classes, aiming to maximize the margin between them. Support vectors, the closest data points to the hyperplane, influence its position.
Slides for the presentation at ENBIS 2018 of "Deep k-Means: Jointly Clustering with k-Means and Learning Representations" by Thibaut Thonet. Joint work with Maziar Moradi Fard and Eric Gaussier.
Jiawei Han, Micheline Kamber and Jian Pei
Data Mining: Concepts and Techniques, 3rd ed.
The Morgan Kaufmann Series in Data Management Systems
Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
5. Introduction
5/29
Why is it interesting today
X1
X2
X3
X4X5
Figure: Given a complex scene with different regions X1, X2, X3, X4 and X5, each region can be characterized
by different features such as: texture, color, texture-color, motion and edge.
Common methods
Same feature is used globally for the whole scene.
Key challenges
It requires a deep knowledge of the scene.
It is possible to automatically select the most relevant features
6. Introduction
6/29
Feature Selection
Traditionally, feature selection methods can be categorized into three main groups:
Wrapper-based methods
• employ a classification algorithm as a “black box” for selecting a set of relevant features.
Embedded methods
• the feature selection is incorporated as part of the classification algorithm, such as in
the decision trees or in neural networks.
Filter-based methods
• evaluate the relevance of the features based on a statistical measure estimated directly
from the data.
Ensemble-based approaches for feature selection
8. Proposed Work
8/29
N
u
m
b
e
r
o
f
fra
m
e
s
200
150
Frames
100
50
0
Feature Importance
Motion
Gradient
Color+Texture
Features
Texture
Color
0.6
1
0.8
0.4
0
0.2
Importance
Figure: Illustration of a complex scene (left) and its features
importance over time. The bar-graph (right) shows the feature
importance variations for a certain region of the scene along
time. The color, texture, cor-texture, gradients and motion
features are represented by pink, blue, pale orange, green and
dark orange color, respectively.
We propose an online weighted ensemble
of one-class SVMs (Support Vector
Machines) for foreground-background
separation.
It automatically selects the best features
for different regions of the image, and the
more relevant features are used for
foreground segmentation.
10. Proposed Work
10/29
An Online Weighted One-Class Ensemble for Feature Selection
A. Generate multiple base models
N
u
m
b
e
r
o
f
fra
m
e
s
This above approach increases the diversity of base
classifiers since different weights of each random sub-
space are taken to distinguish the decision boundaries
computed by the classifiers.
Algorithm 1 Generate multiple base background models
1: Require: IWOC-SVM training procedure, training sequence X, subspace dimension p∗, number of base classifiers M, weight
distribution δ(x)
2: k ← 1
3: repeat
4: Sk ← SelectRandomSubspace(X,p∗)
5: Train k-th IWOC-SVM on Sk with respect to weights w ∼ δ(x)
6: k ← k+1
7: until k > M
8: Output: trained IWOC-SVM base classifiers Ψ = {Ψ1,Ψ2,...,ΨM }
11. Proposed Work
11/29
How to Choose the Classifier?
Bicego and Figueiredo, 2009 proposed a Weighted version that allow to use weights W = {w1,...,wN } in [0,1]
for the data.
R
R
R
a
wiξi
target
outlier
xi
Weighted One-Class SVM (WOC-SVM)
Minimizing the hypersphere volume implies the
minimization of R2
. To prevent the classifier from
over-fitting with noisy data, slack variables ξi are
introduced to allow some target points (respectively
outliers) outside (respectively inside) the hypersphere.
Therefore the problem is to minimize:
Θ(a,R) = R2
+C
N
∑
i=1
wi ξi (1)
where C is a user-defined parameter that controls the
trade-off between the volume and the number of
target points rejected. The larger C, the less outliers
in the hypersphere.
The BS task requires adjust the learned model to the scene variations over time
13. Proposed Work
13/29
An Incremental Weighted One-Class SVM (IWOC-SVM)
We propose an IWOC-SVM which is closely related to the procedure proposed by Tax and Laskov (2003).
Given new samples Z1 = {x1,x2,...,xs} and its
respective weights not learned by the IWOC-SVM.
Karush-Kuhn-Tucker (KKT) conditions:
αi = 0 ⇒ ||xi −a||2
< R2
(2)
0 < αi < C ⇒ ||xi −a||2
= R2
(3)
αi = C ⇒ ||xi −a||2
> R2
(4)
The mathematical model can be defined as:
R −θ ≤ ||x −a||≤ R (5)
where θ ∈ [0,R] is relative to the distribution of
previous training set.
R
R
R
a
wiξi
target
outlier
xi
Algorithm 2 Incremental Weighted One-Class SVM
1: Require: Previous training set Z0, newly added training set Z1 and its respective weights
2: Train IWOC-SVM classifier on Z0, then split Z0 = SV0 ∪NSV0
3: Input new samples Z1. Put samples that violate KKT conditions in ZV
1 . If ZV
1 = /0, then goto 2.
4: Put samples from NSV0 that satisfy Eq. (5) into NSVS
0 .
5: Set Z0 = SV0 ∪NSVS
0 ∪ZV
1 and train IWOC-SVM classifier on Z0.
6: Output: IWOC-SVM classifier Ω and the new training set Z0.
.
14. Proposed Work
14/29
B. Selection of the best base classifiers
Let λcorrect
k (respectively λwrong
k ) be the number of times a pixel was correctly (respectively uncorrectly) classified
by the k-th (k = 1,...,M) base classifier from given ground truth data. Then, the corresponding error is given by:
errork =
λwrong
k
λcorrect
k +λwrong
k
(6)
Note that only the base classifiers that have the smallest errors are combined and used to differentiate the
moving objects from the background model in the scene.
15. 15/29
The Algorithm 3 was adapted from the algorithm proposed by (Wozniak, 2013)
Algorithm 3 Adaptive Importance (AI) computation
1: Require: Strong classifier H, validation set (t1,y1),...,(tN ,yN ) where ti ∈ T, yi ∈ Y = 0,1 for background and foreground
examples respectively, set of L best base classifiers Ψ = {Ψ1,Ψ2,...,ΨL}, learning rate parameter γ
2: Initialize all L best classifiers with importance: βl = 1
3: repeat
4: Classify ti using the strong classifier H according to Eq. (9)
5: for l = 1 : L do
6: Checks response of Ψl and calculates their errorl according to Eq. (6)
7: For each best classifier, Ψl , update the importance βl = βl (i −1)+
Pa(Ψl )−Pa(Ψl (i−1))
(N+γ)
8: where Pa(Ψl ) = 1 −errorl according to Eq.(6).
9: end for
10: until i < N
11: Normalize the importance β of each L best classifier.
12: Output: New importance assigned to the best classifiers β = {β1,β2,...,βL}
16. Proposed Work
16/29
C. Background Detection
Given an incoming pixel x to be classified, one can define a support function associated to the target class ω for
each of the L best base classifiers: ∀l = 1,...,L
Fl (x,ω) =
1
s1
exp(−d(x,a)/s2) (7)
where d(x,a) is a distance metric from x to the center a of the target class ω, s1 is a normalization factor and s2
is a scale parameter. Each Fl (x,ω) is then compared to a threshold t1 to obtain the positive or negative class
labels:∀l = 1,...,L
cl (x,ω) =
1 if Fl (x,ω) t1
−1 otherwise
(8)
Comparing the weighted sum of theses L class labels as in (Tax and Duin, 2001) to another threshold t2 allows
to define the strong classifier for x as follows:
H(x) =
1 if 1
L ∑L
l=1 βl cl (x,ω) t2
0 otherwise
(9)
A pixel x is classified as a background pixel if H(x) = 0.
17. Proposed Work
17/29
D. Heuristic approach for Background Model Maintenance
The Small Votes Instance Selection (SVIS) introduced by Guo and Boukir (Guo and Boukir, 2015) consists of an
unsupervised ensemble margin that combines the first c(1) and second most voted class c(2) labels under the
learned model. Let vc(1)
and vc(2)
denote the relative number of votes. Then the margin, taking value in [0,1] is:
m(x) =
vc(1)
−vc(2)
L
(10)
where L represents the number of best base classifiers in the ensemble. The first smallest margin instances are
selected as support vector candidates. The strong model is updated by the first smallest margin instances. This
procedure is presented in the Algorithm 4.
This heuristic significantly reduces the IWOC-SVM training task complexity while maintaining the accuracy of the
IWOC-SVM classification.
18. Proposed Work
18/29
Algorithm 4 Heuristic approach for model maintenance
1: Require: strong classifier H, test set Z = {z1,z2,...,zt }, weight distribution δ(z), user defined parameter time, user defined
parameter η.
2: i ← 1
3: repeat
4: if H(zi ) = 1 (background) then
5: Compute the margin m(zi ) by Eq. (10).
6: end if
7: if time is reached then
8: Order all the test instances according to their margin values, in ascending order.
9: The η smallest margin instances are selected as support vectors.
10: H(x) is updated using Z1 and its weight w ∼ δ(x).
11: end if
12: i ← i +1
13: until i > t
20. Experimental Results
20/29
Dataset
MSVS dataset (Benezeth et al., 2014) which consists of a set of 5 video sequences containing 7
multispectral bands and color video sequence (RGB).
Parameter Settings
IWOC-SVM with RBF (Radial Basis Function) kernel as a base classifier with C = 1.
The pool of classifiers was homogeneous and consisted of 10 base classifiers of the same type.
The classification threshold t1 was set to 0.9 and t2 to 0.5 for combining the best one-class classifiers.
Video sequences were resized to 160 ×120
The p* = 5 for the random subspace dimension from the original p = 26-dimensional.
Features: Color features (R,G,B, H,S,V and gray-scale), texture feature (XCS-LBP), color-texture
(OC-LBP), edge feature (gradient orientation and magnitude), motion feature (optical flow) and
multispectral bands (7 spectral narrow bands).
21. Experimental Results
21/29
MSVS dataset
Figure: Background subtraction results using the MSVS dataset – (a) original frame, (b) ground truth and (c) proposed method.
The true positives (TP) pixels are in white, true negatives (TN) pixels in black, false positives (FP) pixels in red and false negatives
(FN) pixels in green.
22. Experimental Results
22/29
Table: Performance of the different methods using the MSVS data set.
Videos Method Precision Recall F-score
Scene
01
MD (RGB)(Benezeth et al., 2014) 0.6536 0.6376 0.6536
MD (MSB)(Benezeth et al., 2014) 0.7850 0.8377 0.8105
Pooling (MSB)(Benezeth et al., 2014) 0.7475 0.8568 0.7984
Proposed 0.8500 0.9580 0.9008
Scene
02
MD (RGB)(Benezeth et al., 2014) 0.8346 0.9100 0.8707
MD (MSB)(Benezeth et al., 2014) 0.8549 0.9281 0.8900
Pooling (MSB)(Benezeth et al., 2014) 0.8639 0.8997 0.8815
Proposed 0.8277 0.8245 0.8727
Scene
03
MD (RGB)(Benezeth et al., 2014) 0.7494 0.5967 0.6644
MD (MSB)(Benezeth et al., 2014) 0.7533 0.6332 0.6889
Pooling (MSB)(Benezeth et al., 2014) 0.8809 0.5134 0.6487
Proposed 0.9056 0.9953 0.9483
Scene
04
MD(RGB)(Benezeth et al., 2014) 0.8402 0.7929 0.8158
MD (MSB)(Benezeth et al., 2014) 0.8430 0.8226 0.8327
Pooling (MSB)(Benezeth et al., 2014) 0.8146 0.8654 0.8392
Proposed 0.9534 0.8374 0.8997
Scene
05
MD (RGB)(Benezeth et al., 2014) 0.7359 0.7626 0.7490
MD (MSB)(Benezeth et al., 2014) 0.7341 0.8149 0.7724
Pooling (MSB)(Benezeth et al., 2014) 0.7373 0.8066 0.8066
Proposed 0.7316 0.8392 0.8400
*MD = Mahalanobis distance
27. Conclusion and Future Works
27/29
Conclusion
An incremental version of the WOC-SVM algorithm, called Incremental Weighted One-Class Support
Vector Machine (IWOC-SVM).
An online weighted version of random subspace (OW-RS) to increase the diversity of the classifiers pool.
A mechanism called Adaptive Importance Calculation (AIC) to suitably update the relative importance of
each feature over time.
A heuristic approach for IWOC-SVM model updating to improve speed.
Future Works
A superpixel segmentation strategy to improve the segmentation performance, increasing the
computational efficiency of our ensemble.
An unsupervised mechanism to suitably update the importance of each feature discarding insignificantly
features over time without ground-truth data.
29. References
29/29
Benezeth, Y., Sidibe, D., and Thomas, J. B. (2014). Background subtraction with multispectral video sequences. In IEEE International Conference on Robotics and
Automation (ICRA).
Guo, L. and Boukir, S. (2015). Fast data selection for SVM training using ensemble margin. Pattern Recognition Letters (PRL), 51.
Tax, D. and Duin, R. (2001). Combining one-class classifiers. In Proceedings of the Second International Workshop on Multiple Classifier Systems (MCS), pages
299–308.
Tax, D. and Laskov, P. (2003). Online SVM learning: from classification to data description and back. In IEEE Workshop on Neural Network for Signal Processing
(NNSP), pages 499–508.
Wozniak, M. (2013). Hybrid classifiers: methods of data, knowledge, and classifier combination. Springer Publishing Company, Incorporated.