The document provides a course calendar for a class on Bayesian estimation methods. It lists the dates and topics to be covered over 15 class periods from September to January. The topics progress from basic concepts like Bayes estimation and the Kalman filter, to more modern methods like particle filters, hidden Markov models, Bayesian decision theory, and applications of principal component analysis and independent component analysis. One class is noted as having no class.
Principal Components Analysis, Calculation and VisualizationMarjan Sterjev
The article explains dimension reduction principles, PCA algorithm and mathematics behind. The PCA calculation and data projection is demonstrated in R, Python and Apache Spark. Finally the results are visualized with D3.js.
In this talk we consider the question of how to use QMC with an empirical dataset, such as a set of points generated by MCMC. Using ideas from partitioning for parallel computing, we apply recursive bisection to reorder the points, and then interleave the bits of the QMC coordinates to select the appropriate point from the dataset. Numerical tests show that in the case of known distributions this is almost as effective as applying QMC directly to the original distribution. The same recursive bisection can also be used to thin the dataset, by recursively bisecting down to many small subsets of points, and then randomly selecting one point from each subset. This makes it possible to reduce the size of the dataset greatly without significantly increasing the overall error. Co-author: Fei Xie
Tensor Train (TT) decomposition [3] is a generalization of SVD decomposition from matrices to tensors (=multidimensional arrays).
It represents a tensor compactly in terms of factors and allows to work with the tensor via its factors without materializing the tensor itself.
For example, we can find the elementwise product of two TT-tensors of size 2^100 and get the result in the TT-format as well.
In the talk, we will show how Tensor Train decomposition can be used to represent parameters of neural networks [1] and polynomial models [2].
This parametrization allows exponentially many 'virtual' parameters while working only with small factors of the TT-format.
To train the model, i.e. optimize the objective subject to the constraint that the parameters are in the TT-format, [2] uses stochastic Riemannian optimization.
[1] Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in Neural Information Processing Systems.
[2] Novikov, A., Trofimov, M., & Oseledets, I. (2016). Tensor Train polynomial models via Riemannian optimization. arXiv:1605.03795.
[3] Oseledets, I. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing.
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...ieijjournal
In this paper, we study the numerical solution of singularly perturbed parabolic convection-diffusion type
with boundary layers at the right side. To solve this problem, the backward-Euler with Richardson
extrapolation method is applied on the time direction and the fitted operator finite difference method on the
spatial direction is used, on the uniform grids. The stability and consistency of the method were established
very well to guarantee the convergence of the method. Numerical experimentation is carried out on model
examples, and the results are presented both in tables and graphs. Further, the present method gives a more
accurate solution than some existing methods reported in the literature.
Because of deep learning we now talk a lot about tensors, yet tensors remain relatively unknown objects. In this presentation I will introduce tensors and the basics of multilinear algebra, then describe tensor decompositions and give some examples of how they are used in representation learning for understanding/compressing data. I will also briefly describe how tensor decompositions are used in 1) the method of moments for training latent variable models, and 2) deep learning for understanding why deep convolutional networks are such excellent classifiers.
system of algebraic equation by Iteration methodAkhtar Kamal
solve the system of algebraic equation by Iteration method
classification of Iteration method:-
(1) Jacobi's method
(2) Gauss-Seidel method
each problem
Principal Components Analysis, Calculation and VisualizationMarjan Sterjev
The article explains dimension reduction principles, PCA algorithm and mathematics behind. The PCA calculation and data projection is demonstrated in R, Python and Apache Spark. Finally the results are visualized with D3.js.
In this talk we consider the question of how to use QMC with an empirical dataset, such as a set of points generated by MCMC. Using ideas from partitioning for parallel computing, we apply recursive bisection to reorder the points, and then interleave the bits of the QMC coordinates to select the appropriate point from the dataset. Numerical tests show that in the case of known distributions this is almost as effective as applying QMC directly to the original distribution. The same recursive bisection can also be used to thin the dataset, by recursively bisecting down to many small subsets of points, and then randomly selecting one point from each subset. This makes it possible to reduce the size of the dataset greatly without significantly increasing the overall error. Co-author: Fei Xie
Tensor Train (TT) decomposition [3] is a generalization of SVD decomposition from matrices to tensors (=multidimensional arrays).
It represents a tensor compactly in terms of factors and allows to work with the tensor via its factors without materializing the tensor itself.
For example, we can find the elementwise product of two TT-tensors of size 2^100 and get the result in the TT-format as well.
In the talk, we will show how Tensor Train decomposition can be used to represent parameters of neural networks [1] and polynomial models [2].
This parametrization allows exponentially many 'virtual' parameters while working only with small factors of the TT-format.
To train the model, i.e. optimize the objective subject to the constraint that the parameters are in the TT-format, [2] uses stochastic Riemannian optimization.
[1] Novikov, A., Podoprikhin, D., Osokin, A., & Vetrov, D. P. (2015). Tensorizing neural networks. In Advances in Neural Information Processing Systems.
[2] Novikov, A., Trofimov, M., & Oseledets, I. (2016). Tensor Train polynomial models via Riemannian optimization. arXiv:1605.03795.
[3] Oseledets, I. (2011). Tensor-train decomposition. SIAM Journal on Scientific Computing.
FITTED OPERATOR FINITE DIFFERENCE METHOD FOR SINGULARLY PERTURBED PARABOLIC C...ieijjournal
In this paper, we study the numerical solution of singularly perturbed parabolic convection-diffusion type
with boundary layers at the right side. To solve this problem, the backward-Euler with Richardson
extrapolation method is applied on the time direction and the fitted operator finite difference method on the
spatial direction is used, on the uniform grids. The stability and consistency of the method were established
very well to guarantee the convergence of the method. Numerical experimentation is carried out on model
examples, and the results are presented both in tables and graphs. Further, the present method gives a more
accurate solution than some existing methods reported in the literature.
Because of deep learning we now talk a lot about tensors, yet tensors remain relatively unknown objects. In this presentation I will introduce tensors and the basics of multilinear algebra, then describe tensor decompositions and give some examples of how they are used in representation learning for understanding/compressing data. I will also briefly describe how tensor decompositions are used in 1) the method of moments for training latent variable models, and 2) deep learning for understanding why deep convolutional networks are such excellent classifiers.
system of algebraic equation by Iteration methodAkhtar Kamal
solve the system of algebraic equation by Iteration method
classification of Iteration method:-
(1) Jacobi's method
(2) Gauss-Seidel method
each problem
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
The GraphNet (aka S-Lasso), as well as other “sparsity + structure” priors like TV (Total-Variation), TV-L1, etc., are not easily applicable to brain data because of technical problems
relating to the selection of the regularization parameters. Also, in
their own right, such models lead to challenging high-dimensional optimization problems. In this manuscript, we present some heuristics for speeding up the overall optimization process: (a) Early-stopping, whereby one halts the optimization process when the test score (performance on leftout data) for the internal cross-validation for model-selection stops improving, and (b) univariate feature-screening, whereby irrelevant (non-predictive) voxels are detected and eliminated before the optimization problem is entered, thus reducing the size of the problem. Empirical results with GraphNet on real MRI (Magnetic Resonance Imaging) datasets indicate that these heuristics are a win-win strategy, as they add speed without sacrificing the quality of the predictions. We expect the proposed heuristics to work on other models like TV-L1, etc.
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docxanhlodge
SAMPLE QUESTION:
Exercise 1: Consider the function
f (x,C)=
sin(C x)
Cx
(a) Create a vector x with 100 elements from -3*pi to 3*pi. Write f as an inline or anonymous function
and generate the vectors y1 = f(x,C1), y2 = f(x,C2) and y3 = f(x,C3), where C1 = 1, C2 = 2 and
C3 = 3. Make sure you suppress the output of x and y's vectors. Plot the function f (for the three
C's above), name the axis, give a title to the plot and include a legend to identify the plots. Add a
grid to the plot.
(b) Without using inline or anonymous functions write a function+function structure m-file that does
the same job as in part (a)
SAMPLE LAB WRITEUP:
MAT 275 MATLAB LAB 1 NAME: __________________________
LAB DAY and TIME:______________
Instructor: _______________________
Exercise 1
(a)
x = linspace(-3*pi,3*pi); % generating x vector - default value for number
% of pts linspace is 100
f= @(x,C) sin(C*x)./(C*x) % C will be just a constant, no need for ".*"
C1 = 1, C2 = 2, C3 = 3 % Using commans to separate commands
y1 = f(x,C1); y2 = f(x,C2); y3 = f(x,C3); % supressing the y's
plot(x,y1,'b.-', x,y2,'ro-', x,y3,'ks-') % using different markers for
% black and white plots
xlabel('x'), ylabel('y') % labeling the axis
title('f(x,C) = sin(Cx)/(Cx)') % adding a title
legend('C = 1','C = 2','C = 3') % adding a legend
grid on
Command window output:
f =
@(x,C)sin(C*x)./(C*x)
C1 =
1
C2 =
2
C3 =
3
(b)
M-file of structure function+function
function ex1
x = linspace(-3*pi,3*pi); % generating x vector - default value for number
% of pts linspace is 100
C1 = 1, C2 = 2, C3 = 3 % Using commans to separate commands
y1 = f(x,C1); y2 = f(x,C2); y3 = f(x,C3); % function f is defined below
plot(x,y1,'b.-', x,y2,'ro-', x,y3,'ks-') % using different markers for
% black and white plots
xlabel('x'), ylabel('y') % labeling the axis
title('f(x,C) = sin(Cx)/(Cx)') % adding a title
legend('C = 1','C = 2','C = 3') % adding a legend
grid on
end
function y = f(x,C)
y = sin(C*x)./(C*x);
end
Command window output:
C1 =
1
C2 =
2
C3 =
3
More instructions for the lab write-up:
1) You are not obligated to use the 'diary' function. It was presented only for you convenience. You
should be copying and pasting your code, plots, and results into some sort of "Word" type editor that
will allow you to import graphs and such. Make sure you always include the commands to generate
what is been asked and include the outputs (from command window and plots), unless the pr.
My talk at the International Conference on Monte Carlo Methods and Applications (MCM2032) related to advances in mathematical aspects of stochastic simulation and Monte Carlo methods at Sorbonne Université June 28, 2023, about my recent works (i) "Numerical Smoothing with Hierarchical Adaptive Sparse Grids and Quasi-Monte Carlo Methods for Efficient Option Pricing" (link: https://doi.org/10.1080/14697688.2022.2135455), and (ii) "Multilevel Monte Carlo with Numerical Smoothing for Robust and Efficient Computation of Probabilities and Densities" (link: https://arxiv.org/abs/2003.05708).
Random Matrix Theory and Machine Learning - Part 3Fabian Pedregosa
ICML 2021 tutorial on random matrix theory and machine learning.
Part 3 covers: 1. Motivation: Average-case versus worst-case in high dimensions 2. Algorithm halting times (runtimes) 3. Outlook
This is the entrance exam paper for ISI MSQE Entrance Exam for the year 2008. Much more information on the ISI MSQE Entrance Exam and ISI MSQE Entrance preparation help available on http://crackdse.com
This is the entrance exam paper for ISI MSQE Entrance Exam for the year 2010. Much more information on the ISI MSQE Entrance Exam and ISI MSQE Entrance preparation help available on http://crackdse.com
Convex Optimization Modelling with CVXOPTandrewmart11
An introduction to convex optimization modelling using cvxopt in an IPython environment. The facility location problem is used as an example to demonstrate modelling in cvxopt.
Similar to 2012 mdsp pr13 support vector machine (20)
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Leading Change strategies and insights for effective change management pdf 1.pdf
2012 mdsp pr13 support vector machine
1. Course Calendar (revised 2012 Dec. 27)
Class DATE Contents
1 Sep. 26 Course information & Course overview
2 Oct. 4 Bayes Estimation
3 〃 11 Classical Bayes Estimation - Kalman Filter -
4 〃 18 Simulation-based Bayesian Methods
5 〃 25 Modern Bayesian Estimation :Particle Filter
6 Nov. 1 HMM(Hidden Markov Model)
Nov. 8 No Class
7 〃 15 Bayesian Decision
8 〃 29 Non parametric Approaches
9 Dec. 6 PCA(Principal Component Analysis)
10 〃 13 ICA(Independent Component Analysis)
11 〃 20 Applications of PCA and ICA
12 〃 27 Clustering; k-means, Mixture Gaussian and EM
13 Jan. 17 Support Vector Machine
14 〃 22(Tue) No Class
2. Lecture Plan
Support Vector Machine
1. Linear Discriminative Machine
Perceptron Learning rule
2. Support Vector Machine
Problem setting, Optimization
3. Generalization of SVM
3. 3
1. Introduction
1.1 Classical Linear Discriminative Function
-Perceptron Machine-
Consider the two-category linear discriminative problem using
perceptron-type machine.
- Assumption
Two-category (𝐶1 , 𝐶2) training data in D-dimensional feature space are
separable by a linear discriminative function of the form
𝑓 𝑥 = 𝑤 𝑇 𝑥
which satisfies
𝑓 𝑥 ≥ 0 𝑓𝑜𝑟 𝑥 ∈ 𝐶1
𝑓 𝑥 < 0 𝑓𝑜𝑟 𝑥 ∈ 𝐶2
0 1
0 1
where , , , is (D+1)-dim. weight vector
1, , ,
Here, 0 gives the hyperplane surface which separates
two categories and its normal vector is .
T
D
T
D
T
w w w w
x x x x
f x w x
w
4. 4
𝑤0
𝑥0 = 1
+
𝑤1
𝑥1
𝑤 𝐷𝑥 𝐷
.
.
.
.
0
D
i i
i
f x w x
Fig. 1 Perceptron
Class C1
Class C2
Hyperplane f(x)=0
Fig. 2 Linear Discrimination
weights
x-space
5. 5
( )
2
( ) ( ) ( ) ( )
2
(0)
(1) (2)
( 1) ( )
- Reverse the training vectors of class C
for
- Initial weight vector :
- For a new training dataset , , ,
if
i
i new i i i
i i
x
x x x x C
w
x x
w w
( )
( 1) ( ) ( ) ( )
0
+ if 0
where determines the convergence speed of learning.
i
i i i i
f x
w w x f x
1.2 Learning Rule of Perceptron (η=1 case)
Class C1
Reversed C2 data
Fig. 3 Reversed data of class C2
reflect
8. 8
2. Support Vector Machine (SVM)
2.1 Problem Setting
Given a linearly separable two-category(𝐶1 , 𝐶2) training dataset with
class labels
𝑥𝑖, 𝑡𝑖 𝑖 = 1~𝑁
where 𝑥𝑖 ∶ D-dimensional feature vector
𝑡𝑖 = {−1,1} “1” for C1, and “ -1” for C2
Find a separating hyperplane H
𝑓 𝑥 = 𝑤 𝑇 𝑥 + 𝑏 = 0
- Among a set of possible hyperplanes, we want to seek a reasonable
hyperplane which is farthest from all training sample vectors.
- The obtained discriminant hyperplane will give better generalization
capability. (*)
(*) It is expected well for test data which are outside the training data
9. 9
Motivation of SVM
The optimal discriminative hyperplane should have the largest
margin which is defined as the minimum distance of the training
vectors to the separation surface.
Class C1
Class C2
Margin
Fig. 5 Margin
Hyperplane
10. 10
The distance between a hyperplane
0
and a sample point is given by
(see Appendix)
Since both the scalar( )-multiplication ( ) and a pair
T
i
T
i
w x b
x
w x b
w
k kw,kb
2.2 Optimization problem
of ( , )
give the same hyperplane, we choose the optimal hyperplane which
is given by the discriminative function
1
where in (3) is the closest vector to the separation surface.
T
i
i
w b
w x b
x
(Canonical hyperplane)
(1)
(2)
(3)
11. 11
2
2
0
p
T T
p
b
x w w
w
w x b w w b
b
w
ix
qx
px
w
w
hyperplane
0T
w x b
2
2
= ( = )
T T
i i
q q q
TT
ii
q p
w x w xw
x x w x
w ww
w x bw x b
x x w
ww
:distance between
and hyperplane
ix
Appendix
Fig.6
12. 12
1
2
- The distance (2) from the closest training vector to the decision
surface is
1
2
- The margin is
- If 1 (C ) then 1
If 1 (C ) then 1
therefore
T
i
T
i i
T
i i
w x b
w w
w
t w x b
t w x b
1T
i it w x b
Fig. 7 Margin and distance
Hyperplane
T
iw x b
w
2
w
(4)
(5)
13. 13
2
- Maximization of the margin-
1 1
Minimize
2 2
Subject to ( ) 1 ( 1~ )
Since ( ) is a quadratic function with respect to , there exists
T
T
i i
J w w w w
t w x b i N
J w w
Optimization problem
an (unique) global minimum.
(7)
(6)
14. 14
* *
*
*
satisfies
( , )
(i) 0
(ii) 0 ( 1,..., )
(iii) 0
(iv) 0
z z
i i
i
i
L z
z
g z i k
g z
(optimiztion conditions)
Minimize z (convex space)
Subject to ( ) 0 ( 1~ )
The necessary and suffi
i
J z
g z i k
2.3 Lagrangian multiplier approach - general theory -
Kuhn - Tucker Theorem
*
*
1
cient conditions for a point to be
an optimum are the existence of such that the Lagrangian function
( , ): ( )
k
i i
i
z
L z J z g z
(8)
(9)
(10)
(11)
(12)
15. 15
- The second condition (10), called Karush-Kuhn-Tucker(KKT)
condition or complementary condition, implies the following facts
ifor active constraints if α >0
and for inactive constr iaints if α = 0
1
Apply K-T theorem to Eq. (6) (7)
- Lagrangian
1
( , , ): 1
2
- Condition (i) by substituting , gives
( , , )
0
T T
p i i i
N
p
i i i
i
L w b w w t w x b
z w b
L w b
w t x
w
2.4 Dual Problem
1
( , , )
0 0
N
p
i i
i
L w b
t
b
(13)
(14)
(15)
16. 16
0
1
1 1 1
1
1
( , , )
2
1 1
2 2
1 1
2 2
1
2
1
(: ( , , )) =
2
T T
p i i i i i i
i i i
I
K
N
T T
i i i
i
N N N
T T
i i i i i i i i i
i i i
N
T
i j i j i j
i j
p i
i
L w b w w t w x b t
I w w w t x
K t w x t t x x
t t x x
L L w b
1
1
is maximized subject to
0 and 0
N
T
i j i j i j
i j
N
i i i
i
t t x x
t
(16)
(17)
17. 17
- Dual problem is easier to solve because depends only
on not on ,
- contains training data as the inner product form
- Geometric interpretation of KKT condition (ii) or Eq.(10)
i
T
i i j
L
w b
L x x x
1 0 1
mans,
at either =0 or 1 must hold.
for some 0 must lie on one of the hyperplanes
,namely with active constraint provides the largest margin.
T
i i i
T
i i i i
j
j
t w x b i N
x t w x b
x
(Such is called support vector, see Fig. 8)
At all other points 0 (inactive constraint points)
j
i
x
(18)
18. 18
0
- Only the support vectors contribute to determine hyperplane
because of
- The KTT condition is used to determine the bias b.
i
i i i i iw t x t x
Fig. 8 KTT conditions
support vectors
𝛼𝑖 > 0
𝛼𝑖 = 0
𝛼𝑖 = 0
inactive constraint points
0
- Hyperplane : 0
i
T
i it x x b
(19)
(20)
19. 19
3. Generalization of SVM
3 .1 Non-separable case
- Introduce slack variables ξi in order to relax the constraint (7) as
follows;
𝑡𝑖(𝑤 𝑇 𝑥𝑖 + 𝑏) ≥1- ξi
For ξi =0, the data point is correctly separable with margin.
For 0≦ξi ≦1, the data point is separable but falls within the region of
the margin.
For ξi >1, the data point falls on the wrong side of the separating surface.
Define the slack variable
ξi := ramp{1-𝑡𝑖(𝑤 𝑇 𝑥𝑖 + 𝑏)}
where ramp{u} = u for u>0 and =0 for u≦0.
1
New Optimization Problem:
1
Minimize , :=
2
subject to 1+ 0
0 ( 1 )
N
T
p i
i
T
i i i
L w w w C
t w x b
i N
(21)
(22)
20. 20
Fig. 9 Non separable case and stack variable
𝑡𝑖 = 1
𝑡𝑖 = −1
0
0 0
0
1T
w x b
0
0
0.5
1
2
T
w x b
0
0 0
1.5
1
2
T
w x b
00 0 1
optimum hyperplane
0T
w x b
support vectors
i
21. 21
3.2 Nonlinear SVM
- For the separation problem by a nonlinear discriminative surface,
nonlinear mapping approach is useful.
- Cover’s theorem: A complex pattern classification problem cast in a
high-dimensional space non-linearly is more likely to be linearly
separable than in a low-dimensional space.
x ( )x ( )z x SVM
higher dimension
Fig. 10 nonlinear mapping
( )z x
x-space z-space
22. 22
0 1
0
1, , , ( )
- Hyperplane in -space: 0
- SVM in -space gives an optimum hyperplane with the form
(sum of support vectors in )
- Discriminat
T
M
T
i i i i
i
x x x x M D
x w x
x
w t x x
0
inner product
in M-d space
inner product in -domain kernel function in -domain
ive function:
- If we can choose which satisfies
,
the co
T T
i i i
i
T
i j i j
x
w x t x x
x
x x K x x
mputational cost will be drastically reduced.
(23)
(24)
(25)
23. 23
2
2 2
1 1 2 2 1 2
) Polynomial kernel
, 1
where 1, , 2 , , 2 , 2
T T
T
Ex
K u v u v u v
v u u u u u u
Ex) Nonlinear SVM result by utilizing Gauss kernel
Fig. 11
Support vectors
Bishop [1]
24. 24
References:
[1] C. M. Bishop, “Pattern Recognition and Machine Learning”,
Springer, 2006
[2] R.O. Duda, P.E. Hart, and D. G. Stork, “Pattern Classification”,
John Wiley & Sons, 2nd edition, 2004
[3] 平井有三 「はじめてのパターン認識」森北出版(2012年)