This document summarizes two state-of-the-art clustering techniques: Support Vector Clustering (SVC) and Bregman Co-clustering. SVC involves a two-phase process: 1) determining boundaries of clusters using a minimum enclosing ball approach, and 2) assigning cluster labels by finding connected components in a graph. Bregman Co-clustering aims to robustly handle missing or sparse data, high dimensionality, noise and outliers. The document discusses applications and desirable properties of these clustering methods, such as nonlinear separability and automatic detection of the number of clusters.
Packet Classification using Support Vector Machines with String KernelsIJERA Editor
Since the inception of internet many methods have been devised to keep untrusted and malicious packets away
from a user’s system . The traffic / packet classification can be used
as an important tool to detect intrusion in the system. Using Machine Learning as an efficient statistical based
approach for classifying packets is a novel method in practice today . This paper emphasizes upon using an
advanced string kernel method within a support vector machine to classify packets .
There exists a paper related to a similar problem using Machine Learning [2]. But the researches mentioned in
their paper are not up-to date and doesn’t account for modern day
string kernels that are much more efficient . My work extends their research by introducing different approaches
to classify encrypted / unencrypted traffic / packets .
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...Vladimir Kulyukin
We study two theoretical problems that arise naturally in the application domain of assisted
navigation. Connect-the-dots in a graph is a graph-theoretical problem with application to
robot indoor localization. Buffon’s needle on a chessboard is a problem in geometric probability
with application to the design of RFID-enabled surface for robot-assisted navigation.
Packet Classification using Support Vector Machines with String KernelsIJERA Editor
Since the inception of internet many methods have been devised to keep untrusted and malicious packets away
from a user’s system . The traffic / packet classification can be used
as an important tool to detect intrusion in the system. Using Machine Learning as an efficient statistical based
approach for classifying packets is a novel method in practice today . This paper emphasizes upon using an
advanced string kernel method within a support vector machine to classify packets .
There exists a paper related to a similar problem using Machine Learning [2]. But the researches mentioned in
their paper are not up-to date and doesn’t account for modern day
string kernels that are much more efficient . My work extends their research by introducing different approaches
to classify encrypted / unencrypted traffic / packets .
Connect-the-Dots in a Graph and Buffon's Needle on a Chessboard: Two Problems...Vladimir Kulyukin
We study two theoretical problems that arise naturally in the application domain of assisted
navigation. Connect-the-dots in a graph is a graph-theoretical problem with application to
robot indoor localization. Buffon’s needle on a chessboard is a problem in geometric probability
with application to the design of RFID-enabled surface for robot-assisted navigation.
Presentation of my NSERC-USRA funded summer research project given at the Canadian Undergraduate Mathematics Conference (CUMC) 2014.
Please refer to the project site: http://jessebett.com/Radial-Basis-Function-USRA/
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
with the help fof alkfjafnalfnlsnclsnclsnvsnvlsnvlds snlksnldsn nlncldnldncldsnclsd anflnfldnfldnfldsc knfldfnlfnlnfldnfldsnfldsnf lkfndslfndslfnldsfnlsdnflsdlflsfnsldnf lsnflsfdnldslds dsnfldsnflsdnflsnldsnf
Anomaly detection using deep one class classifier홍배 김
- Anomaly detection의 다양한 방법을 소개하고
- Support Vector Data Description (SVDD)를 이용하여
cluster의 모델링을 쉽게 하도록 cluster의 형상을 단순화하고
boundary근방의 애매한 point를 처리하는 방법 소개
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Presentation of my NSERC-USRA funded summer research project given at the Canadian Undergraduate Mathematics Conference (CUMC) 2014.
Please refer to the project site: http://jessebett.com/Radial-Basis-Function-USRA/
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
with the help fof alkfjafnalfnlsnclsnclsnvsnvlsnvlds snlksnldsn nlncldnldncldsnclsd anflnfldnfldnfldsc knfldfnlfnlnfldnfldsnfldsnf lkfndslfndslfnldsfnlsdnflsdlflsfnsldnf lsnflsfdnldslds dsnfldsnflsdnflsnldsnf
Anomaly detection using deep one class classifier홍배 김
- Anomaly detection의 다양한 방법을 소개하고
- Support Vector Data Description (SVDD)를 이용하여
cluster의 모델링을 쉽게 하도록 cluster의 형상을 단순화하고
boundary근방의 애매한 point를 처리하는 방법 소개
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
A megacity means more people (including more homeless and more refugees), more city lifelines, and more buildings (including homes and business enterprizes) that are at risk from the potential disaster agents of each natural hazard. Presentation courtesy of Dr. Walter Hays, Global Alliance for Disaster Reduction
CONDITIONS ON DAYS SIX AND SEVEN (Nov. 4-5) post passage of Hurricane Sandy
- Over 1.3 million residents still without electricity, waiting in cold, damp houses without cell phone service, refusing to go to heated shelters because of fear of looting if they leave their homes.
- Fuel oil spills in New York Harbor.
NOR’EASTER’S NEGATIVE IMPACTS EXACERBATE RECOVERY
• Wind of up to 50 mph (85 kph) in New Jersey and New York, with downed trees and power outages from a blanket of heavy, wet snow
• Prolonged power outages
Powerpoint presentation courtesy of Dr Walter Hays, Global Alliance for Disaster Reduction
The typhoon impacted Bohol, where a 7.2 magnitude earthquake left thousands homeless in October. Damages and casualties are expected to increase as assessments proceed. Presentation courtesy of Dr. Walter Hays, Global Alliance for Disaster Reduction
LinkedIn is the worlds largest online professional network with over 200 million members in more than 200 countries and territories.
Fact: 2 new professionals join LinkedIn each second of every day.
In this slideshow you will learn:
- How to drive highly targeted traffic to your website using LinkedIn.
- How to setup and optimize your LinkedIn profile so users find your profile and take action.
- How to become an internationally known thought leader using the LinkedIn platform.
- And much more...
A concerted, coordinated, and sustained international effort to rebuild Haiti and Haitian lives will require a new approach that balances science and engineering, politics, and reality.
Mitigation and adaptation strategies for coping with the potential adverse effects of global climate change. If the predictions are right, we will be living with the effects of global climate change for the rest of our lives. Presentation courtesy of Dr. Walter Hays, Global Alliance for Disaster Reduction.
The greenhouse effect occurs when an earth warmed by the solar spectrum radiates invisible infrared light back, but, instead of going back to space, it is partly absorbed by greenhouse gases in the atmosphere, making the atmosphere warmer. Substantial areas of North America are likely to have more frequent droughts of greater severity. Hurricane wind speeds, rainfall intensity, and storm surge levels are likely to increase. The strongest winter storms are likely to become more frequent, with stronger winds and more extreme wave heights. Presentation courtesy of Dr. Walter Hays, Global Alliance for Disaster Reduction
Don’t be lulled into a state-of-unpreparedness! (National Hurricane Center). According to weather records dating back to 1851, the first hurricane DID NOT FORM until after Aug. 20 in 48 of the 161 years, and in 25 of then, it formed on or after 1 September. Presentation courtesy of Dr. Walter Hays, Global Alliance for Disaster Reduction
Slides were formed by referring to the text Machine Learning by Tom M Mitchelle (Mc Graw Hill, Indian Edition) and by referring to Video tutorials on NPTEL
For more info visit us at: http://www.siliconmentor.com/
Support vector machines are widely used binary classifiers known for its ability to handle high dimensional data that classifies data by separating classes with a hyper-plane that maximizes the margin between them. The data points that are closest to hyper-plane are known as support vectors. Thus the selected decision boundary will be the one that minimizes the generalization error (by maximizing the margin between classes).
A Novel Dencos Model For High Dimensional Data Using Genetic Algorithms ijcseit
Subspace clustering is an emerging task that aims at detecting clusters in entrenched in
subspaces. Recent approaches fail to reduce results to relevant subspace clusters. Their results are
typically highly redundant and lack the fact of considering the critical problem, “the density divergence
problem,” in discovering the clusters, where they utilize an absolute density value as the density threshold
to identify the dense regions in all subspaces. Considering the varying region densities in different
subspace cardinalities, we note that a more appropriate way to determine whether a region in a subspace
should be identified as dense is by comparing its density with the region densities in that subspace. Based
on this idea and due to the infeasibility of applying previous techniques in this novel clustering model, we
devise an innovative algorithm, referred to as DENCOS(DENsity Conscious Subspace clustering), to adopt
a divide-and-conquer scheme to efficiently discover clusters satisfying different density thresholds in
different subspace cardinalities. DENCOS can discover the clusters in all subspaces with high quality, and
the efficiency significantly outperforms previous works, thus demonstrating its practicability for subspace
clustering. As validated by our extensive experiments on retail dataset, it outperforms previous works. We
extend our work with a clustering technique based on genetic algorithms which is capable of optimizing the
number of clusters for tasks with well formed and separated clusters.
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...IJNSA Journal
In this paper, we introduce a set of new kernel functions derived from the generalized Legendre polynomials to obtain more robust and higher support vector machine (SVM) classification accuracy. The generalized Legendre kernel functions are suggested to provide a value of how two given vectors are like each other by changing the inner product of these two vectors into a greater dimensional space. The proposed kernel functions satisfy the Mercer’s condition and orthogonality properties for reaching the optimal result with low number support vector (SV). For that, the new set of Legendre kernel functions could be utilized in classification applications as effective substitutes to those generally used like Gaussian, Polynomial and Wavelet kernel functions. The suggested kernel functions are calculated in compared to the current kernels such as Gaussian, Polynomial, Wavelets and Chebyshev kernels by application to various non-separable data sets with some attributes. It is seen that the suggested kernel functions could give competitive classification outcomes in comparison with other kernel functions. Thus, on the basis test outcomes, we show that the suggested kernel functions are more robust about the kernel parameter change and reach the minimal SV number for classification generally.
An introduction to Deep Learning concepts, with a simple yet complete neural network, CNNs, followed by rudimentary concepts of Keras and TensorFlow, and some simple code fragments.
- POSTECH EECE695J, "딥러닝 기초 및 철강공정에의 활용", 2017-11-10
- Contents: introduction to reccurent neural networks, LSTM, variants of RNN, implementation of RNN, case studies
- Video: https://youtu.be/pgqiEPb4pV8
Similar to State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Bregman Information Principle (20)
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
State-of-the-art Clustering Techniques: Support Vector Methods and Minimum Bregman Information Principle
1. SUPERVISOR prof. Anna CORAZZA
UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II
CO-SUPERVISOR prof. Ezio CATANZARITI
State-of-the-art Clustering Techniques
Support Vector Methods and Minimum Bregman Information Principle
by
VINCENZO RUSSO
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
2. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Introduction
What is the clustering?
Non-structured data
Unsupervised learning: groups a
set of objects in subsets called
clusters
The objects are represented as
points in a subspace of Rd
d is the number of point
CLUSTERING components, also called attributes
or features
3-clusters structure
Several application domains:
information retrieval, bioinformatics,
cheminformatics, image retrieval,
astrophsics, market segmentation,
etc.
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
3. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Goals
Two state-of-the-art approaches
Support Vector Clustering (SVC)
Bregman Co-clustering
Goals Application domain
Robustness w.r.t. Missing-valued Data Astrophysics
Robustness w.r.t. Sparse Data Textual documents
Robustness w.r.t. High “dimensionality” Textual documents
Robustness w.r.t. Noise/Outliers t
Synthetic data
t
Other desirable properties
Nonlinear separable problems handling
Automatic detection of the number of clusters
Application domain independent
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
4. ers. Finally, the MEB was used for the Support Vector Domain Descriptio
9).
ption
sed for finding degli STUDI di Vector DomainSupportof classification (Tax, 2001; Tax an
rtunately, the SupportisNAPOLI FEDERICO classclass finding such(Tax, 2001; Tax and
n SVM formulation not the one II Description
DD), UNIVERSITÀ the MEB for enough. The process Vector Clustering a
an SVM formulation for the one classification
9a,b, smallestdetect 6SVDD firstthe(Tax,called Clusterthe the SVC and allows descri
re is the able toclass classification
or 1999a,b, 6 The the clusteris iswas firstlystep ofand toand allows describ-
2004). 2004). The SVDD basic 2001;modeled SVC by
only one boundaries which are Tax mapping
n, sphere to the data space. Thissphere the basic of Description
the
the enclosing phase was
step presented
Support Vector Clustering: the idea
D isetboundaries of clusters. determiningallows describ-
Hur the(2001). clusters. the SVC and the membership of points
undaries of Astep of (Vapnik, 1995). Later it was used
the al. basic second stage
nenkis (VC) dimension for
x1 ,of 2a ·high-dimensional called this points, with X though,Rd ,to adata space. W
eX = {xis,needed.·be a}Mapping n of nCluster Labeling, ⊆ Rd it
rtclusters · · x2 , n } Thenauthors distributiona(Schölkopf et al., thethe higher We
x , Nonlinear dataset of from points,space X ⊆
1 , x · · , x be a dataset step data with
5
data space.
ly does a cluster assignment.
e following subsections with X : Xoverview of the space. input
aset usedpoints, wefeaturean φ →XF→ F from the Wespace X X to some hig
of dimensional provide⊆ Rd , thefromSupport Vector Clus-
data the input
a nonlinear transformationVector Domain Description space to some high
was transformation φ space
inear n for the Support :
mensionalas originallyclass input Ben-Hur look(2001).theTax and enclosing sphere
nal feature spacespaceclassification etEnclosingthe smallest enclosing sphe
φalgorithm F we find the F,by space Xal. look for high (MEB), i.e. the
ation → feature the Minimumweto some smallest
In from proposed wherein
g : X for the one F, wherein we (Tax,for 2001; Ball
R. This weformalized asof all follows allows describ- having the
e SVDDsphereis enclosingsmallest enclosing sphere and
herein isThis basic step follows and
adius R.is the formalizedthe feature-space images
look for the as SVC
s Cluster cluster labeling probably descends from the originally proposed algorithm which
description
sters. minimum radius
5 follows
The name
meacluster labeling probably descends fromthe spherepresented to algorithm which is
Mapping smallest enclosing of, thedata space. in
e onformulation for the back on originally proposed
SVM dataset then points, with X ⊆ sphere was firstly algorithmscontours the connect
of connected componentsRd:a graph: the splitsWe for finding
d finding
ding the connected originally proposed algorithm which is usedfinding the connected
in the Vapnik-Chervonenkis (VC) the input graph:1995).algorithms for
from the components of a of Supportthe vertices.
dimension (Vapnik, the Later it was
descends The assign the “components labels” totoVectors high and describe
mation φusually contours constist space X
ponents : X → F from some (SV),
6nents of assign the of algorithms for finding the (Schölkopf
usually a support “components labels” to the vertices.
stimating thegraph: the a high-dimensional distribution connectedet al.,
eAn alternative clusters for thefor the same task, called One Class SVM, can be found
. F, wherein SVM formulation Support Vector called One Class SVM, can be found in
Finally,SVM formulation for the smallest enclosing sphere
rnative
we look
the to the vertices.
the MEB was used for the same task, Domain Description
onents labels”
ölkopf et al. (2000b) (seefor the one class classification (Tax, 2001; Tax and
D), an SVM formulation Appendix A).
lized as follows
al. (2000b) (see Appendix A). Class SVM,
for the same 6task,SVDD is the basic step of the can be found describ-
, 1999a,b, 2004). The called One SVC and allows in
A).
he boundaries of clusters. the originally proposed algorithm which is
robably descends from
be a dataset of n points, with X ⊆ Rd , the data space. We 9
d components of
= {x1 , x2 , · · · , xn }a graph: the algorithms for finding the connected
91
nonlinear transformation φthe vertices. the input space X φ−1some high
“components labels” to : X → F from to : F → X
nsional feature space F, wherein we look for the smallest enclosing sphere
ulation for the same task, called One Class SVM, can be found in
91
dius R. This is formalized as follows
ppendix A).
he name cluster labeling probably descends from the originally proposed algorithm which is
on finding the connected components of a graph: the algorithms for finding the connected
onents usually assign the “components labels” to the vertices.
n alternative SVM formulation for the same task, called One Class SVM, can be found in
91
opf et al. (2000b) (see Appendix A).
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
5. i j i,j=1,2,··· ,n
cluster labeling. To calculat
rnel. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
analyze each step separately
subject to
Phase I: Cluster K(·) φ(xk ) −
he kernel functiondescription a
defines an2
≤explicit 1, 2, · · · , n if φ is known, othe
R2 , k = mapping
6.1.5.1 Cluster description
apping thesaid toof the sphere. In the majority ofincorporatedfunction φ is u
is center be implicit. Soft constraints are cases, the by adding
here a is Finding the Minimum Enclosing Ball (MEB)
we can implicitly Class
ck variables ξk perform an inner product in the feature space F clus
The complexity of the by
Class
Nonlinear Support Vector Domain Description (SVDD) we have
-1 (see-1Equation 6.3)
kernel K. CHAPTER 6. SUPPORT VECTOR CLUSTERING
QP problem; computational complexity O(n3 ) worst-case running ti
sing nonlinear kernel transformations, we have a chance to transform a
n the QP problem can be sol
able problem + Cdata (Squared Feature Space Distance) . in be a Parameters Optmiza
min R2 Definition ξkspace to a separable oneLet Sequential Minimal(see Fig
in 6.1 x feature space (6.2)
data point. We define
R,a,ξ
PORT VECTOR CLUSTERINGA nonlinear separablespace F, φ(x), from the center = kernel width metho
Figure the distance of its image in feature problem in the data space Xsphere, a, as
2.3: k=1 of the that becomes line
tionqmethods. These
follows
the feature space F. timeC = soft margin (approx
complexity to
subject to
6.1.1 Valid Mercer kernels in R2 (x) = φ(x) − a 2 reduced to O(1)(6.13)
dR subspaces
n
2 2
(Ben-Hur e
Squared Feature Space Distance) . φ(xx ) − a point. We ξk , k = 1, 2, · · · , n
Let k be a data ≤ R + define
here are severalF, φ(x), from thewhich the k = 1,of the ·kernelsatisfythe kernelized cond
image in feature spaceview of Equation 6.6 and the are known to we have Mercer’s
In functions center of definition a, as· , n
ξk ≥ 0, sphere, 2, ·
n In polynomial kernels, the parameter k is 6.1.5.2 Cluster labeling co
version of the above distance
the degree. In the expon
⊆ R . Some of them are
solve this problem φ(x)introduce the−parameter q is called kernel width. isThe k
kernels=(and − 2 (x) = K(x, x) Lagrangianx) +
d2 (x)
we others), the 2 β K(x , (6.13)
dR 2
a
n n n
TheK(xk , xl ) labeling comp
βk βl cluster (6.14)
R k k
• Linear kernel: K(x, y) = xy meaning depending on the kernel: in th
has different mathematical k=1 k=1 l=1jacency matrix A (see Equa
tion 6.6 and the definition solutionkernel functioni.e.kernelized Lagrangian multipliers associ-undirect
Gaussian kernel, 2it vector we have the only the
Since the of the is a β is sparse, of the variance components of the
L(R,distance µ) kernel need to + vectors= (xy− Gaussianrewritekthe aboveusedkone n = n
ove Polynomial kernel: K(x,knormalized. a r) ,kr ≥ 0, k most equation(6.3)
• a, ξ; β, =ated − the support ξ y) are non-zero, we can is theµk +N n × n, where
The R to 2
(R be − φ(xk ) + )β − 2k
ξ ∈ C
matrix is ξ as
n follows k n n 2 1 In the first ksub-step we hav
k
) • K(x, x) − 2 kernel: K(x, y)β=l K(xk , xl )
= Gaussian βk K(xk , x) + β e n , qq >n 0 n 2 (s) where s is anyone of
−q x−y
=
(6.14)
≥ 0 x) + = dRβl · · · , n.
k sv sv sv
th Lagrangian multipliers(x)k=l=1 0 x) − 2 µk βk K(xk ,for k 2σ 1,k2,K(xk, xl ) The posi-
k=1
2 β ≥ and
dR k=1 K(x, β (6.15)
e• Exponential kernel: K(x, y) = k=1 x−y , q k=1 l=1point y sampled along the p
e−q
real constant C provides a way to control outliers> 0
13
n vector βKernel width is aLagrangian term SUPPORTassoci- percentage, allowing the
is sparse, i.e. only the general multipliersindicate theMINIMUM BREGMANwhich data is
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: to VECTOR METHODS AND scale at INFORMATION PRINCIPLE
6. en a pair of data points that belong to different clusters, any β ← clusterD 5: path that c
UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering 6: results ← clu
ts them must exit from the sphere in the feature space. Therefore, such a p
7: choose new q
tains a segment of points y such that dR (y) > R. This leads to the definition
Phase II: Cluster labeling 8: end while
adjacency matrix A between all pairs of points whose images lie in or on 9: return results
ere in feature space. 10: end procedure
The Phase I only describes the clusters’ boundaries
Sij be the line segment connecting xi and xj , such that Sij = {xi+1 , xi+2 , ·
2 , xj−1 } Phasei,II:= 1, 2, · · · , n, connected components of the graph
for all j finding the then 6.1.5 Complexity
induced by the matrix A
We recall that the SVC
1 if ∀y ∈ Sij , dR (y) ≤ R cluster labeling. To calcu
Aij = (6
0 otherwise. analyze each step separa
sters are now defined as the connected components of the graph induced
Sij = {xsegment is · · , xj−2 , xj−16.1.5.1 Cluster a num
matrix A. Checking the line i+1 , xi+2 , · implemented } sampling
by
descrip
f points between the starting point and the ending point. The exactness ofc
The complexity of the
Each component is a cluster (see Equation 6.3) we h
ends on the number m. O(n3 ) worst-case runnin
Original Phase II is a bottleneck (caso peggiore their problem can be
)
arly, the BSVs are unclassified by this procedure sincethe QP feature space
Alternatives
s lie outside the enclosing sphere. One may decide either to leave them
Sequential Minimal Optm
sified orCone Cluster Labeling:cluster that they are closest to. Generally,
to assign them to the best performance/accuracy methods. These me
tion rate
time complexity to (app
Gradient Descent
er is the most appropriate choice.
reduced to O(1) (Ben-Hu
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
7. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
Pseudo-hierarchical execution
Parameters exploration
The greater the kernel width q, the greater the number
of support vectors (and so of clusters)
C rules the number of outliers and allows to deal with
strongly overlapping clusters
Brute force approach unfeasible
Approaches proposed in literature
Secant-like algorithm for q exploration
No theoretical-rooted method for C exploration
Data analysis is performed at different levels of detail
Pseudo-hierarchical: strict hierarchy not guaranteed
when ‘C < 1’, due to the Bounded Support Vectors
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
8. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
Proposed improvements
Soft Margin C parameter selection
Heuristics: successfully applied in 90% of cases
Only 10 tests out of 100 needed further tuning
10 datasets had a high percentage of missing values
New robust stop criterion
Based upon Relative evaluation criteria (C-index, Dunn
Index, ad hoc)
Kernel width (q) selection
SVC integration O(Qn3 ) O(n2 )
sv
Softening strategy heuristics
For all normalized kernels
More kernels
Exponential ( K(x, y) = e−q x−y ), Laplace (K(x, y) = e−q|x−y| )
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
9. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
Improvements - Stop criterion
Detected clusters Actual clusters Validity index
1 3 1,00E-06
Breast Iris
3 3 0,13
4 3 0,05
1 2 1,00E-05
2 2 0,80
4 2 0,27
The bigger the Validity index the better the clustering
found
The stop criterion halt the process when the index value
start to decrease
The idea: the SVC outputs quality-increasing clusterings
before reaching the optimal clustering. After that, it
provides quality-decreasing partitionings.
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
10. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
Improvements - Kernel width selection
Algorithm Accuracy Macroaveraging # iter # potential “q”
SVC 88,00% 87,69% 2 9
Iris
+ softening 94,00% 93,99% 1 13
K-means 85,33% 85,11% not applicable
SVC 87,07% 87,55% 3 7
B. Cancer Syn03 Syn02 Wine
+ softening 93,26% 93,91% 2 6
K-means 50,00% 51,78% not applicable
SVC 88,80% 100,00% 8 18
+ softening 88,00% 100,00% 4 15
K-means 68,40% 63,84% not applicable
SVC 87,30% 100,00% 17 36
+ softening 87,30% 100,00% 6 31
K-means 39,47% 39,90% not applicable
Benign
Contamination
SVC 91,85% 11,00% 3 11
+ softening 96,71% 2,82% 3 13
K-means 60,23% 32,00% not applicable
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
11. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Support Vector Clustering
Improvements - non-Gaussian kernels
Exponential Kernel: improves the cluster separation in several cases
Algorithm Accuracy Macroaveraging # iter # potential “q”
SVC + softening 94,00% 93,99% 1 13
Iris
+ Exp Kernel 97,33% 97,33% 1 15
K-means 85,33% 85,11% not applicable
SVC + softening Failed - only one class out of 3 separated
CLA3
+ Exp Kernel 94,00% 93,99% 1 11
K-means 85,33% 85,11% not applicable
Laplace Kernel: improves/allows the cluster separation with
normalized data
Algorithm Accuracy # iter # potential “q”
SVC + softening Failed - no class separated
SG03 Quad
+ Laplace Kernel 99,94% 1 17
K-means 83,00% not applicable
SVC + softening 73,15% 3 19
+ Laplace Kernel 91,04% 1 16
K-means 50,24% not applicable
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
12. φ 2 1 C, expectation of 2 its interior2
efine the relative 1interior of the set the denoted1 ri(C), as random variable X.
2
the
2
(C) the gradient of STUDI di is the dot product, and ri(S) is the relative interior of
φ is UNIVERSITÀ degli φ, · NAPOLI FEDERICO II Minimum Bregman Information Principle
Proposition 5.1 Let X be a random variable that takes values in X
ri(C) = {x ∈ C : B(x, r) ∩ aff(C) ⊆ C following r > 0},
Bregman Co-clustering (BCC) Rd for some a positive probability distribution measure ν such that
(5.2)
Given a Bregman divergence dφ : S × ri(S) → [0, ∞), the problem
) is the ball of radius r and center x (Boyd and Vandenberghe, 2004,
e 5.1 (Squared Euclidean Distance) clustering of both distance is perhaps
Co-clustering: simultaneous Squared Euclidean rowsEand columns min [d (X, s)]
lest and most widely used Bregman divergence. The underlying ri(S) ν φ φ(x) =
of a data matrix s∈ function
strictly convex, differentiable in Rd and
Bregman framework has a unique solution given by s∗ = µ = Eν [X].
gman divergences
Generalizes K-meansUsing the proposition above, we can now give a more direct
strategy
e Bregman divergences (Bregman, 1967), which form a large class of
dLargexclass of ,divergences: Bregman 2 ,(BI). 2 ) =
Bregman Information
φ (x1 , 2 ) = x1 x1 − x2 , x2 − x1 − x divergences
φ(x
d loss functions with a number of desirable properties.
Minima Bregman1 Information (MBI) principle= (5.4)
= x1 , x − x2 , x2 − 5.2 (Bregman2Information) Let X be a random variab
Definition x1 − x2 , 2x
1 (Bregman divergence) Let φ be a in X = {xiconvex function of Leg- a positive probability distrib
real-valued }n ⊂ S 2 Rd following
Meta-algorithmdom(φ)1 Let R2.=The[X] − x2n ⊆ν x ∈ ri(S) and let d : S × ri(S) → [0,
= Sx1 − x2 , x ⊆ xd = x1
fined on the convex set ≡ −
µ
i=1
E Bregman divergence
= ν i=1 i i φ
→ R+ is defined asconsists of all
interior of a set C divergence points of C that arethe Bregmannot on the “edge” in terms of dφ is de
divergence. Then intuitively Information of X of C
Bregman Bregman Information
d Vandenberghe, 2004, app. A). n
d (x1 , x2 = φ(x1 − φ(x2 ) φ x − x to be 2 ) Iφ (X) = (5.3) (i) int(dom(φ))φ (xi , µ)
roper, φclosed,) convex )function − is 1 said 2 , φ(xof Legendre type νif: φ (X, µ)] =
E [d νi d
mpty, with φ convex, real, dot product, and ri(S) is theon int(dom(φ)),ofand (iii) ∀zb ∈
he gradient of is strictly convex and differentiable relative interior
(ii) φ, · is the differentiable i=1
φ)), limz∈dom(φ)→zb φ(z) → ∞, where dom(φ) is the domain of the φ application, d
Example 5.3 (Variance) Let X = {xi }n be a set in R , and con
i=1
φ)) is the interior of the domain of φ measure over X , i.e.isνthe boundary of the domain of of X with
Divergence and bd(dom(φ)) iMBI1/n. The Bregman Information
Information = Algorithm
ee et al., 2005c).
Euclidean Variance as Bregman divergence is actually K-means
distance Least Squares the variance
(Squared Euclidean Distance) Squared Euclidean distance is perhaps
nd most widely usedEntropy divergence. The underlying function φ(x) =
Relative Bregman Mutual Information Maximum Entropy n
unnamed n
1
ly convex, differentiable in R and
d
Iφ (X) = νi dφ (xi , µ) = 57 xi − µ 2
Itakura-Saito unnamed unnamed i=1 Lindo-Buzo-Gray n i=1
where
dφ (x1 , x2 ) = x1 , x1 −STATE-OF-THE-ART CLUSTERING2 , φ(x2SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
VINCENZO RUSSO x2 , x2 − x1 − x TECHNIQUES: ) = n n
13. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Other experiments
Sparse data and missing-valued data
Star/Galaxy data with missing values
Dataset SVC BCC K-means # attr. affected % obj. affected
MV5000 (25D) 99,02% 94,00% 71,08% 10 27,0%
MV10000 (25D) 96,10% 95,60% 75,12% 10 29,0%
AMV5000 (15D) 91,76% 79,46% 74,90% 6 30,0%
AMV10000 (15D) 90,31% 83,51% 68,20% 6 30,0%
Textual document data: sparsity and high “dimensionality”
Dataset SVC BCC K-means
CLASSIC3 (3303D) 99,80% 100,00% 49,80%
SCI3 (9456D) failed 89,39% 39,15%
PORE (13821D) failed 82,68% 45,91%
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
14. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Other experiments
Outliers
Dataset SVC Best BCC K-means # objects # outliers
SynDECA 02 100,00% 94,18% 68,04% 1000 112
SynDECA 03 100,00% 49,00% 39,47% 10000 1.270
9.8. MISSING VALUES IN ASTROPHYSICS:
SynDECA 02 SynDECA 03
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
15. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Conclusions and future works
Conclusions
Support Vector Clustering achieves the goals
Goals Application domain
Robustness w.r.t. Missing-valued Data Astrophysics
Robustness w.r.t. Sparse Data Textual documents
Robustness w.r.t. High “dimensionality” Textual documents
Robustness w.r.t. Noise/Outliers Synthetic data
Other properties Application domain
Automatic discovering of the number of clusters
Application domain independent
Whole experimental stage
Nonlinear separable problems handling
Arbitrary-shaped clusters handling
Bregman Co-clustering achieves same goals, but the following still hold
the problem of estimating the number of clusters
outliers handling problem
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
16. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Conclusions and future works
Contribution
SVC was made applicable in practice
Complexity reduction for the kernel width selection
Soft margin C parameter estimation
New effective stop criterion
non-Gaussian Kernels
The kernel width selection was shown to be applicable
to all normalized kernels
Exponential and Laplacian kernel successfully used
Improved accuracy
Softening strategy for the kernel width selection
VINCENZO RUSSO STATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE
17. UNIVERSITÀ degli STUDI di NAPOLI FEDERICO II Conclusion and future works
Future works 10.3. FUTURE WORK
Itakura-Saito Minimum Enclosing Bregman Ball (MEBB)
Generalization of the Minimum Enclosing
10.3. FUTURE WORK
Ball (MEB) problem and the Bâdoiu-
Clarkson (BC) algorithm with Bregman
Bregman Balls
L2
divergences
Itakura-Saito 2 Kullbach-Leibler
Squared Euclidean
Fig. 2. Examples of Bregman Balls, for d = 2. Blue dots are the centers of the balls.
Figure 10.1: Examples of Bregman balls. The two ones on the left are balls obtained by means of
Core Vector Machines (CVM)
the Itakura-Saito distance. The middle one is a classic Euclidean ball. The other two are obtained
by employing F isKullback-Leibler distance F . A Bregman divergence has the following
Here, the the gradient operator of (Nock and Nielsen, 2005, fig. 2).
properties: it is convex in x’, always non negative, and zero iff x = x’. Whenever
SVM reformulated as MEB problem
d
F (x) = i=1 x2 = x 2 , the corresponding divergence is the squared Euclidean
i 2
distancedata, therefore we can take2 ,itwith which is associated the common the
tion of the (L2 ): DF (x’, x) = x − x’ 2 into account for much research in
2
definition of a ball in an Euclidean metric space:
SVC and generally in the SVM. In fact, we wish to recall that the classical BC al-
Itakura-Saito L22 Kullbach-Leibler
B algorithm exploited 2 ≤ r} ,
2
The CVMs reformulate the SVMs as a MEB problem and we already expressedThey make use of the BC algorithm
gorithm is the optimizationc,r = {x ∈ X : x − c by the already mentioned CVMs. (2)
Kullback-Leibler
with c ∈ S the center of the ball, and r ≥ 0 its (squared) radius. Eq. (2)
Fig. 2. Examples of Bregman Balls, for d = 2. machines WORKcenters of the balls.
10.3. FUTUREare the cluster description
our will of testing such Blue dots left are balls obtained by means ofstage of the SVC (see
for the
MEBB + CVM = Bregman Vector Machines
e 10.1: Examples of Bregman balls.natural generalization to the definition of balls for arbitrary Bregman
suggests a The two ones on the
section 6.12). Since the Euclidean ball. The other twogeneralized to Bregmanany
BC algorithm has been areusually not symmetric, diver-
kura-Saito distance. The middle one is a classic since a Bregman divergence is obtained
divergences. However,
ploying F isKullback-Leibler distance Fr. ≥about vector 2005, fig.dual Bregman balls:
Here, the the gradient∈ S and any(NockBregman divergence has the following about the SVC) could
gences, the research 0and Nielsen, machines2).
c operator of A define actually two
(and therefore
have very interesting implications. We definitely intend to explore this way.
New implications for vector machines
roperties: it is convex in x’, always non negative, and zero iff x = x’. Whenever
d Bc,r = {x ∈ X : DF (c, x) ≤ r} , (3)
(x) = i=1 x2 = x 2 , the corresponding divergence is the squared Euclidean
i 2
istance (L2 ): DF10.3.2 = can take2 ,itand Bc,r = {x for much research in
x − x’ 2 into extend the :SVC software
∈ X DF (x, c) ≤ r} .
of the data, therefore we Improve with which is associated the common the
2 (x’, x) account
(4)
New implications for SVC
efinition of a ball in an Euclidean metric space:
and generally in theRemark In fact,F (c, wish always convex thecclassicalFBC al- is not always, but
SVM. that D we x) is to recall that in while D (x, c)
For the boundaryaccuracy not2 always convexperform more x, given comparisons with
thealgorithmX :c,r is and≤the already (it depends on robust c), while ∂Bc,r
sake of in order to
hm is the optimizationc,r = {x ∈ ∂B x − c by r} ,
B exploited 2 mentioned CVMs.
(2)
other clustering algorithms, an improved and extended software,r because of
for the Support
CVMs reformulate thealways convex. In this paper, we we already interested in Bc
is SVMs as a MEB problem and are mainly expressed
ith c ∈ S the center ofconvexity of(SVC)≥ 0needed. More of the paper extends
Vector Clusteringand r is The (squared) radius. Eq. and reliability is necessary.
its conclusion stability (2)
Adapting cluster labeling algorithms to
the the ball,
will of testing such machines for the DF in c.description stage of the SVC (see some results to
cluster
uggests a natural Moreover, it,r to the definition2 presents some examples of Bregman to this promising
generalization as well. Figure implement arbitrary Bregman
build Bc is important to of balls for all the key contributions balls for three
n 6.12). Since the BC algorithm has been generalized to Bregmanany diver-
ivergences. However, since a proposed all around the world. In fact, all analytic expressions of the
technique Bregman divergence is usually not symmetric, the tests have been currently
popular Bregman divergences (see Table 1 for the
es, S and any r ≥about vector machines (and thereforeof m points SVC)were sampled from X . A
∈ the research performed by exploiting only some ofabout the that could
0 define actually two dual Bregmanset the characteristic and/or special contribu-
the Bregman divergences
divergences). Let S ⊆ X be a balls:
very interesting implications. We definitely intend to explore this way. ∗
tion smallest {x ∈ X : DBregman ball ,(SEBB) for S is a Bregman ball B c∗ ,r∗ with r
at time. enclosing (c, x) ≤ r}
Bc,r = (3)
F
the minimal real such that S ⊆ Bc∗ ,r∗ . With a slight abuse of language, we will
L2 refer to {x ∈ X : DF (x, c) ≤ r} .
Bc,r = rKullbach-Leibler
∗ (4)
2 Improve and extend as the radius of the ball. Our objective is to approximate as best as
2
the SVC software
possible the SEBB of S, which amounts to minimizing the radius of the enclosing
Remark that DF (c, x) is always convex in c while DF (x, c) is not always, but
he boundaryaccuracy not always convexperform matterx, givenindeed, the SEBB is unique.
he sake of for d c,r2. Blue dots order to (it depends on robust comparisons,r
man Balls, ∂B = is and in are the centers of the balls.
ball we build. As a simple more of fact
c), while ∂Bc with
man balls. The two ones on an improved and extended software for the Support
always convex. In this
the left are balls obtained by means of
clustering algorithms, paper, we are mainlyenclosing Bregman ball Bc∗ ,r∗ of S is unique.
Lemma 1. The smallest interested in Bc,r because of
Euclidean ball. The other two are obtained
rmiddle one is ofclassicin c. The conclusion stability and reliability is necessary.
a
he Clustering (SVC) is needed. More of the paper extends some results to
convexity
over, it,r of
DF
ibler distance F . A Bregman divergence has the following
t operatoras well. Figure implement fig. the key contributions to this promising
(Nock and Nielsen, 2005, 2).
uild Bc is important to 2 presents some examples of Bregman balls for three
all
The End
n x’, always non all around the world. =1x’. Whenevertestsexpressions of the
opular Bregmannegative, and(see Table In fact, all analytic have been currently
VINCENZO zero iff x for the theSTATE-OF-THE-ART CLUSTERING TECHNIQUES: SUPPORT VECTOR
ique proposed divergences RUSSO
, the corresponding divergence is the squared Euclidean
METHODS AND MINIMUM BREGMAN INFORMATION PRINCIPLE