This document discusses using a non-convex optimization approach for correlation clustering. It summarizes that correlation clustering aims to maximize the weight of edges within clusters by clustering vertices based on positive and negative edge weights. Previous approximation algorithms had limitations in scaling or providing exact solutions. The document proposes a non-convex relaxation approach solved using the Frank-Wolfe algorithm, which provides theoretical guarantees while outperforming other methods in runtime and solution quality on synthetic and real-world datasets. It emphasizes the need for algorithms research to combine theoretical guarantees with practical implementations and testing on large datasets.
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
A Non--convex optimization approach to Correlation Clustering
1. 1
A non-convex optimization approach
to correlation clustering
Erik Thiel, Morteza Haghir Chehreghani, Devdatt Dubhashi
Chalmers University of Technology, Sweden
2. Correlation Clustering
• Input: Graph 𝐺 = 𝑉, 𝐸 and positive or
negative weights 𝑤 𝑒 , 𝑒 ∈ 𝐸
• Output: A clustering of the vertices to
maximize the sum of the weights of edges
within each cluster.
3. Difference from usual Clustering
• Weights can be positive or negative!
• Contentious what’s ”good” quality clustering
• But in correlation clustering there is
unambiguous objective
• The number of clusters need not be specified,
will emerge from the optimizing the objective.
5. Approximation Algorithms
• Bansal, Blum Chawla (2004): PTAS on
complete graphs
• Charikar Guruswami, Wirth (2005): APX hard
on general graphs
• Charikar et al (2005), Swamy (2004): 0.76
approximation
• Guruswami-Giotis (2006): PTAS with fixed no
of clusters
8. However …
• No implementation, no code …
• Doesn’t work in practice …
9. A Tale of Two Cultures
• Deep elegant theory
• “Polynomial time”
• No implementation
• No experiments on data
sets
• Does not work in
practice or scale
• Beamer/LaTeX
• Sometimes theory
• Linear or sub-linear
• Well engineered
implementation
• Extensive testing on
data sets
• Must work in practice,
scale to “Big Data”
• Powerpoint
Algorithms Theory Machine Learning
11. Tightness of Relaxation
• The non-convex relaxation is tight: no gap
between continuous and discrete problem,
simple proof by randomized rounding.
• In contrast SDP relaxation is not tight.
17. Non-convex Convergence Theory
• For a differentiable (but not necessarily
convex) function, the convergence rate of FW
is 𝑂(1/ 𝑇).
• If the function is multilinear, the convergence
rate is 𝑂(1/𝑇).
• Note that our correlation clustering objective
is indeed multilinear!
19. Synthetic Data: Generative Model
• Planted model with k clusters and noise p
• With probability (1-p), high positive weight on
edge within a cluster and high negative weight
on edge across clusters, with probability p,
arbitrary weight
20.
21. SDP yields very slow and low-quality results.
e.g., 15 hours vs. a couple of sec for n=200.
See also [Elsner and Schudy 2009]
29. Correlation Clustering Colours
• Vertices are the Munsell tiles
• Edge between tiles x and y has weight sim(x,y)
-1/2, where sim is the CIELAB similarity
(between 0 and 1).
• Thus edges for similar tiles will have positive
weights and for dissimilar tiles will have
negative weights.
31. Summary
• Non-convex relaxation solved with Frank
Wolfe yields an algorithm with guarantees
that beats all other methods handily in both
runtime and quality.
• Combine theory and rigour of algorithms
research with engineering good
implementations and extensive testing on
data.