SlideShare a Scribd company logo
1 of 80
Download to read offline
Fundamentals of Algorithms and Data-Stru 
tures in 
Information-Geometri 
Spa 
es 
Frank NIELSEN 
É 
ole Polyte 
hnique, Fran 
e 
Sony Computer S 
ien 
e Laboratories, In 
MEXT-ISM Workshop on Information Geometry for Ma 
hine Learning 
Brain S 
ien 
e Institute, RIKEN 
4th De 
ember 2014 

 
2014 Frank Nielsen 1/75
Brief histori 
al review of Computational Geometry (CG) 
◮ Three resear 
h periods: 
1. Geometri 
algorithms: 
Voronoi/Delaunay, minimum spanning trees, data-stru 
tures for proximity 
queries 
2. Geometri 
omputing: 
robustness, algebrai 
degree of predi 
ates, programs that work/s 
ale! 
3. Computational topology: 
simpli 
ial 
omplexes, ltrations, input=distan 
e matrix 
→ paradigm of Topologi 
al Data Analysis (TDA) 
◮ Show 
asing libraries for CG software: 
◮ CGAL http://www. 
gal.org/ 
Geometry Fa 
tory http://geometryfa 
tory. 
om/ 
◮ Gudhi https://proje 
t.inria.fr/gudhi/ 
Ayasdi http://www.ayasdi. 
om/ 

 
2014 Frank Nielsen 1.CG History 2/75
Outline 
◮ Review of the basi 
algorithmi 
toolbox in 
omputational geometry: 
Voronoi diagrams and dual Delaunay, spanning balls 
◮ Generalizations of those 
on 
epts and toolbox to information spa 
es: 
◮ Riemannian 
omputational information geometry 
◮ Dually ane 
onne 
tions 
omputational information geometry 
◮ Appli 
ations to 
lustering, learning mixtures, et 
. 
What is a good/friendly geometri 
omputing spa 
e? 

 
2014 Frank Nielsen 1.CG History 3/75
Basi 
s of Eu 
lidean 
Computational Geometry: 
Voronoi diagrams and dual 
Delaunay 
omplexes 

 
2014 Frank Nielsen 2.Ordinary CG 4/75
Eu 
lidean (ordinary) Voronoi diagrams 
P = {P1, ..., Pn}: n distin 
t point generators in Eu 
lidean spa 
e Ed 
V (Pi ) = {X : DE (Pi ,X) ≤ DE (Pj ,X), ∀j6= i} 
Voronoi diagram = 
ell 
omplex V (Pi )'s with their fa 
es 

 
2014 Frank Nielsen 2.Ordinary CG 5/75
Voronoi diagrams from bise 
tors and ∩ halfspa 
es 
Bise 
tors 
Bi(P,Q) = {X : DE (P,X) = DE (Q,X)} 
→ are hyperplanes in Eu 
lidean geometry 
Voronoi 
ells as halfspa 
e interse 
tions: 
=1Bi+(Pi , Pj ) 
V (Pi ) = {X : DE (Pi ,X) ≤ DE (Pj ,X), ∀j6= i} = ∩ni 
DE (P,Q) = kθ(P) − θ(Q)k2 = 
qPd 
i=1(θi (P) − θi (Q))2 
θ(P) = p: Cartesian 
oordinate system with θj (Pi ) = p(j) 
i . 
⇒ Many appli 
ations of Voronoï diagrams: 
rystal growth, 
odebook/quantization, mole 
ule interfa 
es/do 
king, motion planning, et 
. 

 
2014 Frank Nielsen 2.Ordinary CG 6/75
Voronoi diagrams and dual Delaunay simpli 
ial 
omplex 
◮ Empty sphere property, max min angle triangulation, et 
◮ Voronoi  dual Delaunay triangulation 
→ non-degenerate point set = no (d + 2) points 
o-spheri 
al 
◮ Duality: Voronoi k-fa 
e ⇔ Delaunay (d − k)-simplex 
◮ Bise 
tor Bi(P,Q) perpendi 
ular ⊥ to segment [PQ] 

 
2014 Frank Nielsen 2.Ordinary CG 7/75
Voronoi  Delaunay : Complexity and algorithms 
◮ Combinatorial 
omplexity: (n⌈ d 
2 ⌉) (→ quadrati 
in 3D) 
mat 
hed for points on the moment 
urve: t7→ (t, t2, .., td ) 
◮ Constru 
tion: (n log n + n⌈ d 
2 ⌉), optimal 
◮ some output-sensitive algorithms but... 
◮ 
(n log n + f ), not yet optimal output-sensitive algorithms. 

 
2014 Frank Nielsen 2.Ordinary CG 8/75
Modeling population spa 
es in information geometry 
Population spa 
e {P(x)} interpreted as a smooth manifold equipped with 
the Fisher Information Matrix (FIM): 
◮ Riemannian modeling: metri 
length spa 
e with the FIM as metri 
tensor 
(orthogonality), and the Levi-Civita metri 
onne 
tion for length 
minimizing geodesi 
s 
◮ Dual ±1 ane 
onne 
tion modeling: dual geodesi 
s that des 
ribe 
parallel transport, non-metri 
dual divergen 
es indu 
ed by dual potential 
Legendre 
onvex fun 
tions. Dual ±α 
onne 
tions. 
→ Algorithmi 
onsiderations of these two approa 
hes 
Population spa 
e, parameter spa 
e, obje 
t-oriented geometry, et 
. 

 
2014 Frank Nielsen 3.Information geometry 9/75
Riemannian 
omputational 
information geometry from 
the viewpoint of 
omputing 

 
2014 Frank Nielsen 4.Riemannian CIG 10/75
Population spa 
es: Hotelling (1930) [12℄  Rao (1945) [33℄ 
Birth of dierential-geometri 
methods in statisti 
s. 
◮ Fisher information matrix (non-degenerate positive denite) 
an be used 
as a (smooth) Riemannian metri 
tensor g. 
◮ Distan 
e between two populations indexed by θ1 and θ2: Riemannian 
distan 
e (metri 
length) 
First appli 
ations in statisti 
s: 
◮ Fisher-Hotelling-Rao (FHR) geodesi 
distan 
e used in 
lassi 
ation: 
Find the 
losest population to a given set of populations 
◮ Used in tests of signi 
an 
e (null versus alternative hypothesis), power 
of a test: P(reje 
t H0|H0 is false) 
→ dene surfa 
es in population spa 
es 

 
2014 Frank Nielsen 4.Riemannian CIG 11/75
Rao's distan 
e (1945, introdu 
ed by Hotelling 1930 [12℄) 
◮ Innitesimal squared length element: 
ds2 = 
X 
i ,j 
gij (θ)dθidθj = dθTI (θ)dθ 
◮ Geodesi 
and distan 
e are hard to expli 
itly 
al 
ulate: 
ρ(p(x; θ1), p(x; θ2)) = min 
(s) 
(0)=1 
(1)=2 
Z 1 
0 
s 
dθ 
ds 
T 
I (θ) 
dθ 
ds 
ds 
Rao's distan 
e not known in 
losed-form for multivariate normals 
◮ Advantages: Metri 
property of ρ + many tools of dierential 
geometry [1℄: Riemannian Log/Exp tangent/manifold mapping 

 
2014 Frank Nielsen 4.Riemannian CIG 12/75
Extrinsi 
Computational Geometry on tangent planes 
◮ Tensor g = Q(x) ≻ 0 denes smooth inner produ 
t hp, qix = p⊤Q(x)q 
that indu 
es a normed distan 
p 
e: 
dx (p, q) = kp − qkx = 
(p − q)⊤Q(x)(p − q) 
◮ Mahalanobis metri 
distan 
e on tangent planes : 
(X1,X2) = 
q 
(μ1 − μ2)⊤−1(μ1 − μ2) = 
p 
μ⊤−1μ 
◮ Cholesky de 
omposition  = LL⊤ 
(X1,X2) = DE (L−1μ1, L−1μ2) 
◮ CG on tangent planes = ordinary CG on transformed points x′ ← L−1x. 
Extrinsi 
vs intrinsi 
means [10℄ 

 
2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 13/75
Mahalanobis Voronoi diagrams on tangent planes (extrinsi 
) 
In statisti 
s, 
ovarian 
e matrix  a 
ount for both 
orrelation and dimension 
(feature) s 
aling 
⇔ 
Dual stru 
ture ≡ anisotropi 
Delaunay triangulation 
⇒ empty 
ir 
umellipse property (Cholesky de 
omposition) 

 
2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 14/75
Riemannian Mahalanobis metri 
tensor (−1, PSD) 
ρ(p1, p2) = 
q 
(p1 − p2)⊤−1(p1 − p2), g(p) = −1 = 
 
1 −1 
−1 2 
 
non- 
onformal geometry: g(p)6= f (p)I 

 
2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 15/75
Riemannian statisti 
al Voronoi diagrams 
... for statisti 
al population spa 
es: 
◮ Lo 
ation-s 
ale 2D families have 
onstant non-positive 
urvature 
(Hotelling, 1930): Riemannian statisti 
al Voronoi diagrams amount 
to hyperboli 
Voronoi diagrams or Eu 
lidean diagrams (lo 
ation 
families only like isotropi 
Gaussians) 
◮ Multinomial family has spheri 
al geometry on the positive orthant: 
Spheri 
al Voronoi diagram 
( 
ompute as stereographi 
proje 
tion ∝ Eu 
lidean Voronoi diagrams) 
But for arbitrary families p(x|θ): Geodesi 
s not in 
losed forms → limited 
omputational framework in pra 
ti 
e (ray shooting, et 
.) 

 
2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 16/75
Normal/Gaussian family and 2D lo 
ation-s 
ale families 
◮ Fisher Information Matrix (FIM): 
I (θ) = 
 
Ii ,j (θ) = E 
 
∂ 
∂θi 
log p(x|θ) 
∂ 
∂θj 
 
log p(x|θ) 
◮ FIM for univariate normal/multivariate spheri 
al distributions: 
I (μ, σ) = 
 1 
2 0 
0 2 
2 
 
= 
1 
σ2 
 
1 0 
0 2 
 
, I (μ, σ) = diag 
 
1 
σ2 , ..., 
1 
σ2 , 
2 
σ2 
 
◮ → amount to Poin 
aré metri 
dx2+dy2 
y2 , hyperboli 
geometry in 
upper half plane/spa 
e. 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 17/75
Riemannian Poin 
aré upper plane metri 
tensor ( 
onformal) 
osh ρ(p1, p2) = 1 + kp1 − p2k2 
2y1y2 
, g(p) = 
 
1 
y2 0 
0 1 
y2 
# 
= 
1 
y2 I 
onformal: g(p) = 1 
y2 I 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 18/75
Matrix SPD spa 
es and hyperboli 
geometry 
Symmetri 
Positive Denite matri 
es M: ∀x6= 0, x⊤Mx  0. 
◮ 2D SPD(2) matrix spa 
e has dimension d = 3: A positive 
one. 
SPD(2) 
 
(a, b, c) ∈ R3 : a  0, ab − c2  0 
	 
◮ Can be peeled into sheets of dimension 2, ea 
h sheet 
orresponding to a 
onstant value of the determinant of the elements [8℄ 
SPD(2) = SSPD(2) × R+ 
where SSPD(2) = {a, b, c = √1 − ab) : a  0, ab − c2 = 1} 
◮ Mapping M(a, b, c) → H2: 
◮ 
 
x0 = a+b 
2 ≥ 1, x1 = a−b 
2 , x2 = c 
 
in hyperboloid model [28℄ 
◮ z = a−b+2ic 
2+a+b in Poin 
aré disk [28℄. 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 19/75
Riemannian manifolds: Choi 
e of equivalent models? 
Many equivalent models of hyperboli 
geometry: 
◮ Conformal (good for visualization sin 
e we 
an measure angles) versus 
non- 
onformal ( 
omputationally-friendly for geodesi 
s) models. 
◮ Convert equivalently to other models of hyperboli 
geometry: Poin 
aré 
disk, upper half spa 
e, hyperboloid, Beltrami hemisphere, et 
. 
Two questions: 
◮ Given a metri 
tensor g and its indu 
ed metri 
distan 
e ρg (p, q), what 
are the equivalent metri 
tensors g′ ∼ g su 
h that ρg (p, q) = ρg′(p′, q′)? 
Is one metri 
tensor better for 
omputing spa 
e? 
◮ Metri 
s yielding straight geodesi 
s are fully 
hara 
terized in 2D but in 
higher dimensions? 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 20/75
Riemannian Poin 
aré disk metri 
tensor ( 
onformal) 
→ often used in Human Computer Interfa 
es, network routing (embedding 
trees), et 
. 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 21/75
Riemannian Klein disk metri 
tensor (non- 
onformal) 
◮ re 
ommended for  
omputing spa 
e sin 
e geodesi 
s are straight line 
segments 
◮ Klein is also 
onformal at the origin (so we 
an perform translation 
from and ba 
k to the origin) 
◮ Geodesi 
s passing through O in the Poin 
aré disk are straight (so we 
an 
perform translation from and ba 
k to the origin) 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 22/75
Hyperboli 
Voronoi diagrams [25, 29℄ 
In arbitrary dimension, Hd 
◮ In Klein disk, the hyperboli 
Voronoi diagram amounts to a 
lipped 
ane Voronoi diagram, or a 
lipped power diagram with e 
ient 
lipping algorithm [5℄. 
◮ then 
onvert to other models of hyperboli 
geometry: Poin 
aré disk, 
upper half spa 
e, hyperboloid, Beltrami hemisphere, et 
. 
◮ Conformal (good for visualization) versus non- 
onformal (good for 
omputing) models. 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 23/75
Hyperboli 
Voronoi diagrams [25, 29℄ 
Hyperboli 
Voronoi diagram in Klein disk = 
lipped power diagram. 
Power distan 
e: 
kx − pk2 − wp 
→ additively weighted ordinary Voronoi = ordinary CG 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 24/75
Hyperboli 
Voronoi diagrams [25, 29℄ 
5 
ommon models of the abstra 
t hyperboli 
geometry 
https://www.youtube. 
om/wat 
h?v=i9IUzNxeH4o (5 min. video) 
ACM Symposium on Computational Geometry (SoCG'14) 

 
2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli 
geometry 25/75
Dually ane 
onne 
tion 
omputational information 
geometry 

 
2014 Frank Nielsen 5.Dually at CIG 26/75
Dually at spa 
e 
onstru 
tion from 
onvex fun 
tions F 
◮ Convex and stri 
tly dierentiable fun 
tion F(θ) admits a 
Legendre-Fen 
hel 
onvex 
onjugate F∗(η): 
F∗(η) = sup 
 
(θ⊤η − F(θ)), ∇F(θ) = η = (∇F∗)−1(θ) 
◮ Young's inequality gives rise to 
anoni 
al divergen 
e [15℄: 
F(θ) + F∗(η′) ≥ θ⊤η′ ⇒ AF,F∗(θ, η′) = F(θ) + F∗(η′) − θ⊤η′ 
◮ Writing using single 
oordinate system, get dual Bregman 
divergen 
es: 
BF (θp : θq) = F(θp) − F(θq) − (θp − θq)⊤∇F(θq) 
= BF∗(ηq : ηp) = AF,F∗(θp, ηq) = AF∗,F (ηq : θp) 
◮ dual ane 
oordinate systems with geodesi 
s straight: 
η = ∇F(θ) ⇔ θ = ∇F∗(η). Tensor g(θ) = g∗(η) 

 
2014 Frank Nielsen 5.Dually at CIG 27/75
Dual divergen 
e/Bregman dual bise 
tors [6, 24, 26℄ 
Bregman sided (referen 
e) bise 
tors related by 
onvex duality: 
BiF (θ1, θ2) = {θ ∈  |BF (θ : θ1) = BF (θ : θ1)} 
BiF∗(η1, η2) = {η ∈ H |BF∗(η : η1) = BF∗(η : η1)} 
Right-sided bise 
tor: → θ-hyperplane, η-hypersurfa 
e 
HF (p, q) = {x ∈ X | BF (x : p ) = BF (x : q )}. 
HF : h∇F(p) − ∇F(q), xi + (F(p) − F(q) + hq,∇F(q)i − hp,∇F(p)i) = 0 
Left-sided bise 
tor: → θ-hypersurfa 
e, η-hyperplane 
H′F 
(p, q) = {x ∈ X | BF ( p : x) = BF ( q : x)} 
H′F 
: h∇F(x), q − pi + F(p) − F(q) = 0 
hyperplane = autoparallel submanifold of dimension d − 1 

 
2014 Frank Nielsen 5.Dually at CIG-1.bise 
tor 28/75
Visualizing Bregman bise 
tors 
Primal 
oordinates θ Dual 
oordinates η 
natural parameters expe 
tation parameters 
p 
q 
Source Space: Itakura-Saito 
p(0.52977081,0.72041688) q(0.85824458,0.29083834) 
D(p,q)=0.66969016 D(q,p)=0.44835617 
Gradient Space: Itakura-Saito dual 
p’(-1.88760873,-1.38808518) q’(-1.16516903,-3.43833618) 
D*(p’,q’)=0.44835617 D*(q’,p’)=0.66969016 
p’ 
q’ 
Bi(P,Q) and Bi∗(P,Q) 
an be expressed in either θ/η 
oordinate systems 

 
2014 Frank Nielsen 5.Dually at CIG-1.bise 
tor 29/75
Spa 
es of spheres: 1-to-1 
mapping between d-spheres 
and (d + 1)-hyperplanes using 
potential fun 
tions 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 30/75
Spa 
e of Bregman spheres and Bregman balls [6℄ 
Dual sided Bregman balls (bounding Bregman spheres): 
Ballr 
F (c, r ) = {x ∈ X | BF (x : c) ≤ r} 
Balll 
F (c, r ) = {x ∈ X | BF (c : x) ≤ r} 
Legendre duality: 
F (c, r ) = (∇F)−1(Ballr 
F∗(∇F(c), r )) 
Balll 
Illustration for Itakura-Saito divergen 
e, F(x) = −log x 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 31/75
Spa 
e of Bregman spheres: Lifting map [6℄ 
F : x7→ ˆx = (x, F(x)), hypersurfa 
e in Rd+1, potential fun 
tion 
Hp: Tangent hyperplane at ˆp, z = Hp(x) = hx − p,∇F(p)i + F(p) 
◮ Bregman sphere σ −→ ˆσ with supporting hyperplane 
H : z = hx − c,∇F(c)i + F(c) + r . 
(// to Hc and shifted verti 
ally by r ) 
ˆσ = F ∩ H. 
◮ interse 
tion of any hyperplane H with F proje 
ts onto X as a Bregman 
sphere: 
H : z = hx, ai + b → σ : BallF (c = (∇F)−1(a), r = ha, ci − F(c) + b) 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 32/75
Lifting/Polarity: Potential fun 
tion graph F 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 33/75
Spa 
e of Bregman spheres: Algorithmi 
appli 
ations [6℄ 
◮ Union/interse 
tion of Bregman d-spheres from representational 
(d + 1)-polytope [6℄ 
◮ Radi 
al axis of two Bregman balls is an hyperplane: Appli 
ations to 
Nearest Neighbor sear 
h trees like Bregman ball trees or Bregman 
vantage point trees [31℄. 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 34/75
Bregman proximity data stru 
tures [31℄ 
Vantage point trees: partition spa 
e a 
ording to Bregman balls 
Partitionning spa 
e with interse 
tion of Kullba 
k-Leibler balls 
→ e 
ient nearest neighbour queries in information spa 
es 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 35/75
Appli 
ation: Minimum En 
losing Ball [23, 32℄ 
To a hyperplane H = H(a, b) : z = ha, xi + b in Rd+1, 
orresponds a ball 
σ = Ball(c, r ) in Rd with 
enter c = ∇F∗(a) and radius: 
r = ha, ci − F(c) + b = ha,∇F∗(a)i − F(∇F∗(a)) + b = F∗(a) + b 
sin 
e F(∇F∗(a)) = h∇F∗(a), ai − F∗(a) (Young equality) 
SEB: Find halfspa 
e H(a, b)− : z ≤ ha, xi + b that 
ontains all lifted points: 
min 
a,b 
r = F∗(a) + b, 
∀i ∈ {1, ..., n}, ha, xi i + b − F(xi ) ≥ 0 
→ Convex Program (CP) with linear inequality 
onstraints 
F(θ) = F∗(η) = 12 
x⊤x: CP → Quadrati 
Programming (QP) [11℄ used in 
SVM. Smallest en 
losing ball used as a primitive in SVM [34℄ 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 36/75
Smallest Bregman en 
losing balls [32, 22℄ 
Algorithm 1: BBCA(P, l ). 
c1 ← 
hoose randomly a point in P; 
for i = 2 to l − 1 do 
// farthest point from ci wrt. BF 
si ← argmax=1BF (ci : pj ); 
nj 
// update the 
enter: walk on the η-segment [ci , psi ] 
ci+1 ← ∇F−1(∇F(ci )# 1 
i+1∇F(psi )) ; 
end 
// Return the SEBB approximation 
return Ball(cl , rl = BF (cl : X)) ; 
θ-, η-geodesi 
segments in dually at geometry. 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 37/75
Smallest en 
losing balls: Core-sets [32℄ 
Core-set C ⊆ S: SOL(S) ≤ SOL(C) ≤ (1 + ǫ)SOL(S) 
extended Kullba 
k-Leibler Itakura-Saito 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 38/75
InSphere predi 
ates wrt Bregman divergen 
es [6℄ 
Impli 
it representation of Bregman spheres/balls: 
onsider d + 1 support 
points on the boundary 
◮ Is x inside the Bregman ball dened by d + 1 support points? 
InSphere(x; p0, ..., pd ) =
1 ... 1 1 
p0 ... pd x 
F(p0) ... F(pd ) F(x)
◮ sign of a (d + 2) × (d + 2) matrix determinant 
◮ InSphere(x; p0, ..., pd ) is negative, null or positive depending on whether 
x lies inside, on, or outside σ. 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 39/75
Smallest en 
losing ball in Riemannian manifolds [2℄ 
c = a#Mt 
b: point γ(t) on the geodesi 
line segment [ab] wrt M su 
h that 
ρM(a, c) = t × ρM(a, b) (with ρM the metri 
distan 
e on manifold M) 
Algorithm 2: GeoA 
c1 ← 
hoose randomly a point in P; 
for i = 2 to l do 
// farthest point from ci 
si ← argmax=1ρ(ci , pj ); 
nj 
// update the 
enter: walk on the geodesi 
line segment 
[ci , psi ] 
ci+1 ← ci#M 
1 
i+1 
psi ; 
end 
// Return the SEB approximation 
return Ball(cl , rl = ρ(cl ,P)) ; 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 40/75
Approximating the smallest en 
losing ball in hyperboli 
spa 
e 
Initialization First iteration 
Se 
ond iteration Third iteration 
Fourth iteration after 104 iterations 
http://www.sony 
sl. 
o.jp/person/nielsen/infogeo/RiemannMinimax/ 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 41/75
Bregman dual regular/Delaunay triangulations 
Embedded geodesi 
Delaunay triangulations+empty Bregman balls 
Delaunay Exponential Del. Hellinger-like Del. 
◮ empty Bregman sphere property, 
◮ geodesi 
triangles: embedded Delaunay. 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 42/75
Dually orthogonal Bregman Voronoi  Triangulations 
Ordinary Voronoi diagram is perpendi 
ular to Delaunay triangulation: 
Voronoi k-fa 
e ⊥ Delaunay d − k-fa 
e 
Bi(P,Q) ⊥ γ∗(P,Q) 
γ(P,Q) ⊥ Bi∗(P,Q) 

 
2014 Frank Nielsen 5.Dually at CIG-2.Spa 
e of spheres 43/75
Syntheti 
geometry: Exa 
t 
hara 
terization of the 
Bayesian error exponent but 
no 
losed-form known 

 
2014 Frank Nielsen 6.Bayesian error exponent 44/75
Bayesian hypothesis testing, MAP rule and probability of 
error Pe 
◮ Mixture p(x) = 
P 
i wipi (x). Task = Classify x Whi 
h 
omponent? 
◮ Prior probabilities: wi = P(X ∼ Pi )  0 (with 
Pn 
i=1 wi = 1) 
◮ Conditional probabilities: P(X = x|X ∼ Pi ). 
P(X = x) = 
Xn 
i=1 
P(X ∼ Pi )P(X = x|X ∼ Pi ) = 
Xn 
i=1 
wiP(X|Pi ) 
◮ Best rule = Maximum a posteriori probability (MAP) rule: 
map(x) = argmaxi∈{1,...,n} wipi (x) 
where pi (x) = P(X = x|X ∼ Pi ) are the 
onditional probabilities. 
◮ For w1 = w2 = 12 
, probability of error 
Pe = 12 
R 
min(p1(x), p2(x))dx ≤ 12 
R 
p1(x)p2(x)1−dx, for α ∈ (0, 1). 
Best exponent α∗ 

 
2014 Frank Nielsen 6.Bayesian error exponent 45/75
Error exponent for exponential families 
◮ Exponential families have nite dimensional su 
ient statisti 
s: → 
Redu 
e n data to D statisti 
s. 
∀x ∈ X, P(x|θ) = exp(θ⊤t(x) − F(θ) + k(x)) 
F(·): log-normalizer/ 
umulant/partition fun 
tion, k(x): auxiliary term 
for 
arrier measure. 
◮ Maximum likelihood estimator (MLE): ∇F(θ) ˆ= 1 
n 
P 
i t(Xi ) = ˆη 
◮ Bije 
tion between exponential families and Bregman divergen 
es: 
log p(x|θ) = −BF∗(t(x) : η) + F∗(t(x)) + k(x) 
Exponential families are log- 
on 
ave 

 
2014 Frank Nielsen 6.Bayesian error exponent 46/75
Geometry of the best error exponent 
On the exponential family manifold, Cherno α- 
oe 
ient [7℄: 
c(P1 : P2) = 
Z 
2 (x)dμ(x) = exp(−J() 
p 
1 (x)p1− 
F (θ1 : θ2)) 
Skew Jensen divergen 
e [20℄ on the natural parameters: 
J() 
F (θ1 : θ2) = αF(θ1) + (1 − α)F(θ2) − F(θ() 
12 ) 
Cherno information = Bregman divergen 
e for exponential families: 
C(P1 : P2 ) = B(θ1 : θ(∗) 
12 ) = B(θ2 : θ(∗) 
12 ) 
Finding best error exponent α∗? 

 
2014 Frank Nielsen 6.Bayesian error exponent 47/75
Geometry of the best error exponent: binary hypothesis [17℄ 
Cherno distribution P∗: 
P∗ = P∗12 = Ge(P1, P2) ∩ Bim(P1, P2) 
e-geodesi 
: 
Ge(P1, P2) = 
n 
E() 
12 | θ(E() 
o 
, 
12 ) = (1 − λ)θ1 + λθ2, λ ∈ [0, 1] 
m-bise 
tor: 
Bim(P1, P2) : 
n 
P | F(θ1) − F(θ2) + η(P)⊤θ = 0 
o 
, 
Optimal natural parameter of P∗: 
θ∗ = θ(∗) 
12 = argmin∈B(θ1 : θ) = argmin∈B(θ2 : θ). 
→ 
losed-form for order-1 family, or e 
ient bise 
tion sear 
h. 

 
2014 Frank Nielsen 6.Bayesian error exponent 48/75
Geometry of the best error exponent: binary hypothesis 
P∗ = P∗12 = Ge(P1, P2) ∩ Bim(P1, P2) 
p1 
p2 
m-bisector 
p 
12 
-coordinate system 
e-geodesic Ge(P1 , P2 ) 
P 
12 
C(1 : 2) = B(1 :  
12) 
Bim(P1 , P2 ) 
Binary Hypothesis Testing: Pe bounded using Bregman divergen 
e between 
Cherno distribution and 
lass- 
onditional distributions. 

 
2014 Frank Nielsen 6.Bayesian error exponent 49/75
Clustering and Learning 
nite statisti 
al mixtures 

 
2014 Frank Nielsen 6.Bayesian error exponent 50/75
-divergen 
es 
For α ∈ R6= ±1, α-divergen 
es [9℄ on positive arrays [36℄ : 
◮ D(p : q) 
.= 
Xd 
i=1 
4 
1 − α2 
 
1 − α 
2 
pi + 
1 + α 
2 
qi − (pi ) 
1− 
2 (qi ) 
1+ 
2 
 
with 
D(p : q) = D−(q : p) and in the limit 
ases D−1(p : q) = KL(p : q) 
and D1(p : q) = KL(q : p), where KL is the extended Kullba 
kLeibler 
divergen 
e KL(p : q) .= 
Pd 
i=1 pi log pi 
qi + qi − pi 
◮ α-divergen 
es belong to the 
lass of Csiszár f -divergen 
es 
If (p : q) 
.= 
Pd 
i=1 qi f 
 
pi 
qi 
 
with the following generator: 
f (t) = 
 
 
4 
1−2 
 
1 − t(1+)/2 
, if α6= ±1, 
t ln t, if α = 1, 
−ln t, if α = −1 
Information monotoni 
ity 

 
2014 Frank Nielsen 6.Bayesian error exponent 51/75
Mixed divergen 
es [30℄ 
Dened on three parameters p, q and r : 
M(p : q : r ) 
.= 
λD(p : q) + (1 − λ)D(q : r ) 
for λ ∈ [0, 1]. 
Mixed divergen 
es in 
lude: 
◮ the sided divergen 
es for λ ∈ {0, 1}, 
◮ the symmetrized (arithmeti 
mean) divergen 
e for λ = 12 
, or skew 
symmetrized for λ6= 12 
. 

 
2014 Frank Nielsen 7.Mixed divergen 
es 52/75
Symmetrizing -divergen 
es 
S(p, q) = 
1 
2 
(D(p : q) + D(q : p)) = S−(p, q), 
= M1 
2 
(p : q : p), 
For α = ±1, we get half of Jereys divergen 
e: 
S±1(p, q) = 
1 
2 
Xd 
i=1 
(pi − qi ) log 
pi 
qi 
◮ Centroids for symmetrized α-divergen 
e usually not in 
losed form. 
◮ How to perform 
enter-based 
lustering without 
losed form 
entroids? 

 
2014 Frank Nielsen 7.Mixed divergen 
es 53/75
Jereys positive 
entroid [16℄ 
◮ Jereys divergen 
e is symmetrized α = ±1 divergen 
es. 
◮ The Jereys positive 
entroid c = (c1, ..., cd ) of a set {h1, ..., hn} of n 
weighted positive histograms with d bins 
an be 
al 
ulated 
omponent-wise exa 
tly using the Lambert W analyti 
fun 
tion: 
ci = 
ai 
W 
 
ai 
gi e 
 
where ai = 
Pn 
j=1 πjhi 
j denotes the 
oordinate-wise arithmeti 
weighted 
means and gi = 
Qn 
j )j the 
j=1(hi 
oordinate-wise geometri 
weighted 
means. 
◮ The Lambert analyti 
fun 
tion W [4℄ (positive bran 
h) is dened by 
W(x)eW(x) = x for x ≥ 0. 
◮ → Jereys k-means 
lustering . But for α6= 1, how to 
luster? 

 
2014 Frank Nielsen 7.Mixed divergen 
es 54/75
Mixed -divergen 
es/-Jereys symmetrized divergen 
e 
◮ Mixed α-divergen 
e between a histogram x to two histograms p and q: 
M,(p : x : q) = λD(p : x) + (1 − λ)D(x : q), 
= λD−(x : p) + (1 − λ)D−(q : x), 
= M1−,−(q : x : p), 
◮ α-Jereys symmetrized divergen 
e is obtained for λ = 12 
: 
S(p, q) = M1 
2 ,(q : p : q) = M1 
2 ,(p : q : p) 
◮ skew symmetrized α-divergen 
e is dened by: 
S,(p : q) = λD(p : q) + (1 − λ)D(q : p) 

 
2014 Frank Nielsen 7.Mixed divergen 
es 55/75
Mixed divergen 
e-based k-means 
lustering 
k distin 
t seeds from the dataset with li = ri . 
Input: Weighted histogram set H, divergen 
e D(·, ·), integer k  0, real 
λ ∈ [0, 1]; 
Initialize left-sided/right-sided seeds C = {(li , ri )}ki 
=1; 
repeat 
//Assignment 
for i = 1, 2, ..., k do 
Ci ← {h ∈ H : i = arg minj M(lj : h : rj )}; 
end 
// Dual-sided 
entroid relo 
ation 
for i = 1, 2, ..., k do 
ri ← arg minx D(Ci : x) = 
P 
h∈Ci 
wjD(h : x); 
li ← arg minx D(x : Ci ) = 
P 
h∈Ci 
wjD(x : h); 
end 
until 
onvergen 
e; 

 
2014 Frank Nielsen 7.Mixed divergen 
es 56/75
ki 
Mixed -hard 
lustering: MAhC(H, k, , ) 
Input: Weighted histogram set H, integer k  0, real λ ∈ [0, 1], real α ∈ R; 
Let C = {(li , ri )}=1 ← MAS(H, k, λ, α); 
repeat 
//Assignment 
for i = 1, 2, ..., k do 
Ai ← {h ∈ H : i = arg minj M,(lj : h : rj )}; 
end 
// Centroid relo 
ation 
for i = 1, 2, ..., k do 
ri ← 
P 
h∈Ai 
wih 
1− 
2 
 2 
1− ; 
li ← 
P 
h∈Ai 
wih 
1+ 
2 
 2 
1+ ; 
end 
until 
onvergen 
e; 

 
2014 Frank Nielsen 7.Mixed divergen 
es 57/75
Coupled k-Means++ -Seeding 
Algorithm 3: Mixed α-seeding; MAS(H, k, λ, α) 
Input: Weighted histogram set H, integer k ≥ 1, real λ ∈ [0, 1], real α ∈ R; 
Let C ← hj with uniform probability ; 
for i = 2, 3, ..., k do 
Pi 
k at random histogram h ∈ H with probability: 
πH(h) 
.= 
whM,(ch : h : ch) P 
y∈H wyM,(cy : y : cy ) 
, (1) 
//where (ch, ch) 
.=arg min(z,z)∈C M,(z : h : z); 
C ← C ∪ {(h, h)}; 
end 
Output: Set of initial 
luster 
enters C; 
→ Guaranteed probabilisti 
bound. Just need to initialize! No 
entroid 
omputations 

 
2014 Frank Nielsen 7.Mixed divergen 
es 58/75
Learning MMs: A geometri 
hard 
lustering viewpoint 
Learn the parameters of a mixture m(x) = 
Pk 
i=1 wip(x|θi ) 
Maximize the 
omplete data likelihood= 
lustering obje 
tive fun 
tion 
max 
W, 
lc (W, ) = 
Xn 
i=1 
Xk 
j=1 
zi ,j log(wjp(xi |θj )) 
= max 
 
Xn 
i=1 
k 
max 
j=1 
log(wjp(xi |θj )) 
≡ min 
W, 
Xn 
i=1 
k 
min 
j=1 
Dj (xi ) , 
where cj = (wj , θj ) ( 
luster prototype) and Dj (xi ) = −log p(xi |θj ) − log wj 
are potential distan 
e-like fun 
tions. 
further atta 
h to ea 
h 
luster a dierent family of probability distributions. 

 
2014 Frank Nielsen 7.Mixed divergen 
es 59/75
Generalized k-MLE for learning statisti 
al mixtures 
Model-based 
lustering: Assignment of points to 
lusters: 
Dwj ,j ,Fj (x) = −log pFj (x; θj ) − log wj 
k-GMLE: 
1. Initialize weight W ∈ k and family type (F1, ..., Fk ) for ea 
h 
luster 
2. Solve min 
P 
i minj Dj (xi ) ( 
enter-based 
lustering for W xed) with 
potential fun 
tions: Dj (xi ) = −log pFj (xi |θj ) − log wj 
3. Solve family types maximizing the MLE in ea 
h 
luster Cj by 
hoosing 
the parametri 
family of distributions Fj P 
= F(γj ) that yields the best 
likelihood: minF1=F(
1),...,Fk=F(
)∈F(
) 
mini j Dwj k ,j ,Fj (xi ). 
j ( ˆ ηl = 1 
∀l , γl = maxj F∗ 
nl 
P 
x∈Cl 
tj (x)) + 1 
nl 
P 
x∈Cl 
k(x). 
4. Update weight W as the 
luster point proportion 
5. Test for 
onvergen 
e and go to step 2) otherwise. 
Drawba 
k = biased, non- 
onsistent estimator due to Voronoi support 
trun 
ation. 

 
2014 Frank Nielsen 8.k-GMLE 60/75
Computing f -divergen 
es for 
generi 
f : Beyond sto 
hasti 
numeri 
al integration 

 
2014 Frank Nielsen 9.Computing f -divergen 
es 61/75
f -divergen 
es 
If (X1 : X2) = 
Z 
x1(x)f 
 
x2(x) 
x1(x) 
 
dν(x) ≥ 0 
Name of the f -divergen 
e Formula If (P : Q) Generator f (u) with f (1) = 0 
Total variation (metri 
) 1 
2 R |p(x) − q(x)|d(x) 1 
2 |u − 1| 
Squared Hellinger R (pp(x) − pq(x))2d(x) (√u − 1)2 
Pearson  
2 
P R (q(x)−p(x))2 
p(x) 
d(x) (u − 1)2 
2 
N R (p(x)−q(x))2 
Neyman  
q(x) 
d(x) 
(1−u)2 
u 
k 
P R (q(x)−p(x))k 
Pearson-Vajda  
pk−1(x) 
d(x) (u − 1)k 
P R |q(x)−p(x)|k 
Pearson-Vajda ||k 
pk−1(x) 
d(x) |u − 1|k 
Kullba 
k-Leibler R p(x) log p(x) 
q(x) 
d(x) −log u 
reverse Kullba 
k-Leibler R q(x) log q(x) 
p(x) 
d(x) u log u 
-divergen 
e 4 
1−2 (1 − R p 
1− 
2 (x)q1+(x)d(x)) 4 
1−2 (1 − u 
1+ 
2 ) 
2 R (p(x) log 2p(x) 
Jensen-Shannon 1 
p(x)+q(x) + q(x) log 2q(x) 
p(x)+q(x) )d(x) −(u + 1) log 1+u 
2 + u log u 

 
2014 Frank Nielsen 9.Computing f -divergen 
es 62/75
f -divergen 
es and higher-order Vajda k divergen 
es 
If (X1 : X2) = 
∞X 
k=0 
f (k)(1) 
k! 
χk 
P(X1 : X2) 
χk 
P(X1 : X2) = 
Z 
(x2(x) − x1(x))k 
x1(x)k−1 dν(x), 
|χ|k 
P(X1 : X2) = 
Z 
|x2(x) − x1(x)|k 
x1(x)k−1 dν(x), 
are f -divergen 
es for the generators (u − 1)k and |u − 1|k . 
◮ When k = 1, χ1 
P(X1 : X2) = 
R 
(x1(x) − x2(x))dν(x) = 0 (never 
dis 
riminative), and |χ1 
P|(X1,X2) is twi 
e the total variation distan 
e. 
◮ χk 
P is a signed distan 
e 

 
2014 Frank Nielsen 9.Computing f -divergen 
es 63/75
Ane exponential families 
Canoni 
al de 
omposition of the probability measure: 
p(x) = exp(ht(x), θi − F(θ) + k(x)), 
onsider natural parameter spa 
e  ane (like multinomials). 
Poi(λ) : p(x|λ) = 
λx e− 
x! 
, λ  0, x ∈ {0, 1, ...} 
NorI (μ) : p(x|μ) = (2π)−d 
2 e−12 
(x−μ)⊤(x−μ), μ ∈ Rd , x ∈ Rd 
Family θ  F(θ) k(x) t(x) ν 
Poisson log λ R e −log x! x νc 
Iso.Gaussian μ Rd 12 
θ⊤θ d 
2 log 2π − 12 
x⊤x x νL 

 
2014 Frank Nielsen 9.Computing f -divergen 
es 64/75
Higher-order Vajda k divergen 
es 
The (signed) χk 
P distan 
e between members X1 ∼ EF (θ1) and X2 ∼ EF (θ2) of 
the same ane exponential family is (k ∈ N) always bounded and equal to: 
χk 
P(X1 : X2) = 
Xk 
j=0 
(−1)k−j 
 
k 
j 
 
eF((1−j)1+j2) 
e(1−j)F(1)+jF(2) 
For Poisson/Normal distributions, we get 
losed-form formula: 
χk 
P(λ1 : λ2) = 
Xk 
j=0 
(−1)k−j 
 
k 
j 
 
e1−j 
1 j 
2−((1−j)1+j2), 
χk 
P(μ1 : μ2) = 
Xk 
j=0 
(−1)k−j 
 
k 
j 
 
e 
12 
j(j−1)(μ1−μ2)⊤(μ1−μ2). 

 
2014 Frank Nielsen 9.Computing f -divergen 
es 65/75
f -divergen 
es: Analyti 
formula [14℄ 
◮ λ = 1 ∈ int(dom(f (i ))), f -divergen 
e (Theorem 1 of [3℄):

More Related Content

What's hot

Applications of graphs
Applications of graphsApplications of graphs
Applications of graphsTech_MX
 
Introduction to Graph Theory
Introduction to Graph TheoryIntroduction to Graph Theory
Introduction to Graph TheoryYosuke Mizutani
 
Lecture 9 dim & rank - 4-5 & 4-6
Lecture 9   dim & rank -  4-5 & 4-6Lecture 9   dim & rank -  4-5 & 4-6
Lecture 9 dim & rank - 4-5 & 4-6njit-ronbrown
 
Graph theory and life
Graph theory and lifeGraph theory and life
Graph theory and lifeMilan Joshi
 
Visualization of general defined space data
Visualization of general defined space dataVisualization of general defined space data
Visualization of general defined space dataijcga
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabatinabati
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
THE CALCULUS INTEGRAL (Beta Version 2009)
THE CALCULUS INTEGRAL (Beta Version 2009)THE CALCULUS INTEGRAL (Beta Version 2009)
THE CALCULUS INTEGRAL (Beta Version 2009)briansthomson
 
Computer Graphics & linear Algebra
Computer Graphics & linear Algebra Computer Graphics & linear Algebra
Computer Graphics & linear Algebra Xad Kuain
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNEDavid Khosid
 
High Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNEHigh Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNEKai-Wen Zhao
 
The N-Dimensional Map Maker Algorithm
The N-Dimensional Map Maker AlgorithmThe N-Dimensional Map Maker Algorithm
The N-Dimensional Map Maker Algorithmijcga
 
50 keys to cat from test funda
50 keys to cat from test funda50 keys to cat from test funda
50 keys to cat from test fundapallavisingla1
 
Rational points on elliptic curves
Rational points on elliptic curvesRational points on elliptic curves
Rational points on elliptic curvesmmasdeu
 
Chapter 2 Image Processing: Pixel Relation
Chapter 2 Image Processing: Pixel RelationChapter 2 Image Processing: Pixel Relation
Chapter 2 Image Processing: Pixel RelationVarun Ojha
 

What's hot (20)

Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
 
Introduction to Graph Theory
Introduction to Graph TheoryIntroduction to Graph Theory
Introduction to Graph Theory
 
Lecture 9 dim & rank - 4-5 & 4-6
Lecture 9   dim & rank -  4-5 & 4-6Lecture 9   dim & rank -  4-5 & 4-6
Lecture 9 dim & rank - 4-5 & 4-6
 
Graph theory and life
Graph theory and lifeGraph theory and life
Graph theory and life
 
Graph theory
Graph theoryGraph theory
Graph theory
 
Visualization of general defined space data
Visualization of general defined space dataVisualization of general defined space data
Visualization of general defined space data
 
Graph theory concepts complex networks presents-rouhollah nabati
Graph theory concepts   complex networks presents-rouhollah nabatiGraph theory concepts   complex networks presents-rouhollah nabati
Graph theory concepts complex networks presents-rouhollah nabati
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
THE CALCULUS INTEGRAL (Beta Version 2009)
THE CALCULUS INTEGRAL (Beta Version 2009)THE CALCULUS INTEGRAL (Beta Version 2009)
THE CALCULUS INTEGRAL (Beta Version 2009)
 
Computer Graphics & linear Algebra
Computer Graphics & linear Algebra Computer Graphics & linear Algebra
Computer Graphics & linear Algebra
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNE
 
High Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNEHigh Dimensional Data Visualization using t-SNE
High Dimensional Data Visualization using t-SNE
 
The N-Dimensional Map Maker Algorithm
The N-Dimensional Map Maker AlgorithmThe N-Dimensional Map Maker Algorithm
The N-Dimensional Map Maker Algorithm
 
Lecture50
Lecture50Lecture50
Lecture50
 
50 keys to cat from test funda
50 keys to cat from test funda50 keys to cat from test funda
50 keys to cat from test funda
 
Rational points on elliptic curves
Rational points on elliptic curvesRational points on elliptic curves
Rational points on elliptic curves
 
Variograms
VariogramsVariograms
Variograms
 
18 Basic Graph Algorithms
18 Basic Graph Algorithms18 Basic Graph Algorithms
18 Basic Graph Algorithms
 
Graph
GraphGraph
Graph
 
Chapter 2 Image Processing: Pixel Relation
Chapter 2 Image Processing: Pixel RelationChapter 2 Image Processing: Pixel Relation
Chapter 2 Image Processing: Pixel Relation
 

Viewers also liked

On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-DivergencesFrank Nielsen
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Frank Nielsen
 
(slides 9) Visual Computing: Geometry, Graphics, and Vision
(slides 9) Visual Computing: Geometry, Graphics, and Vision(slides 9) Visual Computing: Geometry, Graphics, and Vision
(slides 9) Visual Computing: Geometry, Graphics, and VisionFrank Nielsen
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsFrank Nielsen
 
(slides 8) Visual Computing: Geometry, Graphics, and Vision
(slides 8) Visual Computing: Geometry, Graphics, and Vision(slides 8) Visual Computing: Geometry, Graphics, and Vision
(slides 8) Visual Computing: Geometry, Graphics, and VisionFrank Nielsen
 
On approximating the Riemannian 1-center
On approximating the Riemannian 1-centerOn approximating the Riemannian 1-center
On approximating the Riemannian 1-centerFrank Nielsen
 
k-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture modelsk-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture modelsFrank Nielsen
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
Traitement des données massives (INF442, A8)
Traitement des données massives (INF442, A8)Traitement des données massives (INF442, A8)
Traitement des données massives (INF442, A8)Frank Nielsen
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clusteringFrank Nielsen
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Frank Nielsen
 
INF442: Traitement des données massives
INF442: Traitement des données massivesINF442: Traitement des données massives
INF442: Traitement des données massivesFrank Nielsen
 
Traitement massif des données 2016
Traitement massif des données 2016Traitement massif des données 2016
Traitement massif des données 2016Frank Nielsen
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen
 
On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)Frank Nielsen
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyFrank Nielsen
 
Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Frank Nielsen
 
Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Frank Nielsen
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applicationsFrank Nielsen
 

Viewers also liked (20)

On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
Voronoi diagrams in information geometry:  Statistical Voronoi diagrams and ...
 
(slides 9) Visual Computing: Geometry, Graphics, and Vision
(slides 9) Visual Computing: Geometry, Graphics, and Vision(slides 9) Visual Computing: Geometry, Graphics, and Vision
(slides 9) Visual Computing: Geometry, Graphics, and Vision
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
 
(slides 8) Visual Computing: Geometry, Graphics, and Vision
(slides 8) Visual Computing: Geometry, Graphics, and Vision(slides 8) Visual Computing: Geometry, Graphics, and Vision
(slides 8) Visual Computing: Geometry, Graphics, and Vision
 
On approximating the Riemannian 1-center
On approximating the Riemannian 1-centerOn approximating the Riemannian 1-center
On approximating the Riemannian 1-center
 
k-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture modelsk-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture models
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
Traitement des données massives (INF442, A8)
Traitement des données massives (INF442, A8)Traitement des données massives (INF442, A8)
Traitement des données massives (INF442, A8)
 
Divergence clustering
Divergence clusteringDivergence clustering
Divergence clustering
 
Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)Computational Information Geometry: A quick review (ICMS)
Computational Information Geometry: A quick review (ICMS)
 
INF442: Traitement des données massives
INF442: Traitement des données massivesINF442: Traitement des données massives
INF442: Traitement des données massives
 
Traitement massif des données 2016
Traitement massif des données 2016Traitement massif des données 2016
Traitement massif des données 2016
 
Patch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective DivergencesPatch Matching with Polynomial Exponential Families and Projective Divergences
Patch Matching with Polynomial Exponential Families and Projective Divergences
 
On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)
 
A series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropyA series of maximum entropy upper bounds of the differential entropy
A series of maximum entropy upper bounds of the differential entropy
 
Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)
 
Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)
 
Divergence center-based clustering and their applications
Divergence center-based clustering and their applicationsDivergence center-based clustering and their applications
Divergence center-based clustering and their applications
 

Similar to Fundamentals cig 4thdec

Clustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningClustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningFrank Nielsen
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Frank Nielsen
 
Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensi...
Interpreting Multiple Regressionvia an Ellipse Inscribed in a Square Extensi...Interpreting Multiple Regressionvia an Ellipse Inscribed in a Square Extensi...
Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensi...Toshiyuki Shimono
 
Slides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometrySlides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometryFrank Nielsen
 
Clustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryClustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryFrank Nielsen
 
The inverse scattering series for tasks associated with primaries: direct non...
The inverse scattering series for tasks associated with primaries: direct non...The inverse scattering series for tasks associated with primaries: direct non...
The inverse scattering series for tasks associated with primaries: direct non...Arthur Weglein
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)Arnaud de Myttenaere
 
RDFS with Attribute Equations via SPARQL Rewriting
RDFS with Attribute Equations via SPARQL RewritingRDFS with Attribute Equations via SPARQL Rewriting
RDFS with Attribute Equations via SPARQL RewritingStefan Bischof
 
Spatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologySpatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologyLilac Liu Xu
 
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYFransiskeran
 
On algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetryOn algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetrygraphhoc
 

Similar to Fundamentals cig 4thdec (20)

Clustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningClustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learning
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)Computational Information Geometry on Matrix Manifolds (ICTP 2013)
Computational Information Geometry on Matrix Manifolds (ICTP 2013)
 
Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensi...
Interpreting Multiple Regressionvia an Ellipse Inscribed in a Square Extensi...Interpreting Multiple Regressionvia an Ellipse Inscribed in a Square Extensi...
Interpreting Multiple Regression via an Ellipse Inscribed in a Square Extensi...
 
Slides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometrySlides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometry
 
cswiercz-general-presentation
cswiercz-general-presentationcswiercz-general-presentation
cswiercz-general-presentation
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Clustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryClustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometry
 
The inverse scattering series for tasks associated with primaries: direct non...
The inverse scattering series for tasks associated with primaries: direct non...The inverse scattering series for tasks associated with primaries: direct non...
The inverse scattering series for tasks associated with primaries: direct non...
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
FPDE presentation
FPDE presentationFPDE presentation
FPDE presentation
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
SIAM CSE 2017 talk
SIAM CSE 2017 talkSIAM CSE 2017 talk
SIAM CSE 2017 talk
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)
 
RDFS with Attribute Equations via SPARQL Rewriting
RDFS with Attribute Equations via SPARQL RewritingRDFS with Attribute Equations via SPARQL Rewriting
RDFS with Attribute Equations via SPARQL Rewriting
 
Spatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologySpatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in Epidemiology
 
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRY
 
On algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetryOn algorithmic problems concerning graphs of higher degree of symmetry
On algorithmic problems concerning graphs of higher degree of symmetry
 

Recently uploaded

Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.k64182334
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 

Recently uploaded (20)

Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.Genomic DNA And Complementary DNA Libraries construction.
Genomic DNA And Complementary DNA Libraries construction.
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 

Fundamentals cig 4thdec

  • 1. Fundamentals of Algorithms and Data-Stru tures in Information-Geometri Spa es Frank NIELSEN É ole Polyte hnique, Fran e Sony Computer S ien e Laboratories, In MEXT-ISM Workshop on Information Geometry for Ma hine Learning Brain S ien e Institute, RIKEN 4th De ember 2014 2014 Frank Nielsen 1/75
  • 2. Brief histori al review of Computational Geometry (CG) ◮ Three resear h periods: 1. Geometri algorithms: Voronoi/Delaunay, minimum spanning trees, data-stru tures for proximity queries 2. Geometri omputing: robustness, algebrai degree of predi ates, programs that work/s ale! 3. Computational topology: simpli ial omplexes, ltrations, input=distan e matrix → paradigm of Topologi al Data Analysis (TDA) ◮ Show asing libraries for CG software: ◮ CGAL http://www. gal.org/ Geometry Fa tory http://geometryfa tory. om/ ◮ Gudhi https://proje t.inria.fr/gudhi/ Ayasdi http://www.ayasdi. om/ 2014 Frank Nielsen 1.CG History 2/75
  • 3. Outline ◮ Review of the basi algorithmi toolbox in omputational geometry: Voronoi diagrams and dual Delaunay, spanning balls ◮ Generalizations of those on epts and toolbox to information spa es: ◮ Riemannian omputational information geometry ◮ Dually ane onne tions omputational information geometry ◮ Appli ations to lustering, learning mixtures, et . What is a good/friendly geometri omputing spa e? 2014 Frank Nielsen 1.CG History 3/75
  • 4. Basi s of Eu lidean Computational Geometry: Voronoi diagrams and dual Delaunay omplexes 2014 Frank Nielsen 2.Ordinary CG 4/75
  • 5. Eu lidean (ordinary) Voronoi diagrams P = {P1, ..., Pn}: n distin t point generators in Eu lidean spa e Ed V (Pi ) = {X : DE (Pi ,X) ≤ DE (Pj ,X), ∀j6= i} Voronoi diagram = ell omplex V (Pi )'s with their fa es 2014 Frank Nielsen 2.Ordinary CG 5/75
  • 6. Voronoi diagrams from bise tors and ∩ halfspa es Bise tors Bi(P,Q) = {X : DE (P,X) = DE (Q,X)} → are hyperplanes in Eu lidean geometry Voronoi ells as halfspa e interse tions: =1Bi+(Pi , Pj ) V (Pi ) = {X : DE (Pi ,X) ≤ DE (Pj ,X), ∀j6= i} = ∩ni DE (P,Q) = kθ(P) − θ(Q)k2 = qPd i=1(θi (P) − θi (Q))2 θ(P) = p: Cartesian oordinate system with θj (Pi ) = p(j) i . ⇒ Many appli ations of Voronoï diagrams: rystal growth, odebook/quantization, mole ule interfa es/do king, motion planning, et . 2014 Frank Nielsen 2.Ordinary CG 6/75
  • 7. Voronoi diagrams and dual Delaunay simpli ial omplex ◮ Empty sphere property, max min angle triangulation, et ◮ Voronoi dual Delaunay triangulation → non-degenerate point set = no (d + 2) points o-spheri al ◮ Duality: Voronoi k-fa e ⇔ Delaunay (d − k)-simplex ◮ Bise tor Bi(P,Q) perpendi ular ⊥ to segment [PQ] 2014 Frank Nielsen 2.Ordinary CG 7/75
  • 8. Voronoi Delaunay : Complexity and algorithms ◮ Combinatorial omplexity: (n⌈ d 2 ⌉) (→ quadrati in 3D) mat hed for points on the moment urve: t7→ (t, t2, .., td ) ◮ Constru tion: (n log n + n⌈ d 2 ⌉), optimal ◮ some output-sensitive algorithms but... ◮ (n log n + f ), not yet optimal output-sensitive algorithms. 2014 Frank Nielsen 2.Ordinary CG 8/75
  • 9. Modeling population spa es in information geometry Population spa e {P(x)} interpreted as a smooth manifold equipped with the Fisher Information Matrix (FIM): ◮ Riemannian modeling: metri length spa e with the FIM as metri tensor (orthogonality), and the Levi-Civita metri onne tion for length minimizing geodesi s ◮ Dual ±1 ane onne tion modeling: dual geodesi s that des ribe parallel transport, non-metri dual divergen es indu ed by dual potential Legendre onvex fun tions. Dual ±α onne tions. → Algorithmi onsiderations of these two approa hes Population spa e, parameter spa e, obje t-oriented geometry, et . 2014 Frank Nielsen 3.Information geometry 9/75
  • 10. Riemannian omputational information geometry from the viewpoint of omputing 2014 Frank Nielsen 4.Riemannian CIG 10/75
  • 11. Population spa es: Hotelling (1930) [12℄ Rao (1945) [33℄ Birth of dierential-geometri methods in statisti s. ◮ Fisher information matrix (non-degenerate positive denite) an be used as a (smooth) Riemannian metri tensor g. ◮ Distan e between two populations indexed by θ1 and θ2: Riemannian distan e (metri length) First appli ations in statisti s: ◮ Fisher-Hotelling-Rao (FHR) geodesi distan e used in lassi ation: Find the losest population to a given set of populations ◮ Used in tests of signi an e (null versus alternative hypothesis), power of a test: P(reje t H0|H0 is false) → dene surfa es in population spa es 2014 Frank Nielsen 4.Riemannian CIG 11/75
  • 12. Rao's distan e (1945, introdu ed by Hotelling 1930 [12℄) ◮ Innitesimal squared length element: ds2 = X i ,j gij (θ)dθidθj = dθTI (θ)dθ ◮ Geodesi and distan e are hard to expli itly al ulate: ρ(p(x; θ1), p(x; θ2)) = min (s) (0)=1 (1)=2 Z 1 0 s dθ ds T I (θ) dθ ds ds Rao's distan e not known in losed-form for multivariate normals ◮ Advantages: Metri property of ρ + many tools of dierential geometry [1℄: Riemannian Log/Exp tangent/manifold mapping 2014 Frank Nielsen 4.Riemannian CIG 12/75
  • 13. Extrinsi Computational Geometry on tangent planes ◮ Tensor g = Q(x) ≻ 0 denes smooth inner produ t hp, qix = p⊤Q(x)q that indu es a normed distan p e: dx (p, q) = kp − qkx = (p − q)⊤Q(x)(p − q) ◮ Mahalanobis metri distan e on tangent planes : (X1,X2) = q (μ1 − μ2)⊤−1(μ1 − μ2) = p μ⊤−1μ ◮ Cholesky de omposition = LL⊤ (X1,X2) = DE (L−1μ1, L−1μ2) ◮ CG on tangent planes = ordinary CG on transformed points x′ ← L−1x. Extrinsi vs intrinsi means [10℄ 2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 13/75
  • 14. Mahalanobis Voronoi diagrams on tangent planes (extrinsi ) In statisti s, ovarian e matrix a ount for both orrelation and dimension (feature) s aling ⇔ Dual stru ture ≡ anisotropi Delaunay triangulation ⇒ empty ir umellipse property (Cholesky de omposition) 2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 14/75
  • 15. Riemannian Mahalanobis metri tensor (−1, PSD) ρ(p1, p2) = q (p1 − p2)⊤−1(p1 − p2), g(p) = −1 = 1 −1 −1 2 non- onformal geometry: g(p)6= f (p)I 2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 15/75
  • 16. Riemannian statisti al Voronoi diagrams ... for statisti al population spa es: ◮ Lo ation-s ale 2D families have onstant non-positive urvature (Hotelling, 1930): Riemannian statisti al Voronoi diagrams amount to hyperboli Voronoi diagrams or Eu lidean diagrams (lo ation families only like isotropi Gaussians) ◮ Multinomial family has spheri al geometry on the positive orthant: Spheri al Voronoi diagram ( ompute as stereographi proje tion ∝ Eu lidean Voronoi diagrams) But for arbitrary families p(x|θ): Geodesi s not in losed forms → limited omputational framework in pra ti e (ray shooting, et .) 2014 Frank Nielsen 4.Riemannian CIG-1.Mahalanobis 16/75
  • 17. Normal/Gaussian family and 2D lo ation-s ale families ◮ Fisher Information Matrix (FIM): I (θ) = Ii ,j (θ) = E ∂ ∂θi log p(x|θ) ∂ ∂θj log p(x|θ) ◮ FIM for univariate normal/multivariate spheri al distributions: I (μ, σ) = 1 2 0 0 2 2 = 1 σ2 1 0 0 2 , I (μ, σ) = diag 1 σ2 , ..., 1 σ2 , 2 σ2 ◮ → amount to Poin aré metri dx2+dy2 y2 , hyperboli geometry in upper half plane/spa e. 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 17/75
  • 18. Riemannian Poin aré upper plane metri tensor ( onformal) osh ρ(p1, p2) = 1 + kp1 − p2k2 2y1y2 , g(p) = 1 y2 0 0 1 y2 # = 1 y2 I onformal: g(p) = 1 y2 I 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 18/75
  • 19. Matrix SPD spa es and hyperboli geometry Symmetri Positive Denite matri es M: ∀x6= 0, x⊤Mx 0. ◮ 2D SPD(2) matrix spa e has dimension d = 3: A positive one. SPD(2) (a, b, c) ∈ R3 : a 0, ab − c2 0 ◮ Can be peeled into sheets of dimension 2, ea h sheet orresponding to a onstant value of the determinant of the elements [8℄ SPD(2) = SSPD(2) × R+ where SSPD(2) = {a, b, c = √1 − ab) : a 0, ab − c2 = 1} ◮ Mapping M(a, b, c) → H2: ◮ x0 = a+b 2 ≥ 1, x1 = a−b 2 , x2 = c in hyperboloid model [28℄ ◮ z = a−b+2ic 2+a+b in Poin aré disk [28℄. 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 19/75
  • 20. Riemannian manifolds: Choi e of equivalent models? Many equivalent models of hyperboli geometry: ◮ Conformal (good for visualization sin e we an measure angles) versus non- onformal ( omputationally-friendly for geodesi s) models. ◮ Convert equivalently to other models of hyperboli geometry: Poin aré disk, upper half spa e, hyperboloid, Beltrami hemisphere, et . Two questions: ◮ Given a metri tensor g and its indu ed metri distan e ρg (p, q), what are the equivalent metri tensors g′ ∼ g su h that ρg (p, q) = ρg′(p′, q′)? Is one metri tensor better for omputing spa e? ◮ Metri s yielding straight geodesi s are fully hara terized in 2D but in higher dimensions? 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 20/75
  • 21. Riemannian Poin aré disk metri tensor ( onformal) → often used in Human Computer Interfa es, network routing (embedding trees), et . 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 21/75
  • 22. Riemannian Klein disk metri tensor (non- onformal) ◮ re ommended for omputing spa e sin e geodesi s are straight line segments ◮ Klein is also onformal at the origin (so we an perform translation from and ba k to the origin) ◮ Geodesi s passing through O in the Poin aré disk are straight (so we an perform translation from and ba k to the origin) 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 22/75
  • 23. Hyperboli Voronoi diagrams [25, 29℄ In arbitrary dimension, Hd ◮ In Klein disk, the hyperboli Voronoi diagram amounts to a lipped ane Voronoi diagram, or a lipped power diagram with e ient lipping algorithm [5℄. ◮ then onvert to other models of hyperboli geometry: Poin aré disk, upper half spa e, hyperboloid, Beltrami hemisphere, et . ◮ Conformal (good for visualization) versus non- onformal (good for omputing) models. 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 23/75
  • 24. Hyperboli Voronoi diagrams [25, 29℄ Hyperboli Voronoi diagram in Klein disk = lipped power diagram. Power distan e: kx − pk2 − wp → additively weighted ordinary Voronoi = ordinary CG 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 24/75
  • 25. Hyperboli Voronoi diagrams [25, 29℄ 5 ommon models of the abstra t hyperboli geometry https://www.youtube. om/wat h?v=i9IUzNxeH4o (5 min. video) ACM Symposium on Computational Geometry (SoCG'14) 2014 Frank Nielsen 4.Riemannian CIG-2.Hyperboli geometry 25/75
  • 26. Dually ane onne tion omputational information geometry 2014 Frank Nielsen 5.Dually at CIG 26/75
  • 27. Dually at spa e onstru tion from onvex fun tions F ◮ Convex and stri tly dierentiable fun tion F(θ) admits a Legendre-Fen hel onvex onjugate F∗(η): F∗(η) = sup (θ⊤η − F(θ)), ∇F(θ) = η = (∇F∗)−1(θ) ◮ Young's inequality gives rise to anoni al divergen e [15℄: F(θ) + F∗(η′) ≥ θ⊤η′ ⇒ AF,F∗(θ, η′) = F(θ) + F∗(η′) − θ⊤η′ ◮ Writing using single oordinate system, get dual Bregman divergen es: BF (θp : θq) = F(θp) − F(θq) − (θp − θq)⊤∇F(θq) = BF∗(ηq : ηp) = AF,F∗(θp, ηq) = AF∗,F (ηq : θp) ◮ dual ane oordinate systems with geodesi s straight: η = ∇F(θ) ⇔ θ = ∇F∗(η). Tensor g(θ) = g∗(η) 2014 Frank Nielsen 5.Dually at CIG 27/75
  • 28. Dual divergen e/Bregman dual bise tors [6, 24, 26℄ Bregman sided (referen e) bise tors related by onvex duality: BiF (θ1, θ2) = {θ ∈ |BF (θ : θ1) = BF (θ : θ1)} BiF∗(η1, η2) = {η ∈ H |BF∗(η : η1) = BF∗(η : η1)} Right-sided bise tor: → θ-hyperplane, η-hypersurfa e HF (p, q) = {x ∈ X | BF (x : p ) = BF (x : q )}. HF : h∇F(p) − ∇F(q), xi + (F(p) − F(q) + hq,∇F(q)i − hp,∇F(p)i) = 0 Left-sided bise tor: → θ-hypersurfa e, η-hyperplane H′F (p, q) = {x ∈ X | BF ( p : x) = BF ( q : x)} H′F : h∇F(x), q − pi + F(p) − F(q) = 0 hyperplane = autoparallel submanifold of dimension d − 1 2014 Frank Nielsen 5.Dually at CIG-1.bise tor 28/75
  • 29. Visualizing Bregman bise tors Primal oordinates θ Dual oordinates η natural parameters expe tation parameters p q Source Space: Itakura-Saito p(0.52977081,0.72041688) q(0.85824458,0.29083834) D(p,q)=0.66969016 D(q,p)=0.44835617 Gradient Space: Itakura-Saito dual p’(-1.88760873,-1.38808518) q’(-1.16516903,-3.43833618) D*(p’,q’)=0.44835617 D*(q’,p’)=0.66969016 p’ q’ Bi(P,Q) and Bi∗(P,Q) an be expressed in either θ/η oordinate systems 2014 Frank Nielsen 5.Dually at CIG-1.bise tor 29/75
  • 30. Spa es of spheres: 1-to-1 mapping between d-spheres and (d + 1)-hyperplanes using potential fun tions 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 30/75
  • 31. Spa e of Bregman spheres and Bregman balls [6℄ Dual sided Bregman balls (bounding Bregman spheres): Ballr F (c, r ) = {x ∈ X | BF (x : c) ≤ r} Balll F (c, r ) = {x ∈ X | BF (c : x) ≤ r} Legendre duality: F (c, r ) = (∇F)−1(Ballr F∗(∇F(c), r )) Balll Illustration for Itakura-Saito divergen e, F(x) = −log x 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 31/75
  • 32. Spa e of Bregman spheres: Lifting map [6℄ F : x7→ ˆx = (x, F(x)), hypersurfa e in Rd+1, potential fun tion Hp: Tangent hyperplane at ˆp, z = Hp(x) = hx − p,∇F(p)i + F(p) ◮ Bregman sphere σ −→ ˆσ with supporting hyperplane H : z = hx − c,∇F(c)i + F(c) + r . (// to Hc and shifted verti ally by r ) ˆσ = F ∩ H. ◮ interse tion of any hyperplane H with F proje ts onto X as a Bregman sphere: H : z = hx, ai + b → σ : BallF (c = (∇F)−1(a), r = ha, ci − F(c) + b) 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 32/75
  • 33. Lifting/Polarity: Potential fun tion graph F 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 33/75
  • 34. Spa e of Bregman spheres: Algorithmi appli ations [6℄ ◮ Union/interse tion of Bregman d-spheres from representational (d + 1)-polytope [6℄ ◮ Radi al axis of two Bregman balls is an hyperplane: Appli ations to Nearest Neighbor sear h trees like Bregman ball trees or Bregman vantage point trees [31℄. 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 34/75
  • 35. Bregman proximity data stru tures [31℄ Vantage point trees: partition spa e a ording to Bregman balls Partitionning spa e with interse tion of Kullba k-Leibler balls → e ient nearest neighbour queries in information spa es 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 35/75
  • 36. Appli ation: Minimum En losing Ball [23, 32℄ To a hyperplane H = H(a, b) : z = ha, xi + b in Rd+1, orresponds a ball σ = Ball(c, r ) in Rd with enter c = ∇F∗(a) and radius: r = ha, ci − F(c) + b = ha,∇F∗(a)i − F(∇F∗(a)) + b = F∗(a) + b sin e F(∇F∗(a)) = h∇F∗(a), ai − F∗(a) (Young equality) SEB: Find halfspa e H(a, b)− : z ≤ ha, xi + b that ontains all lifted points: min a,b r = F∗(a) + b, ∀i ∈ {1, ..., n}, ha, xi i + b − F(xi ) ≥ 0 → Convex Program (CP) with linear inequality onstraints F(θ) = F∗(η) = 12 x⊤x: CP → Quadrati Programming (QP) [11℄ used in SVM. Smallest en losing ball used as a primitive in SVM [34℄ 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 36/75
  • 37. Smallest Bregman en losing balls [32, 22℄ Algorithm 1: BBCA(P, l ). c1 ← hoose randomly a point in P; for i = 2 to l − 1 do // farthest point from ci wrt. BF si ← argmax=1BF (ci : pj ); nj // update the enter: walk on the η-segment [ci , psi ] ci+1 ← ∇F−1(∇F(ci )# 1 i+1∇F(psi )) ; end // Return the SEBB approximation return Ball(cl , rl = BF (cl : X)) ; θ-, η-geodesi segments in dually at geometry. 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 37/75
  • 38. Smallest en losing balls: Core-sets [32℄ Core-set C ⊆ S: SOL(S) ≤ SOL(C) ≤ (1 + ǫ)SOL(S) extended Kullba k-Leibler Itakura-Saito 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 38/75
  • 39. InSphere predi ates wrt Bregman divergen es [6℄ Impli it representation of Bregman spheres/balls: onsider d + 1 support points on the boundary ◮ Is x inside the Bregman ball dened by d + 1 support points? InSphere(x; p0, ..., pd ) =
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. 1 ... 1 1 p0 ... pd x F(p0) ... F(pd ) F(x)
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51. ◮ sign of a (d + 2) × (d + 2) matrix determinant ◮ InSphere(x; p0, ..., pd ) is negative, null or positive depending on whether x lies inside, on, or outside σ. 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 39/75
  • 52. Smallest en losing ball in Riemannian manifolds [2℄ c = a#Mt b: point γ(t) on the geodesi line segment [ab] wrt M su h that ρM(a, c) = t × ρM(a, b) (with ρM the metri distan e on manifold M) Algorithm 2: GeoA c1 ← hoose randomly a point in P; for i = 2 to l do // farthest point from ci si ← argmax=1ρ(ci , pj ); nj // update the enter: walk on the geodesi line segment [ci , psi ] ci+1 ← ci#M 1 i+1 psi ; end // Return the SEB approximation return Ball(cl , rl = ρ(cl ,P)) ; 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 40/75
  • 53. Approximating the smallest en losing ball in hyperboli spa e Initialization First iteration Se ond iteration Third iteration Fourth iteration after 104 iterations http://www.sony sl. o.jp/person/nielsen/infogeo/RiemannMinimax/ 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 41/75
  • 54. Bregman dual regular/Delaunay triangulations Embedded geodesi Delaunay triangulations+empty Bregman balls Delaunay Exponential Del. Hellinger-like Del. ◮ empty Bregman sphere property, ◮ geodesi triangles: embedded Delaunay. 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 42/75
  • 55. Dually orthogonal Bregman Voronoi Triangulations Ordinary Voronoi diagram is perpendi ular to Delaunay triangulation: Voronoi k-fa e ⊥ Delaunay d − k-fa e Bi(P,Q) ⊥ γ∗(P,Q) γ(P,Q) ⊥ Bi∗(P,Q) 2014 Frank Nielsen 5.Dually at CIG-2.Spa e of spheres 43/75
  • 56. Syntheti geometry: Exa t hara terization of the Bayesian error exponent but no losed-form known 2014 Frank Nielsen 6.Bayesian error exponent 44/75
  • 57. Bayesian hypothesis testing, MAP rule and probability of error Pe ◮ Mixture p(x) = P i wipi (x). Task = Classify x Whi h omponent? ◮ Prior probabilities: wi = P(X ∼ Pi ) 0 (with Pn i=1 wi = 1) ◮ Conditional probabilities: P(X = x|X ∼ Pi ). P(X = x) = Xn i=1 P(X ∼ Pi )P(X = x|X ∼ Pi ) = Xn i=1 wiP(X|Pi ) ◮ Best rule = Maximum a posteriori probability (MAP) rule: map(x) = argmaxi∈{1,...,n} wipi (x) where pi (x) = P(X = x|X ∼ Pi ) are the onditional probabilities. ◮ For w1 = w2 = 12 , probability of error Pe = 12 R min(p1(x), p2(x))dx ≤ 12 R p1(x)p2(x)1−dx, for α ∈ (0, 1). Best exponent α∗ 2014 Frank Nielsen 6.Bayesian error exponent 45/75
  • 58. Error exponent for exponential families ◮ Exponential families have nite dimensional su ient statisti s: → Redu e n data to D statisti s. ∀x ∈ X, P(x|θ) = exp(θ⊤t(x) − F(θ) + k(x)) F(·): log-normalizer/ umulant/partition fun tion, k(x): auxiliary term for arrier measure. ◮ Maximum likelihood estimator (MLE): ∇F(θ) ˆ= 1 n P i t(Xi ) = ˆη ◮ Bije tion between exponential families and Bregman divergen es: log p(x|θ) = −BF∗(t(x) : η) + F∗(t(x)) + k(x) Exponential families are log- on ave 2014 Frank Nielsen 6.Bayesian error exponent 46/75
  • 59. Geometry of the best error exponent On the exponential family manifold, Cherno α- oe ient [7℄: c(P1 : P2) = Z 2 (x)dμ(x) = exp(−J() p 1 (x)p1− F (θ1 : θ2)) Skew Jensen divergen e [20℄ on the natural parameters: J() F (θ1 : θ2) = αF(θ1) + (1 − α)F(θ2) − F(θ() 12 ) Cherno information = Bregman divergen e for exponential families: C(P1 : P2 ) = B(θ1 : θ(∗) 12 ) = B(θ2 : θ(∗) 12 ) Finding best error exponent α∗? 2014 Frank Nielsen 6.Bayesian error exponent 47/75
  • 60. Geometry of the best error exponent: binary hypothesis [17℄ Cherno distribution P∗: P∗ = P∗12 = Ge(P1, P2) ∩ Bim(P1, P2) e-geodesi : Ge(P1, P2) = n E() 12 | θ(E() o , 12 ) = (1 − λ)θ1 + λθ2, λ ∈ [0, 1] m-bise tor: Bim(P1, P2) : n P | F(θ1) − F(θ2) + η(P)⊤θ = 0 o , Optimal natural parameter of P∗: θ∗ = θ(∗) 12 = argmin∈B(θ1 : θ) = argmin∈B(θ2 : θ). → losed-form for order-1 family, or e ient bise tion sear h. 2014 Frank Nielsen 6.Bayesian error exponent 48/75
  • 61. Geometry of the best error exponent: binary hypothesis P∗ = P∗12 = Ge(P1, P2) ∩ Bim(P1, P2) p1 p2 m-bisector p 12 -coordinate system e-geodesic Ge(P1 , P2 ) P 12 C(1 : 2) = B(1 : 12) Bim(P1 , P2 ) Binary Hypothesis Testing: Pe bounded using Bregman divergen e between Cherno distribution and lass- onditional distributions. 2014 Frank Nielsen 6.Bayesian error exponent 49/75
  • 62. Clustering and Learning nite statisti al mixtures 2014 Frank Nielsen 6.Bayesian error exponent 50/75
  • 63. -divergen es For α ∈ R6= ±1, α-divergen es [9℄ on positive arrays [36℄ : ◮ D(p : q) .= Xd i=1 4 1 − α2 1 − α 2 pi + 1 + α 2 qi − (pi ) 1− 2 (qi ) 1+ 2 with D(p : q) = D−(q : p) and in the limit ases D−1(p : q) = KL(p : q) and D1(p : q) = KL(q : p), where KL is the extended Kullba kLeibler divergen e KL(p : q) .= Pd i=1 pi log pi qi + qi − pi ◮ α-divergen es belong to the lass of Csiszár f -divergen es If (p : q) .= Pd i=1 qi f pi qi with the following generator: f (t) =   4 1−2 1 − t(1+)/2 , if α6= ±1, t ln t, if α = 1, −ln t, if α = −1 Information monotoni ity 2014 Frank Nielsen 6.Bayesian error exponent 51/75
  • 64. Mixed divergen es [30℄ Dened on three parameters p, q and r : M(p : q : r ) .= λD(p : q) + (1 − λ)D(q : r ) for λ ∈ [0, 1]. Mixed divergen es in lude: ◮ the sided divergen es for λ ∈ {0, 1}, ◮ the symmetrized (arithmeti mean) divergen e for λ = 12 , or skew symmetrized for λ6= 12 . 2014 Frank Nielsen 7.Mixed divergen es 52/75
  • 65. Symmetrizing -divergen es S(p, q) = 1 2 (D(p : q) + D(q : p)) = S−(p, q), = M1 2 (p : q : p), For α = ±1, we get half of Jereys divergen e: S±1(p, q) = 1 2 Xd i=1 (pi − qi ) log pi qi ◮ Centroids for symmetrized α-divergen e usually not in losed form. ◮ How to perform enter-based lustering without losed form entroids? 2014 Frank Nielsen 7.Mixed divergen es 53/75
  • 66. Jereys positive entroid [16℄ ◮ Jereys divergen e is symmetrized α = ±1 divergen es. ◮ The Jereys positive entroid c = (c1, ..., cd ) of a set {h1, ..., hn} of n weighted positive histograms with d bins an be al ulated omponent-wise exa tly using the Lambert W analyti fun tion: ci = ai W ai gi e where ai = Pn j=1 πjhi j denotes the oordinate-wise arithmeti weighted means and gi = Qn j )j the j=1(hi oordinate-wise geometri weighted means. ◮ The Lambert analyti fun tion W [4℄ (positive bran h) is dened by W(x)eW(x) = x for x ≥ 0. ◮ → Jereys k-means lustering . But for α6= 1, how to luster? 2014 Frank Nielsen 7.Mixed divergen es 54/75
  • 67. Mixed -divergen es/-Jereys symmetrized divergen e ◮ Mixed α-divergen e between a histogram x to two histograms p and q: M,(p : x : q) = λD(p : x) + (1 − λ)D(x : q), = λD−(x : p) + (1 − λ)D−(q : x), = M1−,−(q : x : p), ◮ α-Jereys symmetrized divergen e is obtained for λ = 12 : S(p, q) = M1 2 ,(q : p : q) = M1 2 ,(p : q : p) ◮ skew symmetrized α-divergen e is dened by: S,(p : q) = λD(p : q) + (1 − λ)D(q : p) 2014 Frank Nielsen 7.Mixed divergen es 55/75
  • 68. Mixed divergen e-based k-means lustering k distin t seeds from the dataset with li = ri . Input: Weighted histogram set H, divergen e D(·, ·), integer k 0, real λ ∈ [0, 1]; Initialize left-sided/right-sided seeds C = {(li , ri )}ki =1; repeat //Assignment for i = 1, 2, ..., k do Ci ← {h ∈ H : i = arg minj M(lj : h : rj )}; end // Dual-sided entroid relo ation for i = 1, 2, ..., k do ri ← arg minx D(Ci : x) = P h∈Ci wjD(h : x); li ← arg minx D(x : Ci ) = P h∈Ci wjD(x : h); end until onvergen e; 2014 Frank Nielsen 7.Mixed divergen es 56/75
  • 69. ki Mixed -hard lustering: MAhC(H, k, , ) Input: Weighted histogram set H, integer k 0, real λ ∈ [0, 1], real α ∈ R; Let C = {(li , ri )}=1 ← MAS(H, k, λ, α); repeat //Assignment for i = 1, 2, ..., k do Ai ← {h ∈ H : i = arg minj M,(lj : h : rj )}; end // Centroid relo ation for i = 1, 2, ..., k do ri ← P h∈Ai wih 1− 2 2 1− ; li ← P h∈Ai wih 1+ 2 2 1+ ; end until onvergen e; 2014 Frank Nielsen 7.Mixed divergen es 57/75
  • 70. Coupled k-Means++ -Seeding Algorithm 3: Mixed α-seeding; MAS(H, k, λ, α) Input: Weighted histogram set H, integer k ≥ 1, real λ ∈ [0, 1], real α ∈ R; Let C ← hj with uniform probability ; for i = 2, 3, ..., k do Pi k at random histogram h ∈ H with probability: πH(h) .= whM,(ch : h : ch) P y∈H wyM,(cy : y : cy ) , (1) //where (ch, ch) .=arg min(z,z)∈C M,(z : h : z); C ← C ∪ {(h, h)}; end Output: Set of initial luster enters C; → Guaranteed probabilisti bound. Just need to initialize! No entroid omputations 2014 Frank Nielsen 7.Mixed divergen es 58/75
  • 71. Learning MMs: A geometri hard lustering viewpoint Learn the parameters of a mixture m(x) = Pk i=1 wip(x|θi ) Maximize the omplete data likelihood= lustering obje tive fun tion max W, lc (W, ) = Xn i=1 Xk j=1 zi ,j log(wjp(xi |θj )) = max Xn i=1 k max j=1 log(wjp(xi |θj )) ≡ min W, Xn i=1 k min j=1 Dj (xi ) , where cj = (wj , θj ) ( luster prototype) and Dj (xi ) = −log p(xi |θj ) − log wj are potential distan e-like fun tions. further atta h to ea h luster a dierent family of probability distributions. 2014 Frank Nielsen 7.Mixed divergen es 59/75
  • 72. Generalized k-MLE for learning statisti al mixtures Model-based lustering: Assignment of points to lusters: Dwj ,j ,Fj (x) = −log pFj (x; θj ) − log wj k-GMLE: 1. Initialize weight W ∈ k and family type (F1, ..., Fk ) for ea h luster 2. Solve min P i minj Dj (xi ) ( enter-based lustering for W xed) with potential fun tions: Dj (xi ) = −log pFj (xi |θj ) − log wj 3. Solve family types maximizing the MLE in ea h luster Cj by hoosing the parametri family of distributions Fj P = F(γj ) that yields the best likelihood: minF1=F( 1),...,Fk=F( )∈F( ) mini j Dwj k ,j ,Fj (xi ). j ( ˆ ηl = 1 ∀l , γl = maxj F∗ nl P x∈Cl tj (x)) + 1 nl P x∈Cl k(x). 4. Update weight W as the luster point proportion 5. Test for onvergen e and go to step 2) otherwise. Drawba k = biased, non- onsistent estimator due to Voronoi support trun ation. 2014 Frank Nielsen 8.k-GMLE 60/75
  • 73. Computing f -divergen es for generi f : Beyond sto hasti numeri al integration 2014 Frank Nielsen 9.Computing f -divergen es 61/75
  • 74. f -divergen es If (X1 : X2) = Z x1(x)f x2(x) x1(x) dν(x) ≥ 0 Name of the f -divergen e Formula If (P : Q) Generator f (u) with f (1) = 0 Total variation (metri ) 1 2 R |p(x) − q(x)|d(x) 1 2 |u − 1| Squared Hellinger R (pp(x) − pq(x))2d(x) (√u − 1)2 Pearson 2 P R (q(x)−p(x))2 p(x) d(x) (u − 1)2 2 N R (p(x)−q(x))2 Neyman q(x) d(x) (1−u)2 u k P R (q(x)−p(x))k Pearson-Vajda pk−1(x) d(x) (u − 1)k P R |q(x)−p(x)|k Pearson-Vajda ||k pk−1(x) d(x) |u − 1|k Kullba k-Leibler R p(x) log p(x) q(x) d(x) −log u reverse Kullba k-Leibler R q(x) log q(x) p(x) d(x) u log u -divergen e 4 1−2 (1 − R p 1− 2 (x)q1+(x)d(x)) 4 1−2 (1 − u 1+ 2 ) 2 R (p(x) log 2p(x) Jensen-Shannon 1 p(x)+q(x) + q(x) log 2q(x) p(x)+q(x) )d(x) −(u + 1) log 1+u 2 + u log u 2014 Frank Nielsen 9.Computing f -divergen es 62/75
  • 75. f -divergen es and higher-order Vajda k divergen es If (X1 : X2) = ∞X k=0 f (k)(1) k! χk P(X1 : X2) χk P(X1 : X2) = Z (x2(x) − x1(x))k x1(x)k−1 dν(x), |χ|k P(X1 : X2) = Z |x2(x) − x1(x)|k x1(x)k−1 dν(x), are f -divergen es for the generators (u − 1)k and |u − 1|k . ◮ When k = 1, χ1 P(X1 : X2) = R (x1(x) − x2(x))dν(x) = 0 (never dis riminative), and |χ1 P|(X1,X2) is twi e the total variation distan e. ◮ χk P is a signed distan e 2014 Frank Nielsen 9.Computing f -divergen es 63/75
  • 76. Ane exponential families Canoni al de omposition of the probability measure: p(x) = exp(ht(x), θi − F(θ) + k(x)), onsider natural parameter spa e ane (like multinomials). Poi(λ) : p(x|λ) = λx e− x! , λ 0, x ∈ {0, 1, ...} NorI (μ) : p(x|μ) = (2π)−d 2 e−12 (x−μ)⊤(x−μ), μ ∈ Rd , x ∈ Rd Family θ F(θ) k(x) t(x) ν Poisson log λ R e −log x! x νc Iso.Gaussian μ Rd 12 θ⊤θ d 2 log 2π − 12 x⊤x x νL 2014 Frank Nielsen 9.Computing f -divergen es 64/75
  • 77. Higher-order Vajda k divergen es The (signed) χk P distan e between members X1 ∼ EF (θ1) and X2 ∼ EF (θ2) of the same ane exponential family is (k ∈ N) always bounded and equal to: χk P(X1 : X2) = Xk j=0 (−1)k−j k j eF((1−j)1+j2) e(1−j)F(1)+jF(2) For Poisson/Normal distributions, we get losed-form formula: χk P(λ1 : λ2) = Xk j=0 (−1)k−j k j e1−j 1 j 2−((1−j)1+j2), χk P(μ1 : μ2) = Xk j=0 (−1)k−j k j e 12 j(j−1)(μ1−μ2)⊤(μ1−μ2). 2014 Frank Nielsen 9.Computing f -divergen es 65/75
  • 78. f -divergen es: Analyti formula [14℄ ◮ λ = 1 ∈ int(dom(f (i ))), f -divergen e (Theorem 1 of [3℄):
  • 79.
  • 80.
  • 81.
  • 82.
  • 83. If (X1 : X2) − Xs k=0 f (k)(1) k!
  • 84.
  • 85.
  • 86.
  • 87.
  • 88. χk P(X1 : X2) ≤ 1 (s + 1)!kf (s+1)k∞(M − m)s , where kf (s+1)k∞ = supt∈[m,M] |f (s+1)(t)| and m ≤ p q ≤ M. ◮ λ = 0 (whenever 0 ∈ int(dom(f (i )))) and ane exponential families, simpler expression: If (X1 : X2) = ∞X i=0 f (i )(0) i ! I1−i ,i (θ1 : θ2), I1−i ,i (θ1 : θ2) = eF(i2+(1−i )1) eiF(2)+(1−i )F(1) . 2014 Frank Nielsen 9.Computing f -divergen es 66/75
  • 89. Designing onformal divergen es: Finding graphi al gaps! 2014 Frank Nielsen 10.Conformal divergen es 67/75
  • 90. Geometri ally designed divergen es Plot of the onvex generator F. q p p+q 2 B(p : q) J(p, q) tB(p : q) F : (x, F(x)) (p, F(p)) (q, F(q)) 2014 Frank Nielsen 10.Conformal divergen es 68/75
  • 91. Divergen es: skew Jensen Bregman divergen es F a smooth onvex fun tion, the generator. ◮ Skew Jensen divergen es: J′ (p : q) = αF(p) + (1 − α)F(q) − F(αp + (1 − α)q), = (F(p)F(q)) − F((pq)), where (pq) = γp + (1 − γ)q = q + γ(p − q) and (F(p)F(q)) = γF(p) + (1 − γ)F(q) = F(q) + γ(F(p) − F(q)). ◮ Bregman divergen es: B(p : q) = F(p) − F(q) − hp − q,∇F(q)i, lim →0 J(p : q) = B(p : q), lim →1 J(p : q) = B(q : p). ◮ Statisti al skewed Bhatta harrya divergen e: Bhat(p1 : p2) = −log Z p1(x)p2(x)1−dν(x) = J′ (θ1 : θ2) for exponential families [21℄. 2014 Frank Nielsen 10.Conformal divergen es 69/75
  • 92. Total Bregman divergen es [13℄ Conformal divergen e, onformal fa tor ρ: D′(p : q) = ρ(p, q)D(p : q) plays the r le of regularizer [35℄ Invarian e by rotation of the axes of the design spa e tB(p : q) = B(p : q) p 1 + h∇F(q),∇F(q)i = ρB(q)B(p : q), ρB(q) = 1 p 1 + h∇F(q),∇F(q)i . For example, total squared Eu lidean divergen e: tE(p, q) = 1 2 hpp− q, p − qi 1 + hq, qi . 2014 Frank Nielsen 10.Conformal divergen es 70/75
  • 93. Total skew Jensen divergen es [27℄ tB(p : q) = ρB(q)B(p : q), ρB(q) = s 1 1 + h∇F(q),∇F(q)i tJ(p : q) = ρJ (p, q)J(p : q), ρJ (p, q) = vuut 1 1 + (F(p)−F(q))2 hp−q,p−qi Jensen-Shannon divergen e, square root is a metri : JS(p, q) = 1 2 Xd i=1 pi log 2pi pi + qi + 1 2 Xd i=1 qi log 2qi pi + qi But the square root of the total Jensen-Shannon divergen e is not a metri . 2014 Frank Nielsen 10.Conformal divergen es 71/75
  • 94. Summary: Geometri Computing in Information Spa es ◮ Lo ation-s ale families, spheri al normal, symmetri positive denite matri es → hyperboli geometry. ◮ Hyperboli geometry: CG ane onstru tions in Klein disk ◮ Spa e of spheres in dually ane onne tion geometry ◮ Syntheti geometry for hara terizing the best error exponent in Bayes error ◮ Conformal divergen es: total Bregman/total Jensen divergen es ◮ Clustering using pair of entroids for lusters using mixed divergen es for symmetrized alpha divergen es ◮ Learning statisi al mixtures maximizing the omplete likelihood as a sequen e of geometri lustering problems: k-GLME ◮ In sear h of losed-form solutions: Jereys entroid using Lambert W fun tion, f -divergen e approximation for ane exponential families. 2014 Frank Nielsen 10.Conformal divergen es 72/75
  • 95. Computational Information Geometry (Edited books) [19℄ [18℄ http://www.springer. om/engineering/signals/book/978-3-642-30231-2 http://www.sony sl. o.jp/person/nielsen/infogeo/MIG/MIGBOOKWEB/ http://www.springer. om/engineering/signals/book/978-3-319-05316-5 http://www.sony sl. o.jp/person/nielsen/infogeo/GTI/Geometri TheoryOfInformation.html 2014 Frank Nielsen 11.Referen es 73/75
  • 96. Geometri S ien es of Information (GSI) 2015 O tober 28-30th 2015. Deadline 1st Mar h 2015 http://www.gsi2015.org/ 2014 Frank Nielsen 11.Referen es 74/75
  • 97. Thank you! 2014 Frank Nielsen 11.Referen es 75/75
  • 98. Mar Arnaudon and Frank Nielsen. On approximating the Riemannian 1- enter. Comput. Geom. Theory Appl., 46(1):93104, January 2013. Mar Arnaudon and Frank Nielsen. On approximating the Riemannian 1- enter. Computational Geometry, 46(1):93 104, 2013. N.S. Barnett, P. Cerone, S.S. Dragomir, and A. Sofo. Approximating Csiszár f -divergen e by the use of Taylor's formula with integral remainder. Mathemati al Inequalities Appli ations, 5(3):417434, 2002. D. A. Barry, P. J. Culligan-Hensley, and S. J. Barry. Real values of the W-fun tion. ACM Trans. Math. Softw., 21(2):161171, June 1995. Jean-Daniel Boissonnat and Christophe Delage. Convex hull and Voronoi diagram of additively weighted points. In Gerth St ¸lting Brodal and Stefano Leonardi, editors, ESA, volume 3669 of Le ture Notes in Computer S ien e, pages 367378. Springer, 2005. Jean-Daniel Boissonnat, Frank Nielsen, and Ri hard No k. Bregman Voronoi diagrams. Dis rete and Computational Geometry, 44(2):281307, April 2010. Herman Cherno. A measure of asymptoti e ien y for tests of a hypothesis based on the sum of observations. Annals of Mathemati al Statisti s, 23:493507, 1952. Pas al Chossat and Olivier P. Faugeras. Hyperboli planforms in relation to visual edges and textures per eption. PLoS Computational Biology, 5(12), 2009. Andrzej Ci ho ki, Sergio Cru es, and Shun-i hi Amari. Generalized alpha-beta divergen es and their appli ation to robust nonnegative matrix fa torization. 2014 Frank Nielsen 11.Referen es 75/75
  • 99. Entropy, 13(1):134170, 2011. P. Thomas Flet her, Conglin Lu, Stephen M. Pizer, and Sarang C. Joshi. Prin ipal geodesi analysis for the study of nonlinear statisti s of shape. IEEE Trans. Med. Imaging, 23(8):9951005, 2004. Bernd Gärtner and Sven S hönherr. An e ient, exa t, and generi quadrati programming solver for geometri optimization. In Pro eedings of the sixteenth annual symposium on Computational geometry, pages 110118. ACM, 2000. Harold Hotelling. Meizhu Liu, Baba C. Vemuri, Shun-i hi Amari, and Frank Nielsen. Shape retrieval using hierar hi al total Bregman soft lustering. Transa tions on Pattern Analysis and Ma hine Intelligen e, 34(12):24072419, 2012. F. Nielsen and R. No k. On the hi square and higher-order hi distan es for approximating f -divergen es. Signal Pro essing Letters, IEEE, 21(1):1013, 2014. Frank Nielsen. Legendre transformation and information geometry. Te hni al Report CIG-MEMO2, September 2010. Frank Nielsen. Jereys entroids: A losed-form expression for positive histograms and a guaranteed tight approximation for frequen y histograms. Signal Pro essing Letters, IEEE, PP(99):11, 2013. Frank Nielsen. Generalized bhatta haryya and herno upper bounds on bayes error using quasi-arithmeti means. Pattern Re ognition Letters, 42:2534, 2014. Frank Nielsen. Geometri Theory of Information. 2014 Frank Nielsen 11.Referen es 75/75
  • 100. Springer, 2014. Frank Nielsen and Rajendra Bhatia, editors. Matrix Information Geometry (Revised Invited Papers). Springer, 2012. Frank Nielsen and Sylvain Boltz. The Burbea-Rao and Bhatta haryya entroids. IEEE Transa tions on Information Theory, 57(8):54555466, 2011. Frank Nielsen and Sylvain Boltz. The Burbea-Rao and Bhatta haryya entroids. IEEE Transa tions on Information Theory, 57(8):54555466, August 2011. Frank Nielsen and Ri hard No k. On approximating the smallest en losing Bregman balls. In Pro eedings of the Twenty-se ond Annual Symposium on Computational Geometry, SCG '06, pages 485486, New York, NY, USA, 2006. ACM. Frank Nielsen and Ri hard No k. On the smallest en losing information disk. Information Pro essing Letters (IPL), 105(3):9397, 2008. Frank Nielsen and Ri hard No k. The dual Voronoi diagrams with respe t to representational Bregman divergen es. In International Symposium on Voronoi Diagrams (ISVD), pages 7178, 2009. Frank Nielsen and Ri hard No k. Hyperboli Voronoi diagrams made easy. In 2013 13th International Conferen e on Computational S ien e and Its Appli ations, pages 7480. IEEE, 2010. Frank Nielsen and Ri hard No k. Hyperboli Voronoi diagrams made easy. In International Conferen e on Computational S ien e and its Appli ations (ICCSA), volume 1, pages 7480, Los Alamitos, CA, USA, mar h 2010. IEEE Computer So iety. Frank Nielsen and Ri hard No k. 2014 Frank Nielsen 11.Referen es 75/75
  • 101. Total jensen divergen es: Denition, properties and k-means++ lustering. CoRR, abs/1309.7109, 2013. Frank Nielsen and Ri hard No k. Visualizing hyperboli Voronoi diagrams. In Pro eedings of the Thirtieth Annual Symposium on Computational Geometry, SOCG'14, pages 90:9090:91, New York, NY, USA, 2014. ACM. Frank Nielsen and Ri hard No k. Visualizing hyperboli Voronoi diagrams. In Symposium on Computational Geometry, page 90, 2014. Frank Nielsen, Ri hard No k, and Shun-i hi Amari. On lustering histograms with k-means by using mixed -divergen es. Entropy, 16(6):32733301, 2014. Frank Nielsen, Paolo Piro, and Mi hel Barlaud. Bregman vantage point trees for e ient nearest neighbor queries. In Pro eedings of the 2009 IEEE International Conferen e on Multimedia and Expo (ICME), pages 878881, 2009. Ri hard No k and Frank Nielsen. Fitting the smallest en losing Bregman ball. In Ma hine Learning, volume 3720 of Le ture Notes in Computer S ien e, pages 649656. Springer Berlin Heidelberg, 2005. Calyampudi Radhakrishna Rao. Information and the a ura y attainable in the estimation of statisti al parameters. Bulletin of the Cal utta Mathemati al So iety, 37:8189, 1945. Ivor W. Tsang, Andras Ko sor, and James T. Kwok. Simpler ore ve tor ma hines with en losing balls. In Pro eedings of the 24th International Conferen e on Ma hine Learning (ICML), pages 911918, New York, NY, USA, 2007. ACM. Baba Vemuri, Meizhu Liu, Shun-i hi Amari, and Frank Nielsen. Total Bregman divergen e and its appli ations to DTI analysis. 2014 Frank Nielsen 11.Referen es 75/75
  • 102. IEEE Transa tions on Medi al Imaging, pages 475483, 2011. Huaiyu Zhu and Ri hard Rohwer. Measurements of generalisation based on information geometry. In StephenW. Ella ott, JohnC. Mason, and IainJ. Anderson, editors, Mathemati s of Neural Networks, volume 8 of Operations Resear h/Computer S ien e Interfa es Series, pages 394398. Springer US, 1997. 2014 Frank Nielsen 11.Referen es 75/75