1. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Introduction to Topological Data Analysis) (TDA)
Part 1 - Introduction
Rodrigo Rojas Moraleda
April 22 2012
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 1/31
2. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Outline
1 About
2 Introduction
3 Elements of Topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 2/31
3. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
1 About
2 Introduction
3 Elements of Topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 3/31
4. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
About
Image Processing Group
Luis Salinas Carrasco Dr.
rer. Nat (Mathematics) Professor at Department of Computer Science UTFSM. Electronic
Engineer, Areas of Interest: Computer Science and Computational Methods in Engineering, Functions Theory,
Theoretical Computer Science and Mathematical Foundations of Computer Science, Signal and Image Processing
Steffen H¨artel Dr. rer. Nat. (Biophysics) Group
leader of Laboratory of Scientific Image Processing (SCIAN-Lab) at the Medical Faculty of the University of Chile,
Santiago, ChileAreas of Interest: to develop mathematical tools and computational algorithms to access dynamic,
morphologic, and topologic features in experimental systems with a biophysical, biological, or medical background
Rodrigo Rojas Moraleda
Ph.D student in Computer Science, Engineer
in Computer Science, B.Sc. Engineering minor
in Computer Science. Areas of Interest: Image
Processing, Computer Vision, Data Analysis, Biomedical applications
Raquel Pezoa Rivera Ph.D
student in Computer Science, M.Sc in Computer
Science, Universidad T´ecnica Federico Santa Mar´ıa
(UTFSM) Computer Science Engineer, UTFSM
Paola Arce Ph.D student
in Computer Science. Time series data analysis
Cesar Fernandez Ph.D student
in Computer Science. Time series data analysis
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 4/31
5. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Summary of contents
Part 1 - Introduction
Part 1 - Introduction
Introduction & Motivation
What is topology
What is computational topology.
Elements of Topology.
Manifolds.
Topological spaces.
Introduction to simplicial
complexes.
Geometrical Simplicial
Complexes.
Abstract Simplicial
Complexes
Part 2 - Homology
What is homology
Elements of homology by example.
Chains.
Clyces.
Boundary operator.
Homology classes.
Homology groups.
Betty numbers.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 5/31
6. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Simplicial Homology
Part 3 - Simplicial Homology
Point Clouds
Homotopy & Nervs
Alpha complex
Rips Complex
Cech Complex
Sandwich theorem
Part 4 - Persistence homology
Concepts
Filtration
Barcodes
Uses and applications
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 6/31
7. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
1 About
2 Introduction
3 Elements of Topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 7/31
8. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Priciple
Shape of data matters
Statistically a crudeaspect is how many connected
componnets can break into. clusters groups content
part of the phenomena.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 8/31
9. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
What is shape of data ?
Formaly defined in terms of a distance funtion of a metric e.g.
correlation distances
binary data Hamming distances
Euclidean distances
alphabet letter distances in Genes sequences.
Encoding notion of similarity o disimimilarity.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 9/31
10. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
What is shape of data ?
Formaly defined in terms of a distance funtion of a metric e.g.
correlation distances
binary data Hamming distances
Euclidean distances
alphabet letter distances in Genes sequences.
Encoding notion of similarity o disimimilarity.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 9/31
11. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
We trust in small distances
Physics define a theoretical
justify based on notion for
distances or meassurements
In life or social sciences,
distance (metric) are
constructed using a notion of
similarity (proximity).
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 10/31
12. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
We trust in small distances
1.8
1.5
Both pairs are regarded as similar, but the strength of the similarity as
encoded by the distance may not be so significant
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 11/31
13. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
We trust in small distances?
Even Local Connections are Noisy, depending on observer’s scale!
15
10
5
0
-5
-10
-15
-15 -15 -10 -5 0 10 15
Is it a circle, dots, or circle of circles?
To see the circle, we ignore variations in small distance (tolerance for proximity)
Because distance measurements are noisy Similar objects lie in
neighborhood of each other, which suffices to define topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 12/31
14. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
We trust in small distances?
Even Local Connections are Noisy, depending on observer’s scale!
15
10
5
0
-5
-10
-15
-15 -15 -10 -5 0 10 15
Is it a circle, dots, or circle of circles?
To see the circle, we ignore variations in small distance (tolerance for proximity)
Because distance measurements are noisy Similar objects lie in
neighborhood of each other, which suffices to define topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 12/31
15. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
16. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
17. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
18. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
19. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
20. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
21. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
22. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
23. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
24. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
according to Gunnar Carlsson
Mathematical formalism for doing two things:
◦ Meassuring shapes
· what is meassuring a shape?.
· counting loops, clusters?
· is not something that can be done easily in
terms of a number.
· is a vague notion.
◦ Representing shapes
· shape in a metric spaces is a infinite list of
points and distances.
· we would like a way to representing and
understanding the shape in a musch smaller
amount of information like a triangulation or
set of nodes that capture the structure of the
space much simpler representation.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 13/31
25. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is topology?
This is the subject of topology
Origins of Topology in Math
Leonhard Euler 1736, Seven Bridges of K¨onigsberg
Johann Benedict Listing 1847, Vorstudien zur Topologie
J.B. Listing (orbituary) Nature 27:316-317, 1883. “qualitative
geometry from the ordinary geometry in which quantitative relations
chiefly are treated.”
In the last 10 years move into the field of Point Clouds.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 14/31
26. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
→ An object representation contains enough information to reconstruct (an
approximation to) the object.
→ A description only contains enough information to identify an object as
amember of some class.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 15/31
27. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is Topology?
According to the Oxford English Dictionary,the word topology is derived from topos
(τ´oπoς) meaning place, and -logy (λoγ´ια), a variant of the verb λ´εγειυ, meaning to
speak. As such, topology speaks about places: how local neighborhoods connect to
each other to form a space.
Topology - Study of shapes (topological spaces) that can be deformed into
other shapes in a continuous manner (without tearing)
If one shape can be deformed into another, we say they are topologically
equivalent.
In contrast the field of euclidean geometry studies intrinsic properties in
the euclidean n-dimensional space that are invariant under euclidean
transformations, such as translations and ortogonal transformations in En
.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 16/31
28. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
What is Computational Topology?
Computational topology has theoretical and practical goals.
Theoretically, looks at the tractability and complexity of each problem, as well
as the design of efficient data structures and algorithms.
Practically, involves heuristics and fast software for solving problems that arise
in diverse disciplines.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 17/31
29. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Shape of data matters
End Introduction.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 18/31
30. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
1 About
2 Introduction
3 Elements of Topology
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 19/31
31. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
A topology on a set X is a subset T ⊆ X such that:
a) ∅, X ∈ T
b) Uα ∈ T ⇒
α
Uα ∈ T
c) U1, . . . , Un ∈ T ⇒
n
k=1
Uk ∈ T
A subset U ∈ T is called open set and its complement X U is called closed.
The pair (X, T ) is called a topological space.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 20/31
32. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
A function f : (X, Tx ) → (Y, Ty ) is continuous if for every open set
U ∈ (Y, Ty ), f −1
(U) is open in (X, Tx ), then f is called a continuous map.
U
(X,Tx)
f -1
(U)
(Y,Ty )
f
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 21/31
33. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
In a topological space (X, Tx ), the closure U of some subset set U is the
intersection of all closed sets containing U
(X,Tx) (X,Tx )
UU
Some spanish synonyms are: adherencia, cierre.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 22/31
34. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
The interior int(U) of some U ∈ (X, Tx ) is the union of all open sets contained
in U
(X,Tx)
U
(X,Tx )
int (U)
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 23/31
35. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
The boundary ∂U of some U is ∂U = U − int(U)
(X,Tx)
U
(X,Tx )
U
(X,Tx )
int (U)
(X,Tx )
U∂
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 24/31
36. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Topology
A limit point A is a set of {x ∈ U : ∀ U open set in (X, Tx ),
x ∈ U ⇒ (U − x) U = ∅}
(X,Tx) (X,Tx )
A
Ux
A'
Spanish synonyms are: acumulaci´on.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 25/31
37. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Manifolds
(a) 1 (b) 2 (d) 2
(e) Klein bottle(c) Toruswith ∂
Figure: Manifolds. (a) The only compact connected one-manifold is a circle S1. (b) The sphere is a two-manifold. (c) The surface of
a donut, a torus, is also a two-manifold.(d) A Boy’s surface is a geometric immersion of the projective plane P2, a nonorientable
two-manifold. (e) The Klein bottle is another nonorientable two-manifold.
A topological space may be viewed as an abstraction of a metric space.
Similarly, manifolds generalize the connectivity of d − dimensional
Euclidean spaces Rd
by being locally similar but globally different.
A d − dimensional chart at p ∈ X is a homeomorphism ϕ : U → Rd
onto
an open subset of Rd
, where U is a neighborhood of p and open is
defined using a metric.
A d-dimensional manifold (d-manifold) is a topological space X with a
d − dimensional chart at every point x ∈ X.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 26/31
38. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
To compute information about a topological space using a computer, we need
a finite representation of the space.
In this section, we represent a topological space as a union of simple pieces,
deriving a combinatorial description that is useful in practice. Intuitively, cell
complexes are composed of Euclidean pieces glued together along seams,
generalizing polyhedra.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 27/31
39. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Geometrical
(a) (b) (c) (d)
A set of (k + 1) points {u0, · · · , uk } in R (with k ≥ 0) is called affinely
independent if the k vectors (uj − u0) , with 1 ≤ j ≤ k are linearly
independent.
A k − simplex σ is defined as their convex hull spaned by {u0, · · · , uk }
points
k is called the dimension of σ and {u0, · · · , uk } the vertices of σ.
Simplices in dimension 0, 1, 2, 3 are called vertices (a), edge(b),
triangles(c), and tetrahedra(d),
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 28/31
40. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Geometrical
The simplex is the set of points defined by all convex combinations
σ =
k
i=0
λi ui | λi > 0and
k
i=0
λi = 1
An l − simplex τ is called a face of a k − simplex σ if the vertices of τ are a
subset of the vertices of σ. Clearly, l ≤ k holds in this case. A k − simplex has
2k+1
− 1 faces.
A k-simplex has exactly 2k+1 − 1 faces. For instance, a tetrahedron consists
has one face in dimension 3 (itself), four triangular faces, six edges, and four
vertices which adds up to 15 = 24 − 1.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 29/31
41. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Geometrical
Simplicial complexes.
A geometric simplicial complex K is a finite collection of simplices in Rd
such
that
for any simplex sigma in K, every face of σ is also in K, and.
for any two simplices σ, τ in K, the intersection σ ∩ τ is empty or a face
of both σ and τ (and therefore,part of K as well).
the dimension of K is the maximal dimension of the simplices that it contains.
Note that K is a set of subsets of Rd
and not a subset of Rd
.
K is defined as
|K| =
σ∈K
σ
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 30/31
42. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Abstract
Given abstract vertices v1, · · · , vn , an abstract simplicial complex A is defined
as a collection of subsets of {v1, · · · , vn}, such that with any set M ∈ A, every
nonempty subset of M is also in A.
The subsets of cardinality k + 1 are called abstract k − simplices of A.
We can draw an abstract complex in Rd
by mapping each vertex to some point
in Rd
and mapping a abstract k − simplex to the simplex spanned by the k + 1
corresponding points. If this drawing is a geometric simplicial complex, we call
it a geometric realization of A.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 31/31
43. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Abstract
Given abstract vertices v1, · · · , vn , an abstract simplicial complex A is defined
as a collection of subsets of {v1, · · · , vn}, such that with any set M ∈ A, every
nonempty subset of M is also in A.
The subsets of cardinality k + 1 are called abstract k − simplices of A.
We can draw an abstract complex in Rd
by mapping each vertex to some point
in Rd
and mapping a abstract k − simplex to the simplex spanned by the k + 1
corresponding points. If this drawing is a geometric simplicial complex, we call
it a geometric realization of A.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 31/31
44. DEPARTAMENTO DE INFORMATICA
INVESTIGACION Y POSTGRADO
Topology / Simplicial complexes
Abstract
Given abstract vertices v1, · · · , vn , an abstract simplicial complex A is defined
as a collection of subsets of {v1, · · · , vn}, such that with any set M ∈ A, every
nonempty subset of M is also in A.
The subsets of cardinality k + 1 are called abstract k − simplices of A.
We can draw an abstract complex in Rd
by mapping each vertex to some point
in Rd
and mapping a abstract k − simplex to the simplex spanned by the k + 1
corresponding points. If this drawing is a geometric simplicial complex, we call
it a geometric realization of A.
Rodrigo Rojas Moraleda — Introduction to Topological Data Analysis) (TDA) 31/31