SlideShare a Scribd company logo
1 of 168
Download to read offline
Shape and Meaning
An Introduction to Topological Data Analysis
Anthony Bak
Goals
For this talk I want to:
Goals
For this talk I want to:
Show you how TDA provides a framework for many machine learning/data
analysis techniques
Goals
For this talk I want to:
Show you how TDA provides a framework for many machine learning/data
analysis techniques
Demonstrate how Ayasdi provides insights into the data.
Goals
For this talk I want to:
Show you how TDA provides a framework for many machine learning/data
analysis techniques
Demonstrate how Ayasdi provides insights into the data.
Caveats: I am only talking about the strain of TDA done by Ayasdi
The Data Problem
How do we extract meaning from Complex Data?
The Data Problem
How do we extract meaning from Complex Data?
Data is complex because it’s "Big Data"
The Data Problem
How do we extract meaning from Complex Data?
Data is complex because it’s "Big Data"
Or has very rich features (eg. Genetic Data >500,000 features, complicated
interdependencies)
The Data Problem
How do we extract meaning from Complex Data?
Data is complex because it’s "Big Data"
Or has very rich features (eg. Genetic Data >500,000 features, complicated
interdependencies)
Or both!
The Data Problem
How do we extract meaning from Complex Data?
Data is complex because it’s "Big Data"
Or has very rich features (eg. Genetic Data >500,000 features, complicated
interdependencies)
Or both!
The Problem in both cases is that there isn’t a single story happening in your data.
The Data Problem
How do we extract meaning from Complex Data?
Data is complex because it’s "Big Data"
Or has very rich features (eg. Genetic Data >500,000 features, complicated
interdependencies)
Or both!
The Problem in both cases is that there isn’t a single story happening in your data.
TDA will be the tool that summarizes out the irrelevant stories to get at something
interesting.
Data Has Shape
And Shape Has Meaning
Data Has Shape
And Shape Has Meaning
⇒ In this talk I will focus on how we extract meaning.
But first... What is shape?
But first... What is shape?
Shape is the global realization of local constraints.
But first... What is shape?
Shape is the global realization of local constraints.
For a given problem determined by a choice of
Features, columns or properties to measure.
A metric on the columns.
But first... What is shape?
Shape is the global realization of local constraints.
For a given problem determined by a choice of
Features, columns or properties to measure.
A metric on the columns.
But not necessarily so. There are more relaxed definitions of shape and we
can use those too.
But first... What is shape?
Shape is the global realization of local constraints.
For a given problem determined by a choice of
Features, columns or properties to measure.
A metric on the columns.
But not necessarily so. There are more relaxed definitions of shape and we
can use those too.
The goal of TDA is to understand (for us, summarize) the shape with no
preconceived model of what it should be.
Math World
To show you how we extract insight from shape we start in "Math World"
Math World
To show you how we extract insight from shape we start in "Math World"
We’ll draw the data as a smooth manifold.
Math World
To show you how we extract insight from shape we start in "Math World"
We’ll draw the data as a smooth manifold.
Functions that appear are smooth or continuous.
Math World
To show you how we extract insight from shape we start in "Math World"
We’ll draw the data as a smooth manifold.
Functions that appear are smooth or continuous.
⇒ We will not need either of these assumptions once we’re in "Data World".
Math World
To show you how we extract insight from shape we start in "Math World"
We’ll draw the data as a smooth manifold.
Functions that appear are smooth or continuous.
⇒ We will not need either of these assumptions once we’re in "Data World".
⇒ Even more importantly, data in the real world is never like this
Math World
Data
Math World
f
p
Data
Math World
f
pf−1
(p)
Data
Math World
f
pf−1
(p)
=⇒
Data
Math World
f
pf−1
(p)
=⇒
q
Data
Math World
f
pf−1
(p)
=⇒
q
Data
Math World
f
pf−1
(p)
=⇒
q
Data
Math World
f
pf−1
(p)
=⇒
q
g
Data
Math World
f
pf−1
(p)
=⇒
q
g
Data
p
Math World
f
pf−1
(p)
=⇒
q
g
Data
p
Math World
f
pf−1
(p)
=⇒
q
g
Data
p q
Math World
f
pf−1
(p)
=⇒
q
g
Data
p q
Math World
f
pf−1
(p)
=⇒
q
g
Data
r
Math World
f
pf−1
(p)
=⇒
q
g
Data
r
Math World
f
pf−1
(p)
=⇒
q
g
Data
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
p
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
p
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
p
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
p
Exercise:
What is the summary if we use both lenses, g and f at the same time?
(g, f)
p
=⇒ We recover the original space
What did the exercise tell us?
What did the exercise tell us?
With a rich enough set of functions (lenses) we can recover the original space
What did the exercise tell us?
With a rich enough set of functions (lenses) we can recover the original space
Of course this leaves us no better off then where we started.
What did the exercise tell us?
With a rich enough set of functions (lenses) we can recover the original space
Of course this leaves us no better off then where we started.
⇒ Instead we select a set of functions to tune in to the signal we want.
This is what Ayasdi does:
f
pf−1
(p)
=⇒
q
g
Data
This is what Ayasdi does:
f
pf−1
(p)
=⇒
q
g
Data
Modulo some details....
Why is this useful?
Why is this useful?
⇒ We get "easy" understanding of the localizations of quantities of interest.
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
f
g
Why is this useful?
Why is this useful?
Lenses inform us where in the space to look for phenomena.
Why is this useful?
Lenses inform us where in the space to look for phenomena.
For easy localizations many different lenses will be informative.
Why is this useful?
Lenses inform us where in the space to look for phenomena.
For easy localizations many different lenses will be informative.
For hard ( = geometrically distributed) localizations we have to be more
careful.
Why is this useful?
Lenses inform us where in the space to look for phenomena.
For easy localizations many different lenses will be informative.
For hard ( = geometrically distributed) localizations we have to be more
careful. But even then, we frequently get incremental knowledge even from a
poorly chosen lens.
Modulo Details....
We want to move from this mathematical model to a data driven setup.
Step 1
f
Step 1
Replace points in the range with an open covering of the range.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
Connect nodes when their corresponding sets intersect.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
Connect nodes when their corresponding sets intersect.
f
((()))
U1
U2
U3
Step 1
Replace points in the range with an open covering of the range.
Connect nodes when their corresponding sets intersect.
f
((()))
U1
U2
U3
⇒ The output is now a graph.
New Parameters
We’ve introduced new parameters into the construction:
New Parameters
We’ve introduced new parameters into the construction:
The resolution is the number of open sets in the range.
New Parameters
We’ve introduced new parameters into the construction:
The resolution is the number of open sets in the range.
The gain is the amount of overlap of these intervals.
New Parameters
We’ve introduced new parameters into the construction:
The resolution is the number of open sets in the range.
The gain is the amount of overlap of these intervals.
Roughly speaking, the resolution controls the number of nodes in the output and
the ’size’ of feature you can pick out, while the gain controls the number of edges
and the ’tightness’ of the graph.
Resolution: A closer look
f
()()()()
U1
U2
U3
U4
Resolution: A closer look
f
()()()()
U1
U2
U3
U4
Resolution: A closer look
f
()()()()
U1
U2
U3
U4
()()
Resolution: A closer look
f
()()()()
U1
U2
U3
U4
()()
Step 2: Clustering as π0
We need to make a final adjustment to the algorithm to bring it into data world.
Step 2: Clustering as π0
We need to make a final adjustment to the algorithm to bring it into data world.
We replace "connected component of the inverse image" is with "clusters in
the inverse image".
Step 2: Clustering as π0
We need to make a final adjustment to the algorithm to bring it into data world.
We replace "connected component of the inverse image" is with "clusters in
the inverse image".
We connect clusters (nodes) with an edge if they share points in common.
Step 2: Clustering as π0
f
Step 2: Clustering as π0
f
Step 2: Clustering as π0
f
U1
Step 2: Clustering as π0
f
U1
U2
Step 2: Clustering as π0
f
U1
U2
Step 2: Clustering as π0
f
U1
U2
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
Step 2: Clustering as π0
f
U1
U2
Nodes are clusters of data points
Edges represent shared points between the clusters
That’s It
That’s It
Ok not quite...
Lenses: Where do they come from
The technique rests on finding good lenses.
Lenses: Where do they come from
The technique rests on finding good lenses.
⇒ Luckily lots of people have worked on this problem
Lenses: Where do they come from
A Non Exhaustive Table of Lenses
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Mean/Max/Min
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Mean/Max/Min
Variance
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Mean/Max/Min
Variance
n-Moment
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Mean/Max/Min
Variance
n-Moment
Density
Lenses: Where do they come from
Standard data analysis functions
A Non Exhaustive Table of Lenses
Statistics
Mean/Max/Min
Variance
n-Moment
Density
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
A Non Exhaustive Table of Lenses
Statistics Geometry
Mean/Max/Min
Variance
n-Moment
Density
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
A Non Exhaustive Table of Lenses
Statistics Geometry
Mean/Max/Min Centrality
Variance
n-Moment
Density
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
A Non Exhaustive Table of Lenses
Statistics Geometry
Mean/Max/Min Centrality
Variance Curvature
n-Moment
Density
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
A Non Exhaustive Table of Lenses
Statistics Geometry
Mean/Max/Min Centrality
Variance Curvature
n-Moment Harmonic Cycles
Density
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
A Non Exhaustive Table of Lenses
Statistics Geometry
Mean/Max/Min Centrality
Variance Curvature
n-Moment Harmonic Cycles
Density ...
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality
Variance Curvature
n-Moment Harmonic Cycles
Density ...
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature
n-Moment Harmonic Cycles
Density ...
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles
Density ...
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ...
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
Domain Knowledge / Data Modeling
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning Data Driven
Mean/Max/Min Centrality PCA/SVD
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
Domain Knowledge / Data Modeling
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning Data Driven
Mean/Max/Min Centrality PCA/SVD Age
Variance Curvature Autoencoders
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
Domain Knowledge / Data Modeling
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning Data Driven
Mean/Max/Min Centrality PCA/SVD Age
Variance Curvature Autoencoders Dates
n-Moment Harmonic Cycles Isomap/MDS/TSNE
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
...
Lenses: Where do they come from
Standard data analysis functions
Geometry and Topology
Modern Statistics
Domain Knowledge / Data Modeling
A Non Exhaustive Table of Lenses
Statistics Geometry Machine Learning Data Driven
Mean/Max/Min Centrality PCA/SVD Age
Variance Curvature Autoencoders Dates
n-Moment Harmonic Cycles Isomap/MDS/TSNE ...
Density ... SVM Distance from Hyperplane
... Error/Debugging Info
...
Interperability and Meaning
But what about insight? meaning?
Interperability and Meaning
f
=⇒Complex Data
Interperability and Meaning
f
=⇒Complex Data f is gaussian density
Interperability and Meaning
f
=⇒Complex Data
f is gaussian density
⇒ The data is bi-modal.
Interperability and Meaning
f
=⇒Complex Data f is centrality
Interperability and Meaning
f
=⇒Complex Data
f is centrality
⇒ The data has two ways of
being abnormal.
Interperability and Meaning
f
=⇒Complex Data f is mean
Interperability and Meaning
f
=⇒Complex Data
f is mean
⇒ Two groups of high mean
data.
Interperability and Meaning
f
=⇒Complex Data f is error
Interperability and Meaning
f
=⇒Complex Data
f is error
⇒ Two types of error.
Interperability and Meaning
f
=⇒Complex Data
f is error
⇒ Two types of error.
The units on the lens give interperability/meaning to the topological summary.
Interperability and Meaning
Another way to think about lenses is as a kind of ’geometric query’.
Examples
Interperability and Meaning
Another way to think about lenses is as a kind of ’geometric query’.
Examples
1. Heart disease study
Interperability and Meaning
Another way to think about lenses is as a kind of ’geometric query’.
Examples
1. Heart disease study
Stratification by age without making arbitrary cutoffs.
Interperability and Meaning
Another way to think about lenses is as a kind of ’geometric query’.
Examples
1. Heart disease study
Stratification by age without making arbitrary cutoffs.
2. Heavy machinery
Interperability and Meaning
Another way to think about lenses is as a kind of ’geometric query’.
Examples
1. Heart disease study
Stratification by age without making arbitrary cutoffs.
2. Heavy machinery
Use mean a variance as a lens to find what operating regimes lead to failure of
mechanical components.
Some generalizations and extensions
Some generalizations and extensions
Metrics
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
In fact, can work with "open covers" of the space (Here taken to mean
overlapping partitions). Don’t need a lens at all.
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
In fact, can work with "open covers" of the space (Here taken to mean
overlapping partitions). Don’t need a lens at all.
Data
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
In fact, can work with "open covers" of the space (Here taken to mean
overlapping partitions). Don’t need a lens at all.
Data
Input space can be anything with a topology. Typically we work with
row/column numeric/categorical data but, for example, graphs are ok.
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
In fact, can work with "open covers" of the space (Here taken to mean
overlapping partitions). Don’t need a lens at all.
Data
Input space can be anything with a topology. Typically we work with
row/column numeric/categorical data but, for example, graphs are ok.
Output
Some generalizations and extensions
Metrics
We don’t need a metric, just a notion of similarity - or perhaps a clustering
mechanism.
Lenses
Lenses don’t need to be continuous - just "sensible".
We can use multiple lenses at the same time.
Lenses can map to space other then R.
In fact, can work with "open covers" of the space (Here taken to mean
overlapping partitions). Don’t need a lens at all.
Data
Input space can be anything with a topology. Typically we work with
row/column numeric/categorical data but, for example, graphs are ok.
Output
The output of the algorithm isn’t just a graph but is an abstract simplicial
complex (swept under the rug in this presentation).
Demo
Online Fraud
Fraud Score
Online Fraud
Charge Back (Ground Truth)
Online Fraud
Time On Page
Online Fraud
No Flash
Online Fraud
No Javascript
Parkinson’s Detection with Mobile Phone

More Related Content

What's hot

27 Machine Learning Unsupervised Measure Properties
27 Machine Learning Unsupervised Measure Properties27 Machine Learning Unsupervised Measure Properties
27 Machine Learning Unsupervised Measure PropertiesAndres Mendez-Vazquez
 
Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine LearningHumberto Marchezi
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded ImaginationDeep Learning JP
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksBennoG1
 
Gan 발표자료
Gan 발표자료Gan 발표자료
Gan 발표자료종현 최
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationJason Anderson
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
新たなRNNと自然言語処理
新たなRNNと自然言語処理新たなRNNと自然言語処理
新たなRNNと自然言語処理hytae
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...宏毅 李
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative ModelsKenta Oono
 

What's hot (20)

27 Machine Learning Unsupervised Measure Properties
27 Machine Learning Unsupervised Measure Properties27 Machine Learning Unsupervised Measure Properties
27 Machine Learning Unsupervised Measure Properties
 
Getting Started with Machine Learning
Getting Started with Machine LearningGetting Started with Machine Learning
Getting Started with Machine Learning
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Introduction to Generative Adversarial Networks
Introduction to Generative Adversarial NetworksIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks
 
Gan 발표자료
Gan 발표자료Gan 발표자료
Gan 발표자료
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 
Fuzzy hypersoft sets and its weightage operator for decision making
Fuzzy hypersoft sets and its weightage operator for decision makingFuzzy hypersoft sets and its weightage operator for decision making
Fuzzy hypersoft sets and its weightage operator for decision making
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
Decision tree learning
Decision tree learningDecision tree learning
Decision tree learning
 
新たなRNNと自然言語処理
新たなRNNと自然言語処理新たなRNNと自然言語処理
新たなRNNと自然言語処理
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Phase 3 final
Phase 3   finalPhase 3   final
Phase 3 final
 
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
Scala and Deep Learning
Scala and Deep LearningScala and Deep Learning
Scala and Deep Learning
 

Similar to Using Topological Data Analysis on your BigData

Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision TreeSuman Debnath
 
Important Terminologies In Statistical Inference I I
Important Terminologies In  Statistical  Inference  I IImportant Terminologies In  Statistical  Inference  I I
Important Terminologies In Statistical Inference I IZoha Qureshi
 
Ijarcet vol-2-issue-4-1579-1582
Ijarcet vol-2-issue-4-1579-1582Ijarcet vol-2-issue-4-1579-1582
Ijarcet vol-2-issue-4-1579-1582Editor IJARCET
 
04 image transformations_ii
04 image transformations_ii04 image transformations_ii
04 image transformations_iiankit_ppt
 
(Radhika) presentation on chapter 2 ai
(Radhika) presentation on chapter 2 ai(Radhika) presentation on chapter 2 ai
(Radhika) presentation on chapter 2 aiRadhika Srinivasan
 
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsJason Tsai
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets DeconstructedPaul Sterk
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Spencer Fox
 
Jumping-with-java8
Jumping-with-java8Jumping-with-java8
Jumping-with-java8Dhaval Dalal
 
Purely functional data structures demystified
Purely functional data structures demystifiedPurely functional data structures demystified
Purely functional data structures demystifiedMohit Thatte
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpankit_ppt
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in aiRobert Antony
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismOlivier Teytaud
 
Facial Expression Recognition via Python
Facial Expression Recognition via PythonFacial Expression Recognition via Python
Facial Expression Recognition via PythonSaurav Gupta
 
Predicting Facial Expression using Neural Network
Predicting Facial Expression using Neural Network Predicting Facial Expression using Neural Network
Predicting Facial Expression using Neural Network Santanu Paul
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smallerTony Tran
 

Similar to Using Topological Data Analysis on your BigData (20)

Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision Tree
 
Important Terminologies In Statistical Inference I I
Important Terminologies In  Statistical  Inference  I IImportant Terminologies In  Statistical  Inference  I I
Important Terminologies In Statistical Inference I I
 
Ijarcet vol-2-issue-4-1579-1582
Ijarcet vol-2-issue-4-1579-1582Ijarcet vol-2-issue-4-1579-1582
Ijarcet vol-2-issue-4-1579-1582
 
04 image transformations_ii
04 image transformations_ii04 image transformations_ii
04 image transformations_ii
 
(Radhika) presentation on chapter 2 ai
(Radhika) presentation on chapter 2 ai(Radhika) presentation on chapter 2 ai
(Radhika) presentation on chapter 2 ai
 
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsDeep Learning: Introduction & Chapter 5 Machine Learning Basics
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
Lets build a neural network
Lets build a neural networkLets build a neural network
Lets build a neural network
 
Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016Introduction to R Short course Fall 2016
Introduction to R Short course Fall 2016
 
Jumping-with-java8
Jumping-with-java8Jumping-with-java8
Jumping-with-java8
 
Purely functional data structures demystified
Purely functional data structures demystifiedPurely functional data structures demystified
Purely functional data structures demystified
 
Matrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlpMatrix decomposition and_applications_to_nlp
Matrix decomposition and_applications_to_nlp
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in ai
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
Artificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with ParallelismArtificial Intelligence and Optimization with Parallelism
Artificial Intelligence and Optimization with Parallelism
 
Facial Expression Recognition
Facial Expression RecognitionFacial Expression Recognition
Facial Expression Recognition
 
Facial Expression Recognition via Python
Facial Expression Recognition via PythonFacial Expression Recognition via Python
Facial Expression Recognition via Python
 
Predicting Facial Expression using Neural Network
Predicting Facial Expression using Neural Network Predicting Facial Expression using Neural Network
Predicting Facial Expression using Neural Network
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
 
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
2019 Fall Series: Postdoc Seminars - Special Guest Lecture, There is a Kernel...
 

More from AnalyticsWeek

Understanding Customer Buying Journey with Big Data
Understanding Customer Buying Journey with Big DataUnderstanding Customer Buying Journey with Big Data
Understanding Customer Buying Journey with Big DataAnalyticsWeek
 
Data-As-A-Service to enable compliance reporting
Data-As-A-Service to enable compliance reportingData-As-A-Service to enable compliance reporting
Data-As-A-Service to enable compliance reportingAnalyticsWeek
 
Making sense of unstructured data by turning strings into things
Making sense of unstructured data by turning strings into thingsMaking sense of unstructured data by turning strings into things
Making sense of unstructured data by turning strings into thingsAnalyticsWeek
 
Reimagining the role of data in government
Reimagining the role of data in governmentReimagining the role of data in government
Reimagining the role of data in governmentAnalyticsWeek
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of RAnalyticsWeek
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in HadoopAnalyticsWeek
 
Rethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modelingRethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modelingAnalyticsWeek
 
Big Data Introduction to D3
Big Data Introduction to D3Big Data Introduction to D3
Big Data Introduction to D3AnalyticsWeek
 

More from AnalyticsWeek (8)

Understanding Customer Buying Journey with Big Data
Understanding Customer Buying Journey with Big DataUnderstanding Customer Buying Journey with Big Data
Understanding Customer Buying Journey with Big Data
 
Data-As-A-Service to enable compliance reporting
Data-As-A-Service to enable compliance reportingData-As-A-Service to enable compliance reporting
Data-As-A-Service to enable compliance reporting
 
Making sense of unstructured data by turning strings into things
Making sense of unstructured data by turning strings into thingsMaking sense of unstructured data by turning strings into things
Making sense of unstructured data by turning strings into things
 
Reimagining the role of data in government
Reimagining the role of data in governmentReimagining the role of data in government
Reimagining the role of data in government
 
The History and Use of R
The History and Use of RThe History and Use of R
The History and Use of R
 
Advanced Analytics in Hadoop
Advanced Analytics in HadoopAdvanced Analytics in Hadoop
Advanced Analytics in Hadoop
 
Rethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modelingRethinking classical approaches to analysis and predictive modeling
Rethinking classical approaches to analysis and predictive modeling
 
Big Data Introduction to D3
Big Data Introduction to D3Big Data Introduction to D3
Big Data Introduction to D3
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Using Topological Data Analysis on your BigData

  • 1. Shape and Meaning An Introduction to Topological Data Analysis Anthony Bak
  • 2. Goals For this talk I want to:
  • 3. Goals For this talk I want to: Show you how TDA provides a framework for many machine learning/data analysis techniques
  • 4. Goals For this talk I want to: Show you how TDA provides a framework for many machine learning/data analysis techniques Demonstrate how Ayasdi provides insights into the data.
  • 5. Goals For this talk I want to: Show you how TDA provides a framework for many machine learning/data analysis techniques Demonstrate how Ayasdi provides insights into the data. Caveats: I am only talking about the strain of TDA done by Ayasdi
  • 6. The Data Problem How do we extract meaning from Complex Data?
  • 7. The Data Problem How do we extract meaning from Complex Data? Data is complex because it’s "Big Data"
  • 8. The Data Problem How do we extract meaning from Complex Data? Data is complex because it’s "Big Data" Or has very rich features (eg. Genetic Data >500,000 features, complicated interdependencies)
  • 9. The Data Problem How do we extract meaning from Complex Data? Data is complex because it’s "Big Data" Or has very rich features (eg. Genetic Data >500,000 features, complicated interdependencies) Or both!
  • 10. The Data Problem How do we extract meaning from Complex Data? Data is complex because it’s "Big Data" Or has very rich features (eg. Genetic Data >500,000 features, complicated interdependencies) Or both! The Problem in both cases is that there isn’t a single story happening in your data.
  • 11. The Data Problem How do we extract meaning from Complex Data? Data is complex because it’s "Big Data" Or has very rich features (eg. Genetic Data >500,000 features, complicated interdependencies) Or both! The Problem in both cases is that there isn’t a single story happening in your data. TDA will be the tool that summarizes out the irrelevant stories to get at something interesting.
  • 12. Data Has Shape And Shape Has Meaning
  • 13. Data Has Shape And Shape Has Meaning ⇒ In this talk I will focus on how we extract meaning.
  • 14. But first... What is shape?
  • 15. But first... What is shape? Shape is the global realization of local constraints.
  • 16. But first... What is shape? Shape is the global realization of local constraints. For a given problem determined by a choice of Features, columns or properties to measure. A metric on the columns.
  • 17. But first... What is shape? Shape is the global realization of local constraints. For a given problem determined by a choice of Features, columns or properties to measure. A metric on the columns. But not necessarily so. There are more relaxed definitions of shape and we can use those too.
  • 18. But first... What is shape? Shape is the global realization of local constraints. For a given problem determined by a choice of Features, columns or properties to measure. A metric on the columns. But not necessarily so. There are more relaxed definitions of shape and we can use those too. The goal of TDA is to understand (for us, summarize) the shape with no preconceived model of what it should be.
  • 19. Math World To show you how we extract insight from shape we start in "Math World"
  • 20. Math World To show you how we extract insight from shape we start in "Math World" We’ll draw the data as a smooth manifold.
  • 21. Math World To show you how we extract insight from shape we start in "Math World" We’ll draw the data as a smooth manifold. Functions that appear are smooth or continuous.
  • 22. Math World To show you how we extract insight from shape we start in "Math World" We’ll draw the data as a smooth manifold. Functions that appear are smooth or continuous. ⇒ We will not need either of these assumptions once we’re in "Data World".
  • 23. Math World To show you how we extract insight from shape we start in "Math World" We’ll draw the data as a smooth manifold. Functions that appear are smooth or continuous. ⇒ We will not need either of these assumptions once we’re in "Data World". ⇒ Even more importantly, data in the real world is never like this
  • 39. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f)
  • 40. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f)
  • 41. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f) p
  • 42. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f) p
  • 43. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f) p
  • 44. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f) p
  • 45. Exercise: What is the summary if we use both lenses, g and f at the same time? (g, f) p =⇒ We recover the original space
  • 46. What did the exercise tell us?
  • 47. What did the exercise tell us? With a rich enough set of functions (lenses) we can recover the original space
  • 48. What did the exercise tell us? With a rich enough set of functions (lenses) we can recover the original space Of course this leaves us no better off then where we started.
  • 49. What did the exercise tell us? With a rich enough set of functions (lenses) we can recover the original space Of course this leaves us no better off then where we started. ⇒ Instead we select a set of functions to tune in to the signal we want.
  • 50. This is what Ayasdi does: f pf−1 (p) =⇒ q g Data
  • 51. This is what Ayasdi does: f pf−1 (p) =⇒ q g Data Modulo some details....
  • 52. Why is this useful?
  • 53. Why is this useful? ⇒ We get "easy" understanding of the localizations of quantities of interest.
  • 54. Why is this useful? f g
  • 55. Why is this useful? f g
  • 56. Why is this useful? f g
  • 57. Why is this useful? f g
  • 58. Why is this useful? f g
  • 59. Why is this useful? f g
  • 60. Why is this useful? f g
  • 61. Why is this useful? f g
  • 62. Why is this useful? f g
  • 63. Why is this useful?
  • 64. Why is this useful? Lenses inform us where in the space to look for phenomena.
  • 65. Why is this useful? Lenses inform us where in the space to look for phenomena. For easy localizations many different lenses will be informative.
  • 66. Why is this useful? Lenses inform us where in the space to look for phenomena. For easy localizations many different lenses will be informative. For hard ( = geometrically distributed) localizations we have to be more careful.
  • 67. Why is this useful? Lenses inform us where in the space to look for phenomena. For easy localizations many different lenses will be informative. For hard ( = geometrically distributed) localizations we have to be more careful. But even then, we frequently get incremental knowledge even from a poorly chosen lens.
  • 68. Modulo Details.... We want to move from this mathematical model to a data driven setup.
  • 70. Step 1 Replace points in the range with an open covering of the range. f ((())) U1 U2 U3
  • 71. Step 1 Replace points in the range with an open covering of the range. f ((())) U1 U2 U3
  • 72. Step 1 Replace points in the range with an open covering of the range. f ((())) U1 U2 U3
  • 73. Step 1 Replace points in the range with an open covering of the range. f ((())) U1 U2 U3
  • 74. Step 1 Replace points in the range with an open covering of the range. f ((())) U1 U2 U3
  • 75. Step 1 Replace points in the range with an open covering of the range. Connect nodes when their corresponding sets intersect. f ((())) U1 U2 U3
  • 76. Step 1 Replace points in the range with an open covering of the range. Connect nodes when their corresponding sets intersect. f ((())) U1 U2 U3
  • 77. Step 1 Replace points in the range with an open covering of the range. Connect nodes when their corresponding sets intersect. f ((())) U1 U2 U3 ⇒ The output is now a graph.
  • 78. New Parameters We’ve introduced new parameters into the construction:
  • 79. New Parameters We’ve introduced new parameters into the construction: The resolution is the number of open sets in the range.
  • 80. New Parameters We’ve introduced new parameters into the construction: The resolution is the number of open sets in the range. The gain is the amount of overlap of these intervals.
  • 81. New Parameters We’ve introduced new parameters into the construction: The resolution is the number of open sets in the range. The gain is the amount of overlap of these intervals. Roughly speaking, the resolution controls the number of nodes in the output and the ’size’ of feature you can pick out, while the gain controls the number of edges and the ’tightness’ of the graph.
  • 82. Resolution: A closer look f ()()()() U1 U2 U3 U4
  • 83. Resolution: A closer look f ()()()() U1 U2 U3 U4
  • 84. Resolution: A closer look f ()()()() U1 U2 U3 U4 ()()
  • 85. Resolution: A closer look f ()()()() U1 U2 U3 U4 ()()
  • 86. Step 2: Clustering as π0 We need to make a final adjustment to the algorithm to bring it into data world.
  • 87. Step 2: Clustering as π0 We need to make a final adjustment to the algorithm to bring it into data world. We replace "connected component of the inverse image" is with "clusters in the inverse image".
  • 88. Step 2: Clustering as π0 We need to make a final adjustment to the algorithm to bring it into data world. We replace "connected component of the inverse image" is with "clusters in the inverse image". We connect clusters (nodes) with an edge if they share points in common.
  • 89. Step 2: Clustering as π0 f
  • 90. Step 2: Clustering as π0 f
  • 91. Step 2: Clustering as π0 f U1
  • 92. Step 2: Clustering as π0 f U1 U2
  • 93. Step 2: Clustering as π0 f U1 U2
  • 94. Step 2: Clustering as π0 f U1 U2
  • 95. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points
  • 96. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 97. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 98. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 99. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 100. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 101. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 102. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 103. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 104. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 105. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 106. Step 2: Clustering as π0 f U1 U2 Nodes are clusters of data points Edges represent shared points between the clusters
  • 108. That’s It Ok not quite...
  • 109. Lenses: Where do they come from The technique rests on finding good lenses.
  • 110. Lenses: Where do they come from The technique rests on finding good lenses. ⇒ Luckily lots of people have worked on this problem
  • 111. Lenses: Where do they come from A Non Exhaustive Table of Lenses
  • 112. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics
  • 113. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics Mean/Max/Min
  • 114. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics Mean/Max/Min Variance
  • 115. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics Mean/Max/Min Variance n-Moment
  • 116. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics Mean/Max/Min Variance n-Moment Density
  • 117. Lenses: Where do they come from Standard data analysis functions A Non Exhaustive Table of Lenses Statistics Mean/Max/Min Variance n-Moment Density ...
  • 118. Lenses: Where do they come from Standard data analysis functions Geometry and Topology A Non Exhaustive Table of Lenses Statistics Geometry Mean/Max/Min Variance n-Moment Density ...
  • 119. Lenses: Where do they come from Standard data analysis functions Geometry and Topology A Non Exhaustive Table of Lenses Statistics Geometry Mean/Max/Min Centrality Variance n-Moment Density ...
  • 120. Lenses: Where do they come from Standard data analysis functions Geometry and Topology A Non Exhaustive Table of Lenses Statistics Geometry Mean/Max/Min Centrality Variance Curvature n-Moment Density ...
  • 121. Lenses: Where do they come from Standard data analysis functions Geometry and Topology A Non Exhaustive Table of Lenses Statistics Geometry Mean/Max/Min Centrality Variance Curvature n-Moment Harmonic Cycles Density ...
  • 122. Lenses: Where do they come from Standard data analysis functions Geometry and Topology A Non Exhaustive Table of Lenses Statistics Geometry Mean/Max/Min Centrality Variance Curvature n-Moment Harmonic Cycles Density ... ...
  • 123. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality Variance Curvature n-Moment Harmonic Cycles Density ... ...
  • 124. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature n-Moment Harmonic Cycles Density ... ...
  • 125. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Density ... ...
  • 126. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... ...
  • 127. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ...
  • 128. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ... Error/Debugging Info
  • 129. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ... Error/Debugging Info ...
  • 130. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics Domain Knowledge / Data Modeling A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Data Driven Mean/Max/Min Centrality PCA/SVD Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ... Error/Debugging Info ...
  • 131. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics Domain Knowledge / Data Modeling A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Data Driven Mean/Max/Min Centrality PCA/SVD Age Variance Curvature Autoencoders n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ... Error/Debugging Info ...
  • 132. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics Domain Knowledge / Data Modeling A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Data Driven Mean/Max/Min Centrality PCA/SVD Age Variance Curvature Autoencoders Dates n-Moment Harmonic Cycles Isomap/MDS/TSNE Density ... SVM Distance from Hyperplane ... Error/Debugging Info ...
  • 133. Lenses: Where do they come from Standard data analysis functions Geometry and Topology Modern Statistics Domain Knowledge / Data Modeling A Non Exhaustive Table of Lenses Statistics Geometry Machine Learning Data Driven Mean/Max/Min Centrality PCA/SVD Age Variance Curvature Autoencoders Dates n-Moment Harmonic Cycles Isomap/MDS/TSNE ... Density ... SVM Distance from Hyperplane ... Error/Debugging Info ...
  • 134. Interperability and Meaning But what about insight? meaning?
  • 136. Interperability and Meaning f =⇒Complex Data f is gaussian density
  • 137. Interperability and Meaning f =⇒Complex Data f is gaussian density ⇒ The data is bi-modal.
  • 139. Interperability and Meaning f =⇒Complex Data f is centrality ⇒ The data has two ways of being abnormal.
  • 141. Interperability and Meaning f =⇒Complex Data f is mean ⇒ Two groups of high mean data.
  • 143. Interperability and Meaning f =⇒Complex Data f is error ⇒ Two types of error.
  • 144. Interperability and Meaning f =⇒Complex Data f is error ⇒ Two types of error. The units on the lens give interperability/meaning to the topological summary.
  • 145. Interperability and Meaning Another way to think about lenses is as a kind of ’geometric query’. Examples
  • 146. Interperability and Meaning Another way to think about lenses is as a kind of ’geometric query’. Examples 1. Heart disease study
  • 147. Interperability and Meaning Another way to think about lenses is as a kind of ’geometric query’. Examples 1. Heart disease study Stratification by age without making arbitrary cutoffs.
  • 148. Interperability and Meaning Another way to think about lenses is as a kind of ’geometric query’. Examples 1. Heart disease study Stratification by age without making arbitrary cutoffs. 2. Heavy machinery
  • 149. Interperability and Meaning Another way to think about lenses is as a kind of ’geometric query’. Examples 1. Heart disease study Stratification by age without making arbitrary cutoffs. 2. Heavy machinery Use mean a variance as a lens to find what operating regimes lead to failure of mechanical components.
  • 151. Some generalizations and extensions Metrics
  • 152. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism.
  • 153. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses
  • 154. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible".
  • 155. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time.
  • 156. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R.
  • 157. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R. In fact, can work with "open covers" of the space (Here taken to mean overlapping partitions). Don’t need a lens at all.
  • 158. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R. In fact, can work with "open covers" of the space (Here taken to mean overlapping partitions). Don’t need a lens at all. Data
  • 159. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R. In fact, can work with "open covers" of the space (Here taken to mean overlapping partitions). Don’t need a lens at all. Data Input space can be anything with a topology. Typically we work with row/column numeric/categorical data but, for example, graphs are ok.
  • 160. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R. In fact, can work with "open covers" of the space (Here taken to mean overlapping partitions). Don’t need a lens at all. Data Input space can be anything with a topology. Typically we work with row/column numeric/categorical data but, for example, graphs are ok. Output
  • 161. Some generalizations and extensions Metrics We don’t need a metric, just a notion of similarity - or perhaps a clustering mechanism. Lenses Lenses don’t need to be continuous - just "sensible". We can use multiple lenses at the same time. Lenses can map to space other then R. In fact, can work with "open covers" of the space (Here taken to mean overlapping partitions). Don’t need a lens at all. Data Input space can be anything with a topology. Typically we work with row/column numeric/categorical data but, for example, graphs are ok. Output The output of the algorithm isn’t just a graph but is an abstract simplicial complex (swept under the rug in this presentation).
  • 162. Demo
  • 164. Online Fraud Charge Back (Ground Truth)