SlideShare a Scribd company logo
1 of 14
Multiscale Mapper Networks
By Colleen M. Farrelly
Problem
 Data contains many
underlying structures
and relationships.
 Current methods (such
as k-means
clustering):
◦ Don’t capture all of these
structures
◦ Struggle with certain
data properties
(dimensionality)
◦ Provide little information
about connectedness
between
clusters/individuals
◦ Instability
Recent Solutions
 Nonlinear distance
metrics
◦ Random forest-based
◦ Manifold learning-based
 Hierarchical clustering
◦ Nested clustering
approach
 Multiscale K-Nearest
Neighbors
◦ Adjust number of
neighbors to slice data
 Still don’t provide a
comprehensive view of
data structure
Topology Overview
 Branch of mainly pure
mathematics
 Study of changes in
function behavior on
different shapes
(called manifolds)
 Can examine locally-
variant and globally-
invariant properties
 Classify
similarities/differences
between shapes
based on these
characteristics
Algebra can be used to build
more complex structures
from basic building blocks
Topology and Data
 Data clouds can be turned into discrete shapes
combinations (simplices)
 Identify key topological features across different slices of
the data (circles, holes…)
◦ Classified by Betti numbers (dimension plus feature type)
 Find connected components of similar topological
structure
doi.ieeecomputersociety.org
Mapper Algorithm
 Topological clustering
◦ Define distance metric
 Linear or nonlinear
◦ Define filtration function
 Linear, density-based…
◦ Slice multidimensional
dataset with Morse function
 Type of function associated
with gradient flow and critical
point identification on smooth
manifolds
◦ Examine function behavior
across slice (level set)
◦ Cluster function behavior
◦ Graph cluster connections
 Type of extended Reeb Graph
Response
gradations
Outliers
Multiscale Extension of
Mapper
 Instability of single-
scale mapper algorithm
◦ Clusters may change with
scale
◦ Connections may change
with scale
 Filtrations at multiple
resolution settings
 Connections change as
lens zooms in or out
◦ Contains information
about underlying data
structure and
relationships
◦ Hierarchy of Reeb graphs
◦ Topological summary
Graph Theory Extensions of
Mapper
 Cluster relationships from
Mapper give an adjacency
matrix and distance metric
◦ Clusters as vertices
◦ Nested hierarchy as edges
◦ Connected/unconnected
components
◦ Centrality of certain points
◦ Bridges linking disparate
clusters
◦ Path lengths between
clusters
 Can apply network
analytics to assess cluster
relationships and
individual connections
across clusters
This is a weighted,
undirected graph!
Network Extensions of Multiscale
Mapper
 Graph theory
algorithms applied to
Mapper results dig
deeper into:
◦ Data topology/structure
◦ Nature of individuals’
similarities across
multivariate distribution
 Examine across
different lenses
◦ Hierarchy of networks
connected through
individuals common to
multiple networks
◦ Analyze across slices to
gain deeper insight into
network and underlying
data structures
Information from Each
Network Hubs
◦ Direct connection
to many other
clusters
 Betweenness
◦ Non-extremity
measure
 Diversity
◦ Information
contained
 Bridges
◦ Connection
between less-
related
components
 Graph Laplacian
◦ Eigenvectors with
connection/bridge
weights
 Centrality
◦ Weight direct
connections and
bridges for
importance to
network
 Vertices
◦ Clusters at a
particular resolution
 Edges
◦ Connections between
clusters
◦ Individuals common
between clusters
 Levels
◦ Level sets (height
slices) containing one
or more vertices
◦ Individuals bridging
levels
Combined Insight of
Extensions
 Multiple resolutions
◦ Cluster hierarchy
 Evolving cluster structure
 More complete picture of
individual classifications
◦ Network hierarchy
 Evolving network structure
 More complete picture of
cluster relationships and
structure
 More complete picture of
individual connections
Example Demonstration
 Demo dataset of 7th grade SAT
scores
 Group-level data mining of results
Transition
Transition
Emergence of
subgroups
Split into
two distinct
groups
Individual Mining Results
 Map back to individuals
◦ Bridging individuals
 Transition between clusters
 Multivariate cut-off scores
determination
◦ Isolated individuals
 Outliers and outlier groups
 Unique response or predictors
subsets
◦ Consistently clustered
individuals
 Cohesive subgroups in data
 Underlying similarity of
predictors or response
Conclusion
New method ameliorates some of the issues
with clustering methods
◦ Robust
◦ Works in high dimensions
◦ Captures connectedness
◦ Stable
◦ Provides hierarchy
◦ Quantify relationships

More Related Content

What's hot

High-Dimensional Data Visualization, Geometry, and Stock Market Crashes
High-Dimensional Data Visualization, Geometry, and Stock Market CrashesHigh-Dimensional Data Visualization, Geometry, and Stock Market Crashes
High-Dimensional Data Visualization, Geometry, and Stock Market CrashesColleen Farrelly
 
Tensor decompositions for medical analytics
Tensor decompositions for medical analyticsTensor decompositions for medical analytics
Tensor decompositions for medical analyticsColleen Farrelly
 
Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models
Data Science Meetup: DGLARS and Homotopy LASSO for Regression ModelsData Science Meetup: DGLARS and Homotopy LASSO for Regression Models
Data Science Meetup: DGLARS and Homotopy LASSO for Regression ModelsColleen Farrelly
 
Quantum generalized linear models
Quantum generalized linear modelsQuantum generalized linear models
Quantum generalized linear modelsColleen Farrelly
 
Morse-Smale Regression for Risk Modeling
Morse-Smale Regression for Risk ModelingMorse-Smale Regression for Risk Modeling
Morse-Smale Regression for Risk ModelingColleen Farrelly
 
PyData Miami 2019, Quantum Generalized Linear Models
PyData Miami 2019, Quantum Generalized Linear ModelsPyData Miami 2019, Quantum Generalized Linear Models
PyData Miami 2019, Quantum Generalized Linear ModelsColleen Farrelly
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphscsandit
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: ClusteringDeepak George
 
Using spectral radius ratio for node degree
Using spectral radius ratio for node degreeUsing spectral radius ratio for node degree
Using spectral radius ratio for node degreeIJCNCJournal
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning ANKUSH PAL
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clusteringArshad Farhad
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentationVishal Tandel
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)Pravinkumar Landge
 
DATA MINING:Clustering Types
DATA MINING:Clustering TypesDATA MINING:Clustering Types
DATA MINING:Clustering TypesAshwin Shenoy M
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsPrashanth Guntal
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysiss v
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysisguest0edcaf
 

What's hot (20)

High-Dimensional Data Visualization, Geometry, and Stock Market Crashes
High-Dimensional Data Visualization, Geometry, and Stock Market CrashesHigh-Dimensional Data Visualization, Geometry, and Stock Market Crashes
High-Dimensional Data Visualization, Geometry, and Stock Market Crashes
 
Tensor decompositions for medical analytics
Tensor decompositions for medical analyticsTensor decompositions for medical analytics
Tensor decompositions for medical analytics
 
Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models
Data Science Meetup: DGLARS and Homotopy LASSO for Regression ModelsData Science Meetup: DGLARS and Homotopy LASSO for Regression Models
Data Science Meetup: DGLARS and Homotopy LASSO for Regression Models
 
Quantum generalized linear models
Quantum generalized linear modelsQuantum generalized linear models
Quantum generalized linear models
 
Morse-Smale Regression for Risk Modeling
Morse-Smale Regression for Risk ModelingMorse-Smale Regression for Risk Modeling
Morse-Smale Regression for Risk Modeling
 
PyData Miami 2019, Quantum Generalized Linear Models
PyData Miami 2019, Quantum Generalized Linear ModelsPyData Miami 2019, Quantum Generalized Linear Models
PyData Miami 2019, Quantum Generalized Linear Models
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
 
Unsupervised learning: Clustering
Unsupervised learning: ClusteringUnsupervised learning: Clustering
Unsupervised learning: Clustering
 
Using spectral radius ratio for node degree
Using spectral radius ratio for node degreeUsing spectral radius ratio for node degree
Using spectral radius ratio for node degree
 
Presentation on unsupervised learning
Presentation on unsupervised learning Presentation on unsupervised learning
Presentation on unsupervised learning
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Cluster analysis for market segmentation
Cluster analysis for market segmentationCluster analysis for market segmentation
Cluster analysis for market segmentation
 
Unsupervised learning (clustering)
Unsupervised learning (clustering)Unsupervised learning (clustering)
Unsupervised learning (clustering)
 
DATA MINING:Clustering Types
DATA MINING:Clustering TypesDATA MINING:Clustering Types
DATA MINING:Clustering Types
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 

Similar to Multiscale Mapper Networks for Topological Data Mining

Database.ppt
Database.pptDatabase.ppt
Database.pptFaimHasan
 
DIFFERENT MODELS IN DBMS.pptx
DIFFERENT MODELS IN DBMS.pptxDIFFERENT MODELS IN DBMS.pptx
DIFFERENT MODELS IN DBMS.pptxKavya990096
 
DBMS-7.pptx
DBMS-7.pptxDBMS-7.pptx
DBMS-7.pptxkingVox
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfMisganawAbeje1
 
DBMS-2.pptx
DBMS-2.pptxDBMS-2.pptx
DBMS-2.pptxkingVox
 
Spatial Database and Database Management System
Spatial Database and Database Management SystemSpatial Database and Database Management System
Spatial Database and Database Management SystemLal Mohammad
 
network modeling ....and its advantages and disadvantages...
network modeling ....and its advantages and disadvantages...network modeling ....and its advantages and disadvantages...
network modeling ....and its advantages and disadvantages...Nimrakhan89
 
Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Pramoda Raj
 
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...Neo4j
 

Similar to Multiscale Mapper Networks for Topological Data Mining (20)

Database.ppt
Database.pptDatabase.ppt
Database.ppt
 
DIFFERENT MODELS IN DBMS.pptx
DIFFERENT MODELS IN DBMS.pptxDIFFERENT MODELS IN DBMS.pptx
DIFFERENT MODELS IN DBMS.pptx
 
Spatial Data Model 2
Spatial Data Model 2Spatial Data Model 2
Spatial Data Model 2
 
Vector data model _Topology _Tin.pptx
Vector data model _Topology _Tin.pptxVector data model _Topology _Tin.pptx
Vector data model _Topology _Tin.pptx
 
DBMS-7.pptx
DBMS-7.pptxDBMS-7.pptx
DBMS-7.pptx
 
28d37b_L3 GIS.pdf
28d37b_L3 GIS.pdf28d37b_L3 GIS.pdf
28d37b_L3 GIS.pdf
 
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdfchapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
chapter 2-DATABASE SYSTEM CONCEPTS AND architecture [Autosaved].pdf
 
Geoinformatics.pptx
Geoinformatics.pptxGeoinformatics.pptx
Geoinformatics.pptx
 
Keynote at AImWD
Keynote at AImWDKeynote at AImWD
Keynote at AImWD
 
Summary2 (1)
Summary2 (1)Summary2 (1)
Summary2 (1)
 
Az36311316
Az36311316Az36311316
Az36311316
 
DBMS-2.pptx
DBMS-2.pptxDBMS-2.pptx
DBMS-2.pptx
 
Spatial Database and Database Management System
Spatial Database and Database Management SystemSpatial Database and Database Management System
Spatial Database and Database Management System
 
ITB - UNIT 3.pdf
ITB - UNIT 3.pdfITB - UNIT 3.pdf
ITB - UNIT 3.pdf
 
Rdbms
RdbmsRdbms
Rdbms
 
network modeling ....and its advantages and disadvantages...
network modeling ....and its advantages and disadvantages...network modeling ....and its advantages and disadvantages...
network modeling ....and its advantages and disadvantages...
 
Data models in geographical information system(GIS)
Data models in geographical information system(GIS)Data models in geographical information system(GIS)
Data models in geographical information system(GIS)
 
F04463437
F04463437F04463437
F04463437
 
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...
Government GraphSummit: Leveraging Knowledge Graphs for Foundational Intellig...
 
DISE - Database Concepts
DISE - Database ConceptsDISE - Database Concepts
DISE - Database Concepts
 

More from Colleen Farrelly

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Colleen Farrelly
 
Modeling Climate Change.pptx
Modeling Climate Change.pptxModeling Climate Change.pptx
Modeling Climate Change.pptxColleen Farrelly
 
Natural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxNatural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxColleen Farrelly
 
The Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxThe Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxColleen Farrelly
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxColleen Farrelly
 
Emerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxEmerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxColleen Farrelly
 
Applications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxApplications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxColleen Farrelly
 
Geometry for Social Good.pptx
Geometry for Social Good.pptxGeometry for Social Good.pptx
Geometry for Social Good.pptxColleen Farrelly
 
Topology for Time Series.pptx
Topology for Time Series.pptxTopology for Time Series.pptx
Topology for Time Series.pptxColleen Farrelly
 
Time Series Applications AMLD.pptx
Time Series Applications AMLD.pptxTime Series Applications AMLD.pptx
Time Series Applications AMLD.pptxColleen Farrelly
 
An introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxAn introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxColleen Farrelly
 
An introduction to time series data with R.pptx
An introduction to time series data with R.pptxAn introduction to time series data with R.pptx
An introduction to time series data with R.pptxColleen Farrelly
 
NLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasNLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasColleen Farrelly
 
Geometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxGeometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxColleen Farrelly
 
Topological Data Analysis.pptx
Topological Data Analysis.pptxTopological Data Analysis.pptx
Topological Data Analysis.pptxColleen Farrelly
 
Transforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxTransforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxColleen Farrelly
 
Natural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxNatural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxColleen Farrelly
 
SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing Colleen Farrelly
 
2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science TalkColleen Farrelly
 

More from Colleen Farrelly (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023Hands-On Network Science, PyData Global 2023
Hands-On Network Science, PyData Global 2023
 
Modeling Climate Change.pptx
Modeling Climate Change.pptxModeling Climate Change.pptx
Modeling Climate Change.pptx
 
Natural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptxNatural Language Processing for Beginners.pptx
Natural Language Processing for Beginners.pptx
 
The Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptxThe Shape of Data--ODSC.pptx
The Shape of Data--ODSC.pptx
 
Generative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptxGenerative AI, WiDS 2023.pptx
Generative AI, WiDS 2023.pptx
 
Emerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptxEmerging Technologies for Public Health in Remote Locations.pptx
Emerging Technologies for Public Health in Remote Locations.pptx
 
Applications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptxApplications of Forman-Ricci Curvature.pptx
Applications of Forman-Ricci Curvature.pptx
 
Geometry for Social Good.pptx
Geometry for Social Good.pptxGeometry for Social Good.pptx
Geometry for Social Good.pptx
 
Topology for Time Series.pptx
Topology for Time Series.pptxTopology for Time Series.pptx
Topology for Time Series.pptx
 
Time Series Applications AMLD.pptx
Time Series Applications AMLD.pptxTime Series Applications AMLD.pptx
Time Series Applications AMLD.pptx
 
An introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptxAn introduction to quantum machine learning.pptx
An introduction to quantum machine learning.pptx
 
An introduction to time series data with R.pptx
An introduction to time series data with R.pptxAn introduction to time series data with R.pptx
An introduction to time series data with R.pptx
 
NLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved AreasNLP: Challenges and Opportunities in Underserved Areas
NLP: Challenges and Opportunities in Underserved Areas
 
Geometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptxGeometry, Data, and One Path Into Data Science.pptx
Geometry, Data, and One Path Into Data Science.pptx
 
Topological Data Analysis.pptx
Topological Data Analysis.pptxTopological Data Analysis.pptx
Topological Data Analysis.pptx
 
Transforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptxTransforming Text Data to Matrix Data via Embeddings.pptx
Transforming Text Data to Matrix Data via Embeddings.pptx
 
Natural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptxNatural Language Processing in the Wild.pptx
Natural Language Processing in the Wild.pptx
 
SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing SAS Global 2021 Introduction to Natural Language Processing
SAS Global 2021 Introduction to Natural Language Processing
 
2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk2021 American Mathematical Society Data Science Talk
2021 American Mathematical Society Data Science Talk
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

Multiscale Mapper Networks for Topological Data Mining

  • 1. Multiscale Mapper Networks By Colleen M. Farrelly
  • 2. Problem  Data contains many underlying structures and relationships.  Current methods (such as k-means clustering): ◦ Don’t capture all of these structures ◦ Struggle with certain data properties (dimensionality) ◦ Provide little information about connectedness between clusters/individuals ◦ Instability
  • 3. Recent Solutions  Nonlinear distance metrics ◦ Random forest-based ◦ Manifold learning-based  Hierarchical clustering ◦ Nested clustering approach  Multiscale K-Nearest Neighbors ◦ Adjust number of neighbors to slice data  Still don’t provide a comprehensive view of data structure
  • 4. Topology Overview  Branch of mainly pure mathematics  Study of changes in function behavior on different shapes (called manifolds)  Can examine locally- variant and globally- invariant properties  Classify similarities/differences between shapes based on these characteristics Algebra can be used to build more complex structures from basic building blocks
  • 5. Topology and Data  Data clouds can be turned into discrete shapes combinations (simplices)  Identify key topological features across different slices of the data (circles, holes…) ◦ Classified by Betti numbers (dimension plus feature type)  Find connected components of similar topological structure doi.ieeecomputersociety.org
  • 6. Mapper Algorithm  Topological clustering ◦ Define distance metric  Linear or nonlinear ◦ Define filtration function  Linear, density-based… ◦ Slice multidimensional dataset with Morse function  Type of function associated with gradient flow and critical point identification on smooth manifolds ◦ Examine function behavior across slice (level set) ◦ Cluster function behavior ◦ Graph cluster connections  Type of extended Reeb Graph Response gradations Outliers
  • 7. Multiscale Extension of Mapper  Instability of single- scale mapper algorithm ◦ Clusters may change with scale ◦ Connections may change with scale  Filtrations at multiple resolution settings  Connections change as lens zooms in or out ◦ Contains information about underlying data structure and relationships ◦ Hierarchy of Reeb graphs ◦ Topological summary
  • 8. Graph Theory Extensions of Mapper  Cluster relationships from Mapper give an adjacency matrix and distance metric ◦ Clusters as vertices ◦ Nested hierarchy as edges ◦ Connected/unconnected components ◦ Centrality of certain points ◦ Bridges linking disparate clusters ◦ Path lengths between clusters  Can apply network analytics to assess cluster relationships and individual connections across clusters This is a weighted, undirected graph!
  • 9. Network Extensions of Multiscale Mapper  Graph theory algorithms applied to Mapper results dig deeper into: ◦ Data topology/structure ◦ Nature of individuals’ similarities across multivariate distribution  Examine across different lenses ◦ Hierarchy of networks connected through individuals common to multiple networks ◦ Analyze across slices to gain deeper insight into network and underlying data structures
  • 10. Information from Each Network Hubs ◦ Direct connection to many other clusters  Betweenness ◦ Non-extremity measure  Diversity ◦ Information contained  Bridges ◦ Connection between less- related components  Graph Laplacian ◦ Eigenvectors with connection/bridge weights  Centrality ◦ Weight direct connections and bridges for importance to network  Vertices ◦ Clusters at a particular resolution  Edges ◦ Connections between clusters ◦ Individuals common between clusters  Levels ◦ Level sets (height slices) containing one or more vertices ◦ Individuals bridging levels
  • 11. Combined Insight of Extensions  Multiple resolutions ◦ Cluster hierarchy  Evolving cluster structure  More complete picture of individual classifications ◦ Network hierarchy  Evolving network structure  More complete picture of cluster relationships and structure  More complete picture of individual connections
  • 12. Example Demonstration  Demo dataset of 7th grade SAT scores  Group-level data mining of results Transition Transition Emergence of subgroups Split into two distinct groups
  • 13. Individual Mining Results  Map back to individuals ◦ Bridging individuals  Transition between clusters  Multivariate cut-off scores determination ◦ Isolated individuals  Outliers and outlier groups  Unique response or predictors subsets ◦ Consistently clustered individuals  Cohesive subgroups in data  Underlying similarity of predictors or response
  • 14. Conclusion New method ameliorates some of the issues with clustering methods ◦ Robust ◦ Works in high dimensions ◦ Captures connectedness ◦ Stable ◦ Provides hierarchy ◦ Quantify relationships

Editor's Notes

  1. Dey, T. K., Memoli, F., & Wang, Y. (2015). Mutiscale Mapper: A Framework for Topological Summarization of Data and Maps. arXiv preprint arXiv:1504.03763. Singh, G., Mémoli, F., & Carlsson, G. E. (2007, September). Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In SPBG (pp. 91-100).
  2. Ghosh, A. K., Chaudhuri, P., & Murthy, C. A. (2006). Multiscale classification using nearest neighbor density estimates. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 36(5), 1139-1148. Shi, T., Seligson, D., Belldegrun, A. S., Palotie, A., & Horvath, S. (2005). Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma. Modern Pathology, 18(4), 547-557. Navarro, J. F., Frenk, C. S., & White, S. D. (1997). A Universal density profile from hierarchical clustering. The Astrophysical Journal, 490(2), 493.
  3. Spanier, E. H. (1994). Algebraic topology (Vol. 55, No. 1). Springer Science & Business Media. Aspinwall, P. S., Greene, B. R., & Morrison, D. R. (1994). Calabi-Yau moduli space, mirror manifolds and spacetime topology change in string theory. Nuclear Physics B, 416(2), 414-480. Schwarz, M. (1993). Morse homology. In Progress in Mathematics. Palis, J. (1969). On morse-smale dynamical systems. Topology, 8(4), 385-404. Devaney, R. L. (1989). An introduction to chaotic dynamical systems (Vol. 13046). Reading: Addison-Wesley.
  4. Epstein, C., Carlsson, G., & Edelsbrunner, H. (2011). Topological data analysis. Inverse Problems, 27(12), 120201. Zomorodian, A. (2007). Topological data analysis. Advances in Applied and Computational Topology, 70, 1-39.
  5. Singh, G., Mémoli, F., & Carlsson, G. E. (2007, September). Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition. In SPBG (pp. 91-100). Lum, P. Y., Singh, G., Lehman, A., Ishkanov, T., Vejdemo-Johansson, M., Alagappan, M., ... & Carlsson, G. (2013). Extracting insights from the shape of complex data using topology. Scientific reports, 3. Carlsson, G., Jardine, R., Feichtner-Kozlov, D., Morozov, D., Chazal, F., de Silva, V., ... & Wang, Y. (2012). Topological Data Analysis and Machine Learning Theory.
  6. Dey, T. K., Memoli, F., & Wang, Y. (2015). Mutiscale Mapper: A Framework for Topological Summarization of Data and Maps. arXiv preprint arXiv:1504.03763.
  7. Opens the door to extremely deep unsupervised learning and a new way to look at data structure and underlying relationships.
  8. Scott, J. (2012). Social network analysis. Sage. Carrington, P. J., Scott, J., & Wasserman, S. (Eds.). (2005). Models and methods in social network analysis (Vol. 28). Cambridge university press. Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications (Vol. 8). Cambridge university press.