SlideShare a Scribd company logo
1 of 31
Download to read offline
Visualizing and
Communicating
High-dimensional Data
Dr. Stefan Kühn
Lead Data Scientist
Data Natives Berlin - 26.10.2016
Data Visualization Basics
2
The Modes of Perception
3
X
X
X
X
X
X
X
O
O
O O
O
O
O
O
X
X
X
X
O
O
O
O
X
X
X
X
X
X
X
X
O
O
O O
O
O
O
O
X
X X
X
X
O
O
O
Fast Slow
Find the outlier
The Modes of Perception
4
• Pre-attentive
• fast
• parallel processing
• effortless
• Pattern recognition
• semi-fast
• governed by laws of Gestalt
• Attentive
• slow
• sequential
• high effort (attention is a very limited resource)
Main Properties of Graphics
5
Category Example
Position
Shape
Size
Color
Orientation (Line)
Length (Line)
Type and Size (Line)
Brightness
Main Properties of Graphics Humans
6
Category Amount of pre-attentive information
Position very high
Shape ———
Size approx. 4
Color approx. 8
Orientation (Line) approx. 4
Length (Line) ———
Type and Size (Line) ———
Brightness approx. 8
Pre-attentive perception
7
• Position
• fast
• effective
• high number of different positions
• Color
• use with care
• Shape
• Orientation
Pre-attentive perception is effortless.
Exploit this as much as you can.
Pattern detection
8
„It is interesting to note that our brain […]
subconsciously always prefers meaningful
situations and objects.“
• Emergence
• Reiification
• Multi-stability
• Invariance
Pattern detection can be trained.
Exploit this for frequent visualizations.
What is this?
9
What is this?
10
What is this?
11
Laws of Gestalt
12
„It is interesting to note that our brain, in
accordance with the laws of Gestalt,
subconsciously always prefers meaningful
situations and objects.“
Accuracy of Graphics
13
Square Pie vs Stacked Bar vs Pie vs Donut
What do you think?
https://eagereyes.org/blog/2016/a-reanalysis-of-a-study-about-square-pie-charts-from-2009
Why do we visualize data?
14
• Explore
• use different techniques
• avoid „construction“ bias
• be careful with „aesthetics“
• challenge findings -> use attentive mode
• Explain
• focus on message or „story“
• use pre-attentive mode
Don’t trust graphics that you have not falsified yourself.
Beyond the Basics
15
Pair Plots
16
Correlation Plots
17
Fundamental Problems
• No accurate method in higher dimensions
• Approximation methods
• „Simulated“ dimensions (color, size, shape)
• Animations?
• No notion of quality or accuracy for
Visualizations
• Information Theory?
• „Stability“?
All Visualizations are wrong, but some are useful.
18
Approximation methods
• Pair Plots
• Axis-aligned projections
• Interpretable in terms of original variables
• Singular Value Decomposition
• Optimal with respect to 2-norm (Euclidean norm)
and supremum norm
• Comes with an error estimate
• Other methods
• Stochastic Neighbor Embedding ((t-)SNE)
• „Manifold Learning“
19
Manifold Learning Methods
• Locally Linear Embedding
• Neighborhood-preserving embedding
• Isomap
• quasi-isometric
• Multi-dimensional scaling
• quasi-isometric
• Spectral Embedding
• Spectral clustering based on similarity
• Stochastic Neighbor Embedding (SNE, t-SNE)
• preserves conditional probabilities for similarity
• Local Tangent Space Alignement (LTSA)
20
Manifold Learning
21
Manifold Learning
22
Manifold Learning
23
Local Tangent Space Alignement
24
Principal Components and Curves
• Principal Component Analysis
• orthogonal decomposition based on SVD
• linear in all variables
• tries to preserve variance
• Principal Curves
• minimize the Sum of Squared Errors with respect
to all variables (as PCA, preserve variance)
• nonlinear
• smooth
25
Principal Components and Curves
26
Parallel Coordinates
• Parallel Coordinates
• especially useful for high-dimensional data
• depends on ordering and scaling
27
The Grand Tour
• Animated sequence of 2-D projections
• https://en.wikipedia.org/wiki/
Grand_Tour_(data_visualisation)
• Asimov (1985): The grand tour: a tool for viewing
multidimensional data.
• Underlying idea
• Randomly generate 2-D projections (random
walk)
• Over time generate a dense subset of all
possible 2-D projections
• Optional: Follow a given path / guided tour
28
The Grand Tour
29
The Grand Tour
30
31
Thanks a lot!
www.codecentric.de
blog.codecentric.de
stefan.kuehn@codecentric.de
datascience@codecentric.de

More Related Content

Similar to Visualizing and Communicating High-dimensional Data

chi03-tutorial.ppt
chi03-tutorial.pptchi03-tutorial.ppt
chi03-tutorial.pptKumarVijay54
 
ODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleKuldeep Jiwani
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsStefan Kühn
 
How Humans See Data - Google - November 2017
How Humans See Data  - Google - November 2017How Humans See Data  - Google - November 2017
How Humans See Data - Google - November 2017John Rauser
 
04 data viz concepts
04 data viz concepts04 data viz concepts
04 data viz conceptsPaul Kahn
 
How Humans See Data - Amazon Cut
How Humans See Data - Amazon CutHow Humans See Data - Amazon Cut
How Humans See Data - Amazon CutJohn Rauser
 
How Humans See Data
How Humans See DataHow Humans See Data
How Humans See DataJohn Rauser
 
Making sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualizationMaking sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualizationVladimir Milev
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysisLindaBeale
 
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1David Gotz
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpowerJen Stirrup
 
Creativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data ScienceCreativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data ScienceDamianMingle
 
Enhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with CurvesEnhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with Curvesmartinjgraham
 
Manifold Learning and Data Visualization
Manifold Learning and Data VisualizationManifold Learning and Data Visualization
Manifold Learning and Data VisualizationStefan Kühn
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data ScienceMaloy Manna, PMP®
 

Similar to Visualizing and Communicating High-dimensional Data (20)

chi03-tutorial.ppt
chi03-tutorial.pptchi03-tutorial.ppt
chi03-tutorial.ppt
 
ODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scaleODSC India 2018: Topological space creation & Clustering at BigData scale
ODSC India 2018: Topological space creation & Clustering at BigData scale
 
Talk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and ApplicationsTalk at MCubed London about Manifold Learning and Applications
Talk at MCubed London about Manifold Learning and Applications
 
How Humans See Data - Google - November 2017
How Humans See Data  - Google - November 2017How Humans See Data  - Google - November 2017
How Humans See Data - Google - November 2017
 
Reza talk
Reza talkReza talk
Reza talk
 
04 data viz concepts
04 data viz concepts04 data viz concepts
04 data viz concepts
 
How Humans See Data - Amazon Cut
How Humans See Data - Amazon CutHow Humans See Data - Amazon Cut
How Humans See Data - Amazon Cut
 
How Humans See Data
How Humans See DataHow Humans See Data
How Humans See Data
 
Making sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualizationMaking sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualization
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
Building maps with analysis
Building maps with analysisBuilding maps with analysis
Building maps with analysis
 
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 
Creativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data ScienceCreativity and Curiosity - The Trial and Error of Data Science
Creativity and Curiosity - The Trial and Error of Data Science
 
Enhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with CurvesEnhancing Parallel Coordinates with Curves
Enhancing Parallel Coordinates with Curves
 
FaceRecognition.ppt
FaceRecognition.pptFaceRecognition.ppt
FaceRecognition.ppt
 
FaceRecognition.ppt
FaceRecognition.pptFaceRecognition.ppt
FaceRecognition.ppt
 
FaceRecognition.ppt
FaceRecognition.pptFaceRecognition.ppt
FaceRecognition.ppt
 
Manifold Learning and Data Visualization
Manifold Learning and Data VisualizationManifold Learning and Data Visualization
Manifold Learning and Data Visualization
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 

More from Stefan Kühn

data2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfdata2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfStefan Kühn
 
data2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfdata2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfStefan Kühn
 
Data Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeData Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeStefan Kühn
 
Interactive Dashboards with R
Interactive Dashboards with RInteractive Dashboards with R
Interactive Dashboards with RStefan Kühn
 
Talk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsTalk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsStefan Kühn
 
The Machinery behind Deep Learning
The Machinery behind Deep LearningThe Machinery behind Deep Learning
The Machinery behind Deep LearningStefan Kühn
 
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsBecoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsStefan Kühn
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017Stefan Kühn
 
Deep Learning and Optimization Methods
Deep Learning and Optimization MethodsDeep Learning and Optimization Methods
Deep Learning and Optimization MethodsStefan Kühn
 
Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeStefan Kühn
 
SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015Stefan Kühn
 
SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015Stefan Kühn
 

More from Stefan Kühn (13)

data2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdfdata2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2023_SKuehn_DataPlatformFallacy.pdf
 
data2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdfdata2day2022_SKuehn_DataValueChain.pdf
data2day2022_SKuehn_DataValueChain.pdf
 
Data Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational ChangeData Science - Cargo Cult - Organizational Change
Data Science - Cargo Cult - Organizational Change
 
Interactive Dashboards with R
Interactive Dashboards with RInteractive Dashboards with R
Interactive Dashboards with R
 
Talk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and ApplicationsTalk at PyData Berlin about Manifold Learning and Applications
Talk at PyData Berlin about Manifold Learning and Applications
 
Bridging the gap
Bridging the gapBridging the gap
Bridging the gap
 
The Machinery behind Deep Learning
The Machinery behind Deep LearningThe Machinery behind Deep Learning
The Machinery behind Deep Learning
 
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing SolutionsBecoming Data-driven - Machine Learning @ XING Marketing Solutions
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017
 
Deep Learning and Optimization Methods
Deep Learning and Optimization MethodsDeep Learning and Optimization Methods
Deep Learning and Optimization Methods
 
Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data Challenge
 
SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015SKuehn_MachineLearningAndOptimization_2015
SKuehn_MachineLearningAndOptimization_2015
 
SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015SKuehn_Talk_FootballAnalytics_data2day2015
SKuehn_Talk_FootballAnalytics_data2day2015
 

Recently uploaded

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 

Recently uploaded (20)

CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

Visualizing and Communicating High-dimensional Data

  • 1. Visualizing and Communicating High-dimensional Data Dr. Stefan Kühn Lead Data Scientist Data Natives Berlin - 26.10.2016
  • 3. The Modes of Perception 3 X X X X X X X O O O O O O O O X X X X O O O O X X X X X X X X O O O O O O O O X X X X X O O O Fast Slow Find the outlier
  • 4. The Modes of Perception 4 • Pre-attentive • fast • parallel processing • effortless • Pattern recognition • semi-fast • governed by laws of Gestalt • Attentive • slow • sequential • high effort (attention is a very limited resource)
  • 5. Main Properties of Graphics 5 Category Example Position Shape Size Color Orientation (Line) Length (Line) Type and Size (Line) Brightness
  • 6. Main Properties of Graphics Humans 6 Category Amount of pre-attentive information Position very high Shape ——— Size approx. 4 Color approx. 8 Orientation (Line) approx. 4 Length (Line) ——— Type and Size (Line) ——— Brightness approx. 8
  • 7. Pre-attentive perception 7 • Position • fast • effective • high number of different positions • Color • use with care • Shape • Orientation Pre-attentive perception is effortless. Exploit this as much as you can.
  • 8. Pattern detection 8 „It is interesting to note that our brain […] subconsciously always prefers meaningful situations and objects.“ • Emergence • Reiification • Multi-stability • Invariance Pattern detection can be trained. Exploit this for frequent visualizations.
  • 12. Laws of Gestalt 12 „It is interesting to note that our brain, in accordance with the laws of Gestalt, subconsciously always prefers meaningful situations and objects.“
  • 13. Accuracy of Graphics 13 Square Pie vs Stacked Bar vs Pie vs Donut What do you think? https://eagereyes.org/blog/2016/a-reanalysis-of-a-study-about-square-pie-charts-from-2009
  • 14. Why do we visualize data? 14 • Explore • use different techniques • avoid „construction“ bias • be careful with „aesthetics“ • challenge findings -> use attentive mode • Explain • focus on message or „story“ • use pre-attentive mode Don’t trust graphics that you have not falsified yourself.
  • 18. Fundamental Problems • No accurate method in higher dimensions • Approximation methods • „Simulated“ dimensions (color, size, shape) • Animations? • No notion of quality or accuracy for Visualizations • Information Theory? • „Stability“? All Visualizations are wrong, but some are useful. 18
  • 19. Approximation methods • Pair Plots • Axis-aligned projections • Interpretable in terms of original variables • Singular Value Decomposition • Optimal with respect to 2-norm (Euclidean norm) and supremum norm • Comes with an error estimate • Other methods • Stochastic Neighbor Embedding ((t-)SNE) • „Manifold Learning“ 19
  • 20. Manifold Learning Methods • Locally Linear Embedding • Neighborhood-preserving embedding • Isomap • quasi-isometric • Multi-dimensional scaling • quasi-isometric • Spectral Embedding • Spectral clustering based on similarity • Stochastic Neighbor Embedding (SNE, t-SNE) • preserves conditional probabilities for similarity • Local Tangent Space Alignement (LTSA) 20
  • 24. Local Tangent Space Alignement 24
  • 25. Principal Components and Curves • Principal Component Analysis • orthogonal decomposition based on SVD • linear in all variables • tries to preserve variance • Principal Curves • minimize the Sum of Squared Errors with respect to all variables (as PCA, preserve variance) • nonlinear • smooth 25
  • 27. Parallel Coordinates • Parallel Coordinates • especially useful for high-dimensional data • depends on ordering and scaling 27
  • 28. The Grand Tour • Animated sequence of 2-D projections • https://en.wikipedia.org/wiki/ Grand_Tour_(data_visualisation) • Asimov (1985): The grand tour: a tool for viewing multidimensional data. • Underlying idea • Randomly generate 2-D projections (random walk) • Over time generate a dense subset of all possible 2-D projections • Optional: Follow a given path / guided tour 28