SlideShare a Scribd company logo
• we focus on visualization=information visualization
• Data→Visualization→Analysis
• Related concepts
• Computer graphics: Data is not necessary present, analysis is not normally assumed
• Example: Computer games
• Scientific visualization: similar to information visualization, often engineering data,
statistical/machine learning analysis is normally not assumed
• Example: Industrial robots
https://www.youtube.com/watch?v=mpechGIfPbw
Scientific visualization Information visualization
• Which graphs can be used for analysis of my data?
• How to create these graphs?
• How should these graphs be analysed?
• How to make these graphs looking good enough for publication or presentation?
• Human sight = primary resource for information understanding
• Visualization is often the quickest way for data understanding
• The way of data visualization may affect decision making dramatically
Decision here: population does not
increase so much, no intervention
needed
Decision here: population
increases quickly, intervention is
required
Visual perception problem
Source: Elting Linda S, Martin Charles
G, Cantor Scott B, Rubenstein Edward
B. Influence of data display formats on
physician investigators' decisions to
stop clinical trials: prospective trial with
repeated measures BMJ 1999; 318
:1527
• Visualization for exploration
• Clusters
• Trends
• Anomalies
• …
• Confirmatory visualization
• Example 1: Perform linear regression, analyse residuals→ was linear regression reasonable
• Example 2: Discover clusters by K-means, visualize clusters→ are they clusters actually?
• Visualization for presentation
Key ingredient: mapping data columns to visual
structures (aesthetics)
• Human visual system has limitations
• These limitations may lead to wrong/incomplete analysis of graphs
• Understanding how we see→ better displays
• Misleading graphics needs to be avoided
• Color= hue + saturation + value (lightness)
• 8% of males are color deficient→ what are good colors?
https://henrydangprg.com/2016/06/26/color-detection-in-python-with-opencv/
http://www.ilusa.com/gallery/elephant-illusion.jpg
How quickly can you identify a
red dot? How quickly can you identify a
square of right-handed Rs?
• Certain aesthetics are fast to process
• How can this affect analysis?
• Viewing raw data is often prefered
• Sometimes some preprocessing is needed
• Missing values and Data cleaning
• Discard the bad record →may remove almost all data
• Assign sentinel value
• Column mean imputation
• Nearest neighbor imputation
• Other imputations
• Normalization
• Converting column to range [0,1]. Useful in for ex. color mapping
• Centering and scaling 0/1
• Nonlinear transformations: log, sqrt
• Segmentation
• Split data according to some column
• Sampling, subsetting and expanding
• Random sampling reduces size of data and facilitates overplotting (for ex. scatterplots)
• Interpolation: linear (one dimension), bilinear (two dimensions), nonlinear. Select necessary
amount of intrepolation points.
• Dimension reduction
• PCA
• MDS
• Other techniques (ex. ICA, Autoencoders), welcome to Machine Learning...
• Mapping nominal dimensions to numbers
• Random mapping should never be done unless intrinsic ordering is present
• Use other numeric variables to measure in the data to measure “closeness” of values in the
nominal variable
• Correspondence analysis
• Aggregation and summarization
1. Grouping observations
2. Computing summary statistics per group
• Smoothing and filtering
• Replace original values with a smoothed versions
Commercial:
• SAS and SAS JMP – environment. Special visual tools are available (JMP), require separate license. Well documented.
Good even for large sets. SAS Enterprise Guide has many visual static tools
• Spotfire – Many static and interactive visualization tools
• Tableau– Many static and interactive visualization tools
• InfoScope – visualizing maps, interactive visualizations
• Infographics-
Free:
• R – programming language. Set of packages is constantly updated. A lot of statistical tools (even the newest methods)
Badly documented
• Plotly – a tool for interactive and dynamic graphics, R interface available
• Shiny - – a tool for R-based web applications using graphics
• GraphViz – visualization of graph data, coding needed
• Jigsaw – Text analysis
Tools for the web (used by web designers):
• ActionScript
• JavaScript
• Prefuse
• VTK
… much more references given in the course book
• Install R: http://www.r-project.org/
• Install RStudio: http://rstudio.org/
Progra
m
Execution
console
Plots
Workspace
• R base: basic plotting, not
publication quality Grob package: low-level plotting
tools, new types of plots
Zhou, Lutong, and W. John Braun. "Fun with the r grid package." Journal of Statistics Education
18.3 (2010).
Ggplot2 package: based on
grammar of graphics, close to
publication quality
Plotly package: Ggplot2 + interactivity
Base R graphical procedures:
• plot(x,..) plots time series
• plot(x,y) scatter plot
• plot(x,y) followed by points(x,y) plots several
scatterplots in one coordinate system
• hist(x,..) plots a hitogram
• persp(x,y,z,…) creates surface plots
• cloud(formula,data..) creates 3D scatter plot
• Visualization for exploration
• Default settings
• Visualization for presentation for publication
• Higher quality graphics is required
• Improve the graph quality in the software (often requires quite a bit of
programming)
• Use postprocessing tools, such as Inkscape or Adobe Illustrator
Plot processed in
Inkscape
Example: Compare two plots and state
what is improved in the second plot.
• Install Inkscape
• http://inkscape.org/
• Inkscape is an open source, SVG-based
vector drawing program
• file format that Inkscape uses is
compact and quickly transmittable over
the Internet.
• Vector graphics: image is defined in
terms of lines, not pixels
• Benefit: can be enlarged without loss
of picture quality
1. Save your R plot as PDF and import it to
Inkscape
2. Make changes and export your plot as a
PNG-file or save it as PDF.
Bitmap image and vector
image (enlarged)
• Menu bar (Important: File, Object, Path, Text, View→Grid)
• Command bar (Zoom to fit page, Edit object’s colors)
• Tool controls
• Tool box (Select and transform, zoom, Text, Eraser, Fill)
• Color palette
• Status bar
• Course book, chapters 1.1, 1.3-1.8 and 2
• Manual to InkScape: http://tavmjong.free.fr/INKSCAPE/MANUAL/html/index.php

More Related Content

Similar to UNit4.pdf

Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
Albert Y. C. Chen
 
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
David Gotz
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection
Rik Van Bruggen
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
Setia Pramana
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
Thomas Hütter
 
datavisualization-5thUnit.pdf
datavisualization-5thUnit.pdfdatavisualization-5thUnit.pdf
datavisualization-5thUnit.pdf
BrijeshPatil13
 
Big data week 2018 - Graph Analytics on Big Data
Big data week 2018 - Graph Analytics on Big DataBig data week 2018 - Graph Analytics on Big Data
Big data week 2018 - Graph Analytics on Big Data
Christos Hadjinikolis
 
Converting and Transforming Technical Graphics
Converting and Transforming Technical GraphicsConverting and Transforming Technical Graphics
Converting and Transforming Technical Graphics
dclsocialmedia
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
Abhishek Roy
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
LDBC council
 
Module 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time SeriesModule 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time Series
ssusere5ddd6
 
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Codemotion
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analytics
Rik Van Bruggen
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup Group
Sri Kanajan
 
Day - 6.pptx
Day - 6.pptxDay - 6.pptx
Day - 6.pptx
Manimaran A
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
Korkrid Akepanidtaworn
 
Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...
samarpeetnandanwar21
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
Michael Ming Lei
 

Similar to UNit4.pdf (20)

Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
 
201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection201411203 goto night on graphs for fraud detection
201411203 goto night on graphs for fraud detection
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
 
An R primer for SQL folks
An R primer for SQL folksAn R primer for SQL folks
An R primer for SQL folks
 
datavisualization-5thUnit.pdf
datavisualization-5thUnit.pdfdatavisualization-5thUnit.pdf
datavisualization-5thUnit.pdf
 
Big data week 2018 - Graph Analytics on Big Data
Big data week 2018 - Graph Analytics on Big DataBig data week 2018 - Graph Analytics on Big Data
Big data week 2018 - Graph Analytics on Big Data
 
Converting and Transforming Technical Graphics
Converting and Transforming Technical GraphicsConverting and Transforming Technical Graphics
Converting and Transforming Technical Graphics
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
Module 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time SeriesModule 3 - Basics of Data Manipulation in Time Series
Module 3 - Basics of Data Manipulation in Time Series
 
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
Big Data, a space adventure - Mario Cartia - Codemotion Rome 2015
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analytics
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup Group
 
Day - 6.pptx
Day - 6.pptxDay - 6.pptx
Day - 6.pptx
 
LSESU a Taste of R Language Workshop
LSESU a Taste of R Language WorkshopLSESU a Taste of R Language Workshop
LSESU a Taste of R Language Workshop
 
ENAR short course
ENAR short courseENAR short course
ENAR short course
 
Mapinfo 2014
Mapinfo 2014Mapinfo 2014
Mapinfo 2014
 
Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...
 
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 

More from SugumarSarDurai

Parking NYC.pdf
Parking NYC.pdfParking NYC.pdf
Parking NYC.pdf
SugumarSarDurai
 
Apache Spark
Apache SparkApache Spark
Apache Spark
SugumarSarDurai
 
Power BI.pdf
Power BI.pdfPower BI.pdf
Power BI.pdf
SugumarSarDurai
 
Unit 6.pdf
Unit 6.pdfUnit 6.pdf
Unit 6.pdf
SugumarSarDurai
 
Unit 5.pdf
Unit 5.pdfUnit 5.pdf
Unit 5.pdf
SugumarSarDurai
 
07 Data-Exploration.pdf
07 Data-Exploration.pdf07 Data-Exploration.pdf
07 Data-Exploration.pdf
SugumarSarDurai
 
06 Excel.pdf
06 Excel.pdf06 Excel.pdf
06 Excel.pdf
SugumarSarDurai
 
05 python.pdf
05 python.pdf05 python.pdf
05 python.pdf
SugumarSarDurai
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
SugumarSarDurai
 
03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf
SugumarSarDurai
 
UNit4d.pdf
UNit4d.pdfUNit4d.pdf
UNit4d.pdf
SugumarSarDurai
 
Unit 4 Time Study.pdf
Unit 4 Time Study.pdfUnit 4 Time Study.pdf
Unit 4 Time Study.pdf
SugumarSarDurai
 
Unit 3 Micro and Memo motion study.pdf
Unit 3 Micro and Memo motion study.pdfUnit 3 Micro and Memo motion study.pdf
Unit 3 Micro and Memo motion study.pdf
SugumarSarDurai
 
02 Work study -Part_1.pdf
02 Work study -Part_1.pdf02 Work study -Part_1.pdf
02 Work study -Part_1.pdf
SugumarSarDurai
 
02 Method Study part_2.pdf
02 Method Study part_2.pdf02 Method Study part_2.pdf
02 Method Study part_2.pdf
SugumarSarDurai
 
01 Production_part_2.pdf
01 Production_part_2.pdf01 Production_part_2.pdf
01 Production_part_2.pdf
SugumarSarDurai
 
01 Production_part_1.pdf
01 Production_part_1.pdf01 Production_part_1.pdf
01 Production_part_1.pdf
SugumarSarDurai
 
01 Industrial Management_Part_1a .pdf
01 Industrial Management_Part_1a .pdf01 Industrial Management_Part_1a .pdf
01 Industrial Management_Part_1a .pdf
SugumarSarDurai
 
01 Industrial Management_Part_1 .pdf
01 Industrial Management_Part_1 .pdf01 Industrial Management_Part_1 .pdf
01 Industrial Management_Part_1 .pdf
SugumarSarDurai
 

More from SugumarSarDurai (19)

Parking NYC.pdf
Parking NYC.pdfParking NYC.pdf
Parking NYC.pdf
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 
Power BI.pdf
Power BI.pdfPower BI.pdf
Power BI.pdf
 
Unit 6.pdf
Unit 6.pdfUnit 6.pdf
Unit 6.pdf
 
Unit 5.pdf
Unit 5.pdfUnit 5.pdf
Unit 5.pdf
 
07 Data-Exploration.pdf
07 Data-Exploration.pdf07 Data-Exploration.pdf
07 Data-Exploration.pdf
 
06 Excel.pdf
06 Excel.pdf06 Excel.pdf
06 Excel.pdf
 
05 python.pdf
05 python.pdf05 python.pdf
05 python.pdf
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 
03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf
 
UNit4d.pdf
UNit4d.pdfUNit4d.pdf
UNit4d.pdf
 
Unit 4 Time Study.pdf
Unit 4 Time Study.pdfUnit 4 Time Study.pdf
Unit 4 Time Study.pdf
 
Unit 3 Micro and Memo motion study.pdf
Unit 3 Micro and Memo motion study.pdfUnit 3 Micro and Memo motion study.pdf
Unit 3 Micro and Memo motion study.pdf
 
02 Work study -Part_1.pdf
02 Work study -Part_1.pdf02 Work study -Part_1.pdf
02 Work study -Part_1.pdf
 
02 Method Study part_2.pdf
02 Method Study part_2.pdf02 Method Study part_2.pdf
02 Method Study part_2.pdf
 
01 Production_part_2.pdf
01 Production_part_2.pdf01 Production_part_2.pdf
01 Production_part_2.pdf
 
01 Production_part_1.pdf
01 Production_part_1.pdf01 Production_part_1.pdf
01 Production_part_1.pdf
 
01 Industrial Management_Part_1a .pdf
01 Industrial Management_Part_1a .pdf01 Industrial Management_Part_1a .pdf
01 Industrial Management_Part_1a .pdf
 
01 Industrial Management_Part_1 .pdf
01 Industrial Management_Part_1 .pdf01 Industrial Management_Part_1 .pdf
01 Industrial Management_Part_1 .pdf
 

Recently uploaded

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 

Recently uploaded (20)

Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 

UNit4.pdf

  • 1.
  • 2.
  • 3.
  • 4. • we focus on visualization=information visualization • Data→Visualization→Analysis • Related concepts • Computer graphics: Data is not necessary present, analysis is not normally assumed • Example: Computer games • Scientific visualization: similar to information visualization, often engineering data, statistical/machine learning analysis is normally not assumed • Example: Industrial robots
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27. • Which graphs can be used for analysis of my data? • How to create these graphs? • How should these graphs be analysed? • How to make these graphs looking good enough for publication or presentation?
  • 28. • Human sight = primary resource for information understanding • Visualization is often the quickest way for data understanding • The way of data visualization may affect decision making dramatically
  • 29. Decision here: population does not increase so much, no intervention needed Decision here: population increases quickly, intervention is required Visual perception problem
  • 30. Source: Elting Linda S, Martin Charles G, Cantor Scott B, Rubenstein Edward B. Influence of data display formats on physician investigators' decisions to stop clinical trials: prospective trial with repeated measures BMJ 1999; 318 :1527
  • 31. • Visualization for exploration • Clusters • Trends • Anomalies • … • Confirmatory visualization • Example 1: Perform linear regression, analyse residuals→ was linear regression reasonable • Example 2: Discover clusters by K-means, visualize clusters→ are they clusters actually? • Visualization for presentation
  • 32. Key ingredient: mapping data columns to visual structures (aesthetics)
  • 33. • Human visual system has limitations • These limitations may lead to wrong/incomplete analysis of graphs • Understanding how we see→ better displays • Misleading graphics needs to be avoided
  • 34. • Color= hue + saturation + value (lightness) • 8% of males are color deficient→ what are good colors? https://henrydangprg.com/2016/06/26/color-detection-in-python-with-opencv/
  • 36. How quickly can you identify a red dot? How quickly can you identify a square of right-handed Rs? • Certain aesthetics are fast to process
  • 37. • How can this affect analysis?
  • 38. • Viewing raw data is often prefered • Sometimes some preprocessing is needed • Missing values and Data cleaning • Discard the bad record →may remove almost all data • Assign sentinel value • Column mean imputation • Nearest neighbor imputation • Other imputations
  • 39. • Normalization • Converting column to range [0,1]. Useful in for ex. color mapping • Centering and scaling 0/1 • Nonlinear transformations: log, sqrt • Segmentation • Split data according to some column
  • 40. • Sampling, subsetting and expanding • Random sampling reduces size of data and facilitates overplotting (for ex. scatterplots) • Interpolation: linear (one dimension), bilinear (two dimensions), nonlinear. Select necessary amount of intrepolation points. • Dimension reduction • PCA • MDS • Other techniques (ex. ICA, Autoencoders), welcome to Machine Learning...
  • 41. • Mapping nominal dimensions to numbers • Random mapping should never be done unless intrinsic ordering is present • Use other numeric variables to measure in the data to measure “closeness” of values in the nominal variable • Correspondence analysis • Aggregation and summarization 1. Grouping observations 2. Computing summary statistics per group
  • 42. • Smoothing and filtering • Replace original values with a smoothed versions
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61. Commercial: • SAS and SAS JMP – environment. Special visual tools are available (JMP), require separate license. Well documented. Good even for large sets. SAS Enterprise Guide has many visual static tools • Spotfire – Many static and interactive visualization tools • Tableau– Many static and interactive visualization tools • InfoScope – visualizing maps, interactive visualizations • Infographics- Free: • R – programming language. Set of packages is constantly updated. A lot of statistical tools (even the newest methods) Badly documented • Plotly – a tool for interactive and dynamic graphics, R interface available • Shiny - – a tool for R-based web applications using graphics • GraphViz – visualization of graph data, coding needed • Jigsaw – Text analysis
  • 62. Tools for the web (used by web designers): • ActionScript • JavaScript • Prefuse • VTK … much more references given in the course book
  • 63.
  • 64. • Install R: http://www.r-project.org/ • Install RStudio: http://rstudio.org/ Progra m Execution console Plots Workspace
  • 65. • R base: basic plotting, not publication quality Grob package: low-level plotting tools, new types of plots Zhou, Lutong, and W. John Braun. "Fun with the r grid package." Journal of Statistics Education 18.3 (2010).
  • 66. Ggplot2 package: based on grammar of graphics, close to publication quality Plotly package: Ggplot2 + interactivity
  • 67. Base R graphical procedures: • plot(x,..) plots time series • plot(x,y) scatter plot • plot(x,y) followed by points(x,y) plots several scatterplots in one coordinate system • hist(x,..) plots a hitogram • persp(x,y,z,…) creates surface plots • cloud(formula,data..) creates 3D scatter plot
  • 68. • Visualization for exploration • Default settings • Visualization for presentation for publication • Higher quality graphics is required • Improve the graph quality in the software (often requires quite a bit of programming) • Use postprocessing tools, such as Inkscape or Adobe Illustrator
  • 69. Plot processed in Inkscape Example: Compare two plots and state what is improved in the second plot.
  • 70. • Install Inkscape • http://inkscape.org/ • Inkscape is an open source, SVG-based vector drawing program • file format that Inkscape uses is compact and quickly transmittable over the Internet. • Vector graphics: image is defined in terms of lines, not pixels • Benefit: can be enlarged without loss of picture quality 1. Save your R plot as PDF and import it to Inkscape 2. Make changes and export your plot as a PNG-file or save it as PDF. Bitmap image and vector image (enlarged)
  • 71. • Menu bar (Important: File, Object, Path, Text, View→Grid) • Command bar (Zoom to fit page, Edit object’s colors) • Tool controls • Tool box (Select and transform, zoom, Text, Eraser, Fill) • Color palette • Status bar
  • 72. • Course book, chapters 1.1, 1.3-1.8 and 2 • Manual to InkScape: http://tavmjong.free.fr/INKSCAPE/MANUAL/html/index.php