SlideShare a Scribd company logo
Unit III
Data Visualization
• As data and insights grow in number, a new requirement is the ability
of the executives and decision makers to absorb this information in
real time.
• There is a limit to human comprehension and visualization capacity.
• That is a good reason to prioritize and manage with fewer but key
variables that relate directly to the Key Result Areas (KRAs) of a role.
Data Visualization
• Data visualization is the graphical representation of information and
data. By using visual elements like charts, graphs, and maps, data
visualization tools provide an accessible way to see and understand
trends, outliers, and patterns in data.
• Additionally, it provides an excellent way for employees or
business owners to present data to non-technical audiences
without confusion.
• In the world of Big Data, data visualization tools and
technologies are essential to analyze massive amounts of
information and make data-driven decisions.
Considerations
• Here are few considerations when presenting using data:
1. Present the conclusions and not just report the data.
2. Choose wisely from a palette of graphs to suit the data.
3. Organize the results to make the central point stand out.
4. Ensure that the visuals accurately reflect the numbers. Inappropriate visuals
can create misinterpretations and misunderstandings.
5. Make the presentation unique, imaginative and memorable.
Data
Visualization
History
• The classic presentation of the story of Napoleon’s march to Russia in
1812, by French cartographer Joseph Minard,
• It covers about six dimensions.
• Time is on horizontal axis. The geographical coordinates and rivers are
mapped in. The thickness of the bar shows the number of troops at
any point of time that is mapped. One color is used for the onward
march and another for the retreat. The weather temperature at each
time is shown in the line graph at the bottom.
Advantages of data visualization
• Easily sharing information.
• Interactively explore opportunities.
• Visualize patterns and relationships.
Disadvantages
• Biased or inaccurate information.
• Correlation doesn’t always mean causation.
• Core messages can get lost in translation.
Why data visualization is important?
• it helps people see, interact with, and better understand data.
Whether simple or complex, the right visualization can bring everyone
on the same page, regardless of their level of expertise.
• Every STEM field benefits from understanding data—and so do
fields in government, finance, marketing, history, consumer
goods, service industries, education, sports, and so on.
• Data visualization is one of the steps of the data science
process, which states that after data has been collected,
processed and modeled, it must be visualized for conclusions to
be made.
•
Data Science
• While both fields involve working with data to gain insights, data
science often involves using data to build models that can predict
future outcomes, while data analytics tends to focus more on
analyzing past data to inform decisions in the present.
• Data Science makes use of machine learning algorithms to get
insights. Data Analytics does not use machine learning to get the
insight of data.
Difference
Why data visualization is important?
• Data Visualization Discovers the Trends in Data
Why data visualization is important?
• Data Visualization Provides a Perspective on the Data
Why data visualization is important?
• Data Visualization Puts the Data into the Correct Context
Why data visualization is important?
• Data Visualization Saves Time
Why data visualization is important?
• Data Visualization Tells a Data Story
General Types of Visualizations
• Chart: Information presented in a tabular, graphical form with data
displayed along two axes. Can be in the form of a graph, diagram, or map.
• Table: A set of figures displayed in rows and columns.
• Graph: A diagram of points, lines, segments, curves, or areas that
represents certain variables in comparison to each other, usually along two
axes at a right angle.
• Geospatial: A visualization that shows data in map form using different
shapes and colors to show the relationship between pieces of data and
specific locations.
• Infographic: A combination of visuals and words that represent data.
Usually uses charts or diagrams.
• Dashboards: A collection of visualizations and data displayed in one
place to help with analyzing and presenting data.
Categories of Data Visualization
Numerical Data
• Numerical data is also known as Quantitative data. Numerical data is
any data where data generally represents amount such as height,
weight, age of a person, etcNumerical data is categorized into two
categories :
• Continuous Data –
• It can be narrowed or categorized (Example: Height measurements).
• Discrete Data –
• This type of data is not “continuous” (Example: Number of cars or children’s a household
has).
• The type of visualization techniques that are used to represent
numerical data visualization is Charts and Numerical Values. Examples
are Pie Charts, Bar Charts, Averages, Scorecards, etc.
Categorical Data
• Categorical data is also known as Qualitative data. Categorical data is any data
where data generally represents groups. It simply consists of categorical variables
that are used to represent characteristics such as a person’s ranking, a person’s
gender, etc. Categorical data visualization is all about depicting key themes,
establishing connections, and lending context. Categorical data is classified into
three categories :
• Binary Data –
• In this, classification is based on positioning (Example: Agrees or Disagrees).
• Nominal Data –
• In this, classification is based on attributes (Example: Male or Female).
• Ordinal Data –
• In this, classification is based on ordering of information (Example: Timeline or processes).
• The type of visualization techniques that are used to represent categorical data is
Graphics, Diagrams, and Flowcharts. Examples are Word clouds, Sentiment
Mapping, Venn Diagram, etc.
Top Data Visualization Tools
• The following are the 10 best Data Visualization Tools
• Tableau
• Looker
• Zoho Analytics
• Sisense
• IBM Cognos Analytics
• Qlik Sense
• Domo
• Microsoft Power BI
• Klipfolio
• SAP Analytics Cloud
Spatial Visualization Techniques
• Univariate data --1 dimension data
• A single value can be displayed
• as the number itself -- a string of digits
• as a dial (such as the altimeter, speedometer, guage)
• as a slider or thermometer
Spatial Visualization Techniques
• Maximization
use least amount of "ink" or non-
background pixels and leverage our
pre-attentive vision to fill in the area.
Tukey plot as typically presented on
the left and a revised minimized plot
on the right (or below):
Spatial Visualization Techniques
• Information in the axes
Histogram removal of y axis; axis values are aligned with the pre-
attentive "white" line through the data
Spatial Visualization Techniques
• Sparklines
• Sparklines are examples of high data-ink ratios. They are typically a time series
and can be used to represent visually the sequence in a very dense and compact
manner. They may be small enough to just be included in the flow of the text
rather than having to refer to a separate figure.
Spatial Visualization Techniques
• One Dimensional Data as
Spatial Data
• Time is now displayed as the
x axis and the data values are
the y axis
Spatial Visualization Techniques
• Two Dimensional Data as Spatial Data
• Mapping spatial attributes of the data to the screen.
• We really are working in three dimensions now.
• Two dimensions specify the location
• A third dimension is then plotted, maybe with several other dimension (see
height and color on the map below).
• Scatterplot -- discrete data values are mapped to a location (pixel or dot) and marked by
color, shape or size; result is 2D
• Image -- each point is mapped to a pixel location and intermediate pixels that are
unmapped are interpolated for color or brightness according to neighboring mapped
pixels; result is 2D; often referred to as a "heat map"
• Rubber sheet -- each point is mapped to an image pixel and it has a third value that
controls a height. Missing points are also interpolated to make a smooth surface. Result is
3D
Spatial Visualization Techniques
• 3D Data as spatial
• Visualizing the surface
• Visualizing the volume
Visualizing Geospatial Data
on a Map
Visualizing Geospatial Data on a Map
• 1. Point map
A point map is one
of the simplest
ways to visualize
geospatial data.
Basically, you
place a point at
any location on the
map that
corresponds to the
variable you’re
trying to measure
(such as a
building, e.g. a
hospital).
Visualizing Geospatial Data on a Map
• Proportional symbol map
This is a variation of the point
map. It uses a circle or other
shape to represent data at a
particular location. However,
based on the point's size and/or
color, it can be used to
represent multiple other
variables at once (such as
population and/or average age).
Visualizing Geospatial Data on a Map
• Cluster map
This is a proportional symbol
map with a twist. It features a
similar concept of using
points of varying sizes and
colors to represent multiple
types of data at a location at
once. However, these larger
points serve as stand-ins for
smaller points, which
become visible if you
increase the map’s scale.
This gets around the main
issue of overcrowding in
point maps, but requires
special geospatial data
visualization tools such as
GIS software.
Visualizing Geospatial Data on a Map
• Choropleth map
It’s made by
separating the
area being
mapped, such as
by geographic or
political
boundaries, and
then filling each
resulting section
with a different
color or shade.
Visualizing Geospatial Data on a Map
• Cartogram map
This variation of the
choropleth map is a
hybrid of a map and a
chart. It involves taking a
land area map of a
geographic region and
dividing it into segments
in such a way that sizes
and/or distances are
proportional to the
values of the variable
being measured.
Visualizing Geospatial Data on a Map
• Hexagonal binning map
Visualizing Geospatial Data on a Map
• Heat map
Visualizing Geospatial Data on a Map
• Topographic map
Visualizing Geospatial Data on a Map
• Flow map
Flow maps, also known as
‘path’ maps, are more
specialized versions of line
maps. Instead of focusing on
physical features of the earth,
they are used to represent the
movement of things across the
earth over time.
Visualizing Geospatial Data on a Map
• Spider map
The spider map is a
variation of the flow
map. Instead of
focusing on discrete
pairs of origin and
destination data
points, the spider
map looks at the
relationships
between origin points
and multiple
destination points –
some of which may
be held in common.
Visualizing Geospatial Data on a Map
• Time-space distribution map
This is an advanced form of
geospatial data mapping that
combines the precision of a point
map with the dynamism of a flow
map. It seeks to accurately
determine the locations of objects at
any point in time as they move.
Visualizing Geospatial Data on a Map
• Data space distribution
map
This is another variant of
the flow map that aims to
not only represent the
movement of things over
time, but also how
variables dependent on
that movement change
over time.
Time Oriented Visualizations
Time Oriented Visualizations
• Time can be simply viewed as linear and chosen as the x-axis in most
visualizations.
Time Oriented Visualizations
1. Scale
• How is time measured? When are the data measurements/samples
taken?
• Ordinal -- before, during, after
• Discrete -- clear intervals (seconds, minutes, hours.....)
• Continuous -- mapping to the real numbers. Discrete values can be
interpolated
Time Oriented Visualizations
2. Scope
• The range of time associated with a measurement/sample
• point -- the sample is from a point in time that has no duration
• interval-based -- there is a duration; a start and end
These time primitives can be anchored (absolute) or unanchored (relative)
• We can also recognize determinancy:
• determinant -- all aspects of time is known and fixed
• indeterminant -- there may be some uncertainty. Intervals are sometimes used here
to compensate.
Time Oriented Visualizations
3. Arrangement
Time often has a cyclical nature, compared to the linear nature described
above:
• hourly cycle
• 24 hour cycle in a daily cyclc
• 7 days in a weekly cycle (Mon->Tues....Sun->Mon)
• ~30 days in a monthly cyclc
• lunar cycle
• quarterly/seasonal cycle (financial, astronomical, meteorological)
• 365 days, 52 weeks, 12 months in a yearly cycle
• Decades
The different units suggest granularity. How you might represent a
visualization may vary (interactively) by granularity (zoom in, zoom out)
Characteristics of Time-Oriented Data
• This is more of a reminder of the data typing we have discussed
earlier in the course
Multivariate Data
Multivariate Data
• Univariate statistics summarize only one variable at a time. Bivariate
statistics compare two variables. Multivariate statistics compare more
than two variables.
• Multivariate visualizations can be done by adding more than
one visual variable to a simple renderer. Common combinations
include:
1.Color and size
2.Size and rotation
3.Size, rotation, and color
Scatter Plot
Correlation Matrix
Heatmap
Parallel Coordinate Plot
Bubble Chart
Graphs, Trees, and How to
Visualize Them
Graphs, Trees, and How to Visualize Them
• Let’s instead talk about graphs, networks, & trees in the mathematical
sense: a model for representing items and the relationships between
those items
• Social / friendship networks
• Computer networks
• Energy or transportation grids
• Organizational structures
• Etc.
Node-link tree diagrams
• Nodes are distributed in space, connected by straight or curved lines
• Typical approach is to use 2D space to break apart breadth and depth
• Often, space is used to communicate hierarchical orientation
Tidy Tree
Text and Document Visualization
Text and Document Visualization
• Here we consider visualizing the text within a document, and collections of
documents which are likely related (corpus).
• Difficulty in analysis includes the loose structure, varied vocabulary, and
optional metadata such as author(s), date, modification dates, comments,
keywords, catalog codes, citations.
• Levels of text to be represented:
• Lexical level -- Simple grouping of characters into "tokens" which are typically words,
but word stems, phrases, word n-grams and character n-grams may be beneficial
• Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in
the context of the phrase, sentence and paragraph
• Semantic level -- Extract meaning of the syntactic structure with the tokens using
fuller analysis of the context.
Vector Space Model
• Analysis of the words in a document and determine their value in
contribution and significance to the document.
• Removal of noise words ("a", "an", "the", "that") and punctuation,
and stemming (collecting roots of words) are typical of preprocessing.
• Simple frequency counts of significant words ordered by decreasing
frequency is a simple vector.
Vector Space Model
• https://wordcounter.net/
• Here we consider visualizing the text within a document, and collections of documents which
are likely related (corpus).
• Difficulty in analysis includes the loose structure, varied vocabulary, and optional metadata
such as author(s), date, modification dates, comments, keywords, catalog codes, citations.
• Levels of text to be represented:
• Lexical level -- Simple grouping of characters into "tokens" which are typically words, but word
stems, phrases, word n-grams and character n-grams may be beneficial
• Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in the
context of the phrase, sentence and paragraph
• Semantic level -- Extract meaning of the syntactic structure with the tokens using fuller
analysis of the context.
Term Frequency--Inverse Document
Frequency
Mapping vector
space models to
the document
Single Document Visualization
• Tag Clouds visualizes the words by size based on frequency. Again this
is the opening Intro section.
• tagcrowd.com
• Here we consider visualizing the text within a document, and collections of documents which
are likely related (corpus).
• Difficulty in analysis includes the loose structure, varied vocabulary, and optional metadata such
as author(s), date, modification dates, comments, keywords, catalog codes, citations.
• Levels of text to be represented:
• Lexical level -- Simple grouping of characters into "tokens" which are typically words, but word
stems, phrases, word n-grams and character n-grams may be beneficial
• Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in the context
of the phrase, sentence and paragraph
• Semantic level -- Extract meaning of the syntactic structure with the tokens using fuller analysis
of the context.
Wordle
• Creates a visualization with size based on frequency.
• http://wordle.net
Wordle
Word Tree
• https://www.jasondavies.com/wordtree/
TextArc
Music theme
visualization
example
Literature fingerprinting
• Here we look at n-word-grams to match patterns of the author.
• N-gram is probably the easiest concept to understand in the
whole machine learning space, I guess. An N-gram means a
sequence of N words. So for example, “Medium blog” is a 2-
gram (a bigram), “A Medium blog post” is a 4-gram, and “Write
on Medium” is a 3-gram (trigram). Well, that wasn’t very
interesting or exciting. True, but we still have to look at the
probability used with n-grams, which is quite interesting.
Document Collection Visualizations
• Goal is to place similar documents close together.
• graph spring layouts,
• multi-dimensional scaling
• clustering (K-means, hierarchical)
• self-organizing maps
• Self-organizing maps -- use the vectors from each document to calculate distances from
each other. Higher weights draw the documents closer together. Randomly start with
one document.
• Stream Graph
Power Query & M Language
Power Query & M Language
• Power Query is built on what was then a new query language called
M. It is a mashup language (hence the letter M) designed to create
queries that mix together data.
• 12 Methods for Visualizing Geospatial Data on a Map | SafeGraph
• Time Oriented Visualizations (juniata.edu)

More Related Content

Similar to Unit III.pptx

Diagramatic and graphical representation of data Notes on Statistics.ppt
Diagramatic and graphical representation of data Notes on Statistics.pptDiagramatic and graphical representation of data Notes on Statistics.ppt
Diagramatic and graphical representation of data Notes on Statistics.ppt
aigil2
 
Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...
samarpeetnandanwar21
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
Saurabh Shanbhag
 
Advantages and Limitations for Diagrams and Graphs
Advantages and Limitations for Diagrams and GraphsAdvantages and Limitations for Diagrams and Graphs
Advantages and Limitations for Diagrams and Graphs
Hardik Bhaavani
 
Organizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptxOrganizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptx
bentrym2
 
DAV Seperate, Align, Staked.pptx
DAV Seperate, Align, Staked.pptxDAV Seperate, Align, Staked.pptx
DAV Seperate, Align, Staked.pptx
SamirSitaula1
 
Datamining data visualization
Datamining data visualizationDatamining data visualization
Datamining data visualization
Asterite
 
Presentation de la DATA visualisation.pptx
Presentation de la DATA visualisation.pptxPresentation de la DATA visualisation.pptx
Presentation de la DATA visualisation.pptx
salmakoummich
 
RM UNIT 6.pptx
RM UNIT 6.pptxRM UNIT 6.pptx
RM UNIT 6.pptx
PallawiBulakh1
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
Maloy Manna, PMP®
 
Data visualization.pptx
Data visualization.pptxData visualization.pptx
Data visualization.pptx
naveen shyam
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
simonwandrew
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
Shreenidhi bhat
 
Unit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptxUnit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptx
JANNU VINAY
 
DATA GRAPHICS 8th Sem.pdf
DATA GRAPHICS 8th Sem.pdfDATA GRAPHICS 8th Sem.pdf
DATA GRAPHICS 8th Sem.pdf
Ravinandan A P
 
Exploring Data (1).pptx
Exploring Data (1).pptxExploring Data (1).pptx
Exploring Data (1).pptx
gina458018
 
DATA INTERPRETATION.pdf
DATA INTERPRETATION.pdfDATA INTERPRETATION.pdf
DATA INTERPRETATION.pdf
DebmalyaGhosh20
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
Ultimate Multimedia Consult
 
Bigdata
BigdataBigdata
Now you see it
Now you see itNow you see it
Now you see it
Gang Tao
 

Similar to Unit III.pptx (20)

Diagramatic and graphical representation of data Notes on Statistics.ppt
Diagramatic and graphical representation of data Notes on Statistics.pptDiagramatic and graphical representation of data Notes on Statistics.ppt
Diagramatic and graphical representation of data Notes on Statistics.ppt
 
Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...Data visualization is the representation of data through use of common graphi...
Data visualization is the representation of data through use of common graphi...
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
 
Advantages and Limitations for Diagrams and Graphs
Advantages and Limitations for Diagrams and GraphsAdvantages and Limitations for Diagrams and Graphs
Advantages and Limitations for Diagrams and Graphs
 
Organizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptxOrganizational Data Analysis by Mr Mumba.pptx
Organizational Data Analysis by Mr Mumba.pptx
 
DAV Seperate, Align, Staked.pptx
DAV Seperate, Align, Staked.pptxDAV Seperate, Align, Staked.pptx
DAV Seperate, Align, Staked.pptx
 
Datamining data visualization
Datamining data visualizationDatamining data visualization
Datamining data visualization
 
Presentation de la DATA visualisation.pptx
Presentation de la DATA visualisation.pptxPresentation de la DATA visualisation.pptx
Presentation de la DATA visualisation.pptx
 
RM UNIT 6.pptx
RM UNIT 6.pptxRM UNIT 6.pptx
RM UNIT 6.pptx
 
Data Visualization in Data Science
Data Visualization in Data ScienceData Visualization in Data Science
Data Visualization in Data Science
 
Data visualization.pptx
Data visualization.pptxData visualization.pptx
Data visualization.pptx
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
 
Unit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptxUnit 2_ Descriptive Analytics for MBA .pptx
Unit 2_ Descriptive Analytics for MBA .pptx
 
DATA GRAPHICS 8th Sem.pdf
DATA GRAPHICS 8th Sem.pdfDATA GRAPHICS 8th Sem.pdf
DATA GRAPHICS 8th Sem.pdf
 
Exploring Data (1).pptx
Exploring Data (1).pptxExploring Data (1).pptx
Exploring Data (1).pptx
 
DATA INTERPRETATION.pdf
DATA INTERPRETATION.pdfDATA INTERPRETATION.pdf
DATA INTERPRETATION.pdf
 
Data Visualization.pptx
Data Visualization.pptxData Visualization.pptx
Data Visualization.pptx
 
Bigdata
BigdataBigdata
Bigdata
 
Now you see it
Now you see itNow you see it
Now you see it
 

Recently uploaded

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 

Unit III.pptx

  • 2. Data Visualization • As data and insights grow in number, a new requirement is the ability of the executives and decision makers to absorb this information in real time. • There is a limit to human comprehension and visualization capacity. • That is a good reason to prioritize and manage with fewer but key variables that relate directly to the Key Result Areas (KRAs) of a role.
  • 3. Data Visualization • Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data. • Additionally, it provides an excellent way for employees or business owners to present data to non-technical audiences without confusion. • In the world of Big Data, data visualization tools and technologies are essential to analyze massive amounts of information and make data-driven decisions.
  • 4. Considerations • Here are few considerations when presenting using data: 1. Present the conclusions and not just report the data. 2. Choose wisely from a palette of graphs to suit the data. 3. Organize the results to make the central point stand out. 4. Ensure that the visuals accurately reflect the numbers. Inappropriate visuals can create misinterpretations and misunderstandings. 5. Make the presentation unique, imaginative and memorable.
  • 6. History • The classic presentation of the story of Napoleon’s march to Russia in 1812, by French cartographer Joseph Minard, • It covers about six dimensions. • Time is on horizontal axis. The geographical coordinates and rivers are mapped in. The thickness of the bar shows the number of troops at any point of time that is mapped. One color is used for the onward march and another for the retreat. The weather temperature at each time is shown in the line graph at the bottom.
  • 7.
  • 8.
  • 9. Advantages of data visualization • Easily sharing information. • Interactively explore opportunities. • Visualize patterns and relationships.
  • 10. Disadvantages • Biased or inaccurate information. • Correlation doesn’t always mean causation. • Core messages can get lost in translation.
  • 11. Why data visualization is important? • it helps people see, interact with, and better understand data. Whether simple or complex, the right visualization can bring everyone on the same page, regardless of their level of expertise. • Every STEM field benefits from understanding data—and so do fields in government, finance, marketing, history, consumer goods, service industries, education, sports, and so on. • Data visualization is one of the steps of the data science process, which states that after data has been collected, processed and modeled, it must be visualized for conclusions to be made. •
  • 12. Data Science • While both fields involve working with data to gain insights, data science often involves using data to build models that can predict future outcomes, while data analytics tends to focus more on analyzing past data to inform decisions in the present. • Data Science makes use of machine learning algorithms to get insights. Data Analytics does not use machine learning to get the insight of data.
  • 14.
  • 15. Why data visualization is important? • Data Visualization Discovers the Trends in Data
  • 16. Why data visualization is important? • Data Visualization Provides a Perspective on the Data
  • 17. Why data visualization is important? • Data Visualization Puts the Data into the Correct Context
  • 18. Why data visualization is important? • Data Visualization Saves Time
  • 19. Why data visualization is important? • Data Visualization Tells a Data Story
  • 20. General Types of Visualizations • Chart: Information presented in a tabular, graphical form with data displayed along two axes. Can be in the form of a graph, diagram, or map. • Table: A set of figures displayed in rows and columns. • Graph: A diagram of points, lines, segments, curves, or areas that represents certain variables in comparison to each other, usually along two axes at a right angle. • Geospatial: A visualization that shows data in map form using different shapes and colors to show the relationship between pieces of data and specific locations. • Infographic: A combination of visuals and words that represent data. Usually uses charts or diagrams. • Dashboards: A collection of visualizations and data displayed in one place to help with analyzing and presenting data.
  • 21. Categories of Data Visualization
  • 22. Numerical Data • Numerical data is also known as Quantitative data. Numerical data is any data where data generally represents amount such as height, weight, age of a person, etcNumerical data is categorized into two categories : • Continuous Data – • It can be narrowed or categorized (Example: Height measurements). • Discrete Data – • This type of data is not “continuous” (Example: Number of cars or children’s a household has). • The type of visualization techniques that are used to represent numerical data visualization is Charts and Numerical Values. Examples are Pie Charts, Bar Charts, Averages, Scorecards, etc.
  • 23. Categorical Data • Categorical data is also known as Qualitative data. Categorical data is any data where data generally represents groups. It simply consists of categorical variables that are used to represent characteristics such as a person’s ranking, a person’s gender, etc. Categorical data visualization is all about depicting key themes, establishing connections, and lending context. Categorical data is classified into three categories : • Binary Data – • In this, classification is based on positioning (Example: Agrees or Disagrees). • Nominal Data – • In this, classification is based on attributes (Example: Male or Female). • Ordinal Data – • In this, classification is based on ordering of information (Example: Timeline or processes). • The type of visualization techniques that are used to represent categorical data is Graphics, Diagrams, and Flowcharts. Examples are Word clouds, Sentiment Mapping, Venn Diagram, etc.
  • 24. Top Data Visualization Tools • The following are the 10 best Data Visualization Tools • Tableau • Looker • Zoho Analytics • Sisense • IBM Cognos Analytics • Qlik Sense • Domo • Microsoft Power BI • Klipfolio • SAP Analytics Cloud
  • 25. Spatial Visualization Techniques • Univariate data --1 dimension data • A single value can be displayed • as the number itself -- a string of digits • as a dial (such as the altimeter, speedometer, guage) • as a slider or thermometer
  • 26. Spatial Visualization Techniques • Maximization use least amount of "ink" or non- background pixels and leverage our pre-attentive vision to fill in the area. Tukey plot as typically presented on the left and a revised minimized plot on the right (or below):
  • 27. Spatial Visualization Techniques • Information in the axes Histogram removal of y axis; axis values are aligned with the pre- attentive "white" line through the data
  • 28. Spatial Visualization Techniques • Sparklines • Sparklines are examples of high data-ink ratios. They are typically a time series and can be used to represent visually the sequence in a very dense and compact manner. They may be small enough to just be included in the flow of the text rather than having to refer to a separate figure.
  • 29.
  • 30. Spatial Visualization Techniques • One Dimensional Data as Spatial Data • Time is now displayed as the x axis and the data values are the y axis
  • 31. Spatial Visualization Techniques • Two Dimensional Data as Spatial Data • Mapping spatial attributes of the data to the screen. • We really are working in three dimensions now. • Two dimensions specify the location • A third dimension is then plotted, maybe with several other dimension (see height and color on the map below). • Scatterplot -- discrete data values are mapped to a location (pixel or dot) and marked by color, shape or size; result is 2D • Image -- each point is mapped to a pixel location and intermediate pixels that are unmapped are interpolated for color or brightness according to neighboring mapped pixels; result is 2D; often referred to as a "heat map" • Rubber sheet -- each point is mapped to an image pixel and it has a third value that controls a height. Missing points are also interpolated to make a smooth surface. Result is 3D
  • 32.
  • 33. Spatial Visualization Techniques • 3D Data as spatial • Visualizing the surface • Visualizing the volume
  • 35. Visualizing Geospatial Data on a Map • 1. Point map A point map is one of the simplest ways to visualize geospatial data. Basically, you place a point at any location on the map that corresponds to the variable you’re trying to measure (such as a building, e.g. a hospital).
  • 36. Visualizing Geospatial Data on a Map • Proportional symbol map This is a variation of the point map. It uses a circle or other shape to represent data at a particular location. However, based on the point's size and/or color, it can be used to represent multiple other variables at once (such as population and/or average age).
  • 37. Visualizing Geospatial Data on a Map • Cluster map This is a proportional symbol map with a twist. It features a similar concept of using points of varying sizes and colors to represent multiple types of data at a location at once. However, these larger points serve as stand-ins for smaller points, which become visible if you increase the map’s scale. This gets around the main issue of overcrowding in point maps, but requires special geospatial data visualization tools such as GIS software.
  • 38. Visualizing Geospatial Data on a Map • Choropleth map It’s made by separating the area being mapped, such as by geographic or political boundaries, and then filling each resulting section with a different color or shade.
  • 39. Visualizing Geospatial Data on a Map • Cartogram map This variation of the choropleth map is a hybrid of a map and a chart. It involves taking a land area map of a geographic region and dividing it into segments in such a way that sizes and/or distances are proportional to the values of the variable being measured.
  • 40. Visualizing Geospatial Data on a Map • Hexagonal binning map
  • 41. Visualizing Geospatial Data on a Map • Heat map
  • 42. Visualizing Geospatial Data on a Map • Topographic map
  • 43. Visualizing Geospatial Data on a Map • Flow map Flow maps, also known as ‘path’ maps, are more specialized versions of line maps. Instead of focusing on physical features of the earth, they are used to represent the movement of things across the earth over time.
  • 44. Visualizing Geospatial Data on a Map • Spider map The spider map is a variation of the flow map. Instead of focusing on discrete pairs of origin and destination data points, the spider map looks at the relationships between origin points and multiple destination points – some of which may be held in common.
  • 45. Visualizing Geospatial Data on a Map • Time-space distribution map This is an advanced form of geospatial data mapping that combines the precision of a point map with the dynamism of a flow map. It seeks to accurately determine the locations of objects at any point in time as they move.
  • 46. Visualizing Geospatial Data on a Map • Data space distribution map This is another variant of the flow map that aims to not only represent the movement of things over time, but also how variables dependent on that movement change over time.
  • 48. Time Oriented Visualizations • Time can be simply viewed as linear and chosen as the x-axis in most visualizations.
  • 49. Time Oriented Visualizations 1. Scale • How is time measured? When are the data measurements/samples taken? • Ordinal -- before, during, after • Discrete -- clear intervals (seconds, minutes, hours.....) • Continuous -- mapping to the real numbers. Discrete values can be interpolated
  • 50. Time Oriented Visualizations 2. Scope • The range of time associated with a measurement/sample • point -- the sample is from a point in time that has no duration • interval-based -- there is a duration; a start and end These time primitives can be anchored (absolute) or unanchored (relative) • We can also recognize determinancy: • determinant -- all aspects of time is known and fixed • indeterminant -- there may be some uncertainty. Intervals are sometimes used here to compensate.
  • 51.
  • 52. Time Oriented Visualizations 3. Arrangement Time often has a cyclical nature, compared to the linear nature described above: • hourly cycle • 24 hour cycle in a daily cyclc • 7 days in a weekly cycle (Mon->Tues....Sun->Mon) • ~30 days in a monthly cyclc • lunar cycle • quarterly/seasonal cycle (financial, astronomical, meteorological) • 365 days, 52 weeks, 12 months in a yearly cycle • Decades The different units suggest granularity. How you might represent a visualization may vary (interactively) by granularity (zoom in, zoom out)
  • 53.
  • 54. Characteristics of Time-Oriented Data • This is more of a reminder of the data typing we have discussed earlier in the course
  • 56. Multivariate Data • Univariate statistics summarize only one variable at a time. Bivariate statistics compare two variables. Multivariate statistics compare more than two variables. • Multivariate visualizations can be done by adding more than one visual variable to a simple renderer. Common combinations include: 1.Color and size 2.Size and rotation 3.Size, rotation, and color
  • 61. Graphs, Trees, and How to Visualize Them
  • 62. Graphs, Trees, and How to Visualize Them • Let’s instead talk about graphs, networks, & trees in the mathematical sense: a model for representing items and the relationships between those items • Social / friendship networks • Computer networks • Energy or transportation grids • Organizational structures • Etc.
  • 63.
  • 64.
  • 65.
  • 66. Node-link tree diagrams • Nodes are distributed in space, connected by straight or curved lines • Typical approach is to use 2D space to break apart breadth and depth • Often, space is used to communicate hierarchical orientation
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73. Text and Document Visualization
  • 74. Text and Document Visualization • Here we consider visualizing the text within a document, and collections of documents which are likely related (corpus). • Difficulty in analysis includes the loose structure, varied vocabulary, and optional metadata such as author(s), date, modification dates, comments, keywords, catalog codes, citations. • Levels of text to be represented: • Lexical level -- Simple grouping of characters into "tokens" which are typically words, but word stems, phrases, word n-grams and character n-grams may be beneficial • Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in the context of the phrase, sentence and paragraph • Semantic level -- Extract meaning of the syntactic structure with the tokens using fuller analysis of the context.
  • 75. Vector Space Model • Analysis of the words in a document and determine their value in contribution and significance to the document. • Removal of noise words ("a", "an", "the", "that") and punctuation, and stemming (collecting roots of words) are typical of preprocessing. • Simple frequency counts of significant words ordered by decreasing frequency is a simple vector.
  • 76. Vector Space Model • https://wordcounter.net/ • Here we consider visualizing the text within a document, and collections of documents which are likely related (corpus). • Difficulty in analysis includes the loose structure, varied vocabulary, and optional metadata such as author(s), date, modification dates, comments, keywords, catalog codes, citations. • Levels of text to be represented: • Lexical level -- Simple grouping of characters into "tokens" which are typically words, but word stems, phrases, word n-grams and character n-grams may be beneficial • Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in the context of the phrase, sentence and paragraph • Semantic level -- Extract meaning of the syntactic structure with the tokens using fuller analysis of the context.
  • 78.
  • 79. Mapping vector space models to the document
  • 80. Single Document Visualization • Tag Clouds visualizes the words by size based on frequency. Again this is the opening Intro section. • tagcrowd.com • Here we consider visualizing the text within a document, and collections of documents which are likely related (corpus). • Difficulty in analysis includes the loose structure, varied vocabulary, and optional metadata such as author(s), date, modification dates, comments, keywords, catalog codes, citations. • Levels of text to be represented: • Lexical level -- Simple grouping of characters into "tokens" which are typically words, but word stems, phrases, word n-grams and character n-grams may be beneficial • Syntactic level --Parsing purpose of token, grammatical category, tense, plurality, in the context of the phrase, sentence and paragraph • Semantic level -- Extract meaning of the syntactic structure with the tokens using fuller analysis of the context.
  • 81. Wordle • Creates a visualization with size based on frequency. • http://wordle.net
  • 86. Literature fingerprinting • Here we look at n-word-grams to match patterns of the author. • N-gram is probably the easiest concept to understand in the whole machine learning space, I guess. An N-gram means a sequence of N words. So for example, “Medium blog” is a 2- gram (a bigram), “A Medium blog post” is a 4-gram, and “Write on Medium” is a 3-gram (trigram). Well, that wasn’t very interesting or exciting. True, but we still have to look at the probability used with n-grams, which is quite interesting.
  • 87.
  • 88. Document Collection Visualizations • Goal is to place similar documents close together. • graph spring layouts, • multi-dimensional scaling • clustering (K-means, hierarchical) • self-organizing maps • Self-organizing maps -- use the vectors from each document to calculate distances from each other. Higher weights draw the documents closer together. Randomly start with one document.
  • 89.
  • 90.
  • 92. Power Query & M Language
  • 93. Power Query & M Language • Power Query is built on what was then a new query language called M. It is a mashup language (hence the letter M) designed to create queries that mix together data.
  • 94. • 12 Methods for Visualizing Geospatial Data on a Map | SafeGraph • Time Oriented Visualizations (juniata.edu)