2. >By visualizing information, we turn it into a landscape
that you can explore with your eyes.
>The human brain is wired in such a way that it
responds much faster to images than to words.
>It is estimated that we process visual information up
to 60,000 times better than textual information.
Hence, if you want your audience to comprehend the
data, it is always better to visualize it. So data
visualization is a key skill in a data scientist's toolbox.
3. • Imagine one day traffic lights suddenly have no
colors; instead, they are just words in the same color. This
would create a lot of chaos. To further understand, Look at
the example below
4. General types of Data
visualization
•Charts
•Graphs
•Maps
•Infographics
5. Best Practices in Visualization
Know your audience and objectives ?
What is the impact?
Use the right flow chart or decision tree to
solve a problem
e.g https://www.statsflowchart.co.uk/
Emphasize the important points
Use the right colors
6. WHAT IS GOOD
DATA
VISUALIZATION ?
Meet the audience’s needs
Communicates clearly
Provides a gradual learning curve
Adds value to the data
Tells the truth
7. Bad visualization example
Horizontal bar charts suffer from the same issue as pie charts: once there are too many categories,
you run out of space to include text and it becomes hard to digest:
8. Use of Data Visualization
Descriptive
analytics – What
has happened ?
Diagnostic
analytics - Why
has this happened
?
Predictive
analytics - What
will happen ?
Prscriptive
analytics - What
actions should you
take ?
9. Follow Data Privacy
Regulations & Data Ethics
• 1. Consent
• 2. Clarity
• 3. Consistency
• 4. Control (and
Transparency)
• 5. Consequences (and
Harm)
10. Popular visualizations and BI tools
Excel Tableau Power BI MicroStrategy
Grafana Qlik Super set D3 JS
Google charts seaborn matplotlib plotly
11. Some Real Time Data Stories
The Gramener cricket analysis is a data-driven
analysis that uses machine learning and artificial
intelligence to identify trends and patterns in
cricket data.
The analysis is based on a wide range of data,
including player performance data, team
performance data, and match data
https://gramener.com/cricket/
12. Singapore Circle Line Rogue train interference issue
https://www.tech.gov.sg/media/technews/rogue-train-a-big-data-story
The rogue train problem was a real-world problem that occurred on the Singapore MRT system in 2016. A
train (PV46) was emitting a signal that was jamming the signalling mechanism of the tracks, somehow
affecting both the trains behind it and trains travelling in the opposite direction. This caused delays and
disruptions to the entire MRT system.
To solve this problem, GovTech data scientists used big data visualization to analyze incident logs and other
data. They created a Marey chart, which is a type of line chart that shows how the location of a train changes
over time. The chart showed that the data points indicating each incident were fairly random and spread out,
but there was a straight line that could be drawn through some of them. This line represented the path of the
rogue train.
By connecting the dots, the data scientists were able to identify the rogue train and its location. This
information was then used to stop the train and fix the problem.
Big data visualization was essential to solving the rogue train problem because it allowed the data scientists
to see patterns in the data that would have been difficult to see otherwise. The Marey chart provided a clear
and concise way to visualize the data, and it made it easy to identify the rogue train.
The rogue train problem is a good example of how big data visualization can be used to solve real-world
problems. By using data visualization to identify trends and patterns, data scientists can gain insights that