2. Information visualization
application design
• Task, subtask and problem
• Dataset
• Scalers
• PCA Algorithm
• Representation and presentation
• Overview and interactive objects
• Filtering
• Significant objects
• Navigational guidance
3. Analytic and visual solutions
DATA TO ANALYZE FILTERS SHOWING
QUANTITATIVE
INFORMATION
UI AND UX VISUALIZATIONS FRAMEWORKS
4. Task, subtask
and problem
To perform main task, subtasks are
needed: we have to gain insights
into a collection and to understand
the set of characteristics.
The project task is to evaluate
statistics of traffic collision in Italy,
looking for a correlation between
the actors of these accidents.
5. Dataset
• Data are provided by ISTAT and contains
information about the Italians traffic
collisions, including injured and types of
incident, organized by year (from 2003 to
2013) and by Italian region.
• The AS (AngeliniSantucci) index is defined
as: AS = #tuples * #dimensions.
In this project we have an AS = 10850.
6. Standardization
and PCA
algorithm
• Principal Component Analysis is a
method of extracting important
variables (in form of components) from
a large set of variables available in a
dataset. It extracts low dimensional set
of features from a high dimensional
dataset with a motive to capture as
much information as possible.
• PCA is affected by scale, so we need to
scale features in the dataset before
applying PCA, e.g. standardization
involves rescaling the features such
that they have the properties of a
standard normal distribution with a
mean of zero and a standard deviation
of one.
7. Scalers and PCA
algorithm
• The original data has many columns. After
applying a scaler, PCA transforms the
original data which is multidimensional into
2 dimensions, as the following example.
• Finally, we can plot our data.
8. 2-component
PCA output
• In these graphs, according to
original data, Standard Scaler,
MinMax Scaler, Quantile
Transformer and Power
Transformer, we can see regions
with low-rate of accidents
nearby on the left, while regions
with high-rate of accidents are
detached, far on the right.
• On the other hand, in
Normalizer, we have regions with
low collisions on the right, while
the others are on the left.
9. Representation and
presentation
• First, we have to map data values to visual
attributes to represent a value.
• Then, it is possible to show relations among
values and apply useful interaction
technique; in this project, as an example,
are applied the mouse hover technique,
clickable regions, filters and sliders.
• The project uses modern solutions,
including easily visualizing numbers, large
explained details and big digits.
10. Overview and
interactive
objects • Data information is shown using histograms
and graphs.
• The project includes many interactive tools,
e.g. time series slider, mouse-hovering and
clickable Italian regions and selectable
options filters.
11. Data filtering
• A system for suppressing not relevant data
is required. The dataset provided by ISTAT
was full of (for my purposes) useless
information, such as car brands involved in
accidents. So, these columns, were deleted.
12. Significant objects
• To simplify the exploration of the dataset, it
is useful to mark and define some
significant and interesting objects
• With significant data it is involved
information that the user was looking for,
statistics and/or graphs that the users will
look again
13. Navigational guidance
• To help users to navigate visualized data, the project fits into one fixed-size
page. No multiple pages or infinite scrolling are needed, because they can
confuse users.
• All the information and the data are in one place and they are interactive and
clickable, useful, good-looking and they also provide powerful items to satisfy
user requirements.
14. Data to analyze
The data to analyze includes numbers of circulation
crashes in Italy per year in each Italian region,
examining both urban and extra-urban roads,
investigating deaths and injured.
Other information is about number of licensed people
and the number of collisions, in order to make ratio
between this two numbers and to make relationships
between each Italian region according to its ratio.
Also, selecting a region it is possible to inspect
collisions on this region and the related severity.
15. Filters #1
• The solution makes it possible to let
users to use filters on the data to
analyze, such as the type of collision,
including only vehicles, only vehicles
and pedestrians or both vehicles and
pedestrian.
• It is also possible to review the
seriousness of the collisions, analyzing
data if there is at least one injured or at
least one dead; another filter is the
possibility to make analysis on data
about collisions only on urban roads,
only on extra-urban roads or both
urban and on extra-urban roads.
16. Filters #2
• Another available filter is the year: it is
possible to choose the year of data to
analyze thanks to a time series slider,
this will have repercussion on all the
investigated information.
• It is also possible to select regions,
showing, by year and by all the
previously discussed and chosen filters,
the results of the analysis and the
number of persons with driving license
and how many of them are injured.
17. Showing
quantitative
information
• After choosing filters, it is possible to show various data
both in percentage and in absolute numbers, using the
right length and the right scale avoiding pie charts or 3D
charts, avoiding also areas, angles and volumes
comparisons and using opportune colors scales only
when strictly needed.
• A second slider also allows users to switch between the
used PCA scalers.
18. UI and UX
In this project, it is implemented the most
user-friendly UI and UX I could design,
applying information visualization
representing abstract data to amplify
cognition. The actual understanding of
information visualization involves cognitive
activity and interactive activity. The first is
about computer-based visualization, the
second allows the user to manipulate the
visualization to better reach his goals.
Scrolling is boring, consumes a lot of time
and most content is hidden from the view.
Distortion is also disturbing, so I don’t use
3D objects neither perspective walls in the
project. Suppression and zoom are not
needed because the interactive map of Italy,
the histograms, the graphs and the sliders
perfectly fit in the monitor page.
19. Interactive map of
Italy
• The visual part provides to the user a computer
software dashboard with an interactive map of
Italy on the left, with big and clickable regions,
each one with a color density for quantitative
comparison using as colors shades of green
(from dark green to light green), according to
the key legend on the left.
• Hovering each region of Italy with the mouse, it
pops up a little window showing by the year
and by all the selected filters the number of
licensed people, the number of accidents and
the ratio of accidents over licensed.
20. Histograms
showing data • Histograms show data about traffic
collisions on urban roads and on extra-
urban road and they also show the
information about the number of dead and
injured.
21. PCA Scalers and Time Series
Sliders
• Time series slider allows users to change the year of
analyzed data. This action changes all the other
visualizations: the PCA output changes according to the
selected year, each single Italian region change its own
color of shades of green according to the key legend;
histograms changes its own rectangles length maintaining
the same lie factor and the rate scale on both x and y
axes, in order to avoid distortion and to not deform the
data representation.
• PCA scalers slider changes the used scaler to visualize the
PCA algorithm output.
22. Frameworks and
languages
• Project’s frontend is developed using D3.js
JavaScript library.
• Then, there is a huge range of visualization
functionality available for Python, with a
diversity in approach and focus that is
reflected in the large number of libraries
used in backend.
23. Visual Analytics cycle
• Available data are elaborated in order to make a
functional and useful dataset to be suitable for the
project purposes.
• Through scripts it is applied a parameter refinement
and it makes a mapping to build a visualizable and
interactive dashboard.
• Through the view, users are able to interact with
filters, with the interactive map of Italy and with the
other visualizations such as histograms, time series
and PCA sliders in order to read in a good way the
data, to acquire knowledge about Italian traffic
collisions, and, hopefully, trying to sensitize drivers
and pedestrian about road hazards.
24. Conclusion
• Road traffic accidents is the leading cause of
death by injury and the tenth-leading cause of
all deaths globally.
• If present trends continue, road traffic injuries
are predicted to be the third-leading
contributor to the global burden of disease
and injury by 2020.
• We get two points: most accidented regions
are not only the most peopled and the
situation is not getting better over the year.
• We absolutely need to set stronger road and
safety rules, securing compliance, and
improving transport policy.
25. ➢ Check out more amazing projects by Roberto Falconi
http://www.robertodaguarcino.com
https://github.com/RobertoFalconi
https://www.linkedin.com/in/roberto-falconi
https://www.slideshare.net/RobertoFalconi4
Thank you!