8. [Sample data sets recreated from Francis J. Anscombe (1973). Graphs in statistical analysis.
Source: Andy Kirk. (2012) Data visualization: A successful design process]
11. Data visualization addresses…
…Human Scalability
• It enhances the recognition of patterns
• It increases our efficiency to explore large datasets
• It supports decisions
• It expands our working memory to solve problems
25. Network Threat
Analyst
Computer Network Data Collection Point
Get to know the context…
User
CMD Tools
Websites
Logs
Physical &
Task
Context
Technical Context
27. Step 1: Identify relevant visualization tasks
•Find suspicious IPs blocks
•Find domain names associated with specific IPs
•Examine the presence of domain names on blacklists
•Examine the relation of domain names with malware
•Identify the geographical location of IPs
•Identify the ownership of domain names
•Find suspicious Autonomous Systems
28.
29. The more accessible your visualization,
the greater your audience and your impact
[Scott Murray. (2013) Interactive Data Visualization for the Web]
Step 2: Choose a library
30. Step 2: Choose a library
•Functionality: Does it support the visualizations I
need?
•License: open source or commercial?
•Active support and development
•Browser compatibility
•Dependencies (e.g. React.js)
31. Step 2: Choose a library
Building a
visualization
with charting
libraries such
as Chart.js,
Tableau…
32. Step 2: Choose a library
Building a
visualization
with D3.js
33. •D3 is not really a “visualization library”; it does not
draw visualizations
•D3 = “Data-Driven Documents”; it associates data with
DOM elements and manages the results
•D3.js provides with tools such as layout, scales,
shapes that you can use to build visualizations
Step 2: Choose a library
34.
35. Step 3: Data transformations
{"date":"20160408","qname":"*.3rdandmonster.com.","qtype":1,"rdata":
{"string":"66.96.161.142"},"ttl":null,"authority_ips":"216.239.36.109","count":1,"hours":
1048576,"source":"gt","sensor":"active-dns"}
{"date":"20160408","qname":"*.aavxxnbm.org.","qtype":1,"rdata":{"string":"213.184.126.162"},"ttl":{"int":
604800},"authority_ips":"213.184.126.162","count":10,"hours":5543209,"source":"gt","sensor":"active-dns"}
{"date":"20160408","qname":"*.aenhfat.info.","qtype":1,"rdata":{"string":"213.184.126.162"},"ttl":{"int":
604800},"authority_ips":"213.184.126.162","count":4,"hours":8397064,"source":"gt","sensor":"active-dns"}
{"date":"20160408","qname":"*.agzksjhrmf.info.","qtype":1,"rdata":{"string":"213.184.126.162"},"ttl":{"int":
604800},"authority_ips":"213.184.126.162","count":5,"hours":4329736,"source":"gt","sensor":"active-dns"}
[Fragment of Active DNS resolution queries in deserialized Avro format - JSON format,
https://www.activednsproject.org]
Pre-processed data Domain Name
IP address
36. Step 3: Data transformations
Guided by the Visual Information-Seeking Mantra:
“Overview first,
Zoom and Filter, and then
Details-on-Demand”
[Shneiderman. (1996) The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations]
37. Step 3: Data transformations
{
"date": "dateValue",
"children": [{
"name": “/8Name",
"size": “numberOfIPs/8",
"color": “numberOfBlacklistedDomainNames/8",
"children":
[
{
"name": "/16Name",
"size": "numberOfIPs/16",
"color": "numberOfBlacklistedDomainNamesper/16",
"children": [
….
]
}
….
]
}
Nested JSON
format template
(JSON file per day)
Nested IPs in the following format:
/8 >> /16 >> /24 >> /32
Visual variables
38. Step 3: Data transformations
{
"date": "dateValue",
"children": [{
"name": “Continent",
"size": “numberOfIPsContinent",
"color": “numberOfBlacklistedDomainNamesperContinent",
"children":
[
{
"name": "Country",
"size": "numberOfIPscOuntry",
"color": "numberOfBlacklistedDomainNamesperCountry",
"children": [
….
]
}
….
]
}
Nested JSON
format template
(JSON file per day)
Alternative nesting options:
Continent >> Country >> State >> City
39. Step 3: Data transformations
> JSON files of 70
Mb
Nested JSON
format template
(JSON file per day)
Triple hierarchy!!!
40. Step 3: Data transformations
Split into
IPhierarchy.json
GeographicalHierarchy.json
AS.json
Nested JSON
format template
(JSON file per day)