The Art of Data Visualization
Agenda:
6:00 - 6:15: Welcome
6:15 – 6:45: Guidelines for Data Visualization
6:45- 7:30 : Large-scale GPU-Accelerated Data Visualization with MapD
7:30 - 8:00: 1000+ Members Giveaway / Networking + Q&A
2. Agenda
6:00 - 6:15: Welcome
6:15 - 7:00: Guidelines for Data Visualization
7:00 - 7:30: Large-scale GPU-Accelerated
Data Visualization with MapD
7:30 - 8:00: 1000+ Members Giveaway /
Networking + Q&A
2
The Art of Data Visualization
3. Special Event
February 12
"Studies in Gameful Interaction Design and Games
User Research"
Dr Lennart Nacke, Director of the HCI Games Group | Associate
professor for human-computer interaction -University of Waterloo
3
8. “Visual representations not only make the
patterns, trends, and exceptions in
numbers visible and understandable, they
also extend the capacity of our memory,
making available in front of our eyes what
we couldn’t otherwise hold all at once in
our minds.”
– Stephen Few
9. 5 Rules
2 Rule
1 - Make sure your visualization answers a question
2 - Consider your audience and the context of use
3 - Use the right method of visualization
4 - Make Your Visualization Readable
5 - Use the right analytical interaction and
navigation
15. What information
does he need to be
successful?
What level of detail
does the user need?
What actions can be
taken?
Consider accessibility
What is the context of
use?
27. Use the right
analytical
interaction and
navigation
Comparing
Sorting
Adding variables
Filtering
Highlighting
Aggregating
Re-expressing
Re-visualizing
Zooming and panning
Re-scaling
Accessing details on demand
Annotating
Bookmarking
32. Aaron Williams
VP of Global Community
@_arw_
aaron@mapd.com
/in/aaronwilliams/
/williamsaaron
Christophe Viau
Data Visualization Engineer
chrisv@mapd.com
/in/christopheviau/
/biovisualize
33. “Every business will become a
software business, build
applications, use advanced analytics
and provide SaaS services.”
- Smart CEO Guy
34. The Evolution of Data as a Weapon
4
Collect It Make It
Actionable
Make it
Predictive
35. MapD: Extreme Analytics
5
100x Faster Queries
MapD Core
The world’s fastest
columnar database,
built specifically for GPUs
+
Visualization at the Speed of Thought
MapD Immerse
A visualization front end that
leverages the speed &
rendering superiority of GPUs
38. Core Density Makes a Huge Difference
8
GPU ProcessingCPU Processing
40,000
Cores
20 Cores
*fictitious example
Latency Throughput
CPU
1 ns
per task
(1 task/ns) x (20 cores) =
20 tasks/ns
GPU
10 ns
per task
(0.1 task per ns) x (40,000 cores) =
4,000 task per ns
Latency: Time to do a task. | Throughput: Number of tasks per unit time.
39. Query Compilation with LLVM
9
Traditional DBs can be highly inefficient
• each operator in SQL treated as a separate function
• incurs tremendous overhead and prevents vectorization
MapD compiles queries w/LLVM to create one custom function
• Queries run at speeds approaching hand-written functions
• LLVM enables generic targeting of different architectures (GPUs, X86, ARM, etc).
• Code can be generated to run query on CPU and GPU simultaneously
1011101010100101011010110101010
1
0011010110110101010101010101110
1
LLVM
40. Keeping Data Close to Compute
MapD maximizes performance by optimizing memory use
10
SSD or NVRAM STORAGE (L3)
250GB to 20TB
1-2 GB/sec
CPU RAM (L2)
32GB to 3TB
70-120 GB/sec
GPU RAM (L1)
24GB to 256GB
1000-6000 GB/sec
Hot Data
Speedup = 1500x to 5000x
Over Cold Data
Warm Data
Speedup = 35x to 120x
Over Cold Data
Cold Data
COMPUTE
LAYER
STORAGE
LAYER
Data Lake/Data Warehouse/System Of Record
SpeedIncreases
SpaceIncreases
42. The GPU Open Analytics Initiative Model
Standard in-memory format; zero-copy interchange
12
GPU
43. The GPU Open Analytics Initiative Model
Standard in-memory format; zero-copy interchange
13
44. Interactive Machine Learning
Empowering the People in the Pipeline
14
Personas in
Analytics Lifecycle
(Illustrative)
Business Analyst
Data Scientist
Data Engineer
IT Systems Admin
Data Scientist / Business Analyst
Data Preparation
Data
Discovery
& Feature Engineering
Model & Validate Predict
Operationalize
Monitoring & Refinement
Evaluate
& Decide
GPUsMapD H20.ai MapD
45. MapD Immerse
Using a hybrid approach to speed and scale visualization
15
Basic charts are frontend
rendered using D3 and other
related toolkits
Scatterplots, pointmaps + polygons
are backend rendered using the Iris
Rendering Engine on GPUs
Geo-Viz is composited over a
frontend rendered basemap
46. Built for an open-source ecosystem
16
Extending multiple APIs
● Dc.js (docs): Mapd-charting (docs)
● Crossfilter: Mapd-crossfilter
● Vega (editor): Mapd Raster
● GPU DB Connector (docs)
Part of an ecosystem
● Related projects like Deck.gl
● Building blocks like Mapbox, which uses Leaflet
● Using smaller building blocks, like D3.js
47. Try MapD
It’s free and it’s easy
17
Play with the live demos: https://www.mapd.com/demos/
Try the Test Drive: https://mapd.io/testdrive-enterprise
Install the Community Edition:
https://www.mapd.com/platform/download-community/
Join our forums:
https://community.mapd.com/
Review these slides:
https://speakerdeck.com/mapd
49. AWS Credits Available
19
Free GPU Compute!
We’re looking for interesting use cases.
Email Aaron Williams (aaron@mapd.com) with your ideas!
50. Aaron Williams
VP of Global Community
@_arw_
aaron@mapd.com
/in/aaronwilliams/
/williamsaaron
Christophe Viau
Data Visualization Engineer
chrisv@mapd.com
/in/christopheviau/
/biovisualize
51.
52. Merci / Thank You
22
@jdalabsmtl
Data Science | Design | Technology
(Check for next DSDT meetup at https://www.meetup.com/DSDTMTL)