map-D 
GDWDUHÀQHG 
www.map-d.com 
@datarefined Todd Mostak Ι 
todd@map-d.com Ι 
1830 Sansome St. 
San Francisco, CA 94104 
#mapd 
@datarefined
map-D? super-fast database 
built into GPU memory 
Do? 
world’s fastest 
real-time big data analytics 
interactive visualization 
Demo? 
twitter analytics platform 
1billion+ tweets 
milliseconds
The importance of interactivity 
People have struggled for a long time to build interactive 
visualizations of big data that can deliver insight 
Interactivity means: 
• Hypothesis testing can occur at “speed of thought” 
How Interactive is interactive enough? 
• According to a study by Jeffrey Heer and Zhicheng Liu, “an injected 
delay of half a second per operation adversely affects user 
performance in exploratory data analysis.” 
• Some types of latency are more detrimental than others: 
• For example, linking and brushing more sensitive than zooming
Strategies for interactivity 
• Sampling: 
• Ex. BlinkDB 
• Issues: 
• Need statistically robust method for sampling 
• Sampling can miss “long-tail” phenomena 
• Pre-computation 
• Ex. ImMems (datacubing) 
• Issues: 
• Only can show what curator thought was relevant 
• Can only store a certain number of binned attributes 
• Must be curated!
The Arrival of In-Memory Systems 
• Traditional RDBMS used to be too slow to serve as a back-end 
for interactive visualizations. 
• Queries of over a billion records could take minutes if not 
hours 
• But in-memory systems can execute such queries in a fraction 
of the time. 
• Both full DBMS and “pseudo”-DBMS solutions 
• But still often too slow
Enter Map-D
the technology
Core Innovation 
SQL-enabled column store database built into the memory 
architecture on GPUs and CPUs 
Code developed from scratch to take advantage of: 
• Memory and computational bandwidth of multiple GPUs 
• Heterogeneous architectures (CPUs and GPUs) 
• Fast RDMA between GPUs on different nodes 
• GPU Graphics pipeline 
Double-level buffer pool across GPU and CPU memory 
Shared scans – multiple queries of the same data can share 
memory bandwidth 
System can scan data at  2TB/sec per node, with  10TB/sec per 
node logical throughput with shared scans
The 
Hardware 
IB 
IB 
GPU 
1 
GPU 
2 
GPU 
3 
PCI 
PCI 
CPU 
0 
S1 
CPU 
1 
QPI 
RAID 
Controller 
GPU 
0 
S2 
S3 
S4 
IB 
IB 
GPU 
1 
GPU 
2 
GPU 
3 
PCI 
PCI 
CPU 
0 
S1 
CPU 
1 
QPI 
RAID 
Controller 
GPU 
0 
S2 
S3 
S4 
Switch 
Node 
0 
Node 
1
The 
Two-­‐Level 
Buffer 
Pool 
GPU 
Memory 
CPU 
Memory 
SSD
Shared Nothing Processing 
Multiple GPUs, with data partitioned between them 
Filter 
text ILIKE ‘rain’! 
Filter 
text ILIKE ‘rain’! 
Filter 
text ILIKE ‘rain’! 
Node 
1 
Node 
2 
Node 
3
the product
Complex 
AnalyKcs 
Image 
processing 
VisualizaKon 
GPU 
in-­‐memory 
SQL 
database 
OpenGL 
H.264/VP8 
streaming 
GPU 
pipeline 
Machine 
learning 
Graph 
analyKcs 
License 
Simple 
# 
of 
GPUs 
Mobile/server 
versions 
Scale 
to 
cluster 
of 
GPU 
nodes 
SQL 
compiler 
Shared 
scans 
User 
defined 
funcKons 
Hybrid 
GPU/CPU 
execuKon 
OpenCL 
and 
CUDA 
Product GPU 
powered 
end-­‐to-­‐end 
big 
data 
analyKcs 
and 
visualizaKon 
plaYorm
Map-D code 
Single GPU 
12GB memory 
Map-D code 
integrated into 
GPU memory 
Single CPU 
768GB memory 
Map-D code 
integrated into 
CPU memory 
NVIDIA TEGRA 
Mobile chip 
4GB memory 
Map-D code 
integrated into 
chip memory 
8 cards = 4U box 
4 sockets = 4U box 
Map-D code 
runs on GPU + 
CPU memory 
36U rack: 
~400GB GPU 
~12TB CPU 
Mobile Map-D running 
small datasets 
Native App 
Web-based 
service 
Map-D hardware architecture 
Large Data Big Data 
Small Data 
Next Gen Flash 
40TB 
100GB/s
map-D 
www.map-d.com 
@datarefined 
info@map-d.com

[2C5]Map-D: A GPU Database for Interactive Big Data Analytics

  • 1.
    map-D GDWDUHÀQHG www.map-d.com @datarefined Todd Mostak Ι todd@map-d.com Ι 1830 Sansome St. San Francisco, CA 94104 #mapd @datarefined
  • 2.
    map-D? super-fast database built into GPU memory Do? world’s fastest real-time big data analytics interactive visualization Demo? twitter analytics platform 1billion+ tweets milliseconds
  • 3.
    The importance ofinteractivity People have struggled for a long time to build interactive visualizations of big data that can deliver insight Interactivity means: • Hypothesis testing can occur at “speed of thought” How Interactive is interactive enough? • According to a study by Jeffrey Heer and Zhicheng Liu, “an injected delay of half a second per operation adversely affects user performance in exploratory data analysis.” • Some types of latency are more detrimental than others: • For example, linking and brushing more sensitive than zooming
  • 4.
    Strategies for interactivity • Sampling: • Ex. BlinkDB • Issues: • Need statistically robust method for sampling • Sampling can miss “long-tail” phenomena • Pre-computation • Ex. ImMems (datacubing) • Issues: • Only can show what curator thought was relevant • Can only store a certain number of binned attributes • Must be curated!
  • 5.
    The Arrival ofIn-Memory Systems • Traditional RDBMS used to be too slow to serve as a back-end for interactive visualizations. • Queries of over a billion records could take minutes if not hours • But in-memory systems can execute such queries in a fraction of the time. • Both full DBMS and “pseudo”-DBMS solutions • But still often too slow
  • 6.
  • 7.
  • 8.
    Core Innovation SQL-enabledcolumn store database built into the memory architecture on GPUs and CPUs Code developed from scratch to take advantage of: • Memory and computational bandwidth of multiple GPUs • Heterogeneous architectures (CPUs and GPUs) • Fast RDMA between GPUs on different nodes • GPU Graphics pipeline Double-level buffer pool across GPU and CPU memory Shared scans – multiple queries of the same data can share memory bandwidth System can scan data at 2TB/sec per node, with 10TB/sec per node logical throughput with shared scans
  • 9.
    The Hardware IB IB GPU 1 GPU 2 GPU 3 PCI PCI CPU 0 S1 CPU 1 QPI RAID Controller GPU 0 S2 S3 S4 IB IB GPU 1 GPU 2 GPU 3 PCI PCI CPU 0 S1 CPU 1 QPI RAID Controller GPU 0 S2 S3 S4 Switch Node 0 Node 1
  • 10.
    The Two-­‐Level Buffer Pool GPU Memory CPU Memory SSD
  • 11.
    Shared Nothing Processing Multiple GPUs, with data partitioned between them Filter text ILIKE ‘rain’! Filter text ILIKE ‘rain’! Filter text ILIKE ‘rain’! Node 1 Node 2 Node 3
  • 12.
  • 13.
    Complex AnalyKcs Image processing VisualizaKon GPU in-­‐memory SQL database OpenGL H.264/VP8 streaming GPU pipeline Machine learning Graph analyKcs License Simple # of GPUs Mobile/server versions Scale to cluster of GPU nodes SQL compiler Shared scans User defined funcKons Hybrid GPU/CPU execuKon OpenCL and CUDA Product GPU powered end-­‐to-­‐end big data analyKcs and visualizaKon plaYorm
  • 14.
    Map-D code SingleGPU 12GB memory Map-D code integrated into GPU memory Single CPU 768GB memory Map-D code integrated into CPU memory NVIDIA TEGRA Mobile chip 4GB memory Map-D code integrated into chip memory 8 cards = 4U box 4 sockets = 4U box Map-D code runs on GPU + CPU memory 36U rack: ~400GB GPU ~12TB CPU Mobile Map-D running small datasets Native App Web-based service Map-D hardware architecture Large Data Big Data Small Data Next Gen Flash 40TB 100GB/s
  • 15.