In-memory Data Management Trends & Techniques

Lightning Talk 
In-Memory Data Management  
Trends & Techniques"
GREG LUCK!
CTO HAZELCAST

2
In-Memory Hardware Trends
How to Use It

5
Commodity Multi-core Servers
5
0
4
8
12
16
20
Cores/CPU

7
Commodity 64 bit servers
7
4GB32 18EB64

8
50 Years of RAM Prices
Historical and Projected
8

10
SSD Prices
10
Average Price

$1/GB

11
Cost Comparison: USD/GB 2012
11
Disk:

$0.04

SSD:

$1

25x

DRAM: $21

525x

$4k

$100k

$2.1m

100TB

12
Max RAM Per Commodity Server
12
0
1
2
3
4
5
6
7
8
9
2010 2011 2012 2013
TB

13
Latency across the network
13
0
10
20
30
40
50
60
70
µs

14
Access Times & Sizes
14
Level RR Latency Typical Size Technology Managed By
Registers <1 ns 1 KB Custom CMOS Compiler
L1 Cache 1 ns 8 – 128 KB SRAM Hardware
L2 Cache 3 ns .5 – 8 MB SRAM Hardware
L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware
Main Memory 60 ns 16GB – TB DRAM OS/App
SSD 50 -100us 400GB – 6TB Flash Memory OS/App
Main Memory
over Network
2-100us Unbounded DRAM/
Ethernet/
Infinband
OS/App
Disk 4 - 7ms Multiple TBs Magnetic
Rotational Disk
OS/App
Disk over
Network
6 - 10ms Unbounded Disk/Ethernet/
Infiniband
OS/App

15
Access Times & Sizes
15
Level RR Latency Typical Size Technology Managed By
Registers <1 ns 1 KB Custom CMOS Compiler
L1 Cache 1 ns 8 – 128 KB SRAM Hardware
L2 Cache 3 ns .5 – 8 MB SRAM Hardware
L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware
Main Memory 60 ns 16GB – TB DRAM OS/App
SSD 50 -100us 400GB – 6TB Flash Memory OS/App
Main Memory
over Network
2-100us Unbounded DRAM/
Ethernet/
Infinband
OS/App
Disk 4 - 7ms Multiple TBs Magnetic
Rotational Disk
OS/App
Disk over
Network
6 - 10ms Unbounded Disk/Ethernet/
Infiniband
OS/App
Cache up to 30 times faster than memory.

Memory 106 times faster than disk.

Network Memory 103 times faster than disk.

SSD 102 faster than disk

Exploit Data Locality
Data is more likely to be read if:

•  It was recently read (temporal locality)

•  If it is adjacent to other data (e.g. arrays, ﬁelds in an object)

•  If it is part of a pattern (e.g. looping, relations)

•  Some data is naturally accessed more frequently e.g. Pareto
Distribution

Working with the CPU’s Cache Hierarchy
•  Memory up to 30x slower than cache

•  Alleviated somewhat by NUMA, wide
channel, multi-channel/large cache

•  Vector instructions

•  Work with Cache Lines

•  Work with Memory Pages (TLBs)

•  Work with Prefetching

•  Exploit NUMA with cpu afﬁnity

numactl --physcpubind=0 –localalloc java …

•  Exploit natural data locality

Data Locality Effects – intra machine
0
20
40
60
80
100
120
140
160
Linear Random -
Page
Random -
Heap
Intel U4100
i7-860
i7-2760QM

20
Tiered Storage
20
20
Local Disk
SSD and Rotational
(Restartable)
Local Storage
Heap
Store
Off-Heap Store
5,000,000+
1,000,000
10
1000+
2,000+
Speed (TPS) Size (GB)
100,000
10,000s -
Network Storage
Network Accessible Memory
- 100,000 +

21
Data Locality Effects – inter machine
2121
Compared
with
hybrid
in-‐process
and
distributed

cache:

Latency
=
L1
speed
*
propor:on

+
L2
speed
*
propor:on

L1
=
0ms
(
5us)
for
on-‐heap
and
50-‐100
us
oﬀ-‐
heap

L2
=
1
ms

80%
L1
Pareto
Model:

=
0
*
.8
+
1
*
.2

=
.2
ms

90%
L1
Pareto
Model:

latency
=
0
*
.9
+
1
*
.1

=
.1
ms

Columnar Storage
•  Manipulate data locality

•  Sorted Dictionary compression
for ﬁnite values

•  Allows values to be held in
cache for SSE instructions

•  Better cache line effectiveness

•  Fewer CPU cache misses for
aggregate calculations

•  Cross-over point is around a
few dozen columns

Parallelism
•  Multi-threading

•  Avoid synchronized: CAS

•  Query using a scatter gather pattern

•  Map/Reduce e.g. Hazelcast Map/Reduce

Java: Will it make the cut?
Garbage Collection limits heap usage. G1
and Balanced aim for 100ms at 10GB.

Unused
Memory
64GB
4GB 4s
Heap
Java Apps Memory Bound
GC Pause
Time
Available
Memory
GC
Off-Heap Storage

No low-level CPU access

Java is challenged as an infrastructure
language despite its newly popular
usage for this

CEP/Stream Processing
•  Don’t let data pool up and then process with “pull queries”.

•  Invert that and process it as it streams in.“push queries”

•  Queries execute against “tables” that breaks the stream up into
a current time window

•  Hold the window and intermediate results in memory

Results are in real-time

In-Situ Processing
Rather than moving the data to be processed you process it in-situ.

Examples:

- HANA Calculation Engine
- Google Big Query
- Exadata Storage Servers
- Hazelcast EntryProcessor and Distributed Executor Service

27
Souped-Up Von Neumann Architecture
27
Memory

Over The
Network

Memory

Over The
Network

SSD

(Flash and
RAM)

Multi-
processor

Multi-core/
Compression

64 bit

DRAM

More Cache, NUMA, Wide/
Multi channel, Locality

PCI Flash

PCI Flash

Vector/AES etc

The Data Management Landscape
28

2929
The new data management world
Data Grid

Terracotta

Coherence

Gemﬁre …

SAP HANA
Relational | Analytical
•  “Appliance”

•  Aggressive IA64 optimisations

•  ACID, SQL and MDX

•  In-memory SSD and Disk

•  Row and Column based Storage

•  Fast aggregation on column store

•  Single Instance 1TB limit

•  Uses compression (est. 5x size)

•  Parallel DB - round-robin, hash, or range partitioning of a table
with shared storage

•  Updates as delta inserts

•  Data is fed from source systems near real-time, real-time or
batch

Volt DB
Relational | New SQL | Operational | Analytical
•  An all in-memory design

•  Full SQL and full ACID

•  Partitioned per core so that one thread own its partition –
avoids locking and latching

•  Redundancy provided by
multiples instances with
writes being replicated

•  Claims to be 45x faster

Oracle Exadata
Relational | Operational | Analytical | Appliance
•  Combines Oracle RAC with “Storage Servers”

•  Connected with the box with Inﬁniband QDR

•  SS use PCI Flash (not SSD) for a 22 TB hardware cache

•  In-situ computation on the Storage Servers with “Smart Scan”

•  Uses “Hybrid Columnar Compression” a compromise of row
and column storage.

PCI Flash Card

Terracotta BigMemory
Key-Value | Operational | Data Grid
•  In-memory

•  Key-value with the Ehcache and soon javax.cache APIs

•  In-process (L1) and server storage (L2)

•  Persistence via log-forward Fast Restart Store: SSD or Disk

•  Tiered Storage: local on-heap, local off-heap, server on-heap,
server off-heap

•  Partitions with consistent hashing

•  Search with parallel in-situ execution

•  Off-heap allows 2TB uncompressed in each app server Java
process and on each server partition

•  Compression

•  Speed ranging from 1µs to a few ms.

Hazelcast
Key-Value | Operational | Data Grid
•  In-memory

•  Key-value Map API and javax.cache API

•  Near cache and server data storage

•  Tiered Storage: local on-heap, local off-heap, server on-heap,
server off-heap

•  Partitions with consistent hashing

•  Search with parallel in-situ execution

•  In-situ processing with Entry Processors and Distributed
Executors

•  Speed ranging from 1µs to a few ms.

Disk is the new tape
35
SSD is the new disk
Memory is the new
operational store

In-memory Data Management Trends & Techniques

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to In-memory Data Management Trends & Techniques

Similar to In-memory Data Management Trends & Techniques (20)

More from Hazelcast

More from Hazelcast (20)

Recently uploaded

Recently uploaded (20)

In-memory Data Management Trends & Techniques