• Like
In-memory Data Management Trends & Techniques
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

In-memory Data Management Trends & Techniques

  • 246 views
Published

www.hazelcast.com

www.hazelcast.com

Published in Software , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
246
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
16
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Lightning Talk
 In-Memory Data Management 
 Trends & Techniques" GREG LUCK! CTO HAZELCAST
  • 2. 2 In-Memory Hardware Trends How to Use It
  • 3. 3 Von Neumann Architecture 3
  • 4. Hardware Trends 4
  • 5. 5 Commodity Multi-core Servers 5 0 4 8 12 16 20 Cores/CPU
  • 6. UMA -> NUMA
  • 7. 7 Commodity 64 bit servers 7 4GB32 18EB64
  • 8. 8 50 Years of RAM Prices Historical and Projected 8
  • 9. 9 50 Years of Disk Prices 9
  • 10. 10 SSD Prices 10 Average Price $1/GB
  • 11. 11 Cost Comparison: USD/GB 2012 11 Disk: $0.04 SSD: $1 25x DRAM: $21 525x $4k $100k $2.1m 100TB
  • 12. 12 Max RAM Per Commodity Server 12 0 1 2 3 4 5 6 7 8 9 2010 2011 2012 2013 TB
  • 13. 13 Latency across the network 13 0 10 20 30 40 50 60 70 µs
  • 14. 14 Access Times & Sizes 14 Level RR Latency Typical Size Technology Managed By Registers <1 ns 1 KB Custom CMOS Compiler L1 Cache 1 ns 8 – 128 KB SRAM Hardware L2 Cache 3 ns .5 – 8 MB SRAM Hardware L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware Main Memory 60 ns 16GB – TB DRAM OS/App SSD 50 -100us 400GB – 6TB Flash Memory OS/App Main Memory over Network 2-100us Unbounded DRAM/ Ethernet/ Infinband OS/App Disk 4 - 7ms Multiple TBs Magnetic Rotational Disk OS/App Disk over Network 6 - 10ms Unbounded Disk/Ethernet/ Infiniband OS/App
  • 15. 15 Access Times & Sizes 15 Level RR Latency Typical Size Technology Managed By Registers <1 ns 1 KB Custom CMOS Compiler L1 Cache 1 ns 8 – 128 KB SRAM Hardware L2 Cache 3 ns .5 – 8 MB SRAM Hardware L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware Main Memory 60 ns 16GB – TB DRAM OS/App SSD 50 -100us 400GB – 6TB Flash Memory OS/App Main Memory over Network 2-100us Unbounded DRAM/ Ethernet/ Infinband OS/App Disk 4 - 7ms Multiple TBs Magnetic Rotational Disk OS/App Disk over Network 6 - 10ms Unbounded Disk/Ethernet/ Infiniband OS/App Cache up to 30 times faster than memory. Memory 106 times faster than disk. Network Memory 103 times faster than disk. SSD 102 faster than disk
  • 16. Techniques 16
  • 17. Exploit Data Locality Data is more likely to be read if: •  It was recently read (temporal locality) •  If it is adjacent to other data (e.g. arrays, fields in an object) •  If it is part of a pattern (e.g. looping, relations) •  Some data is naturally accessed more frequently e.g. Pareto Distribution
  • 18. Working with the CPU’s Cache Hierarchy •  Memory up to 30x slower than cache •  Alleviated somewhat by NUMA, wide  channel, multi-channel/large cache •  Vector instructions •  Work with Cache Lines •  Work with Memory Pages (TLBs) •  Work with Prefetching •  Exploit NUMA with cpu affinity  numactl --physcpubind=0 –localalloc java … •  Exploit natural data locality
  • 19. Data Locality Effects – intra machine 0 20 40 60 80 100 120 140 160 Linear Random - Page Random - Heap Intel U4100 i7-860 i7-2760QM
  • 20. 20 Tiered Storage 20 20 Local Disk SSD and Rotational (Restartable) Local Storage Heap Store Off-Heap Store 5,000,000+ 1,000,000 10 1000+ 2,000+ Speed (TPS) Size (GB) 100,000 10,000s - Network Storage Network Accessible Memory - 100,000 +
  • 21. 21 Data Locality Effects – inter machine 2121 Compared  with  hybrid  in-­‐process  and  distributed   cache:     Latency  =  L1  speed  *  propor:on                                        +  L2  speed  *  propor:on   L1  =  0ms  (<  5us)  for  on-­‐heap  and  50-­‐100  us  off-­‐ heap   L2  =  1  ms     80%  L1  Pareto  Model:      =  0  *  .8  +  1  *  .2   =  .2  ms     90%  L1  Pareto  Model:     latency  =  0  *  .9  +  1  *  .1   =  .1  ms  
  • 22. Columnar Storage •  Manipulate data locality •  Sorted Dictionary compression for finite values •  Allows values to be held in cache for SSE instructions •  Better cache line effectiveness •  Fewer CPU cache misses for aggregate calculations •  Cross-over point is around a few dozen columns
  • 23. Parallelism •  Multi-threading •  Avoid synchronized: CAS •  Query using a scatter gather pattern •  Map/Reduce e.g. Hazelcast Map/Reduce
  • 24. Java: Will it make the cut? Garbage Collection limits heap usage. G1 and Balanced aim for <100ms at 10GB.  Unused Memory 64GB 4GB 4s Heap Java Apps Memory Bound GC Pause Time Available Memory GC Off-Heap Storage No low-level CPU access  Java is challenged as an infrastructure language despite its newly popular  usage for this
  • 25. CEP/Stream Processing •  Don’t let data pool up and then process with “pull queries”. •  Invert that and process it as it streams in.“push queries” •  Queries execute against “tables” that breaks the stream up into a current time window •  Hold the window and intermediate results in memory    Results are in real-time
  • 26. In-Situ Processing Rather than moving the data to be processed you process it in-situ. Examples: - HANA Calculation Engine - Google Big Query - Exadata Storage Servers - Hazelcast EntryProcessor and Distributed Executor Service
  • 27. 27 Souped-Up Von Neumann Architecture 27 Memory Over The Network Memory Over The Network SSD (Flash and RAM) Multi- processor Multi-core/ Compression 64 bit DRAM More Cache, NUMA, Wide/ Multi channel, Locality PCI Flash PCI Flash Vector/AES etc
  • 28. The Data Management Landscape 28
  • 29. 2929 The new data management world Data Grid Terracotta Coherence Gemfire …
  • 30. SAP HANA Relational | Analytical •  “Appliance” •  Aggressive IA64 optimisations •  ACID, SQL and MDX •  In-memory SSD and Disk •  Row and Column based Storage •  Fast aggregation on column store •  Single Instance 1TB limit •  Uses compression (est. 5x size) •  Parallel DB - round-robin, hash, or range partitioning of a table with shared storage •  Updates as delta inserts •  Data is fed from source systems near real-time, real-time or batch
  • 31. Volt DB Relational | New SQL | Operational | Analytical •  An all in-memory design •  Full SQL and full ACID •  Partitioned per core so that one thread own its partition – avoids locking and latching •  Redundancy provided by  multiples instances with  writes being replicated •  Claims to be 45x faster
  • 32. Oracle Exadata Relational | Operational | Analytical | Appliance •  Combines Oracle RAC with “Storage Servers” •  Connected with the box with Infiniband QDR •  SS use PCI Flash (not SSD) for a 22 TB hardware cache •  In-situ computation on the Storage Servers with “Smart Scan” •  Uses “Hybrid Columnar Compression” a compromise of row and column storage. PCI Flash Card
  • 33. Terracotta BigMemory Key-Value | Operational | Data Grid •  In-memory •  Key-value with the Ehcache and soon javax.cache APIs •  In-process (L1) and server storage (L2) •  Persistence via log-forward Fast Restart Store: SSD or Disk •  Tiered Storage: local on-heap, local off-heap, server on-heap, server off-heap •  Partitions with consistent hashing •  Search with parallel in-situ execution •  Off-heap allows 2TB uncompressed in each app server Java process and on each server partition •  Compression •  Speed ranging from < 1µs to a few ms.
  • 34. Hazelcast Key-Value | Operational | Data Grid •  In-memory •  Key-value Map API and javax.cache API •  Near cache and server data storage •  Tiered Storage: local on-heap, local off-heap, server on-heap, server off-heap •  Partitions with consistent hashing •  Search with parallel in-situ execution •  In-situ processing with Entry Processors and Distributed Executors •  Speed ranging from < 1µs to a few ms.
  • 35. Disk is the new tape 35 SSD is the new disk Memory is the new operational store