Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SILT: A Memory-Efficient, High-Performance Key-Value Store

884 views

Published on

SILT: A Memory-Efficient, High-Performance Key-Value Store

Published in: Education
  • Login to see the comments

SILT: A Memory-Efficient, High-Performance Key-Value Store

  1. 1. SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen, Michael Kaminsky Carnegie Mellon University, Intel Labs 1
  2. 2. Outline 1- Introduction. 2- Problem statement. 3. Contributions. 4- Experiments 5- Results and conclusion This template is free to use under Creative Commons Attribution license. If you use the graphic assets (photos, icons and typographies) provided with this presentation you must keep the Credits slide. 2
  3. 3. Hello! I am Mahdi Atawneh You can find me at: @mshanak mahdi@ppu.edu 3
  4. 4. 1. Introduction 4
  5. 5. What is key-value store ? a data storage paradigm designed for storing, retrieving, and managing associative arrays, 5 Introduction
  6. 6.  Ecommerce (Amazon)  Picture stores (facebook)  Web object caching. 6 key-value stores used in:
  7. 7. Many projects have examined flash memory-based key-value stores ; Faster than disk, cheaper than DRAM 7 DRAM vs Flash RAM (main memory)  a bit more expensive .  requires constant power.  is much faster. Flash memory (HD)  low-cost .  retains data when power is removed (nonvolatile),  but its performance is also slow.
  8. 8.  Memory overhead: Index size per entry, Ideally 0 (no memory overhead)  Read amplification: Flash reads per query, Limits query throughput, Ideally 1 (no wasted flash reads).  Write amplification: Flash writes per entry, Limits insert throughput, Also reduces flash life expectancy 8 Three Metrics to Minimize
  9. 9. 2. Motivation Problem statement 9
  10. 10. Motivation As key-value stores scale in both size and importance, index memory efficiency is increasingly becoming one of the most important factors for the system’s scalability and overall cost effectiveness. 10
  11. 11. Challenge Memory efficiency High performance 11 This talk will introduce SILT, which uses drastically less memory than previous systems while retaining high performance.
  12. 12. Related Work Many studies tried to reduce in- memory index overhead,but: • there solutions either require more memory. • or keeping parts of there index on disk (low performance "called read amplification") 12
  13. 13. 3. Contributions SILT (Small Index Large Table) 13
  14. 14. Contributions 1. The design and implementation of three basic key-value stores (LogStore, HashStore, and SortedStore) . 2. Synthesis of these basic stores to build SILT. 3. An analytic model that enables an explicit and careful balance between memory, storage, and computation . 14
  15. 15. Contributions 1. The design and implementation of three basic key-value stores (LogStore, HashStore, and SortedStore) . 2. Synthesis of these basic stores to build SILT. 3. An analytic model that enables an explicit and careful balance between memory, storage, and computation . 15
  16. 16. Basic store design LogStore, HashStore, and SortedStore . ( overall overview) . 16
  17. 17. Basic store design LogStore, HashStore, and SortedStore . ( overall overview) . 17
  18. 18. SILT: 1. LogStore 1. Write friendly key-value store. 2. Use a tag (15 bits) for an index rather than an entire hash index (160bits) . 3. A customized version of Cuckoo hashing is used. 4. In-memory hash table to map contents in flash. 5. Only one instance. 18
  19. 19. SILT: 1. LogStore How it works? . 19
  20. 20. SILT: 1. LogStore K2 h1(k2) Tag Offset DRAM (memory) ** store short tag (15b) FLASH (hard disk) ** store the full key (160) h2(k2) 1 2 3 4 Cuckoo hashing 20
  21. 21. SILT: 1. LogStore K2 h1(k2) Tag Offset h2(k2 ) 1 K2 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k2) 1 2 3 4 21
  22. 22. SILT: 1. LogStore K1 h1(k1) Tag Offset h2(k2 ) 1 h1(k1 ) 2 K2 K1 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k1) 1 2 3 4 22
  23. 23. SILT: 1. LogStore K4 h1(k2) Tag Offset h2(k2 ) 1 h1(k4 ) 3 h1(k1 ) 2 K2 K1 K4 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k2) 1 2 3 4 23
  24. 24. SILT: 1. LogStore K3 h1(k3) Tag Offset h2(k2 ) 1 h1(k4 ) 3 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k3) 1 2 3 4 24
  25. 25. SILT: 1. LogStore K3 h1(k3) Tag Offset h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k3) 1 2 3 4 25
  26. 26. SILT: 1. LogStore K3 h1(k3) Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k3) 1 2 3 4 26
  27. 27. SILT: 1. LogStore K5 ? h1(k5) Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k5) 1 2 3 4 27
  28. 28. SILT: 1. LogStore K5 ? h1(k5) Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k5) 1 2 3 4 28
  29. 29. SILT: 1. LogStore K5 ? h1(k5) Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) h2(k5) 1 2 3 4LogStore is full? 1. SILT freezes the LogStore 2. initializes a new one without expensive rehashing. 3. Convert the old LogStore to a HashStore ( in background). 29
  30. 30. Basic store design LogStore, HashStore, and SortedStore . ( overall overview) . 30
  31. 31. SILT: 2. HashStore 1. Convert logStore into a more memory-efficient data structure. 2. Sort the LogStore based on ‘HashOrder’ 3. Saves lots of in-memory by eliminating the index and reordering the on-flash pairs from insertion order to hash-order 31
  32. 32. SILT: 2. HashStore 32
  33. 33. SILT: 2. HashStore Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) 1 2 3 4 Convert logStore into a more memory-efficient data structure.. 33
  34. 34. SILT: 2. HashStore Tag Offset h2(k3 ) 4 h1(k4 ) 3 h2(k2 ) 1 h1(k1 ) 2 K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) 1 2 3 4 Convert logStore into a more memory-efficient data structure.. 1. Remove the Offset column 34
  35. 35. SILT: 2. HashStore Tag h2(k3 ) h1(k4 ) h2(k2 ) h1(k1 ) K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) 1 2 3 4 Convert logStore into a more memory-efficient data structure.. 35
  36. 36. SILT: 2. HashStore Tag h2(k3 ) h1(k4 ) h2(k2 ) h1(k1 ) K2 K1 K4 K3 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) 1 2 3 4 Convert logStore into a more memory-efficient data structure.. 2. Sort according to the hash 36
  37. 37. SILT: 2. HashStore Tag h2(k3 ) h1(k4 ) h2(k2 ) h1(k1 ) K3 K4 K2 K1 DRAM (memory) ** store short tag FLASH (hard disk) ** store the full key (160) 1 2 3 4 Convert logStore into a more memory-efficient data structure.. 37
  38. 38. SILT: 2. HashStore K3 K4 K2 K1 Hashstore store many logstores . K3 K4 K2 K1 K3 K4 K2 K1 K3 K4 K2 K1 38
  39. 39. Basic store design LogStore, HashStore, and SortedStore . ( overall overview) . 39
  40. 40. SILT: 3. SortedStore  Multiple HashStore can be aggregated into one SortedStore.  Focuses on minimize the bit presentation (by Using Sorted Data on Flash).  From the sorted results, indices are re-made ( trie data structure , uses 0.4 bytes of index memory per key on average).  keeps read amplification low (exactly 1) by directly pointing to the correct location on flash. 40
  41. 41. SILT: 3. SortedStore Indexing Sorted Data with a Trie: leaf = key internal node = common prefix of the keys represented by its descendants How it works? 41
  42. 42. SILT: 3. SortedStore Indexing Sorted Data with a Trie: Example 42
  43. 43. 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 43
  44. 44. 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 44
  45. 45. 0 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 45
  46. 46. 0 0 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 46
  47. 47. 0 0 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 47
  48. 48. 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 48
  49. 49. 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 0 49
  50. 50. 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 0 Ignored 50
  51. 51. 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 0 1 51
  52. 52. 0 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 0 1 2 52
  53. 53. 10 2 76 3 54 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 2 3 4 5 6 7 53
  54. 54.  SortedStore uses a compact recursive representation to eliminate pointers 54
  55. 55. SLIT Lookup Process  Queries look up stores in sequence (from new to old)  Note : Inserts only go to Log 55
  56. 56. SLIT Lookup Process  Queries look up stores in sequence (from new to old)  Note : Inserts only go to Log 56
  57. 57. Contributions 1. The design and implementation of three basic key-value stores (LogStore, HashStore, and SortedStore) . 2. Synthesis of these basic stores to build SILT. 3. An analytic model that enables an explicit and careful balance between memory, storage, and computation . 57
  58. 58. Contributions 1. The design and implementation of three basic key-value stores (LogStore, HashStore, and SortedStore) . 2. Synthesis of these basic stores to build SILT. 3. An analytic model that enables an explicit and careful balance between memory, storage, and computation . 58
  59. 59. Analytic model 59 Memory overhead (MA)= Read amplification (RA)= Write amplification (WA)= data written to flash data written by application data read from flash data read by application total memory consumed number of items
  60. 60. 4. Evaluation & Experiments 60
  61. 61. Experiment Setup 61 CPU 2.80 GHz (4 cores) Flash drive SATA 256 GB (48 K random 1024-byte reads/sec) Workload size 20-byte key, 1000-byte value, ≥ 50 M keys Query pattern Uniformly distributed
  62. 62. Experiment 1 LogStore Alone: Too Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys) 62
  63. 63. Experiment 2 LogStore + SortedStore: Still Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys) 63
  64. 64. Experiment 3 Full SILT: Very Memory Efficient Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys) 64
  65. 65. “Thanks

×