Successfully reported this slideshow.

It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives CIDR 2020 presentation

0

Share

Loading in …3
×
1 of 16
1 of 16

More Related Content

Similar to It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives CIDR 2020 presentation

It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives CIDR 2020 presentation

  1. 1. It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives Alberto Lerner1 Jaewook Kwak2 Sangjin Lee2 Kibin Park2 Yong Ho Song2,3 Philippe Cudré-Mauroux1 1 XI Lab – University of Fribourg, Switzerland 2 ENC Lab – Hanyang University, Korea 3 Samsung Electronics, Korea CIDR – January 2020 - Amsterdam
  2. 2. Motivation • Where is time going? • CPU/cache utilization -> HW performance counters • Per-instruction cost -> pprof, linux perf tool • Operating System impact -> systemtap, several others • SSD performance -> ? 2
  3. 3. Challenges in In-Memory Databases Durability • Log needs to be written as fast as possible • Checkpoint competes with client request for memory and disk access • Can we understand the interference? Was the TX Log IO pattern efficient to begin with? ¼ Users Txn’s CP workers 3 host storage Txn Log Check point
  4. 4. Cosmos+ OpenSSD • Idea: let’s instrument an actual device! • SSD rapid prototyping platform • SoC-based • Fully functional • Open source firmware • Next generation is on final stages of development 4
  5. 5. Anatomy of an SSD ¼ ¼ ¼ ¼ 5
  6. 6. Lifetime of a Write ¼ ¼ ¼ ¼ 6
  7. 7. Lifetime of a Write ¼ ¼ ¼ ¼ 7
  8. 8. Instrumentation • Timestamping (in red) • Counters (in green) • Pagemap • Mechanisms • Triggers • Data extraction commands 8
  9. 9. Performance Event Records (PEV) • Currently four types of records IO_TIMESTAMP Regular timestamp stations GC_TIMESTAMP FTL timestamp stations PERFORMANCE_INDEX Aggregated counter PERFORMANCE_INDEX_PER_CH Per channel counters 9
  10. 10. Experimenting with Timestamps • In-memory Databases Simulated Workloads • (1-1) WAL – IPP • (1-N) WAL – CALC • (M-N) SILOR / CPR ¼ ... 10 Txn Log Check point
  11. 11. Delay Examples 11 t0 t1
  12. 12. Interference Analysis No interference 2.5x 12
  13. 13. Research Agenda I - Instrumentation • Functionality Limitations • Currently limited at 4 channels • Further annotations to trace back valid copies • Contextual triggers • Signal Generation • Process instrumentation records on-the-fly • Identify scenarios where a scheduling policy change is beneficial 13
  14. 14. Research Agenda II – SSD as a Platform • Adaptive Scheduling • Respond instantaneously to signals generated by changing priorities • In-Storage Checkpoint ”Derivation” • Move the checkpoint process partially or entirely into the device 14
  15. 15. Conclusion • SSDs don’t have to be black boxes • The Instrumented Cosmos+ allows designers of both Databases and FTLs to analyze and understand interference in workloads • Opportunities to • Have SSDs interact with applications in richer ways • Exploit new possibilities of Near-Data Computing for Databases 15
  16. 16. Q&A Thank you! 16

×