Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives CIDR 2020 presentation

20 views

Published on

It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives

CIDR 2020 presentation

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives CIDR 2020 presentation

  1. 1. It Takes Two: Instrumenting the Interaction between In-Memory Databases and Solid-State Drives Alberto Lerner1 Jaewook Kwak2 Sangjin Lee2 Kibin Park2 Yong Ho Song2,3 Philippe Cudré-Mauroux1 1 XI Lab – University of Fribourg, Switzerland 2 ENC Lab – Hanyang University, Korea 3 Samsung Electronics, Korea CIDR – January 2020 - Amsterdam
  2. 2. Motivation • Where is time going? • CPU/cache utilization -> HW performance counters • Per-instruction cost -> pprof, linux perf tool • Operating System impact -> systemtap, several others • SSD performance -> ? 2
  3. 3. Challenges in In-Memory Databases Durability • Log needs to be written as fast as possible • Checkpoint competes with client request for memory and disk access • Can we understand the interference? Was the TX Log IO pattern efficient to begin with? ¼ Users Txn’s CP workers 3 host storage Txn Log Check point
  4. 4. Cosmos+ OpenSSD • Idea: let’s instrument an actual device! • SSD rapid prototyping platform • SoC-based • Fully functional • Open source firmware • Next generation is on final stages of development 4
  5. 5. Anatomy of an SSD ¼ ¼ ¼ ¼ 5
  6. 6. Lifetime of a Write ¼ ¼ ¼ ¼ 6
  7. 7. Lifetime of a Write ¼ ¼ ¼ ¼ 7
  8. 8. Instrumentation • Timestamping (in red) • Counters (in green) • Pagemap • Mechanisms • Triggers • Data extraction commands 8
  9. 9. Performance Event Records (PEV) • Currently four types of records IO_TIMESTAMP Regular timestamp stations GC_TIMESTAMP FTL timestamp stations PERFORMANCE_INDEX Aggregated counter PERFORMANCE_INDEX_PER_CH Per channel counters 9
  10. 10. Experimenting with Timestamps • In-memory Databases Simulated Workloads • (1-1) WAL – IPP • (1-N) WAL – CALC • (M-N) SILOR / CPR ¼ ... 10 Txn Log Check point
  11. 11. Delay Examples 11 t0 t1
  12. 12. Interference Analysis No interference 2.5x 12
  13. 13. Research Agenda I - Instrumentation • Functionality Limitations • Currently limited at 4 channels • Further annotations to trace back valid copies • Contextual triggers • Signal Generation • Process instrumentation records on-the-fly • Identify scenarios where a scheduling policy change is beneficial 13
  14. 14. Research Agenda II – SSD as a Platform • Adaptive Scheduling • Respond instantaneously to signals generated by changing priorities • In-Storage Checkpoint ”Derivation” • Move the checkpoint process partially or entirely into the device 14
  15. 15. Conclusion • SSDs don’t have to be black boxes • The Instrumented Cosmos+ allows designers of both Databases and FTLs to analyze and understand interference in workloads • Opportunities to • Have SSDs interact with applications in richer ways • Exploit new possibilities of Near-Data Computing for Databases 15
  16. 16. Q&A Thank you! 16

×