Disksim with SSD_extension


Published on

Analyzed the source code of disk simulator Disksim, and its SSD extension from Microsoft

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Disksim with SSD_extension

  1. 1. Disksim with SSD extension -- A develops perspective Jiannan Ouyang PhD CS@PITT 2011/04/07
  2. 2. Outline Overview Disksim implementation SSD extension
  3. 3. DisksimDisksim: An open source disk simulator originally developed atUMich. and enhanced at CMU.
  4. 4. Disksim features Various device model including: disk, simpledisk, memsmodel Controller model: simple, smart(with cache) Trace synthesis and different trace file format DIXtrac: automatic disk characterization
  5. 5. ssdmodel Developed by Microsoft. NOT for any specific SSD Device For an idealized SSD that is parameterized by the properties of NAND flash chips Cache is NOT natively supported
  6. 6. Source Dir src/ disksim source (disksim_*.c/h) ssdmodel/ ssd extension source (ssd_*.c/h) diskmodel/ diskmodel layout and mech memsmodel/ MEMS device model libparam/ parameter processing lib ...
  7. 7. Outline Overview Disksim implementation SSD extension
  8. 8. Disksim source: src/ disksim_main* main entrance main() disksim_iodriver* driver iodriver_send_event_down_path() dismsim_bus* bus bus_deliver_event() disksim_controller* controller controller_event_arrive() disksim_diskctlr* disk controller disk_event_arrive() ...
  9. 9. Disksim Control PathEvent Based System: various types of events: io, interrupt, timer... all event are stored in a global queue in time order addtointq() and removefromintq() are used to access the global queueEquivalent code:while(curr=getnextevent()){ swith (curr->type){ case IO_REQUEST_ARRIVE: iodriver_request(curr); break; }}
  10. 10. Examplesrc/disksim_iosim.c io_internal_event() case IO_ACCESS_ARRIVE: iodriver_schedule(0, curr); break;src/disksim_iodriver.c iodriver_schedule() iodriver_send_event_down_path(curr);src/disksim_iodriver.c iodriver_send_event_down_path() bus_deliver_event(busno.byte[0], slotno.byte[0], curr);
  11. 11. Example con.src/disksim_bus.c bus_deliver_event() case CONTROLLER: controller_event_arrive(devno, curr); break; case DEVICE: ASSERT(devno == curr->devno); device_event_arrive(curr); break;This control flow is a simulation of an event.
  12. 12. Disksim & Device InterfaceINLINE void device_event_arrive (ioreq_event *curr){ ASSERT1 ((curr->devno >= 0) && (curr->devno <numdevices), "curr->devno", curr->devno); return disksim->deviceinfo->devices[curr->devno]->event_arrive(curr);}Funtion pointer! By dynamic tracing using gdb, we found thatFor disk, it jumps to disk_event_arrive()For ssd, it jumps to ssd_event_arrive()
  13. 13. event_arrive: disk v.s. ssddisk_event_arrive() ssd_event_arrive()case IO_ACCESS_ARRIVE: case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); ssd_request_arrive(curr); case DEVICE_OVERHEAD_COMPLETE: disk_request_arrive(curr); case DEVICE_ACCESS_COMPLETE: case DEVICE_BUFFER_SEEKDONE: ssd_access_complete (curr); disk_buffer_seekdone(currdisk, curr); case DEVICE_DATA_TRANSFER_COMPLETE: case DEVICE_BUFFER_SECTOR_DONE: ssd_bustransfer_complete(curr); disk_buffer_sector_done(currdisk, curr); case IO_INTERRUPT_COMPLETE: case DEVICE_GOTO_REMAPPED_SECTOR: disk_goto_remapped_sector(currdisk, curr); ssd_interrupt_complete(curr); case DEVICE_GOT_REMAPPED_SECTOR: case SSD_CLEAN_GANG: disk_got_remapped_sector(currdisk, curr); ssd_clean_gang_complete(curr); case DEVICE_PREPARE_FOR_DATA_TRANSFER: case SSD_CLEAN_ELEMENT: disk_prepare_for_data_transfer(curr); ssd_clean_element_complete(curr); case DEVICE_DATA_TRANSFER_COMPLETE: disk_reconnection_or_transfer_complete(curr); case IO_INTERRUPT_COMPLETE: disk_interrupt_complete(curr);"buffer" is cache related events. "clean" is garbage collection and wear-leveling"remapped sector" seems to related to data layout related. "Gang" and "Element" specify the(not sure) allocation and reclaim unit.
  14. 14. Outline Overview Disksim implementation SSD extension
  15. 15. ssdmodel features Add an auxiliary level of parallel elements, each with a closed queue, to represent flash elements or gangs Add logic to serialized request completions from these parallel elements For each elements, maintain data structures to represent SSD logical block maps, cleaning state and wear_leveling state Delay is introduced when request is processed Parameters including background cleaning, gang-size, gang organization, interleaving, overprovisioning
  16. 16. Flash Package Internal
  17. 17. Flash Chip Performance1. Latency 4. Bandwidth and Interleavebus<->data reg 100usmedia->reg: read 25us src plane -> dest plane 4 page copying (100us per page)reg->media: write 200userease 1.5ms2. Two-plane commandscan be executed on theirplane pairs 0&1 or 2&33. Support background copyon the same plane
  18. 18. SSD Simulation Logical Block Map allocation pool Cleaning greedy or wear-leveling aware Parallelism and Interconnect Density ganging, interleaving, background cleaning Persistence saving mapping information per block in DRAM
  19. 19. Interconnection - Ganging A gang of flash packages can be utilized in synchrony to optimized a multi-page request. Allow multiple packages to be used in parallel while sharing one request queue A request queue can be associated to each gang or to each element (full interconnection mode)
  20. 20. Logical Block Map Use allocation pool to think about how an SSD allocates flash blocks to service write requests An allocation pool an be a flash package or a gang Static: a portion of each LBA constitutes a fixed mapping to a specific allocation pool Dynamic: the non-static portion of a LBA is the lookup key for a mapping within a pool
  21. 21. Garbage Collection (Cleaning) active block: block available to holding incoming writes in a pool superseded page: out-of-date page cleaning efficiency: (superseded / total pages) in a block a pure greedy approach: choosing blocks to clean based on potential cleaning efficiency
  22. 22. Wear-Leveling average remaining lifetime(ARL) of a block age variance (say 20%) of the ARL retirement age (say 85%) of the ARLWear-aware garbage collection:1. If ARL < retirement, migrate cold data into this block from a migration-candidate queue, and recycle the head block of the queue. Populate the queue with new blocks with cold data. Otherwise, if ARL<age variance, then restrict recycling of the block with a probability that increases linearly as the remaining lifetime drops to 0. (80% of average ~ Prob of recycle = 1; 0% of average ~ 0)
  23. 23. Source: ssdmodel/ssdmodel is very simple, all c files listed below: ssd.c main ssd_event_arrive() ssd_clean.c gabege collection and wear ssd_activate_gang() leveling ssd_gang.c several flash packages ssd_clean_blocks_greedy() orgnised as gang ssd_timing.c timing model ssd_compute_access_time() ssd_utils.c util ssd_init.c init
  24. 24. Exampleevent sequences for one request:ssd_request_arrive->ssd_interrupt_complete(reconnect)->ssd_bustransfer_complete->ssd_access_complete->ssd_interrupt_complete(completion)ssd_bustransfer_complete() -> ssd_media_access_request ();ssdmodel/ssd.c: ssd_media_access_request () case SSD_ALLOC_POOL_PLANE: case SSD_ALLOC_POOL_CHIP: ssd_media_access_request_element(curr); break; case SSD_ALLOC_POOL_GANG:#if SYNC_GANG ssd_media_access_request_gang_sync(curr);#else ssd_media_access_request_gang(curr);#endif break;
  25. 25. Example con.ssd_media_access_request_element() -> sse_activate_element() -> ssd_invoke_element_cleaning() -> ssd_compute_access_time(currdisk, elem_num,read_reqs, read_total); -> add complete into global event queue -> ssd_compute_access_time(currdisk, elem_num,write_reqs, write_total); -> add complete into global event queueParallel processing sequential complete is achieved by processing batch of requestsin parallel, however, generate the ACCESS_COMPLETE events sequencially
  26. 26. ReferencesDisksim: http://www.pdl.cmu.edu/DiskSim/Disksim Manual: http://www.pdl.cmu.edu/PDL-FTP/DriveChar/CMU-PDL-08-101.pdfDisksim implementation doc: src/doc/Outline.txtSSD Extension: http://research.microsoft.com/en-us/downloads/b41019e2-1d2b-44d8-b512-ba35ab814cd4/SSD Extension paper: Design Tradeoffs for SSDPerformance, N Agrawal, 2008Cache over SSD project: Group 6 on http://www-users.cselabs.umn.edu/classes/Spring-2009/csci8980-ass/
  27. 27. ThanksQ&A?
  28. 28. Block stripping// blocks can be concatenated (chained) from each plane//// plane 0 plane 1 plane 2 plane 3// ------------------------------------------// blk 0 blk 2048 blk 4096 blk 6144// blk 1 blk 2049 blk 4097 blk 6145// ... ...// blk 2047 blk 4095 blk 6143 blk 8191// blocks can be stripped across all the planes//// plane 0 plane 1 plane 2 plane 3// ------------------------------------------// blk 0 blk 1 blk 2 blk 3// blk 4 blk 5 blk 6 blk 7// ... ...// blk 8188 blk 8189 blk 8190 blk 8191//