20071215 fcache
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
418
On Slideshare
417
From Embeds
1
Number of Embeds
1

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 1

http://www.slideshare.net 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . An Introduction to fcache Huang Pu Hu Ziming Beijing University of Posts and Telecommunications December 15, 2007 . . . . . . . A . n Introduction to fcache 1 . / 34 . .
  • 2. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Who Are We H P School of Telecommunication Engineer, BUPT Graduate at Apr, 2008 (expected) kenhp1982@gmail.com H Z School of Information Engineer, BUPT Graduate at Apr, 2008 (expected) hzmangel@gmail.com . . . . . . . A . n Introduction to fcache 2 . / 34 . . .
  • 3. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . What is fcache A block remapping cache Between file system and block device A separated cache parition is needed . . . . . . . A . n Introduction to fcache 3 . / 34 . . .
  • 4. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Principle of fcache Data accessed during boot is not layout in linear order Hard disk has to perform more seeking during boot up Slow mechanical operations contribute to low boot speed . . . . . . . A . n Introduction to fcache 4 . / 34 . . .
  • 5. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Principle of fcache Data accessed during boot is not layout in linear order Hard disk has to perform more seeking during boot up Slow mechanical operations contribute to low boot speed But if data was laid out with accessing order on the disk Less hard disk seeking will make system boot up faster . . . . . . . A . n Introduction to fcache 4 . / 34 . . .
  • 6. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Principle of fcache Data accessed during boot is not layout in linear order Hard disk has to perform more seeking during boot up Slow mechanical operations contribute to low boot speed But if data was laid out with accessing order on the disk Less hard disk seeking will make system boot up faster fcache can make this . . . . . . . A . n Introduction to fcache 4 . / 34 . . .
  • 7. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Flow of fcache There are two working modes in fcache One is priming mode The other is cache mode . . . . . . . A . n Introduction to fcache 5 . / 34 . . .
  • 8. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Flow of fcache -- priming In priming mode, fcache captures all read operations during boot up Then mirrors the data to cache partition while responsing read operations . . . . . . . A . n Introduction to fcache 6 . / 34 . . .
  • 9. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Flow of fcache -- priming In priming mode, fcache captures all read operations during boot up Then mirrors the data to cache partition while responsing read operations Data will be mirrored into cache partition With the same order accessed during boot up The detail will show on the next slide . . . . . . . A . n Introduction to fcache 6 . / 34 . . .
  • 10. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- at the beginning xchg(&q->make_request_fn, fcache_make_request) . . . . . . . A . n Introduction to fcache 7 . / 34 . . .
  • 11. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- at the beginning xchg(&q->make_request_fn, fcache_make_request) Now fcache captures all read requests in request queue . . . . . . . A . n Introduction to fcache 7 . / 34 . . .
  • 12. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read request ... 1 Clones the original bio and adds cloned data to cache_device . . . . . . . A . n Introduction to fcache 8 . / 34 . . .
  • 13. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read request ... 1 Clones the original bio and adds cloned data to cache_device ... 2 Submit that cloned bio to real device . . . . . . . A . n Introduction to fcache 8 . / 34 . . .
  • 14. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read request ... 1 Clones the original bio and adds cloned data to cache_device ... 2 Submit that cloned bio to real device ... 3 When this bio has been completely cloned . . . . . . . A . n Introduction to fcache 8 . / 34 . . .
  • 15. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read request ... 1 Clones the original bio and adds cloned data to cache_device ... 2 Submit that cloned bio to real device ... 3 When this bio has been completely cloned ... 4 Generates extent, and adds it to prio-tree . . . . . . . A . n Introduction to fcache 8 . / 34 . . .
  • 16. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read finished When all read requests are completed The script will remount target partition . . . . . . . A . n Introduction to fcache 9 . / 34 . . .
  • 17. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Priming -- read finished When all read requests are completed The script will remount target partition Then fcache will re-generate prio-tree from extents Then write extents to cache device Then re-write the header of cache device . . . . . . . A . n Introduction to fcache 9 . / 34 . . .
  • 18. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Flow of fcache -- cache In cache mode, fcache tries to fetch blocks from cache partition And serves the read request from the cache partition if it can . . . . . . . A . n Introduction to fcache 1 . 0 / 34 . . .
  • 19. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- at the beginning Reads header from cache device Gets basic info such as version and serial . . . . . . . A . n Introduction to fcache 1 . 1 / 34 . . .
  • 20. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- at the beginning Reads header from cache device Gets basic info such as version and serial Reads all extents and builds prio-tree The value of tree is offset of cache . . . . . . . A . n Introduction to fcache 1 . 1 / 34 . . .
  • 21. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- at the beginning Reads header from cache device Gets basic info such as version and serial Reads all extents and builds prio-tree The value of tree is offset of cache Function xchg is used again To replace every normal read request . . . . . . . A . n Introduction to fcache 1 . 1 / 34 . . .
  • 22. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- read request Looks up extents in prio-tree . . . . . . . A . n Introduction to fcache 1 . 2 / 34 . . .
  • 23. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- read request Looks up extents in prio-tree Sends the request to real device if not matched . . . . . . . A . n Introduction to fcache 1 . 2 / 34 . . .
  • 24. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Cache -- read request Looks up extents in prio-tree Sends the request to real device if not matched If matched, it means requested data is extents from cache parition Then submits the read request to cache device . . . . . . . A . n Introduction to fcache 1 . 2 / 34 . . .
  • 25. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of header I There is only one header in a cache partition Which will cost one block of the partition And is stored at the beginning of partition . . . . . . . A . n Introduction to fcache 1 . 3 / 34 . . .
  • 26. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of header II header magic version nr extent max extent serial . . . . . . . . . . A . n Introduction to fcache 1 . 4 / 34 . . .
  • 27. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of extent I A er header block is extent blocks It will be stored in several blocks . . . . . . . A . n Introduction to fcache 1 . 5 / 34 . . .
  • 28. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of extent I A er header block is extent blocks It will be stored in several blocks extent will show the offset of cache data And the mapping of cache data and the orig data . . . . . . . A . n Introduction to fcache 1 . 5 / 34 . . .
  • 29. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of extent II extent fs sector real device offset fs size extent length cache sector cache device offset pio node . . . . . . . A . n Introduction to fcache 1 . 6 / 34 . . .
  • 30. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of combo I A er extent blocks is data blocks Which will fill other blocks . . . . . . . A . n Introduction to fcache 1 . 7 / 34 . . .
  • 31. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Data structure of combo II combo 1 block header some extents extent 1 2 3 4 5 ... some blocks . . . . . . data . . . . . . . . . . . . . . . . A . n Introduction to fcache 1 . 8 / 34 . . .
  • 32. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Normal Without fcache OS data read disk . . . block block . . . block . . . . . . . . . . A . n Introduction to fcache 1 . 9 / 34 . . .
  • 33. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . fcache in Priming Mode OS data read fcache copy read data cache disk cache . . . cache block cache . . . . . block . . . block . . . . . . . . A . n Introduction to fcache 2 . 0 / 34 . . .
  • 34. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . fcache in Cache Mode OS data read fcache read data re-fetch cache disk cache . . . cache block cache . . . . . block . . . block . . . . . . . . A . n Introduction to fcache 2 . 1 / 34 . . .
  • 35. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . What we have done Make this patch can be able on ext4 filesystem . . . . . . . A . n Introduction to fcache 2 . 2 / 34 . . .
  • 36. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . What we have done Make this patch can be able on ext4 filesystem Port it to the newest kernel 2.6.24-rc5 . . . . . . . A . n Introduction to fcache 2 . 2 / 34 . . .
  • 37. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Modified interface INIT_WORK Has changed a er 2.6.20 Takes 2 arguments instead of 3 Macro container_of is used . . . . . . . A . n Introduction to fcache 2 . 3 / 34 . . .
  • 38. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Modified Data Type request_queue_t Undefined variable change it to struct request_queue Trivial, but should be handled carefully . . . . . . . A . n Introduction to fcache 2 . 4 / 34 . . .
  • 39. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Boot with ext4 filesystem Download kernel 2.6.24-rc5, make sure options about ext4dev are chosen . . . . . . . A . n Introduction to fcache 2 . 5 / 34 . . .
  • 40. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Boot with ext4 filesystem Download kernel 2.6.24-rc5, make sure options about ext4dev are chosen We add -t ext4dev option for mount command to indicate the type of partition But it cannot change the mount type by using remount option . . . . . . . A . n Introduction to fcache 2 . 5 / 34 . . .
  • 41. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Boot with ext4 filesystem Download kernel 2.6.24-rc5, make sure options about ext4dev are chosen We add -t ext4dev option for mount command to indicate the type of partition But it cannot change the mount type by using remount option We need mount root file system at the very beginning While the scripts in init directory is executing Actually the root file system has been mounted Which means -t ext4dev -o remount will take no effect . . . . . . . A . n Introduction to fcache 2 . 5 / 34 . . .
  • 42. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Boot with ext4 filesystem Download kernel 2.6.24-rc5, make sure options about ext4dev are chosen We add -t ext4dev option for mount command to indicate the type of partition But it cannot change the mount type by using remount option We need mount root file system at the very beginning While the scripts in init directory is executing Actually the root file system has been mounted Which means -t ext4dev -o remount will take no effect So we need to use kernel option rootfstype=ext4dev . . . . . . . A . n Introduction to fcache 2 . 5 / 34 . . .
  • 43. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Fix a Bug I max_extent means how many blocks can be cached The method to calc the max extents is . . . . . . . A . n Introduction to fcache 2 . 6 / 34 . . .
  • 44. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Fix a Bug I max_extent means how many blocks can be cached The method to calc the max extents is . Calc number of max extent .. (cache_blocks − 1) ∗ PAGE_SIZE max_extent = PAGE_SIZE - sizeof(fcache_extent) . . . . . . . . A . n Introduction to fcache 2 . 6 / 34 . . .
  • 45. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Fix a Bug I max_extent means how many blocks can be cached The method to calc the max extents is . Calc number of max extent .. (cache_blocks − 1) ∗ PAGE_SIZE max_extent = PAGE_SIZE - sizeof(fcache_extent) . But we think the denominator should be + instead of - Or the division by zero error will occur If the size of fcache_extent is one page . . . . . . . A . n Introduction to fcache 2 . 6 / 34 . . .
  • 46. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Fix a Bug II So, we think the method should . . . . . . . A . n Introduction to fcache 2 . 7 / 34 . . .
  • 47. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Fix a Bug II So, we think the method should . Calc number of max extent(fixed) .. (cache_blocks − 1) ∗ PAGE_SIZE max_extent = PAGE_SIZE + sizeof(fcache_extent) . . . . . . . . A . n Introduction to fcache 2 . 7 / 34 . . .
  • 48. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Method for Benchmark ... 1 Set the fcache to priming mode . . . . . . . A . n Introduction to fcache 2 . 8 / 34 . . .
  • 49. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Method for Benchmark ... 1 Set the fcache to priming mode ... 2 Wait the start-up procedure finished, this will take more time than before . . . . . . . A . n Introduction to fcache 2 . 8 / 34 . . .
  • 50. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Method for Benchmark ... 1 Set the fcache to priming mode ... 2 Wait the start-up procedure finished, this will take more time than before ... 3 Set the fcache to cache mode, then reboot . . . . . . . A . n Introduction to fcache 2 . 8 / 34 . . .
  • 51. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Method for Benchmark ... 1 Set the fcache to priming mode ... 2 Wait the start-up procedure finished, this will take more time than before ... 3 Set the fcache to cache mode, then reboot ... 4 Then, you can feel the speed . . . . . . . A . n Introduction to fcache 2 . 8 / 34 . . .
  • 52. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Information of Benchmark Data The time was measured during boot And there are three parts for each time . . . . . . . A . n Introduction to fcache 2 . 9 / 34 . . .
  • 53. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Information of Benchmark Data The time was measured during boot And there are three parts for each time Time from grub to login screen Time from login to finish login The total time . . . . . . . A . n Introduction to fcache 2 . 9 / 34 . . .
  • 54. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Benchmark Data for ext3 Mode grub to login login finish total Normal 48s 38s 86s Priming 50s 52s 102s Cache 46s 29s 75s . . . . . . . A . n Introduction to fcache 3 . 0 / 34 . . .
  • 55. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Benchmark Data for ext4 without extent Mode grub to login login finish total Normal 39.4s 26.4s 65.8s Priming 44.3s 31.0s 75.3s Cache 33.0s 19.7s 52.7s . . . . . . . A . n Introduction to fcache 3 . 1 / 34 . . .
  • 56. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Benchmark Data for ext4 with extent(2.6.24-rc5) Mode grub to login login finish total Normal 37s 34s 71s Priming 43s 52s 97s Cache 32s 27s 59s . . . . . . . A . n Introduction to fcache 3 . 2 / 34 . . .
  • 57. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Q and A Q and A . . . . . . . A . n Introduction to fcache 3 . 3 / 34 . . .
  • 58. W . hat is. fcache P . rinciple . and Flow W . hat we have done . P . roblems during porting . B . enchmark . Thank you all . . . . . . . A . n Introduction to fcache 3 . 4 / 34 . . .