Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open stack meetup oct 2018 migrating 8.3pb of ceph

18 views

Published on

These are the slides from the OpenStack Toronto October meetup. We presented how we migrated 8.3PiB of data from Filestore to Bluestore.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Open stack meetup oct 2018 migrating 8.3pb of ceph

  1. 1. Ontario Institute for Cancer Research Migrating 8.3PiB of Ceph from Filestore to Bluestore October 23rd 2018
  2. 2. 2 Why move to Bluestore? ● Supportability ● Lower latency ● Higher throughput ONTARIO INSTITUTE FOR CANCER RESEARCH Read more @ https://ceph.com/community/new-luminous-bluestore/
  3. 3. ONTARIO INSTITUTE FOR CANCER RESEARCH How?
  4. 4. 100% AI
  5. 5. ONTARIO INSTITUTE FOR CANCER RESEARCH
  6. 6. 6 1. Add Luminous repository 2. apt-get install ceph Done!
  7. 7. ONTARIO INSTITUTE FOR CANCER RESEARCH
  8. 8. ONTARIO INSTITUTE FOR CANCER RESEARCH 8 Migration process for each Storage node Drain Drain data from all OSD’s on desired storage node Find the numerical range of OSD’s (684 to 719) and change the osd crush weight to 0 Convert the OSD’s on desired storage node from Filestore to Bluestore *More detail in next few slides Convert Refill the OSD’s on desired storage node Using the same range of OSD’s from the Drain step, change the osd crush weight to the appropriate disk size Fill
  9. 9. Draining 9 ONTARIO INSTITUTE FOR CANCER RESEARCH for i in $(seq 648 683); do ceph osd crush reweight osd.$i 0; done ● for loop to drain a server worth of OSD’s ● ~24 hours per server ● 1-2 servers draining at a time ● Multi-rack draining ● Wait for ‘ceph health ok’ ● Tuneables osd recovery max active 3 -> 4 osd max backfills 1 -> 16
  10. 10. Draining 10 ONTARIO INSTITUTE FOR CANCER RESEARCH Majority drained in 3 hours Long tail of 28 hours to complete 144TB server case study
  11. 11. Draining 11 ONTARIO INSTITUTE FOR CANCER RESEARCH 360TB server case study Steady drain for 13 hours
  12. 12. Converting to Bluestore 12 ONTARIO INSTITUTE FOR CANCER RESEARCH Migrate bluestore script @ https://github.com/CancerCollaboratory/infrastructure 1. Stop the OSD process (systemctl stop ceph-osd@501.service) 2. Unmount the OSD (umount /dev/sdr1) 3. Zap the disk (ceph-disk zap 501) 4. Mark the OSD as destroyed (ceph osd destroy 501 --yes-i-really-mean-it) 5. Prepare the disk as Bluestore (ceph-disk prepare --bluestore /dev/sdr --osd-id 501)
  13. 13. Filling 13 ONTARIO INSTITUTE FOR CANCER RESEARCH for i in $(seq 648 683); do ceph osd crush reweight osd.$i 3.640; done ● for loop to fill a server worth of OSD’s ● ~24 hours per server ● 1-2 servers filling at a time ● Multi-rack draining ● Wait for ‘ceph health ok’ ● Monitoring caveat
  14. 14. Filling 14 ONTARIO INSTITUTE FOR CANCER RESEARCH 144TB server case study
  15. 15. Filling 15 ONTARIO INSTITUTE FOR CANCER RESEARCH 360TB server case study
  16. 16. Filling 16 ONTARIO INSTITUTE FOR CANCER RESEARCH Monitoring caveat Zabbix graphs built from zabbix-agent xfs disk usage Grafana w/ graphite and ceph-mgr
  17. 17. Tracking & Monitoring of progress 17 ONTARIO INSTITUTE FOR CANCER RESEARCH
  18. 18. How long did it take? 18 ONTARIO INSTITUTE FOR CANCER RESEARCH 0101011101010101000101101010101010 Start Finish End of July Early September +480TB of data uploaded during this time by researchers +1PB of capacity added during migration (new nodes) 188TB of data served from the object store
  19. 19. Performance impact during migration 19 ONTARIO INSTITUTE FOR CANCER RESEARCH
  20. 20. Issues 20 ONTARIO INSTITUTE FOR CANCER RESEARCH ● Increased amount of drive failures ○ 4 failures within a week at the end of the migration ● Ceph monmap growing to ~15GB
  21. 21. Funding for the Ontario Institute for Cancer Research is provided by the Government of Ontario

×