Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)

2,076 views

Published on

- Что такое SDS (общие места для (почти) всех решений — масштабирование, абстрагирование от аппаратных ресурсов, управление с помощью политик, кластерные ФС);
- Почему мы решили использовать SDS (нужно было объектное хранилище);
- Почему решили использовать именно Ceph, а не другие открытые (GlusterFS, Swift...) или проприетарные (IBM Elastic Storage, Huawei OceanStor) решения;
- Что еще умеет Ceph, кроме object storage (RBD, CephFS);
- Как работает Ceph (со стороны сервера);
- Что нового дает BlueStore по сравнению с классическим (поверх ФС);
- Сравнение производительности (метрики тестов);
- BlueStore — все еще tech preview;
- Заключение. Ссылки, литература.

Published in: Engineering
  • Be the first to comment

Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)

  1. 1. Ceph new store: BlueStore Максим Воронцов
  2. 2. About me ● Главный инженер по вычислительным комплексам ● Работаю с Linux 8 лет ● WAS/DB2/MQ вот это все ● Много разных проектов
  3. 3. About RedSys ● Бизнес интегратор ● Существует более 20 лет ● Офисы в MOW, LED, OVB, GOJ, ROV, KHV ● RED = Responsibility + Efficiency + Development ● Отрасли - ТЭК, ВПК, Госы, Телеком, etc.
  4. 4. Customers
  5. 5. TOC ● Before Ceph ● Ceph first advent ● Ceph temptations ● BlueStore prophecy ● Ceph FileStore vs BlueStore ● Let's fight ● Results ● Awaiting Ceph second advent
  6. 6. Software Defined Storage ● Unlimited scalability ● Storage virtualization ● Policy-driven administration ● API services ● Support for block, file and object data types
  7. 7. IBM definition «SDS in today's business context refers to IT storage that goes beyond typical array interfaces (for example, command line and graphic user) to operate within a higher architectural construct.»
  8. 8. Examples ● AWS S3 ● EMC ScaleIO ● Ceph ● GlusterFS ● Huawei FusionStorage ● IBM ElasticStorage ● NexentaStor
  9. 9. Issue ● DB2 on z/OS
  10. 10. Issue ● DB2 on z/OS ● XML in DB2
  11. 11. Issue ● DB2 on z/OS ● XML in DB2 ● Signed XML in DB2 (no way)
  12. 12. Issue ● DB2 on z/OS ● XML in DB2 ● Signed XML in DB2 (no way) ● You really shouldn't store blobs in relational store
  13. 13. To find a way ● More money to IBM?
  14. 14. To find a way ● More money to IBM? ● More money to someone else?
  15. 15. To find a way ● More money to IBM? ● More money to someone else? ● Something else?
  16. 16. Which one? ● AWS S3 ● Ceph ● IBM ElasticStorage ● Huawei OceanStor ● Swift
  17. 17. Why this one?
  18. 18. Standing on the shoulders of giants ● CERN ● Cisco ● Deutsche Telecom ● Yahoo ● Cloudmouse.ru ● ...
  19. 19. Preborn 7 guests in VMWare: ● 1 MON ● 3 OSD ● 1 ActiveMQ ● 1 Tomcat ● 1 ElasticStorage
  20. 20. Long story short Long long story about...
  21. 21. Long story short Long long story about… What is English for «импортозамещение»?
  22. 22. Long story short Long long story about… What is English for «импортозамещение»? Catch up and overtake z/OS
  23. 23. Long story short Long long story about… What is English for «импортозамещение»? Catch up and overtake z/OS What is Russian for LTFS?
  24. 24. Long story short Long long story about… What is English for «импортозамещение»? Catch up and overtake z/OS What is Russian for LTFS? What is Russian for WORM?
  25. 25. BlueStore prophecy Ceph Jewel Preview: a new store is coming, BlueStore
  26. 26. Ceph scheme
  27. 27. OSD scheme
  28. 28. FileStore scheme
  29. 29. BlueStore scheme
  30. 30. BlueStore advanced scheme
  31. 31. Mount directory structure $ ls -R /var/lib/ceph/osd/ceph-0 | wc -l
  32. 32. Mount directory structure $ ls -R /var/lib/ceph/osd/ceph-0 | wc -l FileStore BlueStore 18656 16
  33. 33. HW test $ sudo dd bs=1G count=1 oflag=direct if=/dev/zero of=zerofile 1+0 records in 1+0 records out 1073741824 bytes (1,1 GB) copied, 10,275 s, 105 MB/s
  34. 34. HW test $ iperf3 -c osd00 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 7.40 GBytes 6.35 Gbits/sec 3278 sender [ 4] 0.00-10.00 sec 7.39 GBytes 6.35 Gbits/sec receiver $ iperf3 -c osd00-ci - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec 64 sender [ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec receiver
  35. 35. Ceph tests $ ceph osd pool create radosbench 64 $ rados bench -p radosbench 300 write --no-cleanup $ rados bench -p radosbench 300 seq $ rados bench -p radosbench 300 rand $ rbd create fio_test --size 10G $ fio rbd.fio
  36. 36. Results
  37. 37. Results
  38. 38. Not so fast $ ceph-disk prepare --bluestore /dev/sdd /dev/sdb $ ls /dev/disk/by-partlabel/ -l osd-device-2-block -> ../../sdb2 osd-device-2-data -> ../../sdd1
  39. 39. Not so fast $ ceph-disk prepare --bluestore /dev/sdd /dev/sdb $ ls /dev/disk/by-partlabel/ -l ceph%20data -> ../../sdb1 ceph%20block -> ../../sdb2
  40. 40. Not so fast Here be dragons Tech preview CPU regression on too fast disks ;-) Did you do backup today?
  41. 41. Hot to reach me Mail + hangouts: 6012030@gmail.com mail: maxim.vorontsov@redsys.ru http://redsys.ru
  42. 42. ● https://www.redbooks.ibm.com/abstracts/redp5121.html ● http://www.sersc.org/journals/IJMUE/vol10_no11_2015/27.p df ● http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf ● https://ceph.com ● https://www.sebastien-han.fr/blog/ ● https://cds.cern.ch/record/2015206/files/CephScaleTestMa rch2015.pdf ● http://rocksdb.org

×