Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ceph for Storing MeerKAT Radio Telescope Data

45 views

Published on

Thomas Bennet: Ceph for Storing MeerKAT Radio Telescope Data

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Ceph for Storing MeerKAT Radio Telescope Data

  1. 1. Ceph for Storing MeerKAT Radio Telescope Data Thomas Bennett Ceph Day London 2019
  2. 2. Who are we? International project to build the worlds largest telescope: • 13 countries are at the core of the SKA • 100 organisations across about 20 countries have been participating in the design and development of the SKA The South African Radio Astronomy Observatory, responsible for managing all radio astronomy initiatives and facilities in South Africa, including: • MeerKAT • African VLBI Network • HERA • Infrastructure and engineering planning for the SKA “With 40% of the world’s population involved, the SKA Project is truly a global enterprise.” https://www.skatelescope.org/participating-countries/
  3. 3. Who am I? “Be yourself; everyone else is already taken.” Oscar Wilde
  4. 4. WTF? “The most beautiful thing we can experience is the mysterious. It is the source of all true art and science.” - Albert Einstein
  5. 5. “The most beautiful thing we can experience is the mysterious. It is the source of all true art and science.” - Albert Einstein WTF
  6. 6. SARAO Ceph Clusters Cluster Version # OSDs Application Total Capacity MeerKAT-C1 12.2.8 563 RADOS-GW, librados 1.39 PiB SeeKAT-C1 12.2.2 785 RADOS-GW, librados 5.52 PiB SeeKAT-C3 12.2.12 1860 RADOS-GW, librados 13.0 PiB SwartKAT-C1 14.2.2 78 CephFS, RBD 0.18 PiB CAM-A/B/C/C1 12.2.5 9 CephFS, RBD 8.0 mPiB (each)
  7. 7. SKAT C3 DISK DATA BUFFER S3 COLD BUFFER (policy: HDD) SKAT C3 FILE to S3 DATA MOVERS 15 PB S3 HOT BUFFER (policy: SSD) S3 to S3 DATA MOVER MKAT C1 MKAT C1 480 TB 2 PB ~ 900km SARAO Data Movement
  8. 8. SARAO User Data Access katdal.open( ) HTTPS LINK + JW TOKEN 1 – 100 million objects S3://OBSERVATION DATA
  9. 9. SARAO Ceph Activities • https://github.com/sratcliffe/cephmetrics • Ceph deployment: proxmox and DIY • Benchmarking of cluster to tune RADOS RGW parameters • Data migration SeeKAT-C1 to SeeKAT-C3: • Rclone sync S3 data between clusters • Python librados implementation to sync non S3 client pools
  10. 10. SARAO RADOS GW ceph.conf • rgw_dynamic_resharding = False • rgw_get_obj_max_req_size = 20971520 • rgw_get_obj_window_size = 20971520 • rgw_max_chunk_size = 20971520 • rgw_obj_stripe_size = 20971520 • rgw_override_bucket_index_max_shards = 10 • rgw_put_obj_min_window_size = 20971520
  11. 11. SARAO Ceph Lessons Learned • Hardware failures • OSD node hardware • Using ceph balancer • OSD sizes on SwartKAT-C1 • More lessons to be learned on SeeKAT-C1
  12. 12. OSD Node Hardware Peralex DSS HC Storage pod Key Features 25 GbE (SFP+) x 48 SAS 3 connectivity Processor Haswell-EP 4C Intel Xeon 4 core 3.7 GHz Memory 128 GB Networking Mellanox Network Interface Card, 25GbE, Single-Port Drive Config 48 x 8TB HDDs 1 x 512 GB NVMe Form Factor 4U with 1500W PSU 2.6 GB / OSD 1 CPU 3.7 GHz - 4 cores
  13. 13. Ceph Community Activities • Locally organized Ceph meetups • Gitter IM community • Ceph presentations at South African conferences • Document existing clusters / users and bring them together
  14. 14. Mzanzi Ceph Clusters Cluster Version # OSDs Application Total Capacity DIRISA 12.2.8 60 RBD 218.0 TiB ACELAB 14.2.2 50 CephFS. RBD 92.0 TiB ILIFU 12.2.12 214 CephFS 2.0 PiB VUMACAM 12.2.4 245 CephFS with CIFS 1.0 PiB
  15. 15. SARAO Ceph Future Plans • Decommission SeeKAT-C1 (70% full) and do some testing along the way • Upgrade to Nautilus • Design work started for new storage cluster for MeerKAT+ (20 more dishes) • 4 x data rate • will require another ~20 PB raw (~15 PB ec-pools) • Continue work to build South African Ceph community and support current South African Ceph users

×