Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Gluster Fs


Published on

Published in: Technology, Business
  • Be the first to comment

Gluster Fs

  1. 1. Z RESEARCH Inc. GlusterFS Cluster File System Z RESEARCH Inc. Non-stop Storage © 2007 Z RESEARCH
  2. 2. Z RESEARCH Inc. GlusterFS Cluster File System GlusterFS is a Cluster File System that aggregates multiple storage bricks over InfiniBand RDMA into one large parallel network file system. + + = N x Performance & Capacity © 2007 Z RESEARCH
  3. 3. Z RESEARCH Inc. Key Design Considerations Capacity Scaling Scalable beyond Peta Bytes I/O Throughput Scaling Pluggable Clustered I/O Schedulers Advantage of RDMA transport Reliability Non Stop Storage No Meta Data Ease of Manageability Self Heal NFS like Disk Layout Elegance in Design Stackable Modules Not tied to I/O Profiles or Hardware or OS © 2007 Z RESEARCH
  4. 4. Z RESEARCH Inc. GlusterFS Design de Compatibility with Si MS Windows Storage Clients nt and other Unices Cluster of Clients (Supercomputer, Data Center) ie Cl GLFS Client GLFS Client GigE ClusteredClient GLFS Vol Manager ClusteredClient GLFS Vol Manager NFS / SAMBA GlusterFS Client Clustered Vol Manager GlusterFS Client Clustered Vol Manager over TCP/IP Clustered I/O Scheduler Clustered Vol Manager Clustered I/O Scheduler Clustered Vol Manager Clustered I/O Scheduler Clustered I/O Scheduler Storage Gateway Storage Gateway Storage Gateway Clustered I/O Scheduler Clustered I/O Scheduler NFS/Samba NFS/Samba NFS/Samba GLFS Client GLFS Client GLFS Client RDMA RDMA InfiniBand RDMA (or) TCP/IP RDMA e Si d GlusterFS Clustered Filesystem on x86-64 platform er rv GLFSD Se Storage Brick 1 Storage Brick 2 Storage Brick 3 Storage Brick 4 GLFSD Volume GlusterFS GlusterFS GlusterFS GlusterFS Volume Volume Volume Volume Volume © 2007 Z RESEARCH
  5. 5. Z RESEARCH Inc. Stackable Design n t l ie VFS C S e rF Read Ahead u st l I/O Cache G Unify GlusterFS GlusterFS Server Client Client Client TCPIP – GigE, 10GigE / InfiniBand - RDMA GlusterFS Server GlusterFS Server GlusterFS Server Server Server Server POSIX POSIX POSIX Ext3 Ext3 Ext3 Brick 1 Brick 2 Brick n © 2007 Z RESEARCH
  6. 6. Z RESEARCH Inc. Volume Layout BRICK1 BRICK2 BRICK3 Unify Volume work.ods benchmark.pdf mylogo.xcf corporate.odp test.ogg driver.c test.m4a driver.c initcore.c ether.c Mirror Volume accounts-2007.db accounts-2007.db accounts-2007.db accounts-2006.db accounts-2006.db accounts-2006.db Stripe Volume north-pole-map north-pole-map north-pole-map dvd-1.iso dvd1.iso dvd1.iso xen-image xen-image xen-image © 2007 Z RESEARCH
  7. 7. Z RESEARCH Inc. I/O Scheduling ➢Adaptive least usage (ALU) ➢NUFA ➢Random ➢Custom ➢Round robin volume bricks type cluster/unify subvolumes ss1c ss2c ss3c ss4c option scheduler alu option alu.limits.min-free-disk 60GB option alu.limits.max-open-files 10000 option alu.order disk-usage:read-usage:write-usage:open-files-usage:disk-speed-usage option alu.disk-usage.entry-threshold 2GB # Units in KB, MB and GB are allowed option alu.disk-usage.exit-threshold 60MB # Units in KB, MB and GB are allowed option 1024 option 32 option alu.stat-refresh.interval 10sec end-volume © 2007 Z RESEARCH
  8. 8. Z RESEARCH Inc. GlusterFS Benchmarks Benchmark Environment Method: Multiple 'dd' of varying blocks are read and written from multiple clients simultaneously. GlusterFS Brick Configuration (16 bricks) Processor - Dual Intel(R) Xeon(R) CPU 5160 @ 3.00GHz RAM - 8GB FB-DIMM Linux Kernel - 2.6.18-5+em64t+ofed111 (Debian) Disk - SATA-II 500GB HCA - Mellanox MHGS18-XT/S InfiniBand HCA Client Configuration (64 clients) RAM - 4GB DDR2 (533 Mhz) Processor - Single Intel(R) Pentium(R) D CPU 3.40GHz Linux Kernel - 2.6.18-5+em64t+ofed111 (Debian) Disk - SATA-II 500GB HCA - Mellanox MHGS18-XT/S InfiniBand HCA Interconnect Switch: Voltaire port InfiniBand Switch (14U) GlusterFS version 1.3.pre0-BENKI © 2007 Z RESEARCH
  9. 9. Z RESEARCH Inc. Aggregated Bandwidth Aggregated I/O Benchmark on 16 bricks(servers) and 64 clients over IB Verbs transport. Bps! 13G Peak aggregated read throughput -13GBps. After a particular threshold, write performance plateaus because of disk I/O bottleneck. System memory greater than the peak load will ensure best possible performance. ib-verbs transport driver is about 30% faster than ib-sdp transport driver. © 2007 Z RESEARCH
  10. 10. Z RESEARCH Inc. Scalability Performance improves when the number of bricks are increased Throughput increases with corresponding increased in servers from 1 to 16 © 2007 Z RESEARCH
  11. 11. Z RESEARCH Inc. Hardware Platform Example Storage Building Block  Intel SE5000PSL (Star Lake) baseboard Intel EPSD  Uses 2 (or 1) dual core LV Intel® Xeon® McKay Creek (Woodcrest) processors  Uses 1 Intel SRCSAS144e (Boiler Bay) Add-in Card Options  Infiniband RAID card  10 GbE  Two 2 ½” boot HDDs or boot from DOM  Fibre Channel  2U form factor, 12 hot-swap HDDs  3 ½” SATA 3.0Gbps (7.2k or 10k RPM)  SAS (10k or 15k RPM)  SES compliant enclosure management FW  External SAS JBOD expansion © 2007 Z RESEARCH
  12. 12. Z RESEARCH Inc. Thank You! Hitesh Chellani Tel: 510-354-6801 © 2007 Z RESEARCH