Default mtu = 1500 (not optimum for RoCE because of header overhead and RoCE’s mtu =1K)
Will rerun the test with jumbo frames (2K and 4K)
Single node cluster single OSD, one fio_rbd client with numjobs=8 run random read test.
Loading mlx4_core with parameter log_num_mgm_entry_size=-7 for ConnectX-3 HCA (instead of default = -10)
Notes:
Two nodes benchmarks, no storage replication/clustering etc.
Workloads are random read/write
BW is measured at 256K IOs
Iops is measured at 4K IOs
Data and header CRCs are OFF for both XioMessenger and SimpleMessenger
1 fio_rbd client 1 stream (1 OSD, 1 pool/image, numjobs=1, iodepth=64,32 for random write/read respectively)
1 fio_rbd client 8 streams (1OSD, 1 pool/image, nubjobs=8, iodepth=64,32)
2 fio_rbd clients 16 streams (1OSD, 2 pools/images, each client with numjobs=8, iodepth=64,32 for random write/read)
NA – having problem running fio_rbd randwrite/write with numjobs=n > 2. Will update with multiple clients with numjobs=1. I’ll run random write tests differently, instead of running with single fio_client with numjobs=8, I will run 8 different fio_rbd clients to 8 different pools/images on same OSD with numjobs=1 each. 8 different clients on 8 separated pools/images will be a little bit better results than 1 client with numjobs=8
Notes:
Two nodes benchmarks, no storage replication/clustering etc.
Workloads are random read/write
BW is measured at 256K IOs
Iops is measured at 4K IOs
Data and header CRCs are OFF for both XioMessenger and SimpleMessenger
2/4/8 OSDs instances, 4 clients on 4 separated pools/images (numjobs=8 for random read, numjobs=1 for random write)