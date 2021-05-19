Successfully reported this slideshow.
Gluster dev session #6 understanding gluster's network communication layer

This session talks about how the RPC layer with XDR works in glusterfs from p.o.v of both client and server xlators in glusterfs

  1. 1. Network Layer Insights into how GlusterFS’s RPC and network layer work!
  2. 2. Agenda ● Why we need Networking ? ● What is a protocol ? ● What is RPC ? ● How to pack data on network (XDR) ? ● Gluster’s Networking layer ● History ● Code Walk Through ● Challenges ● Roadmap
  3. 3. Network in Filesystem ● Client - Server Architecture! ● Data on servers, access on clients. ● All connections are initiated by client, and server always in listen() mode. ○ Gluster uses TCP connections, ie, all connections are stateful, and are always on. ● Most operations are initiated from Client. ○ cbk (or callback) methods are used to initiate request/message from server side. ○ cbk methods are generally used for ‘notification’.
  4. 4. Protocol ● Set of guidelines on how to order data, understand requests, responses etc. ● Examples are HTTP, TCP/IP, FTP, SSH etc etc.. ● GlusterFS uses RPC / XDR combination for networking protocol at present.
  5. 5. RPC ● Remote Procedure Call (RFC5531) Normal Function call call exec call exec n/w Remote Procedure Call struct rpc_msg { unsigned int xid; union switch (msg_type mtype) { case CALL: call_body cbody; case REPLY: reply_body rbody; } body; }; struct call_body { unsigned int rpcvers; /* must be equal to two (2) */ unsigned int prog; unsigned int vers; unsigned int proc; opaque_auth cred; opaque_auth verf; /* procedure-specific parameters start here */ };
  6. 6. XDR ● External Data Representation (RFC4506) ● Used for procedure specific payload ● Client sends payload -> Server expects it in same order.
  7. 7. History of Gluster’s n/w layer ● Binary packing of structures (v1.x) ○ Just do write(sockfd, structure, sizeof(structure)); ○ Not possible to work in network with different type of machines. ○ Not easy to manage versions, and rolling upgrades. ● Dictionary stream as protocol (v2.x) ○ Works smoothly on any type of machines, and across all versions. ○ Too much load on CPU (for dict encode/decode operations). ● RPC / XDR (v3.x onwards) ○ Common network layer for both NFS and GlusterFS protocols.
  8. 8. Gluster’s RPC layer ● Key components to look at - ○ xlator/protocol ○ rpc/lib ○ rpc/xdr ● Network layer’s major responsibilities: ○ Connection management ○ RPC ○ Notification ○ Modularity (TCP/IP - RDMA and others)
  9. 9. Gluster Networking Layer - A Walk through ● Check the open() fop… ○ client/protocol - Understand the program number, version and procedure number ○ Understand XDR encoding ○ Network layer just does ‘write()/read()’ on socket. ○ On server, rpc layer looks at program number, version and procedure number, and calls the corresponding method/actor. ○ In actor, specific payload gets decoded. ○ The response path happens in the same order, but now, there is just XID (transaction ID), based on which, we need to handle response in client.
  10. 10. Challenges ● Considering we use XDR, it is critical to keep structure same across version. ● Version compatibility is a challenge when the project is evolving. ● Performance: Current XDR and RPC layers have huge performance impact. ○ Increases memory allocation (of small segments) ○ Multiple system calls to read RPC headers and understand the payload. ○ Connection management is a challenge. ● Upgrade to new version would have issues if anything on-wire changes, as we can’t expect all nodes in network to be upgraded in one shot.
  11. 11. Things to consider while developing ● Never add anything inbetween w.r.to procedure number, or Enums specific to xdr. ● Don’t change the order of XDR structure, or change the xdr structure. ● If one need to create a new field, or xdr structure, it should be added as another procedure or should add it as a new program version, with different actors.
  12. 12. Roadmap ● XDR -> Protobuf ● RPC -> gRPC ● Better modularity ● RDMA (re-enable) ○ IB-Verbs and RoCE ● DRC (Duplicate Replay Cache)
  13. 13. Thank You ● Credits: Pranith Kumar Karampuri (@pranithk) ● Reach Out: Twitter - @tumballi / @kadaluIO / @gluster https://gluster.slack.com / https://gluster.org https://kadalu.slack.com / https://kadalu.io

