netkit ftpd/ftp migration
Netkit ftp://ftp.uk.linux.org/pub/linux/Networking/netkit a port of the OpenBSD ftp daemon and client Source code ftpd (server): 4717 lines ftp (client): 6249 lines
flowchart Server start listening, and wait for new connection request Client open new connection, then login to the server Child use the established connection to transfer  COMMANDs  and  REPLIES  information with Client fork() USER, PASS, PORT, PASV, RETR, STOR communication channel data transfer channel 1 2 3
Disk and Connection Operation Use read() and write() operate on disk files and tcp connection Default BUFSIZ (8KB) in stdio.h in RHEL 5 FILE *  and  fd fdopen  – associate a stream with a file descriptor fileno  - map a stream pointer to a file descriptor put
Transfer Param - PORT after finish login and authentication ftp client start listen on a port, then send the port information to the server PORT h1,h2,h3,h4,p1,p2 Port choose sin_port = 0 /* let system pick one */ Then use  getsockname()  to get listening address and port, send to the server
Transfer Param - PASV ftp client send PASV command to the server, server listening on a port and waiting for connection Choose a port in [40000, 44999] Random sin_port = 0 /* let system pick one */ Then use  getsockname()  to get listening address and port, send to the client
put - STOR Client fin = fopen(local, “r”); read(fileno(fin), ..); dout = fdopen(connfd, ..); write(fileno(dout), ..); Server fout = fopen(local, “r”); read(fileno(din), ..); write(read(fileno(din), ..); FILE *fin -> *dout -> *din -> *fout;
get - RETR Client fout = fopen(local, “r”); read(fileno(din), ..); write(fileno(fout), ..); Server fin = fopen(local, “r”); If filesize < 16MB mmap(.., fileno(fin), ..); write(fileno(dout).., filesize); else read(fileno(din), ,blksize * 16); write(read(fileno(din), ..); FILE *fout <- *din <- *dout <- *fin;
ftpcmd.y Procedure bison –y ftpcmd.y Output y.tab.c Gcc y.tab.c Same procedure as Oracle Pro*C
Migrate ftpd/ftp to RDMA environment Use librdmacm to establish the data transfer channel instead of socket Server start listening, and wait for new connection request Client open new connection, then login to the server Child use the established connection to transfer  COMMANDs  and  REPLIES  information with Client fork() USER, PASS, PORT, PASV, RETR, STOR communication channel data transfer channel
Desc Sequence processing cmd – data – cmd – data - … When the process handle data transfer, it discards the communication channel error no  poll/select/epoll  etc. Each process handle an individual data transfer channel. When the data transfer finished, the channel will be closed.
More efficient transfer – data transfer channel GridFTP’s method Parallel and striped transfer. Can we use parallel data transfer channels in rdma-ftp? Or we just implement the GridFTP command and protocol?
More efficient transfer – buffer and memory copy Read data directly into the MR. Avoid memcpy() in the process. Reuse the MR. Separate the EXCUTER and RESOURCE  Excuter Sender, receiver, reader, writer, manager… Organize the MR blocks with linked lists. free block lists busy block lists
example reader reader reader sender sender sender manager writer writer writer receiver manager receiver receiver listener listener
MiddleWare ? In each host, setup a group of daemon processes responsible for buffer management, remote data transfer, file operation and inter-process communication, etc. Application just need send command and parameter to those daemon, to initiate the data transfer and check the result. RDMA - DAEMON RDMA - DAEMON ftp http scp ftp http scp

Netkitmig

  • 1.
  • 2.
    Netkit ftp://ftp.uk.linux.org/pub/linux/Networking/netkit aport of the OpenBSD ftp daemon and client Source code ftpd (server): 4717 lines ftp (client): 6249 lines
  • 3.
    flowchart Server startlistening, and wait for new connection request Client open new connection, then login to the server Child use the established connection to transfer COMMANDs and REPLIES information with Client fork() USER, PASS, PORT, PASV, RETR, STOR communication channel data transfer channel 1 2 3
  • 4.
    Disk and ConnectionOperation Use read() and write() operate on disk files and tcp connection Default BUFSIZ (8KB) in stdio.h in RHEL 5 FILE * and fd fdopen – associate a stream with a file descriptor fileno - map a stream pointer to a file descriptor put
  • 5.
    Transfer Param -PORT after finish login and authentication ftp client start listen on a port, then send the port information to the server PORT h1,h2,h3,h4,p1,p2 Port choose sin_port = 0 /* let system pick one */ Then use getsockname() to get listening address and port, send to the server
  • 6.
    Transfer Param -PASV ftp client send PASV command to the server, server listening on a port and waiting for connection Choose a port in [40000, 44999] Random sin_port = 0 /* let system pick one */ Then use getsockname() to get listening address and port, send to the client
  • 7.
    put - STORClient fin = fopen(local, “r”); read(fileno(fin), ..); dout = fdopen(connfd, ..); write(fileno(dout), ..); Server fout = fopen(local, “r”); read(fileno(din), ..); write(read(fileno(din), ..); FILE *fin -> *dout -> *din -> *fout;
  • 8.
    get - RETRClient fout = fopen(local, “r”); read(fileno(din), ..); write(fileno(fout), ..); Server fin = fopen(local, “r”); If filesize < 16MB mmap(.., fileno(fin), ..); write(fileno(dout).., filesize); else read(fileno(din), ,blksize * 16); write(read(fileno(din), ..); FILE *fout <- *din <- *dout <- *fin;
  • 9.
    ftpcmd.y Procedure bison–y ftpcmd.y Output y.tab.c Gcc y.tab.c Same procedure as Oracle Pro*C
  • 10.
    Migrate ftpd/ftp toRDMA environment Use librdmacm to establish the data transfer channel instead of socket Server start listening, and wait for new connection request Client open new connection, then login to the server Child use the established connection to transfer COMMANDs and REPLIES information with Client fork() USER, PASS, PORT, PASV, RETR, STOR communication channel data transfer channel
  • 11.
    Desc Sequence processingcmd – data – cmd – data - … When the process handle data transfer, it discards the communication channel error no poll/select/epoll etc. Each process handle an individual data transfer channel. When the data transfer finished, the channel will be closed.
  • 12.
    More efficient transfer– data transfer channel GridFTP’s method Parallel and striped transfer. Can we use parallel data transfer channels in rdma-ftp? Or we just implement the GridFTP command and protocol?
  • 13.
    More efficient transfer– buffer and memory copy Read data directly into the MR. Avoid memcpy() in the process. Reuse the MR. Separate the EXCUTER and RESOURCE Excuter Sender, receiver, reader, writer, manager… Organize the MR blocks with linked lists. free block lists busy block lists
  • 14.
    example reader readerreader sender sender sender manager writer writer writer receiver manager receiver receiver listener listener
  • 15.
    MiddleWare ? Ineach host, setup a group of daemon processes responsible for buffer management, remote data transfer, file operation and inter-process communication, etc. Application just need send command and parameter to those daemon, to initiate the data transfer and check the result. RDMA - DAEMON RDMA - DAEMON ftp http scp ftp http scp

Editor's Notes

  • #4 Access Control Commands: USER, PASS, CWD, QUIT Transfer Parameter Commands: PORT(client told server the address and port), PASV(server told client the address and port) FTP Service Commands: RETR, STOR
  • #13 GridFTP allowing simultaneous TCP streams. Files can be downloaded in pieces simultaneously from multiple sources, or in separate parallel streams from the same source, make better use of bandwidth.
  • #16 Daemon for data transfer