slides

530 views
452 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
530
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • I’m not sure what equipment costs and topological flexibility are-?
  • Under Test you have 1GBE and 1Gb/s – not the same, needs to be fixed, I assume.
  • slides

    1. 1. Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., steph@sandburst.com Allyn Romanow, Cisco Systems, allyn@cisco.com
    2. 2. RDDP Is Coming Soon <ul><li>“ ST [RDMA] Is The Wave Of The Future” – S Bailey & C Good, CERN 1999 </li></ul><ul><li>Need: </li></ul><ul><ul><li>standard protocols </li></ul></ul><ul><ul><li>host software </li></ul></ul><ul><ul><li>accelerated NICs (RNICs) </li></ul></ul><ul><ul><li>faster host buses (for > 1G) </li></ul></ul><ul><li>Vendors are finally serious: </li></ul><ul><ul><li>Broadcom, Intel, Agilent, Adaptec, Emulex, Microsoft, IBM, HP (Compaq, Tandem, DEC), Sun, EMC, NetApp, Oracle, Cisco & many, many others </li></ul></ul>
    3. 3. Overview <ul><li>Motivation </li></ul><ul><li>Architecture </li></ul><ul><li>Open Issues </li></ul>
    4. 4. CFP SigComm Workshop <ul><li>NICELI SigComm 03 Workshop </li></ul><ul><li>Workshop on Network-I/O Convergence: Experience, Lessons, Implications </li></ul><ul><li>http://www.acm.org/sigcomm/sigcomm2003/workshop/niceli/index.html </li></ul>
    5. 5. High Speed Data Transfer <ul><li>Bottlenecks </li></ul><ul><ul><li>Protocol performance </li></ul></ul><ul><ul><li>Router performance </li></ul></ul><ul><ul><li>End station performance, host processing </li></ul></ul><ul><ul><ul><li>CPU Utilization </li></ul></ul></ul><ul><ul><ul><li>The I/O Bottleneck </li></ul></ul></ul><ul><ul><ul><ul><li>Interrupts </li></ul></ul></ul></ul><ul><ul><ul><ul><li>TCP checksum </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Copies </li></ul></ul></ul></ul>
    6. 6. What is RDMA? <ul><li>Avoids copying by allowing network adapter under control of application to steer data directly into application buffers </li></ul><ul><li>Bulk data transfer or kernel bypass for small messages </li></ul><ul><li>Grid, cluster, supercomputing, data centers </li></ul><ul><li>Historically, special purpose fabrics – Fibre Channel, VIA, Infiniband, Quadrics, Servernet </li></ul>
    7. 7. Traditional Data Center Servers The World Ethernet/ IP Storage Network (Fibre Channel) Database Intermachine Network (VIA, IB, Proprietary) application A Machine
    8. 8. Why RDMA over IP? Business Case <ul><li>TCP/IP not used for high bandwidth interconnection, host processing costs too high </li></ul><ul><li>High bandwidth transfer to become more prevalent – 10 GE, data centers </li></ul><ul><li>Special purpose interfaces are expensive </li></ul><ul><li>IP NICs are cheap, volume </li></ul>
    9. 9. The Technical Problem- I/O Bottleneck <ul><li>With TCP/IP host processing can’t keep up with link bandwidth, on receive </li></ul><ul><li>Per byte costs dominate, Clark (89) </li></ul><ul><li>Well researched by distributed systems community, mid 1990’s. Industry experience. </li></ul><ul><li>Memory bandwidth doesn’t scale, processor memory performance gap– Hennessy(97), D.Patterson, T. Anderson(97), </li></ul><ul><li>Stream benchmark </li></ul>
    10. 10. Copying <ul><li>Using IP transports (TCP & SCTP) requires data copying </li></ul>NIC 1 User Buffer Packet Buffer Packet Buffer 2 Data copies
    11. 11. Why Is Copying Important? <ul><li>Heavy resource consumption @ high speed (1Gbits/s and up) </li></ul><ul><ul><li>Uses large % of available CPU </li></ul></ul><ul><ul><li>Uses large fraction of avail. bus bw – min 3 trips across the bus </li></ul></ul>64 KB window, 64 KB I/Os, 2P 600 MHz PIII, 9000 B MTU 0.2 CPUs 0.2 CPUs 891 1 Gb/s RDMA SAN - VIA 1.2 CPUs 0.5 CPUs 769 1 GBE, TCP Rx CPUs Tx CPUs Throughput (Mb/sec) Test
    12. 12. What’s In RDMA For Us? <ul><li>Network I/O becomes `free’ (still have latency though) </li></ul>2500 machines using 30% CPU for I/O 1750 machines using 0% CPU for I/O
    13. 13. Approaches to Copy Reduction <ul><li>On-host – Special purpose software and/or hardware e.g., Zero Copy TCP, page flipping </li></ul><ul><ul><li>Unreliable, idiosyncratic, expensive </li></ul></ul><ul><li>Memory to memory copies, using network protocols to carry placement information </li></ul><ul><ul><li>Satisfactory experience – Fibre Channel, VIA, Servernet </li></ul></ul><ul><li>FOR HARDWARE, not software </li></ul>
    14. 14. RDMA over IP Standardization <ul><li>IETF RDDP Remote Direct Data Placement WG </li></ul><ul><ul><li>http://ietf.org/html.charters/rddp-charter.html </li></ul></ul><ul><li>RDMAC RDMA Consortium </li></ul><ul><ul><li>http://www.rdmaconsortium.org/home </li></ul></ul>
    15. 15. RDMA over IP Architecture <ul><li>Two layers: </li></ul><ul><li>DDP – Direct Data Placement </li></ul><ul><li>RDMA - control </li></ul>IP Transport DDP RDMA control ULP
    16. 16. Upper and Lower Layers <ul><li>ULPs- SDP Sockets Direct Protocol, iSCSI, MPI </li></ul><ul><li>DAFS is standardized NFSv4 on RDMA </li></ul><ul><li>SDP provides SOCK_STREAM API </li></ul><ul><li>Over reliable transport – TCP, SCTP </li></ul>
    17. 17. Open Issues <ul><li>Security </li></ul><ul><li>TCP order processing, framing </li></ul><ul><li>Atomic ops </li></ul><ul><li>Ordering constraints – performance vs. predictability </li></ul><ul><li>Other transports, SCTP, TCP, unreliable </li></ul><ul><li>Impact on network & protocol behaviors </li></ul><ul><li>Next performance bottleneck? </li></ul><ul><li>What new applications? </li></ul><ul><li>Eliminates the need for large MTU (jumbos)? </li></ul>

    ×