Remote%20Direct%20Memory%20Access%20(RDMA)%20over%20IP%20PFLDNet%202003,%20Geneva%20Stephen%20Bailey,%20Sandburst%20Corp.,%20steph@sandburst.com%20Allyn%20Romanow,%20Cisco%20Systems,%20allyn@cisco.com - PowerPoint PPT Presentation

About This Presentation
Title:

Remote%20Direct%20Memory%20Access%20(RDMA)%20over%20IP%20PFLDNet%202003,%20Geneva%20Stephen%20Bailey,%20Sandburst%20Corp.,%20steph@sandburst.com%20Allyn%20Romanow,%20Cisco%20Systems,%20allyn@cisco.com

Description:

Remote Direct Memory Access (RDMA) over IP. PFLDNet 2003, Geneva ... HP (Compaq, Tandem, DEC), Sun, EMC, NetApp, Oracle, Cisco & many, many others ... – PowerPoint PPT presentation

Number of Views:245
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Remote%20Direct%20Memory%20Access%20(RDMA)%20over%20IP%20PFLDNet%202003,%20Geneva%20Stephen%20Bailey,%20Sandburst%20Corp.,%20steph@sandburst.com%20Allyn%20Romanow,%20Cisco%20Systems,%20allyn@cisco.com


1
Remote Direct Memory Access (RDMA) over
IPPFLDNet 2003, GenevaStephen Bailey,
Sandburst Corp., steph_at_sandburst.comAllyn
Romanow, Cisco Systems, allyn_at_cisco.com
2
RDDP Is Coming Soon
  • ST RDMA Is The Wave Of The Future S Bailey
    C Good, CERN 1999
  • Need
  • standard protocols
  • host software
  • accelerated NICs (RNICs)
  • faster host buses (for gt 1G)
  • Vendors are finally serious
  • Broadcom, Intel, Agilent, Adaptec, Emulex,
    Microsoft, IBM, HP (Compaq, Tandem, DEC), Sun,
    EMC, NetApp, Oracle, Cisco many, many others

3
Overview
  • Motivation
  • Architecture
  • Open Issues

4
CFP SigComm Workshop
  • NICELI SigComm 03 Workshop
  • Workshop on Network-I/O Convergence Experience,
    Lessons, Implications
  • http//www.acm.org/sigcomm/sigcomm2003/workshop/ni
    celi/index.html

5
High Speed Data Transfer
  • Bottlenecks
  • Protocol performance
  • Router performance
  • End station performance, host processing
  • CPU Utilization
  • The I/O Bottleneck
  • Interrupts
  • TCP checksum
  • Copies

6
What is RDMA?
  • Avoids copying by allowing network adapter under
    control of application to steer data directly
    into application buffers
  • Bulk data transfer or kernel bypass for small
    messages
  • Grid, cluster, supercomputing, data centers
  • Historically, special purpose fabrics Fibre
    Channel, VIA, Infiniband, Quadrics, Servernet

7
Traditional Data Center
The World
Servers
8
Why RDMA over IP? Business Case
  • TCP/IP not used for high bandwidth
    interconnection, host processing costs too high
  • High bandwidth transfer to become more prevalent
    10 GE, data centers
  • Special purpose interfaces are expensive
  • IP NICs are cheap, volume

9
The Technical Problem- I/O Bottleneck
  • With TCP/IP host processing cant keep up with
    link bandwidth, on receive
  • Per byte costs dominate, Clark (89)
  • Well researched by distributed systems community,
    mid 1990s. Industry experience.
  • Memory bandwidth doesnt scale, processor memory
    performance gap Hennessy(97), D.Patterson, T.
    Anderson(97),
  • Stream benchmark

10
Copying
  • Using IP transports (TCP SCTP) requires data
    copying

1
NIC
Packet Buffer
2
Packet Buffer
User Buffer
Data copies
11
Why Is Copying Important?
  • Heavy resource consumption _at_ high speed (1Gbits/s
    and up)
  • Uses large of available CPU
  • Uses large fraction of avail. bus bw min 3
    trips across the bus

Test Throughput (Mb/sec) Tx CPUs Rx CPUs
1 GBE, TCP 769 0.5 CPUs 1.2 CPUs
1 Gb/s RDMA SAN - VIA 891 0.2 CPUs 0.2 CPUs
64 KB window, 64 KB I/Os, 2P 600 MHz PIII, 9000 B
MTU
12
Whats In RDMA For Us?
  • Network I/O becomes free (still have latency
    though)

1750 machines using 0 CPU for I/O
2500 machines using 30 CPU for I/O
13
Approaches to Copy Reduction
  • On-host Special purpose software and/or
    hardware e.g., Zero Copy TCP, page flipping
  • Unreliable, idiosyncratic, expensive
  • Memory to memory copies, using network protocols
    to carry placement information
  • Satisfactory experience Fibre Channel, VIA,
    Servernet
  • FOR HARDWARE, not software

14
RDMA over IP Standardization
  • IETF RDDP Remote Direct Data Placement WG
  • http//ietf.org/html.charters/rddp-charter.html
  • RDMAC RDMA Consortium
  • http//www.rdmaconsortium.org/home

15
RDMA over IP Architecture
  • Two layers
  • DDP Direct Data Placement
  • RDMA - control

16
Upper and Lower Layers
  • ULPs- SDP Sockets Direct Protocol, iSCSI, MPI
  • DAFS is standardized NFSv4 on RDMA
  • SDP provides SOCK_STREAM API
  • Over reliable transport TCP, SCTP

17
Open Issues
  • Security
  • TCP order processing, framing
  • Atomic ops
  • Ordering constraints performance vs.
    predictability
  • Other transports, SCTP, TCP, unreliable
  • Impact on network protocol behaviors
  • Next performance bottleneck?
  • What new applications?
  • Eliminates the need for large MTU (jumbos)?
Write a Comment
User Comments (0)
About PowerShow.com