A hybrid MPI Design using SCTP and iWARP - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A hybrid MPI Design using SCTP and iWARP

Description:

A Hybrid Message Passing Interface Design using the ... Open MPI SCTP BTL (in v1.3 trunk)? Hardware acceleration techniques for IP. Protocol offload ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 22
Provided by: pen49
Category:
Tags: mpi | sctp | design | hybrid | iwarp | using

less

Transcript and Presenter's Notes

Title: A hybrid MPI Design using SCTP and iWARP


1
A Hybrid MPI Design using SCTP and iWARP
Mike Tsai, Brad Penoff, and Alan
Wagner Department of Computer Science University
of British Columbia Vancouver, Canada
Distributed Systems Group
April 14, 2008
2
A Hybrid Message Passing Interface Design using
theStream Control Transmission Protocol and the
Internet Wide Area Remote Direct Memory Access
Protocol
Mike Tsai, Brad Penoff, and Alan
Wagner Department of Computer Science University
of British Columbia Vancouver, Canada
Distributed Systems Group
April 14, 2008
3
Research Background
  • SCTP Stream Control Transmission Protocol
  • IETF standardized transport protocol for IP
  • Can be used anywhere TCP or UDP are used
  • Additional features
  • SCTP and MPI middleware
  • LAM (unreleased)?
  • MPICH2 (1.0.5 and on) ch3sctp
  • Open MPI SCTP BTL (in v1.3 trunk)?

4
State-of-the-Art Networking
  • Hardware acceleration techniques for IP
  • Protocol offload
  • OS bypass
  • Zero copy
  • RDMA
  • 10 GigE
  • How would these look for SCTP?
  • Are there benefits here for using SCTP?

5
Story/motivation
  • iWARP - Internet Wide Area RDMA protocol
  • IETF standard for RDMA over IP
  • Use RDMA, point-to-point, or a mix?
  • Why Compromise? (G. Shainer _at_ HPCWire.com)?
  • Depending on the application, use whichever is
    best.
  • For MPI middleware, who decides whats best?

The programmer!
6
Contribution
  • Hybrid MPI with functional decomposition lets the
    programmer decide
  • Let RMA use RDMA
  • Let other communications use point-to-point
  • Explore SCTPs use within iWARP
  • Extended OSC userspace software iWARP, making
    many internal OSC changes

7
(No Transcript)
8
(No Transcript)
9
SCTP is a better LLP
  • LLPs needs built-in to SCTP
  • Reliable, message-based
  • CRC32c checksum
  • Out-of-order support
  • MSG_UNORDERED
  • Multistreaming
  • Multihoming
  • Unmodified stack supports
  • Path failover
  • Multirail data striping

LLP
10
In the beginning, there was ch3sctp
11
OSC iWARP was modified and incorporated in as a
thread.
12
RMA done by modified OSC iWARP
13
OSC iWARP changes to support MPI
  • Running in a thread
  • Use SCTP
  • Making all OSC ops non-blocking
  • Locks around shared data

14
Connection Management Design
  • Connection establishment
  • Separate one-to-many socket for new QPs
  • SCTP peeloff feature
  • New QP sends request from one-to-many socket
  • Request/ACK received, then QP socket peeled-off
  • For conflicts, MPI rank resolves who sends ACK

15
(No Transcript)
16
Performance
  • What we tested
  • Compared our new ch3hybrid to the original
    ch3sctp
  • Two 3.2 GHz Intel boxes (GigE switch)?
  • OSU latency tests (MPI_Put MPI_Get)?
  • Homemade synthetic benchmark
  • Combination of RMA and MPI-1 calls

17
OSU One-sided Latency Tests
  • ch3hybrid adds 2-8 overhead

18
Synthetic Application
  • ch3hybrid was faster than ch3sctp
  • 3.8 seconds vs. 4.5 seconds
  • Extra thread helps in some cases

19
Conclusions
  • RDMA versus point-to-point for MPI
  • Why choose?
  • Functional decomposition lets programmer decide
  • SCTP is a good match for iWARP
  • Implementation of iWARP using SCTP shown.
  • SCTP has its place in the state-of-the-art.
  • Itd be more exciting to have SCTP-based devices

20
Thank you!
Google sctp mpi for more information about our
work
21
Connection Management Design
Write a Comment
User Comments (0)
About PowerShow.com