Bibliotecas de Comunicacin Eficiente en Clusters para cdigos Java - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Bibliotecas de Comunicacin Eficiente en Clusters para cdigos Java

Description:

Designing a High Performance Java Socket Solution. Implementing Efficient Java Sockets on Clusters ... PIV Xeon at 2.8 GHz 2GB mem (hyperthreading disabled) ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 37
Provided by: guillermol
Category:

less

Transcript and Presenter's Notes

Title: Bibliotecas de Comunicacin Eficiente en Clusters para cdigos Java


1
Towards High Performance Cluster Communication
in Java The Java Fast Sockets Approach
Guillermo L. Taboada
taboada_at_udc.es
ACET Seminars, Autumn 2006, University of Reading
2
Outline
  • Introduction
  • Designing a High Performance Java Socket Solution
  • Implementing Efficient Java Sockets on Clusters
  • Performance Evaluation
  • Conclusions

3
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • ? interest on clusters (? computation ? cost)
  • Growing solution
  • Java (and HPC Java) on clusters
  • Challenge scalable peformance clusterJava
  • Network performance is scalable
  • Java middleware less efficient than native
    code
  • ? Java is not going to scale performance
  • High Performance Networks not supported or
    supported with poor performance
  • Ways of support
  • IP Emulations
  • High Performance Sockets

Interconnection Network (SCI,GbE,Myrinet,IB)
4
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Interconnection Networks
  • Play (with its associated software libraries) a
    key role in High Performance Clustering
    Technology
  • Diferent technologies
  • Gb 10Gb Ethernet
  • Myrinet, Myrinet 2k, Myri-10G (10GbMyrinet
    10GbE)
  • Scalable Coherent Interface (SCI)
  • Infiniband
  • Qsnet, Giganet, Quadrics, GSN - HIPPI
  • Small hw latencies (1.3-30us)
  • High bandwidths ( gt 1Gbps)

5
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • SCI (Scalable Coherent Interface)
  • IEEE standar 1596-1992
  • Implementation of PCI(-X)-NIC
  • High Performance
  • Latency 1.42 us (theoretical)
  • Bandwidth 5333 Mbps (bi-directional)
  • Pt2Pt topologies 1D (ring) / 2D (torus 2D) / 3D
  • Usually without switch (small clusters)

6
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • SCI cluster example (2D torus 4x4)

7
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Comm libs in SCI
  • SCI IP (IP emulation) (ScaIP)
  • SISCI (Sw Infrastructure for SCI)
  • SMI (Shared Memory Interface)
  • SCI-MPICH(ScaMPI) MPI implementations
  • SCI-SOCKET High Performance Socket
    Implementation

8
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Myrinet
  • Most popular technology for high-range clusters
  • Delivers High Performance
  • Latency 1.3 us (theoretical)
  • Bandwidth 512, 1280, 2000 10k Mbps
  • Highly Scalable (large efficient switching tech.)
  • Lots of developments
  • Communication libraries
  • Low level GM, MX
  • Message-Passing MPICH-GM, MPICH-MX
  • Sockets Sockets-GM, Sockets-MX

9
Introduction
Introduction Design
Implementation Evaluation
Conclusions
http//www.myri.com/myrinet/performance/Sockets-MX
/socketsmx-concept.png
10
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Java Comm on Clusters
  • Javas portability means in networking that only
    the widely extended TCP/IP is supported by the
    JDK
  • Use of IP emulations but performance issues
  • SCIP, ScaIP, IPoGM, IPoMX, IPoIB
  • Emerging High Performance Socket Implementations
    for Cluster Interconnects
  • SCI-SOCKET
  • Sockets-MX, Sockets-GM (Myrinet)
  • Socket Direct Protocol Infiniband
  • Sockets over VIA (SOVIA)

11
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Java Communications on High Performance Cluster
    Interconnects
  • Myrinet
  • KaRMI/GM (JavaParty, Univ. Karlsruhe)
  • Manta/LFC/Panda/Ibis (Univ. Vrije Holland)
  • RMIX myrinet
  • mpiJava over MPICH-GM/MPICH-MX
  • SCI
  • still waiting
  • My research motivation is filling the efficiency
    gap between Java and high-speed interconnects.
  • Getting the most of the capabilities of the
    interconnects in Java. This could be done
    supporting High Performance Sockets libraries in
    Java.

12
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Previous work
  • Non-blocking communication support
  • Java NIO (New I/O)
  • Improves scalability, basic in client/server
    applications
  • Message-Passing Java
  • mpiJava, wrapper to native MPI implementation
    that supports non-blocking comms.
  • MPJ Express, Java message-passing system with NIO
    device
  • MPJ/Ibis, Java message-passing system with
    non-blocking support through multi-threading
  • High Performance network support
  • Almost centered on Myrinet
  • Solutions based on protocols designed ad hoc,
    poorly maintained and with numerous layers
  • Numerous libraries ?
  • ? communication overhead

13
Introduction
JAVA FAST SOCKETS
14
Introduction
Introduction Design
Implementation Evaluation
Conclusions
  • Solution, Java Fast Sockets (JFS)
  • 1st High Performance Java Sockets implementation
  • High Performance Network libraries support
  • Through native libraries on SCI, MX native
    Sockets
  • Implements an API widely spread (Java Sockets)
    with ? performance compared to RMI
  • Avoids the use of IP emulations (less efficient
    protocol for error-prone environments, with
    several layers)
  • Numerous libraries ? ? communication overhead

15
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • Java Fast Sockets (JFS) implements Java Sockets
    API in a way
  • Efficient portable through
  • general pure Java solution
  • Specific solutions that access native
    communication libraries (SCI Sockets)
  • The fail-over approach applied to the selection
    of libraries the system tryes to use highly
    efficient native communication libraries. If this
    is not possible, uses the pure Java general
    solution
  • User transparency
  • Setting JFSFactory as the default Sockets Factory
    in a small launcher application with
    Socket.setSocketImplFactory().
  • This application will invoke using reflection the
    main method. All Sockets communications wil use
    JFS for then on.

16
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • Sun Java Sockets implementation (Suns JRE)
  • Only supports the TCP/IP stack communication
    library
  • Performs unnecessary copies
  • Do not implement communication optimization
    methods (setPreferences() method) related to
  • Latency reduction
  • Maximizing bandwidth
  • The use of Java NIO Sockets is more complex
  • Use of Socket Channels, Selectors, Buffers, etc
  • Establishment of connections
  • Re-design communications of existing Socket-based
    applications

17
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • Basic Java IO

ltltusesgtgt
OBJECTOUTPUTSTREAM
OUTPUTSTREAM
Avoids extra copies Implements new IO
functionalities write of arrays of primitive
types, NIO native copies
SOCKETOUTPUTSTREAM
JFSOUTPUTSTREAM
BYTEARRAYOUTPUTSTREAM
Writes to a native socket. Performs serveral
copies JNIGetArrayRegion SOL Avoid extra
copies use JNIGetArrayCritical
Buffers data for sending Positive if sending long
messages, or if small messages have a big cost
18
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • Default scenario in Suns Java Sockets
    communication

JAVA VIRTUAL MACHINE
JAVA VIRTUAL MACHINE
HEAP / GARBAGE COLLECTABLE AREA
HEAP / GARBAGE COLLECTABLE AREA
byte buf
byte data
Data to send
Data to receive
byte data
byte buf
char JVM_buffer
char JVM_buffer
NATIVE SOCKETS IMPLEMENTATION
NATIVE SOCKETS IMPLEMENTATION
NET
char driver_buffer
char driver_buffer
LEGEND
DESERIALIZATION
COPY
19
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • JFS communication using Java NIO direct
    ByteBuffer

JAVA VIRTUAL MACHINE
JAVA VIRTUAL MACHINE
HEAP / GARBAGE COLLECTABLE AREA
HEAP / GARBAGE COLLECTABLE AREA
byte buf
byte data
Data to send
Data to receive
byte data
byte buf
char JVM_buffer
char JVM_buffer
direct ByteBuffer
direct ByteBuffer
NATIVE SOCKETS IMPLEMENTATION
NATIVE SOCKETS IMPLEMENTATION
NET
char driver_buffer
char driver_buffer
LEGEND
DESERIALIZATION
COPY
20
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • JFS communication optimized (zero-copy)

JAVA VIRTUAL MACHINE
JAVA VIRTUAL MACHINE
HEAP / GARBAGE COLLECTABLE AREA
HEAP / GARBAGE COLLECTABLE AREA
Data to send
Data to receive
direct ByteBuffer
direct ByteBuffer
NATIVE SOCKETS IMPLEMENTATION
NATIVE SOCKETS IMPLEMENTATION
NET
char driver_buffer
char driver_buffer
LEGEND
DESERIALIZATION
COPY
21
Implementing Efficient Java Communication
Libraries on Clusters
Introduction Design
Implementation Evaluation
Conclusions
  • SCI issues
  • Only IPv4 supported, and Java defaults to IPv6
  • Myrinet issues
  • Some calls to Sockets-MX are faulty
  • -Ethernet issues
  • Different protocol boundaries. The general
    solution and only optimized in reducing number of
    copies

22
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
  • Experimental configuration
  • PIV Xeon at 2.8 GHz 2GB mem (hyperthreading
    disabled)
  • SCI (Dolphin), GbE (Marvell 88E8050), Myrinet
    2000
  • Java Sun JVM 1.5.0_05
  • gcc 3.4.4
  • Libraries
  • mpiJava 1.2.5 over MPICH 1.2.5
  • SCI SOCKET 3.0.3
  • DIS 3.0.3 (IRM/SISCI/SCILib/Mbox)
  • Linux CentOS 4 kernel 2.6.9

23
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
24
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
25
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
26
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
27
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
28
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
29
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
30
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
31
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
32
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
33
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
34
Performance Evaluation
Introduction Design
Implementation Evaluation
Conclusions
35
Conclusions
Introduction Design
Implementation Evaluation
Conclusions
  • Java Fast Sockets (JFS), a High Performance Java
    Sockets implementation supports System Area
    Networks through native code, delivering high
    performance to pure Java libraries and
    applications
  • Java Sockets communications on clusters can
    significantly improve and increase their
    performance thanks to the use of this library.
  • Latency up to 84 latency reduction
  • Throughput up to 120 increase

36
Towards High Performance Cluster Communication
in Java The Java Fast Sockets Approach
Guillermo L. Taboada
taboada_at_udc.es
ACET Seminars, Autumn 2006, University of Reading
Write a Comment
User Comments (0)
About PowerShow.com