ProActive performance evaluation with NAS benchmarks and optimization of OO SPMD - PowerPoint PPT Presentation

About This Presentation
Title:

ProActive performance evaluation with NAS benchmarks and optimization of OO SPMD

Description:

5 benchmarks (kernels) to test different aspects of a system ... CG Kernel (Conjugate Gradient) Floating point operations. Eigen value computation ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 17
Provided by: wwwsop
Category:

less

Transcript and Presenter's Notes

Title: ProActive performance evaluation with NAS benchmarks and optimization of OO SPMD


1
ProActive performance evaluationwith NAS
benchmarksandoptimization of OO SPMD
  • Brian Amedro Vladimir Bodnartchouk

2
Outline
  • TimIt A profiling tool for ProActive
  • OO SPMD model in ProActive
  • Performance evaluation with NAS benchmarks
  • Optimizing group communications

3
TimIt A profiling tool for ProActive
A ProActive feature to time and analyze
applications
4
OO SPMD model
  • A parallel programming model
  • Flexibility and high level of abstraction
  • Strongly used in NAS benchmarks implementations

One To All
Scattering
Reduce operation
5
NAS Parallel Benchmarks
  • Designed by NASA to evaluate benefits of high
    performance systems
  • Strongly based on CFD
  • 5 benchmarks (kernels) to test different aspects
    of a system
  • Easy to implement thanks to OOSPMD pattern
  • Tests performed on Sun 1.5 with RMI for ProActive
    and PGI 6.0 compiler for MPI

6
CG Kernel (Conjugate Gradient)
  • 12000 calls
  • 570 MB sent
  • 1 min 32
  • 65 comms
  • Floating point operations
  • Eigen value computation
  • High number of unstructuredcommunications

7
MG Kernel (Multi Grid)
  • 600 calls
  • 45 MB sent
  • 1 min 32
  • 80 comms
  • Floating point operations
  • Solving Poisson problem
  • Structured communications

8
IS Kernel (Integer Sort)
  • 65 calls
  • 22 MB sent
  • 4 min 32
  • 60 comms
  • Keys ranking operations
  • Bucket sort
  • Large arrays in memory

9
EP Kernel (Embarrassingly Parallel)
  • 6 calls
  • 246 bytes sent
  • 7 min 32
  • 2 comms
  • Random numbers generation
  • Almost no communications

10
FT Kernel (Fourier Transformation)
  • 22 calls
  • 180 MB sent
  • 1 min 32
  • 40 comms
  • Floating point operations
  • Big messages 8 MB per call

11
Optimizing group communications
  • Implement efficient group communication
  • Minimize the TCP traffic
  • Decrease the network congestion

Use clustering techniques to choose the better
algorithm to use
12
Ring all-to-all algorithm
  • Best for large size communications
  • Takes n-1 steps

A
A
B
D
C
D
D
C
B
A
C
D
B
B
A
C
C
B
A
D
step
1
2
3
13
Recursive doubling all-to-all algorithm
  • Best for small size communications
  • Takes log(n) steps

A
A
A
D
B
C
D
A
B
C
D
D
D
C
A
B
B
B
B
A
D
C
C
C
step
1
2
14
Conclusion
  • TimIt easy and helpful profiling tool
  • NAS benchmarks easy to implements with ProActive
    and OO SPMD patternhttp//www-sop.inria.fr/oasis/
    proactive/nas
  • Good performances expected with future Sun Java 6
    and usage of Ibis RMI

15
Questions
?
16
MPI / ProActive
MPI ProActive Mpirun deployment MPI_Init a
ctivities creation MPI_Finalize MPI_Comm_Size ge
tMyGroupSize MPI_Comm_rank getMyRank MPI_Send
method call (setter and getter) MPI_Recv MPI_Ba
rrier barrier MPI_Bcast method
call MPI_Scatter method call with a
scatter group as parameter MPI_Gather result
of a group communication MPI_Reduce programmer's
method
Back
Write a Comment
User Comments (0)
About PowerShow.com