Title: ProActive performance evaluation with NAS benchmarks and optimization of OO SPMD
1ProActive performance evaluationwith NAS
benchmarksandoptimization of OO SPMD
- Brian Amedro Vladimir Bodnartchouk
2Outline
- TimIt A profiling tool for ProActive
- OO SPMD model in ProActive
- Performance evaluation with NAS benchmarks
- Optimizing group communications
3TimIt A profiling tool for ProActive
A ProActive feature to time and analyze
applications
4OO SPMD model
- A parallel programming model
- Flexibility and high level of abstraction
- Strongly used in NAS benchmarks implementations
One To All
Scattering
Reduce operation
5NAS Parallel Benchmarks
- Designed by NASA to evaluate benefits of high
performance systems - Strongly based on CFD
- 5 benchmarks (kernels) to test different aspects
of a system
- Easy to implement thanks to OOSPMD pattern
- Tests performed on Sun 1.5 with RMI for ProActive
and PGI 6.0 compiler for MPI
6CG Kernel (Conjugate Gradient)
- 12000 calls
- 570 MB sent
- 1 min 32
- 65 comms
- Floating point operations
- Eigen value computation
- High number of unstructuredcommunications
7MG Kernel (Multi Grid)
- 600 calls
- 45 MB sent
- 1 min 32
- 80 comms
- Floating point operations
- Solving Poisson problem
- Structured communications
8IS Kernel (Integer Sort)
- 65 calls
- 22 MB sent
- 4 min 32
- 60 comms
- Keys ranking operations
- Bucket sort
- Large arrays in memory
9EP Kernel (Embarrassingly Parallel)
- 6 calls
- 246 bytes sent
- 7 min 32
- 2 comms
- Random numbers generation
- Almost no communications
10FT Kernel (Fourier Transformation)
- 22 calls
- 180 MB sent
- 1 min 32
- 40 comms
- Floating point operations
- Big messages 8 MB per call
11Optimizing group communications
- Implement efficient group communication
- Minimize the TCP traffic
- Decrease the network congestion
Use clustering techniques to choose the better
algorithm to use
12Ring all-to-all algorithm
- Best for large size communications
- Takes n-1 steps
A
A
B
D
C
D
D
C
B
A
C
D
B
B
A
C
C
B
A
D
step
1
2
3
13Recursive doubling all-to-all algorithm
- Best for small size communications
- Takes log(n) steps
A
A
A
D
B
C
D
A
B
C
D
D
D
C
A
B
B
B
B
A
D
C
C
C
step
1
2
14Conclusion
- TimIt easy and helpful profiling tool
- NAS benchmarks easy to implements with ProActive
and OO SPMD patternhttp//www-sop.inria.fr/oasis/
proactive/nas - Good performances expected with future Sun Java 6
and usage of Ibis RMI
15Questions
?
16MPI / ProActive
MPI ProActive Mpirun deployment MPI_Init a
ctivities creation MPI_Finalize MPI_Comm_Size ge
tMyGroupSize MPI_Comm_rank getMyRank MPI_Send
method call (setter and getter) MPI_Recv MPI_Ba
rrier barrier MPI_Bcast method
call MPI_Scatter method call with a
scatter group as parameter MPI_Gather result
of a group communication MPI_Reduce programmer's
method
Back