Title: Performance analysis tools applied to a finite adaptive mesh free boundary seepage parallel algorithm
1Performance analysis tools applied to a finite
adaptive mesh free boundary seepage parallel
algorithm
- S. Boeriu1 and J.C. Bruch, Jr.2
- 1Center for Computational Science and Engineering
- 2Department of Mechanical and Environmental
Engineering - and Department of Mathematics
- University of California, Santa Barbara
- http//www.engineering.ucsb.edu/hpscicom
2Acknowledgements
This material is based upon work supported by the
National Science Foundation under Grant 0086262.
This research was supported in part by NSF
cooperative agreement ACI-9619020 through
computing resources provided by the National
Partnership for Advanced Computational
Infrastructure at the San Diego Supercomputer
Center. http//www.npaci.edu/Horizon/guide_linked/
bh_tools_txt.html
3Outline of Presentation
- Introduction (Physical problem)
- Problem formulation
- Fixed domain formulation
- Numerical algorithm
- Test case
- Performance tools and considerations
- a. VAMPIR
- b. PARAVER
- Diagnostic example
- Conclusions
4Physical problem
Figure 1. Seepage through a rectangular dam.
5Simplifying assumptions
- The soil in the flowfield is homogeneous and
isotropic - Capillary and evaporation effects are neglected
- The flow obeys Darcys Law
- Two-dimensional
- Steady state
6Mathematical formulation
- Darcys Law
- Potential Function
- Velocity Components
- Continuity Equation
- Irrotationality Condition
- Cauchy-Riemann Equations
- Laplaces Equations
7Problem formulation
Figure 2. Mathematical formulation of physical
problem.
8Extension of solution domain
- The solution domain is extended to the known
region - Then extend continuously to be defined on
by setting
9in the sense of distributions where
10Fixed domain formulation
Figure 3. Fixed domain mathematical formulation.
11Numerical Algorithm
- A minimization problem can be formulated in
terms of the functional
where a is a bilinear form, continuous,
symmetric, positive definite on R and
i.e.,
12 The functional J has one and only one
minimum on a closed convex set. The minimum is
found using the following algorithm
13 Finite Element Error Analysis
Adaptive Mesh Finite Element Analysis (FEA)
General Equation for FEA
14 Error Analysis
Error Definition
where is the approximation of the exact
solution is the
calculated of an element (constant)
is the shape function and
15Averaging Technique
Error Estimate in an Element
16Error Norm of the Whole Computation Domain
Percentage Error
17 Local Mesh Refinement
Desired Criteria
Desired Local Error Criteria
Error Ratio
New Element Size
18 Mesh Refinement
19Test case
20 Results
21(No Transcript)
22Figure 4. Domain decomposition for Pass 4 of Case
1.
23 Figure 5. Speedup for Case
1.
24Performance tools and considerations
- The parallel program is monitored while
- it is executed. Monitoring produces
- performance data that is interpreted in
- order to reveal areas of poor performance.
- The program is then altered and the
- process is repeated until an acceptable
- level of performance is reached.
25VAMPIR (Visualization and Analysis of MPI
Resources 2.0)
- VAMPIR 2.0 is a post-mortem trace visualization
tool from Pallas GmbH - http//www.pallas.com
-
- It uses the profile extensions to MPI
and - permits analysis of the message events where
- data is transmitted between processors during
- execution of a parallel program. It has a
- convenient user-interface and an excellent
- zooming and filtering. Global displays show
all - selected processes.
-
26- Global Timeline detailed application execution
over time axis - Activity Chart presents per-process profiling
information - Summaric Chart aggregated profiling information
- Communication Statistics message statistics for
each process pair - Global Communication Statistics collective
operations statistics - I/O Statistics MPI I/O operation statistics
- Calling Tree global dynamic calling tree
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36PARAVER(Parallel Program Visualization and
Analysis Tool)
- PARAVER is a flexible parallel program
visualization and analysis tool based on an
easy-to-use Motif GUI (graphical user interface) -
- PARAVER was developed to respond to
the - basic need to have a qualitative perception of
the - application behavior by visual inspection and
then - to be able to focus on the detailed
quantitative - analysis of the problems.
37Paraver (Parallel Program Visualization and
Analysis Tool)
- Powerful flexible parallel program visualization
tool based on an easy-to-use Motif GUI (graphical
user interface) - Developed by
- European Center for Parallelism of
Barcelona (CEPBA) - Universitat Politecnica de Catalunya
- http//www.cepba.upc.es/
-
-
38- Paraver is designed to visualize and analyze
- - Communication and load balance
- - Combining OpenMP and MPI
- - Hardware performance and counters
- Usage
- - Compile programs with special
libraries - - Run programs to produce trace files
- - View and analyze traces
- - Designed to help in program
understanding and optimization -
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47Inefficient programming example
- Load imbalance (inefficient memory use)
- TLB (translation lookaside buffer) misses
48Figure 6. Stage 1 Processor 0 Mesh Map
49Figure 7. Stage 1 Processor 3 Mesh Map
50Figure 8. Stage 1 VAMPIR Activity Chart
51Figure 9. Stage 1 - PARAVER Global Display
52Figure 10. Stage 4 - VAMPIR Activity Chart
53Figure 11. Stage 4 - VAMPIR Display Chart
54Table 8. TLB misses.
STAGES Proc. 0 Proc. 3
1 TLB misses 9,464 7,870
4 TLB misses 12,210 208,341
55Figure 12. Stage 4 - Processor 0 Mesh Map
56Figure 13. Stage 4 Processor 3 Mesh Map
57Table 9. Stage 4 timing of the SOR module.
Processor Time spent in SOR
0 0.3671
1 0.4068
2 0.6940
3 0.8393
58Figure 14. Stage 4 VAMPIR Activity Chart
59Figure 15. Stage 4 VAMPIR Display Chart
60Figure 16. Stage 4 PARAVER Global Display
61Conclusions
- A significant factor that affects the
performance of a parallel application is the
balance between communication and workload. The
challenge of the message passing model is in
reducing message traffic over the interconnection
network. To fully understand the - performance behavior of such applications,
analysis and - visualization tools are needed. Two such
tools, VAMPIR - and PARAVER, were used to analyze the
performance of - the seepage application. It was seen that
optimization of - the parallel code can be carried out in an
iterative process - involving these tools to investigate
performance issues.
62Web Sites
- Project site
- http//www.engineering.ucsb.edu/hpscicom
- San Diego Supercomputer Center
- http//www.npaci.edu/Horizon/guide_linked/bh_tools
_txt.html - VAMPIR
- http//www.pallas.com
- PARAVER
- http//www.cepba.upc.es/
-
-
-