Performance analysis tools applied to a finite adaptive mesh free boundary seepage parallel algorithm - PowerPoint PPT Presentation

About This Presentation
Title:

Performance analysis tools applied to a finite adaptive mesh free boundary seepage parallel algorithm

Description:

The flow obeys Darcy's Law. Two-dimensional. Steady state. 6. Mathematical formulation. Darcy's Law: Potential Function: Velocity Components: Continuity Equation: ... – PowerPoint PPT presentation

Number of Views:278
Avg rating:3.0/5.0
Slides: 63
Provided by: engineer9
Category:

less

Transcript and Presenter's Notes

Title: Performance analysis tools applied to a finite adaptive mesh free boundary seepage parallel algorithm


1
Performance analysis tools applied to a finite
adaptive mesh free boundary seepage parallel
algorithm
  • S. Boeriu1 and J.C. Bruch, Jr.2
  • 1Center for Computational Science and Engineering
  • 2Department of Mechanical and Environmental
    Engineering
  • and Department of Mathematics
  • University of California, Santa Barbara
  • http//www.engineering.ucsb.edu/hpscicom

2
Acknowledgements
This material is based upon work supported by the
National Science Foundation under Grant 0086262.
This research was supported in part by NSF
cooperative agreement ACI-9619020 through
computing resources provided by the National
Partnership for Advanced Computational
Infrastructure at the San Diego Supercomputer
Center. http//www.npaci.edu/Horizon/guide_linked/
bh_tools_txt.html
3
Outline of Presentation
  • Introduction (Physical problem)
  • Problem formulation
  • Fixed domain formulation
  • Numerical algorithm
  • Test case
  • Performance tools and considerations

  • a. VAMPIR
  • b. PARAVER
  • Diagnostic example
  • Conclusions

4
Physical problem
Figure 1. Seepage through a rectangular dam.
5
Simplifying assumptions
  1. The soil in the flowfield is homogeneous and
    isotropic
  2. Capillary and evaporation effects are neglected
  3. The flow obeys Darcys Law
  4. Two-dimensional
  5. Steady state



6
Mathematical formulation
  • Darcys Law
  • Potential Function
  • Velocity Components
  • Continuity Equation
  • Irrotationality Condition
  • Cauchy-Riemann Equations
  • Laplaces Equations

7
Problem formulation
Figure 2. Mathematical formulation of physical
problem.
8
Extension of solution domain
  • The solution domain is extended to the known
    region
  • Then extend continuously to be defined on
    by setting

9
  • This yields

in the sense of distributions where
10
Fixed domain formulation
Figure 3. Fixed domain mathematical formulation.
11
Numerical Algorithm
  • A minimization problem can be formulated in
    terms of the functional

where a is a bilinear form, continuous,
symmetric, positive definite on R and
i.e.,
12
The functional J has one and only one
minimum on a closed convex set. The minimum is
found using the following algorithm

13
Finite Element Error Analysis
Adaptive Mesh Finite Element Analysis (FEA)
General Equation for FEA

14
Error Analysis
Error Definition

where is the approximation of the exact
solution is the
calculated of an element (constant)
is the shape function and
15
Averaging Technique
Error Estimate in an Element
16
Error Norm of the Whole Computation Domain
Percentage Error
17
Local Mesh Refinement
Desired Criteria
Desired Local Error Criteria
Error Ratio
New Element Size
18
Mesh Refinement
19
Test case


20
Results
21
(No Transcript)
22
Figure 4. Domain decomposition for Pass 4 of Case
1.
23
Figure 5. Speedup for Case
1.
24
Performance tools and considerations
  • The parallel program is monitored while
  • it is executed. Monitoring produces
  • performance data that is interpreted in
  • order to reveal areas of poor performance.
  • The program is then altered and the
  • process is repeated until an acceptable
  • level of performance is reached.

25
VAMPIR (Visualization and Analysis of MPI
Resources 2.0)
  • VAMPIR 2.0 is a post-mortem trace visualization
    tool from Pallas GmbH
  • http//www.pallas.com
  • It uses the profile extensions to MPI
    and
  • permits analysis of the message events where
  • data is transmitted between processors during
  • execution of a parallel program. It has a
  • convenient user-interface and an excellent
  • zooming and filtering. Global displays show
    all
  • selected processes.

26
  • Global Timeline detailed application execution
    over time axis
  • Activity Chart presents per-process profiling
    information
  • Summaric Chart aggregated profiling information
  • Communication Statistics message statistics for
    each process pair
  • Global Communication Statistics collective
    operations statistics
  • I/O Statistics MPI I/O operation statistics
  • Calling Tree global dynamic calling tree

27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
PARAVER(Parallel Program Visualization and
Analysis Tool)
  • PARAVER is a flexible parallel program
    visualization and analysis tool based on an
    easy-to-use Motif GUI (graphical user interface)
  • PARAVER was developed to respond to
    the
  • basic need to have a qualitative perception of
    the
  • application behavior by visual inspection and
    then
  • to be able to focus on the detailed
    quantitative
  • analysis of the problems.


37
Paraver (Parallel Program Visualization and
Analysis Tool)
  • Powerful flexible parallel program visualization
    tool based on an easy-to-use Motif GUI (graphical
    user interface)
  • Developed by
  • European Center for Parallelism of
    Barcelona (CEPBA)
  • Universitat Politecnica de Catalunya
  • http//www.cepba.upc.es/


38
  • Paraver is designed to visualize and analyze
  • - Communication and load balance
  • - Combining OpenMP and MPI
  • - Hardware performance and counters
  • Usage
  • - Compile programs with special
    libraries
  • - Run programs to produce trace files
  • - View and analyze traces
  • - Designed to help in program
    understanding and optimization

39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
Inefficient programming example
  • Load imbalance (inefficient memory use)
  • TLB (translation lookaside buffer) misses

48
Figure 6. Stage 1 Processor 0 Mesh Map
49
Figure 7. Stage 1 Processor 3 Mesh Map
50
Figure 8. Stage 1 VAMPIR Activity Chart
51
Figure 9. Stage 1 - PARAVER Global Display
52
Figure 10. Stage 4 - VAMPIR Activity Chart
53
Figure 11. Stage 4 - VAMPIR Display Chart
54
Table 8. TLB misses.
STAGES Proc. 0 Proc. 3
1 TLB misses 9,464 7,870
4 TLB misses 12,210 208,341
55
Figure 12. Stage 4 - Processor 0 Mesh Map
56
Figure 13. Stage 4 Processor 3 Mesh Map
57
Table 9. Stage 4 timing of the SOR module.
Processor Time spent in SOR
0 0.3671
1 0.4068
2 0.6940
3 0.8393
58
Figure 14. Stage 4 VAMPIR Activity Chart
59
Figure 15. Stage 4 VAMPIR Display Chart
60
Figure 16. Stage 4 PARAVER Global Display
61
Conclusions
  • A significant factor that affects the
    performance of a parallel application is the
    balance between communication and workload. The
    challenge of the message passing model is in
    reducing message traffic over the interconnection
    network. To fully understand the
  • performance behavior of such applications,
    analysis and
  • visualization tools are needed. Two such
    tools, VAMPIR
  • and PARAVER, were used to analyze the
    performance of
  • the seepage application. It was seen that
    optimization of
  • the parallel code can be carried out in an
    iterative process
  • involving these tools to investigate
    performance issues.

62
Web Sites
  • Project site
  • http//www.engineering.ucsb.edu/hpscicom
  • San Diego Supercomputer Center
  • http//www.npaci.edu/Horizon/guide_linked/bh_tools
    _txt.html
  • VAMPIR
  • http//www.pallas.com
  • PARAVER
  • http//www.cepba.upc.es/
Write a Comment
User Comments (0)
About PowerShow.com