PARALLEL FULLY AUTOMATIC HPADAPTIVE 2D FINITE ELEMENT PACKAGE - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

PARALLEL FULLY AUTOMATIC HPADAPTIVE 2D FINITE ELEMENT PACKAGE

Description:

The backward substitution can be finally run in parallel, over each subdomain ... But in current subdomain there is no element having this entire edge ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 57
Provided by: maciejpa
Category:

less

Transcript and Presenter's Notes

Title: PARALLEL FULLY AUTOMATIC HPADAPTIVE 2D FINITE ELEMENT PACKAGE


1
PARALLEL FULLY AUTOMATIC HP-ADAPTIVE 2D FINITE
ELEMENT PACKAGE
  • Principal investigator
  • Leszek Demkowicz (ICES, UT Austin)
  • Team
  • Maciej Paszynski (ICES, UT Austin)
  • Jason Kurtz (ICES, UT Austin)
  • Collaborators
  • Waldemar Rachowicz (Cracow University of
    Technology)
  • Timothy Walsh (Sandia National Laboratories)
  • David Pardo (ICES, UT Austin)
  • Dong Xue (ICES, UT Austin)
  • Kent Milfeld (TACC, UT Austin )

2
INTRODUCTION
  • We are working on
  • Parallel fully automatic hp-adaptive finite
    element 2D and 3D codes
  • The code automatically produces a sequence of
    optimal meshes with global exponential
    convergence rate
  • Currently, we have running 2D version of the code
    for the Laplace equation
  • All stages of the code are fully parallel
  • The code will be soon extended to solve 3D
    Hemholtz and time harmonic Maxwell equations
  • The work is driven by 3 Challenging Applications
  • Simulation of EM waves in the human head
  • Calculation of the Radar Cross-sections (3D
    scattering problems)
  • Simulation of Logging While Drilling EM measuring
    devices

3
PLAN OF THE PRESENTATION
  • General idea of hp-adaptive strategy
  • Parallel data structure and data migration
  • Parallel direct solver
  • Parallel mesh refinements and mesh reconciliation
  • Communication strategy
  • Results
  • Conclusions

4
GENERAL IDEA OF HP-ADAPTIVE
STRATEGY

5

ORTHOTROPIC HEAT EQUATION
  • 5 materials, some orthotropic some not
  • large jumps in material data generate
    singularities at 3-material interfaces
  • requires anisotropic refinements

6
COARSE MESH, FINE MESH AND OPTIMAL MESH
  • Initial mesh coarse mesh
  • for the 1st step
  • of the iteration

Optimal mesh coarse mesh for the 2nd step of
the iteration
Fine mesh
Optimal mesh
7

8
PARALLEL DATA STRUCTURES AND DATA MIGRATION

9
PARALLEL DATA STRUCTURES
  • Refinements trees are grown vertically
    from the initial mesh on each process
  • Each process generates initial mesh elements
    in only a portion of the global
    geometry
  • Identical copies of global
    geometry are stored on each process

10

DATA MIGRATION
  • Load balancing performed by ZOLTAN library
  • ZOLTAN provides 6 different
  • domain decomposition algorithms
  • Initial mesh elements together with
  • refinements trees migrate through subdomains

11
PARALLEL DIRECT SOLVER

12

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS
  • Both the coarse and fine mesh problems
  • are solved using the parallel frontal solver
  • Frontal solver extension of the Gaussian
    elimination
  • Assembly Elimination performed together
  • on the frontal submatrix of the global matrix
  • Domain decomposition approach

Reference T.Walsh,L.Demkowicz A Parallel
Multifrontal Solver for hp-Adaptive Finite
Elements, TICAM Report 99-01
13

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

Global matrix
14

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

Distribution of the global matrix into processors
15

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS
  • Run the forward elimination stage
  • with fake elements
  • over each subdomain

16

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

After the forward elimination with fake elements,
frontal matrices contains contributions to the
interface problem
17

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

2. Formulate the interface problem
18

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

3. Solve the interface problem
  • Broadcast the solution together with upper
    triangular form
  • of the interface problem matrix

19

PARALLEL FRONTAL SOLVERWITH FAKE ELEMENTS

The backward substitution can be finally run in
parallel, over each subdomain
20
PARALLEL MESH REFINEMENTSANDMESH RECONCILIATION

21

MESH REGULARITY
Each quad element has 4 vertex nodes 4
mid-edge nodes 1 middle node

22

MESH REGULARITY
Isotropic h-refinement Element is broken
into 4 sons (in horizontal and vertical
directions) Big element has 2 smaller
neighbors Mid-edge nodes and common vertex node
of smaller elements remain constrained with no
new nodes generated Mesh with constrained nodes
is called irregular mesh

23

MESH REGULARITY

It may happen that smaller elements need to be
broken once again. Newly created nodes are called
multiply constrained nodes Multiply constrained
nodes are not allowed.
24

MESH REGULARITY

1-irregularity rule Edge of given element can
be broken only once, without breaking neighboring
elements Neighboring element must be broken in
this example
25

PARALLEL MESH REFINEMENTSEXAMPLE 1
26

PARALLEL MESH REFINEMENTSEXAMPLE 1
27

PARALLEL MESH RECONCILIATIONEXAMPLE 1
28

PARALLEL MESH RECONCILIATIONEXAMPLE 1
29

PARALLEL MESH REFINEMENTSAND MESH
RECONCILIATIONEXAMPLE 1
Conclusion It is required to exchange refinement
trees data between neighboring subdomains.
30

PARALLEL MESH REFINEMENTSAND MESH
RECONCILIATIONCODING THE REFINEMENT TREES
Refinement trees are exchanged between
neighboring subdomains in a compressed binary
format
31

MESH RECONCILIATIONEXAMPLE 2

Constrained nodes
Coarse mesh
Global hp-refinement -gt Fine mesh
Optimal mesh
32

PARALLEL MESH REFINEMENTSAND MESH
RECONCILIATIONEXAMPLE 2
Conclusion It is required to exchange
information about constrained nodes located at
the same place on both neighboring subdomains The
mesh reconciliation must be performed after
creation of the fine mesh (after global
hp-refinement) and also after creation of the
optimal mesh
33

PARALLEL MESH REFINEMENTSAND MESH
RECONCILIATIONEXAMPLE 3
When interface edge with constrained node is
broken orders of newly created nodes must be
established It is done by comparing coarse and
fine grid solutions over the edge But in current
subdomain there is no element having this entire
edge We need to ask neighboring subdomains for
orders of just created nodes
34

PARALLEL MESH REFINEMENTSAND MESH
RECONCILIATIONEXAMPLE 3
Conclusion After breaking interface edge with
constrained node, it is necessary to ask
neighboring subdomain for order of approximation
of newly created nodes
35

PARALLEL MESH REFINEMENTSSUMMARY
The mesh refinements algorithm is running on each
subdomain separately 1-irregularity rule is
enforced The rule is telling that edge of given
element can be broken only once, without breaking
neighboring elements Nodes situated on the global
interface are treated at the same way as
internal nodes After parallel mesh refinements it
is necessary to run
the mesh reconciliation algorithm

36

PARALLEL MESH RECONCILIATIONSUMMARY
Two adjacent elements from neighboring subdomains

The first is not refined, the second one is
refined
Create constrained node on the interface edge (in
order to have the same number of degrees of
freedom)
37

PARALLEL MESH RECONCILIATIONSUMMARY
Two adjacent elements from neighboring subdomains

Both refined
Create constrained nodes on the interface edges
Exchange constrained nodes data between subdomains
Remove interface constrained nodes situated at
the same place on both subdomains
38

PARALLEL MESH RECONCILIATIONSUMMARY
Two adjacent elements from neighboring subdomains

The first one is not refined, the second one is
refined
Exchange refinement trees between subdomains
Second one is refined once again
Break the element
Remove constrained node
Create constrained node
39

PARALLEL MESH REFINEMENTSSUMMARY
  • We can summarize our algorithm in the following
    stages
  • Parallel mesh refinements
  • Exchange information about interface edge
    refinement trees, constrained nodes and orders of
    approximation along the interface
  • Mesh reconciliation
  • The repetition of stages 2 and 3 may be required
    if some of the interface edges were modified
    during the last iteration.

40
RESULTS

41

RESULTSTHE LAPLACE EQUATION OVER L-SHAPE DOMAIN

Optimal mesh obtained after parallel iterations
over 3 subdomains. Exponential convergence is
obtained to the accuracy of 1 relative error.
42

RESULTSTHE BATTERY PROBLEM

The solution with the accuracy of 0.1 relative
error. Exponential convergence curve for the
parallel execution (16 processors)
43

RESULTSTHE BATTERY PROBLEM

Optimal mesh obtained after parallel iterations
over 15 subdomains giving the accuracy of 0.1
relative error.
44

RESULTSTHE BATTERY PROBLEM

Partition of hp-refined mesh into 15 processors
45

PERFORMANCE MEASURMENTS

Computation times for each part of the sequential
code, measured during each hp-adaptive iteration
46

PERFORMANCE MEASURMENTS
Computation times for each part of the parallel
code, executed over 16 processors measured
during each hp-adaptive iteration
47

PERFORMANCE MEASURMENTS
Computation times for each part of the parallel
code, executed over 32 processors measured
during each hp-adaptive iteration
48

PERFORMANCE MEASURMENTS
When load over one element is higher then overall
load for all other elements, the optimal load
balance uses only 2 processors
49

PERFORMANCE MEASURMENTS
Load distribution over 32 processors during
particular iterations Only 16 processors are
working
50

PERFORMANCE MEASURMENTS
  • The load balancing is performed on the level of
    initial mesh elements.
  • In the Sandia battery problem, there are 16
    initial mesh elements covering areas with
    strongest singularities.
  • Most of hp-refinements are required in the
    neighborhood of these areas.
  • Optimal load balancing needs only 16 processors.

51

PERFORMANCE MEASURMENTS
  • Observations
  • When a singularity is covered by initial mesh
    element, then computational load over this
    element is high
  • Is the load balance on the higher level then
    initial mesh elements necessary?
  • The singularity inside the initial mesh element
    can be solved with accuracy 0.1 relative error
    within 20 seconds

52

PERFORMANCE MEASURMENTS
Computation times for each part of the parallel
code, executed over 16 processors measured
during each hp-adaptive iteration
53

PERFORMANCE MEASURMENTS
  • Observations
  • When a singularity is covered by initial mesh
    element, then computational load over this
    element is high
  • Is the load balance on the level of initial mesh
    elements enough?
  • The singularity within initial mesh element can
    be solved with accuracy 0.1 relative error
    within 20 seconds
  • The problem over any larger subdomain can also be
    solved with accuracy 0.1 relative error within
    20 seconds
  • (assuming the communication scales well)

54

PERFORMANCE MEASURMENTS
  • Observations
  • Number of hp-adaptive strategy iterations
    required to obtain the solution with given
    accuracy is LOWER in the parallel implementation
  • (parallel direct solver basing on domain
    decomposition approach has less error than direct
    solver over entire domain)

55
CONCLUSIONS
  • We have developed the parallel fully automatic
    hp-adaptive 2D code for the Laplace equation,
    where
  • Load balancing is performed by ZOLTAN library
  • Both coarse and fine mesh problems are solved by
    the parallel frontal solver
  • Mesh is refined fully in parallel
  • References
  • M.Paszynski, J.Kurtz,L.Demkowicz Parallel Fully
    Automatic hp-Adaptive 2D Finite Element Package,
    ICES Report 04-07
  • M.Paszynski,K.Milfeld h-Relation Personalized
    Communication Strategy For HP-Adaptive
    Computations , ICES Report 04-40

56
FUTURE WORK
  • Future work will include
  • Implementation of parallel version of 3D code
  • Extending the code to be able to solve 3D
    Hemholtz and
    time-harmonic Maxwell equations
  • Parallel version of two grid solver
  • Challenging applications
  • Simulation of EM waves in the human head
  • Calculation of the Radar Cross-sections (3D
    scattering problems)
  • Simulation of Logging While Drilling EM
    measuring devices
  • References
  • M.Paszynski, J.Kurtz,L.Demkowicz Parallel Fully
    Automatic hp-Adaptive 2D Finite Element Package,
    ICES Report 04-07
  • M.Paszynski,K.Milfeld h-Relation Personalized
    Communication Strategy For HP-Adaptive
    Computations , ICES Report 04-40
Write a Comment
User Comments (0)
About PowerShow.com