A Framework for Parallel Finite Element Method Codes With Charm - PowerPoint PPT Presentation

About This Presentation
Title:

A Framework for Parallel Finite Element Method Codes With Charm

Description:

Data-driven parallel language and runtime system. 10/30/09. Parallel Programming Laboratory ... Based on object migration and measurement of load information ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 22
Provided by: charmC
Learn more at: http://charm.cs.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: A Framework for Parallel Finite Element Method Codes With Charm


1
A Framework for Parallel Finite Element Method
Codes With Charm
  • This presentation will probably involve audience
    discussion, which will create action items. Use
    PowerPoint to keep track of these action items
    during your presentation
  • In Slide Show, click on the right mouse button
  • Select Meeting Minder
  • Select the Action Items tab
  • Type in action items as they come up
  • Click OK to dismiss this box
  • This will automatically create an Action Item
    slide at the end of your presentation with your
    points entered.
  • M. Bhandarkar, T. Hinrichs, O. Lawlor, K.
    Mahesh, L.V. Kale
  • and
  • J.H. Jeong, J. Dantzig

2
Objectives
  • Assist in development of parallel FEM codes
  • Support parallel implementation
  • Provide capabilities of adaptive load balancing
  • Yet, keep the application code free of
    parallelization issues

3
Dendritic Growth
  • Studies evolution of solidification
    microstructures using a phase-field model
    computed on an adaptive finite element grid
  • Adaptive refinement and coarsening of grid
    involves re-partitioning

4
Crack Propagation
  • Explicit FEM code
  • Zero-volume Cohesive Elements inserted near the
    crack
  • As the crack propagates, more cohesive elements
    added near the crack, which leads to severe load
    imbalance

Decomposition into 16 chunks (left) and 128
chunks, 8 for each PE (right). The middle area
contains cohesive elements. Pictures S.
Breitenfeld, and P. Geubelle
5
FEM Framework Responsibilities
FEM Application (Initialize, Registration of
Nodal Attributes, Loops Over Elements, Finalize)
FEM Framework (Update of Nodal properties,
Reductions over nodes or partitions)
Partitioner
Combiner
Charm (Dynamic Load Balancing, Communication)
METIS
I/O
6
Structure of an FEM Program
  • Serial init() and finalize() subroutines
  • Do serial I/O, read serial mesh and call
    FEM_Set_Mesh
  • Parallel driver() main routine
  • One driver per partitioned mesh chunk
  • Runs in a thread time-loop looks like serial
    version
  • Does computation and call FEM_Update_Field
  • Framework handles partitioning, parallelization,
    and communication

7
Structure of an FEM Application
init()
Update
Update
Update
driver
driver
driver
Shared Nodes
Shared Nodes
finalize()
8
Framework Calls
  • FEM_Set_Mesh
  • Called from initialization to set the serial mesh
  • Framework partitions mesh into chunks
  • FEM_Create_Field
  • Registers a node data field with the framework,
    supports user data types
  • FEM_Update_Field
  • Updates node data field across all processors
  • Handles all parallel communication
  • Other parallel calls (Reductions, etc.)

9
Implementation
  • Requirements
  • Multi-partitioning
  • Latency-tolerant, better use of cache,
    flexibility for load balancing
  • Migration-based adaptive load balancing
  • Handle dynamic load variations in irregular
    applications
  • Threads
  • Parallel code structure similar to sequential
    codes
  • Charm supports all of these
  • Data-driven parallel language and runtime system

10
Charm
  • Adaptive latency tolerance
  • Message-driven execution
  • No blocking receives
  • Interoperability with other parallel languages
  • Run-time system allows modules using Charm,
    Threads, PVM, MPI to be combined
  • Runs on a variety of machines
  • Clusters of workstations (Unix, Linux, Windows
    NT)
  • Massively parallel machines (ASCI Red, Cray T3E)
  • SMP machines (IBM SP, SGI Origin)

11
Load Balancing Framework
  • Based on object migration and measurement of load
    information
  • Partition problem more finely than the number of
    available processors
  • Partitions implemented as objects (or threads)
    and mapped to available processors by LB
    Framework
  • Runtime system measures actual computation times
    of every partition, as well as communication
    patterns
  • Variety of plug-in LB strategies available

12
Load Balancing Framework
13
Crack Propagation
Decomposition into 16 chunks (left) and 128
chunks, 8 for each PE (right). The middle area
contains cohesive elements. Both decompositions
obtained using Metis. Pictures S. Breitenfeld,
and P. Geubelle
14
Overhead of Multipartitioning
15
Load balancer in action
Automatic Load Balancing in Crack Propagation
1. Elements Added
3. Chunks Migrated
2. Load Balancer Invoked
16
Charm Threads
  • A single scheduler for both objects and threads
    (based on generalized messages)
  • Non-preemptive
  • Complete control over scheduling
  • No locking overhead
  • Allow migration
  • isomalloc thread stacks balances creation
    overhead with ability to migrate

17
Migrating Threads
  • Problem references to local variables on stack
    undefined upon migration
  • Solution Force threads to use identical virtual
    addresses on any processor
  • Stack-copying
  • Isomalloc

18
Migrating Threads
  • Preliminary implementation Stack-copy on
    context-switch
  • All threads execute on process stack
  • Stack copied on context-switch
  • Expensive
  • Current implementation isomalloc
  • Threads use identical virtual addresses on any
    processor
  • Stacks are locally allocated, but globally
    reserved
  • Sync-less algorithm using memory slotting

19
Overhead of Context-switch
20
Scalability of FEM Framework
21
Future Work
  • More support for dynamic FEM applications
  • Adaptive refinement/coarsening
  • Insertion/deletion of elements
Write a Comment
User Comments (0)
About PowerShow.com