The Pencil Code: multi-purpose and multi-user maintained - PowerPoint PPT Presentation

About This Presentation
Title:

The Pencil Code: multi-purpose and multi-user maintained

Description:

Before check-in: run auto-test yourself ... on no-modules (auto-test) SGI namelist problem (see ... Shock tube test. Axel Brandenburg: The Pencil-Code. 18 ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 35
Provided by: axel64
Category:

less

Transcript and Presenter's Notes

Title: The Pencil Code: multi-purpose and multi-user maintained


1
The Pencil Code multi-purpose and multi-user
maintained
  • Axel Brandenburg
  • (Nordita, Copenhagen)

2
Overview
  • Pencil formulation (advantages, headaches)
  • Structure of code, cvs maintainence
  • High-order schemes, tests
  • Peculiarities on big linux clusters
  • Online data processing

3
Pencil Code
  • Started in Sept. 2001 with Wolfgang Dobler
  • High order (6th order in space, 3rd order in
    time)
  • Cache memory efficient
  • MPI, can run PacxMPI (across countries!)
  • Maintained/developed by many people (CVS!)
  • Automatic validation (over night or any time)
  • Max resolution so far 10243

4
Range of applications
  • Isotropic turbulence
  • MHD (Nils), passive scalar, cosmic rays
  • Stratified layers
  • Convection, radiative transport (Tobi)
  • Shearing box
  • MRI (Nils), Planetesimals (Anders), Interstellar
    (Tony)
  • Sphere embedded in box
  • Fully convective stars (Dobler), geodynamo
    (McMillan)

5
Pencil formulation
  • In CRAY days worked with full chunks
    f(nx,ny,nz,nvar)
  • Now, on SGI, nearly 100 cache misses
  • Instead work with f(nx,nvar), i.e. one nx-pencil
  • No cache misses, negligible work space, just 2N
  • Communication before sub-timestep
  • Then evaluate all derivatives, e.g. call
    curl(f,iA,B)
  • Vector potential Af(,,,iAxiAz), BB(nx,3)

6
A few headaches
  • All operations must be combined
  • Curl(curl), max5(smooth(divu)) must be in one go
  • rms and max values for monitoring
  • call max_name(b2,i_bmax,lsqrt.true.)
  • call sum_name(b2,i_brms,lsqrt.true.)
  • Similar routines for toroidal average, etc
  • Online analysis (spectra, slices, vectors)

7
CVS maintained
  • pserver (password protected)
  • Public (check-out only), private (ci/co, 20
    people)
  • Set of 10 test problems
  • Nightly auto-test (different machines, web)
  • Before check-in run auto-test yourself
  • Mpi and nompi dummy module for single processor
    machine (or use lammpi on laptops)

8
Switch modules
  • magnetic or nomagnetic (e.g. just hydro)
  • hydro or nohydro (e.g. kinematic dynamo)
  • density or nodensity (burgulence)
  • entropy or noentropy (e.g. isothermal)
  • radiation or noradiation (see Tobis talk)
  • dustvelocity or nodustvelocity (planetesimals)

9
Features, problems
  • Namelist (can freely introduce new params)
  • Upgrades forgotten on no-modules (auto-test)
  • SGI namelist problem (see pencil FAQs)

10
Pencil Code check ins
11
High-order schemes
  • Alternative to spectral or compact schemes
  • Efficiently parallelized
  • No transpose necessary
  • 6th order central differences in space
  • Non-conservative scheme
  • Allows use of logarithmic density and entropy
  • Copes well with strong stratification and
    temperature contrasts

12
High-order spatial schemes
Main advantage low phase errors
13
Wavenumber characteristics
14
Higher order less viscosity
15
Less viscosity also in shocks
16
High-order temporal schemes
Main advantage low amplitude errors
2N-RK3 scheme (Williamson 1980)
2nd order
3rd order
1st order
17
Shock tube test
18
Hydromagnetic turbulence and subgrid scale models?
  • Want to shorten diffusive subrange
  • Waste of resources
  • Want to prolong inertial range
  • Focus of essential physics
  • Reasons to be worried about hyperviscosity
  • Shallower spectra
  • Wrong amplitudes of resulting large scale fields

19
Simulations at 5123
Biskamp Muller (2000)
Normal diffusivity
With hyperdiffusivity
20
256 processor run at 10243
21
MHD equation
Magn. Vector potential
Induction Equation
Momentum and Continuity eqns
22
Vector potential
  • BcurlA, advantage divB0
  • JcurlBcurl(curlA) curl2A
  • Not a disadvantage consider Alfven waves

B-formulation
A-formulation
2nd der once is better than 1st der twice!
23
Wallclock time versus processor
24
Sensitivity to layout onLinux clusters
Gigabit uplink
100 Mbit link only
  • yprox x zproc
  • 4 x 32 ? 1 (speed)
  • 8 x 16 ? 3 times slower
  • 16 x 8 ? 17 times slower

24 procs per hub
25
Why this sensitivity to layout?
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
6 7 8 9 0 1 2 3 4






All processors need to communicate with
processors outside to group of 24
26
Use exactly 4 columns
Only 2 x 4 8 processors need to communicate
outside the group of 24 ? optimal use of speed
ratio between 100 Mb ethernet switch and 1 Gb
uplink
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
16 17 18 19
20 21 22 23
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15






27
Fragmentation over many switches
28
Animation of uz
29
Animation of B vectors
30
Animation of B vectors
31
Animation of energy spectra
32
Saturation behavior explained by magnetic
helicity conservation
Steady state, closed box
Small scale and large scale current helicity in
balance
33
With hyperdiffusivity
for ordinary hyperdiffusion
34
Conclusions
  • Subgrid scale modeling can be unsafe (some
    problems)
  • shallower spectra, longer time scales, different
    saturation amplitudes
  • High order schemes
  • Low phase and amplitude errors
  • Need less viscosity
  • 100 MB link close to bandwidth limit
  • Comparable to Origin
  • 2x faster with GB switch
  • 100 MB switches with GB uplink optimal
Write a Comment
User Comments (0)
About PowerShow.com