Title: Welcome To the Inaugural Meeting of the WRF Software Training and Documentation Team Jan. 26-28, 2004 NCAR, MMM Division
1Welcome To the Inaugural Meeting of the WRF
Software Training and Documentation TeamJan.
26-28, 2004NCAR, MMM Division
2Monday January 26, 2004
- Introduction
- Software Overview
- WRF Software Tutorial, June 2003
- Data and data structures
- Parallel infrastructure
3Introduction
- History
- Requirements emphasize flexibility over a range
of platforms, applications, users - WRF develops rapidly. First released Dec 2000
Last beta release, 1.3, in May 2003. Official 2.0
release coming in May 2004 - Circa 2003 "Arcane" used to describe WRF
- Adj. Known or understood by only a few.
Mysterious.
4Introduction
- Purpose of WRF Tiger Team effort
- Extend knowledge of WRF software to wider base of
software developers - Create comprehensive developer document
- Team approach to both objectives
- Streamlining and code improvement as byproducts
in the coming months but not the subject for this
meeting
5Introduction
- This meeting
- Review of WRF software structure and function
- Phenomenological this is the code as it exists
- Incomplete time to prepare and present is
limiter, but - What we're looking for now is a roadmap through
the code for producing the comprehensive
documentation - Develop an outline for the developer
documentation - Writing assignments and work plan over next 9
months
6Some terms
- WRF Architecture scheme of software layers and
interface definitions - WRF Framework the software infrastructure, also
"driver layer" in the WRF architecture - WRF Model Layer the computational routines that
are specifically WRF - WRF Model a realization of the WRF architecture
comprising the WRF model layer with some
framework - WRF a set of WRF architecture-compliant
applications, of which the WRF Model is one
7WRF Software Overview
8Weather Research and Forecast Model
Goals Develop an advanced mesoscale forecast
and assimilation system, and accelerate
research advances into operations
12km WRF simulation of large-scale baroclinic
cyclone, Oct. 24, 2001
9WRF Software Requirements
-)
- Fully support user community's needs nesting,
coupling, contributed physics code and multiple
dynamical cores but keep it simple - Support every computer but make sure scaling and
performance is optimal on the computer we use - Leverage community infrastructure, computational
frameworks, contributed software, but please no
opaque code - Implement by committee of geographically remote
developers - Adhere to union of all software process models
- Fully test, document, support
- Free
10WRF Software Requirements (for real)
- Goals
- Community Model
- Good performance
- Portable across a range of architectures
- Flexible, maintainable, understandable
- Facilitate code reuse
- Multiple dynamics/ physics options
- Run-time configurable
- Nested
- Package independent
- Aspects of Design
- Single-source code
- Fortran90 modules, dynamic memory, structures,
recursion - Hierarchical software architecture
- Multi-level parallelism
- CASE Registry
- Package-neutral APIs
- I/O, data formats
- Communication
- Scalable nesting/coupling infrastructure
11Aspects of WRF Software Design
12Aspects of WRF Software Design
13Model Coupling
14ESMF
15Performance
16Structural Aspects
- Directory Structure and relationship to Software
Hierarchy - File nomenclature and conventions
- Use Association
17Directory Structure
18WRF Model Directory Structure
19WRF File Taxonomy and Nomenclature
20Module Conventions and USE Association
- Modules are named module_something
- Name of file containing module is
module_something.F - If a module includes an initialization routine,
that routine should be named init_module_something
() - Typically
- Driver and model layers are made up of modules,
- Mediation layer is not (rather, bare
subroutines), except for physics drivers in phys
directory - Gives benefit of modules while avoiding cycles in
the use association graph
MODULE module_this MODULE module_that
driver
USE module_this USE module_that USE
module_whatcha USE module_macallit
mediation
MODULE module_whatcha MODULE module_macallit USE
module_this USE module_that
model
21WRF S/W Tutorial, June 2003
22Tutorial Presentation (click here)
- Parallel Infrastructure
- Registry details
- I/O architecture and mechanism
- Example of coding new package into framework
23Data and Data Structures
24Session 3 Data Structures
- Overview
- Representation of domain
- Special representations
- Lateral Boundary Conditions
- 4D Tracer Arrays
25Data Overview
- WRF Data Taxonomy
- State data
- Intermediate data type 1 (I1)
- Intermediate data type 2 (I2)
- Heap storage (COMMON or Module data)
26State Data
- Persist for the duration of a domain
- Represented as fields in domain data structure
- Arrays are represented as dynamically allocated
pointer arrays in the domain data structure - Declared in Registry using state keyword
- Always memory dimensioned always thread shared
- Only state arrays can be subject to I/O and
Interprocessor communication
27I1 Data
- Data that persists for the duration of 1 time
step on a domain and then released - Declared in Registry using i1 keyword
- Typically automatic storage (program stack) in
solve routine - Typical usage is for tendency arrays in solver
- Always memory dimensioned and thread shared
- Typically not communicated or I/O
28I2 Data
- I2 data are local arrays that exist only in
model-layer subroutines and exist only for the
duration of the call to the subroutine - I2 data is not declared in Registry, never
communicated and never input or output - I2 data is tile dimensioned and thread local
over-dimensioning within the routine for
redundant computation is allowed - the responsibility of the model layer programmer
- should always be limited to thread-local data
29Heap Storage
- Data stored on the process heap is not thread-
safe and is generally forbidden anywhere in WRF - COMMON declarations
- Module data
- Exception If the data object is
- Completely contained and private within a Model
Layer module, and - Set once and then read-only ever after, and
- No decomposed dimensions.
30Grid Representation in Arrays
- Increasing indices in WRF arrays run
- West to East (X, or I-dimension)
- South to North (Y, or J-dimension)
- Bottom to Top (Z, or K-dimension)
- Storage order in WRF is IKJ but this is a WRF
Model convention, not a restriction of the WRF
Software Framework
31Grid Representation in Arrays
- The extent of the logical or domain dimensions is
always the "staggered" grid dimension. That is,
from the point of view of a non-staggered
dimension, there is always an extra cell on the
end of the domain dimension.
32Grid Indices Mapped onto Array Indices (C-grid
example)
jde 5
Computation over mass points runs only
ids..ide-1 and jds..jde-1 Likewise,
vertical computation over unstaggered fields run
kds..kde-1
jds 1
ids 1
ide 5
33LBC Arrays
- State arrays, declared in Registry using the b
modifier in the dimension field of the entry - Store specified forcing data on domain 1, or
forcing data from parent on a nest - All four boundaries are stored in the array last
index is over - P_XSB (western)
- P_XEB (eastern)
- P_YSB (southern)
- P_YEB (northern)
- These are defined in module_state_description.F
34LBC Arrays
- LBC arrays are declared as follows
- em_u_b(max(ide,jde),kde,spec_bdy_width,4)
- Globally dimensioned in first index as the
maximum of x and y dimensions - Second index is over vertical dimension
- Third index is the width of the boundary
(namelist) - Fourth index is which boundary
- Note LBC arrays are globally dimensioned
- not fully dimensioned so still scalable in memory
- preserves global address space for dealing with
LBCs - makes input trivial (just read and broadcast)
35LBC Arrays
P_YEB
unused
unused
jde
A Given Domain
P_YEB
P_XEB
P_YSB
jds
spec_bdy_width
ids
ide
36LBC Arrays
P_YEB
unused
unused
jde
P_XEB
A given subdomain that includes a domain boundary
P_YEB
P_YSB
jds
spec_bdy_width
ids
ide
37Four Dimensional Tracer Arrays
- State arrays, used to store arrays of 3D fields
such as moisture tracers, chemical species,
ensemble members, etc. - First 3 indices are over grid dimensions last
dimension is the tracer index - Each tracer is declared in the Registry as a
separate state array but with f and optionally
also t modifiers to the dimension field of the
entry - The field is then added to the 4D array whose
name is given by the use field of the Registry
entry
38Four Dimensional Tracer Arrays
- Fields of a 4D array are input and output
separately and appear as any other 3D field in a
WRF dataset - The extent of the last dimension of a tracer
array is from PARAM_FIRST_SCALAR to
num_tracername - Both defined in Registry-generated
frame/module_state_description.F - PARAM_FIRST_SCALAR is a defined constant (2)
- Num_tracername is computed at run-time in
set_scalar_indices_from_config (module_configure) - Calculation is based on which of the tracer
arrays are associated with which specific
packages in the Registry and on which of those
packages is active at run time (namelist.input)
39Four Dimensional Tracer Arrays
- Each tracer index (e.g. P_QV) into the 4D array
is also defined in module_state_description and
set in set_scalar_indices_from_config - Code should always test that a tracer index
greater than or equal to PARAM_FIRST_SCALAR
before referencing the tracer (inactive tracers
have an index of 1) - Loops over tracer indices should always run from
PARAM_FIRST_SCALAR to num_tracername -- EXAMPLE
40Parallel Infrastructure
41Parallel Infrastructure
- Distributed memory parallelism
- Some basics
- API
- module_dm.F routines
- Registry interface (gen_comms.c)
- Data decomposition
- Communications
- Shared memory parallelism
- Tiling
- Threading directives
- Thread safety
42Some Basics on DM Parallelism
- Principal types of explicit communication
- Halo exchanges
- Periodic boundary updates
- Parallel transposes
- Special purpose scatter gather for nesting
- Also
- Broadcasts
- Reductions (missing, using MPI directly)
- Patch-to-global and global-to-patch
- Built in I/O server mechanism
43Some Basics on DM Parallelism
- All DM comm operations are collective
- Semantics for specifying halos exchanges,
periodic bdy updates, and transposes allow
message agglomeration (bundling) - Halos and periods allow fields to have varying
width stencils within the same operation - Efficient implementation is up to the external
package implementing communications
44DM Comms API
- External package provides a number of subroutines
in module_dm.F - Click here for partial API specification
- Actual invocation of halos, periods, transposes
provided by a specific external package is
through include files in the inc directory. This
provides greater flexibility and latitude to the
implementer than a subroutine interface - Package implementer may define comm-invocation
include files manually or they can be generated
automatically by the Registry by providing a
routine external/package/gen_comms.c for
inclusion in the Registry program
45A few notes on RSL implementation
msg1 u, v
- RSL maintains descriptors for domains and the
operations on the domains - An operation such as a halo exchange is a
collection of logical "messages", one per point
on the halo's stencil - Each message is a collection of fields that
should be exchanged for that point - RSL stores up this information in tables then
compiles an efficient communication schedule the
first time the operation is invoked for a domain
46Example HALO_EM_D2_5
USED in dyn_em/solve_em.F ifdef DM_PARALLEL
IF ( h_mom_adv_order lt 4 ) THEN include
"HALO_EM_D2_3.inc" ELSE IF ( h_mom_adv_order
lt 6 ) THEN include "HALO_EM_D2_5.inc" ELSE
WRITE(wrf_err_message,)'solve_em invalid
h_mom_adv_order ' CALL wrf_error_fatal
(TRIM(wrf_err_message)) ENDIF include
"PERIOD_BDY_EM_D.inc" include
"PERIOD_BDY_EM_MOIST2.inc" include
"PERIOD_BDY_EM_CHEM2.inc" endif
Defined in Registry halo HALO_EM_D2_5
dyn_em 48u_2,v_2,w_2,t_2,ph_2\
24moist_2,chem_2\
4mu_2,al
47Example HALO_EM_D2_5
!STARTOFREGISTRYGENERATEDINCLUDE
'inc/HALO_EM_D2_5.inc' ! ! WARNING This file is
generated automatically by use_registry ! using
the data base in the file named Registry. ! Do
not edit. Your changes to this file will be
lost. ! IF ( gridcomms( HALO_EM_D2_5 )
invalid_message_value ) THEN CALL wrf_debug (
50 , 'set up halo HALO_EM_D2_5' ) CALL
setup_halo_rsl( grid ) CALL reset_msgs_48pt
CALL add_msg_48pt_real ( u_2 , (glen(2)) ) CALL
add_msg_48pt_real ( v_2 , (glen(2)) ) CALL
add_msg_48pt_real ( w_2 , (glen(2)) ) CALL
add_msg_48pt_real ( t_2 , (glen(2)) ) CALL
add_msg_48pt_real ( ph_2 , (glen(2)) ) if (
P_qv .GT. 1 ) CALL add_msg_24pt_real ( moist_2 (
gridsm31,gridsm32,gridsm33,P_qv), glen(2) )
if ( P_qc .GT. 1 ) CALL add_msg_24pt_real (
moist_2 ( gridsm31,gridsm32,gridsm33,P_qc),
glen(2) ) if ( P_qr .GT. 1 ) CALL
add_msg_24pt_real ( moist_2 ( gridsm31,gridsm32,
gridsm33,P_qr), glen(2) ) if ( P_qi .GT. 1 )
CALL add_msg_24pt_real ( moist_2 (
gridsm31,gridsm32,gridsm33,P_qi), glen(2) )
if ( P_qs .GT. 1 ) CALL add_msg_24pt_real (
moist_2 ( gridsm31,gridsm32,gridsm33,P_qs),
glen(2) ) if ( P_qg .GT. 1 ) CALL
add_msg_24pt_real ( moist_2 ( gridsm31,gridsm32,
gridsm33,P_qg), glen(2) ) CALL
add_msg_4pt_real ( mu_2 , 1 ) CALL
add_msg_4pt_real ( al , (glen(2)) ) CALL
stencil_48pt ( griddomdesc , gridcomms (
HALO_EM_D2_5 ) ) ENDIF CALL rsl_exch_stencil (
griddomdesc , gridcomms( HALO_EM_D2_5 ) )
Defined in Registry halo HALO_EM_D2_5
dyn_em 48u_2,v_2,w_2,t_2,ph_2\
24moist_2,chem_2\
4mu_2,al
48Notes on Period Communication
m1,4
m2,4
m3,4
m4,4
updating Mass Point periodic boundary
m1,3
m2,3
m3,3
m4,3
m1,2
m2,2
m3,2
m4,2
m1,1
m2,1
m3,1
m4,1
49Notes on Period Communication
m4,4
m1,4
m1,4
m2,4
m3,4
m4,4
updating Mass Point periodic boundary
m4,3
m1,3
m1,3
m2,3
m3,3
m4,3
m4,2
m1,2
m1,2
m2,2
m3,2
m4,2
m4,1
m1,1
m1,1
m2,1
m3,1
m4,1
50Notes on Period Communication
u1,4
u2,4
u3,4
u4,4
u1,5
m1,4
m2,4
m3,4
m4,4
updating U Staggered periodic boundary
u1,3
u2,3
u3,3
u4,3
u1,3
m1,3
m2,3
m3,3
m4,3
u1,2
u2,2
u3,2
u4,2
u1,2
m1,2
m2,2
m3,2
m4,2
u1,1
u2,1
u3,1
u4,1
u1,1
m1,1
m2,1
m3,1
m4,1
51Notes on Period Communication
u4,4
u2,4
u1,4
u2,4
u3,4
u4,4
u1,5
m1,4
m2,4
m3,4
m4,4
updating U Staggered periodic boundary
u4,3
u2,3
u1,3
u2,3
u3,3
u4,3
u1,3
m1,3
m2,3
m3,3
m4,3
u4,2
u2,2
u1,2
u2,2
u3,2
u4,2
u1,2
m1,2
m2,2
m3,2
m4,2
u4,1
u2,1
u1,1
u2,1
u3,1
u4,1
u1,1
m1,1
m2,1
m3,1
m4,1
52Welcome To the Inaugural Meeting of the WRF
Software Training and Documentation TeamJan.
26-28, 2004NCAR, MMM Division
53Tuesday, January 27, 2004
- Detailed code walk-through
- I/O
- Misc. Topics
- Registry
- Error handling
- Time management
- Build mechanism
54Detailed WRF Code Walkthrough
55Detailed Code Walkthrough
- The walkthrough was conducted using the following
set of Notes - The WRF Code Browser was used to peruse the code
and dive down at various points - The walkthrough began with the main/wrf.F routine
(when you bring up the browser, this should be in
the upper right hand frame if not, click the
link WRF in the lower left hand frame, under
Programs)
56I/O
57I/O
- Concepts
- I/O Software Stack
- I/O and Model Coupling API
58WRF I/O Concepts
- WRF model has multiple input and output streams
that are bound a particular format at run time - Different formats (NetCDF, HDF, binary I/O) are
implemented behind a standardized WRF I/O API - Lower levels of the WRF I/O software stack allow
expression of a dataset open as a two-stage
operation OPEN BEGIN and then OPEN COMMIT - Between the OPEN BEGIN and OPEN COMMIT the
program performs the sequence of writes that will
constitute one frame of output to "train" the
interface - An implementation of the API is free to use this
information for optimization/bundling/etc. or
ignore it - Higher levels of the WRF I/O software stack
provide a BEGIN/TRAIN/COMMIT form of an OPEN as a
single call
59I/O Software Stack
- Domain I/O
- Field I/O
- Package-independent I/O API
- Package-specific I/O API
60Domain I/O
- Routines in share/module_io_domain.F
- High level routines that apply to operations on a
domain and a stream - open and define a stream for writing in a single
call that contains the OPEN FOR WRITE BEGIN, the
series of "training writes" to a dataset, and the
final OPEN FOR WRITE COMMIT - read or write all the fields of a domain that
make up a complete frame on a stream (as
specified in the Registry) with a single call - some wrf-model specific file name manipulation
routines
61Field I/O
- Routines in share/module_io_wrf.F
- Many of the routines here are duplicative of the
routines in share/module_io_domain.F and an
example of unnecessary layering in the WRF I/O
software stack - However, file does contain the base output_wrf
and input_wrf routines in this file are what all
the stream-specific wrappers (that are duplicated
in the two layers)
62Field I/O
- Output_wrf and input_wrf
- Contain hard coded WRF-specific meta-data puts
(for output) and gets (for input) - Whether meta-data is output or input is
controlled by a flag in the grid data structure - Meta data output is turned off when output_wrf is
being called as part of a "training write" within
a two-stage open - It is turned on when it's called as part of an
actual write - Contain registry generated series of calls the
WRF I/O API to write or read individual files
63Package-independent I/O API
- frame/module_io.F
- These routines correspond to WRF I/O API
specification - Start with the wrf_ prefix (package-specific
routines start with ext_package_) - The package-independent routines here contain
logic for - selecting between formats (package-specific)
based on the what stream is being written and
what format is specified for that stream - calling the external package as a parallel
package (each process passes subdomain) or
collecting and calling on a single WRF process - passing the data off the the asynchronous
quilt-servers instead of calling the I/O API from
this task
64Package-specific I/O API
- Format specific implementations of I/O
- external/io_netcdf/wrf_io.F90
- external/io_int/io_int.F90
- external/io_phdf5/wrf-phdf5.F90
- external/io_mcel/io_mcel.F90
- The NetCDF version each contain a small program,
diffwrf.F90, that uses the API read and then
generate an ascii dump of a field that is
readable by HMV (see www.rotang.com) a small
plotting program we use in-house for debugging
and quick output. - Diffwrf is also useful as a small example of how
to use the I/O API to read a WRF data set
65Misc. Topics
66Misc. Topics
- Registry
- Error handling
- Time management
- Build mechanism
67Registry
- Overview of Registry program
- Survey of what is autogenerated
68Registry Source Files (in tools/)
registry.c Main program reg_parse.c Parser
Registry File and build AST gen_allocs.c Generat
e allocate statements gen_args.c Generate
argument lists gen_comms.c Generate comms (STUBS
or PACKAGE SPECIFIC) gen_config.c Generate
namelist handling code gen_defs.c Generate
variable/dummy arg declarations gen_interp.c Gene
rate nest interpolation code gen_mod_state_descr.c
Generate frame/module_state_description.F gen_mod
el_data_ord.c Generate inc/model_data_ord.inc gen_
scalar_derefs.c Generate grid dereferencing code
for non arrays gen_scalar_indices.c Generate code
for 4D array indexing gen_wrf_io.c Generate
calls to I/O API for fields misc.c Utilities
used in registry program my_strtok.c "
" " " " data.c Abstract syntax tree
routines sym.c Symbol table (used by parser and
AST) symtab_gen.c " " " " "
" type.c Type handling, derived data types,
misc.
69What the Registry Generates
- Include files in the inc directory
70WRF Error Handling
- frame/module_wrf_error.F
- Routines for
- Incremental debugging output WRF_DEBUG
- Producing diagnostic messages WRF_MESSAGE
- Writing an error message and terminating
WRF_ERROR_FATAL
71WRF Time management
- Implementation of ESMF Time Manager
- Defined in external/esmf_time_f90
- Objects
- Clocks
- Alarms
- Time Instances
- Time Intervals
72WRF Time management
- Operations on ESMF time objects
- For example , -, and other arithmetic is
defined for time intervals intervals and
instances - I/O intervals are specified by setting alarms on
clocks that are stored for each domain see
share/set_timekeeping.F - The I/O operations are called when these alarms
"go off". see MED_BEFORE_SOLVE_IO in
share/mediation_integrate.F
73WRF Build Mechanism
- Structure
- Scripts
- configure
- Determines architecture using 'uname' then
searchs the arch/configure.defaults file for the
list of possible compile options for that system.
Typically the choices involved compiling for
single-threaded, pure shared memory, pure
distributed memory, or hybrid may be other
options too - Creates the file configure.wrf, included by
Makefiles - compile scenario
- Checks for existence of configure.wrf
- Checks the environment for the core-specific
settings such as WRF_EM_CORE or WRF_NMM_CORE - Invokes the make command on the top-level
Makefile passing it information about specific
targets to be built depending on the scenario
argument to the script - clean -a
- Cleans the code, or really cleans the code
74WRF Build Mechanism
- Structure (continued)
- arch/configure.defaults -- file containing
settings for various architectures - test directory
- Contains a set of subdirectories, each one for a
different scenario. Includes idealized cases as
well as directories for running real-data cases - The compile script requires the name of one of
these directories (for example "compile em_real")
and based on that it compiles wrf.exe and the
appropriate preprocessor (for example real.exe)
and creates symbolic links from the test/em_real
subdirectory to these executables
75WRF Build Mechanism
- Structure (continued)
- Top-level Makefile and Makefiles in
subdirectories - The compile script invokes the top-level Makefile
as "make scenario" - The top level Makefile, including rules and
targets from the configure.wrf file that was
generated by the configure script, then
recursively invokes Makefiles in subdirectories
in order - external (external packages based on
configure.wrf) - tools (builds registry)
- frame (invokes registry and then builds
framework) - shared (mediation layer and other modules and
subroutines) - physics (physics package)
- dyn_ (core specific code)
- main (main routine and link to produce
executables)