QDP and Chroma - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

QDP and Chroma

Description:

Exists, implemented in MPI, GM, gigE and QCDOC. Wilson Op, DWF Inv for P4; Wilson and Stag. ... Exists in C, C , scalar and. MPP using QMP. Data layout over ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 21
Provided by: roberte152
Category:
Tags: qdp | chroma | stag

less

Transcript and Presenter's Notes

Title: QDP and Chroma


1
QDP and Chroma
  • Robert Edwards
  • Jefferson Lab
  • Collaborators
  • Balint Joo

2
SciDAC Software Structure
3
Overlapping communications and computations
  • C(x)A(x) shift(B, mu)
  • Send face forward non-blocking to neighboring
    node.
  • Receive face into pre-allocated buffer.
  • Meanwhile do AB on interior sites.
  • Wait on receive to perform AB on the face.
  • Lazy Evaluation (C style)
  • Shift(tmp, B, mu)
  • Mult(C, A, tmp)

4
QMP Simple Example
  • char bufsize
  • QMP_msgmem_t mm
  • QMP_msghandle_t mh
  • mm QMP_declare_msgmem(buf,size)
  • mh QMP_declare_send_relative(mm,x)
  • QMP_start(mh)
  • // Do computations
  • QMP_wait(mh)
  • Receiving node coordinates with the same steps
    except
  • mh QMP_declare_receive_from(mm,-x)

Multiple calls
5
Data Parallel QDP/C,C API
  • Hides architecture and layout
  • Operates on lattice fields across sites
  • Linear algebra tailored for QCD
  • Shifts and permutation maps across sites
  • Reductions
  • Subsets
  • Entry/exit attach to existing codes

6
QDP Type Structure
  • Lattice Fields have various kinds of indices
  • Color Uab(x) Spin Gab Mixed yaa(x), Qabab(x)
  • Tensor Product of Indices forms Type
  • QDP forms these types via nested C
    templating
  • Formation of new types (eg half fermion)
    possible

7
Data-parallel Operations
  • Unary and binary
  • -a a-b
  • Unary functions
  • adj(a), cos(a), sin(a),
  • Random numbers
  • // platform independent
  • random(a), gaussian(a)
  • Comparisons (booleans)
  • a lt b,
  • Broadcasts
  • a 0,
  • Reductions
  • sum(a),
  • Fields have various types (indices) Tensor
    Product

8
QDP Expressions
  • Can create expressions
  • QDP/C code
  • multi1dltLatticeColorMatrixgt u(Nd)
  • LatticeDiracFermion b, c, d
  • int mu
  • ceven umu shift(b,mu) 2 d
  • PETE Portable Expression Template Engine
  • Temporaries eliminated, expressions optimised

9
Linear Algebra Implementation
  • Naïve ops involve lattice temps inefficient
  • Eliminate lattice temps -PETE
  • Allows further combining of operations (adj(x)y)
  • Overlap communications/computations
  • Full performance expressions at site level

// Lattice operation A adj(B) 2 C
// Lattice temporaries t1 2 C t2
adj(B) t3 t2 t1 A t3
// Merged Lattice loop for (i ... ... ...)
Ai adj(Bi) 2 Ci
10
QDP Optimization
  • Optimizations under the hood
  • Select numerically intensive operations through
    template specialization.
  • PETE recognises expression templates like
  • z a x y
  • from type information at compile time.
  • Calls machine specific optimised routine (axpyz)
  • Optimized routine can use assembler, reorganize
    loops etc.
  • Optimized routines can be selected at
    configuration time,
  • Unoptimized fallback routines exist for
    portability

11
Performance Test Case -Wilson Conjugate Gradient
LatticeFermion psi, p, r Real c, cp, a, d
Subset s for(int k 1 k lt MaxCG k)
// c rk-1 2 c cp //
ak rk-1 2 / ltM pk, Mpk gt //
Mp M(u) p M(mp, p, PLUS) // Dslash
// d mp 2 d norm2(mp, s) a
c / d // Psik ak pk psis
a p // rk - ak Mdag.M.pk
M(mmp, mp, MINUS) rs - a mmp cp
norm2(r, s) if ( cp lt rsd_sq ) return
// bk1 rk2 / rk-12 b
cp / c // pk1 rk bk1 pk
ps r bp
  • In C significant room for perf. degradation
  • Performance limitations in Lin. Alg. Ops (VAXPY)
    and norms
  • Optimization
  • Funcs return container holding function type and
    operands
  • At , replace expression with optimized code by
    template specialization
  • Performance
  • QDP overhead 1 peak
  • Wilson QCDOC 283Mflops/node _at_350 MHz, 44/node

12
Chroma
  • A lattice QCD toolkit/library built on top of
    QDP
  • Library is a module can be linked with other
    codes.
  • Features
  • Utility libraries (gluonic measure, smearing,
    etc.)
  • Fermion support (DWF, Overlap, Wilson, Asqtad)
  • Applications
  • Spectroscopy, Props 3-pt funcs, eigenvalues
  • Heatbath, HMC
  • Optimization hooks level 3 Wilson-Dslash for
    Pentium, QCDOC, BG/L, IBM SP-like nodes (via
    Bagel)

13
Software Map
  • Autoconf/make based.
  • Installed packages leave a bin script for other
    packages

14
Chroma Lib Structure
  • Chroma Lattice Field Theory library
  • Support for gauge and fermion actions
  • Boson action support
  • Fermion action support
  • Fermion actions
  • Fermion boundary conditions
  • Inverters
  • Fermion linear operators
  • Quark propagator solution routines
  • Gauge action support
  • Gauge actions
  • Gauge boundary conditions
  • IO routines
  • Enums
  • Measurement routines
  • Eigenvalue measurements
  • Gauge fixing routines
  • Gluonic observables
  • Hadronic observables
  • Measurement routines
  • Eigenvalue measurements
  • Gauge fixing routines
  • Gluonic observables
  • Hadronic observables
  • Inline measurements
  • Eigenvalue measurements
  • Glue measurements
  • Hadron measurements
  • Smear measurements
  • Psibar-psi measurements
  • Schroedinger functional
  • Smearing routines
  • Trace-log support
  • Gauge field update routines
  • Heatbath
  • Molecular dynamics support
  • Hamiltonian systems
  • HMC trajectories

15
Fermion Actions
  • Actions are factory objects (foundries)
  • Do not hold gauge fields only params
  • Factory/creation functions with gauge field
    argument
  • Takes a gauge field - creates a State applies
    fermion BC.
  • Takes a State creates a Linear Operator
    (dslash)
  • Takes a State creates quark prop. solvers
  • Linear Ops are function objects
  • E.g., class Foo int operator() (int x) fred
    // int zfred(1)
  • Argument to CG, MR, etc. simple functions
  • Created with XML

16
Fermion Actions - XML
  • ltFermionActiongt
  • ltFermActgtWILSONlt/FermActgt
  • ltKappagt0.11lt/Kappagt
  • ltFermionBCgt
  • ltFermBCgtSIMPLE_FERMBClt/FermBCgt
  • ltboundarygt1 1 1 -1lt/boundarygt
  • lt/FermionBCgt
  • ltAnisoParamgt
  • ltanisoPgtfalselt/anisoPgt
  • ltt_dirgt3lt/t_dirgt
  • ltxi_0gt1.0lt/xi_0gt
  • ltnugt1.0lt/nugt
  • lt/AnisoParamgt
  • lt/FermionActiongt
  • Tag FermAct is key in lookup map of constructors
  • During construction, action reads XML
  • FermBC tag invokes another lookup
  • XPath used in chroma/mainprogs/main/propagator.cc
  • /propagator/Params/FermionAction/FermAct

17
HMC and Monomials
  • ltMonomialsgt
  • ltelemgt
  • ltNamegtTWO_FLAVOR_WILSON_FERM_MONOMIAL
  • lt/Namegt
  • ltFermionActiongt
  • ltFermActgtWILSONlt/FermActgt
  • lt/FermionActiongt
  • ltInvertParamgt
  • ltinvTypegtCG_INVERTERlt/invTypegt
  • ltRsdCGgt1.0e-7lt/RsdCGgt
  • ltMaxCGgt1000lt/MaxCGgt
  • lt/InvertParamgt
  • ltChronologicalPredictorgt
  • ltNamegtLAST_SOLUTION_4D_PREDICTORlt/Namegt
  • lt/ChronologicalPredictorgt
  • lt/elemgt
  • ltelemgt . lt/elemgt
  • HMC built on Monomials
  • Monomials define Nf, gauge, etc.
  • Only provide Mom à deriv(U) and S(U) .
    Pseudoferms not visible.
  • Have Nf2 and rational Nf1
  • Both 4D and 5D versions.

18
Gauge Monomials
  • Gauge monomials
  • Plaquette
  • Rectangle
  • Parallelogram
  • Monomial constructor will invoke constructor for
    Name in GaugeAction
  • ltMonomialsgt
  • ltelemgt . lt/elemgt
  • ltelemgt
  • ltNamegtWILSON_GAUGEACT_MONOMIALlt/Namegt
  • ltGaugeActiongt
  • ltNamegtWILSON_GAUGEACTlt/Namegt
  • ltbetagt5.7lt/betagt
  • ltGaugeBCgt
  • ltNamegtPERIODIC_GAUGEBClt/Namegt
  • lt/GaugeBCgt
  • lt/GaugeActiongt
  • lt/elemgt
  • lt/Monomialsgt

19
Chroma Inline Measurements
ltInlineMeasurementsgt ltelemgt
ltNamegtMAKE_SOURCElt/Namegt
ltParamgt...lt/Paramgt ltPropgt
ltsource_filegt./source_0lt/source_filegt
ltsource_volfmtgtMULTIFILElt/source_volfmtgt
lt/Propgt lt/elemgt ltelemgt
ltNamegtPROPAGATORlt/Namegt
ltParamgt...lt/Paramgt ltPropgt
ltsource_filegt./source_0lt/source_filegt
ltprop_filegt./propagator_0lt/prop_filegt
ltprop_volfmtgtMULTIFILElt/prop_volfmtgt
lt/Propgt lt/elemgt ltelemgt.lt/elemgt lt/InlineMe
asurementsgt
  • HMC has Inline meas.
  • Chroma.cc is Inline only code.
  • Former mainprogs now inline meas.
  • Meas. are registered with constructor call.
  • Meas. given gauge field no return value.
  • Only communicate to each other via disk (maybe
    mem. buf.??)

20
For More Information
  • U.S. Lattice QCD Home Page
  • http//www.usqcd.org/
  • The JLab Lattice Portal http//lqcd.jlab.org/
  • High Performance Computing at JLab
  • http//www.jlab.org/hpc/
Write a Comment
User Comments (0)
About PowerShow.com