Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 - PowerPoint PPT Presentation

Loading...

PPT – Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 PowerPoint presentation | free to download - id: 3cf375-ZTdjO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006

Description:

SAND Number: 2005-3338P ... Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 134
Provided by: sandiaGo8
Learn more at: http://www.sandia.gov
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006


1
Trilinos TutorialOverview and Basic
ConceptsMichael A. HerouxSandia National
LaboratoriesACTS Toolkit Workshop 2006
Sandia is a multiprogram laboratory operated by
Sandia Corporation, a Lockheed Martin
Company,for the United States Department of
Energy under contract DE-AC04-94AL85000.
2
Trilinos Development Team
Ross Bartlett Lead Developer of Thyra Developer
of Rythmos Paul Boggs Developer of Thyra Todd
Coffey Lead Developer of Rythmos Jason Cross
Developer of Jpetra David Day Developer of
Komplex Clark Dohrmann Developer of
CLAPS Michael Gee Developer of ML, NOX Bob
Heaphy Lead developer of Trilinos SQA Mike
Heroux Trilinos Project Leader Lead Developer
of Epetra, AztecOO, Kokkos, Komplex, IFPACK,
Thyra, Tpetra Developer of Amesos, Belos,
EpetraExt, Jpetra Ulrich Hetmaniuk Developer of
Anasazi
Robert Hoekstra Lead Developer of
EpetraExt Developer of Epetra, Thyra,
Tpetra Russell Hooper Developer of NOX Vicki
Howle Lead Developer of Meros Developer of Belos
and Thyra Jonathan Hu Developer of ML Sarah
Knepper Developer of Komplex Tammy Kolda Lead
Developer of NOX Joe Kotulski Lead Developer of
Pliris Rich Lehoucq Developer of Anasazi and
Belos Kevin Long Lead Developer of Thyra,
Developer of Belos and Teuchos Roger Pawlowski
Lead Developer of NOX Michael Phenow Trilinos
Webmaster Lead Developer of New_Package
Eric Phipps Developer of LOCA and NOX Marzio
Sala Lead Developer of Didasko and
IFPACK Developer of ML, Amesos Andrew Salinger
Lead Developer of LOCA Paul Sexton Developer
of Epetra and Tpetra Bill Spotz Lead Developer
of PyTrilinos Developer of Epetra,
New_Package Ken Stanley Lead Developer of
Amesos and New_Package Heidi Thornquist Lead
Developer of Anasazi, Belos and Teuchos Ray
Tuminaro Lead Developer of ML and Meros Jim
Willenbring Developer of Epetra and New_Package.
Trilinos library manager Alan Williams
Developer of Epetra, EpetraExt, AztecOO, Tpetra
3
Outline of Talk
  • Background/Motivation.
  • Trilinos Package Concepts.
  • ANAs, LALs and APPs.
  • Managing Parallel Data Petra Object Model.
  • Whirlwind Tour of Some Trilinos Packages.
  • Generic Programming.
  • SQA/SQE Issues.
  • Concluding remarks.

4
Target Problems PDES and more
Circuits
PDES
Inhomogeneous Fluids
And More
5
Motivation For Trilinos
  • Sandia does LOTS of solver work.
  • When I started at Sandia in May 1998
  • Aztec was a mature package. Used in many codes.
  • FETI, PETSc, DSCPack, Spooles, ARPACK, DASPK, and
    many other codes were (and are) in use.
  • New projects were underway or planned in
    multi-level preconditioners, eigensolvers,
    non-linear solvers, etc
  • The challenges
  • Little or no coordination was in place to
  • Efficiently reuse existing solver technology.
  • Leverage new development across various projects.
  • Support solver software processes.
  • Provide consistent solver APIs for applications.
  • ASCI was forming software quality
    assurance/engineering (SQA/SQE) requirements
  • Daunting requirements for any single solver
    effort to address alone.

6
Evolving Trilinos Solution
  • Trilinos1 is an evolving framework to address
    these challenges
  • Fundamental atomic unit is a package.
  • Includes core set of vector, graph and matrix
    classes (Epetra/Tpetra packages).
  • Provides a common abstract solver API (Thyra
    package).
  • Provides a ready-made package infrastructure
    (new_package package)
  • Source code management (cvs, bonsai).
  • Build tools (autotools).
  • Automated regression testing (queue directories
    within repository).
  • Communication tools (mailman mail lists).
  • Specifies requirements and suggested practices
    for package SQA.
  • In general allows us to categorize efforts
  • Efforts best done at the Trilinos level (useful
    to most or all packages).
  • Efforts best done at a package level (peculiar or
    important to a package).
  • Allows package developers to focus only on things
    that are unique to their package.

1. Trilinos loose translation A string of
pearls
7
Trilinos Packages
Full Vertical Solver Coverage
8
Trilinos Statistics
Stats Trilinos Download Page 8/18/2006.
9
Trilinos Package Concepts
Package The Atomic Unit
10
Trilinos Packages
  • Trilinos is a collection of Packages.
  • Each package is
  • Focused on important, state-of-the-art algorithms
    in its problem regime.
  • Developed by a small team of domain experts.
  • Self-contained No explicit dependencies on any
    other software packages (with some special
    exceptions).
  • Configurable/buildable/documented on its own.
  • Sample packages NOX, AztecOO, ML, IFPACK, Meros.
  • Special package collections
  • Petra (Epetra, Tpetra, Jpetra) Concrete Data
    Objects
  • Thyra Abstract Conceptual Interfaces
  • Teuchos Common Tools.
  • New_package Jumpstart prototype.

11
Greek Names
12
Trilinos Package Summary
Basic Linear Algebra classes
Block Krylov Methods
Scalable Preconditioners
3rd Party Direct Solver Package.
Abstract Interfaces.
13
Why Packages?
Lets Do Some Matrix Analysis!
14
Trilinos Development Team (Again)
Ross Bartlett Lead Developer of Thyra Developer
of Rythmos Paul Boggs Developer of Thyra Todd
Coffey Lead Developer of Rythmos Jason Cross
Developer of Jpetra David Day Developer of
Komplex Clark Dohrmann Developer of
CLAPS Michael Gee Developer of ML, NOX Bob
Heaphy Lead developer of Trilinos SQA Mike
Heroux Trilinos Project Leader Lead Developer
of Epetra, AztecOO, Kokkos, Komplex, IFPACK,
Thyra, Tpetra Developer of Amesos, Belos,
EpetraExt, Jpetra Ulrich Hetmaniuk Developer of
Anasazi
Robert Hoekstra Lead Developer of
EpetraExt Developer of Epetra, Thyra,
Tpetra Russell Hooper Developer of NOX Vicki
Howle Lead Developer of Meros Developer of Belos
and Thyra Jonathan Hu Developer of ML Sarah
Knepper Developer of Komplex Tammy Kolda Lead
Developer of NOX Joe Kotulski Lead Developer of
Pliris Rich Lehoucq Developer of Anasazi and
Belos Kevin Long Lead Developer of Thyra,
Developer of Belos and Teuchos Roger Pawlowski
Lead Developer of NOX Michael Phenow Trilinos
Webmaster Lead Developer of New_Package
Eric Phipps Developer of LOCA and NOX Marzio
Sala Lead Developer of Didasko and
IFPACK Developer of ML, Amesos Andrew Salinger
Lead Developer of LOCA Paul Sexton Developer
of Epetra and Tpetra Bill Spotz Lead Developer
of PyTrilinos Developer of Epetra,
New_Package Ken Stanley Lead Developer of
Amesos and New_Package Heidi Thornquist Lead
Developer of Anasazi, Belos and Teuchos Ray
Tuminaro Lead Developer of ML and Meros Jim
Willenbring Developer of Epetra and New_Package.
Trilinos library manager Alan Williams
Developer of Epetra, EpetraExt, AztecOO, Tpetra
15
Developer and Package Node IDs
  • Developer IDs
  • RossBartlett 1
  • PaulBoggs 2
  • ToddCoffey 3
  • JasonCross 4
  • DavidDay 5
  • ClarkDohrmann 6
  • MichaelGee 7
  • BobHeaphy 8
  • MikeHeroux 9
  • UlrichHetmaniuk 10
  • RobHoekstra 11
  • RussellHooper 12
  • VickiHowle 13
  • JonathanHu 14
  • SarahKnepper 15
  • TammyKolda 16
  • JoeKotulski 17
  • Package IDs
  • PyTrilinos 1
  • amesos 2
  • anasazi 3
  • aztecoo 4
  • belos 5
  • claps 6
  • didasko 7
  • epetra 8
  • epetraext 9
  • ifpack 10
  • jpetra 11
  • kokkos 12
  • komplex 13
  • meros 14
  • loca 15
  • ml 16

16
Developer-Package EdgesA(i,j) 1 if developer i
contributes to package j
  • A sparse(31,24)
  • A(RossBartlett,rythmos) 1
  • A(RossBartlett,thyra) 1
  • A(PaulBoggs,thyra) 1
  • A(ToddCoffey,rythmos) 1
  • A(JasonCross,jpetra) 1
  • A(DavidDay,komplex) 1
  • A(ClarkDohrmann,claps) 1
  • A(MichaelGee,ml) 1
  • A(MichaelGee,nox) 1
  • A(BobHeaphy,trilinosframework) 1
  • A(MikeHeroux,trilinosframework) 1
  • A(MikeHeroux,epetra) 1
  • A(MikeHeroux,aztecoo) 1
  • A(MikeHeroux,kokkos) 1
  • A(MikeHeroux,komplex) 1
  • A(MikeHeroux,ifpack) 1
  • A(MikeHeroux,thyra) 1

A(MarzioSala,didasko) 1 A(MarzioSala,ifpack)
1 A(MarzioSala,ml) 1 A(MarzioSala,amesos)
1 A(AndrewSalinger,loca) 1 A(PaulSexton,epetra
) 1 A(PaulSexton,tpetra) 1 A(BillSpotz,PyTri
linos) 1 A(BillSpotz,epetra)
1 A(BillSpotz,new_package) 1 A(KenStanley,ames
os) 1 A(KenStanley,new_package)
1 A(HeidiThornquist,anasazi)
1 A(HeidiThornquist,belos) 1 A(HeidiThornquist
,teuchos) 1 A(RayTuminaro,ml)
1 A(RayTuminaro,meros) 1 A(JimWillenbring,epet
ra) 1 A(JimWillenbring,new_package)
1 A(JimWillenbring,trilinosframework)
1 A(AlanWilliams,epetra) 1 A(AlanWilliams,epet
raext) 1 A(AlanWilliams,aztecoo)
1 A(AlanWilliams,tpetra) 1
A(RobHoekstra,epetra) 1 A(RobHoekstra,thyra)
1 A(RobHoekstra,tpetra) 1 A(RobHoekstra,epetra
ext) 1 A(RussellHooper,nox)
1 A(VickiHowle,meros) 1 A(VickiHowle,belos)
1 A(VickiHowle,thyra) 1 A(JonathanHu,ml)
1 A(SarahKnepper,komplex) 1 A(TammyKolda,nox)
1 A(TammyKolda,trilinosframework)
1 A(JoeKotulski,pliris) 1 A(RichLehoucq,anasaz
i) 1 A(RichLehoucq,belos) 1 A(KevinLong,thyr
a) 1 A(KevinLong,belos) 1 A(KevinLong,teucho
s) 1 A(RogerPawlowski,nox)
1 A(MichaelPhenow,trilinosframework)
1 A(MichaelPhenow,trilinosframework)
1 A(EricPhipps,loca) 1 A(EricPhipps,nox) 1
17
Developer-Package Matrix
spy(A)
  • Number of developers per package
  • Maximum max(sum(A)) 6
  • Average sum(sum(A))/24 2.875
  • Number of package affiliations per developer
  • Maximum max(sum(A)) 12 (me).
  • Minus outlier 4.
  • Average sum(sum(A))/31 2.26
  • Observations
  • Small package teams.
  • Multiple packages per developer.

18
Developer-Developer Matrix (AA)
Komplex Subteam
Anasazi Subteam
ML Subteam
Thyra Subteam
  • B AA
  • Apply Symmetric AMD reordering.
  • Two developers connected if co-authors of a
    package.
  • Only two developers disconnected
  • Clark Dohrmann CLAPS.
  • Joe Kotulski Pliris.

Epetra Subteam
NOX Subteam
19
Package-Package Matrix (AA)(Not sure what to
say about this)
  • B AA, Apply symamd.
  • Two packages connected if they share a developer.
  • Only two packages disconnected
  • Clark Dohrmann CLAPS.
  • Joe Kotulski Pliris.

Without Me
Me
Meros, AztecOO, Tpetra, EpetraExt
Epetra, Amesos, Didasko, Ifpack
20
Why Packages?
Partial Answer Small Inter-related teams.
21
Package Interoperability
22
Interoperability vs. Dependence (Can Use)
(Depends On)
  • Although most Trilinos packages have no explicit
    dependence, each package must interact with some
    other packages
  • NOX needs operator, vector and solver objects.
  • AztecOO needs preconditioner, matrix, operator
    and vector objects.
  • Interoperability is enabled at configure time.
    For example, NOX
  • --enable-nox-lapack compile NOX lapack
    interface libraries
  • --enable-nox-epetra compile NOX epetra
    interface libraries
  • --enable-nox-petsc compile NOX petsc
    interface libraries
  • Trilinos configure script is vehicle for
  • Establishing interoperability of Trilinos
    components
  • Without compromising individual package autonomy.
  • Trilinos offers seven basic interoperability
    mechanisms.

../configure enable-python
23
Trilinos Interoperability Mechanisms(Acquired as
Package Matures)
24
Can Use vs. Depends On
  • Can Use
  • Interoperable without dependence.
  • Dense is Good.
  • Encouraged.
  • Depends On
  • OK, if essential.
  • Epetra 9 clients.
  • Thyra, Teuchos, NOX 2 clients.
  • Discouraged.

25
Interoperability Example ML
  • ML Multi-level Preconditioner Package.
  • Primary Developers Ray Tuminaro, Jonathan Hu,
    Marzio Sala.
  • No explicit, essential dependence on other
    Trilinos packages.
  • Uses abstract interfaces to matrix/operator
    objects.
  • Has independent configure/build process (but can
    be invoked at Trilinos level).
  • Interoperable with other Trilinos packages and
    other libraries
  • Accepts user data as Epetra matrices/vectors.
  • Can use Epetra for internal matrices/vectors.
  • Can be used via Thyra abstract interfaces.
  • Can be built via Trilinos configure/build
    process.
  • Available as preconditioner to all other Trilinos
    packages.
  • Can use IFPACK, Amesos, AztecOO objects as
    smoothers, coarse solvers.
  • Can be driven via Teuchos ParameterLists.
  • Available to PETSc users without dependence on
    any other Trilinos packages.

26
Package Maturation Process
Asynchronicity
27
Day 1 of Package Life
  • CVS Each package is self-contained in
    Trilinos/package/ directory.
  • Bugzilla Each package has its own Bugzilla
    product.
  • Bonsai Each package is browsable via Bonsai
    interface.
  • Mailman Each Trilinos package, including
    Trilinos itself, has four mail lists
  • package-checkins_at_software.sandia.gov
  • CVS commit emails. Finger on the pulse list.
  • package-developers_at_software.sandia.gov
  • Mailing list for developers.
  • package-users_at_software.sandia.gov
  • Issues for package users.
  • package-announce_at_software.sandia.gov
  • Releases and other announcements specific to the
    package.
  • New_package (optional) Customizable boilerplate
    for
  • Autoconf/Automake/Doxygen/Python/Thyra/Epetra/Test
    Harness/Website

28
Sample Package Maturation Process
Startup Steps
Maturation Steps
29
Maturation Jumpstart NewPackage
  • NewPackage provides jump start to
    develop/integrate a new package
  • NewPackage is a Hello World program and
    website
  • Simple but it does work with autotools.
  • Compiles and builds.
  • NewPackage directory contains
  • Commonly used directory structure src, test,
    doc, example, config.
  • Working Autoconf/Automake files.
  • Documentation templates (doxygen).
  • Working regression test setup.
  • Working Python and Thyra adaptors.
  • Substantially cuts down on
  • Time to integrate new package.
  • Variation in package integration details.
  • Development of website.

NOTE NewPackage can be used independent from
Trilinos
30
What Trilinos is not
  • Trilinos is not a single monolithic piece of
    software. Each package
  • Can be built independent of Trilinos.
  • Has its own self-contained CVS structure.
  • Has its own Bugzilla product and mail lists.
  • Development team is free to make its own
    decisions about algorithms, coding style, release
    contents, testing process, etc.
  • Trilinos top layer is not a large amount of
    source code
  • Trilinos repository (6.0 branch) contains
    660,378 source lines of code (SLOC).
  • Sum of the packages SLOC counts 648,993.
  • Trilinos top layer SLOC count 11,385
    (1.7).
  • Trilinos is not indivisible
  • You dont need all of Trilinos to get things
    done.
  • Any collection of packages can be combined and
    distributed.
  • Current public release contains only 16 of the
    20 Trilinos packages.

31
Insight from HistoryA Philosophy for Future
Directions
  • In the early 1800s U.S. had many new
    territories.
  • Question How to incorporate into U.S.?
  • Colonies? No.
  • Expand boundaries of existing states? No.
  • Create process for self-governing regions. Yes.
  • Theme Local control drawing on national
    resources.
  • Trilinos package architecture has some
    similarities
  • Asynchronous maturation.
  • Packages decide degree of interoperations, use of
    Trilinos facilities.
  • Strength of each Scalable growth with local
    control.

32
Solver Collaboration The Big Picture
33
Solver Software Components and Interfaces
ThyraNonlin
  • Example Trilinos Packages
  • Belos (linear solvers)
  • Anasazi (eigen solvers)
  • NOX (nonlinear equations)
  • Rhythmos (ODEs,DAEs)
  • MOOCHO (Optimization)

ANA
ANA/APP Interface
APP
Thyra ANA Interfaces to Linear Algebra
ANA Linear Operator Interface
ANA Vector Interface
  • Examples
  • SIERRA
  • NEVADA
  • Xyce
  • Sundance

LAL
  • Example Trilinos Packages
  • Epetra/Tpetra (Mat,Vec)
  • Ifpack, AztecOO, ML (Preconditioners)
  • Meros (Preconditioners)
  • Pliris (Interface to direct solvers)
  • Amesos (Direct solvers)
  • Komplex (Complex/Real forms)

Matrix
Preconditioner
FEI/Thyra APP to LAL Interfaces
Custom/Thyra LAL to LAL Interfaces
Vector
Types of Software Components
1) ANA Abstract Numerical Algorithm (e.g.
linear solvers, eigen solvers, nonlinear solvers,
stability analysis, uncertainty quantification,
transient solvers, optimization etc.)
2) LAL Linear Algebra Library (e.g. vectors,
sparse matrices, sparse factorizations,
preconditioners)
3) APP Application (the model physics,
discretization method etc.)
34
LAL Foundation Petra
  • Petra provides a common language for
    distributed linear algebra objects (operator,
    matrix, vector)
  • Petra provides distributed matrix and vector
    services.
  • Has 3 implementations under development.

35
Petra Object Model
  • Perform redistribution of distributed objects
  • Parallel permutations.
  • Ghosting of values for local computations.
  • Collection of partial results from remote
    processors.
  • Base Class for All Distributed Objects
  • Performs all communication.
  • Requires Check, Pack, Unpack methods from
    derived class.
  • Abstract Interface for Sparse All-to-All
    Communication
  • Supports construction of pre-recorded plan for
    data-driven communications.
  • Examples
  • Supports gathering/scatter of off-processor x/y
    values when computing y Ax.
  • Gathering overlap rows for Overlapping Schwarz.
  • Redistribution of matrices, vectors, etc
  • Abstract Interface to Parallel Machine
  • Shameless mimic of MPI interface.
  • Keeps MPI dependence to a single class (through
    all of Trilinos!).
  • Allow trivial serial implementation.
  • Opens door to novel parallel libraries (shmem,
    UPC, etc)
  • Graph class for structure-only computations
  • Reusable matrix structure.
  • Pattern-based preconditioners.
  • Pattern-based load balancing tools.
  • Basic sparse matrix class
  • Flexible construction process.
  • Arbitrary entry placement on parallel machine.
  • Describes layout of distributed objects
  • Vectors Number of vector entries on each
    processor and global ID
  • Matrices/graphs Rows/Columns managed by a
    processor.
  • Called Maps in Epetra.
  • Dense Distributed Vector and Matrices
  • Simple local data structure.
  • BLAS-able, LAPACK-able.
  • Ghostable, redistributable.
  • RTOp-able.

36
Petra Implementations
  • Three version under development
  • Epetra (Essential Petra)
  • Current production version.
  • Restricted to real, double precision arithmetic.
  • Uses stable core subset of C (circa 2000).
  • Interfaces accessible to C and Fortran users.
  • Tpetra (Templated Petra)
  • Next generation C version.
  • Templated scalar and ordinal fields.
  • Uses namespaces, and STL Improved
    usability/efficiency.
  • Jpetra (Java Petra)
  • Pure Java. Portable to any JVM.
  • Interfaces to Java versions of MPI, LAPACK and
    BLAS via interfaces.

37
Data Model Wrap-Up
  • Our target apps require flexible data model.
  • Petra Object Model (POM) supports
  • Arbitrary placement of vector, graph, matrix
    entries on parallel machine.
  • Arbitrary redistribution, ghosting and collection
    of distributed data.
  • This flexibility is needed by LALs
  • Algebraic and multi-level preconditioners.
  • Concrete distributed matrix kernels.
  • Direct methods.
  • POM is complex Non-LALs (ANAs) do not rely on it.

38
Abstract Numerical Algorithms (ANAs)
39
Trilinos Packages
Full Vertical Solver Coverage
40
Goal of Abstraction Flexibility
  • Linear solvers (Direct, Iterative,
    Preconditioners)
  • Abstraction of basic vector/matrix operations
    (dot, axpy, mv).
  • Can use any concrete linear algebra library
    (Epetra, PETSc, BLAS).
  • Nonlinear solvers (Newton, etc.)
  • Abstraction of linear solve (solve Axb).
  • Can use any concrete linear solver library
    (AztecOO, ML, PETSc, LAPACK).
  • Transient/DAE solvers (implicit)
  • Abstraction of nonlinear solve.
  • and so on.
  • But how do we really make it work, and retain
    high performance?
  • Our answer Thyra.

41
Introducing Abstract Numerical Algorithms
What is an abstract numerical algorithm (ANA)?
An ANA is a numerical algorithm that can be
expressed abstractly solely in terms of vectors,
vector spaces, and linear operators (i.e. not
with direct element access)
Example Linear ANA (LANA) Linear Conjugate
Gradients
Types of operations
Types of objects
Linear Conjugate Gradient Algorithm
42
Examples of Nonlinear Abstract Numerical
Algorithms
GMRES (e.g. Belos)
Linear equations
Class of Numerical Problem
Example ANA
Newtons method (e.g. NOX, MOOCHO)
Nonlinear equations
Requires
Backward Euler method (e.g. Rythmos)
Initial Value DAE/ODE
Requires
43
Basic Thyra Vector and Operator Interfaces
ltltcreategtgt
44
Basic Thyra Vector and Operator Interfaces
  • Basically compatible with HCL (Symes et al.) and
    many other operator/vector interfaces
  • Near optimal for many but not all abstract
    numerical algorithms (ANAs)
  • Whats missing?
  • gt Multi-vectors!

ltltcreategtgt
45
Introducing Multi-Vectors
What is a multi-vector?
Example m 4 columns
  • An m multi-vector V is a tall thin dense matrix
    composed of m column vectors vj
  • What ANAs can exploit multi-vectors?
  • Block linear solvers (e.g. block GMRES)
  • Block eigen solvers (i.e. block Arnoldi)
  • Compact limited memory quasi-Newton
  • Tensor methods for nonlinear equations
  • Operator applications
  • (i.e. mat-vecs)
  • Why are multi-vectors important?
  • Cache performance
  • Reduce global communication

Y A X
  • Block dot products (m2)

Examples of multi-vector operations
Q XT Y
  • Block update

Y Y X Q
46
Basic Thyra Vector and Operator Interfaces
Where do multi-vectors fit in?
ltltcreategtgt
47
Basic Thyra Vector and Operator Interfaces
LinearOps apply to MultiVectors
A MulitVector is a linear operator
ltltcreategtgt
VectorSpaces create MultiVectors
A MulitVector is a tall thin dense matrix
A Vector is a MultiVector
A MulitVector has a collection of column vectors
48
Basic Thyra Vector and Operator Interfaces
ltltcreategtgt
What about standard vector ops? Reductions
(norm, dot etc.)? Transformations (axpy,
scaling etc.)?
What about specialized vector ops? e.g.
Interior point methods for opt
49
Whats the Big Deal about Vector-Vector
Operations?
Many different and unusual vector operations are
needed by interior point methods for optimization!
Currently in MOOCHO gt 40 vector operations!
50
Vector Reduction/Transformation Operators Defined
  • Reduction/Transformation Operators (RTOp) Defined
  • z 1i z qi opt( i , v 1i v pi , z 1i z qi
    ) element-wise transformation
  • b opr( i , v 1i v pi , z 1i z
    qi ) element-wise reduction
  • b2 oprr( b1 , b2 )
    reduction of intermediate
    reduction objects
  • v 1 v p Î R n p non-mutable (constant)
    input vectors
  • z 1 z q Î R n q mutable (non-constant)
    input/output vectors
  • b reduction target
    object (many be non-scalar (e.g. yk ,k), or
    NULL)
  • Key to Optimal Performance
  • opt() and opr() applied to entire sets of
    subvectors (i ab) independently
  • z 1ab z qab , b op( a, b , v 1ab v pab
    , z 1ab z qab , b )
  • Communication between sets of subvectors only
    for b ¹ NULL, oprr( b1 , b2 ) b2
  • Software Implementation (see RTOp paper on
    Trilinos website Publications)
  • Visitor design pattern (a.k.a. function
    forwarding)

RTOpT
Vector
apply_op(in sv1..p, inout sz1q, inout beta )
applyOp( in RTOpT, apply_op(in v1..p ,
inout z1q, inout beta)

ROpDotProd
TOpAssignVectors

SerialVectorStd
MPIVectorStd
apply_op( )
apply_op( )
applyOp( )
applyOp( )
51
Basic Thyra Vector and Operator Interfaces
  • The missing piece
  • Reduction/Transformation Operators
  • Supports all needed vector operations
  • Data/parallel independence
  • Optimal performance

R. A. Bartlett, B. G. van Bloemen Waanders and M.
A. Heroux. Vector Reduction/Transformation
Operators, ACM TOMS, March 2004
52
Managing Parallel Data
  • The Petra Object Model

53
A Simple Epetra/AztecOO Program
// Header files omitted int main(int argc, char
argv) MPI_Init(argc,argv) // Initialize
MPI, MpiComm Epetra_MpiComm Comm(
MPI_COMM_WORLD )
// Create x and b vectors
Epetra_Vector x(Map) Epetra_Vector b(Map)
b.Random() // Fill RHS with random s

// Map puts same number of equations on
each pe int NumMyElements 1000
Epetra_Map Map(-1, NumMyElements, 0, Comm)
int NumGlobalElements Map.NumGlobalElements()
// Create Linear Problem
Epetra_LinearProblem problem(A, x, b)

// Create/define AztecOO instance, solve
AztecOO solver(problem)
solver.SetAztecOption(AZ_precond, AZ_Jacobi)
solver.Iterate(1000, 1.0E-8)
// Create an Epetra_Matrix
tridiag(-1,2,-1) Epetra_CrsMatrix
A(Copy, Map, 3) double negOne -1.0 double
posTwo 2.0 for (int i0 iltNumMyElements
i) int GlobalRow A.GRID(i) int
RowLess1 GlobalRow - 1 int RowPlus1
GlobalRow 1 if (RowLess1!-1)
A.InsertGlobalValues(GlobalRow, 1, negOne,
RowLess1) if (RowPlus1!NumGlobalElements)
A.InsertGlobalValues(GlobalRow, 1,
negOne, RowPlus1) A.InsertGlobalValues(Glob
alRow, 1, posTwo, GlobalRow)
A.FillComplete() // Transform from GIDs to LIDs
// Report results, finish
cout ltlt "Solver
performed " ltlt solver.NumIters() ltlt
" iterations." ltlt endl ltlt "Norm of true
residual " ltlt solver.TrueResidual()
ltlt endl MPI_Finalize() return
0
54
Typical Flow of Epetra Object Construction
  • Any number of Comm objects can exist.
  • Comms can be nested (ee.g., serial within MPI).

Construct Comm
  • Maps describe parallel layout.
  • Maps typically associated with more than one
    comp object.
  • Two maps (source and target) define an
    export/import object.

Construct Map
  • Computational objects.
  • Compatibility assured via common map.

Construct x
Construct b
Construct A
55
Parallel Data Redistribution
  • Epetra vectors, multivectors, graphs and matrices
    are distributed via one of the map objects.
  • A map is basically a partitioning of a list of
    global IDs
  • IDs are simply labels, no need to use contiguous
    values (Directory class handles details for
    general ID lists).
  • No a priori restriction on replicated IDs.
  • If we are given
  • A source map and
  • A set of vectors, multivectors, graphs and
    matrices (or other distributable objects) based
    on source map.
  • Redistribution is performed by
  • Specifying a target map with a new distribution
    of the global IDs.
  • Creating Import or Export object using the source
    and target maps.
  • Creating vectors, multivectors, graphs and
    matrices that are redistributed (to target map
    layout) using the Import/Export object.

56
Example epetra/ex9.cpp
  • int main(int argc, char argv)
  • MPI_Init(argc, argv)
  • Epetra_MpiComm Comm(MPI_COMM_WORLD)
  • int NumGlobalElements 4 // global dimension
    of the problem
  • int NumMyElements // local nodes
  • Epetra_IntSerialDenseVector MyGlobalElements
  • if( Comm.MyPID() 0 )
  • NumMyElements 3
  • MyGlobalElements.Size(NumMyElements)
  • MyGlobalElements0 0
  • MyGlobalElements1 1
  • MyGlobalElements2 2
  • else
  • NumMyElements 3
  • MyGlobalElements.Size(NumMyElements)
  • MyGlobalElements0 1
  • MyGlobalElements1 2
  • MyGlobalElements2 3
  • // create a vector based on map
  • Epetra_Vector xxx(Map)
  • for( int i0 iltNumMyElements i )
  • xxxi 10( Comm.MyPID()1 )
  • if( Comm.MyPID() 0 )
  • double val 12
  • int pos 3
  • xxx.SumIntoGlobalValues(1,0,val,pos)
  • cout ltlt xxx
  • // create a target map, in which all elements
    are on proc 0
  • int NumMyElements_target
  • if( Comm.MyPID() 0 )
  • NumMyElements_target NumGlobalElements
  • else
  • NumMyElements_target 0
  • Epetra_Map TargetMap(-1,NumMyElements_target,0,C
    omm)
  • Epetra_Export Exporter(Map,TargetMap)
  • // work on vectors

57
Output epetra/ex9.cpp
Before Export
After Export
  • gt mpirun -np 2 ./ex9.exe
  • EpetraVector
  • MyPID GID Value
  • 0 0 10
  • 0 1 10
  • 0 2 10
  • EpetraVector
  • 1 1 20
  • 1 2 20
  • 1 3 20
  • EpetraVector
  • MyPID GID Value
  • 0 0 10
  • 0 1 30
  • 0 2 30
  • 0 3 20
  • EpetraVector

PE 0 xxx(0)10 xxx(1)10 xxx(2)10
PE 0 yyy(0)10 yyy(1)30 yyy(2)30 yyy(3)20
Export/Add
PE 1 xxx(1)20 xxx(2)20 xxx(3)20
PE 1
58
Import vs. Export
  • Import (Export) means calling processor knows
    what it wants to receive (send).
  • Distinction between Import/Export is important to
    user, almost identical in implementation.
  • Import (Export) objects can be used to do an
    Export (Import) as a reverse operation.
  • When mapping is bijective (1-to-1 and onto),
    either Import or Export is appropriate.

59
Example 1D Matrix Assembly
-uxx f u(a) ?0 u(b) ?1
PE 0
PE 1
a
b
x1
x2
x3
  • 3 Equations Find u at x1, x2 and x3
  • Equation for u at x2 gets a contribution from PE
    0 and PE 1.
  • Would like to compute partial contributions
    independently.
  • Then combine partial results.

60
Two Maps
  • We need two maps
  • Assembly map
  • PE 0 1, 2 .
  • PE 1 2, 3 .
  • Solver map
  • PE 0 1, 2 (we arbitrate ownership of 2).
  • PE 1 3 .

61
End of Assembly Phase
  • At the end of assembly phase we have
    AssemblyMatrix
  • On PE 0
  • On PE 1
  • Want to assign all of Equation 2 to PE 0 for
    usewith solver.
  • NOTE For a class of Neumann-Neumann
    preconditioners, the above layout is exactly what
    we want.

Equation 1Equation 2
Row 2 is shared
Equation 2Equation 3
62
Export Assembly Matrix to Solver Matrix
Epetra_Export Exporter(AssemblyMap,
SolverMap) Epetra_CrsMatrix SolverMatrix (Copy,
SolverMap, 0) SolverMatrix.Export(AssemblyMatrix
, Exporter, Add) SolverMatrix.FillComplete()
63
Matrix Export
After Export
Before Export
PE 0
PE 0
Equation 1Equation 2
Equation 1Equation 2
Export/Add
PE 1
PE 1
Equation 2Equation 3
Equation 3
64
Example epetraext/ex2.cpp
  • int main(int argc, char argv)
  • MPI_Init(argc,argv)
  • Epetra_MpiComm Comm (MPI_COMM_WORLD)
  • int MyPID Comm.MyPID()
  • int n4
  • // Generate Laplacian2d gallery matrix
  • Trilinos_UtilCrsMatrixGallery G("laplace_2d",
    Comm)
  • G.Set("problem_size", nn)
  • G.Set("map_type", "linear") // Linear map
    initially
  • // Get the LinearProblem.
  • Epetra_LinearProblem Prob G.GetLinearProblem(
    )
  • // Get the exact solution.
  • Epetra_MultiVector sol G.GetExactSolution()
  • // Get the rhs (b) and lhs (x)
  • Epetra_MultiVector b Prob-gtGetRHS()
  • // Repartition graph using Zoltan
  • EpetraExtZoltan_CrsGraph ZoltanTrans new
    EpetraExtZoltan_CrsGraph()
  • EpetraExtLinearProblem_GraphTrans
    ZoltanLPTrans
  • new EpetraExtLinearProblem_GraphTrans(
    (dynamic_castltEpetraExtStructuralSameTypeTra
    nsformltEpetra_CrsGraphgtgt(ZoltanTrans)) )
  • cout ltlt "Creating Load Balanced Linear
    Problem\n"
  • Epetra_LinearProblem BalancedProb
    (ZoltanLPTrans)(Prob)
  • // Get the rhs (b) and lhs (x)
  • Epetra_MultiVector Balancedb Prob-gtGetRHS()
  • Epetra_MultiVector Balancedx Prob-gtGetLHS()
  • cout ltlt "Balanced b " ltlt Balancedb ltlt endl
  • cout ltlt "Balanced x " ltlt Balancedx ltlt endl
  • MPI_Finalize()
  • return 0

65
Need for Import/Export
  • Solvers for complex engineering applications need
    expressive, easy-to-use parallel data
    redistribution
  • Allows better scaling for non-uniform overlapping
    Schwarz.
  • Necessary for robust solution of multiphysics
    problems.
  • We have found import and export facilities to be
    a very natural and powerful technique to address
    these issues.

66
Details about Epetra Maps
  • Getting beyond standard use case

67
1-to-1 Maps
  • 1-to-1 map (defn) A map is 1-to-1 if each GID
    appears only once in the map (and is therefore
    associated with only a single processor).
  • Certain operations in parallel data
    repartitioning require 1-to-1 maps.
    Specifically
  • The source map of an import must be 1-to-1.
  • The target map of an export must be 1-to-1.
  • The domain map of a 2D object must be 1-to-1.
  • The range map of a 2D object must be 1-to-1.

68
2D Objects Four Maps
  • Epetra 2D objects
  • CrsMatrix
  • CrsGraph
  • VbrMatrix
  • Have four maps
  • RowMap On each processor, the GIDs of the rows
    that processor will manage.
  • ColMap On each processor, the GIDs of the
    columns that processor will manage.
  • DomainMap The layout of domain objects (the x
    vector/multivector in yAx).
  • RangeMap The layout of range objects (the y
    vector/multivector in yAx).

Typically a 1-to-1 map
Typically NOT a 1-to-1 map
Must be 1-to-1 maps!!!
69
Sample Problem
x
y
A

70
Case 1 Standard Approach
  • First 2 rows of A, elements of y and elements of
    x, kept on PE 0.
  • Last row of A, element of y and element of x,
    kept on PE 1.

PE 0 Contents
PE 1 Contents
  • RowMap 0, 1
  • ColMap 0, 1, 2
  • DomainMap 0, 1
  • RangeMap 0, 1
  • RowMap 2
  • ColMap 1, 2
  • DomainMap 2
  • RangeMap 2
  • Notes
  • Rows are wholly owned.
  • RowMapDomainMapRangeMap (all 1-to-1).
  • ColMap is NOT 1-to-1.
  • Call to FillComplete A.FillComplete() // Assumes

Original Problem
y
A
x

71
Case 2 Twist 1
  • First 2 rows of A, first element of y and last 2
    elements of x, kept on PE 0.
  • Last row of A, last 2 element of y and first
    element of x, kept on PE 1.

PE 0 Contents
PE 1 Contents
  • RowMap 0, 1
  • ColMap 0, 1, 2
  • DomainMap 1, 2
  • RangeMap 0
  • RowMap 2
  • ColMap 1, 2
  • DomainMap 0
  • RangeMap 1, 2
  • Notes
  • Rows are wholly owned.
  • RowMap is NOT DomainMap is NOT
    RangeMap (all 1-to-1).
  • ColMap is NOT 1-to-1.
  • Call to FillComplete A.FillComplete(DomainMap,
    RangeMap)

Original Problem
y
A
x

72
Case 2 Twist 2
  • First row of A, part of second row of A, first
    element of y and last 2 elements of x, kept on PE
    0.
  • Last row, part of second row of A, last 2 element
    of y and first element of x, kept on PE 1.

PE 0 Contents
PE 1 Contents
  • RowMap 0, 1
  • ColMap 0, 1
  • DomainMap 1, 2
  • RangeMap 0
  • RowMap 1, 2
  • ColMap 1, 2
  • DomainMap 0
  • RangeMap 1, 2
  • Notes
  • Rows are NOT wholly owned.
  • RowMap is NOT DomainMap is NOT
    RangeMap (all 1-to-1).
  • RowMap and ColMap are NOT 1-to-1.
  • Call to FillComplete A.FillComplete(DomainMap,
    RangeMap)

Original Problem
y
A
x

73
Overview of Trilinos Packages
74
Trilinos Common Language Petra
  • Petra provides a common language for
    distributed linear algebra objects (operator,
    matrix, vector)
  • Petra1 provides distributed matrix and vector
    services.
  • Exists in basic form as an object model
  • Describes basic user and support classes in UML,
    independent of language/implementation.
  • Describes objects and relationships to build and
    use matrices, vectors and graphs.
  • Has 3 implementations under development.

1Petra is Greek for foundation.
75
Petra Implementations
  • Three version under development
  • Epetra (Essential Petra)
  • Current production version.
  • Restricted to real, double precision arithmetic.
  • Uses stable core subset of C (circa 2000).
  • Interfaces accessible to C and Fortran users.
  • Tpetra (Templated Petra)
  • Next generation C version.
  • Templated scalar and ordinal fields.
  • Uses namespaces, and STL Improved
    usability/efficiency.
  • Jpetra (Java Petra)
  • Pure Java. Portable to any JVM.
  • Interfaces to Java versions of MPI, LAPACK and
    BLAS via interfaces.
  • Developers
  • Mike Heroux, Rob Hoekstra, Alan Williams, Paul
    Sexton

76
Epetra
  • Basic Stuff What you would expect.
  • Variable block matrix data structures.
  • Multivectors.
  • Arbitrary index labeling.
  • Flexible, versatile parallel data redistribution.
  • Language support for inheritance, polymorphism
    and extensions.
  • View vs. Copy.

77
AztecOO
  • Krylov subspace solvers CG, GMRES, Bi-CGSTAB,
  • Incomplete factorization preconditioners
  • Aztec is the workhorse solver at Sandia
  • Extracted from the MPSalsa reacting flow code.
  • Installed in dozens of Sandia apps.
  • 1900 external licenses.
  • AztecOO improves on Aztec by
  • Using Epetra objects for defining matrix and RHS.
  • Providing more preconditioners/scalings.
  • Using C class design to enable more
    sophisticated use.
  • AztecOO interfaces allows
  • Continued use of Aztec for functionality.
  • Introduction of new solver capabilities outside
    of Aztec.
  • Developers
  • Mike Heroux, Alan Williams, Ray Tuminaro

78
IFPACK Algebraic Preconditioners
  • Overlapping Schwarz preconditioners with
    incomplete factorizations, block relaxations,
    block direct solves.
  • Accept user matrix via abstract matrix interface
    (Epetra versions).
  • Uses Epetra for basic matrix/vector calculations.
  • Supports simple perturbation stabilizations and
    condition estimation.
  • Separates graph construction from factorization,
    improves performance substantially.
  • Compatible with AztecOO, ML, Amesos. Can be used
    by NOX and ML.
  • Developers
  • Marzio Sala, Mike Heroux

79
Amesos
  • Interface to direct solver for distributed sparse
    linear systems
  • Challenges
  • No single solver dominates
  • Different interfaces and data formats, serial and
    parallel
  • Interface often change between revisions
  • Amesos offers
  • A single, clear, consistent interface, to various
    packages
  • Common look-and-feel for all classes
  • Separation from specific solver details
  • Use serial and distributed solvers Amesos takes
    care of data redistribution
  • Developers
  • Ken Stanley, Marzio Sala, Tim Davis

80
Calling Amesos
  • Epetra_LinearProblem Problem( A, x, b)
  • Amesos Afact
  • Amesos_BaseSolver Solver Afact.Create( Klu,
    Problem )
  • Solver-gtSymbolicFactorization( )
  • Solver-gtNumericFactorization( )
  • Solver-gtSolve( )

81
Amesos Supported Libraries
Others DSCPACK, PARDISO, TAUCS
1Matrices can be distributed over any number of
processes
82
ML Multi-level Preconditioners
  • Smoothed aggregation, multigrid and domain
    decomposition preconditioning package
  • Critical technology for scalable performance of
    some key apps.
  • ML compatible with other Trilinos packages
  • Accepts user data as Epetra_RowMatrix object
    (abstract interface). Any implementation of
    Epetra_RowMatrix works.
  • Implements the Epetra_Operator interface. Allows
    ML preconditioners to be used with AztecOO and
    TSF.
  • Can also be used completely independent of other
    Trilinos packages.
  • Developers
  • Ray Tuminaro, Jonathan Hu, Marzio Sala

83
NOX Nonlinear Solvers
  • Suite of nonlinear solution methods
  • NOX uses abstract vector and group interfaces
  • Allows flexible selection and tuning of various
    strategies
  • Directions.
  • Line searches.
  • Epetra/AztecOO/ML, LAPACK, PETSc implementations
    of abstract vector/group interfaces.
  • Designed to be easily integrated into existing
    applications.
  • Developers
  • Tammy Kolda, Roger Pawlowski

84
LOCA
  • Library of continuation algorithms
  • Provides
  • Zero order continuation 
  • First order continuation 
  • Arc length continuation 
  • Multi-parameter continuation (via Henderson's MF
    Library)
  • Turning point continuation 
  • Pitchfork bifurcation continuation 
  • Hopf bifurcation continuation 
  • Phase transition continuation 
  • Eigenvalue approximation (via ARPACK or Anasazi)
  • Developers
  • Andy Salinger, Eric Phipps

85
EpetraExt Extensions to Epetra
  • Library of useful classes not needed by everyone
  • Most classes are types of transforms.
  • Examples
  • Graph/matrix view extraction.
  • Epetra/Zoltan interface.
  • Explicit sparse transpose.
  • Singleton removal filter, static condensation
    filter.
  • Overlapped graph constructor, graph colorings.
  • Permutations.
  • Sparse matrix-matrix multiply.
  • Matlab, MatrixMarket I/O functions.
  • Most classes are small, useful, but non-trivial
    to write.
  • Developer
  • Robert Hoekstra, Alan Williams, Mike Heroux

86
Teuchos
  • Utility package of commonly useful tools
  • ParameterList class key/value pair database,
    recursive capabilities.
  • LAPACK, BLAS wrappers (templated on ordinal and
    scalar type).
  • Dense matrix and vector classes (compatible with
    BLAS/LAPACK).
  • FLOP counters, Timers.
  • Ordinal, Scalar Traits support Definition of
    zero, one, etc.
  • Reference counted pointers, and more
  • Takes advantage of advanced features of C
  • Templates
  • Standard Template Library (STL)
  • ParameterList
  • Allows easy control of solver parameters.
  • Developers
  • Roscoe Barlett, Kevin Long, Heidi Thorquist,
    Mike Heroux, Paul Sexton, Kris Kampshoff

87
Belos and Anasazi
  • Next generation linear solvers (Belos) and
    eigensolvers (Anasazi) libraries, written in
    templated C.
  • Provide a generic interface to a collection of
    algorithms for solving large-scale linear
    problems and eigenproblems.
  • Algorithm implementation is accomplished through
    the use of abstract base classes (mini
    interface). Interfaces are derived from these
    base classes to operator-vector products, status
    tests, and any arbitrary linear algebra library.
  • Includes block linear solvers and eigensolvers.
  • Developers
  • Heidi Thorquist, Mike Heroux, Chris Baker, Rich
    Lehoucq

88
Kokkos
  • Goal
  • Isolate key non-BLAS kernels for the purposes of
    optimization.
  • Kernels
  • Dense vector/multivector updates and collective
    ops (not in BLAS/Teuchos).
  • Sparse MV, MM, SV, SM.
  • Serial-only for now.
  • Reference implementation provided (templated).
  • Mechanism for improving performance
  • Default is aggressive compilation of reference
    source.
  • BeBOP Jim Demmel, Kathy Yelick, Rich Vuduc, UC
    Berkeley.
  • Vector version Cray.
  • Developer
  • Mike Heroux

89
NewPackage Package
  • NewPackage provides jump start to
    develop/integrate a new package
  • NewPackage is a Hello World program and
    website
  • Simple but it does work with autotools.
  • Compiles and builds.
  • NewPackage directory contains
  • Commonly used directory structure src, test,
    doc, example, config.
  • Working autotools files.
  • Documentation templates (doxygen).
  • Working regression test setup.
  • Substantially cuts down on
  • Time to integrate new package.
  • Variation in package integration details.
  • Development of website.

90
CLAPS Package
CLAPSClarks Linear Algebra Suite
  • Clark Dohrmann
  • Expert in computational structural mechanics.
  • Unfamiliar with parallel computing. Limited C
    experience.
  • Epetra
  • Advanced parallel basic linear algebra package.
  • Supports sparse, dense linear algebra and
    parallel data redistribution.
  • ClarkEpetra
  • CLOP Overlapping Schwarz preconditioner.
  • Scales with increasing number of constraints.
  • sleg1,2,4x joint models, 12 modes, vplant
    cluster (previously unsolvable).

3 Months to develop. Depends only on Epetra.
Compatible with rest of Trilinos.
91
Pliris Package
  • Migration of Linpack benchmark code to supported
    environment.
  • Used new_package as starting point.
  • Put under source control for the first time.
  • Portable build process for the first time.
  • Documented distributed data model (Epetra).
  • Has its own Bugzilla product, mail lists,
    regression tests.
  • Source code browsable via Bonsai.
  • And more
  • But Pliris is still an independent piece of
    codeand better off after the process.

92
Interface Packages
  • Thyra and PyTrilinos

93
Thyra sillyCGSolve
  • templateltclass Scalargt
  • bool sillyCgSolve(
  • const ThyraLinearOpBaseltScalargt A ,const
    ThyraVectorBaseltScalargt b
  • ,const int maxNumIters ,const typename
    TeuchosScalarTraitsltScalargtmagnitudeType
    tolerance
  • ,ThyraVectorBaseltScalargtx ,stdostream
    out NULL )
  • using TeuchosRefCountPtr
  • typedef TeuchosScalarTraitsltScalargt ST
    // We need to use ScalarTraits to support
    arbitrary types.
  • typedef typename STmagnitudeType
    ScalarMag // This is the type returned from a
    vector norm.
  • typedef one STone())
  • // Get the vector space (domain and range spaces
    should be the same)
  • RefCountPtrltconst ThyraVectorSpaceBaseltScalargt
    gt space A.domain()
  • // Compute initial residual r b - Ax
  • RefCountPtrltThyraVectorBaseltScalargt gt r
    createMember(space)
  • Thyraassign(r,b)
    // r b
  • Thyraapply(A,ThyraNOTRANS,x, one, one)
    // r -Ax r
  • const ScalarMag r0_nrm Thyranorm(r)
    // Compute r0 sqrt(ltr0,r0gt) for convergence
    test
  • if(r0_nrm STzero()) return true
    // Trivial RHS and initial LHS guess?
  • Generic CG code.
  • To use, provide
  • ThyraLinearOp.
  • ThyraVector.
  • Works on any parallel machine.
  • Allows mixing of vectors matrices
  • PETSc Epetra.
  • Works with any scalar datatype.

94
PyTrilinos
  • PyTrilinos provides Python access to Trilinos
    packages.
  • Uses SWIG to generate bindings.
  • Epetra, AztecOO, IFPACK, ML, NOX, LOCA,
    Amesosand NewPackage are support.
  • Possible to
  • Define RowMatrix implementation in Python.
  • Use from Trilinos C code.
  • Performance for large grain is equivalent to C.
  • Several times hit for very fine grain code.

95
  • include "mpi.h"
  • include "Epetra_MpiComm.h"
  • include "Epetra_Vector.h"
  • include "Epetra_Time.h"
  • include "Epetra_RowMatrix.h"
  • include "Epetra_CrsMatrix.h"
  • include "Epetra_Time.h"
  • include "Epetra_LinearProblem.h"
  • include "Trilinos_Util_CrsMatrixGallery.h"
  • using namespace Trilinos_Util
  • int main(int argc, char argv)
  • MPI_Init(argc, argv)
  • Epetra_MpiComm Comm(MPI_COMM_WORLD)
  • int nx 1000
  • int ny 1000 Comm.NumProc()
  • CrsMatrixGallery Gallery("laplace_2d", Comm)
  • Gallery.Set("ny", ny)
  • Gallery.Set("nx", nx)

C vs. Python Equivalent Code
! /usr/bin/env python from PyTrilinos import
Epetra, Triutils Epetra.Init() Comm
Epetra.PyComm() nx 1000 ny 1000
Comm.NumProc() Gallery Triutils.CrsMatrixGallery
("laplace_2d", Comm) Gallery.Set("nx",
nx) Gallery.Set("ny", ny) Gallery.Set("problem_siz
e", nx ny) Gallery.Set("map_type",
"linear") Matrix Gallery.GetMatrix() LHS
Gallery.GetStartingSolution() RHS
Gallery.GetRHS() Time Epetra.Time(Comm) for i
in xrange(10) Matrix.Multiply(False, LHS,
RHS) print Time.ElapsedTime() Epetra.Finalize()
96
Templates and Generic Programming
97
Investment in Generic Programming
  • 2nd Generation Trilinos packages are templated
    on
  • OrdinalType (think int).
  • ScalarType (think double).
  • ExamplesTeuchosSerialDenseMatrixltint, doublegt
    A TeuchosSerialDenseMatrixltshort, floatgt B
  • Scalar/Ordinal Traits mechanism completes support
    for genericity.
  • The following packages support templatesTeuchos
    (Basic Tools)Thyra (Abstract Interfaces)Tpetra
    (including MPI support)Belos (Krylov and Block
    Krylov Linear), IFPACK (algebraic
    preconditioners, next version), Anasazi
    (Eigensolvers),

98
Potential Benefits of Templated Types
  • Templated scalar (and ordinal) types have great
    potential
  • Generic Algorithms expressed over abstract
    field.
  • High Performance Partial template specialization
    Fortran.
  • Facilitate variety of algorithmic studies.
  • Allow study of asymptotic behavior of
    discretizations.
  • Facilitate debugging Reduces FP error as source
    error.
  • Use your imagination
  • All new Trilinos packages are templated.

99
Kokkos Results Sparse MVPentium M 1.6GHz
Cygwin/GCC 3.2 (WinXP Laptop)
100
ARPREC
  • The ARPREC library uses arrays of 64-bit
    floating-point numbers to represent
    high-precision floating-point numbers.
  • ARPREC values behave just like any other
    floating-point datatype, except the maximum
    working precision (in decimal digits) must be
    specified before any calculations are done
  • mpmp_init(200)
  • Illustrate the use of ARPREC with an example
    using Hilbert matrices.

101
Hilbert Matrices
  • A Hilbert matrix HN is a square N-by-N matrix
    such that
  • For Example

102
Hilbert Matrices
  • Notoriously ill-conditioned
  • ?(H3) ? 524
  • ?(H5) ? 476610
  • ?(H10) ? 1.6025 x 1013
  • ?(H20) ? 7.8413 x 1017
  • ?(H100) ? 1.7232 x 1020
  • Hilbert matrices introduce large amounts of error

103
Hilbert Matrices and Cholesky Factorization
  • With double-precision arithmetic, Cholesky
    factorization will fail for HN for all N gt 13.
  • Can we improve on this using arbitrary-precision
    floating-point numbers?

104
New Trilinos 7.0 Packages
105
Galeri Isorropia
  • Galeri Generates Epetra maps and matrices for
    testing.
  • A set of functionalities that are very close to
    that of the MATLAB's gallery() function.
  • Capabilities to create several well-know finite
    difference matrices.
  • A simple finite element code that can be used to
    discretize scalar second order elliptic problems
    using Galerkin and SUPG techniques, on both 2D
    and 3D unstructured grids.
  • Isorropia Repartitioning/rebalancing.
  • Assists with redistributing objects such as
  • Matrices and matrix-graphs in a parallel
    execution setting.
  • Allows for more efficient computations.
  • Primarily an interface to the Zoltan library.
  • Can be built and used with minimal capability
    without Zoltan.

106
Moertel Mooc
About PowerShow.com