Title: Design of C Matrix Library for FEM Computing Base on Generic Programming
1Design of C Matrix Library for FEM Computing
Base on Generic Programming
Zhou Jianhui
2Existing linear algebra library(1)
3Existing linear algebra library(2)
- Summary
- 1. C code is by no means inherently slower
than Fortran code. After being optimized, C
matrix library perform equally or even better
than that Fortran in term of corresponding codes. - 2. Almost all high-performance C matrix library
use generic programming technique to optimize
performance. Generic programming is proved to be
of key importance to high-performance numerical
computation. - 3. C matrix libraries have more convenient
interface. The operator overloading of C makes
it possible to write program as easy as writing
algebra expressions. For example, if you do
matrix addition operation, you can write A B
C, rather than call addition (A,B,C).
4Overview
- As the new C matrix library hereby is
oriented towards FEM computing, it only
provides operations of dense, symmetric and
banded matrix, vector, sub-matrix, etc, rather
than that of all algebra operations. But it does
provide optimization specifically for FEM
computing optimizing. - The library was developed with references to
those existing matrix libraries mentioned above,
which brings some similarities between them
resulting from the same optimizing technique
based on generic programming. - Notably, the library has made some
significant improvements, including a novel
architecture, an upgraded optimizing technique
and a efficient memory management policy. The
library achieves better performance, user
interface and capability for further expansion.
5Architecture
6User Interface(1)
Some of the user interface
7User Interface(2)
Some of the user interface
8User Interface(3)
Some of the user interface
9The realization of optimizing by C generic
programming.
- Two of the most widely used optimizing technique
are - Expression templates.
- Template metaprograms.
10Expression templates eliminate temporary matrix
- Matrix addition A B C
- Fortran code will act as follows
- while C codes usually generate temporary
variables implicitly and actually act as follows - Such temporary matrix occupies too much
spaces and is time consuming. Such drawbacks can
be eliminated by expression templates, making C
code as efficient as Fortran Code.
11Template metaprograms minimize cost of loop
- Inner production of vectors dot V1 V2,
suppose size of V1,V2 are 3 - General codes can be depicted as
- Float dot 0
DOT 0 - For(int i 0 i lt 3 i)
DO 100 I 1, 3 - dot v1i v2i 100
DOT DOT V1(I) V2(I) - But with the size of two vectors given in
advance, the code can be more efficient as
follows - dot v10 v20 v11 v21
v12 v22 - or DOT V1(1) V2(1) V1(2) V2(2) V1(3)
V2(3) - Essential difference between them is that the
latter codes saves the runtime of loop,such as
increment of looping variant. - With template metaprograms, C can generate
codes like the latter form automatically during
compiling process on condition that size of
matrix or vector is known beforehand.
12The highlight of the new matrix library
- 1. Improved implementation of expression
templates, which further simplifies the user
interface. - 2. Efficient memory management. The library does
memory management itself instead of the operating
system. - 3. New library architecture, as mentioned before.
- 4. Some specific optimization for FEM computing.
- (1) Special storage strategy and the
corresponding algorithms for special matrix, such
as stiffness matrix. - (2) Use template metaprograms as frequently
as possible.
13Specific optimizing for FEM computing
- (1) Specific storage strategy and the
corresponding algorithms for specific matrix - Example
- In FEM computing, as K is a symmetric matrix,
thus the multiplication algorithm can be
optimized so that its calculation can be reduced
by half
14FEM computing-specific optimizing
- (2) Use template metaprograms as widely as
possible. - Example For an eight-node isoparametric element
- The size of all the above matrixes are fixed,
this is exactly where template metaprograms can
be used to optimize performance. - Actually, there are so many matrix operations
where matrix and vector are size-fixed, that
template metaprograms can be frequently used to
achieve high performance.
15Performance Comparison
- The program is compiled under Visual C7.1,
and run on Intel P?550E machine for 500000 times.
For instance, the matrix of the isoparametric
element operations in the program can be depicted
as follows
16