Loading...

PPT – A LargeGrained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integratio PowerPoint presentation | free to view - id: 115517-NmNiM

The Adobe Flash plugin is needed to view this content

A Large-Grained Parallel Algorithm for Nonlinear

Eigenvalue Problems Using Complex Contour

Integration

- Takeshi Amako, Yusaku Yamamoto and Shao-Liang

Zhang - Dept. of Computational Science Engineering
- Nagoya University, Japan

Outline of the talk

- Introduction
- The nonlinear eigenvalue problem
- Existing algorithms
- Our objective
- The algorithm
- Formulation as a nonlinear equation
- Application of Kravanja et als method
- Detecting and removing spurious eigenvalues
- Numerical results
- Accuracy of the computed eigenvalues
- Parallel performance
- Conclusion

Introduction

- The nonlinear eigenvalue problem
- Given A(z) ? Cnn , z complex parameter
- Find z1 ? C such that A(z1) x 0 has a nonzero

solution x x1. - z1 and x1 are called the eigenvalue and the

corresponding eigenvector, respectively. - Examples
- A(z) A zB z2C quadratic eigenvalue

problem - A(z) A zB ezC general nonlinear

eigenvalue problem - Applications
- Electronic structure calculation
- Nonlinear elasticity
- Theoretical fluid dynamics

Existing algorithms

- Multivariate Newtons method and its variants
- Locally quadratic convergence
- Requires good initial estimate both for z1 and

x1. - Nonlinear Arnoldi methods
- Nonlinear Jacobi-Davidson methods
- Efficient for large sparse matrices
- Not suitable for finding all eigenvalues within a

specified region of the complex plane

Our objective

- Let
- G closed Jordan curve on the complex plane,
- A(z) ? Cnn analytical function of z in G.
- We propose an algorithm that
- can find all the eigenvalues within G, and
- has large-grain parallelism.

Im z

Assumption In the following, we mainly

consider the case where G is a circle centered at

the origin and with radius r.

G

Re z

O

r

Related work Sakurai et al. propose an

algorithm for linear generalized eigenvalue

problems

Our approach

- The basic idea
- Let f(z) det(A(z)).
- Then f(z) is an analytical function of z in G and

the eigenvalues of A(z) are characterized as the

zeros of f(z). - Use Kravanjas method (Kravanja et al., 1999) to

find the zeros of an analytic function.

Finding zeros of f(z)

- Let
- z1, z2, ..., zm zeros of f(z) in G, and
- n1, n2, ..., nm their multiplicity.
- Then f(z) can be written as
- Define the complex moments by
- Then

f(z) g(z)

analytical and nonzero in G

analytical in G

Finding zeros of f(z) (cont'd)

- To extract information on zk from mp, define

the following matrices - Then it is easy to see that

Finding zeros of f(z) (cont'd)

- Noting that Vm and Dm are nonsingular, we have

the following equivalence relation - That is, we can find the zeros of f(z) in G by
- computing the complex moments m0, m1 , ...,

m2m-1, - constructing Hm and Hmlt, and
- computing the eigenvalues of Hmlt lHm.

l is an eigenvalue of Hmlt lHm ? l is an

eigenvalue of Lm lI ? ?k, l zk

Application to the nonlinear eigenvalue problem

- In our case, f(z) det(A(z)) and
- By applying the trapezoidal rule with K points,

we have - where

G

The algorithm

Detecting and removing spurious eigenvalues

- Usually, we do not know m, the number of

eigenvalues of A(z) in G, in advance and use some

estimate M instead. - When M gt m, the eigenvalues of Hmlt lHm include

spurious solutions that do not correspond to an

eigenvalue of A(z). - To detect them, we compute the corresponding

eigenvector by inverse iteration and evaluate the

relative residual defined by - Of course, this quantity can also be used to

check the accuracy of the computed eigenvalues.

relative residual

Numerical results

- Test problem
- A(z) A zI eB(z), where
- A(z) real random nonsymmetric matrix
- B(z) antidiagonal matrix with antidiagonal

elements ez - e parameter to specify the strength of

nonlinearity - Parameters
- n 500, 1000, 2000
- e 0, 104, 103, 102, 101
- Computational environment
- Fujitsu HPC2500 (SPARC 64IV), 1-16 processors
- Program written with C and MPI
- LAPACK routines were used to compute (A(z))1 and

to compute the eigenvalues of Hmlt lHm.

Accuracy of the computed eigenvalues

- Parameters
- n 500 and e 0.1
- r 0.85, K 128 and M 11.
- There are 7 eigenvalues in G.
- Results
- Our algorithm succeeded in locating all the

eigenvalues in G. - The relative residuals were all under 1010.
- Similar results for other cases.

Im z

Re z

Effect of K and M on the accuracy

- Effect of the number of sample points K
- Usually K128 gives sufficient accuracy.
- Effect of the Hankel matrix size M
- It is better to take M a few more than the number

of eigenvalues within G (7 in this case). - This is to mitigate the perturbation from

eigenvalues outside G.

K

M

Residuals as a function of K.

Residuals as a function of M.

Detecting and removing spurious eigenvalues

- Parameters
- n 1000 and e 0.01
- r 0.7, K 128 and M 10.
- There are 9 eigenvalues in G.
- Eigenvalues of Hmlt lHm
- 10 eigenvalues were found within G.
- For 9 of the eigenvalues, the residual was less

than 1011. - For one eigenvalue, the residual was 102.

Im z

Re z

Parallel performance

- Performance on Fujitsu HPC2500
- Matrix size n 500, 1000, 2000
- Number of processors P 1, 2, 4, 8, 16

Almost linear speedup was obtained in all cases

due to large-grain parallelism.

Execution time (sec)

Number of processors

Parallel performance (cont'd)

- Performance in a Grid environment
- Matrix size n 1000
- Machine Intel Xeon Cluster
- Master-worker type parallelization using OmniRPC

(GridRPC)

Good scalability was obtained for up to 14

processors.

20000

16

Execution time

Speedup

14

13000

12

10

10000

8

6

03000

4

2

00000

0

Number of processors

2

4

6

8

10

12

14

Summary of this study

- We proposed a new algorithm for the nonlinear

eigenvalue problem based on complex contour

integration. - Our algorithm can find all the eigenvalues within

a closed curve on the complex plane. Moreover, it

has large-grain parallelism and is expected to

show excellent parallel performance. - These advantages have been confirmed by numerical

experiments.

Future work

- Performance evaluation on large-scale grid

environments. - Application to practical problems.
- Computation of scaling exponent in theoretical

fluid dynamics - Development of an efficient algorithm for

computing