# A LargeGrained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integratio - PowerPoint PPT Presentation

PPT – A LargeGrained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integratio PowerPoint presentation | free to view - id: 115517-NmNiM

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## A LargeGrained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integratio

Description:

### ... was obtained in all cases due to large-grain parallelism. ... Moreover, it has large-grain parallelism and is expected to show excellent parallel performance. ... – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 21
Provided by: naCseNa
Category:
Tags:
Transcript and Presenter's Notes

Title: A LargeGrained Parallel Algorithm for Nonlinear Eigenvalue Problems Using Complex Contour Integratio

1
A Large-Grained Parallel Algorithm for Nonlinear
Eigenvalue Problems Using Complex Contour
Integration
• Takeshi Amako, Yusaku Yamamoto and Shao-Liang
Zhang
• Dept. of Computational Science Engineering
• Nagoya University, Japan

2
Outline of the talk
• Introduction
• The nonlinear eigenvalue problem
• Existing algorithms
• Our objective
• The algorithm
• Formulation as a nonlinear equation
• Application of Kravanja et als method
• Detecting and removing spurious eigenvalues
• Numerical results
• Accuracy of the computed eigenvalues
• Parallel performance
• Conclusion

3
Introduction
• The nonlinear eigenvalue problem
• Given A(z) ? Cnn , z complex parameter
• Find z1 ? C such that A(z1) x 0 has a nonzero
solution x x1.
• z1 and x1 are called the eigenvalue and the
corresponding eigenvector, respectively.
• Examples
• A(z) A zB z2C quadratic eigenvalue
problem
• A(z) A zB ezC general nonlinear
eigenvalue problem
• Applications
• Electronic structure calculation
• Nonlinear elasticity
• Theoretical fluid dynamics

4
Existing algorithms
• Multivariate Newtons method and its variants
• Requires good initial estimate both for z1 and
x1.
• Nonlinear Arnoldi methods
• Nonlinear Jacobi-Davidson methods
• Efficient for large sparse matrices
• Not suitable for finding all eigenvalues within a
specified region of the complex plane

5
Our objective
• Let
• G closed Jordan curve on the complex plane,
• A(z) ? Cnn analytical function of z in G.
• We propose an algorithm that
• can find all the eigenvalues within G, and
• has large-grain parallelism.

Im z
Assumption In the following, we mainly
consider the case where G is a circle centered at
the origin and with radius r.
G
Re z
O
r
Related work Sakurai et al. propose an
algorithm for linear generalized eigenvalue
problems
6
Our approach
• The basic idea
• Let f(z) det(A(z)).
• Then f(z) is an analytical function of z in G and
the eigenvalues of A(z) are characterized as the
zeros of f(z).
• Use Kravanjas method (Kravanja et al., 1999) to
find the zeros of an analytic function.

7
Finding zeros of f(z)
• Let
• z1, z2, ..., zm zeros of f(z) in G, and
• n1, n2, ..., nm their multiplicity.
• Then f(z) can be written as
• Define the complex moments by
• Then

f(z) g(z)
analytical and nonzero in G
analytical in G
8
Finding zeros of f(z) (cont'd)
• To extract information on zk from mp, define
the following matrices
• Then it is easy to see that

9
Finding zeros of f(z) (cont'd)
• Noting that Vm and Dm are nonsingular, we have
the following equivalence relation
• That is, we can find the zeros of f(z) in G by
• computing the complex moments m0, m1 , ...,
m2m-1,
• constructing Hm and Hmlt, and
• computing the eigenvalues of Hmlt lHm.

l is an eigenvalue of Hmlt lHm ? l is an
eigenvalue of Lm lI ? ?k, l zk
10
Application to the nonlinear eigenvalue problem
• In our case, f(z) det(A(z)) and
• By applying the trapezoidal rule with K points,
we have
• where

G
11
The algorithm
12
Detecting and removing spurious eigenvalues
• Usually, we do not know m, the number of
eigenvalues of A(z) in G, in advance and use some
• When M gt m, the eigenvalues of Hmlt lHm include
spurious solutions that do not correspond to an
eigenvalue of A(z).
• To detect them, we compute the corresponding
eigenvector by inverse iteration and evaluate the
relative residual defined by
• Of course, this quantity can also be used to
check the accuracy of the computed eigenvalues.

relative residual
13
Numerical results
• Test problem
• A(z) A zI eB(z), where
• A(z) real random nonsymmetric matrix
• B(z) antidiagonal matrix with antidiagonal
elements ez
• e parameter to specify the strength of
nonlinearity
• Parameters
• n 500, 1000, 2000
• e 0, 104, 103, 102, 101
• Computational environment
• Fujitsu HPC2500 (SPARC 64IV), 1-16 processors
• Program written with C and MPI
• LAPACK routines were used to compute (A(z))1 and
to compute the eigenvalues of Hmlt lHm.

14
Accuracy of the computed eigenvalues
• Parameters
• n 500 and e 0.1
• r 0.85, K 128 and M 11.
• There are 7 eigenvalues in G.
• Results
• Our algorithm succeeded in locating all the
eigenvalues in G.
• The relative residuals were all under 1010.
• Similar results for other cases.

Im z
Re z
15
Effect of K and M on the accuracy
• Effect of the number of sample points K
• Usually K128 gives sufficient accuracy.
• Effect of the Hankel matrix size M
• It is better to take M a few more than the number
of eigenvalues within G (7 in this case).
• This is to mitigate the perturbation from
eigenvalues outside G.

K
M
Residuals as a function of K.
Residuals as a function of M.
16
Detecting and removing spurious eigenvalues
• Parameters
• n 1000 and e 0.01
• r 0.7, K 128 and M 10.
• There are 9 eigenvalues in G.
• Eigenvalues of Hmlt lHm
• 10 eigenvalues were found within G.
• For 9 of the eigenvalues, the residual was less
than 1011.
• For one eigenvalue, the residual was 102.

Im z
Re z
17
Parallel performance
• Performance on Fujitsu HPC2500
• Matrix size n 500, 1000, 2000
• Number of processors P 1, 2, 4, 8, 16

Almost linear speedup was obtained in all cases
due to large-grain parallelism.
Execution time (sec)
Number of processors
18
Parallel performance (cont'd)
• Performance in a Grid environment
• Matrix size n 1000
• Machine Intel Xeon Cluster
• Master-worker type parallelization using OmniRPC
(GridRPC)

Good scalability was obtained for up to 14
processors.
20000
16
Execution time
Speedup
14
13000
12
10
10000
8
6
03000
4
2
00000
0
Number of processors
2
4
6
8
10
12
14
19
Summary of this study
• We proposed a new algorithm for the nonlinear
eigenvalue problem based on complex contour
integration.
• Our algorithm can find all the eigenvalues within
a closed curve on the complex plane. Moreover, it
has large-grain parallelism and is expected to
show excellent parallel performance.
• These advantages have been confirmed by numerical
experiments.

20
Future work
• Performance evaluation on large-scale grid
environments.
• Application to practical problems.
• Computation of scaling exponent in theoretical
fluid dynamics
• Development of an efficient algorithm for
computing