CSE 245 Computer Aided Circuit Simulation and

Verification

Fall 2004, Oct 19 Lecture 7 Matrix Solver II

-Iterative Method

Outline

- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Multigrid Method

Iterative Methods

- Stationary
- x(k1) Gx(k)c
- where G and c do not depend on iteration count

(k) - Non Stationary
- x(k1) x(k)akp(k)
- where computation involves information that

change at each iteration

Stationary Jacobi Method

- In the i-th equation solve for the value of xi

while assuming the other entries of x remain

fixed - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the

strictly lower-trg and strictly upper-trg parts

of M - MD-L-U

Stationary-Gause-Seidel

- Like Jacobi, but now assume that previously

computed results are used as soon as they are

available - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the

strictly lower-trg and strictly upper-trg parts

of M - MD-L-U

Stationary Successive Overrelaxation (SOR)

- Devised by extrapolation applied to Gauss-Seidel

in the form of weighted average - In matrix terms the method becomes
- where D, -L and -U represent the diagonal, the

strictly lower-trg and strictly upper-trg parts

of M - MD-L-U

SOR

- Choose w to accelerate the convergence
- W 1 Jacobi / Gauss-Seidel
- 2gtWgt1 Over-Relaxation
- W lt 1 Under-Relaxation

Convergence of Stationary Method

- Linear Equation MXb
- A sufficient condition for convergence of the

solution(GS,Jacob) is that the matrix M is

diagonally dominant. - If M is symmetric positive definite, SOR

converges for any w (0ltwlt2) - A necessary and sufficient condition for the

convergence is the magnitude of the largest

eigenvalue of the matrix G is smaller than 1 - Jacobi
- Gauss-Seidel
- SOR

Outline

- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Steepest Descent
- Conjugate Gradient
- Preconditioning
- Multigrid Method

Linear Equation an optimization problem

- Quadratic function of vector x
- Matrix A is positive-definite, if

for any nonzero vector x - If A is symmetric, positive-definite, f(x) is

minimized by the solution

Linear Equation an optimization problem

- Quadratic function
- Derivative
- If A is symmetric
- If A is positive-definite
- is minimized by setting to 0

For symmetric positive definite matrix A

Gradient of quadratic form

The points in the direction of steepest increase

of f(x)

Symmetric Positive-Definite Matrix A

- If A is symmetric positive definite
- P is the arbitrary point
- X is the solution point

since

We have,

If p ! x

If A is not positive definite

- Positive definite matrix b) negative-definite

matrix - c) Singular matrix d) positive

indefinite matrix

Non-stationary Iterative Method

- State from initial guess x0, adjust it until

close enough to the exact solution - How to choose direction and step size?

i0,1,2,3,

Adjustment Direction

Step Size

Steepest Descent Method (1)

- Choose the direction in which f decrease most

quickly the direction opposite of - Which is also the direction of residue

Steepest Descent Method (2)

- How to choose step size ?
- Line Search
- should minimize f, along the direction of

, which means

Orthogonal

Steepest Descent Algorithm

Given x0, iterate until residue is smaller than

error tolerance

Steepest Descent Method example

- Starting at (-2,-2) take the
- direction of steepest descent of f
- b) Find the point on the intersec-
- tion of these two surfaces that
- minimize f
- c) Intersection of surfaces.
- d) The gradient at the bottommost
- point is orthogonal to the gradient
- of the previous step

Iterations of Steepest Descent Method

Convergence of Steepest Descent-1

let

Eigenvector

EigenValue

j1,2,,n

Energy norm

Convergence of Steepest Descent-2

Convergence Study (n2)

assume

let

Spectral condition number

let

Plot of w

Case Study

Bound of Convergence

It can be proved that it is also valid for ngt2,

where

Conjugate Gradient Method

- Steepest Descent
- Repeat search direction
- Why take exact one step for each direction?

Search direction of Steepest descent method

Orthogonal Direction

Pick orthogonal search direction

- We dont know !!!

Orthogonal ? A-orthogonal

- Instead of orthogonal search direction, we make

search direction A orthogonal (conjugate)

Search Step Size

Iteration finish in n steps

Initial error

A-orthogonal

The error component at direction dj is eliminated

at step j. After n steps, all errors are

eliminated.

Conjugate Search Direction

- How to construct A-orthogonal search directions,

given a set of n linear independent vectors. - Since the residue vector in

steepest descent method is orthogonal, a good

candidate to start with

Construct Search Direction -1

- In Steepest Descent Method
- New residue is just a linear combination of

previous residue and - Let

We have

Krylov SubSpace repeatedly applying a matrix to

a vector

Construct Search Direction -2

let

For i gt 0

Construct Search Direction -3

- can get next direction from the previous one,

without saving them all.

let

then

Conjugate Gradient Algorithm

Given x0, iterate until residue is smaller than

error tolerance

Conjugate gradient Convergence

- In exact arithmetic, CG converges in n steps

(completely unrealistic!!) - Accuracy after k steps of CG is related to
- consider polynomials of degree k that are equal

to 1 at 0. - how small can such a polynomial be at all the

eigenvalues of A? - Thus, eigenvalues close together are good.
- Condition number ?(A) A2 A-12

?max(A) / ?min(A) - Residual is reduced by a constant factor by

O(?1/2(A)) iterations of CG.

Other Krylov subspace methods

- Nonsymmetric linear systems
- GMRES for i 1, 2, 3, . . . find xi ? Ki

(A, b) such that ri (Axi b) ? Ki (A,

b)But, no short recurrence gt save old vectors

gt lots more space (Usually restarted every k

iterations to use less space.) - BiCGStab, QMR, etc.Two spaces Ki (A, b) and Ki

(AT, b) w/ mutually orthogonal bases Short

recurrences gt O(n) space, but less robust - Convergence and preconditioning more delicate

than CG - Active area of current research
- Eigenvalues Lanczos (symmetric), Arnoldi

(nonsymmetric)

Preconditioners

- Suppose you had a matrix B such that
- condition number ?(B-1A) is small
- By z is easy to solve
- Then you could solve (B-1A)x B-1b instead of Ax

b - B A is great for (1), not for (2)
- B I is great for (2), not for (1)
- Domain-specific approximations sometimes work
- B diagonal of A sometimes works
- Better blend in some direct-methods ideas. . .

Preconditioned conjugate gradient iteration

x0 0, r0 b, d0 B-1 r0, y0

B-1 r0 for k 1, 2, 3, . . . ak

(yTk-1rk-1) / (dTk-1Adk-1) step length xk

xk-1 ak dk-1 approx

solution rk rk-1 ak Adk-1

residual yk B-1 rk

preconditioning

solve ßk (yTk rk) / (yTk-1rk-1)

improvement dk yk ßk dk-1

search direction

- One matrix-vector multiplication per iteration
- One solve with preconditioner per iteration

Outline

- Iterative Method
- Stationary Iterative Method (SOR, GS,Jacob)
- Krylov Method (CG, GMRES)
- Multigrid Method

What is the multigrid

- A multilevel iterative method to solve
- Axb
- Originated in PDEs on geometric grids
- Expend the multigrid idea to unstructured problem

Algebraic MG - Geometric multigrid for presenting the basic

ideas of the multigrid method.

The model problem

Ax b

Simple iterative method

- x(0) -gt x(1) -gt -gt x(k)
- Jacobi iteration
- Matrix form x(k) Rjx(k-1) Cj
- General form x(k) Rx(k-1) C (1)
- Stationary x Rx C (2)

Error and Convergence

- Definition error e x - x (3)
- residual r b Ax (4)
- e, r relation Ae r (5) ((3)(4))
- e(1) x-x(1) Rx C Rx(0) C Re(0)
- Error equation e(k) Rke(0) (6) ((1)(2)(3))
- Convergence

Error of diffenent frequency

- Wavenumber k and frequency ?
- k?/n
- High frequency error is more oscillatory between

points

Iteration reduce low frequency error efficiently

- Smoothing iteration reduce high frequency error

efficiently, but not low frequency error

Error

k 1

k 2

k 4

Iterations

Multigrid a first glance

- Two levels coarse and fine grid

?2h

A2hx2hb2h

Ahxhbh

?h

Axb

Idea 1 the V-cycle iteration

- Also called the nested iteration

Start with

?2h

A2hx2h b2h

A2hx2h b2h

Iterate gt

Prolongation ?

Restriction ?

?h

Ahxh bh

Iterate to get

Question 1 Why we need the coarse grid ?

Prolongation

- Prolongation (interpolation) operator
- xh x2h

Restriction

- Restriction operator
- xh x2h

Smoothing

- The basic iterations in each level
- In ?ph xphold ? xphnew
- Iteration reduces the error, makes the error

smooth geometrically. - So the iteration is called smoothing.

Why multilevel ?

- Coarse lever iteration is cheap.
- More than this
- Coarse level smoothing reduces the error more

efficiently than fine level in some way . - Why ? ( Question 2 )

Error restriction

- Map error to coarse grid will make the error more

oscillatory

K 4, ? ?

K 4, ? ?/2

Idea 2 Residual correction

- Known current solution x
- Solve Axb eq. to
- MG do NOT map x directly between levels
- Map residual equation to coarse level
- Calculate rh
- b2h Ih2h rh ( Restriction )
- eh Ih2h x2h ( Prolongation )
- xh xh eh

Why residual correction ?

- Error is smooth at fine level, but the actual

solution may not be. - Prolongation results in a smooth error in fine

level, which is suppose to be a good evaluation

of the fine level error. - If the solution is not smooth in fine level,

prolongation will introduce more high frequency

error.

Revised V-cycle with idea 2

?2h ?h

- Smoothing on xh
- Calculate rh
- b2h Ih2h rh
- Smoothing on x2h
- eh Ih2h x2h
- Correct xh xh eh

Restriction

Prolongation

What is A2h

- Galerkin condition

Going to multilevels

- V-cycle and W-cycle
- Full Multigrid V-cycle

h 2h 4h

h 2h 4h 8h

Performance of Multigrid

- Complexity comparison

Gaussian elimination O(N2)

Jacobi iteration O(N2log?)

Gauss-Seidel O(N2log?)

SOR O(N3/2log?)

Conjugate gradient O(N3/2log?)

Multigrid ( iterative ) O(Nlog?)

Multigrid ( FMG ) O(N)

Summary of MG ideas

- Three important ideas of MG
- Nested iteration
- Residual correction
- Elimination of error
- high frequency fine grid
- low frequency coarse grid

AMG for unstructured grids

- Axb, no regular grid structure
- Fine grid defined from A

Three questions for AMG

- How to choose coarse grid
- How to define the smoothness of errors
- How are interpolation and prolongation done

How to choose coarse grid

- Idea
- C/F splitting
- As few coarse grid point as possible
- For each F-node, at least one of its neighbor is

a C-node - Choose node with strong coupling to other nodes

as C-node

1

2

4

3

5

6

How to define the smoothness of error

- AMG fundamental concept
- Smooth error small residuals
- r ltlt e

How are Prolongation and Restriction done

- Prolongation is based on smooth error and strong

connections - Common practice I

AMG Prolongation (2)

AMG Prolongation (3)

- Restriction

Summary

- Multigrid is a multilevel iterative method.
- Advantage scalable
- If no geometrical grid is available, try

Algebraic multigrid method

The landscape of Solvers

More Robust

Less Storage (if sparse)