Linear Algebra and Complexity presentation

About This Presentation

Transcript and Presenter's Notes

Title: Linear Algebra and Complexity

1
Linear Algebra and Complexity
Chris Dickson CAS 746 - Advanced Topics in
Combinatorial Optimization McMaster University,
January 23, 2006
2
Introduction

Review of linear algebra
Complexity of determinant, inverse, systems of
equations, etc
Gaussian elimination method for soliving linear
systems
complexity
solvability
numerical issues
Iterative methods for solving linear systems
Jacobi Method
Gauss-Sidel Method
Successive Over-Relaxation Methods

3
Review

A subset L of Rn (respectively Qn) is called a
linear hyperplane if L x ax 0 for some
nonzero row vector a in Rn (Qn).
A set U of vectors in Rn (respectively Qn) is
called a linear subspace of Rn if the following
properties hold
The zero vector is in U
If x ? U then ?x ? U for any scalar ?
If x,y ? U then x y ? U
Important subspaces
Null Space null(A) x Ax 0
Image im(A) b Ax b for some x in A
A basis for a subspace L is a set of linearly
independent vectors that spans L

4
Review

Theorem 3.1. Each linear subspace L of Rn is
generated by finitely many vectors and is also
the intersection of finitely many linear
hyperplanes
Proof
If L is a subspace, then a maximal set of
linearly independent vectors from L will form a
basis.
Also, L is the intersection of hyperplanes x
aix 0 where the set of all ai are linearly
independent generating L z zx 0 for all
x in L

5
Review

Corollary 3.1a. For each matrix A there exist
vectors x1 , , xT such that Ax 0 ? x
?1x1 ? TxT for certain rationals ?1 , ?
T.
If x1 , , xT are linearlly independent they
form a fundamental system of solutions to Ax
0.
Corollary 3.1b The system Ax b has a solution
iff yb 0 for each vector y with yA 0.
Fundamental theorem of linear algebra, Gauss
1809.

6
Review

Proof
(?) If Axb then when yA 0 we have yAx 0 ? yb
0.
(?)Let im(A) z Ax z for some x . By
Thrm 3.1 there exists a matrix C such that im(A)
z Cz 0 . So, CAx 0 implies that CA
is all-zero. Thus (yA 0 ? yb 0) ?
Cb 0 ? b ? im(A) ? There is a solution to
Ax b

7
Review

Corollary 3.1c. If x0 is a solution to Ax b,
then there are vectors x1, , xt, such that Ax
b iff x x0 ?1x1 ? txt for certain
rationals ? 1, , ? t.
Proof Ax b iff A(x x0) 0. Cor 3.1a
implies (x x0) ? 1x1 ? txt
x x0 ? 1x1 ? txt

8
Review

Cramers Rule If A is an invertible (n x
n)-matrix, the solution to Ax b is given
by x1 detA1 / detA xn detAn / detA
Ai is defined by replacing the ith column of A by
the column vector b.

9
Review

Example 5x1 x2 x3 4 9x1 x2 x3
1 x1 -x2 5x3 2Here, x2 det(A2) /
det(A) 166 / 16 10.3750

10
Sizes and Characterizations

Rational Number r p / q (p ? Z, q ? N,
relatively prime)
Rational Vector c ( ?1 , ?2 , ... , ?n )
Rational Matrix
We only consider rational entries for vectors and
matrices.

11
Sizes and Characterizations

Theorem 3.2. Let A be a square rational matrix of
size ?. Then, the size of det(A) is less than 2
?.
Proof Let A (pij/qij) for i,j 1..n. Let
detA p/q where p,q and pij ,qij are all
relatively prime. Then
From the definitition of the determinant

12
Sizes and Characterizations

Noting that p detAq, we combine the two
formulas to bound p and q
Plugging the bounds for p and q into the
definition for the size of a rational number, we
get

13
Sizes and Characterizations

Several important corollaries from the theorem
Corollary 3.2a. The inverse A-1 of a nonsingular
(rational) matrix A has size polynomially bounded
by the size of A.
Proof Entries of A-1 are quotients of
subdeterminants of A. Then, apply the theorem.
Corollary 3.2b. If the system Axb has a
solution, it has one of size polynomially bounded
by the sizes of A and bProof Since A-1 is
polynomially bounded, then A-1b is polynomially
bounded. This tells us that the Corollary
3.1b provides a good characterization

14
Sizes and Characterizations

What if there isnt a solution to Ax b? Is
this problem still polynomially bounded? Yes!
Corollary 3.2c says so! If there is no
solution, we can specify a vector y with yA 0
and yb 1. By corollary 3.2b, such a vector
exists and is polynomially bounded by the sizes
of A and b.
Gaussian elimination provodes a polynomial method
for finding (or not finding) a solution
One more thing left to show. Do we know the
solution will be polynomially bounded in size?

15
Sizes and Characterizations

Corollary 3.2d. Let A be a rational m x n
matrix and let b be a rational column vector such
that each row of the matrix A b has size at
most ?. If Ax b has a solution, then x
Ax b x0 ?1x1 ?txt ?i ? R
()for certain rational vectors x0 , .... , xt
of size at most 4n2 ?.
Proof By Cramers Rule, there exists x0 , ... ,
xt satisfying () which are quotients of
subdeterminants of A b of order at most n. By
theorem 3.2, each determinant has size lt 2n?.
Each component of xi has size less than 4n?.
Since each xi has n components, the size of xi is
at most 4n2?.

16
Gaussian Elimination

Given a system of linear equationsa11x1
a12x2 .... a1nxn b1a21x1 a22x2 ....
a2nxn b2 .........an1x1 an2x2 ....
annxn bnWe subtract appropriate multiples of
the first equation from the other equations to
geta11x1 a12x2 .... a1nxn b1
a22x2 .... a2nxn b2 .........
an2x2 .... annxn bnWe then recursively
solve the system of the last m-1 equations.

17
Gaussian Elimination

Operations allowed on matrices
Adding a multiple of one row to another row
Permuting rows or columns
Forward Elimination Given a matrix Ak of the
formWe choose a nonzero component of D called
the pivot element. Without loss of generality,
assume that this pivot element is in D11. Next,
add rational multiples of the first row of D to
the other rows in D such that D11 is the only
non-zero element in the first column of D. (B
is non-singular, upper triangular, order k)

18
Gaussian Elimination

Back Substitution After FE stage, we wish to
transform the matrix into the formwhere ?
is non-singular diagonal.
We accomplish this by adding multiples of the kth
row to the rows 1, ... , k-1 such that the
element Akk is the only non-zero in the kth row
and column for k r, r-1, ... , 1. (r
rank(A))

19
Gaussian Elimination

Theorem 3.3. For rational data, the Gaussian
elminination method is a polynomial time method.
proof is found in the book. It is very
longwinded!
since operations allowed for GE are polynomially
bounded, it only remains to show that numbers do
not grow exponentially.
Corollary of the theorem implies that other
operations on matrices are polynomially solvable
as they depend on solving a system of linear
equations.
calculating determinant of a rational matrix
determining the rank of a rational matrix
calculating the inverse of a rational matrix
testing a set of vectors for linear independence
solving a system of rational linear equations

20
Gaussian Elimination

Numerical Considerations
.A naive pivot strategy always uses D11 as the
pivot. However, this can lead to numerical
problems when we do not have exact arithmetic.
(ie/ floating point numbers).
Partial pivoting always selects the largest
element (in absolute value) in the first column
of D as the pivot. (swap rows to make this D11)
Scaled partial pivoting is like partial pivoting,
but it scales the pivot column and row so that
D11 1.
Full pivoting uses the largest value in the
entire D submatrix as the pivot. We then swap
rows and columns to put this value in D11.
Studies show that full pivoting usually requires
more overhead to be beneficial.

21
Gaussian Elimination

Other methods
Gauss-Jordan Method This method combines the FE
and BS stages in Gaussian elimination.
inferior to ordinary Gaussian elimination
LU Decomposition Uses GE to decompose original
matrix A into the product of a lower and upper
triangular matrix.
This is the method of choice when we must solve
many systems with the same matrix A but with
different bs.
We only need to perform FE once, then do back
substitution for each right hand side.
Iterative Methods Want to solve Ax b by
determining a sequence of vectors x0 , x1, ... ,
which eventually converge to a solution x.

22
Iterative Methods

Given Ax b and x0, the goal is find a sequence
of iterates x1, x2 , ... , x such that Ax b.
For most methods, the goal is to split A into A
L D U where L is strictly lower triangular, D
is diagonal, and U is strictly upper triangular.
Example L D
U

23
Iterative Methods

Now, we can define how the iteration proceeds.
In general
Jacobi Iteration Mj D , Nj -(L U).
Alternatively, the book presents the
iterationThese two are equivalent as book
assumes aii 1, which means D D-1 I.
(Identity)

24
Iterative Methods

Example Solve for x in Axb using Jacobis
Method where
Using the iteration in the previous slide and x0
b, we obtainx1 -5 0.667 1.5 Tx2
0.833 -1.0 -1.0T .........x10 0.9999
0.9985 1.9977 T x 1 1 2 T

25
Iterative Methods

Example 2 Solve for x in Axb using Jacobis
Method where
Using x0 b, we obtainx1 -44.5 -49.5
-13.4 Tx2 112.7 235.7 67.7T
.........x10 2.82 4.11 1.39 T 106
x 1 1 1 T

26
Iterative Methods

.What happened? The sequence diverged!
The matrix in the second example was NOT
diagonally dominant.
Strict diagonal row dominance is sufficient for
convergence.
Diagonal element is greater than sum of all other
elements in the same row (take absolute values!)
Not neccessary though as example 1 did not have
strict diagonal dominance.
Alternatively, book says that Jacobi Method will
converge if all eigenvalues of (I - A) are less
than 1 in absolute value.
Again, this is a sufficient condition. It can
be checked that eigenvalues of matrix in first
example do not satisfy this condition.

27
Iterative Methods

Gauss-Sidel Method Computes the (i1)th
component of xk1. ie, where

28
Iterative Methods

Gauss-Sidel Method Alternatively, if we split A
LDU, then we obtain the iteration
Where MG D L , and NG -U.
Again, it can be shown that the books method is
equivalent to this method.

29
Iterative Methods

Example Gauss-Sidel Method
Starting with x0 b, we obtain the sequence
x1 -4.666 -5.116 3.366 T x2
1.477 -2.788 1.662T x3 1.821 -0.404
1.116T ....
x12 1.000 1.000 1.000 T

30
Iterative Methods

It should also be noted that in both GS and
Jacobi Methods, the choice of M is important.
Since the iterations of each method involve
inverting M, we should choose M so that this is
easy.
In Jacobi Method, M is a diagonal matrix
Easy to invert!
In Gauss-Sidel Method, M is a lower triangular
Slightly harder. Although inverse is still
lower triangular, we must perform a little more
work, but we only have to invert it once.
Alternatively, use the strategy by the book to
compute iterates component by component.

31
Iterative Methods

Successive Over-Relaxation Method (SOR) Write
matrices L and U (L lower triangular with
diagonal) such thatfor some appropriately
chosen ? gt 0 (? is the diagonal of A)
A related class of iterative relaxation methods
for approximating the solution to Ax b can now
be defined.

32
Iterative Methods

Algorithm
For a certain ? with 0lt?2
convergent if 0lt?lt2 and Ax b has a solution.

33
Iterative Methods

There are many different choices that can be made
for the relaxation methods
Choice of ?
? 2, then xk1 is reflection of xk into ?xi ?
? 1, then xk1 is projection of xk onto the
hyperplane ?xi ?
How to choose violated equation
Is it best to just pick any violated equation?
Should we have a strategy? (First, Best, Worst,
etc)
Stopping Criteria
How close is close enough?

34
Iterative Methods

Why use iterative methods?
Sometimes they will be more numerically stable.
Particulary when matrix A is positive-semi-definit
e and diagonally dominant. (This is the case for
the linear least squares problem)
If an exact solution is not neccessary. Only
need to be close enough.
Which one to use?
Compared with Jacobi Iteration, Gauss-Seidel
Method converges faster, usually by a factor of 2.

Linear Algebra and Complexity PowerPoint PPT Presentation