A Little Necessary Matrix Algebra for Doctoral Studies in Business Administration

About This Presentation

Title:

A Little Necessary Matrix Algebra for Doctoral Studies in Business Administration

Description:

... right triangle, the lengths of the hypotenuse and the other two sides are ... hypotenuse. side a. side b. Geometry. of Vectors. Vector addition for the ... – PowerPoint PPT presentation

Number of Views:149

Avg rating:3.0/5.0

Slides: 253

Provided by: JCoc8

Category:

more less

Transcript and Presenter's Notes

Title: A Little Necessary Matrix Algebra for Doctoral Studies in Business Administration

1
A Little Necessary Matrix Algebra for Doctoral
Studies in Business Administration

James J. Cochran
Department of Computer Information Systems
Analysis
Louisiana Tech University
Jcochran_at_cab.latech.edu

2
Matrix Algebra

Matrix algebra is a means of efficiently
expressing large numbers of calculations to be
made upon ordered sets of numbers
Often referred to as Linear Algebra

3
Why use it?

Matrix algebra is used primarily to facilitate
mathematical expression.
Many equations would be completely intractable if
scalar mathematics had to be used. It is also
important to note that the scalar algebra is
under there somewhere.

4
Definitions - Scalars

scalar - a single value (i.e., a number)

5
Definitions - Vectors

Vector - a single row or column of numbers
Each individual entry is called an element
denoted with bold small letters
row vector

column vector

6
Definitions - Matrices

A matrix is a rectangular array of numbers
(called elements) arranged in orderly rows and
columns

Subscripts denote row (i1,,n) and column
(j1,,m) location of an element

7
Definitions - Matrices

Matrices are denoted with bold Capital letters
All matrices (and vectors) have an order or
dimensions - that is the number of rows x the
number of columns. Thus A is referred to as a two
by three matrix.
Often a matrix A of dimension n x m is denoted
Anxm
Often a vector a of dimension n (or m) is denoted
An (or Am)

8
Definitions - Matrices

Null matrix a matrix for which all elements are
zero, i.e., aij 0 ? i,j
Square Matrix a matrix for which the number of
rows equals the number of columns (n m)
Symmetric Matrix a matrix for which aij aji ?
i,j

9
Definitions - Matrices

Diagonal Elements Elements of a Square Matrix
for which the row and column locations are equal,
i.e., aij ? i j
Upper Triangular Matrix a matrix for which all
elements below the diagonal are zero, i.e., aij
0 ? i,j ? i lt j
Lower Triangular Matrix a matrix for which all
elements above the diagonal are zero, i.e., aij
0 ? i,j ? i gt j

10
Matrix Equality

Thus two matrices are equal iff (if and only if)
all of their elements are identical
Note that statistical data sets are matrices
(usually with observations in the rows and
variables in the columns)

11
Basic Matrix Operations

Transpositions
Sums and Differences
Products
Inversions

12
The Transpose of a Matrix

The transpose A of a matrix A is the matrix
such that the ith row of A is the jth column of
A, i.e., B is the transpose of A iff bij aji ?
i,j
This is equivalent to fixing the upper left and
lower right corners then rotating the matrix 180
degrees

13
Transpose of a MatrixAn Example

If we have

then
i.e.,
14
More on theTranspose of a Matrix

(A) A (think about it!)
If A A', then A is symmetric

15
Sums and Differencesof Matrices

Two matrices may be added (subtracted) iff they
are the same order
Simply add (subtract) elements from corresponding
locations

where
16
Sums and DifferencesAn Example

If we have

then we can calculate C A B by
17
Sums and DifferencesAn Example

Similarly, if we have

then we can calculate C A - B by
18
Some Properties ofMatrix Addition/Subtraction

Note that
The transpose of a sum sum of transposes ?
(ABC) ABC
AB BA (i.e., matrix addition is commutative)
Matrix addition can be extended beyond two
matrices
matrix addition is associative, i.e., A(BC)
(AB)C

19
Products of Scalarsand Matrices

To multiply a scalar times a matrix, simply
multiply each element of the matrix by the scalar
quantity

20
Products of Scalars Matrices An Example

If we have

then we can calculate bA by

Note that bA Ab if b is a scalar

21
Some Properties ofScalar x Matrix Multiplication

Note that
if b is a scalar then bA Ab (i.e., scalar x
matrix multiplication is commutative)
Scalar x Matrix multiplication can be extended
beyond two scalars
Scalar x Matrix multiplication is associative,
i.e., ab(C) a(bC)
Scalar x Matrix multiplication leads to removal
of a common factor, i.e., if

22
Products of Matrices

We write the multiplication of two matrices A and
B as AB
This is referred to either as
pre-multiplying B by A
or
post-multiplying A by B
So for matrix multiplication AB, A is referred to
as the premultiplier and B is referred to as the
postmultiplier

23
Products of Matrices

In order to multiply matrices, they must be
conformable (the number of columns in the
premultiplier must equal the number of rows in
postmultiplier)
Note that
an (m x n) x (n x p) (m x p)
an (m x n) x (p x n) cannot be done
a (1 x n) x (n x 1) a scalar (1 x 1)

24
Products of Matrices

If we have A(3x2) and B(2x3) then

where
25
Products of Matrices

If we have A(3x2) and B(2x3) then

i.e., matrix multiplication is not commutative
(why?)
26
Matrix MultiplicationAn Example

If we have

then
where
27
Some Properties ofMatrix Multiplication

Note that
Even if conformable, AB does not necessarily
equal BA (i.e., matrix multiplication is not
commutative)
Matrix multiplication can be extended beyond two
matrices
matrix multiplication is associative, i.e., A(BC)
(AB)C

28
Some Properties ofMatrix Multiplication

Also note that
The transpose of a product is equal to the
product of the transposes in reverse order ?
(ABC) CBA
If AA A then A' is idempotent (and A' A)

29
Special Uses forMatrix Multiplication

Sum Row Elements of a Matrix
Premultiply a matrix A by a conformable row
vector of 1s If

then premultiplication by
will yield the column totals for A, i.e.
30
Special Uses forMatrix Multiplication

Sum Column Elements of a Matrix
Postmultiply a matrix A by a conformable column
vector of 1s If

then postmultiplication by
will yield the column totals for A, i.e.
31
Special Uses forMatrix Multiplication

The Dot (or Inner) Product of two Vectors
Premultiplication of a column vector a by
conformable row vector b yields a single value
called the dot product or inner product - If

then ab gives us
which is the sum of products of elements in
similar positions for the two vectors
32
Special Uses forMatrix Multiplication

The Outer Product of two Vectors
Postmultiplication of a column vector a by
conformable row vector b yields a matrix
containing the products of each pair of elements
from the two matrices (called the outer product)
- If

then ba gives us
33
Special Uses forMatrix Multiplication

Sum the Squared Elements of a Vector
Premultiply a column vector a by its transpose
If
then premultiplication by a row vector a
will yield the sum of the squared values of
elements for a, i.e.

34
Special Uses forMatrix Multiplication

Postmultiply a row vector a by its transpose If
then postmultiplication by a column vector a
will yield the sum of the squared values of
elements for a, i.e.

35
Special Uses forMatrix Multiplication

Determining if two vectors are Orthogonal Two
conformable vectors a and b are orthogonal iff
ab 0
Example Suppose we have
then

36
Special Uses forMatrix Multiplication

Representing Systems of Simultaneous Equations
Suppose we have the following system of
simultaneous equations
px1 qx2 rx3 M
dx1 ex2 fx3 N
If we let

then we can represent the system (in matrix
notation) as Ax b (why?)
37
Special Uses forMatrix Multiplication

Linear Independence any subset of columns (or
rows) of a matrix A are said to be linearly
independent if no column (row) in the subset can
be expressed as a linear combination of other
columns (rows) in the subset.
If such a combination exists, then the columns
(rows) are said to be linearly dependent.

38
Special Uses forMatrix Multiplication

The Rank of a matrix is defined to be the number
of linearly independent columns (or rows) of the
matrix.
Nonsingular (Full Rank) Matrix Any matrix that
has no linear dependencies among its columns
(rows). For a square matrix A this implies that
Ax 0 iff x 0.
Singular (Not of Full Rank) Matrix Any matrix
that has at least one linear dependency among its
columns (rows).

39
Special Uses forMatrix Multiplication

Example - The following matrix A

is singular (not of full rank) because the third
column is equal to three times the first
column. This result implies there is either i)
no unique solution or ii) no existing solution to
the system of equations Ax 0 (why?).
40
Special Uses forMatrix Multiplication

Example - The following matrix A

is singular (not of full rank) because the third
column is equal to the first column plus two
times the second column. Note that the number
of linearly independent rows in a matrix will
always equal the number of linearly independent
columns in the matrix.
41
Geometryof Vectors

Vectors have geometric properties of length and
direction for a vector
we have

2
length
x2
Why?
1
x1
42
Geometryof Vectors

Recall the Pythagorean Theorem in any right
triangle, the lengths of the hypotenuse and the
other two sides are related by the simple formula.

2
hypotenuse
side a
1
side b
43
Geometryof Vectors

Vector addition for the vectors
we have

Lx
2
Ly
q
1
44
Geometryof Vectors

Scalar multiplication changes only the vector
length for the vector
we have

2
Lx
1
45
Geometryof Vectors

Vector multiplication have angles between them
for the vectors
we have

2
q
1
46
A Little Trigonometry Review

The Unit Circle

2
Cos q 0
0 ? Cos q ? 1
-1 ? Cos q ? 0
Cos q 1
Cos q -1
q
1
-1 ? Cos q ? 0
0 ? Cos q ? 1
Cos q 0
47
A Little Trigonometry Review
Suppose we rotate x any y so x lies on axis 1
2
qxy
1
48
A Little Trigonometry Review
What does this imply about rxy?
2
Cos q 0
0 ? Cos q ? 1
-1 ? Cos q ? 0
qxy
Cos q 1
Cos q -1
1
-1 ? Cos q ? 0
0 ? Cos q ? 1
Cos q 0
49
Geometryof Vectors
What is the correlation between the vectors x
and y?
2
x
Plotting in the column space gives us
1
y
50
Geometryof Vectors
Rotating so x lies on axis 1 makes it easier to
see
2
Qxy1800
1
51
Geometryof Vectors
What is the correlation between the vectors x
and y?
52
Geometryof Vectors
Of course, we can see this by plotting the these
values in the x,y (row) space
Y
X
0.6, -0.3
1.0, -0.5
53
Geometryof Vectors
What is the correlation between the vectors x
and y?
54
Geometryof Vectors
Plotting in the column space gives us
2
Qxy900
1
55
Geometryof Vectors
Rotating so x lies on axis 1 makes it easier to
see
2
Qxy900
1
56
Geometryof Vectors

The space of all real m-tuples, with scalar
multiplication and vector addition as we have
defined, is called vector space.
The vector
is a linear combination of the vectors x1, x2,,
xk.
The set of all linear combinations of the vectors
x1, x2,, xk is called their linear span.

57
Geometryof Vectors
Here is the column space plot for some vectors
x1 and x2
3
1
2
58
Geometryof Vectors
Here is the linear span for some vectors x1 and
x2
3
1
2
59
Geometryof Vectors

A set of vectors x1, x2,, xk is said to be
linearly dependent if there exist k numbers a1,
a2,, ak, at least one of which is nonzero, such
that
Otherwise the set of vectors x1, x2,, xk is
said to be linearly independent

60
Geometryof Vectors

Are the vectors
linearly independent?
Take a1 0.5 and a2 1.0. Then we have
The vectors x and y are dependent.

61
Geometryof Vectors
Geometrically x and y look like this
2
1
62
Geometryof Vectors
Rotating so x lies on axis 1 makes it easier to
see
2
Qxy1800
1
63
Geometryof Vectors

Are the vectors
linearly independent?
There are no real values a1, a2 such that
so the vectors x and y are independent.

64
Geometryof Vectors
Geometrically x and y look like this
2
Qxy900
1
65
Geometryof Vectors
Rotating so x lies on axis 1 makes it easier to
see
2
Qxy900
1
66
Geometryof Vectors

Here x and y are called perpendicular (or
orthogonal) this is written x ? y.
Some properties of orthogonal vectors
xy 0 ? x ? y
z is perpendicular to every vector iff z 0
If z is perpendicular to each vector x1, x2, ,
xk, then z is perpendicular their linear span.

67
Geometryof Vectors
Here vectors x1 and x2 (plotted in the column
space) are orthogonal
3
1
2
68
Geometryof Vectors
Recall that the linear span for vectors x1 and
x2 is
3
1
2
69
Geometryof Vectors
Vector z looks like this
3
1
2
70
Geometryof Vectors
The vector z is perpendicular to the linear span
for vectors x1 and x2
3
Check each of the dot products!
1
2
71
Geometryof Vectors
Here vectors x1, x2, and z from our previous
problem are orthogonal
3
x1 and z are perpendicular
1
2
72
Geometryof Vectors
Here vectors x1, x2, and z from our previous
problem are orthogonal
3
x2 and z are perpendicular
1
2
73
Geometryof Vectors
Here vectors x1, x2, and z from our previous
problem are orthogonal
3
x1 and x2 are perpendicular
1
Note we could rotate x1, x2, and z until they
lied on our three axes!
2
74
Geometryof Vectors
The projection (or shadow) of a vector x on a
vector y is given by
If y has unit length (i.e., Ly 1), the
projection (or shadow) of a vector x on a vector
y simplifies to (xy)y
75
Geometryof Vectors
For the vectors the projection (or shadow) of
x on y is
76
Geometryof Vectors
Geometrically the projection of x on y looks
like this
2
Perpendicular wrt y
projection of x on y
1
77
Geometryof Vectors
Rotating so y lies on axis 1 makes it easier to
see
2
1
78
Geometryof Vectors
Note that we write the length of the projection
of x on y like this
For our previous example, the length of the
length of the projection of x on y is
79
The Gram-Schmidt (Orthogonalization) Process
For linearly independent vectors x1, x2,, xk,
there exist mutually perpendicular vectors u1,
u2,, uk with the same linear span. These may be
constructed by setting
80
The Gram-Schmidt (Orthogonalization) Process
We can normalize (convert to vectors z of unit
length) the vectors u by setting
Finally, note that we can project a vector xk
onto the linear span of vectors x1, x2,, xk-1
81
The Gram-Schmidt (Orthogonalization) Process
Here are vectors x1, x2, and z from our previous
problem
3
1
2
82
The Gram-Schmidt (Orthogonalization) Process
Lets construct mutually perpendicular vectors
u1, u2, u3 with the same linear span well
arbitrarily select the first axis as u1
83
The Gram-Schmidt (Orthogonalization) Process
Now we construct a vector u2 perpendicular with
vector u1 (and in the linear span of x1, x2, z)
84
The Gram-Schmidt (Orthogonalization) Process
Finally, we construct a vector u3 perpendicular
with vectors u1 and u2 (and in the linear span of
x1, x2, z)
85
The Gram-Schmidt (Orthogonalization) Process
86
The Gram-Schmidt (Orthogonalization) Process
Here are our orthogonal vectors u1, u2, and u3
3
1
2
87
The Gram-Schmidt (Orthogonalization) Process
If we normalized our vectors u1, u2, and u3, we
get
88
The Gram-Schmidt (Orthogonalization) Process
and
89
The Gram-Schmidt (Orthogonalization) Process
and
90
The Gram-Schmidt (Orthogonalization) Process
The normalized vectors z1, z2, and z3 look like
this
3
1
2
91
Special Matrices

There are a number of special matrices. These
include
Diagonal Matrices
Identity Matrices
Null Matrices
Commutative Matrices
Anti-Commutative Matrices
Periodic Matrices
Idempotent Matices
Nilpodent Matrices
Orthogonal Matrices

92
Diagonal Matrices

A diagonal matrix is a square matrix that has
values on the diagonal with all off-diagonal
entities being zero.

93
Identity Matrices

An identity matrix is a diagonal matrix where
the diagonal elements all equal 1
When used as a premultiplier or postmultiplier
of any conformable matrix A, the Identity Matrix
will return the original matrix A, i.e.,
IA AI A
Why?

94
Null Matrices

A square matrix whose elements all equal 0
Usually arises as the difference between two
equal square matrices, i.e.,
a b 0 ? a b

95
Commutative Matrices

Any two square matrices A and B such that AB BA
are said to commute.
Note that it is easy to show that any square
matrix A commutes with both itself and with a
conformable identity matrix I.

96
Anti-Commutative Matrices

Any two square matrices A and B such that AB
-BA are said to anti-commute.

97
Periodic Matrices

Any matrix A such that Ak1 A is said to be of
period k.
Of course any matrix that commutes with itself
of period k for any integer value of k (why?).

98
Idempotent Matrices

Any matrix A such that A2 A is said to be of
idempotent.
Thus an idempotent matrix commutes with itself
if of period k for any integer value of k.

99
Nilpotent Matrices

Any matrix A such that Ap 0 where p is a
positive integer is said to be of nilpotent.
Note that if p is the least positive integer
such that Ap 0, then A is said to be nilpotent
of index p.

100
Orthogonal Matrices

Any square matrix A with rows (considered as
vectors) are mutually perpendicular and have unit
lengths, i.e.,
AA I
Note that A is orthogonal iff A-1 A.

101
The Determinant of a Matrix

The determinant of a matrix A is commonly denoted
by A or det A.
Determinants exist only for square matrices.
They are a matrix characteristic (that can be
somewhat tedious to compute).

102
The Determinantfor a 2x2 Matrix

If we have a matrix A such that
then
For example, the determinant of
is
Determinants for 2x2 matrices are easy!

103
The Determinantfor a 3x3 Matrix

If we have a matrix A such that
Then the determinant is
which can be expanded and rewritten as

(Why?)
104
The Determinantfor a 3x3 Matrix

If we rewrite the determinants for each of the
2x2 submatrices in

as
by substitution we have
105
The Determinantfor a 3x3 Matrix

Note that if we have a matrix A such that
Then A can also be written as
or
or

106
The Determinantfor a 3x3 Matrix

To do so first create a matrix of the same
dimensions as A consisting only of alternating
signs (,-,,)

107
The Determinantfor a 3x3 Matrix

Then expand on any row or column (i.e., multiply
each element in the selected row/column by the
corresponding sign, then multiply each of these
results by the determinant of the submatrix that
results from elimination of the row and column to
which the element belongs
For example, lets expand on the second column

108
The Determinantfor a 3x3 Matrix

The three elements on which our expansion is
based will be a12, a22, and a32. The
corresponding signs are -, , -.

109
The Determinantfor a 3x3 Matrix

So for the first term of our expansion we will
multiply -a12 by the determinant of the matrix
formed when row 1 and column 2 are eliminated
from A (called the minor and often denoted Arc
where r and c are the deleted rows and columns)

which gives us
This product is called a cofactor.
110
The Determinantfor a 3x3 Matrix

For the second term of our expansion we will
multiply a22 by the determinant of the matrix
formed when row 2 and column 2 are eliminated
from A

which gives us
111
The Determinantfor a 3x3 Matrix

Finally, for the third term of our expansion we
will multiply -a32 by the determinant of the
matrix formed when row 3 and column 2 are
eliminated from A

which gives us
112
The Determinantfor a 3x3 Matrix

Putting this all together yields

So there are nine distinct ways to calculate the
determinant of a 3x3 matrix! These can be
expressed as
Note that this is referred to as the method of
cofactors and can be used to find the determinant
of any square matrix.
113
The Determinant for a 3x3 Matrix An Example

Suppose we have the following matrix A

Using row 1 (i.e., i1), the determinant is
Note that this is the same result we would
achieve using any other row or column!
114
Some Propertiesof Determinants

Determinants have several mathematical properties
useful in matrix manipulations
AA'
If each element of a row (or column) of A is 0,
then A 0
If every value in a row is multiplied by k, then
A kA
If two rows (or columns) are interchanged the
sign, but not value, of A changes
If two rows (or columns) of A are identical, A
0

115
Some Propertiesof Determinants

A remains unchanged if each element of a row is
multiplied by a constant and added to any other
row
If A is nonsingular, then A1/A-1, i.e.,
AA-11
AB AB (i.e., the determinant of a product
product of the determinants)
For any scalar c, cA ckA where k is the
order of A
Determinant of a diagonal matrix is simply the
product of the diagonal elements

116
Why areDeterminants Important?

Consider the small system of equations
a11x1 a12x2 b1
a21x1 a22x2 b2
Which can be represented by
Ax b
where

117
Why areDeterminants Important?

If we were to solve this system of equations
simultaneously for x2 we would have
a21(a11x1 a12x2 b1)
-a11(a21x1 a22x2 b2)
Which yields (through cancellation
rearranging)
a21a11x1 a21a12x2 - a11a21x1 - a11a22x2
a21b1 - a11b2

118
Why areDeterminants Important?

or (a11a2 - a21a12)x2 a11b2 - a21b1
which implies

Notice that the denominator is
Thus iff A 0 there is either i) no unique
solution or ii) no existing solution to the
system of equations Ax b!
119
Why areDeterminants Important?

This result holds true
if we solve the system for x1 as well or
for a square matrix A of any order.
Thus we can use determinants in conjunction with
the A matrix (coefficient matrix in a system of
simultaneous equations) to see if the system has
a unique solution.

120
Traces of Matrices

The trace of a square matrix A is the sum of the
diagonal elements
Denoted tr(A)
We have

For example, the trace of
is
121
Some Propertiesof Traces

Traces have several mathematical properties
useful in matrix manipulations
For any scalar c, tr(cA) ctr(A)
tr(A ? B) tr(A) ? tr(B)
tr(AB) tr(BA)
tr(B-1AB) tr(A)

122
The Inverse of a Matrix

The inverse of a matrix A is commonly denoted by
A-1 or inv A.
The inverse of an n x n matrix A is the matrix
A-1 such that AA-1 I A-1A
The matrix inverse is analogous to a scalar
reciprocal
A matrix which has an inverse is called
nonsingular

123
The Inverse of a Matrix

For some n x n matrix A an inverse matrix A-1
may not exist.
A matrix which does not have an inverse is
singular.
An inverse of n x n matrix A exists iff A?0

124
Inverse bySimultaneous Equations

Pre or postmultiply your square matrix A by a
dummy matrix of the same dimensions, i.e.,

Set the result equal to an identity matrix of the
same dimensions as your square matrix A, i.e.,

125
Inverse bySimultaneous Equations

Recognize that the resulting expression implies a
set of n2 simultaneous equations that must be
satisfied if A-1 exists
a11(a) a12(d) a13(g) 1, a11(b)
a12(e) a13(h) 0,
a11(c) a12(f) a13(i) 0
or
a21(a) a22(d) a23(g) 0, a21(b)
a22(e) a23(h) 1,
a21(c) a22(f) a23(i) 0
or
a31(a) a32(d) a33(g) 0, a31(b)
a32(e) a33(h) 0,
a31(c) a32(f) a33(i) 1.

Solving this set of n2 equations simultaneously
yields A-1.
126
Inverse by Simultaneous Equations An Example

If we have

Then the postmultiplied matrix would be
We now set this equal to a 3x3 identity matrix
127
Inverse by Simultaneous Equations An Example

Recognize that the resulting expression implies
the following n2 simultaneous equations
1a 2d 3g 1, 1b 2e 3h 0, 1c 2f 3i
0
or
2a 5d 4g 0, 2b 5e 4h 1, 2c 5f 4i
0
or
1a - 3d - 2g 0, 1b - 3e - 2h 0, 1c - 3f - 2i
1.

This system can be satisfied iff A-1 exists.
128
Inverse by Simultaneous Equations An Example
Solving the set of n2 equations simultaneously
yields a -2/15, b 1/3, c 7/3, d
-8/15, e 1/3, f -2/3 g 11/15, h
-1/3, i -1/15 so we have that A-1
129
Inverse by Simultaneous Equations An Example
ALWAYS check your answer. How? Use the fact
that AA-1 A-1A I and do a little matrix
multiplication!
So we have found A-1!
130
Inverse by theGauss-Jordan Algorithm

Augment your matrix A with an identity matrix of
the same dimensions, i.e.,

Now we use valid Row Operations necessary to
convert A to I (and so AI to IA-1)
131
Inverse by theGauss-Jordan Algorithm

Valid Row Operations on AI
You may interchange rows
You may multiply a row by a scalar
You may replace a row with the sum of that row
and another row multiplied by a scalar (which is
often negative)
Every operation performed on A must be performed
on I
Use valid Row Operations on AI to convert A to I
(and so AI to IA-1)

132
Inverse by the Gauss-Jordan Algorithm An Example

If we have

Then the augmented matrix AI is
We now wish to use valid row operations to
convert the A side of this augmented matrix to I
133
Inverse by the Gauss-Jordan Algorithm An Example

Step 1 Subtract 2Row 1 from Row 2

And substitute the result for Row 2 in AI
134
Inverse by the Gauss-Jordan Algorithm An Example

Step 2 Subtract Row 3 from Row 1

Divide the result by 5 and substitute for Row 3
in the matrix derived in the previous step
135
Inverse by the Gauss-Jordan Algorithm An Example

Step 3 Subtract Row 2 from Row 3

Divide the result by 3 and substitute for Row 3
in the matrix derived in the previous step
136
Inverse by the Gauss-Jordan Algorithm An Example

Step 4 Subtract 2Row 2 from Row 1

Substitute the result for Row 1 in the matrix
derived in the previous step
137
Inverse by the Gauss-Jordan Algorithm An Example

Step 5 Subtract 7Row 3 from Row 1

Substitute the result for Row 1 in the matrix
derived in the previous step
138
Inverse by the Gauss-Jordan Algorithm An Example

Step 6 Add 2Row 3 to Row 2

Substitute the result for Row 2 in the matrix
derived in the previous step
139
Inverse by the Gauss-Jordan Algorithm An Example

Now that the left side of the augmented matrix
is an identity matrix I, the right side of the
augmented matrix is the inverse of the matrix A
(A-1), i.e.,

140
Inverse by the Gauss-Jordan Algorithm An Example

To check our work, lets see if our result
yields AA-1 I

So our work checks out!
141
Inverse by Determinants

Replace each element aij in a matrix A with an
element calculated as follows
Find the determinant of the submatrix that
results when the ith row and jth column are
eliminated from A (i.e., Aij)
Attach the sign that you identified in the Method
of Cofactors
Divide by the determinant of A
After all elements have been replaced, transpose
the resulting matrix

142
Inverse by Determinants An Example

Again suppose we have some matrix A

We have calculated the determinant of A to be
15, so we replace element 1,1 with
Similarly, we replace element 1,2 with
143
Inverse by Determinants An Example

After using this approach to replace each of the
nine elements of A, The eventual result will be

which is A-1!

144
Eigenvalues and Eigenvectors

For a square matrix A, let I be a conformable
identity matrix. Then the scalars satisfying the
polynomial equation A - lI 0 are called the
eigenvalues (or characteristic roots) of A.
The equation A - lI 0 is called the
characteristic equation or the determinantal
equation.

145
Eigenvalues and Eigenvectors

For example, if we have a matrix A

then
which implies there are two roots or eigenvalues
-- ?-6 and ?4.
146
Eigenvalues and Eigenvectors

For a matrix A with eigenvectors ?, a nonzero
vector x such that Ax ?x is called an
eigenvector (or characteristic vector) of A
associated with ?.

147
Eigenvalues and Eigenvectors

For example, if we have a matrix A

with eigenvalues ? -6 and ? 4, the
eigenvector of A associated with ? -6 is
Fixing x11 yields a solution for x2 of 2.
148
Eigenvalues and Eigenvectors

Note that eigenvectors are usually normalized so
they have unit length, i.e.,

For our previous example we have
Thus our arbitrary choice to fix x11 has no
impact on the eigenvector associated with ? -6.
149
Eigenvalues and Eigenvectors

For matrix A and eigenvalue ? 4, we have

We again arbitrarily fix x11, which now yields
a solution for x2 of 1/2.
150
Eigenvalues and Eigenvectors
Normalization to unit length yields
Again our arbitrary choice to fix x11 has no
impact on the eigenvector associated with ? 4.
151
Quadratic Forms
A Quadratic From is a function Q(x) xAx in
k variables x1,,xk where
and A is a k x k symmetric matrix.
152
Quadratic Forms
Note that a quadratic form has only squared
terms and crossproducts, and so can be written
Suppose we have
then
153
Spectral Decomposition and Quadratic Forms
Any k x k symmetric matrix can be expressed in
terms of its k eigenvalue-eigenvector pairs (li,
ei) as
This is referred to as the spectral
decomposition of A.
154
Spectral Decomposition and Quadratic Forms
For our previous example on eigenvalues and
eigenvectors we showed that
has eigenvalues ?1 -6 and ?2 -4, with
corresponding (normalized) eigenvectors
155
Spectral Decomposition and Quadratic Forms
Can we reconstruct A?
156
Spectral Decomposition and Quadratic Forms
Spectral decomposition can be used to
develop/illustrate many statistical results/
concepts. We start with a few basic concepts
- Nonnegative Definite Matrix when any k x k
matrix A such that 0 ? xAx ? x x1, x2, ,
xk the matrix A and the quadratic form are
said to be nonnegative definite.
157
Spectral Decomposition and Quadratic Forms
- Positive Definite Matrix when any k x k
matrix A such that 0 lt xAx ? x x1, x2, ,
xk?? 0, 0, , 0 the matrix A and the
quadratic form are said to be positive definite.
158
Spectral Decomposition and Quadratic Forms
Example - Show that the following quadratic form
is positive definite
We first rewrite the quadratic form in matrix
notation
159
Spectral Decomposition and Quadratic Forms
Now identify the eigenvalues of the resulting
matrix A (they are l1 2 and l2 8).
160
Spectral Decomposition and Quadratic Forms
Next, using spectral decomposition we can write
where again, the vectors ei are the normalized
and orthogonal eigenvectors associated with the
eigenvalues l1 2 and l2 8.
161
Spectral Decomposition and Quadratic Forms
Sidebar - Note again that we can recreate the
original matrix A from the spectral decomposition
162
Spectral Decomposition and Quadratic Forms
Because l1 and l2 are scalars, premultiplication
and postmultiplication by x and x, respectively,
yield
where
At this point it is obvious that xAx is at
least nonnegative definite!
163
Spectral Decomposition and Quadratic Forms
We now show that xAx is positive definite, i.e.
From our definitions of y1 and y2 we have
164
Spectral Decomposition and Quadratic Forms
Since E is an orthogonal matrix, E
exists. Thus,
But 0 ? x Ey implies y ? 0 ?.
At this point it is obvious that xAx is
positive definite!
165
Spectral Decomposition and Quadratic Forms
This suggests rules for determining if a k x k
symmetric matrix A (or equivalently, its
quadratic form xAx) is nonegative definite or
positive definite - A is a nonegative definite
matrix iff li ? 0, i 1,,rank(A) - A is a
positive definite matrix iff li gt 0, i
1,,rank(A)
166
Measuring Distance
Euclidean (straight line) distance The
Euclidean distance between two points x and y
(whose coordinates are represented by the
elements of the corresponding vectors) in p-space
is given by
167
Measuring Distance
For a previous example
3
1
2
the Euclidean (straight line) distances are
168
Measuring Distance
3
1.430
1.414
1.414
1
2
169
Measuring Distance
Notice that the lengths of the vectors are their
distances from the origin
This is yet another place where the Pythagorean
Theorem rears its head!
170
Measuring Distance
Notice also that if we connect all points
equidistant from some given point z, the result
is a hypersphere with its center at z and area of
pr2
2
In p2 dimensions this yields a circle
z
r
1
171
Measuring Distance
In p 2 dimensions, we actually talk about
area. In p ? 3 dimensions, we talk about volume -
which is 4/3pr3 for this problem or, more
generally
3
In p3 dimensions we have a sphere
z
r
1
2
172
Measuring Distance
Problem What if the coordinates of a point x
(i.e., the elements of vector x) are random
variables with differing variances? Suppose
- we have n pairs of measurements on two
variables X1 and X2, each having a mean of zero
- X1 is more variable than X2 - X1 and X2 vary
independently
173
Measuring Distance
A scatter diagram of these data might look like
this
2
Which point really lies further from the origin
in statistical terms (i.e., which point is less
likely to have occurred randomly)?
1
Euclidean distance does not account for
differences in variation of X1 and X2!
174
Measuring Distance
Notice that a circle does not efficiently
inscribe the data
2
r2
1
The area of the ellipse is pr1r2.
r1
An ellipse does so much more efficiently!
175
Measuring Distance
How do we take the relative dispersions on the
two axes into consideration?
2
1
We standardize each value of Xi by dividing by
its standard deviation.
176
Measuring Distance
Note that the problem can extend beyond two
dimensions.
3
The area of the ellipsoid is (4/3)pr1r2r3 or more
generally
1
2
177
Measuring Distance
If we are looking at distances from the origin
D(0,P), we could divide coordinate i by its
sample standard deviation sii
178
Measuring Distance
The resulting measure is called Statistical
Distance or Mahalanobis Distance
179
Measuring Distance
Note that if we plot all points a constant
squared distance c2 from the origin
2
1
The area of this ellipse is
180
Measuring Distance
What if the scatter diagram of these data looked
like this
2
1
X1 and X2 now have an obvious positive
correlation!
181
Measuring Distance
We can plot a rotated coordinate system on axes
x1 and x2

2

x2
x1
Q
1
This suggests that we calculate distance based
on the rotated axes x1 and x2.

182
Measuring Distance
The relation between the original coordinates
(x1, x2) and the rotated coordinates (x1, x2) is
provided by

183
Measuring Distance
Now we can write the distance from P (x1, x2)
to the origin in terms of the original
coordinates x1 and x2 of P as

where
184
Measuring Distance
and
185
Measuring Distance
Note that the distance from P (x1, x2) to the
origin for uncorrelated coordinates x1 and x2 is
for weights
186
Measuring Distance
What if we wish to measure distance from some
fixed point Q (y1, y2)?
2

x2

x1
Q(y1, y2)
1
_
_
In this diagram, Q (y1, y2) (x1, x2) is
called the centroid of the data.
187
Measuring Distance
The distance from any point p to some fixed
point Q (y1, y2) is

2
x2
P(x1, x2)

x1
Q(y1, y2)
Q
1
188
Measuring Distance
Suppose we have the following ten bivariate
observations (coordinate sets of (x1, x2))
189
Measuring Distance
The plot of these points would look like this
2
Centroid (-2, 5)
1
The data suggest a positive correlation between
x1, and x2.
190
Measuring Distance
The inscribing ellipse (and major and minor
axes) look like this

2
x1

x2
Q450
1
191
Measuring Distance
The rotational weights are
192
Measuring Distance
and
193
Measuring Distance
and
194
Measuring Distance
So the distances of the observed points from
their centroid Q (-2.0, 5.0) are

195
Measuring Distance
Mahalonobis distance can easily be generalized
to p dimensions
and all points satisfying
form a hyperellipsoid with centroid Q.
196
Measuring Distance
Now lets backtrack the Mahalonobis distance
of a random p dimensional point P from the origin
is given by
so we can say
provided that d2 gt 0 ? x ? 0.
197
Measuring Distance
Recognizing that aij aji, i ? j, i 1,,p, j
1,,p, we have
for x ? 0.
198
Measuring Distance
Thus, p x p symmetric matrix A is positive
definite, i.e., distance is determined from a
positive definite quadratic form xAx! We can
also conclude from this result that a positive
definite quadratic form can be interpreted as a
squared distance! Finally, if the square of the
distance from point x to the origin is given by
xAx, then the square of the distance from point
x to some arbitrary fixed point m is given by
(x-m)A (x-m).
199
Measuring Distance
Expressing distance as the square root of a
positive definite quadratic form yields an
interesting geometric interpretation based on the
eigenvalues and eigenvectors of A. For example,
in p 2 two dimensions all points
that are constant distance c from the origin
must satisfy
200
Measuring Distance
By the spectral decomposition we have
so by substitution we now have
and A is positive definite, so l1 gt 0 and l2 gt0,
which means
is an ellipse.
201
Measuring Distance
Finally, a little algebra can be used to show
that
satisfies
202
Measuring Distance
Similarly, a little algebra can be used to show
that
satisfies
203
Measuring Distance
So the points at a distance c lie on an ellipse
whose axes are given by the eigenvectors of A
with lengths proportional to the reciprocals of
the square roots of the corresponding eigenvalues
(with constant of proportionality c)
x2
e1
e2
This generalizes to p dimensions
x1
204
Square Root Matrices
Because spectral decomposition allows us to
express the inverse of a square matrix in terms
of its eigenvalues and eigenvectors, it enables
us to conveniently create a square root
matrix. Let A be a p x p positive definite
matrix with the spectral decomposition
205
Square Root Matrices
Also let P be a matrix whose columns are the
normalized eigenvectors e1, e2, , ep of A, i.e.,
Then
where PP PP I and
206
Square Root Matrices
Now since (PL-1P)PLPPLP(PL-1P)PPI we
have
Next let
207
Square Root Matrices
The matrix
is called the square root of A.
208
Square Root Matrices

The square root of A has the following
properties

209
Square Root Matrices
Next let L-1 denote the matrix matrix whose
columns are the normalized eigenvectors e1, e2,
, ep of A, i.e.,
Then
where PP PP I and
210
Singular Value Decomposition
We can extend the operations of spectral
decomposition for a rectangular matrix by using
the eigenvalues and eigenvectors from the square
matrix AA or (AA). Suppose A is an m x k real
matrix. There exists an m x m orthogonal matrix U
and a k x k orthogonal matrix V such that A
ULV where L has ith diagonal element li ? 0 for
i 1, 2,, min (m,k) and 0 for all other
elements.
singular values of A
211
Singular Value Decomposition
Singular Value decomposition can also be
expressed as a matrix expansion that depends on
the rank r of A. There exist r - positive
constants l1, l2,, lr, - orthogonal m x 1 unit
vectors u1, u2, , ur, - orthogonal k x 1 unit
vectors v1, v2, , vr, such that
212
Singular Value Decomposition
where - Ur u1, u2, , ur - Vr v1, v2,
, vr - L is an r x r with diagonal entries l1,
l2,, lr and off diagonal entries 0
213
Singular Value Decomposition
We can show that AA has eigenvalue-eigenvector
pairs (li, ui), so
with
then
214
Singular Value Decomposition
Alternatively, we can show that AA has
eigenvalue-eigenvector pairs (li, vi), so
with
then
215
Singular Value Decomposition
Suppose we have a rectangular matrix
then
216
Singular Value Decomposition
and AA has eigenvalues of g15 and g210 with
corresponding normalized eigenvectors
217
Singular Value Decomposition
Similarly for our rectangular matrix
then
218
Singular Value Decomposition
and AA also has eigenvalues of g10, g25 and
g310 with corresponding normalized eigenvectors
219
Singular Value Decomposition
Now taking
we find that the singular value decomposition of
A is
220
Singular Value Decomposition
The singular value decomposition is closely
connected to the approximation of a rectangular
matrix by a lower-dimension matrix Eckart and
Young, 1936 First note that if a m x k matrix
A is approximated by B of same dimension but
lower rank,
221
Singular Value Decomposition
Eckart and Young used this fact to show that,
for a m x k real matrix A with m ? k and singular
value decomposition ULV,
is the rank-s least squares approximation to A
(where s lt k rank (A)).
222
Singular Value Decomposition
This matrix B minimizes
over all m x k matrices of rank no greater than
s! It can also be shown that the error of this
approximation is
223
Random Vectors and Matrices

Random Vector vector whose individual elements
are random variables
Random Matrix matrix whose individual elements
are random variables

224
Random Vectors and Matrices
The expected value of a random vector or matrix
is the matrix containing the expected values of
the individual elements, i.e.,
225
Random Vectors and Matrices
where
226
Random Vectors and Matrices

Note that for random matrices X and Y of the
same dimension, conformable matrices of constants
A and B, and scalar c
E(cX) cE(X)
E(XY) E(X) E(Y)
E(AXB) AE(X)B

227
Random Vectors and Matrices
Mean Vector random vector whose elements are
the means of the corresponding random variables,
i.e.,
228
Random Vectors and Matrices
In matrix notation we can write the mean vector
as
229
Random Vectors and Matrices
For the bivariate probability distribution
the mean vector is
230
Random Vectors and Matrices
Covariance Matrix random symmetric vector
whose diagonal elements are variances of the
corresponding random variables, i.e.,
231
Random Vectors and Matrices
and whose off-diagonal elements are covariances
of the corresponding random variable pairs, i.e.,
notice that if we this expression, when i k,
returns the variance, i.e.,
232
Random Vectors and Matrices
In matrix notation we can write the covariance
matrix as
233
Random Vectors and Matrices
For the bivariate probability distribution we
used earlier
the covariance matrix is
234
Random Vectors and Matrices

Write a Comment

User Comments (0)

About PowerShow.com

A Little Necessary Matrix Algebra for Doctoral Studies in Business Administration - PowerPoint PPT Presentation

A Little Necessary Matrix Algebra for Doctoral Studies in Business Administration

... right triangle, the lengths of the hypotenuse and the other two sides are ... hypotenuse. side a. side b. Geometry. of Vectors. Vector addition for the ... – PowerPoint PPT presentation