MATRIX ALGEBRA TITLE

Matrix Algebra for Econometrics and Statistics
GARTH TARR
2011
Fundamentals Quadratic Forms Systems Sums Applications Code
Matrix fundamentals
A =
_
a
11
a
12
a
13
a
21
a
22
a
23
_
A matrix is a rectangular array of numbers.
Size: (rows)(columns). E.g. the size of A is 2 3.
The size of a matrix is also known as the dimension.
The element in the ith row and jth column of A is referred to
as a
ij
.
The matrix A can also be written as A = (a
ij
).
Matrix addition and subtraction
A =
_
a
11
a
12
a
13
a
21
a
22
a
23
_
; B =
_
b
11
b
12
b
13
b
21
b
22
b
23
_
Denition (Matrix Addition and Subtraction)
Dimensions must match:
( r c ) ( r c ) = ( r c )
A and B are both 2 3 matrices, so
A+B =
_
a
11
+ b
11
a
12
+ b
12
a
13
+ b
13
a
21
+ b
21
a
22
+ b
22
a
23
+ b
23
_
More generally we write:
AB = (a
ij
) (b
ij
).
Matrix multiplication
A =
_
a
11
a
12
a
13
a
21
a
22
a
23
_
; D =
_
_
d
11
d
12
d
21
d
22
d
31
d
32
_
_
Denition (Matrix Multiplication)
Inner dimensions need to match:
( r c ) ( c p ) = ( r p )
A is a 2 3 and D is a 3 2 matrix, so the inner dimensions
match and we have: C = AD =
_
a
11
d
11
+ a
12
d
21
+ a
13
d
31
a
11
d
12
+ a
12
d
22
+ a
13
d
32
a
21
d
11
+ a
22
d
21
+ a
23
d
31
a
21
d
12
+ a
22
d
22
+ a
23
d
32
_
Look at the pattern in the terms above.
Matrix multiplication
a
11
a
12
a
13
a
21
a
22
a
23
_
_
_
_
A : 2 3
d
11
d
12
d
21
d
22
d
31
d
32
_
_
_
_
D : 3 2
c
11
c
12
c
21
c
22
_
_
_
_
a
1
1
d
1
1
a
1
2
d
2
1
a
1
3
d
3
1
+
+
C = AD : 2 2
Determinant
Denition (General Formula)
Let C = (c
ij
) be an n n square matrix.
Dene a cofactor matrix, C
ij
, be the determinant of the
square matrix of order (n 1) obtained from C by removing
row i and column j multiplied by (1)
i+j
.
For xed i, i.e. focusing on one row: det(C) =
n
j=1
c
ij
C
ij
.
For xed j, i.e. focusing on one column: det(C) =
n
j=1
c
ij
C
ij
.
Note that this is a recursive formula.
More
The trick is to pick a row (or column) with a lot of zeros (or
better yet, use a computer)!
2 2 determinant
Apply the general formula to a 2 2 matrix: C =
_
c
11
c
12
c
21
c
22
_
.
Keep the rst row xed, i.e. set i = 1.
General formula when i = 1 and n = 2: det(C) =
2
j=1
c
1j
C
1j
When j = 1, C
11
is one cofactor matrix of C, i.e. the
determinant after removing the rst row and rst column of
C multiplied by (1)
i+j
= (1)
2
. So
C
11
= (1)
2
det(c
22
) = c
22
as c
22
is a scalar and the determinant of a scalar is itself.
C
12
= (1)
3
det(c
21
) = c
21
as c
21
is a scalar and the
determinant of a scalar is itself.
Put it all together and you get the familiar result:
det(C) = c
11
C
11
+ c
12
C
12
= c
11
c
22
c
12
c
21
3 3 determinant
B =
_
_
b
11
b
12
b
13
b
21
b
22
b
23
b
31
b
32
b
33
_
_
Keep the rst row xed, i.e. set i = 1. General formula when
i = 1 and n = 3:
det(B) =
3
j=1
b
1j
B
1j
= b
11
B
11
+ b
12
B
12
+ b
13
B
13
For example, B
12
is the determinant of the matrix you get
after removing the rst row and second column of B
multiplied by (1)
i+j
= (1)
1+2
= 1: B
12
=
b
21
b
23
b
31
b
33
.
det(B) = b
11
b
22
b
23
b
32
b
33
b
12
b
21
b
23
b
31
b
33
+ b
13
b
21
b
22
b
31
b
32

Sarrus scheme for the determinant of a 3 3
French mathematician: Pierre Frederic Sarrus (1798-1861)
det(B) =
b
11
b
12
b
13
b
21
b
22
b
23
b
31
b
32
b
33
= b
11
b
22
b
23
b
32
b
33
b
12
b
21
b
23
b
31
b
33
+ b
13
b
21
b
22
b
31
b
32
=
_
b
11
b
22
b
33
+ b
12
b
23
b
31
+ b
13
b
21
b
32
_
_
b
13
b
22
b
31
+ b
11
b
23
b
32
+ b
12
b
21
b
33
_
b
11
b
12
b
13
b
11
b
12
b
21
b
22
b
23
b
21
b
22
b
31
b
32
b
33
b
31
b
32
+ + +
Write the rst two
columns of the matrix
again to the right of the
original matrix. Multiply
the diagonals together and
then add or subtract.
Determinant as an area
A =
_
x
1
y
1
x
2
y
2
_
=
_
a
b
_
For a 2 2 matrix, det(A) is the oriented area
1
of the
parallelogram with vertices at 0 = (0, 0), a = (x
1
, y
1
),
a +b = (x
1
+ x
2
, y
1
+ y
2
), and b = (x
2
, y
2
).
x
y
a
a +b b
x
1
x
2
y
1
y
2
In a sense, the determinant summarises the information in
the matrix.
1
The oriented area is the same as the usual area, except that it is negative
when the vertices are listed in clockwise order.
Identity matrix
Denition (Identity matrix)
A square matrix, I, with ones on the main diagonal and zeros
everywhere else:
I =
_
_
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1 0
0 0 0 0 1
_
_
Sometimes you see I
r
which indicates that it is an r r
identity matrix.
If the size of I is not specied, it is assumed to be
conformable, i.e. as big as necessary.
Identity matrix
An identity matrix is the matrix analogue of the number 1.
If you multiply any matrix (or vector) with a conformable
identity matrix the result will be the same matrix (or vector).
Example (2 2)
AI =
_
a
11
a
12
a
21
a
22
_ _
1 0
0 1
_
=
_
a
11
1 + a
12
0 a
11
0 + a
12
1
a
21
1 + a
22
0 a
21
0 + a
22
1
_
=
_
a
11
a
12
a
21
a
22
_
= A.
Inverse
Denition (Inverse)
Requires a square matrix i.e. dimensions: r r
For a 2 2 matrix, A =
_
a
11
a
12
a
21
a
22
_
,
A
1
=
1
det(A)
_
a
22
a
12
a
21
a
11
_
More generally, a square matrix A is invertible or nonsingular
if there exists another matrix B such that
AB = BA = I.
If this occurs then B is uniquely determined by A and is
denoted A
1
, i.e. AA
1
= I.
Vectors
Vectors are matrices with only one row or column. For example,
the column vector:
x =
_
_
x
1
x
2
.
.
.
x
n
_
_
Denition (Transpose Operator)
Turns columns into rows (and vice versa):
x
= x
T
=
_
x
1
x
2
x
n
Example (Sum of Squares)

x
x =
n
i=1
x
2
i
Transpose
Say we have some mn matrix:
A = (a
ij
) =
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
Denition (Transpose Operator)
Flips the rows and columns of a matrix:
A
= (a
ji
)
The subscripts gets swapped.
A
is a n m matrix: the columns in A are the rows in A
.
Symmetry
Denition (Square Matrix)
A matrix, P is square if it has the same number of rows as
columns. I.e.
dim(P) = n n
for some n 1.
Denition (Symmetric Matrix)
A square matrix, P is symmetric if it is equal to its transpose:
P = P

Idempotent
Denition (Idempotent)
A square matrix, P is idempotent if when multiplied by itself,
yields itself. I.e.
PP = P.
1. When an idempotent matrix is subtracted from the identity
matrix, the result is also idempotent, i.e. M = I P is
idempotent.
2. The trace of an idempotent matrix is equal to the rank.
3. X(X
X)
1
X
is an idempotent matrix.
Order of operations
Matrix multiplication is non-commutative, i.e. the order of
multiplication is important: AB = BA.
Commutativity
Matrix multiplication is associative, i.e. as long as the order
stays the same, (AB)C = A(BC).
Associativity
A(B+C) = AB+AC
(A+B)C = AC+BC
Example
Let A be a k k matrix and x and c be k 1 vectors:
Ax = c
A
1
Ax = A
1
c (PRE-multiply both sides by A
1
)
Ix = A
1
c
x = A
1
c
Note: A
1
c = cA
1
Matrix Dierentiation
If and a are both k 1 vectors then,

= a.
Proof.
a
_
=

(
1
a
1
+
2
a
2
+ . . . +
k
a
k
)
=
_
1
(
1
a
1
+
2
a
2
+ . . . +
k
a
k
)
2
(
1
a
1
+
2
a
2
+ . . . +
k
a
k
)
.
.
.
k
(
1
a
1
+
2
a
2
+ . . . +
k
a
k
)
_
_
= a
Matrix Calculus
Let be a k 1 vector and A be a k k symmetric matrix then
= 2A.
Proof.
By means of proof, say =
_
2
_
and A =
_
a
11
a
12
a
12
a
22
_
, then
A
_
=

2
1
a
11
+ 2a
12
2
+
2
2
a
22
_
=
_

1
_
2
1
a
11
+ 2a
12
2
+
2
2
a
22
_
2
_
2
1
a
11
+ 2a
12
2
+
2
2
a
22
_
_
=
_
2
1
a
11
+ 2a
12
2
2
1
a
12
+ 2a
22
2
_
= 2A
Let be a k 1 vector and A be a n k matrix then
A
= A.
Proof.
By means of proof, say =
_
2
_
and A =
_
a
11
a
12
a
21
a
22
_
, then
(A) =

_
a
11
1
+ a
12
2
a
21
1
+ a
22
2
_
=
_
_
_

2
_
(a
11
1
+ a
12
2
)
_

2
_
(a
21
1
+ a
22
2
)
_
_
=
_

1
(a
11
1
+ a
12
2
)

2
(a
11
1
+ a
12
2
)
1
(a
21
1
+ a
22
2
)

2
(a
21
1
+ a
22
2
)
_
= A.
Rank
The rank of a matrix A is the maximal number of linearly
independent rows or columns of A.
A family of vectors is linearly independent if none of them can
be written as a linear combination of nitely many other
vectors in the collection.
Example (Dummy variable trap)
v
1
v
2
v
3
v
4
_ _
=
1 1 0 0
1 1 0 0
1 0 1 0
1 0 0 1
_
_
_
_
independent
dependent
v
1
, v
2
and v
3
are independent but v
4
= v
1
v
2
v
3
.
Rank
The maximum rank of an mn matrix is min(m, n).
A full rank matrix is one that has the largest possible rank,
i.e. the rank is equal to either the number of rows or columns
(whichever is smaller).
In the case of an n n square matrix A, then A is invertible
if and only if A has rank n (that is, A has full rank).
For some n k matrix, X, rank(X) = rank(X
X)
This is why the dummy variable trap exists, you need to drop
one of the dummy categories otherwise X is not of full rank
and therefore you cannot nd the inverse of X
X.
Trace
Denition
The trace of an n n matrix A is the sum of the elements on the
main diagonal: tr(A) = a
11
+ a
22
+ . . . + a
nn
=
n
i=1
a
ii
.
Properties
tr(A+B) = tr(A) + tr(B)
tr(cA) = ctr(A)
If A is an mn matrix and B is an n m matrix then
tr(AB) = tr(BA)
More generally, for conformable matrices:
tr(ABC) = tr(CAB) = tr(BCA)
BUT: tr(ABC) = tr(ACB). You can only move from the
front to the back (or back to the front)!
Eigenvalues
An eigenvalue and an eigenvector x = 0 of a square matrix
A is dened as
Ax = x.
Since the eigenvector x is dierent from the zero vector (i.e.
x = 0) the following is valid:
(AI)x = 0 = det(AI) = 0.
We know det(AI) = 0 because:
if (AI)
1
existed, we could just pre multiply both sides by
(AI)
1
and get the solution x = 0.
but we have assumed x = 0 so we require that (AI) is
NOT invertible which implies
2
that det(AI) = 0.
To nd the eigenvalues, we can solve det(AI) = 0.
2
A matrix is invertible if and only if the determinant is non-zero
Eigenvalues
Example (Finding eigenvalues)
Say A =
_
2 1
1 2
_
. We can nd the eigenvaules of A by solving
det(AI) = 0
det
__
2 1
1 2
_
_
1 0
0 1
__
= 0
2 1
1 2
= 0
(2 )(2 ) 1 1 = 0
2
4 + 3 = 0
( 1)( 3) = 0
The eigenvalues are the roots of this quadratic: = 1 and = 3.
Why do we care about eigenvalues?
An n n matrix A is positive denite if all eigenvalues of A,
1
,
2
, . . . ,
n
are positive.
Deniteness
A matrix is negative-denite, negative-semidenite, or
positive-semidenite if and only if all of its eigenvalues are
negative, non-positive, or non-negative, respectively.
The eigenvectors corresponding to dierent eigenvalues are
linearly independent. So if a n n matrix has n nonzero
eigenvalues, it is of full rank.
Rank
The trace of a matrix is the sum of the eigenvectors:
tr(A) =
1
+
2
+ . . . +
n
.
Trace
The determinant of a matrix is the product of the
eigenvectors: det(A) =
1
2

n
.
Determinant
The eigenvectors and eigenvalues of the covariance matrix of
a data set data are also used in principal component analysis
(similar to factor analysis).
Factor Analysis
Useful rules
(AB)
= B
det(A) = det(A
)
det(AB) = det(A) det(B)
det(A
1
) =
1
det(A)
AI = A and xI = x
If and a are both k 1 vectors,

= a
If A is a n k matrix,
A
= A
If A is a k k symmetric matrix,

= 2A
If A is a k k (not necessarily symmetric) matrix,
= (A+A
)
Quadratic forms
A quadratic form on R
n
is a real-valued function of the form
Q(x
1
, . . . , x
n
) =
ij
a
ij
x
i
x
j
.
E.g. in R
2
we have Q(x
1
, x
2
) = a
11
x
2
1
+ a
12
x
1
x
2
+ a
22
x
2
2
.
Quadratic forms can be represented by a symmetric matrix A
such that:
Q(x) = x
Ax
E.g. if x = (x
1
, x
2
)
then
Q(x) =
_
x
1
x
2
_
_
a
11
1
2
a
12
1
2
a
21
a
22
__
x
1
x
2
_
= a
11
x
2
1
+
1
2
(a
12
+ a
21
)x
1
x
2
+ a
22
x
2
2
but A is symmetric, i.e. a
12
= a
21
, so we can write,
= a
11
x
2
1
+ a
12
x
1
x
2
+ a
22
x
2
2
.
Quadratic forms
If x R
3
, i.e. x = (x
1
, x
2
, x
3
)
then the general three dimensional

quadratic form is:
Q(x) = x
Ax
=
_
x
1
x
2
x
3
_
_
_
a
11
1
2
a
12
1
2
a
13
1
2
a
12
a
22
1
2
a
23
1
2
a
13
1
2
a
23
a
33
_
_
_
_
x
1
x
2
x
3
_
_
= a
11
x
2
1
+ a
22
x
2
2
+ a
33
x
2
3
+ a
12
x
1
x
2
+ a
13
x
1
x
3
+ a
23
x
2
x
3
.
Quadratic Forms and Sum of Squares
Recall sums of squares can be written as x
x and quadratic forms

are x
Ax. Quadratic forms are like generalised and weighted sum

of squares. Note that if A = I then we recover the sums of
squares exactly.
Deniteness of quadratic forms
A quadratic form always takes on the value zero at the point
x = 0. This is not an interesting result!
For example, if x R, i.e. x = x
1
then the general quadratic
form is ax
2
1
which equals zero when x
1
= 0.
Its distinguishing characteristic is the set of values it takes
when x = 0.
We want to know if x = 0 is a max, min or neither.
Example: when x R, i.e. the quadratic form is ax
2
1
,
a > 0 means ax
2
0 and equals 0 only when x = 0. Such
a form is called positive denite; x = 0 is a global
minimiser.
a < 0 means ax
2
0 and equals 0 only when x = 0. Such
a form is called negative denite; x = 0 is a global
maximiser.
Positive denite
If A =
_
1 0
0 1
_
then Q
1
(x) = x
Ax = x
2
1
+ x
2
2
.
Q
1
is greater than zero at x = 0 i.e. (x
1
, x
2
) = (0, 0).
The point x = 0 is a global minimum.
Q
1
is called positive denite.
10
5
0
5
10
10
5
0
5
10
0
50
100
150
200
x
1
x
2
Q
1
(
x
1
,
x
2
)
Figure 1: Q
1
(x
1
, x
2
) = x
2
1
+ x
2
2
Code
Negative denite
If A =
_
1 0
0 1
_
then Q
2
(x) = x
Ax = x
2
1
x
2
2
.
Q
2
is less than zero at x = 0 i.e. (x
1
, x
2
) = (0, 0).
The point x = 0 is a global maximum.
Q
2
is called negative denite.
10
5
0
5
10
10
5
0
5
10
200
150
100
50
0
x
1
x
2
Q
2
(
x
1
,
x
2
)
Figure 2: Q
2
(x
1
, x
2
) = x
2
1
x
2
2
Code
Indenite
If A =
_
1 0
0 1
_
then Q
3
(x) = x
Ax = x
2
1
x
2
2
.
Q
3
can be take both positive and negative values.
E.g. Q
3
(1, 0) = +1 and Q
3
(0, 1) = 1.
Q
3
is called indenite.
10
5
0
5
10
10
5
0
5
10
100
50
0
50
100
x
1
x
2
Q
3
(
x
1
,
x
2
)
Figure 3: Q
3
(x
1
, x
2
) = x
2
1
x
2
2
Code
Positive semidenite
If A =
_
1 1
1 1
_
then Q
4
(x) = x
Ax = x
2
1
+ 2x
1
x
2
+ x
2
2
.
Q
4
is always 0 but does equal zero at some x = 0.
E.g. Q
4
(10, 10) = 0.
Q
4
is called positive semidenite.
10
5
0
5
10
10
0
10
0
100
200
300
400
x
1
x
2
Q
4
(
x
1
,
x
2
)
Figure 4: Q
4
(x
1
, x
2
) = x
2
1
+ 2x
1
x
2
+ x
2
2
Code
Negative semidenite
If A =
_
1 1
1 1
_
then Q
5
(x) = x
Ax = (x
1
+ x
2
)
2
.
Q
4
is always 0 but does equal zero at some x = 0
E.g. Q
5
(10, 10) = 0
Q
5
is called negative semidenite.
10
5
0
5
10
10
0
10
400
300
200
100
0
x
1
x
2
Q
5
(
x
1
,
x
2
)
Figure 5: Q
5
(x
1
, x
2
) = (x
1
+ x
2
)
2
Code
Denite symmetric matrices
A symmetric matrix, A, is called positive denite, positive
semidenite, negative denite, etc. according to the deniteness of
the corresponding quadratic form Q(x) = x
Ax.
Denition
Let A be a n n symmetric matrix, then A is
1. positive denite if x
Ax > 0 for all x = 0 in R

n
2. positive semidenite if x
Ax 0 for all x = 0 in R
n
3. negative denite if x
Ax < 0 for all x = 0 in R

n
4. negative semidenite if x
Ax 0 for all x = 0 in R
n
5. indenite if x
Ax > 0 for some x = 0 in R

n
and < 0 for some
other x in R
n
We can check the deniteness of a matrix by show that one of
these denitions holds as in the example
Example
You can nd the eigenvalues to check deniteness
Eigenvalues
How else to check for deniteness?
You can check the sign of the sequence of determinants of the
leading principal minors:
Positive Denite
An n n matrix M is positive denite if all the following matrices
have a positive determinant:
the top left 1 1 corner of M (1st order principal minor)
the top left 2 2 corner of M (2nd order principal minor)
.
.
.
M itself.
In other words, all of the leading principal minors are positive.
Negative Denite
A matrix is negative denite if all kth order leading principal
minors are negative when k is odd and positive when k is even.
Why do we care about deniteness?
Useful for establishing if a (multivariate) function has a maximum,
minimum or neither at a critical point.
If we have a function, f(x), we can show that a minimum
exists at a critical point, i.e. when f
(x) = 0, if f
(x) > 0.
Example (f(x) = 2x
2
)
f
(x) = 4x
f
(x) = 0 = x = 0
f
(x) = 4 > 0 =minimum at x = 0.

x
f(x)
1
1
f(x) = 2x
2
In the special case of a univariate function f
(x) is a 1 1
Hessian matrix and showing that f
(x) > 0 is equivalent to

showing that the Hessian is positive denite.
If we have a bivariate function f(x, y) we nd critical points
when the rst order partial derivatives are equal to zero:
1. Find the rst order derivatives and set them equal to zero
2. Solve simultaneously to nd critical points
We can check if max or min or neither using the Hessian
matrix, H, the matrix of second order partial derivatives:
H =
_
f
xx
f
xy
f
yx
f
yy
_
1. (If necessary) evaluate the Hessian at a critical point
2. Check if H is positive or negative denite:
Check deniteness
|H| > 0 and f
xx
> 0 = positive denite = minimum
|H| > 0 and f
xx
< 0 = negative denite = maximum
3. Repeat for all critical points
If we nd the second order conditions and show that it is a
positive denite matrix then we have shown that we have a
minimum.
Positive denite matrices are non-singular, i.e. we can invert
them. So if we can show X
X is positive deniteness, we can

nd [X
X]
1
.
Application: showing that the Ordinary Least Squares (OLS)
minimises the sum of squared residuals.
Application
Matrices as systems of equations
A system of equations:
y
1
= x
11
b
1
+ x
12
b
2
+ . . . + x
1k
b
k
y
2
= x
21
b
1
+ x
22
b
2
+ . . . + x
2k
b
k
.
.
.
y
n
= x
n1
b
1
+ x
n2
b
2
+ . . . + x
nk
b
k
The matrix form:
_
_
y
1
y
2
.
.
.
y
n
_
_
=
_
_
x
11
x
12
. . . x
1k
x
21
x
22
. . . x
2k
.
.
.
.
.
.
.
.
.
x
n1
x
n2
. . . x
nk
_
_
_
_
b
1
b
2
.
.
.
b
k
_
_
.
More succinctly: y = Xb where
y =
_
_
y
1
y
2
.
.
.
y
n
_
_
; b =
_
_
b
1
b
2
.
.
.
b
k
_
_
; x
i
=
_
_
x
i1
x
i2
.
.
.
x
ik
_
_
for i = 1, 2, . . . , n and
X =
_
_
x
11
x
12
. . . x
1k
x
21
x
22
. . . x
2k
.
.
.
.
.
.
.
.
.
x
n1
x
n2
. . . x
nk
_
_
=
_
_
x
1
x
2
.
.
.
x
n
_
_
.
x
i
is the covariate vector for the ith observation.
DM 1.4
We can write y = Xb as
_
_
y
1
y
2
.
.
.
y
n
_
_
=
_
_
x
1
x
2
.
.
.
x
n
_
_
b.
Returning to the original system, we can write each individual
equation using vectors:
y
1
= x
1
b
y
2
= x
2
b
.
.
.
y
n
= x
n
b
Mixing matrices, vectors and summation notation
Often we want to nd X
u or X
X. A convenient way to write

this is as a sum of vectors. Say we have a 3 2 matrix X:
X =
_
_
x
11
x
12
x
21
x
22
x
31
x
32
_
_
=
_
_
x
1
x
2
x
3
_
_
; x
i
=
_
x
i1
x
i2
_
; and u =
_
_
u
1
u
2
u
3
_
_
We can write,
X
u =
_
x
11
x
21
x
31
x
12
x
22
x
32
_
_
_
u
1
u
2
u
3
_
_
=
_
x
11
u
1
+ x
21
u
2
+ x
31
u
3
x
12
u
1
+ x
22
u
2
+ x
32
u
3
_
= x
1
u
1
+x
2
u
2
+x
3
u
3
=
3
i=1
x
i
u
i
Mixing matrices, vectors and summation notation
In a similar fashion, you can also show that X
X =
3
i=1
x
i
x
i
.
X
X =
_
x
11
x
21
x
31
x
12
x
22
x
32
_
_
_
x
11
x
12
x
21
x
22
x
31
x
32
_
_
=
_
x
1
x
2
x
3
_
_
x
1
x
2
x
3
_
_
= x
1
x
1
+x
2
x
2
+x
3
x
3
=
3
i=1
x
i
x
i
Application: variance-covariance matrix
For the univariate case, var(Y ) = E
_
[Y ]
2
_
.
In the multivariate case Y is a vector of n random variables.
Without loss of generality, assume Y has mean zero, i.e.
E(Y) = = 0. Then,
cov(Y, Y) = var(Y) = E
_
[Y][Y]
_
= E
_
_
_
_
_
_
_
Y
1
Y
2
.
.
.
Y
n
_
_
_
Y
1
Y
2
Y
n
_
_
_
_
_
= E
_
_
Y
2
1
Y
1
Y
2
Y
1
Y
n
Y
2
Y
1
Y
2
2
Y
2
Y
n
.
.
.
.
.
.
.
.
.
Y
n
Y
1
Y
n
Y
2
Y
2
n
_
_
Application: variance-covariance matrix
Hence, we have a variance-covariance matrix:
var(Y) =
_
_
var(Y
1
) cov(Y
1
, Y
2
) cov(Y
1
, Y
n
)
cov(Y
2
, Y
1
) var(Y
2
) cov(Y
2
, Y
n
)
.
.
.
.
.
.
.
.
.
cov(Y
n
, Y
1
) cov(Y
n
, Y
2
) var(Y
n
)
_
_
.
What if we weight the random variables with a vector of
constants, a?
var(a
Y) = E
_
[a
Ya
][a
Ya
_
= E
_
a
[Y](a
[Y])
_
= E
_
a
[Y][Y]
a
_
= a
E
_
[Y][Y]
_
a
= a
var(Y)a
Application: variance of sums of random variables
Let Y = (Y
1
, Y
2
)
be a vector of random variables and

a = (a
1
, a
2
)
be some constants,
a
Y =
_
a
1
a
2
_
Y
1
Y
2
_
= a
1
Y
1
+ a
2
Y
2
Now, var(a
1
Y
1
+ a
2
Y
2
) = var(a
Y) = a
var(Y)a where
var(Y) =
_
var(Y
1
) cov(Y
1
, Y
2
)
cov(Y
1
, Y
2
) var(Y
2
)
_
,
is the (symmetric) variance-covariance matrix.
var(a
Y) = a
var(Y)a
=
_
a
1
a
2
_
var(Y
1
) cov(Y
1
, Y
2
)
cov(Y
1
, Y
2
) var(Y
2
)
_ _
a
1
a
2
_
= a
2
1
var(Y
1
) + a
2
2
var(Y
2
) + 2a
1
a
2
cov(Y
1
, Y
2
)
Application: Given a linear model y = X +u derive the
OLS estimator

. Show that

achieves a minimum.
The OLS estimator minimises the sum of squared residuals,
u
u =
n
i=1
u
2
i
where u = y X or u
i
= y
i
x
i
.
S() =
n
i=1
(y
i
x
i
)
2
= (y X)
(y X)
= y
y 2y
X +
X.
Take the rst derivative of S() and set it equal to zero:
S()
= 2X
y + 2X
X = 0 = X
= X
y.
Assuming X (and therefore X
X) is of full rank (so is X
X
invertible) we get,
= (X
X)
1
X
y.
Application: Given a linear model y = X +u derive the
OLS estimator

. Show that

achieves a minimum.
For a minimum we need to use the second order conditions:
2
S()
= 2X
X.
The solution will be a minimum if X
X is a positive denite
matrix. Let q = c
Xc for some c = 0. Then

q = v
v =
n
i=1
v
2
i
, where v = Xc.
Unless v = 0, q is positive. But, if v = 0 then v or c would
be a linear combination of the columns of X that equals 0
which contradicts the assumption that X has full rank.
Since c is arbitrary, q is positive for every c = 0 which
establishes that X
X is positive denite.
Deniteness
Therefore, if X has full rank, then the least squares solution

is unique and minimises the sum of squared residuals.

Matrix Operations
Operation
R Matlab
A =
_
5 7
10 2
_
A=matrix(c(5,7,10,2),
ncol=2,byrow=T)
A = [5,7;10,2]
det(A)
det(A) det(A)
A
1
solve(A) inv(A)
A+B A + B A + B
AB A %*% B A * B
A
t(A) A
Matrix Operations
Operation
R Matlab
eigenvalues &
eigenvectors
eigen(A) [V,E] = eig(A)
covariance
matrix of X
var(X) or cov(X) cov(X)
estimate of
rank(A)
qr(A)$rank
rank(A)
r r identity
matrix, I
r
diag(1,r) eye(r)
Matlab Code
Figure 1
Figure 1
[x,y] = meshgrid(-10:0.75:10,-10:0.75:10);
surfc(x,y,x.^2 + y.^2)
ylabel(x_2)
xlabel(x_1)
zlabel(Q_1(x_1,x_2))
Figure 2
Figure 2
[x,y] = meshgrid(-10:0.75:10,-10:0.75:10);
surfc(x,y,-x.^2 - y.^2)
ylabel(x_2)
xlabel(x_1)
Matlab Code
Figure 3
Figure 3
[x,y] = meshgrid(-10:0.75:10,-10:0.75:10);
surfc(x,y,x.^2 - y.^2)
ylabel(x_2)
xlabel(x_1)
Figure 4
Figure 4
[x,y] = meshgrid(-10:0.75:10,-10:0.75:10);
surfc(x,y,x.^2 + 2.*x.*y + y.^2)
ylabel(x_2)
xlabel(x_1)
Matlab Code
Figure 5
Figure 5
[x,y] = meshgrid(-10:0.75:10,-10:0.75:10);
surfc(x,y,-(x+y).^2)
ylabel(x_2)
xlabel(x_1)

MATRIX ALGEBRA TITLE

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

MATRIX ALGEBRA TITLE

Transféré par

Droits d'auteur :

Formats disponibles

Matrix Algebra for Econometrics and Statistics

Fundamentals Quadratic Forms Systems Sums Applications Code

Example (Sum of Squares)

is a n m matrix: the columns in A are the rows in A

Fundamentals Quadratic Forms Systems Sums Applications Code

then the general three dimensional

x and quadratic forms

Ax. Quadratic forms are like generalised and weighted sum

Ax > 0 for all x = 0 in R

Ax < 0 for all x = 0 in R

Ax > 0 for some x = 0 in R

(x) = 4 > 0 =minimum at x = 0.

(x) > 0 is equivalent to

X is positive deniteness, we can

X. A convenient way to write

be a vector of random variables and

X) is of full rank (so is X

Xc for some c = 0. Then

is unique and minimises the sum of squared residuals.

Vous aimerez peut-être aussi