Processing math: 33%

Matrix Algebra

Matteo Courthoud

2021-10-29

Basics

Matrix Definition

A real n×m matrix A is an array

A=[a11a12a1ma21a22a2man1an2anm]

We write [A]ij=aij to indicate the (i,j)-element of A.

We will usually take the convention that a real vector xRn is identified with an n×1 matrix.

The n×n identity matrix In is given by
[In]ij={1   if i=j0   if ij

Fundamental Operations

  1. Two n×m matrices, A,B, are added element-wise so that [A+B]ij=[A]ij+[B]ij.
  2. A matrix A can be multiplied by a scalar cR in which case we set [cA]ij=c[A]ij.
  3. An n×m matrix A can be multiplied with an m×p matrix B.
  4. The product AB is defined according to the rule [AB]ij=mk=1[A]ik[B]kj.
  5. An n×n matrix is invertible if there exists a matrix B such that AB=I. In this case, we use the notational convention of writing B=A1.
  6. Matrix transposition is defined by [A]ij=[A]ji.

Trace and Determinant

The trace of a square matrix A with dimension n×n is tr(A)=ni=1aii.

The determinant of a square n×n matrix A is defined according to one of the following three (equivalent) definitions.

  1. Recursively as det(A)=ni=1aij(1)i+jdet([A]i,j) where [A]i,j is the matrix obtained by deleting the ith row and the jth column.
  2. Adet(A) under the unique alternating multilinear map on n×n matrices such that I1.

Linear Independence

Vectors x1,...,xk are linearly independent if the only solution to the equation b1x1+...+bkxk=0, bjR, is b1=b2=...=bk=0.

Useful Identities

  • (A+B)=A+B
  • (AB)C=A(BC)
  • A(B+C)=AB+AC
  • (AB)=BA
  • (A1)=(A)1
  • (AB)1=B1A1
  • tr(cA)=ctr(A)
  • tr(A+B)=tr(A)+tr(B)
  • tr(AB)=tr(BA)
  • det(I)=1
  • det(cA)=cndet(A) if A is n×n and cR
  • det(A)=det(A)
  • det(AB)=det(A)det(B)
  • det(A1)=(det(A))1
  • A1 exists iff det(A)0
  • rank(A)=rank(A)=rank(AA)=rank(AA)
  • A1 exists iff rank(A)=n for A n×n
  • rank(AB)min

Matrix Rank

The rank of a matrix, rank(A) is equal to the maximal number of linearly independent rows for A.

Let A be an n \times n matrix. The n \times 1 vector x \neq 0 is an eigenvector of A with corresponding eigenvalue \lambda is Ax = \lambda x.

Definitions

  1. A matrix A is diagonal if [A]_ {ij} \neq 0 only if i=j.
  2. An n \times n matrix A is orthogonal if A'A = I
  3. A matrix A is symmetric if [A]_ {ij} = [A]_ {ji}.
  4. An n \times n matrix A is idempotent if A^2=A.
  5. The matrix of zeros ([A]_ {ij} =0 for each i,j) is simply denoted 0.
  6. An n \times n matrix A is nilpotent if A^k=0 for some integer k>0.

Spectral Decomposition

Spectral Theorem

Theorem: Let A be an n \times n symmetric matrix. Then A can be factored as A = C \Lambda C' where C is orthogonal and \Lambda is diagonal.

If we postmultiply A by C, we get

  • AC = C \Lambda C'C and
  • AC = C \Lambda.

This is a matrix equation which can be split into columns. The ith column of the equation reads A c_i = \lambda_i c_i which corresponds to the definition of eigenvalues and eigenvectors. So if the decomposition exists, then C is the eigenvector matrix and \Lambda contains the eigenvalues.

Rank and Trace

Theorem: The rank of a symmetric matrix equals the number of non zero eigenvalues.

Proof: rank(A) = rank(C\Lambda C') = rank(\Lambda) = | \lbrace i: \lambda_i \neq 0 \rbrace |. \tag*{$\blacksquare$}

Theorem: The nonzero eigenvalues of AA' and A'A are identical.

Theorem: The trace of a symmetric matrix equals the sum of its eignevalues.

Proof: tr(A) = tr(C \Lambda C') = tr((C \Lambda)C') = tr(C'C \Lambda) = tr(\Lambda) = \sum_ {i=1}^n \lambda_i. \tag*{$\blacksquare$}

Theorem: The determinant of a symmetric matrix equals the product of its eignevalues.

Proof: det(A) = det(C \Lambda C') = det(C)det(\Lambda)det(C') = det(C)det(C')det(\Lambda) = det(CC') det(\Lambda) = det(I)det(\Lambda) = det(\Lambda) = \prod_ {i=1}^n \lambda_i. \tag*{$\blacksquare$}

Eigenvalues

Theorem: For any symmetric matrix A, the eigenvalues of A^2 are the square of the eignevalues of A, and the eigenvectors are the same.

Proof: A = C \Lambda C' \implies A^2 = C \Lambda C' C \Lambda C' = C \Lambda I \Lambda C' = C \Lambda^2 C' \tag*{$\blacksquare$}

Theorem: For any symmetric matrix A, and any integer k>0, the eigenvalues of A^k are the kth power of the eigenvalues of A, and the eigenvectors are the same.

Theorem: Any square symmetric matrix A with positive eigenvalues can be written as the product of a lower triangular matrix L and its (upper triangular) transpose L' = U. That is A = LU = LL'

Note that A = LL' = LU = U'U = (L')^{-1}L^{-1} = U^{-1}(U')^{-1} where L^{-1} is lower triangular and U^{ -1} is upper trianguar. You can check this for the 2 \times 2 case. Also note that the validity of the theorem can be extended to symmetric matrices with non- negative eigenvalues by a limiting argument. However, then the proof is not constructive anymore.

Quadratic Forms and Definite Matrices

Definition

A quadratic form in the n \times n matrix A and n \times 1 vector x is defined by the scalar x'Ax.

  1. A is negative definite (ND) if for each x \neq 0, x'Ax < 0
  2. A is negative semidefinite (NSD) if for each x \neq 0, x'Ax \leq 0
  3. A is positive definite (PD) if for each x \neq 0, x'Ax > 0
  4. A is positive semidefinite (PSD) if for each x \neq 0, x'Ax \geq 0

Equivalence

Theorem: Let A be a symmetric matrix. Then A is PD(ND) \iff all of its eigenvalues are positive (negative).

Some more results:

  1. If a symmetric matrix A is PD (PSD, ND, NSD), then \text{det}(A) >(\geq,<,\leq) 0.
  2. If symmetric matrix A is PD (ND) then A^{-1} is symmetric PD (ND).
  3. The identity matrix is PD (since all eigenvalues are equal to 1).
  4. Every symmetric idempotent matrix is PSD (since the eigenvalues are only 0 or 1).

Theorem: If A is n\times k with n>k and rank(A)=k, then A'A is PD and AA' is PSD.

The semidefinite partial order is defined by A \geq B iff A-B is PSD.

Theorem: Let A, B be symmetric,square , PD, conformable. Then A-B is PD iff A^{-1}-B^{-1} is PD.

Matrix Calculus

Comformable Matrices

We first define matrices blockwise when they are conformable. In particular, we assume that if A_1, A_2, A_3, A_4 are matrices with appropriate dimensions then the matrix A = \begin{bmatrix} A_1 & A_1 \newline A_3 & A_4 \end{bmatrix} is defined in the obvious way.

Matrix Functions

Let F: \mathbb R^m \times \mathbb R^n \rightarrow \mathbb R^p \times \mathbb R^q be a matrix valued function. More precisely, given a real m \times n matrix X, F(X) returns the p \times q matrix
\begin{bmatrix} f_ {11}(X) & ... & f_ {1q}(X) \newline \vdots & \ddots & \vdots \newline f_ {p1}(X)& ... & f_ {pq}(X) \end{bmatrix}

Matrix Derivatives

The derivative of F with respect to the matrix X is the mp \times nq matrix \frac{\partial F(X)}{\partial X} = \begin{bmatrix} \frac{\partial F(X)}{\partial x_ {11}} & ... & \frac{\partial F(X)}{\partial x_ {1n}} \newline \vdots & \ddots & \vdots \newline \frac{\partial F(X)}{\partial x_ {m1}} & ... & \frac{\partial F(X)}{\partial x_ {mn}} \end{bmatrix} where each \frac{\partial F(X)}{\partial x_ {ij}} is a p\times q matrix given by
\frac{\partial F(X)}{\partial x_ {ij}} = \begin{bmatrix} \frac{\partial f_ {11}(X)}{\partial x_ {ij}} & ... & \frac{\partial f_ {1q}(X)}{\partial x_ {ij}} \newline \vdots & \ddots & \vdots \newline \frac{\partial f_ {p1}(X)}{\partial x_ {ij}} & ... & \frac{\partial f_ {pq}(X)}{\partial x_ {ij}} \end{bmatrix} The most important case is when F: \mathbb R^n \rightarrow \mathbb R since this simplifies the derivation of the least squares estimator. Also, the trickiest thing is to make sure that dimensions are correct.

Useful Results in Matrix Calculus

  1. \frac{\partial b'x}{\partial x}= b for dim(b) = dim(x)
  2. \frac{\partial B'x}{\partial x}= B for arbitrary, conformable B
  3. \frac{\partial B'x}{\partial x'}= B' for arbitrary, conformable B
  4. \frac{\partial x'Ax}{\partial x} = (A + A')x
  5. \frac{\partial x'Ax}{\partial A} = xx'
  6. \frac{\partial x'Ax}{\partial x} = det(A) (A^{-1})'
  7. \frac{\partial \ln det(A)}{\partial A} = (A^{-1})'