Matrix Algebra

Last updated on Oct 29, 2021

Basics

Matrix Definition

A real $n \times m$ matrix $A$ is an array

$$ A= \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1m} \newline a_{21} & a_{22} & \dots & a_{2m} \newline \vdots & \vdots & \ddots & \vdots \newline a_{n1} & a_{n2} & \dots & a_{nm} \end{bmatrix} $$

We write $[A]_ {ij} = a_ {ij}$ to indicate the $(i,j)$-element of $A$.

We will usually take the convention that a real vector $x \in \mathbb R^n$ is identified with an $n \times 1$ matrix.

The $n \times n$ identity matrix $I_n$ is given by
$$ [I_n] _ {ij} = \begin{cases} 1 \ \ \ \text{if} \ i=j \newline 0 \ \ \ \text{if} \ i \neq j \end{cases} $$

Fundamental Operations

  1. Two $n \times m$ matrices, $A,B$, are added element-wise so that $[A+B]_{ij} = [A] _{ij} + [B] _{ij}$.
  2. A matrix $A$ can be multiplied by a scalar $c\in \mathbb{R}$ in which case we set $[cA]_{ij} = c[A] _{ij}$.
  3. An $n \times m$ matrix $A$ can be multiplied with an $m \times p$ matrix $B$.
  4. The product $AB$ is defined according to the rule $[AB] _ {ij} = \sum_{k=1}^m [A] _{ik} [B] _{kj}$.
  5. An $n \times n$ matrix is invertible if there exists a matrix $B$ such that $AB=I$. In this case, we use the notational convention of writing $B = A^{-1}$.
  6. Matrix transposition is defined by $[A’] _{ij} = [A] _{ji}$.

Trace and Determinant

The trace of a square matrix $A$ with dimension $n \times n$ is $\text{tr}(A) = \sum_{i=1}^n a_{ii}$.

The determinant of a square $n \times n$ matrix A is defined according to one of the following three (equivalent) definitions.

  1. Recursively as $det(A) = \sum_{i=1}^n a_{ij} (-1)^{i+j} det([A]{-i,-j})$ where $[A]{-i,-j}$ is the matrix obtained by deleting the $i$th row and the $j$th column.
  2. $A \mapsto det(A)$ under the unique alternating multilinear map on $n \times n$ matrices such that $I \mapsto 1$.

Linear Independence

Vectors $x_1,…,x_k$ are linearly independent if the only solution to the equation $b_1x_1 + … + b_k x_k=0, \ b_j \in \mathbb R$, is $b_1=b_2=…=b_k=0$.

Useful Identities

  • $(A+B)’ =A’+B'$
  • $(AB)C = A(BC)$
  • $A(B+C) = AB+AC$
  • $(AB’) = B’A'$
  • $(A^{-1})’ = (A’)^{-1}$
  • $(AB)^{-1} = B^{-1}A^{-1}$
  • $\text{tr}(cA) = c\text{tr}(A)$
  • $\text{tr}(A+B) = \text{tr}(A) + \text{tr}(B)$
  • $\text{tr}(AB) =\text{tr}(BA)$
  • $det(I)=1$
  • $det(cA) = c^ndet(A)$ if $A$ is $n \times n$ and $c \in \mathbb R$
  • $det(A) = det(A’)$
  • $det(AB) = det(A)det(B)$
  • $det(A^{-1}) = (det(A))^{-1}$
  • $A^{-1}$ exists iff $det(A) \neq 0$
  • $rank(A) = rank(A’) = rank(A’A) = rank(AA’)$
  • $A^{-1}$ exists iff $rank(A)=n$ for $A$ $n \times n$
  • $rank(AB) \leq \min \lbrace rank(A), rank(B) \rbrace$

Matrix Rank

The rank of a matrix, $rank(A)$ is equal to the maximal number of linearly independent rows for $A$.

Let $A$ be an $n \times n$ matrix. The $n \times 1$ vector $x \neq 0$ is an eigenvector of $A$ with corresponding eigenvalue $\lambda$ is $Ax = \lambda x$.

Definitions

  1. A matrix $A$ is diagonal if $[A]_ {ij} \neq 0$ only if $i=j$.
  2. An $n \times n$ matrix $A$ is orthogonal if $A’A = I$
  3. A matrix $A$ is symmetric if $[A]_ {ij} = [A]_ {ji}$.
  4. An $n \times n$ matrix $A$ is idempotent if $A^2=A$.
  5. The matrix of zeros ($[A]_ {ij} =0$ for each $i,j$) is simply denoted 0.
  6. An $n \times n$ matrix $A$ is nilpotent if $A^k=0$ for some integer $k>0$.

Spectral Decomposition

Spectral Theorem

Theorem: Let $A$ be an $n \times n$ symmetric matrix. Then $A$ can be factored as $A = C \Lambda C’$ where $C$ is orthogonal and $\Lambda$ is diagonal.

If we postmultiply $A$ by $C$, we get

  • $AC = C \Lambda C’C$ and
  • $AC = C \Lambda$.

This is a matrix equation which can be split into columns. The $i$th column of the equation reads $A c_i = \lambda_i c_i$ which corresponds to the definition of eigenvalues and eigenvectors. So if the decomposition exists, then $C$ is the eigenvector matrix and $\Lambda$ contains the eigenvalues.

Rank and Trace

Theorem: The rank of a symmetric matrix equals the number of non zero eigenvalues.

Proof: $rank(A) = rank(C\Lambda C’) = rank(\Lambda) = | \lbrace i: \lambda_i \neq 0 \rbrace |$. $$\tag*{$\blacksquare$}$$

Theorem: The nonzero eigenvalues of $AA’$ and $A’A$ are identical.

Theorem: The trace of a symmetric matrix equals the sum of its eignevalues.

Proof: $tr(A) = tr(C \Lambda C’) = tr((C \Lambda)C’) = tr(C’C \Lambda) = tr(\Lambda) = \sum_ {i=1}^n \lambda_i.$ $$\tag*{$\blacksquare$}$$

Theorem: The determinant of a symmetric matrix equals the product of its eignevalues.

Proof: $det(A) = det(C \Lambda C’) = det(C)det(\Lambda)det(C’) = det(C)det(C’)det(\Lambda) = det(CC’) det(\Lambda) = det(I)det(\Lambda) = det(\Lambda) = \prod_ {i=1}^n \lambda_i.$ $$\tag*{$\blacksquare$}$$

Eigenvalues

Theorem: For any symmetric matrix $A$, the eigenvalues of $A^2$ are the square of the eignevalues of $A$, and the eigenvectors are the same.

Proof: $A = C \Lambda C’ \implies A^2 = C \Lambda C’ C \Lambda C’ = C \Lambda I \Lambda C’ = C \Lambda^2 C’$ $$\tag*{$\blacksquare$}$$

Theorem: For any symmetric matrix $A$, and any integer $k>0$, the eigenvalues of $A^k$ are the $k$th power of the eigenvalues of $A$, and the eigenvectors are the same.

Theorem: Any square symmetric matrix $A$ with positive eigenvalues can be written as the product of a lower triangular matrix $L$ and its (upper triangular) transpose $L’ = U$. That is $A = LU = LL'$

Note that $$ A = LL’ = LU = U’U = (L’)^{-1}L^{-1} = U^{-1}(U’)^{-1} $$ where $L^{-1}$ is lower triangular and $U^{ -1}$ is upper trianguar. You can check this for the $2 \times 2$ case. Also note that the validity of the theorem can be extended to symmetric matrices with non- negative eigenvalues by a limiting argument. However, then the proof is not constructive anymore.

Quadratic Forms and Definite Matrices

Definition

A quadratic form in the $n \times n$ matrix $A$ and $n \times 1$ vector $x$ is defined by the scalar $x’Ax$.

  1. $A$ is negative definite (ND) if for each $x \neq 0$, $x’Ax < 0$
  2. $A$ is negative semidefinite (NSD) if for each $x \neq 0$, $x’Ax \leq 0$
  3. $A$ is positive definite (PD) if for each $x \neq 0$, $x’Ax > 0$
  4. $A$ is positive semidefinite (PSD) if for each $x \neq 0$, $x’Ax \geq 0$

Equivalence

Theorem: Let $A$ be a symmetric matrix. Then $A$ is PD(ND) $\iff$ all of its eigenvalues are positive (negative).

Some more results:

  1. If a symmetric matrix $A$ is PD (PSD, ND, NSD), then $\text{det}(A) >(\geq,<,\leq) 0$.
  2. If symmetric matrix $A$ is PD (ND) then $A^{-1}$ is symmetric PD (ND).
  3. The identity matrix is PD (since all eigenvalues are equal to 1).
  4. Every symmetric idempotent matrix is PSD (since the eigenvalues are only 0 or 1).

Theorem: If $A$ is $n\times k$ with $n>k$ and $rank(A)=k$, then $A’A$ is PD and $AA’$ is PSD.

The semidefinite partial order is defined by $A \geq B$ iff $A-B$ is PSD.

Theorem: Let $A$, $B$ be symmetric,square , PD, conformable. Then $A-B$ is PD iff $A^{-1}-B^{-1}$ is PD.

Matrix Calculus

Comformable Matrices

We first define matrices blockwise when they are conformable. In particular, we assume that if $A_1, A_2, A_3, A_4$ are matrices with appropriate dimensions then the matrix $$ A = \begin{bmatrix} A_1 & A_1 \newline A_3 & A_4 \end{bmatrix} $$ is defined in the obvious way.

Matrix Functions

Let $F: \mathbb R^m \times \mathbb R^n \rightarrow \mathbb R^p \times \mathbb R^q$ be a matrix valued function. More precisely, given a real $m \times n$ matrix $X$, $F(X)$ returns the $p \times q$ matrix
$$ \begin{bmatrix} f_ {11}(X) & … & f_ {1q}(X) \newline \vdots & \ddots & \vdots \newline f_ {p1}(X)& … & f_ {pq}(X) \end{bmatrix} $$

Matrix Derivatives

The derivative of $F$ with respect to the matrix $X$ is the $mp \times nq$ matrix $$ \frac{\partial F(X)}{\partial X} = \begin{bmatrix} \frac{\partial F(X)}{\partial x_ {11}} & … & \frac{\partial F(X)}{\partial x_ {1n}} \newline \vdots & \ddots & \vdots \newline \frac{\partial F(X)}{\partial x_ {m1}} & … & \frac{\partial F(X)}{\partial x_ {mn}} \end{bmatrix} $$ where each $\frac{\partial F(X)}{\partial x_ {ij}}$ is a $p\times q$ matrix given by
$$ \frac{\partial F(X)}{\partial x_ {ij}} = \begin{bmatrix} \frac{\partial f_ {11}(X)}{\partial x_ {ij}} & … & \frac{\partial f_ {1q}(X)}{\partial x_ {ij}} \newline \vdots & \ddots & \vdots \newline \frac{\partial f_ {p1}(X)}{\partial x_ {ij}} & … & \frac{\partial f_ {pq}(X)}{\partial x_ {ij}} \end{bmatrix} $$ The most important case is when $F: \mathbb R^n \rightarrow \mathbb R$ since this simplifies the derivation of the least squares estimator. Also, the trickiest thing is to make sure that dimensions are correct.

Useful Results in Matrix Calculus

  1. $\frac{\partial b’x}{\partial x}= b$ for $dim(b) = dim(x)$
  2. $\frac{\partial B’x}{\partial x}= B$ for arbitrary, conformable $B$
  3. $\frac{\partial B’x}{\partial x’}= B’$ for arbitrary, conformable $B$
  4. $\frac{\partial x’Ax}{\partial x} = (A + A’)x$
  5. $\frac{\partial x’Ax}{\partial A} = xx'$
  6. $\frac{\partial x’Ax}{\partial x} = det(A) (A^{-1})'$
  7. $\frac{\partial \ln det(A)}{\partial A} = (A^{-1})'$
Next