Matrix Algebra
Last updated on Oct 29, 2021
Basics
Matrix Definition
A real $n \times m$ matrix $A$ is an array
$$ A= \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1m} \newline a_{21} & a_{22} & \dots & a_{2m} \newline \vdots & \vdots & \ddots & \vdots \newline a_{n1} & a_{n2} & \dots & a_{nm} \end{bmatrix} $$
We write $[A]_ {ij} = a_ {ij}$ to indicate the $(i,j)$-element of $A$.
We will usually take the convention that a real vector $x \in \mathbb R^n$ is identified with an $n \times 1$ matrix.
The $n \times n$ identity matrix $I_n$ is given by
$$
[I_n] _ {ij} = \begin{cases} 1 \ \ \ \text{if} \ i=j \newline
0 \ \ \ \text{if} \ i \neq j \end{cases}
$$
Fundamental Operations
- Two $n \times m$ matrices, $A,B$, are added element-wise so that $[A+B]_{ij} = [A] _{ij} + [B] _{ij}$.
- A matrix $A$ can be multiplied by a scalar $c\in \mathbb{R}$ in which case we set $[cA]_{ij} = c[A] _{ij}$.
- An $n \times m$ matrix $A$ can be multiplied with an $m \times p$ matrix $B$.
- The product $AB$ is defined according to the rule $[AB] _ {ij} = \sum_{k=1}^m [A] _{ik} [B] _{kj}$.
- An $n \times n$ matrix is invertible if there exists a matrix $B$ such that $AB=I$. In this case, we use the notational convention of writing $B = A^{-1}$.
- Matrix transposition is defined by $[A’] _{ij} = [A] _{ji}$.
Trace and Determinant
The trace of a square matrix $A$ with dimension $n \times n$ is $\text{tr}(A) = \sum_{i=1}^n a_{ii}$.
The determinant of a square $n \times n$ matrix A is defined according to one of the following three (equivalent) definitions.
- Recursively as $det(A) = \sum_{i=1}^n a_{ij} (-1)^{i+j} det([A]{-i,-j})$ where $[A]{-i,-j}$ is the matrix obtained by deleting the $i$th row and the $j$th column.
- $A \mapsto det(A)$ under the unique alternating multilinear map on $n \times n$ matrices such that $I \mapsto 1$.
Linear Independence
Vectors $x_1,…,x_k$ are linearly independent if the only solution to the equation $b_1x_1 + … + b_k x_k=0, \ b_j \in \mathbb R$, is $b_1=b_2=…=b_k=0$.
Useful Identities
- $(A+B)’ =A’+B'$
- $(AB)C = A(BC)$
- $A(B+C) = AB+AC$
- $(AB’) = B’A'$
- $(A^{-1})’ = (A’)^{-1}$
- $(AB)^{-1} = B^{-1}A^{-1}$
- $\text{tr}(cA) = c\text{tr}(A)$
- $\text{tr}(A+B) = \text{tr}(A) + \text{tr}(B)$
- $\text{tr}(AB) =\text{tr}(BA)$
- $det(I)=1$
- $det(cA) = c^ndet(A)$ if $A$ is $n \times n$ and $c \in \mathbb R$
- $det(A) = det(A’)$
- $det(AB) = det(A)det(B)$
- $det(A^{-1}) = (det(A))^{-1}$
- $A^{-1}$ exists iff $det(A) \neq 0$
- $rank(A) = rank(A’) = rank(A’A) = rank(AA’)$
- $A^{-1}$ exists iff $rank(A)=n$ for $A$ $n \times n$
- $rank(AB) \leq \min \lbrace rank(A), rank(B) \rbrace$
Matrix Rank
The rank of a matrix, $rank(A)$ is equal to the maximal number of linearly independent rows for $A$.
Let $A$ be an $n \times n$ matrix. The $n \times 1$ vector $x \neq 0$ is an eigenvector of $A$ with corresponding eigenvalue $\lambda$ is $Ax = \lambda x$.
Definitions
- A matrix $A$ is diagonal if $[A]_ {ij} \neq 0$ only if $i=j$.
- An $n \times n$ matrix $A$ is orthogonal if $A’A = I$
- A matrix $A$ is symmetric if $[A]_ {ij} = [A]_ {ji}$.
- An $n \times n$ matrix $A$ is idempotent if $A^2=A$.
- The matrix of zeros ($[A]_ {ij} =0$ for each $i,j$) is simply denoted 0.
- An $n \times n$ matrix $A$ is nilpotent if $A^k=0$ for some integer $k>0$.
Spectral Decomposition
Spectral Theorem
Theorem: Let $A$ be an $n \times n$ symmetric matrix. Then $A$ can be factored as $A = C \Lambda C’$ where $C$ is orthogonal and $\Lambda$ is diagonal.
If we postmultiply $A$ by $C$, we get
- $AC = C \Lambda C’C$ and
- $AC = C \Lambda$.
This is a matrix equation which can be split into columns. The $i$th column of the equation reads $A c_i = \lambda_i c_i$ which corresponds to the definition of eigenvalues and eigenvectors. So if the decomposition exists, then $C$ is the eigenvector matrix and $\Lambda$ contains the eigenvalues.
Rank and Trace
Theorem: The rank of a symmetric matrix equals the number of non zero eigenvalues.
Proof: $rank(A) = rank(C\Lambda C’) = rank(\Lambda) = | \lbrace i: \lambda_i \neq 0 \rbrace |$. $$\tag*{$\blacksquare$}$$
Theorem: The nonzero eigenvalues of $AA’$ and $A’A$ are identical.
Theorem: The trace of a symmetric matrix equals the sum of its eignevalues.
Proof: $tr(A) = tr(C \Lambda C’) = tr((C \Lambda)C’) = tr(C’C \Lambda) = tr(\Lambda) = \sum_ {i=1}^n \lambda_i.$ $$\tag*{$\blacksquare$}$$
Theorem: The determinant of a symmetric matrix equals the product of its eignevalues.
Proof: $det(A) = det(C \Lambda C’) = det(C)det(\Lambda)det(C’) = det(C)det(C’)det(\Lambda) = det(CC’) det(\Lambda) = det(I)det(\Lambda) = det(\Lambda) = \prod_ {i=1}^n \lambda_i.$ $$\tag*{$\blacksquare$}$$
Eigenvalues
Theorem: For any symmetric matrix $A$, the eigenvalues of $A^2$ are the square of the eignevalues of $A$, and the eigenvectors are the same.
Proof: $A = C \Lambda C’ \implies A^2 = C \Lambda C’ C \Lambda C’ = C \Lambda I \Lambda C’ = C \Lambda^2 C’$ $$\tag*{$\blacksquare$}$$
Theorem: For any symmetric matrix $A$, and any integer $k>0$, the eigenvalues of $A^k$ are the $k$th power of the eigenvalues of $A$, and the eigenvectors are the same.
Theorem: Any square symmetric matrix $A$ with positive eigenvalues can be written as the product of a lower triangular matrix $L$ and its (upper triangular) transpose $L’ = U$. That is $A = LU = LL'$
Note that $$ A = LL’ = LU = U’U = (L’)^{-1}L^{-1} = U^{-1}(U’)^{-1} $$ where $L^{-1}$ is lower triangular and $U^{ -1}$ is upper trianguar. You can check this for the $2 \times 2$ case. Also note that the validity of the theorem can be extended to symmetric matrices with non- negative eigenvalues by a limiting argument. However, then the proof is not constructive anymore.
Quadratic Forms and Definite Matrices
Definition
A quadratic form in the $n \times n$ matrix $A$ and $n \times 1$ vector $x$ is defined by the scalar $x’Ax$.
- $A$ is negative definite (ND) if for each $x \neq 0$, $x’Ax < 0$
- $A$ is negative semidefinite (NSD) if for each $x \neq 0$, $x’Ax \leq 0$
- $A$ is positive definite (PD) if for each $x \neq 0$, $x’Ax > 0$
- $A$ is positive semidefinite (PSD) if for each $x \neq 0$, $x’Ax \geq 0$
Equivalence
Theorem: Let $A$ be a symmetric matrix. Then $A$ is PD(ND) $\iff$ all of its eigenvalues are positive (negative).
Some more results:
- If a symmetric matrix $A$ is PD (PSD, ND, NSD), then $\text{det}(A) >(\geq,<,\leq) 0$.
- If symmetric matrix $A$ is PD (ND) then $A^{-1}$ is symmetric PD (ND).
- The identity matrix is PD (since all eigenvalues are equal to 1).
- Every symmetric idempotent matrix is PSD (since the eigenvalues are only 0 or 1).
Theorem: If $A$ is $n\times k$ with $n>k$ and $rank(A)=k$, then $A’A$ is PD and $AA’$ is PSD.
The semidefinite partial order is defined by $A \geq B$ iff $A-B$ is PSD.
Theorem: Let $A$, $B$ be symmetric,square , PD, conformable. Then $A-B$ is PD iff $A^{-1}-B^{-1}$ is PD.
Matrix Calculus
Comformable Matrices
We first define matrices blockwise when they are conformable. In particular, we assume that if $A_1, A_2, A_3, A_4$ are matrices with appropriate dimensions then the matrix $$ A = \begin{bmatrix} A_1 & A_1 \newline A_3 & A_4 \end{bmatrix} $$ is defined in the obvious way.
Matrix Functions
Let
$F: \mathbb R^m \times \mathbb R^n \rightarrow \mathbb R^p \times \mathbb R^q$
be a matrix valued function. More precisely, given a real $m \times n$
matrix $X$, $F(X)$ returns the $p \times q$ matrix
$$
\begin{bmatrix}
f_ {11}(X) & … & f_ {1q}(X) \newline \vdots & \ddots & \vdots \newline
f_ {p1}(X)& … & f_ {pq}(X)
\end{bmatrix}
$$
Matrix Derivatives
The derivative of $F$ with respect to the matrix $X$ is the
$mp \times nq$ matrix $$
\frac{\partial F(X)}{\partial X} = \begin{bmatrix}
\frac{\partial F(X)}{\partial x_ {11}} & … & \frac{\partial F(X)}{\partial x_ {1n}} \newline \vdots & \ddots & \vdots \newline
\frac{\partial F(X)}{\partial x_ {m1}} & … & \frac{\partial F(X)}{\partial x_ {mn}}
\end{bmatrix}
$$ where each $\frac{\partial F(X)}{\partial x_ {ij}}$ is a $p\times q$
matrix given by
$$
\frac{\partial F(X)}{\partial x_ {ij}} = \begin{bmatrix}
\frac{\partial f_ {11}(X)}{\partial x_ {ij}} & … & \frac{\partial f_ {1q}(X)}{\partial x_ {ij}} \newline
\vdots & \ddots & \vdots \newline
\frac{\partial f_ {p1}(X)}{\partial x_ {ij}} & … & \frac{\partial f_ {pq}(X)}{\partial x_ {ij}}
\end{bmatrix}
$$ The most important case is when
$F: \mathbb R^n \rightarrow \mathbb R$ since this simplifies the
derivation of the least squares estimator. Also, the trickiest thing is
to make sure that dimensions are correct.
Useful Results in Matrix Calculus
- $\frac{\partial b’x}{\partial x}= b$ for $dim(b) = dim(x)$
- $\frac{\partial B’x}{\partial x}= B$ for arbitrary, conformable $B$
- $\frac{\partial B’x}{\partial x’}= B’$ for arbitrary, conformable $B$
- $\frac{\partial x’Ax}{\partial x} = (A + A’)x$
- $\frac{\partial x’Ax}{\partial A} = xx'$
- $\frac{\partial x’Ax}{\partial x} = det(A) (A^{-1})'$
- $\frac{\partial \ln det(A)}{\partial A} = (A^{-1})'$