Matteo Courthoud
2021-10-29
A real n×m matrix A is an array
A=[a11a12…a1ma21a22…a2m⋮⋮⋱⋮an1an2…anm]
We write [A]ij=aij to indicate the (i,j)-element of A.
We will usually take the convention that a real vector x∈Rn is identified with an n×1 matrix.
The n×n identity matrix In is given by
[In]ij={1 if i=j0 if i≠j
The trace of a square matrix A with dimension n×n is tr(A)=∑ni=1aii.
The determinant of a square n×n matrix A is defined according to one of the following three (equivalent) definitions.
Vectors x1,...,xk are linearly independent if the only solution to the equation b1x1+...+bkxk=0, bj∈R, is b1=b2=...=bk=0.
The rank of a matrix, rank(A) is equal to the maximal number of linearly independent rows for A.
Let A be an n \times n matrix. The n \times 1 vector x \neq 0 is an eigenvector of A with corresponding eigenvalue \lambda is Ax = \lambda x.
Theorem: Let A be an n \times n symmetric matrix. Then A can be factored as A = C \Lambda C' where C is orthogonal and \Lambda is diagonal.
If we postmultiply A by C, we get
This is a matrix equation which can be split into columns. The ith column of the equation reads A c_i = \lambda_i c_i which corresponds to the definition of eigenvalues and eigenvectors. So if the decomposition exists, then C is the eigenvector matrix and \Lambda contains the eigenvalues.
Theorem: The rank of a symmetric matrix equals the number of non zero eigenvalues.
Proof: rank(A) = rank(C\Lambda C') = rank(\Lambda) = | \lbrace i: \lambda_i \neq 0 \rbrace |. \tag*{$\blacksquare$}
Theorem: The nonzero eigenvalues of AA' and A'A are identical.
Theorem: The trace of a symmetric matrix equals the sum of its eignevalues.
Proof: tr(A) = tr(C \Lambda C') = tr((C \Lambda)C') = tr(C'C \Lambda) = tr(\Lambda) = \sum_ {i=1}^n \lambda_i. \tag*{$\blacksquare$}
Theorem: The determinant of a symmetric matrix equals the product of its eignevalues.
Proof: det(A) = det(C \Lambda C') = det(C)det(\Lambda)det(C') = det(C)det(C')det(\Lambda) = det(CC') det(\Lambda) = det(I)det(\Lambda) = det(\Lambda) = \prod_ {i=1}^n \lambda_i. \tag*{$\blacksquare$}
Theorem: For any symmetric matrix A, the eigenvalues of A^2 are the square of the eignevalues of A, and the eigenvectors are the same.
Proof: A = C \Lambda C' \implies A^2 = C \Lambda C' C \Lambda C' = C \Lambda I \Lambda C' = C \Lambda^2 C' \tag*{$\blacksquare$}
Theorem: For any symmetric matrix A, and any integer k>0, the eigenvalues of A^k are the kth power of the eigenvalues of A, and the eigenvectors are the same.
Theorem: Any square symmetric matrix A with positive eigenvalues can be written as the product of a lower triangular matrix L and its (upper triangular) transpose L' = U. That is A = LU = LL'
Note that A = LL' = LU = U'U = (L')^{-1}L^{-1} = U^{-1}(U')^{-1} where L^{-1} is lower triangular and U^{ -1} is upper trianguar. You can check this for the 2 \times 2 case. Also note that the validity of the theorem can be extended to symmetric matrices with non- negative eigenvalues by a limiting argument. However, then the proof is not constructive anymore.
A quadratic form in the n \times n matrix A and n \times 1 vector x is defined by the scalar x'Ax.
Theorem: Let A be a symmetric matrix. Then A is PD(ND) \iff all of its eigenvalues are positive (negative).
Some more results:
Theorem: If A is n\times k with n>k and rank(A)=k, then A'A is PD and AA' is PSD.
The semidefinite partial order is defined by A \geq B iff A-B is PSD.
Theorem: Let A, B be symmetric,square , PD, conformable. Then A-B is PD iff A^{-1}-B^{-1} is PD.
We first define matrices blockwise when they are conformable. In particular, we assume that if A_1, A_2, A_3, A_4 are matrices with appropriate dimensions then the matrix A = \begin{bmatrix} A_1 & A_1 \newline A_3 & A_4 \end{bmatrix} is defined in the obvious way.
Let F: \mathbb R^m \times \mathbb R^n \rightarrow \mathbb R^p \times \mathbb R^q be a matrix valued function. More precisely, given a real m \times n matrix X, F(X) returns the p \times q matrix
\begin{bmatrix}
f_ {11}(X) & ... & f_ {1q}(X) \newline \vdots & \ddots & \vdots \newline
f_ {p1}(X)& ... & f_ {pq}(X)
\end{bmatrix}
The derivative of F with respect to the matrix X is the mp \times nq matrix
\frac{\partial F(X)}{\partial X} = \begin{bmatrix}
\frac{\partial F(X)}{\partial x_ {11}} & ... & \frac{\partial F(X)}{\partial x_ {1n}} \newline \vdots & \ddots & \vdots \newline
\frac{\partial F(X)}{\partial x_ {m1}} & ... & \frac{\partial F(X)}{\partial x_ {mn}}
\end{bmatrix}
where each \frac{\partial F(X)}{\partial x_ {ij}} is a p\times q matrix given by
\frac{\partial F(X)}{\partial x_ {ij}} = \begin{bmatrix}
\frac{\partial f_ {11}(X)}{\partial x_ {ij}} & ... & \frac{\partial f_ {1q}(X)}{\partial x_ {ij}} \newline
\vdots & \ddots & \vdots \newline
\frac{\partial f_ {p1}(X)}{\partial x_ {ij}} & ... & \frac{\partial f_ {pq}(X)}{\partial x_ {ij}}
\end{bmatrix}
The most important case is when F: \mathbb R^n \rightarrow \mathbb R since this simplifies the derivation of the least squares estimator. Also, the trickiest thing is to make sure that dimensions are correct.