Math and physics notes

matrix identities

\((ABC\dots)^{-1} = C^{-1} B^{-1} A^{-1}\).
\((ABC\dots)^T = C^T B^T A^T\).
\(\vert c A \vert = c^n \vert A\vert\) where A is \(n \times n\).

linear regression

The Moorse-Penrose pseudoinverse for orthogonal real matrices is

\[A^+ = (A^T A)^{-1} A^T\]

The normal equation is

\[\widehat\beta = (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} y\]

The hat matrix is given by

\[X (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1}\]

matrix decompositions

Singular Value decomposition: \(A = U D V^T\) where

The columns of \(U\) are the eigenvectors of \(A A^T\)
\(D\) is a diagonal matrix containting hte eigenvalues of \(A A^T\) (the singular values)
\(V\) is the matrix whose columns are the eigenvectors of \(A^T A\)

LU decomposition: \(A = LU\) where

\(L\) is lower-triangular
\(U\) is upper-triangular

The Normal distribution

\[\mathcal{N}(\mathbf{x} | \mathbf{\mu}, \Sigma) = \exp\left( - \frac{1}{2} \mathbf{r}^T \Sigma^{-1} \mathbf{r} \right) \, |2 \pi \Sigma|^{-\frac{1}{2}}\]

where \(\mathbf{r} = \mathbf{x} - \mathbf{\mu}\). Note that the \(2 \pi\) is inside the determinant.

linear operations on Gaussian random variables: If \(x \sim \mathcal{N}(\mu_x, \Sigma_X)\), and \(y \sim \mathcal{N}(\mu_y, \Sigma_y)\), then

\(x + y \sim \mathcal{N}(\mu_x + \mu_y, \Sigma_x + \Sigma_y)\).
\(Ax \sim \mathcal{N}(A \mu_x, A \Sigma_x A^T)\).

Product of Gaussian PDFs: \(\mathcal{N}(\mathbf{x} \vert \alpha, \Sigma) \mathcal{N}(\mathbf{x} \vert \beta, \Omega) = \eta \mathcal{N}(\mathbf{x} | \mathbf{m}, C)\), where

\(\mathbf{m} = (\Sigma^{-1} + \Omega^{-1})^{-1} (\Sigma^{-1} \alpha + \Omega^{-1} \beta)\).
\(C = (\Sigma^{-1} + \Omega^{-1})^{-1}\).
\(\eta = \mathcal{N}(\alpha-\beta \vert 0, \Sigma + \Omega)\).

Refactoring the product of heirarchical Gaussian PDFs:
\(\mathcal{N}(\mathbf{x} \vert M \theta, C) \mathcal{N}(\theta \vert \mu, \Lambda) = \mathcal{N}(\theta \vert \mathbf{a}, A) \mathcal{N}(\mathrm{x} \vert \mathrm{b}, B)\), where

\(A^{-1} = \Lambda^{-1} + M^T C^{-1} M\).
\(a = A(\Lambda^{-1} \mu + M^T C^{-1} \mathbf{x})\).
\(B = C + M^T \Lambda M\).
\(b = M \mu\).

Sources: Wikipedia, The Matrix Cookbook, Hogg+ 2020 (in prep)