Adam Wheeler

ADS | arXiv | github | math notes | blog

Math and physics notes

matrix identities

linear regression

The Moorse-Penrose pseudoinverse for orthogonal real matrices is

\( A^+ = (A^T A)^{-1} A^T \)

The normal equation is

\(\widehat\beta = (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} y\)

And the standard error on \(\widehat\beta\) is

\( \Sigma_{\widehat\beta} = (X^T \Sigma^{-1} X)^{-1} \)

For “standard” linear regression, the design matrix \(X\) has rows of the form \(\left[1~x_i\right]\),

\( \sigma_{\beta_1} = \sigma \sqrt{ \frac{1}{n} + \frac{\bar{x}^2}{\sum_{i=1}^n (x_i - \bar{x})^2} }, \quad \sigma_{\beta_2} = \sigma (\sum_{i=1}^n (x_i - \bar{x})^2)^{-\frac{1}{2}} \)

The hat matrix is given by

\( X (X^T \Sigma^{-1} X)^{-1} X^T \Sigma^{-1} \)

matrix decompositions

Singular Value decomposition: \(A = U D V^T\) where

LU decomposition: \(A = LU\) where

The Normal distribution

$$\mathcal{N}(\mathbf{x} | \mathbf{\mu}, \Sigma) = \exp\left( - \frac{1}{2} \mathbf{r}^T \Sigma^{-1} \mathbf{r} \right) , |2 \pi \Sigma|^{-\frac{1}{2}}$$

where \(\mathbf{r} = \mathbf{x} - \mathbf{\mu}\). Note that the \(2 \pi\) is inside the determinant.

linear operations on Gaussian random variables: If \( x \sim \mathcal{N}(\mu_x, \Sigma_X)\), and \(y \sim \mathcal{N}(\mu_y, \Sigma_y)\), then

Product of Gaussian PDFs: \(\mathcal{N}(\mathbf{x} \vert \alpha, \Sigma) \mathcal{N}(\mathbf{x} \vert \beta, \Omega) = \eta \mathcal{N}(\mathbf{x} | \mathbf{m}, C)\), where

Refactoring the product of heirarchical Gaussian PDFs: \(\mathcal{N}(\mathbf{x} \vert M \theta, C) \mathcal{N}(\theta \vert \mu, \Lambda) = \mathcal{N}(\theta \vert \mathbf{a}, A) \mathcal{N}(\mathrm{x} \vert \mathrm{b}, B)\), where


Sources: The Matrix Cookbook, Hogg+ 2020