A calculus of the absurd

20.3 Matrix factorisations

20.3.1 LU decomposition

In general, systems of linear equations which are triangular127127 That is, all the elements in coefficient matrix either above or below the diagonal are all equal to \(0\) are substantially easier to solve than those which are not (because we can simply use backwards or forwards substitution).

Unfortunately, however, more often than not coefficient matrices do not come in triangular form128128 As you’ve probably noticed, reality is messy.. It would, however, be nice if we could reduce them to triangular matrices (if you’re thinking “of course, we can just use Gaussian elimination (Section 20.2.1)”, then you are correct)! However, Gaussian elimination is useful for solving an equation in the form

\begin{equation} Ax = b \end{equation}

It works very well when we only have one \(b\), but sometimes we have more than one \(b\) we might want to solve for. What if we want to solve the equation \(Ax = b\) for every129129 One principle which seems to hold in a lot of mathematics is that if you consider enough absurd scenarios, eventually you will discover something interesting (e.g. a principle which can be applied to non-absurd scenarios) \(b \in \{b_1, b_2, ..., b_n\}\). Here’s a neat idea; if we can write \(A = LU\) (where \(L\) and \(U\) are both triangular matrices) 130130 How to do this is documented in Section ??, then we can write our system equivalently as

\begin{equation} (LU)x = b \end{equation}

And because matrix multiplication is associative, this is the same as

\begin{equation} L(Ux) = b \end{equation}

As \(Ux\) will be a column vector, if we set \(y=Ux\), then we can build a system

\begin{equation} Ly = b \end{equation}

We can solve this system using forwards or backwards substitution (depending on whether \(L\) is upper-triangular or lower-triangular). Then, once we know the value of \(y\), we can solve a second equation,

\begin{equation} Ux = y \end{equation}

Which gives us the value of \(x\) (which is what we wanted). Note that because of how we defined \(y\), \(y=Ux\).