31.1 QR decomposition
31.1.1 Definition and some uses
The QR decomposition is a very powerful matrix factorisation. The idea is that for any matrix $A\in M_{m\times n}(\mathbb{R})$ we can find two matrices $Q,R$ such that
where $Q$ is an orthogonal matrix and $R$ is an upper triangular matrix.
Example 31.1.1
Consider the least squares problem of finding the value of $x$ which minimises $Axb_{2}$. If we have a $QR$ decomposition for $A$ we can then write
$\displaystyle\left\lVert Axb\right\rVert_{2}$  $\displaystyle=\left\lVert QRxb\right\rVert_{2}$  (31.2)  
$\displaystyle=\left\lVert Q(RxQ^{T}b)\right\rVert_{2}$  (31.3)  
$\displaystyle=\left\lVert RxQ^{T}b\right\rVert_{2}$  (31.4) 
31.1.2 Householder transformations
We already know one method for computing a $QR$ decomposition; GramSchmitt. Unfortunately, GramSchmitt is a fantastic theoretical tool, but very numerically unstable. There are alternative methods which are more suitable for computational applications.
One of these uses Householder transformations. The idea here is that we will apply a series of transformations which zero out everything in our matrix. In algebraic terms we will be carrying out the operation
and if all the $T$s are orthogonal this means that we can find
which is a $QR$ decomposition where
because the inverse of an orthogonal matrix is also orthogonal, and the product of orthogonal matrices is also orthogonal.
The idea is that the first $T$ will carry out the following transformation (in matrix terms).
The second $T$ will carry out
and so on, until we get a triangular matrix.
The matrices, $T_{2},T_{3},...,T_{n1}$ can be interpreted as Householder reflections for smaller matrices. The $i$th transformation should only affect the submatrix formed by excluding the first $i1$ rows and columns. This means that $T_{1}$ will look like
where $T_{(n1)\times(n1)}$ is a Householder reflection for the smaller square matrix formed by eliminating the first row and column of our original matrix. We can continue to apply this recursively to find the remaining matrices.
Let us consider how $T_{1}$ will work  in matrix terms we will get something along the lines of
so we want $T$ to satisfy two very important constraints

•
be orthogonal

•
the column vector $TA_{1}$ should have all zero entries (the only exception being the first, which can be anything)
Orthogonal transformations are generally rotations or reflections. In this case we will use reflections. It is also possible to use rotations, as we will see in the next section.
If we consider this diagram (you might want to draw your own as mine is uniquely terrible) we can see that we want
we need to reflect to $\left\lVert A_{1}\right\rVert e_{1}$ (rather than just $e_{1}$ or some other value) so that we preserve the size of the vector (and thus have an orthogonal transformation).
Consider the dotted line, which we can call $H$ (for hyperplane, a reminder that this also works in higher dimensions). The orthogonal projection of $A_{1}$ onto $v$ is given by
and while we’re at it we should define $v$, which is
and if you are wondering why there is a $\pm$ there, more on that later (it’s about numerical stability). The important thing is that either works. The diagram uses $v=\left\lVert A_{1}\right\rVert e_{1}A_{1}$, but we could also use $\left\lVert A_{1}\right\rVert e_{1}A_{1}$, except that we’d be on the negative $e_{1}$axis and by drawing skills aren’t that good.
Therefore we have the complementary projector
which projects onto $H$ (the dotted line). This is actually pretty helpful because it gets us half the way to $\left\lVert A_{1}\right\rVert e_{1}$. To get the rest of the way there we just need to multiply by $2$, i.e. use
Thus to find $T_{i}$ we have to compute the corresponding $v$, and then use Equation 31.15.
31.1.3 Givens rotations
These are like Householder reflections, but more fiddly. Personally I prefer Householder reflections.
The idea with a Givens rotation is that instead of zeroing out all the entries in each vector at the same time, we do this one by one. Visually this goes something like
$\displaystyle\begin{pmatrix}\times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\end{pmatrix}$  $\displaystyle\mapsto_{1}\begin{pmatrix}\times&\times&\times&\times\\ 0&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\end{pmatrix}$  (31.28)  
$\displaystyle\mapsto_{2}\begin{pmatrix}\times&\times&\times&\times\\ 0&\times&\times&\times\\ 0&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\\ \times&\times&\times&\times\end{pmatrix}$  (31.35)  
$\displaystyle...$  (31.36)  
$\displaystyle\mapsto_{n1}\begin{pmatrix}\times&\times&\times&\times\\ 0&\times&\times&\times\\ 0&\times&\times&\times\\ 0&\times&\times&\times\\ 0&\times&\times&\times\\ 0&\times&\times&\times\end{pmatrix}$  (31.43) 
Therefore the question is really how we can find a matrix $T$ which satisfies
This problem looks pretty hard, but we can simplify it by first consider the $2$d problem; we seek $T$ such that
where $r=\sqrt{v_{1}^{2}+v_{2}^{2}}$ so that $T$ is an orthogonal transformation. This is just the linear system
where we can assume the special structure for $T$ because we want $T$ to be a rotation. Therefore we want our solution to satisfy
$\displaystyle v_{1}cv_{2}s=r$  (31.47)  
$\displaystyle v_{2}c+v_{1}s=0$  (31.48) 
which means that we need
Here we have to be a bit careful because $r=\sqrt{v_{1}^{2}+v_{2}^{2}}$ and $r^{2}$ is quite big and thus susceptible to overflow error. There are good ways to compute this but let’s not get distracted.
We can generalise this to the $n$dimensional case. Let $T_{n}$ represent the transform for an $n\times n$ matrix. For a given (pun entirely unintended) vector we know
and we are interested in leaving all the vectors unpreserved except the $k$th one, which means we need $T_{i}=e_{i}$ where $i\notin\{k,k+1\}$. For the $v_{k}$ and $v_{k+1}$ terms we want
$\displaystyle v_{k}\begin{pmatrix}0\\ 0\\ ...\\ c\\ s\\ ...\\ 0\end{pmatrix}+v_{k+1}\begin{pmatrix}0\\ 0\\ ...\\ s\\ c\\ ...\\ 0\end{pmatrix}$  (31.65) 
so that the transform gives us
which applies the rotation as required.