A calculus of the absurd

22.7.8 Orthogonal projection

If you’ve done any physics (shivers) then you’ve probably come across the idea of “resolving” forces. If you haven’t, then the basic idea is that if we have some vector \(v\) in \(\mathbb {R}^2\), then for example we can split it into two components: a perpendicular one and a parallel one.

One way we can do this (which is quite natural) is to resolve any vector parallel and perpendicular to the “axes”, for example we could have the vector in the diagram below which can be resolved into a component parallel to \(\uvec {i}\) and \(\uvec {j}\) (note that \(\uvec {i}\) and \(\uvec {j}\) are orthogonal).

(-tikz- diagram)

But perpendicular and parallel to what? Usually in secondary school mathematics this is not very well-defined, but we can now use some of our previous definitions to define this notion of “splitting up a vector” and generalise it to vector spaces where we don’t have a ready geometric interpretation.

This definition encodes a lot of the intuitive notions about orthogonality and perpendicularity.

  • Definition 22.7.3 Let \(\textsf {V}\) be a vector space, and \(E\) be a subspace of \(V\).

    We say that \(\textbf {w}\) is the orthogonal projection of \(\textbf {v}\) onto \(E\) if

    • 1. The vector \(\textbf {w}\) is in \(E\).

    • 2. The vector obtained by subtracting \(\textbf {w}\) from \(\textbf {v}\) is orthogonal to all the vectors in \(E\) (which we can write as \(\textbf {v} - \textbf {w} \bot E\))

A very key property is that the orthogonal projection is the vector in \(E\) which is the closest vector to \(\textbf {w}\). This makes the orthogonal projection useful in optimisation problems!

  • Theorem 22.7.6 Let \(V\) be a vector space, and \(W\) be a subspace of \(V\). Let \(v \in V\), in which case the orthogonal projection of \(v\) onto \(W\) minimises the distance between \(v\) and \(W\) (which we define as the distance between \(v\) and the closest vector in \(W\)).

To prove this, we will need the Pythagorean theorem. Let \(v \in v\) and let \(w\) be an arbitrary vector in \(W\). Then we will define \(p\) to be the orthogonal projection of \(p\) onto \(W\). We can write the distance between \(v\) and \(w\) as \(||v-w||^2\). Our goal is to show that this is greater than or equal to \(||v-x||^2\) (note that in general it is always nicer to work with the distance squared, and this is all good and well because distance is never negative). Then we apply the trusty trick of adding zero (i.e. an object and its inverse), in this case \(p\), which gives us

\begin{align} ||v-w||^2 &= ||v-p+p-w||^2 \end{align}

Then note that \(v - p \in W^{\bot }\) and that \(p - w \in W\) (as both are in \(W\), which is a subspace). To this, we can apply Pythagoras’ theorem (just when you thought you’d escaped school geometry, it comes back to bite!) which we know that

Now that we have defined the object, we can ask some questions that (at least to me) it makes sense to ask. For example, we can ask if the orthogonal projection always exists! This seems to be intuitively true, but how do we know? Let us prove this too.