19.6 Inner product spaces
TODO
19.6.1 Inner products
19.6.2 Norms
Induced norms
Every inner product \sayinduces a norm, which is to say that if we have an inner product, we can define the norm
To show that this actually is a norm it is necessary and sufficient to show that it satisfies the four norm axioms.
-
•
Homogeneity. Consider , for which
(19.146) (19.147) (19.148) (19.149) (19.150) -
•
TODO: other proofs
Norms which are not induced norms
We will pretend these do not exist (but be aware that they do most definitely exist)!!!
19.6.3 Orthogonality
Hopefully you know what perpendicular means. We would like to generalise this notion to a more abstract setting; we can say that
Definition 19.6.1
Let V be an inner product space over a field F, then two vectors are orthogonal if and only if their inner product is zero; that is if we let
We can denote this as (read \say and are orthogonal).
This is not the most exciting definition, but it is a very useful one!
Example 19.6.1
Consider the vector space . The two vectors
are orthogonal with respect to the dot product.
We can just apply the definition, recall that
(19.157) | ||||
(19.158) |
And therefore, the two vectors are orthogonal.
19.6.4 Some useful properties orthogonal vectors possess.
Theorem 19.6.1
Let be a set of non-zero pairwise orthogonal vectors (that is, any two vectors are orthogonal - as defined in 19.6.1), then this set is linearly independent.
todo: proof
19.6.5 Orthonormal bases
Theorem 19.6.2
Let be a finite-dimensional vector space, and let be a basis for this vector space. Then, (spoiler alert) we know that for all that
We can prove this as follows. First we know that for some scalars that as is a basis.
We now want to find the values of (for any ). We know that for all
(19.161) | ||||
(19.162) |
Then as is an orthogonal basis, we know that all the terms are zero
19.6.6 Gram-Schmidt orthonormalisation
Gram-Schmidt orthonormalisation provides a useful way to turn a set of linearly independent vectors into a set of orthogonal vectors.
The key idea is that we build our set of vectors inductively, i.e. if our set of vectors is (a finite subset of some vector space ). Then we will order our set (doesn’t matter how, any ordering will do) and start to build sets , such that is an orthogonal set of vectors which satisfies
Clearly the main thing which is missing here is the step which takes us from to . There are a lot of ways to find this step,
-
•
Consider specific examples of linear vectors in well-known vector spaces (for example ) and guess the formula 88 8 Pun entirely unintended. for performing this orthonormalisation process.
-
•
Try to write a proof for our method and through this try to fill in the actual method.
I will try for the latter, because I think it is an approach which is much more fun. We will start by creating by simply selecting the only element in the first one element of which is by itself orthogonal.
Now, suppose that we have constructed . We would like to find a way to build . Clearly we should add to this set, the question of course is how. We need our new vector, say to be such that
(19.164) |
19.6.7 Orthogonal complement
Definition 19.6.2
Let V be an inner product space, and be a subspace of V. We define the orthogonal complement of as the set
Theorem 19.6.3
Let be a vector space, and be a subspace of . In this case wouldn’t it be nice if ?
Yes, it would. We can prove separately that and .
-
•
. This direction is the easier one, because we know both and are subspaces and therefore they both contain at least the vector.
-
•
. Let us suppose that and , and to prove this by contradiction that , i.e. . Because we know that for all that (this follows directly from the definition of ). As it is also true that , it follows from this that therefore also
(19.166)which is true if and only if , which contradicts our earlier assumption that , and thus this direction is true.
Therefore, the theorem is true.
Theorem 19.6.4 (The very important resolving theorem)
Let be a vector space, of which and are subspaces, then for every vector there exist unique vectors such that
Theorem 19.6.5
Let be a vector space, of which is a subspace. Then
This theorem follows mostly from the definitions. First, let and , then our goal is to show that .
First, fix a basis for and a basis for , . Then we will prove that is a basis for .
-
•
Generating. Let , then for and using
19.6.8 Orthogonal projection
If you’ve done any physics (shivers) then you’ve probably come across the idea of \sayresolving forces. If you haven’t, then the basic idea is that if we have some vector in , then for example we can split it into two components: a perpendicular one and a parallel one.
One way we can do this (which is quite natural) is to resolve any vector parallel and perpendicular to the \sayaxes, for example we could have the vector in the diagram below which can be resolved into a component parallel to and (note that and are orthogonal).
But perpendicular and parallel to what? Usually in secondary school mathematics this is not very well-defined, but we can now use some of our previous definitions to define this notion of \saysplitting up a vector and generalise it to vector spaces where we don’t have a ready geometric interpretation.
This definition encodes a lot of the intuitive notions about orthogonality and perpendicularity.
Definition 19.6.3
Let V be a vector space, and be a subspace of .
We say that w is the orthogonal projection of v onto if
-
1.
The vector w is in .
-
2.
The vector obtained by subtracting w from v is orthogonal to all the vectors in (which we can write as )
A very key property is that the orthogonal projection is the vector in which is the closest vector to w. This makes the orthogonal projection useful in optimisation problems!
Theorem 19.6.6
Let be a vector space, and be a subspace of . Let , in which case the orthogonal projection of onto minimises the distance between and (which we define as the distance between and the closest vector in ).
To prove this, we will need the Pythagorean theorem. Let and let be an arbitrary vector in . Then we will define to be the orthogonal projection of onto . We can write the distance between and as . Our goal is to show that this is greater than or equal to (note that in general it is always nicer to work with the distance squared, and this is all good and well because distance is never negative). Then we apply the trusty trick of adding zero (i.e. an object and its inverse), in this case , which gives us
(19.169) |
Then note that and that (as both are in , which is a subspace). To this, we can apply Pythagoras’ theorem (just when you thought you’d escaped school geometry, it comes back to bite!) which we know that
Now that we have defined the object, we can ask some questions that (at least to me) it makes sense to ask. For example, we can ask if the orthogonal projection always exists! This seems to be intuitively true, but how do we know? Let us prove this too.
19.6.9 The method of least squares
There’s a fairly intuitive notion that the orthogonal (well, \sayperpendicular) line minimises the distance between points.
I had a surprising amount of trouble trying to get a nice little right- angle symbol onto the diagram (and ultimately failed), but the lin connecting the point to the other line is in fact perpendicular to the line and also the shortest distance.
Usually the question of how to minimising or maximising a quantity requires some analysis (usually differentiation), however, here we are lucky to have a case where we can minimise things using only linear algebra!
The key idea in least squares is that when we have a system of linear equations , sometimes we cannot find a solution to our system of linear equations (which sucks), but it would be nice to have an approximate solution. That is, our problem is how to find an such that
is the next best thing to a solution to . How exactly we should define optimal actually does matter (for example, it turns out that in the \sayreal world error in experiments tends to follow a certain kind of statistical distribution; the normal distribution), but let us be driven by what seems simplest, and say that whatever mimisises the \saydistance between and is best, that is we seek to find which minimises
Note that we are essentially trying to find the vector in (the range of ) which is closest to .