19.2 Vector spaces

19.2.1 The vector space axioms

There are eight axioms in total, but I find it easier to remember them this way:

Definition 19.2.1

A vector space is a set V over a field 𝕂\mathbb{K} (elements of which are called \sayscalars) equipped with an operator ++ called \sayvector addition which is an operator taking two elements of VV and returning a single element in VV, and an operator \cdot called \sayscalar multiplication which takes a scalar and a vector, and outputs a vector.

We have the following eight axioms,

  1. 1.

    The first four axioms are equivalent to stating that (𝘝,+)(\textsf{V},+) must be an Abelian group.

  2. 2.

    We have two kinds of distributivity, one is that if 𝐱,𝐲𝘝\mathbf{x},\mathbf{y}\in\textsf{V} and a𝕂a\in\mathbb{K}, then

    a(𝐱+𝐲)=a𝐱+a𝐲a\cdot(\mathbf{x}+\mathbf{y})=a\cdot\mathbf{x}+a\cdot\mathbf{y} (19.40)
  3. 3.

    The second is that if a,b𝕂a,b\in\mathbb{K} and 𝐱𝘝\mathbf{x}\in\textsf{V}, then

    𝐱(a+b)=𝐱a+𝐱b\mathbf{x}\cdot(a+b)=\mathbf{x}\cdot a+\mathbf{x}\cdot b (19.41)
  4. 4.

    The neutral element (e.g. in \mathbb{R} this is 11) in 𝐊\mathbf{K} has the following property,

    𝐱𝘝1x=𝐱\forall\mathbf{x}\in\textsf{V}\hskip 12.0pt1\cdot\mathbf{}{x}=\mathbf{x} (19.42)
  5. 5.

    We also have a kind of \saymultiplicative distributivity

    a(b𝐱)=(ab)𝐱a(b\mathbf{x})=(ab)\mathbf{x} (19.43)

Not exactly the most exciting stuff, but we can’t build castles without foundations! I’m not a structural engineer, but I’m pretty sure this is a true statement.

19.2.2 Linear independence

This is a very importantTM\text{very important}^{TM} concept in linear algebra.

Definition 19.2.2

Let v1,v2,,vnv_{1},v_{2},...,v_{n} be some vectors in a vector space V, and let a1,a2,,ana_{1},a_{2},...,a_{n} be some scalars in the field 𝕂\mathbb{K} (over which this vector space is defined).

We say these vectors are linearly independent if and only if

a1v1+a2v2++anvn=0a1=a2==an=0.a_{1}v_{1}+a_{2}v_{2}+...+a_{n}v_{n}=0\implies a_{1}=a_{2}=...=a_{n}=0. (19.44)

In words, this means \sayif the only values for all the aas which satisify a1v1+a2v2++anvn=0a_{1}v_{1}+a_{2}v_{2}+...+a_{n}v_{n}=0 are when all the aas are zero, then the vectors are linearly independent.

19.2.3 Bases

Change of basis

Let us suppose that we have two sets of basis vectors for the same vector space VV. It doesn’t really matter what we call them, but \mathcal{B} and 𝒞\mathcal{C} are names as good as any. These vectors can be written in the form

={β1,β2,,βdim(V)}\displaystyle\mathcal{B}=\{\beta_{1},\beta_{2},...,\beta_{\dim(V)}\} (19.45)
𝒞={γ1,γ2,,γdim(V)}\displaystyle\mathcal{C}=\{\gamma_{1},\gamma_{2},...,\gamma_{\dim(V)}\} (19.46)

For any vector 𝐯V\mathbf{v}\in V we can always write it in the co-ordinate system \mathcal{B} by writing the vector as a linear combination55 5 This is always possible because 𝐁\mathbf{B} is a basis for VV. of the vectors in \mathcal{B}. We can write this as

[𝐯]=(v1v2vdim(V))[\mathbf{v}]_{\mathcal{B}}=\begin{pmatrix}v_{1}\\ v_{2}\\ ...\\ v_{\dim(V)}\end{pmatrix} (19.47)

Where v1,v2,,vdim(V)v_{1},v_{2},...,v_{\dim(V)} are such that

𝐯=v1β1+v2β2++vdim(V)βdim(V)\mathbf{v}=v_{1}\beta_{1}+v_{2}\beta_{2}+...+v_{\dim(V)}\beta_{\dim(V)} (19.48)

That is, they are the coefficients needed to write 𝐯\mathbf{v} as a linear combination of \mathcal{B}. This also helps to understand why for example the vector space of 2×22\times 2 symmetric matrices66 6 i.e. those in the form (abbc)\begin{pmatrix}a&b\\ b&c\end{pmatrix} (19.49) is three dimensional; we can write every matrix as a vector of dimension 3×13\times 1 where each coefficient denotes what to multiply each basis vector by to obtain our specific vector.

But what if we want to find a way to translate [𝐯]𝒞[\mathbf{v}]_{\mathcal{C}} into [𝐯][\mathbf{v}]_{\mathcal{B}}? This is actually doable using a single matrix. Here’s how. We start by applying the definition of [𝐯][\mathbf{v}]_{\mathcal{B}}, that is, we have that [𝐯]=[v1,v2,,vdim(V)][\mathbf{v}]_{\mathcal{B}}=[v_{1},v_{2},...,v_{\dim(V)}] if and only if

𝐯=v1β1+v2β2++vdim(V)βdim(V)\mathbf{v}=v_{1}\beta_{1}+v_{2}\beta_{2}+...+v_{\dim(V)}\beta_{\dim(V)} (19.50)

To find [𝐯]𝒞[\mathbf{v}]_{\mathcal{C}}, it is sufficient to find the 𝐯\mathbf{v} in terms of the basis vectors in 𝐜\mathbf{c}. How do we do this? A straightforward approach is to write every vector in \mathcal{B} in terms of those in 𝒞\mathcal{C} and then to substitute for them, which removes all the \mathcal{B}-vectors and means that we instead have 𝒞\mathcal{C}-vectors.

Because \mathcal{B} and 𝒞\mathcal{C} are both basis for VV, we can write every vector in 𝒱\mathcal{V} in terms of those in CC.

β1=α1,1γ1+α2,1γ2++αdim(V),1γdim(V)\displaystyle\beta_{1}=\alpha_{1,1}\gamma_{1}+\alpha_{2,1}\gamma_{2}+...+% \alpha_{\dim(V),1}\gamma_{\dim(V)} (19.51)
β2=α1,2γ1+α2,2γ2++αdim(V),2γdim(V)\displaystyle\beta_{2}=\alpha_{1,2}\gamma_{1}+\alpha_{2,2}\gamma_{2}+...+% \alpha_{\dim(V),2}\gamma_{\dim(V)} (19.52)
\displaystyle... (19.53)
βdim(V)=α1,dim(V)γ1+α2,dim(V)γ2++αdim(V),dim(V)γdim(V)\displaystyle\beta_{\dim(V)}=\alpha_{1,\dim(V)}\gamma_{1}+\alpha_{2,\dim(V)}% \gamma_{2}+...+\alpha_{\dim(V),\dim(V)}\gamma_{\dim(V)} (19.54)

We can then substitute this into the linear combination of 𝐯\mathbf{v} in terms of the basis vectors in \mathcal{B}, giving

𝐯\displaystyle\mathbf{v} =v1(α1,1γ1+α2,1γ2++αdim(V),1γdim(V))\displaystyle=v_{1}\left(\alpha_{1,1}\gamma_{1}+\alpha_{2,1}\gamma_{2}+...+% \alpha_{\dim(V),1}\gamma_{\dim(V)}\right) (19.56)
+v2(α1,2γ1+α2,2γ2++αdim(V),2γdim(V))\displaystyle\quad+v_{2}\left(\alpha_{1,2}\gamma_{1}+\alpha_{2,2}\gamma_{2}+..% .+\alpha_{\dim(V),2}\gamma_{\dim(V)}\right) (19.57)
+\displaystyle\quad+... (19.58)
+vdim(V)(α1,dim(V)γ1+α2,dim(V)γ2++αdim(V),dim(V)γdim(V))\displaystyle\quad+v_{\dim(V)}\left(\alpha_{1,\dim(V)}\gamma_{1}+\alpha_{2,% \dim(V)}\gamma_{2}+...+\alpha_{\dim(V),\dim(V)}\gamma_{\dim(V)}\right) (19.59)

This looks scary, but we just need to stick to the definitions and keep our goal in mind; writing 𝐯\mathbf{v} in terms of all the γ\gamma. We can move things around to obtain

𝐯\displaystyle\mathbf{v} =(v1α1,1+v2α2,1++vdim(V)αdim(V),1)γ1\displaystyle=\left(v_{1}\alpha_{1,1}+v_{2}\alpha_{2,1}+...+v_{\dim(V)}\alpha_% {\dim(V),1}\right)\gamma_{1} (19.60)
+(v2α1,2+v2α2,2++vdim(V)αdim(V),2)γ2\displaystyle\quad+\left(v_{2}\alpha_{1,2}+v_{2}\alpha_{2,2}+...+v_{\dim(V)}% \alpha_{\dim(V),2}\right)\gamma_{2} (19.61)
+\displaystyle\quad+... (19.62)
+(v2α1,dim(V)+v2α2,dim(V)++vdim(V)αdim(V),dim(V))γdim(V)\displaystyle\quad+\left(v_{2}\alpha_{1,\dim(V)}+v_{2}\alpha_{2,\dim(V)}+...+v% _{\dim(V)}\alpha_{\dim(V),\dim(V)}\right)\gamma_{\dim(V)} (19.63)

Therefore, we have that

[𝐯]\displaystyle[\mathbf{v}]_{\mathcal{B}} =(v1α1,1+v2α2,1++vdim(V)αdim(V),1v2α1,2+v2α2,2++vdim(V)αdim(V),2vdim(V)α1,dim(V)+vdim(V)α2,dim(V)++vdim(V)αdim(V),dim(V))\displaystyle=\begin{pmatrix}v_{1}\alpha_{1,1}+v_{2}\alpha_{2,1}+...+v_{\dim(V% )}\alpha_{\dim(V),1}\\ v_{2}\alpha_{1,2}+v_{2}\alpha_{2,2}+...+v_{\dim(V)}\alpha_{\dim(V),2}\\ ...\\ v_{\dim(V)}\alpha_{1,\dim(V)}+v_{\dim(V)}\alpha_{2,\dim(V)}+...+v_{\dim(V)}% \alpha_{\dim(V),\dim(V)}\end{pmatrix} (19.68)
=(α1,1α2,1αdim(V),1α1,2+α2,2αdim(V),2α1,dim(V)α2,dim(V)αdim(V),dim(V))(v1v2vdim(V))\displaystyle=\begin{pmatrix}\alpha_{1,1}&\alpha_{2,1}&...&\alpha_{\dim(V),1}% \\ \alpha_{1,2}+\alpha_{2,2}&...&\alpha_{\dim(V),2}\\ ...\\ \alpha_{1,\dim(V)}&\alpha_{2,\dim(V)}&...&\alpha_{\dim(V),\dim(V)}\end{pmatrix% }\begin{pmatrix}v_{1}\\ v_{2}\\ ...\\ v_{\dim(V)}\end{pmatrix} (19.77)
=(α1,1α2,1αdim(V),1α1,2α2,2αdim(V),2α1,dim(V)α2,dim(V)αdim(V),dim(V))[𝐯]𝒞\displaystyle=\begin{pmatrix}\alpha_{1,1}&\alpha_{2,1}&...&\alpha_{\dim(V),1}% \\ \alpha_{1,2}&\alpha_{2,2}&...&\alpha_{\dim(V),2}\\ ...\\ \alpha_{1,\dim(V)}&\alpha_{2,\dim(V)}&...&\alpha_{\dim(V),\dim(V)}\end{pmatrix% }[\mathbf{v}]_{\mathcal{C}} (19.82)

Alternatively - the change of basis matrix has as its kkth column the scalars needed to write the kkth element of the one basis as a linear combination of the others.

19.2.4 Subspaces

Definition 19.2.3

Let V be a vector space. The set W is a subspace of V if W is a vector space, and W is a subset of VV.

Technique 19.2.1

Showing that something is a subspace. Suppose we have a vector space V, and we want to prove that W is a subspace of V. The steps to do so are this

  1. 1.

    Show that the zero vector is in the subspace in question.

  2. 2.

    Show that WVW\subseteq V using the standard technique for showing that something is a subset of something else (as in Section TODO: write).

  3. 3.

    Then we must show that WW is closed under vector addition and scalar multiplication. The rest of the vector space axioms follow from the fact that WVW\subseteq V and VV is a vector space.

This theorem is given as both an example of how to prove facts about vector spaces, but also because it is important in its own right.

Theorem 19.2.1

Let V be a vector space, and U and W be subspaces of V. Prove that UWU\cup W is a subspace of VV if and only if UWU\subseteq W or WUW\subseteq U.

To prove this, first we will show the \sayif direction, and then the only if direction.

  • If. Without loss of generality, assume that UWU\subseteq W, in which case UW=WU\cup W=W, and this is a subspace of VV as WW is a subspace of VV. The proof for the other case follows by swapping UU and WW in the proof.

  • Only if. This direction requires a bit more of an intuition about what directions to explore. First we will assume that UWU\cup W is a subspace, and then we will assume that the consequent is untrue (i.e. that UW or WUU\subseteq W\text{ or }W\subseteq U is not true), in which case there exist u,w𝖵u,w\in\textsf{V} such that

    uUW and wWU.u\in U\setminus W\text{ and }w\in W\setminus U. (19.83)

    We can then ask (this is the core idea in the proof which is not immediately obvious – to me at least), about the status of u+wu+w. As u,wUWu,w\in U\cup W and by assumption UWU\cup W is a subspace (and therefore by definition closed under vector addition) it must be that u+wUWu+w\in U\cup W. Then either u+wUu+w\in U or u+wWu+w\in W (by definition of the set union).

    1. 1.

      If u+wUu+w\in U, then also u+w+(u)=wu+w+(-u)=w which is a contradiction as by definition of ww (Equation LABEL:definition_of_u_and_w) wUw\notin U.

    2. 2.

      If u+wWu+w\in W, a very similar thing is the case; also u+w+(w)=uWu+w+(-w)=u\in W which is a contradiction as uWu\notin W.

    Therefore, by contradiction this direction of the theorem must be true.