A calculus of the absurd

18.3.2 Matrix representation of linear transformations.

The best way to understand this is to do a lot of examples, with specific linear transformations and vector spaces. It’s easy to get lost, sinking, “not waving but drowning” in the steaming soup of generality. As they don’t say, a little reification133133 Meaning turning something abstract into something concrete. every day keeps the doctor away.

18.3.2.1 A little preliminary investigation.

TODO

18.3.2.2 Change of basis

Let us suppose that we have two sets of basis vectors for the same vector space \(V\). It doesn’t really matter what we call them, but \(\mathcal {B}\) and \(\mathcal {C}\) are names as good as any. These vectors can be written in the form

\begin{align} & \mathcal {B} = \{\beta _1, \beta _2, ..., \beta _{\dim (V)}\} \\ & \mathcal {C} = \{\gamma _1, \gamma _2, ..., \gamma _{\dim (V)}\} \end{align}

For any vector \(\mathbf {v} \in V\) we can always write it in the co-ordinate system \(\mathcal {B}\) by writing the vector as a linear combination134134 This is always possible because \(\mathbf {B}\) is a basis for \(V\). of the vectors in \(\mathcal {B}\). We can write this as

\begin{equation} [\mathbf {v}]_{\mathcal {B}} = \begin{pmatrix} v_1 \\ v_2 \\ ... \\ v_{\dim (V)} \end {pmatrix} \end{equation}

Where \(v_1, v_2, ..., v_{\dim (V)}\) are such that

\begin{equation} \mathbf {v} = v_1 \beta _1 + v_2 \beta _2 + ... + v_{\dim (V)} \beta _{\dim (V)} \end{equation}

That is, they are the coefficients needed to write \(\mathbf {v}\) as a linear combination of \(\mathcal {B}\). This also helps to understand why for example the vector space of \(2 \times 2\) symmetric matrices135135 i.e. those in the form
\begin{equation} \begin{pmatrix} a & b \\ b & c \end {pmatrix} \end{equation}
is three dimensional; we can write every matrix as a vector of dimension \(3 \times 1\) where each coefficient denotes what to multiply each basis vector by to obtain our specific vector.

But what if we want to find a way to translate \([\mathbf {v}]_{\mathcal {C}}\) into \([\mathbf {v}]_{\mathcal {B}}\)? This is actually doable using a single matrix. Here’s how. We start by applying the definition of \([\mathbf {v}]_{\mathcal {B}}\), that is, we have that \([\mathbf {v}]_{\mathcal {B}} = [v_1, v_2, ..., v_{\dim (V)}]\) if and only if

\begin{equation} \mathbf {v} = v_1 \beta _1 + v_2 \beta _2 + ... + v_{\dim (V)} \beta _{\dim (V)} \end{equation}

To find \([\mathbf {v}]_{\mathcal {C}}\), it is sufficient to find the \(\mathbf {v}\) in terms of the basis vectors in \(\mathbf {c}\). How do we do this? A straightforward approach is to write every vector in \(\mathcal {B}\) in terms of those in \(\mathcal {C}\) and then to substitute for them, which removes all the \(\mathcal {B}\)-vectors and means that we instead have \(\mathcal {C}\)-vectors.

Because \(\mathcal {B}\) and \(\mathcal {C}\) are both basis for \(V\), we can write every vector in \(\mathcal {V}\) in terms of those in \(C\).

\begin{align} & \beta _1 = \alpha _{1, 1} \gamma _1 + \alpha _{2, 1} \gamma _2 + ... + \alpha _{\dim (V), 1} \gamma _{\dim (V)} \\ & \beta _2 = \alpha _{1, 2} \gamma _1 + \alpha _{2, 2} \gamma _2 + ... + \alpha _{\dim (V), 2} \gamma _{\dim (V)} \\ & ... \\ & \beta _{\dim (V)} = \alpha _{1, \dim (V)} \gamma _1 + \alpha _{2, \dim (V)} \gamma _2 + ... + \alpha _{\dim (V), \dim (V)} \gamma _{\dim (V)} \\ \end{align}

We can then substitute this into the linear combination of \(\mathbf {v}\) in terms of the basis vectors in \(\mathcal {B}\), giving

\begin{align} \mathbf {v} &= v_1 \left (\alpha _{1, 1} \gamma _1 + \alpha _{2, 1} \gamma _2 + ... + \alpha _{\dim (V), 1} \gamma _{\dim (V)}\right ) \\ &\quad + v_2 \left (\alpha _{1, 2} \gamma _1 + \alpha _{2, 2} \gamma _2 + ... + \alpha _{\dim (V), 2} \gamma _{\dim (V)}\right ) \\ &\quad + ... \\ &\quad + v_{\dim (V)} \left (\alpha _{1, \dim (V)} \gamma _1 + \alpha _{2, \dim (V)} \gamma _2 + ... + \alpha _{\dim (V), \dim (V)} \gamma _{\dim (V)}\right ) \end{align}

This looks scary, but we just need to stick to the definitions and keep our goal in mind; writing \(\mathbf {v}\) in terms of all the \(\gamma \). We can move things around to obtain

\begin{align} \mathbf {v} &= \left (v_1 \alpha _{1, 1} + v_2 \alpha _{2, 1} + ... + v_{\dim (V)} \alpha _{\dim (V), 1}\right ) \gamma _1 \\ &\quad + \left (v_2 \alpha _{1, 2} + v_2 \alpha _{2, 2} + ... + v_{\dim (V)} \alpha _{\dim (V), 2}\right ) \gamma _2 \\ &\quad + ... \\ &\quad + \left (v_2 \alpha _{1, \dim (V)} + v_2 \alpha _{2, \dim (V)} + ... + v_{\dim (V)} \alpha _{\dim (V), \dim (V)}\right ) \gamma _{\dim (V)} \end{align}

Therefore, we have that

\begin{align} [\mathbf {v}]_{\mathcal {B}} &= \begin{pmatrix} v_1 \alpha _{1, 1} + v_2 \alpha _{2, 1} + ... + v_{\dim (V)} \alpha _{\dim (V), 1} \\ v_2 \alpha _{1, 2} + v_2 \alpha _{2, 2} + ... + v_{\dim (V)} \alpha _{\dim (V), 2} \\ ... \\ v_{\dim (V)} \alpha _{1, \dim (V)} + v_{\dim (V)} \alpha _{2, \dim (V)} + ... + v_{\dim (V)} \alpha _{\dim (V), \dim (V)} \end {pmatrix} \\ &= \begin{pmatrix} \alpha _{1, 1} & \alpha _{2, 1} & ... & \alpha _{\dim (V), 1} \\ \alpha _{1, 2} + \alpha _{2, 2} & ... & \alpha _{\dim (V), 2} \\ ... \\ \alpha _{1, \dim (V)} & \alpha _{2, \dim (V)} & ... & \alpha _{\dim (V), \dim (V)} \end {pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ ... \\ v_{\dim (V)} \end {pmatrix} \\ &= \begin{pmatrix} \alpha _{1, 1} & \alpha _{2, 1} & ... & \alpha _{\dim (V), 1} \\ \alpha _{1, 2} & \alpha _{2, 2} & ... & \alpha _{\dim (V), 2} \\ ... \\ \alpha _{1, \dim (V)} & \alpha _{2, \dim (V)} & ... & \alpha _{\dim (V), \dim (V)} \end {pmatrix} [\mathbf {v}]_{\mathcal {C}} \end{align}

Alternatively - the change of basis matrix has as its \(k\)th column the scalars needed to write the \(k\)th element of the one basis as a linear combination of the others.

18.3.2.3 Representing a linear transformation as a matrix

This is just an extension of the previous section.

Let’s assume that we have a linear transformation \(T: V \to W\), and we would like to find its matrix representation. It’s really easy to get confused here, but don’t lose sight of the goal. We need some information about \(V\) and \(W\), specifically

  • • A basis for \(V\), denoted as \(\mathcal {B} = \{\beta _i\}_{1 \leqq i \leqq n}\).

  • • A basis for \(W\), denoted as \(\mathcal {C} = \{\gamma _i\}_{1 \leqq i \leqq m}\).

We then pick an arbitrary vector, \(\mathbf {v} \in V\), and finds its representation as a linear combination of \(\beta \), that is we find

\begin{equation} \mathbf {v} = \sum _{1 \leqq i \leqq n} v_i b_i \end{equation}

But we’re not after \(\mathbf {v}\), we’re after \(T(\mathbf {v})\)! Therefore, we apply \(T\) to both sides, giving

\begin{align} T(\mathbf {v}) &= T\left (\sum _{1 \leqq j \leqq n} v_j \beta _j\right ) \\ \shortintertext {We now use liberally the fact that $T$ is linear.} &= \sum _{1 \leqq j \leqq n} v_j T(\beta _j) \\ \shortintertext {We're not dealing with a concrete linear transformation, so \say {all} we can say is that for each $i$, $T(\beta _j)$ will give us a vector in $W$ and that we can certainly write this as a linear combination of $\mathcal {C}$, as it is a basis for $W$. Every $T(\beta _j)$ is a linear combination of the $m$ vectors in $\mathcal {C}$, i.e. $T(\beta _j) = \sum _{1 \leqq i \leqq m} \left (a_{i,j}\right ) \gamma _i$. Substituting this in, we get} &= \sum _{1 \leqq j \leqq n} v_j \left (\sum _{1 \leqq i \leqq m} a_{i,j} \gamma _i\right ) \\ &= \sum _{1 \leqq i \leqq m} \gamma _i \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \\ &= \sum _{1 \leqq i \leqq m} \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \gamma _i \end{align}

Now, from the definition of a co-ordinate vector, as \(\gamma _1, ..., \gamma _m\) are the basis vectors for \(\mathcal {C}\), the representation of \(T(v)\) as a co-ordinate vector in this basis is just

\begin{align} [T(v)]_{\mathcal {C}} &= \begin{pmatrix} \sum _{1 \leqq j \leqq n} a_{1, j} v_j \\ \sum _{1 \leqq j \leqq n} a_{2, j} v_j \\ ... \\ \sum _{1 \leqq j \leqq n} a_{m, j} v_j \\ \end {pmatrix} \\ &= \underbrace { \begin{pmatrix} a_{1, 1} & a_{1, 2} & ... & a_{1, n} \\ a_{2, 1} & a_{2, 2} & ... & a_{2, n} \\ ... & ... & ... & ... \\ a_{m, 1} & a_{1, 2} & ... & a_{1, n} \end {pmatrix} }_{\text {Let this be $\mathbf {A}$.}} \begin{pmatrix} v_1 \\ v_2 \\ ... \\ v_3 \end {pmatrix} \label {where we defined A} \end{align}

Which is exactly what we wanted to find. Specifically, the \(j\)th column of the matrix \(\mathbf {A}\) (as defined in Equation 18.107) is the co-ordinate vector (in the ordered base \(\mathcal {C}\)) of the result of \(T(\beta _j)\).