A calculus of the absurd

20.5.3 The matrix representation of a linear transformation

This is just an extension of the previous section.

Let’s assume that we have a linear transformation \(T: V \to W\), and we would like to find its matrix representation. It’s really easy to get confused here, but don’t lose sight of the goal. We need some information about \(V\) and \(W\), specifically

  • • A basis for \(V\), denoted as \(\mathcal {B} = \{\beta _i\}_{1 \leqq i \leqq n}\).

  • • A basis for \(W\), denoted as \(\mathcal {C} = \{\gamma _i\}_{1 \leqq i \leqq m}\).

We then pick an arbitrary vector, \(\mathbf {v} \in V\), and finds its representation as a linear combination of \(\beta \), that is we find

\begin{equation} \mathbf {v} = \sum _{1 \leqq i \leqq n} v_i b_i \end{equation}

But we’re not after \(\mathbf {v}\), we’re after \(T(\mathbf {v})\)! Therefore, we apply \(T\) to both sides, giving

\begin{align} T(\mathbf {v}) &= T\left (\sum _{1 \leqq j \leqq n} v_j \beta _j\right ) \\ \end{align}

We now use liberally the fact that \(T\) is linear.

\begin{align} T(\mathbf {v}) &= \sum _{1 \leqq j \leqq n} v_j T(\beta _j) \end{align}

We’re not dealing with a concrete linear transformation, so “all” we can say is that for each \(i\), \(T(\beta _j)\) will give us a vector in \(W\) and that we can certainly write this as a linear combination of \(\mathcal {C}\), as it is a basis for \(W\). Every \(T(\beta _j)\) is a linear combination of the \(m\) vectors in \(\mathcal {C}\), i.e. \(T(\beta _j) = \sum _{1 \leqq i \leqq m} \left (a_{i,j}\right ) \gamma _i\). Substituting this in, we get

\begin{align} T(\mathbf {v}) &= \sum _{1 \leqq j \leqq n} v_j \left (\sum _{1 \leqq i \leqq m} a_{i,j} \gamma _i\right ) \\ &= \sum _{1 \leqq i \leqq m} \gamma _i \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \\ &= \sum _{1 \leqq i \leqq m} \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \gamma _i \end{align}

Now, from the definition of a co-ordinate vector, as \(\gamma _1, ..., \gamma _m\) are the basis vectors for \(\mathcal {C}\), the representation of \(T(v)\) as a co-ordinate vector in this basis is just

\begin{align} [T(v)]_{\mathcal {C}} &= \begin{pmatrix} \sum _{1 \leqq j \leqq n} a_{1, j} v_j \\ \sum _{1 \leqq j \leqq n} a_{2, j} v_j \\ ... \\ \sum _{1 \leqq j \leqq n} a_{m, j} v_j \\ \end {pmatrix} \\ &= \underbrace { \begin{pmatrix} a_{1, 1} & a_{1, 2} & ... & a_{1, n} \\ a_{2, 1} & a_{2, 2} & ... & a_{2, n} \\ ... & ... & ... & ... \\ a_{m, 1} & a_{1, 2} & ... & a_{1, n} \end {pmatrix} }_{\text {Let this be $\mathbf {A}$.}} \begin{pmatrix} v_1 \\ v_2 \\ ... \\ v_3 \end {pmatrix} \label {where we defined A} \end{align}

Which is exactly what we wanted to find. Specifically, the \(j\)th column of the matrix \(\mathbf {A}\) (as defined in Equation 20.123) is the co-ordinate vector (in the ordered base \(\mathcal {C}\)) of the result of \(T(\beta _j)\).