A calculus of the absurd

22.4.2 The matrix representation of a linear transformation

The best way to understand this is to do a lot of examples, with specific linear transformations and vector spaces. It’s easy to get lost, sinking, “not waving but drowning” in the steaming soup of generality. As they don’t say, a little reification 135135 Meaning turning something abstract into something concrete. every day keeps the doctor away.

Let’s assume that we have a linear transformation \(T: V \to W\), and we would like to find its matrix representation. It’s really easy to get confused here, but don’t lose sight of the goal. We need some information about \(V\) and \(W\), specifically

  • • A basis for \(V\), denoted as \(\mathcal {B} = \{\beta _i\}_{1 \leqq i \leqq n}\).

  • • A basis for \(W\), denoted as \(\mathcal {C} = \{\gamma _i\}_{1 \leqq i \leqq m}\).

We then pick an arbitrary vector, \(\mathbf {v} \in V\), and finds its representation as a linear combination of \(\beta \), that is we find

\begin{equation} \mathbf {v} = \sum _{1 \leqq i \leqq n} v_i b_i \end{equation}

But we’re not after \(\mathbf {v}\), we’re after \(T(\mathbf {v})\)! Therefore, we apply \(T\) to both sides, giving

\begin{align} T(\mathbf {v}) &= T\left (\sum _{1 \leqq j \leqq n} v_j \beta _j\right ) \\ \end{align}

We now use liberally the fact that \(T\) is linear.

\begin{align} T(\mathbf {v}) &= \sum _{1 \leqq j \leqq n} v_j T(\beta _j) \end{align}

We’re not dealing with a concrete linear transformation, so “all” we can say is that for each \(i\), \(T(\beta _j)\) will give us a vector in \(W\) and that we can certainly write this as a linear combination of \(\mathcal {C}\), as it is a basis for \(W\). Every \(T(\beta _j)\) is a linear combination of the \(m\) vectors in \(\mathcal {C}\), i.e. \(T(\beta _j) = \sum _{1 \leqq i \leqq m} \left (a_{i,j}\right ) \gamma _i\). Substituting this in, we get

\begin{align} T(\mathbf {v}) &= \sum _{1 \leqq j \leqq n} v_j \left (\sum _{1 \leqq i \leqq m} a_{i,j} \gamma _i\right ) \\ &= \sum _{1 \leqq i \leqq m} \gamma _i \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \\ &= \sum _{1 \leqq i \leqq m} \left (\sum _{1 \leqq j \leqq n} a_{i, j} v_j\right ) \gamma _i \end{align}

Now, from the definition of a co-ordinate vector, as \(\gamma _1, ..., \gamma _m\) are the basis vectors for \(\mathcal {C}\), the representation of \(T(v)\) as a co-ordinate vector in this basis is just

\begin{align} [T(v)]_{\mathcal {C}} &= \begin{pmatrix} \sum _{1 \leqq j \leqq n} a_{1, j} v_j \\ \sum _{1 \leqq j \leqq n} a_{2, j} v_j \\ ... \\ \sum _{1 \leqq j \leqq n} a_{m, j} v_j \\ \end {pmatrix} \\ &= \underbrace { \begin{pmatrix} a_{1, 1} & a_{1, 2} & ... & a_{1, n} \\ a_{2, 1} & a_{2, 2} & ... & a_{2, n} \\ ... & ... & ... & ... \\ a_{m, 1} & a_{1, 2} & ... & a_{1, n} \end {pmatrix} }_{\text {Let this be $\mathbf {A}$.}} \begin{pmatrix} v_1 \\ v_2 \\ ... \\ v_3 \end {pmatrix} \label {where we defined A} \end{align}

Which is exactly what we wanted to find. Specifically, the \(j\)th column of the matrix \(\mathbf {A}\) (as defined in Equation 22.123) is the co-ordinate vector (in the ordered base \(\mathcal {C}\)) of the result of \(T(\beta _j)\).