19.3 Linear transformations

19.3.1 Introduction to linear transformations

Definition 19.3.1

Let 𝖵,𝖶\textsf{V},\textsf{W} be vector spaces over a field 𝕂\mathbb{K}. We say a function T:𝖵𝖶T:\textsf{V}\to\textsf{W} is linear if it satisfies these two properties

  1. 1.

    For every x,y𝘝x,y\in\textsf{V},

    T(x+y)=T(x)+T(y)\displaystyle T(x+y)=T(x)+T(y) (19.84)
    T(αx)=αT(x)\displaystyle T(\alpha x)=\alpha T(x) (19.85)
Theorem 19.3.1

Let 𝖵,𝖶\textsf{V},\textsf{W} be vector spaces over a field 𝕂\mathbb{K} and let x,y𝖵x,y\in\textsf{V} and α𝕂\alpha\in\mathbb{K}. The map/function/ whatever you want to call it T:𝖵𝖶T:\textsf{V}\to\textsf{W} is linear if and only if

T(αx+y)=αT(x)+T(y)T(\alpha x+y)=\alpha T(x)+T(y) (19.86)

Proof: there are two directions to show.

  1. 1.

    Only if. We assume that TT is a linear transformation. Therefore, TT satisfies 19.84, so we can write

    T(αx+y)\displaystyle T(\alpha x+y) =T(αx)+T(y)\displaystyle=T(\alpha x)+T(y) (19.87)

    As TT is linear it also satisifies 19.85, so by this property,

    T(αx+y)\displaystyle T(\alpha x+y) =αT(x)+T(y)\displaystyle=\alpha T(x)+T(y) (19.89)
  2. 2.

    If. We assume that 19.86 holds, and therefore (this is just a restatement of the equation from the theorem)

    T(αx+y)=T(αx)+T(y)\displaystyle T(\alpha x+y)=T(\alpha x)+T(y) (19.90)

    We then set α=1\alpha=1, so it follows that

    T(x+y)=T(x)+T(y)T(x+y)=T(x)+T(y) (19.91)

    What remains to show is that for all α\alpha and xx we have

    T(αx)=αT(x)T(\alpha x)=\alpha T(x) (19.92)

    We obtain this by fixing y=0y=0 from which the result for all α\alpha and 𝕂\mathbb{K} follows.

19.3.2 The matrix representation of a linear transformation

The best way to understand this is to do a lot of examples, with specific linear transformations and vector spaces. It’s easy to get lost, sinking, \saynot waving but drowning in the steaming soup of generality. As they don’t say, a little reification 77 7 Meaning turning something abstract into something concrete. every day keeps the doctor away.

Let’s assume that we have a linear transformation T:VWT:V\to W, and we would like to find its matrix representation. It’s really easy to get confused here, but don’t lose sight of the goal. We need some information about VV and WW, specifically

  • A basis for VV, denoted as ={βi}1in\mathcal{B}=\{\beta_{i}\}_{1\leqq i\leqq n}.

  • A basis for WW, denoted as 𝒞={γi}1im\mathcal{C}=\{\gamma_{i}\}_{1\leqq i\leqq m}.

We then pick an arbitrary vector, 𝐯V\mathbf{v}\in V, and finds its representation as a linear combination of β\beta, that is we find

𝐯=1invibi\mathbf{v}=\sum_{1\leqq i\leqq n}v_{i}b_{i} (19.93)

But we’re not after 𝐯\mathbf{v}, we’re after T(𝐯)T(\mathbf{v})! Therefore, we apply TT to both sides, giving

T(𝐯)\displaystyle T(\mathbf{v}) =T(1jnvjβj)\displaystyle=T\left(\sum_{1\leqq j\leqq n}v_{j}\beta_{j}\right) (19.94)

We now use liberally the fact that TT is linear.

T(𝐯)\displaystyle T(\mathbf{v}) =1jnvjT(βj)\displaystyle=\sum_{1\leqq j\leqq n}v_{j}T(\beta_{j}) (19.96)

We’re not dealing with a concrete linear transformation, so \sayall we can say is that for each ii, T(βj)T(\beta_{j}) will give us a vector in WW and that we can certainly write this as a linear combination of 𝒞\mathcal{C}, as it is a basis for WW. Every T(βj)T(\beta_{j}) is a linear combination of the mm vectors in 𝒞\mathcal{C}, i.e. T(βj)=1im(ai,j)γiT(\beta_{j})=\sum_{1\leqq i\leqq m}\left(a_{i,j}\right)\gamma_{i}. Substituting this in, we get

T(𝐯)\displaystyle T(\mathbf{v}) =1jnvj(1imai,jγi)\displaystyle=\sum_{1\leqq j\leqq n}v_{j}\left(\sum_{1\leqq i\leqq m}a_{i,j}% \gamma_{i}\right) (19.97)
=1imγi(1jnai,jvj)\displaystyle=\sum_{1\leqq i\leqq m}\gamma_{i}\left(\sum_{1\leqq j\leqq n}a_{i% ,j}v_{j}\right) (19.98)
=1im(1jnai,jvj)γi\displaystyle=\sum_{1\leqq i\leqq m}\left(\sum_{1\leqq j\leqq n}a_{i,j}v_{j}% \right)\gamma_{i} (19.99)

Now, from the definition of a co-ordinate vector, as γ1,,γm\gamma_{1},...,\gamma_{m} are the basis vectors for 𝒞\mathcal{C}, the representation of T(v)T(v) as a co-ordinate vector in this basis is just

[T(v)]𝒞\displaystyle[T(v)]_{\mathcal{C}} =(1jna1,jvj1jna2,jvj1jnam,jvj)\displaystyle=\begin{pmatrix}\sum_{1\leqq j\leqq n}a_{1,j}v_{j}\\ \sum_{1\leqq j\leqq n}a_{2,j}v_{j}\\ ...\\ \sum_{1\leqq j\leqq n}a_{m,j}v_{j}\\ \end{pmatrix} (19.104)
=(a1,1a1,2a1,na2,1a2,2a2,nam,1a1,2a1,n)Let this be 𝐀.(v1v2v3)\displaystyle=\underbrace{\begin{pmatrix}a_{1,1}&a_{1,2}&...&a_{1,n}\\ a_{2,1}&a_{2,2}&...&a_{2,n}\\ ...&...&...&...\\ a_{m,1}&a_{1,2}&...&a_{1,n}\end{pmatrix}}_{\text{Let this be $\mathbf{A}$.}}% \begin{pmatrix}v_{1}\\ v_{2}\\ ...\\ v_{3}\end{pmatrix} (19.113)

Which is exactly what we wanted to find. Specifically, the jjth column of the matrix 𝐀\mathbf{A} (as defined in Equation 19.113) is the co-ordinate vector (in the ordered base 𝒞\mathcal{C}) of the result of T(βj)T(\beta_{j}).

19.3.3 Gaussian elimination strikes back (linear independence, span and Gaussian elimination)