A calculus of the absurd

22.2.2 Elementary matrices

An “elementary” matrix is a matrix which can be obtained by performing exactly one of the elementary row operations (last seen in Section 22.2.2), once on the identity matrix.

For example, let us consider the case of the \(2 \times 2\) identity matrix

\begin{equation} I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end {pmatrix} \end{equation}

If we swap the two rows, we obtain a new matrix

\begin{equation} P = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end {pmatrix} \end{equation}

Why the letter \(P\)? Because this a \(2 \times 2\) permutation matrix - if we apply it to another matrix it will permute (i.e. reorder) the rows of that matrix, for example

\begin{equation} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end {pmatrix} \begin{pmatrix} a & b \\ c & d \end {pmatrix} = \begin{pmatrix} c & d \\ a & b \end {pmatrix} \end{equation}

We can codify the fact that elementary matrices and elementary row operations are kind of equivalent using this theorem.

  • Theorem 22.2.1 If \(E\) is an elementary matrix (which by definition was obtained by performing one of the three fundamental row operations a single time on \(I\)) then when we multiply any matrix \(A\) by \(E\) the resulting matrix is the same as the matrix we would obtain by directly performing an elementary row operation to this matrix.

To prove this, we need to consider all the elementary matrices. The first class (perhaps the easiest) to consider, is the elementary matrix which we obtain by swapping two rows of the identity matrix. To prove that multiplying this with another matrix, \(A\), gives the same result as just swapping two rows of \(A\), we first can first devise a formula for \(E_{i,j}\) (i.e. a formula for every element in the matrix).

\begin{equation} E_{i,j} = \begin{cases} 1 & \text {if $i = j$ and $i \ne x$ and $x \ne y$} \\ 1 & \text {if ($i = x$ and $j = y$) or ($i = y$ and $j =x$)} \\ 0 & \text {otherwise} \end {cases} \end{equation}

Then, we can consider the result of multiplying \(A\) by \(E\), i.e.128128 Note: here \(n\) refers to the number of columns in \(E\) and rows in \(A\)

\begin{equation} (EA)_{ij} = \sum _{1\leqq r \leqq n} \Big [ E_{i,r}A_{r,j} \Big ] \end{equation}

From here we can consider a set of exhaustive cases.

  • • If \(i=x\) then we have that \(E_{x,j} = 1\) if and only if \(r=y\) and \(0\) in all other cases. This means that the sum is equal to \(E_{x,y}A_{y,j} = 1\), as this is the only non-zero member of the sum (i.e. the jth item of the yth row is now the jth item of the xth row - the positions have been swapped which is what we wanted to show).

  • • We now get the case where \(i=y\) for free, as we can apply the exact same logic as for the case \(i=x\) by swapping \(x\) and \(y\) (i.e. this case is true by symmetry).

  • • In all other cases of \(i\) we know that \(E_{i,r} = 1\) if and only if \(i=j\), so when \(r=i=j\). Therefore, the only non-zero member of the sum is \(E_{i, i} A_{i,j} = A_{i, j}\). That is, all the other rows in \(AE\) are equivalent to all the other rows in \(A\) (they have remained unchanged, which is what would have happened had we just applied the elementary row operations to \(A\) directly).

This means that the theorem is true for any \(E\) where we swap two rows.

TODO: prove the other theorems (they can be shown analogously)