A calculus of the absurd

20.2.2 Elementary matrices

An “elementary” matrix is a matrix which can be obtained by performing exactly one of the elementary row operations (last seen in Section 20.2.2), once on the identity matrix.

For example, let us consider the case of the \(2 \times 2\) identity matrix

\begin{equation} I = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end {pmatrix} \end{equation}

If we swap the two rows, we obtain a new matrix

\begin{equation} P = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end {pmatrix} \end{equation}

Why the letter \(P\)? Because this a \(2 \times 2\) permutation matrix - if we apply it to another matrix it will permute (i.e. reorder) the rows of that matrix, for example

\begin{equation} \begin{pmatrix} 0 & 1 \\ 1 & 0 \end {pmatrix} \begin{pmatrix} a & b \\ c & d \end {pmatrix} = \begin{pmatrix} c & d \\ a & b \end {pmatrix} \end{equation}

This is kind of amazing when you think about it - we have a set of properties which allow us to reduce every matrix into the “row-echelon form”.

  • Theorem 20.2.1 If \(E\) is an elementary matrix (which by definition was obtained by performing one of the three fundamental row operations a single time on \(I\)) then the result of the multiplication of \(E\) by \(A\) (\(EA\)) is the same as applying the elementary row operation directly to \(A\).

To prove this, we need to consider all the elementary matrices. The first class (perhaps the easiest) to consider, is the elementary matrix which we obtain by swapping two rows of the identity matrix. To prove that multiplying this with another matrix, \(A\), gives the same result as just swapping two rows of \(A\), we first can first devise a formula for \(E_{i,j}\) (i.e. a formula for every element in the matrix).

\begin{equation} E_{i,j} = \begin{cases} 1 & \text {if $i = j$ and $i \ne x$ and $x \ne y$} \\ 1 & \text {if ($i = x$ and $j = y$) or ($i = y$ and $j =x$)} \\ 0 & \text {otherwise} \end {cases} \end{equation}

Then, we can consider the result of multiplying \(A\) by \(E\), i.e.126126 Note: here \(n\) refers to the number of columns in \(E\) and rows in \(A\)

\begin{equation} (EA)_{ij} = \sum _{1\leqq r \leqq n} \Big [ E_{i,r}A_{r,j} \Big ] \end{equation}

From here we can consider a set of exhaustive cases.

  • • If \(i=x\) then we have that \(E_{x,j} = 1\) if and only if \(r=y\) and \(0\) in all other cases. This means that the sum is equal to \(E_{x,y}A_{y,j} = 1\), as this is the only non-zero member of the sum (i.e. the jth item of the yth row is now the jth item of the xth row - the positions have been swapped which is what we wanted to show).

  • • We now get the case where \(i=y\) for free, as we can apply the exact same logic as for the case \(i=x\) by swapping \(x\) and \(y\) (i.e. this case is true by symmetry).

  • • In all other cases of \(i\) we know that \(E_{i,r} = 1\) if and only if \(i=j\), so when \(r=i=j\). Therefore, the only non-zero member of the sum is \(E_{i, i} A_{i,j} = A_{i, j}\). That is, all the other rows in \(AE\) are equivalent to all the other rows in \(A\) (they have remained unchanged, which is what would have happened had we just applied the elementary row operations to \(A\) directly).

This means that the theorem is true for any \(E\) where we swap two rows.

TODO: prove the other theorems (they can be shown analogously)