Inverses and Elementary Matrices

Matrix inversion gives a method for solving some systems of equations. Suppose

$$\eqalign{a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n &= b_1 \cr a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n &= b_2 \cr & \vdots \cr a_{n1}x_1 + a_{n2}x_2 + \cdots + a_{nn}x_n &= b_n \cr}$$

is a system of n linear equations in n variables. Write

$$A = \left[\matrix{a_{11} & a_{12} & \ldots & a_{1n} \cr a_{21} & a_{22} & \ldots & a_{2n} \cr \vdots & \vdots & \ddots & \vdots \cr a_{n1} & a_{n2} & \ldots & a_{nn} \cr}\right], \quad x = \left[\matrix{x_1 \cr x_2 \cr \vdots \cr x_n \cr}\right], \quad b = \left[\matrix{b_1 \cr b_2 \cr \vdots \cr b_n \cr}\right].$$

The system can then be written in matrix form:

$$Ax = b.$$

(One reason for using matrix notation is that it saves writing!) If A has an inverse $A^{-1}$ , I can multiply both sides by $A^{-1}$ :

$$\eqalign{A^{-1}Ax &= A^{-1}b \cr Ix &= A^{-1}b \cr x & = A^{-1}b \cr}$$

I've solved for the vectors x of unknowns.

Since not every matrix has an inverse, it's important to know:

I'll discuss these questions in this section.

Definition. An elementary matrix is a matrix which represents an elementary row operation. ("Represents" means that multiplying on the left by the elementary matrix performs the row operation.)

In the pictures below, the elements that are not shown are the same as those in the identity matrix.

$$\bordermatrix{& & i & & j & \cr & & \vdots & & \vdots & \cr i & \cdots & 0 & \cdots & 1 & \cdots \cr & & \vdots & & \vdots & \cr j &\cdots & 1 & \cdots & 0 & \cdots \cr & & \vdots & & \vdots & \cr}$$

interchanges rows i and j.

$$\bordermatrix{& & i & \cr & & \vdots & \cr i & \cdots & a & \cdots \cr & & \vdots & \cr}$$

multiplies row i by a.

$$\bordermatrix{& & i & & j & \cr & & \vdots & & \vdots & \cr i & \cdots & 1 & \cdots & a & \cdots \cr & & \vdots & & \vdots & \cr j & \cdots & 0 & \cdots & 1 & \cdots \cr & & \vdots & & \vdots & \cr}$$

replaces row i with row i plus a times row j.

Their inverses are the elementary matrices

$$\bordermatrix{& & i & & j & \cr & & \vdots & & \vdots & \cr i & \cdots & 0 & \cdots & 1 & \cdots \cr & & \vdots & & \vdots & \cr j &\cdots & 1 & \cdots & 0 & \cdots \cr & & \vdots & & \vdots & \cr}, \qquad \bordermatrix{& & i & \cr & & \vdots & \cr i & \cdots & \frac{1}{a} & \cdots \cr & & \vdots & \cr}, \qquad \bordermatrix{& & i & & j & \cr & & \vdots & & \vdots & \cr i & \cdots & 1 & \cdots & -a & \cdots \cr & & \vdots & & \vdots & \cr j & \cdots & 0 & \cdots & 1 & \cdots \cr & & \vdots & & \vdots & \cr},$$

respectively. Multiplication by the first matrix swaps rows i and j. Multiplication by the second matrix divides row i by a. Multiplication by the third matrix subtracts a times row j from row i. These operations are the inverses of the operations implemented by the original matrices.


Example. Multiplying on the left by

$$\left[\matrix{1 & 0 & 2 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right] \quad\hbox{adds 2 times row 3 to row 1}.$$

The inverse

$$\left[\matrix{1 & 0 & 2 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right]^{-1} = \left[\matrix{1 & 0 & -2 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right] \quad\hbox{subtracts 2 times row 3 from row 1}.$$

Multiplying on the left by

$$\left[\matrix{1 & 0 & 0 \cr 0 & 0 & 1 \cr 0 & 1 & 0 \cr}\right] \quad\hbox{swaps row 2 and row 3}.$$

The inverse

$$\left[\matrix{1 & 0 & 0 \cr 0 & 0 & 1 \cr 0 & 1 & 0 \cr}\right]^{-1} = \left[\matrix{1 & 0 & 0 \cr 0 & 0 & 1 \cr 0 & 1 & 0 \cr}\right] \quad\hbox{swaps row 2 and row 3}.$$

Multiplying on the left by

$$\left[\matrix{1 & 0 & 0 \cr 0 & 17 & 0 \cr 0 & 0 & 1 \cr}\right] \quad\hbox{multiplies row 2 by 17}.$$

The inverse

$$\left[\matrix{1 & 0 & 0 \cr 0 & 17 & 0 \cr 0 & 0 & 1 \cr}\right]^{-1} = \left[\matrix{1 & 0 & 0 \cr \noalign{\vskip2pt} 0 & \dfrac{1}{17} & 0 \cr \noalign{\vskip2pt} 0 & 0 & 1 \cr}\right] \quad\hbox{divides row 2 by 17}.\quad\halmos$$


Definition. Matrices A and B are row equivalent if A can be transformed to B by a finite sequence of elementary row operations.

Remark. Since row operations may be implemented by multiplying by elementary matrices, A and B are row equivalent if and only if there are elementary matrices $E_1$ , ..., $E_n$ such that

$$E_1\cdots E_nA = B.\quad\halmos$$

Lemma. Row equivalence is an equivalence relation.

Proof. I have to show three things:

  1. (Reflexivity) Every matrix is row equivalent to itself.
  2. (Symmetry) If A row reduces to B, then B row reduces to A.
  3. (Transitivity) If A row reduces to B and B row reduces to C, then A row reduces to C.

(a) is obvious, since I can row reduce a matrix to itself by performing the identity row operation.

For (b), suppose A row reduces to B. Write

$$E_1\cdots E_nA = B,$$

where $E_1$ , ... $E_n$ are elementary matrices. Then

$$A = E_n^{-1}\cdots E_1^{-1}B.$$

Since the inverse of an elementary matrix is an elementary matrix, it follows that B row reduces to A.

It remains to prove (c). If A row reduces to B and B row reduces to C, then there are elementary matrices $E_1$ , ..., $E_m$ , $F_1$ , ..., $F_n$ such that

$$E_1\cdots E_mA = B \quad\hbox{and}\quad F_1\cdots F_nB = C.$$

Then

$$F_1\cdots F_nE_1\cdots E_mA = C,$$

so A row reduces to C.

Therefore, row equivalence is an equivalence relation.

Definition. An $n \times n$ matrix A is invertible if there is an $n \times n$ matrix B such that $AB = BA = I$ , where I is the $n
   \times n$ identity matrix.

Notation. If A is a square matrix, then

$$A^n = \cases{\overbrace{A\cdot A\cdots A}^{n\rm\;times} & if $n > 0$ \cr I & if $n = 0$ \cr \overbrace{A^{-1}\cdot A^{-1}\cdots A^{-1}}^{n\rm\;times} & if $n < 0$ \cr}$$

(where $A^n$ for $n < 0$ only makes sense if A is invertible.

The usual rules for powers hold:

  1. $A^mA^n = A^{m+n}$ .
  2. $(A^m)^n = A^{mn}$ .

Example. Let

$$A = \left[\matrix{3 & 1 \cr 2 & 0 \cr}\right].$$

Compute $A^2$ and $A^{-2}$ .

$$A^2 = A\cdot A = \left[\matrix{3 & 1 \cr 2 & 0 \cr}\right] \left[\matrix{3 & 1 \cr 2 & 0 \cr}\right] = \left[\matrix{11 & 3 \cr 6 & 2 \cr}\right].$$

Using the formula for the inverse of a $2 \times 2$ matrix,

$$A^{-1} = \dfrac{1}{3\cdot 0 - 2\cdot 1} \left[\matrix{0 & -1 \cr -2 & 3}\right] = \left[\matrix{0 & 0.5 \cr 1 & -1.5 \cr}\right].$$

Therefore,

$$A^{-2} = A^{-1}\cdot A^{-1} = \left[\matrix{0 & 0.5 \cr 1 & -1.5 \cr}\right] \left[\matrix{0 & 0.5 \cr 1 & -1.5 \cr}\right] = \left[\matrix{0.5 & -0.75 \cr -1.5 & 2.75 \cr}\right].\quad\halmos$$


Proposition.

  1. If A and B are invertible $n \times
   n$ matrices, then

$$(AB)^{-1} = B^{-1}A^{-1}.$$

  1. If A is invertible, then $(A^T)^{-1} =
   (A^{-1})^T$ .

Proof. (a) The inverse of $AB$ is the thing which, when multiplied by $AB$ , gives the identity I. Now

$$(B^{-1}A^{-1})(AB) = B^{-1}IB = B^{-1}B = I,$$

$$(AB)(B^{-1}A^{-1}) = AIA^{-1} = AA^{-1} = I.$$

Since $B^{-1}A^{-1}$ gives the identity when multiplied by $AB$ , $B^{-1}A^{-1}$ must be the inverse of $AB$ --- that is, $(AB)^{-1} = B^{-1}A^{-1}$ .

(b) The inverse of $A^T$ is the thing which, when multiplied by $A^T$ , give the identity I. Now

$$A^T(A^{-1})^T = \left(A^{-1}A\right)^T = I^T = I \quad\hbox{and}\quad (A^{-1})^TA^T = \left(AA^{-1}\right)^T = I^T = I.$$

Since $(A^{-1})^T$ gives the identity when multiplied by $A^T$ , $(A^{-1})^T$ must be the inverse of $A^T$ --- that is, $(A^T)^{-1} = (A^{-1})^T$ .

Remark. Look over the proofs of the two parts of the last proposition and be sure you understand why the computations proved the things that were to be proved. The idea is that the inverse of a matrix is defined by a property, not by appearance. By analogy, it is like the difference between the set of mathematicians (a set defined by a property) and the set of people with purple hair (a set defined by appearance).

A matrix B is the inverse of a matrix A if it has the property that multiplying B by A (in both orders) gives the identity I. So to check whether a matrix B really is the inverse of A, you multiply B by A (in both orders) any see whether you get I.


Example. Solve the following matrix equation for X, assuming that A and B are invertible:

$$A^2XBA = AB.$$

$$\eqalign{A^2XBA &= AB \cr A^{-2}A^2XBA &= A^{-2}AB \cr XBA &= A^{-1}B \cr XBAA^{-1} &= A^{-1}BA^{-1} \cr XB &= A^{-1}BA^{-1} \cr XBB^{-1} &= A^{-1}BA^{-1}B^{-1} \cr X &= A^{-1}BA^{-1}B^{-1} \cr}$$

Notice that I can multiply both sides of a matrix equation by the same thing, but I must multiply on the same side of both sides. The reason I have to be careful is that in general, $MN \ne NM$ --- matrix multiplication is not commutative.


Example. Note that $(A + B)^{-1} \ne A^{-1} + B^{-1}$ . In fact, if A and B are invertible, $A + B$ need not be invertible. For example, if

$$A = \left[\matrix{1 & 0 \cr 0 & 1 \cr}\right] \quad\hbox{and}\quad B = \left[\matrix{-1 & 0 \cr 0 & -1 \cr}\right],$$

then A and B are invertible --- each is its own inverse.

But

$$A + B = \left[\matrix{0 & 0 \cr 0 & 0 \cr}\right],$$

which is not invertible.


Theorem. Let A be an $n \times n$ matrix. The following are equivalent:

  1. A is row equivalent to I.
  2. A is a product of elementary matrices.
  3. A is invertible.
  4. The system

$$Ax = 0$$

has only the trivial solution $x = 0$ .

  1. For any n-dimensional vector b, the system

$$Ax = b$$

has a unique solution.

Proof. When you are trying to prove several statements are equivalent, you must prove that if you assume any one of the statements, you can prove any of the others. I can do this here by proving that (a) implies (b), (b) implies (c), (c) implies (d), (d) implies (e), and (e) implies (a).

(a) $\Rightarrow$ (b): Let $E_1, \ldots,
   E_p$ be elementary matrices which row reduce A to I:

$$E_1 \cdots E_p A = I.$$

Then

$$A = E_p^{-1} \cdots E_1^{-1}.$$

Since the inverse of an elementary matrix is an elementary matrix, A is a product of elementary matrices.

(b) $\Rightarrow$ (c): Write A as a product of elementary matrices:

$$A = F_1 \cdots F_q.$$

Now

$$F_1 \cdots F_q \cdot F_q^{-1} \cdots F_1^{-1} = I,$$

$$F_q^{-1} \cdots F_1^{-1} \cdot F_1 \cdots F_q = I.$$

Hence,

$$A^{-1} = F_q^{-1} \cdots F_1^{-1}.$$

(c) $\Rightarrow$ (d): Suppose A is invertible. The system $Ax = 0$ has at least one solution, namely $x =
   0$ .

Moreover, if y is any other solution, then

$$Ay = 0, \quad\hbox{so}\quad A^{-1}Ay = A^{-1}0, \quad\hbox{or}\quad y = 0.$$

That is, 0 is the one and only solution to the system.

(d) $\Rightarrow$ (e): Suppose the only solution to $Ax = 0$ is $x = 0$ . If $A = (a_{ij})$ , this means that row reducing the augmented matrix

$$\left[\matrix{a_{11} & a_{12} & \cdots & a_{1n} & 0 \cr a_{21} & a_{22} & \cdots & a_{2n} & 0 \cr \vdots & \vdots & \ddots & \vdots \cr a_{n1} & a_{n2} & \cdots & a_{nn} & 0 \cr}\right] \quad\hbox{produces}\quad \left[\matrix{1 & 0 & \cdots & 0 & 0 \cr 0 & 1 & \cdots & 0 & 0 \cr \vdots & \vdots & \ddots & \vdots & \vdots \cr 0 & 0 & \cdots & 1 & 0 \cr}\right].$$

Ignoring the last column (which never changes), this means there is a sequence of row operations $E_1$ , ..., $E_n$ which reduces A to the identity I --- that is, A is row equivalent to I. (I've actually proved (d) $\Rightarrow$ (a) at this point.)

Let $b = \langle b_1, \ldots
   b_n\rangle$ be an arbitrary n-dimensional vector. Then

$$E_1\cdots E_n \left[\matrix{a_{11} & a_{12} & \cdots & a_{1n} & b_1 \cr a_{21} & a_{22} & \cdots & a_{2n} & b_2 \cr \vdots & \vdots & \ddots & \vdots \cr a_{n1} & a_{n2} & \cdots & a_{nn} & b_n \cr}\right] = \left[\matrix{1 & 0 & \cdots & 0 & b_1' \cr 0 & 1 & \cdots & 0 & b_2' \cr \vdots & \vdots & \ddots & \vdots & \vdots \cr 0 & 0 & \cdots & 1 & b_n' \cr}\right].$$

Thus, $z = \langle b_1', \ldots
   b_n'\rangle$ is a solution.

Suppose y is another solution to $Ax =
   b$ . Then

$$A(y - z) = Ay - Az = b - b = 0.$$

Therefore, $y - z$ is a solution to $Ax =
   0$ . But the only solution to $Ax = 0$ is 0, so $y - z = 0$ , or $y = z$ . Thus, $z = \langle b_1', \ldots b_n'\rangle$ is the unique solution to $Ax = b$ .

(e) $\Rightarrow$ (a): Suppose $Ax = b$ has a unique solution for every b. As a special case, $Ax = 0$ has a unique solution (namely $x = 0$ ). But arguing as I did in (d) $\Rightarrow$ (e), I can show that A row reduces to I, and that is (a).


Example. ( Writing an invertible matrix as a product of elementary matrices) If A is invertible, the theorem implies that A can be written as a product of elementary matrices. To do this, row reduce A to the identity, keeping track of the row operations you're using. Write each row operation as an elementary matrix, and express the row reduction as a matrix multiplication. Finally, solve the resulting equation for A.

For example, suppose

$$A = \left[\matrix{2 & -4 \cr -2 & 3 \cr}\right].$$

Row reduce A to I:

$$\left[\matrix{2 & -4 \cr -2 & 3 \cr}\right] \matrix{\rightarrow \cr r_1 \to r_1/2 \cr} \left[\matrix{1 & -2 \cr -2 & 3 \cr}\right] \matrix{\rightarrow \cr r_2 \to r_2 + 2r_1 \cr}$$

$$\left[\matrix{1 & -2 \cr 0 & 1 \cr}\right] \matrix{\rightarrow \cr r_1 \to r_1 + 2r_2 \cr} \left[\matrix{1 & 0 \cr 0 & 1 \cr}\right]$$

Represent each row operation as an elementary matrix:

$$r_1 \to \dfrac{1}{2}r_1 \quad\hbox{corresponds to} \quad \left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right],$$

$$r_2 \to r_2 + 2r_1 \quad\hbox{corresponds to} \quad \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right],$$

$$r_1 \to r_1 + 2r_2 \quad\hbox{corresponds to} \quad \left[\matrix{1 & 2 \cr 0 & 1 \cr}\right].$$

Write the row reduction as a matrix multiplication. A must be multiplied on the left by the elementary matrices in the order in which the operations were performed.

$$\left[\matrix{1 & 2 \cr 0 & 1 \cr}\right] \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right] \left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right]\cdot A = I.$$

Now solve for A, being careful to get the inverses in the right order:

$$\left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right]^{-1} \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right]^{-1} \left[\matrix{1 & 2 \cr 0 & 1 \cr}\right]^{-1} \left[\matrix{1 & 2 \cr 0 & 1 \cr}\right] \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right] \left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right]\cdot A = \left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right]^{-1} \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right]^{-1} \left[\matrix{1 & 2 \cr 0 & 1 \cr}\right]^{-1}\cdot I,$$

$$A = \left[\matrix{\dfrac{1}{2} & 0 \cr \noalign{\vskip2pt} 0 & 1 \cr}\right]^{-1} \left[\matrix{1 & 0 \cr 2 & 1 \cr}\right]^{-1} \left[\matrix{1 & 2 \cr 0 & 1 \cr}\right]^{-1}.$$

Finally, write each inverse as an elementary matrix.

$$A = \left[\matrix{2 & 0 \cr 0 & 1 \cr}\right] \left[\matrix{1 & 0 \cr -2 & 1 \cr}\right] \left[\matrix{1 & -2 \cr 0 & 1 \cr}\right].\quad\halmos$$


Corollary. If A and B are $n \times n$ matrices and $AB = I$ , then $A = B^{-1}$ and $BA = I$ .

Proof. Suppose A and B are $n \times n$ matrices and $AB = I$ . The system

$$Bx = 0$$

certainly has $x = 0$ as a solution. I'll show it's the only solution.

Suppose y is another solution, so

$$By = 0.$$

Multiply both sides by A and simplify:

$$\eqalign{ABy &= A\cdot 0 \cr Iy &= 0 \cr y &= 0 \cr}$$

Thus, 0 is a solution, and it's the solution.

Thus, B satisfies condition (d) of the Theorem. Since the five conditions are equivalent, B also satisfies condition (c), so B is invertible. Let $B^{-1}$ be the inverse of B. Then

$$\eqalign{AB & = I \cr ABB^{-1} &= IB^{-1} \cr AI &= B^{-1} \cr A &= B^{-1} \cr}$$

This proves the first part of the Corollary. Finally,

$$BA = BB^{-1} = I.$$

This finishes the proof.

Remark. This result shows that if you're checking that two square matrices A and B are inverses by multiplying to get the identity, you only need to check $AB = I$ --- $BA = I$ is then automatic.

Algorithm. The proof provides an algorithm for inverting a matrix A.

If $E_1, \ldots, E_p$ are elementary matrices which row reduce A to I, then

$$E_1 \cdots E_p A = I.$$

Then

$$A = E_p^{-1} \cdots E_1^{-1} \quad\hbox{so}\quad A^{-1} = E_1 \cdots E_p\cdot I.$$

That is, the row operations which reduce A to the identity also transform the identity into $A^{-1}$ .


Example. To invert

$$\left[\matrix{1 & 2 & -1 \cr -1 & -1 & 3 \cr 0 & 1 & 1 \cr}\right],$$

form the augmented matrix

$$\left[\matrix{1 & 2 & -1 \cr -1 & -1 & 3 \cr 0 & 1 & 1 \cr}\right| \left.\matrix{1 & 0 & 0 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right].$$

Next, row reduce the augmented matrix. The row operations are entirely determined by the block on the left, which is the original matrix. The row operations turn the left block into the identity, while simultaneously turning the identity on the right into the inverse.

$$\left[\matrix{1 & 2 & -1 \cr -1 & -1 & 3 \cr 0 & 1 & 1 \cr}\right| \left.\matrix{1 & 0 & 0 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right] \quad\matrix{\rightarrow \cr r_2\to r_2 + r_1 \cr}\quad \left[\matrix{1 & 2 & -1 \cr 0 & 1 & 2 \cr 0 & 1 & 1 \cr}\right| \left.\matrix{1 & 0 & 0 \cr 1 & 1 & 0 \cr 0 & 0 & 1 \cr}\right] \quad\matrix{\rightarrow \cr r_3\to r_3 - r_2 \cr}$$

$$\left[\matrix{1 & 2 & -1 \cr 0 & 1 & 2 \cr 0 & 0 & -1 \cr}\right| \left.\matrix{1 & 0 & 0 \cr 1 & 1 & 0 \cr -1 & -1 & 1 \cr}\right] \quad\matrix{\rightarrow \cr r_1\to r_1 - 2r_2 \cr}\quad \left[\matrix{1 & 0 & -5 \cr 0 & 1 & 2 \cr 0 & 0 & -1 \cr}\right| \left.\matrix{-1 & -2 & 0 \cr 1 & 1 & 0 \cr -1 & -1 & 1 \cr}\right] \quad\matrix{\rightarrow \cr r_3\to -r_3 \cr}$$

$$\left[\matrix{1 & 0 & -5 \cr 0 & 1 & 2 \cr 0 & 0 & 1 \cr}\right| \left.\matrix{-1 & -2 & 0 \cr 1 & 1 & 0 \cr 1 & 1 & -1 \cr}\right] \quad\matrix{\rightarrow \cr r_1\to r_1 + 5r_3 \cr}\quad \left[\matrix{1 & 0 & 0 \cr 0 & 1 & 2 \cr 0 & 0 & 1 \cr}\right| \left.\matrix{4 & 3 & -5 \cr 1 & 1 & 0 \cr 1 & 1 & -1 \cr}\right] \quad\matrix{\rightarrow \cr r_2\to r_2 - 2r_3 \cr}$$

$$\left[\matrix{1 & 0 & 0 \cr 0 & 1 & 0 \cr 0 & 0 & 1 \cr}\right| \left.\matrix{4 & 3 & -5 \cr -1 & -1 & 2 \cr 1 & 1 & -1 \cr}\right].$$

Thus,

$$A^{-1} = \left[\matrix{4 & 3 & -5 \cr -1 & -1 & 2 \cr 1 & 1 & -1 \cr}\right].\quad\halmos$$


Example. ( Inverting a matrix over $\integer_p$ ) Find the inverse of the following matrix over $\integer_3$ :

$$\left[\matrix{1 & 0 & 2 \cr 1 & 1 & 1 \cr 2 & 1 & 1 \cr}\right].$$

$$\left[\matrix{1 & 0 & 2 & 1 & 0 & 0 \cr 1 & 1 & 1 & 0 & 1 & 0 \cr 2 & 1 & 1 & 0 & 0 & 1 \cr}\right] \matrix{\rightarrow \cr r_2\to r_2-r_1 \cr} \left[\matrix{1 & 0 & 2 & 1 & 0 & 0 \cr 0 & 1 & 2 & 2 & 1 & 0 \cr 2 & 1 & 1 & 0 & 0 & 1 \cr}\right] \matrix{\rightarrow \cr r_3\to r_3-2r_1 \cr}$$

$$\left[\matrix{1 & 0 & 2 & 1 & 0 & 0 \cr 0 & 1 & 2 & 2 & 1 & 0 \cr 0 & 1 & 0 & 1 & 0 & 1 \cr}\right] \matrix{\rightarrow \cr r_3\to r_3-r_2 \cr} \left[\matrix{1 & 0 & 2 & 1 & 0 & 0 \cr 0 & 1 & 2 & 2 & 1 & 0 \cr 0 & 0 & 1 & 2 & 2 & 1 \cr}\right] \matrix{\rightarrow \cr r_1\to r_1-2r_3 \cr}$$

$$\left[\matrix{1 & 0 & 0 & 0 & 2 & 1 \cr 0 & 1 & 2 & 2 & 1 & 0 \cr 0 & 0 & 1 & 2 & 2 & 1 \cr}\right] \matrix{\rightarrow \cr r_2\to r_2-2r_3 \cr} \left[\matrix{1 & 0 & 0 & 0 & 2 & 1 \cr 0 & 1 & 0 & 1 & 0 & 1 \cr 0 & 0 & 1 & 2 & 2 & 1 \cr}\right]$$

Therefore,

$$\left[\matrix{1 & 0 & 2 \cr 1 & 1 & 1 \cr 2 & 1 & 1 \cr}\right]^{-1} = \left[\matrix{0 & 2 & 1 \cr 1 & 0 & 1 \cr 2 & 2 & 1 \cr}\right]. \quad\halmos$$


Proposition. Let F be a field, and let $Ax = b$ be a system of linear equations over F. Then:

  1. If F is infinite, then the system has either no solutions, exactly one solution, or infinitely many solutions.
  2. If F is a finite field with $p^n$ elements, where p is prime and $n \ge 1$ , then the system has either no solutions, exactly one solution, or at least $p^n$ solutions.

Proof. Suppose the system has more than one solution. I must show that there are infinitely many solutions if F is infinite, or at least $p^n$ solutions if F is a finite field with $p^n$ elements.

Suppose then that there is more than one solution. Let $x_1$ and $x_2$ be distinct solutions to $Ax = b$ , so

$$Ax_1 = b \quad\hbox{and}\quad Ax_2 = b.$$

Note that

$$A(x_1 - x_2) = Ax_1 - Ax_2 = b - b = 0.$$

Since $x_1 - x_2 \ne 0$ , $x_1 -
   x_2$ is a nontrivial solution to the system $Ax = 0$ . Now if $t
   \in F$ ,

$$A\left(x_1 + t(x_1 - x_2)\right) = Ax_1 + t\cdot A(x_1 - x_2) = b + 0 = b.$$

Thus, $x_1 + t(x_1 - x_2)$ is a solution to $Ax = b$ . Moreover, the only way two solutions of the form $x_1 + t(x_1 - x_2)$ can be the same is if they have the same t. For

$$x_1 + t(x_1 - x_2) = x_1 + t'(x_1 - x_2) \quad\hbox{gives}\quad (t - t')x_1 = (t - t')x_2.$$

Now $t - t' \ne 0$ implies $x_1 = x_2$ , a contradiction. Therefore, $t - t' = 0$ , so $t = t'$ .

Thus, different t's give different $x_1 + t(x_1 - x_2)$ 's, each of which is a solution to the system.

If F has infinitely many elements, there are infinitely many possibilities for t, so there are infinitely many solutions.

If F has $p^n$ elements, there are $p^n$ possibilities for t, so there are at least $p^n$ solutions. (Note that there may be solutions which are not of the form $x_1 + t(x_1 - x_2)$ , so there may be more than $p^n$ solutions. In fact, I'll be able to show later than the number of solutions will be some power of $p^n$ .)


Example. Since $\real$ is an infinite field, a system of linear equations over $\real$ has no solutions, exactly one solution, or infinitely many solutions.

Since $\integer_3$ is a field with 3 elements, a system of linear equations over $\integer_3$ has no solutions, exactly one solution, or at least 3 solutions. (And I'll see later that if there's more than one solution, then there might be 3 solutions, 9 solutions, 27 solutions, ....)


Send comments about this page to: Bruce.Ikenaga@millersville.edu.

Bruce Ikenaga's Home Page

Copyright 2008 by Bruce Ikenaga