A Proof of Lagrange’s Four Square Theorem
         Using Quaternion Algebras
                            Drew Stokesbary
                               Spring 2007


                                 Abstract
         Many prime numbers can be expressed as a sum of the squares of
     two other numbers. This paper explores which numbers can be written
     as a sum of the squares of four numbers. This question is deeply
     related to a number system known as quaternion algebra, which will
     be developed in this paper to describe what numbers can be written
     as the sum of four squares.

    In 1770, Joseph Louis Lagrange proved that every positive integer can be
expressed as a sum of the squares of four integers, which has henceforth been
called Lagrange’s Theorem, and we will eventually prove this very result. In
order to reach this conclusion, we will introduce a number system called a
Quaternion Algebra, which may be roughly thought of as an extension of the
complex numbers or the Gaussian integers.
    After introducing the fundamentals of quaternions and exploring some
peculiarities of arithmetic in this number system, we will begin traveling
along the road towards a proof a Lagrange’s Theorem. We will use the
arithmetic of quaternions to examine the notion of a norm and of a unit
in this number system. These concepts will allow us to develop a division
algorithm for quaternions and prove the existence of a greatest (right-hand)
common divisor. Finally, we will see what it means to be a prime quaternion.
    All of these properties of quaternions will eventually allow us to prove
that every positive integer can be written as a sum of four squares.




                                     1
1      Quaternion Arithmetic
Formally, we define the quaternions H to be the set of all ordered quadruples
(a1 , a2 , a3 , a4 ), where a1 , a2 , a3 , a4 ∈ R. Addition is defined as

(1)      (a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 ) = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ),

and multiplication is defined as

(2)                  (a1 , a2 , a3 , a4 ) · (b1 , b2 , b3 , b4 ) = (c1 , c2 , c3 , c4 ),

where
                               c1   = a 1 b1 − a 2 b2 − a 3 b3 − a 4 b4 ,
                               c2   = a 1 b2 + a 2 b1 + a 3 b4 − a 4 b3 ,
(3)
                               c3   = a 1 b3 − a 2 b4 + a 3 b1 + a 4 b2 ,
                               c4   = a 1 b4 + a 2 b3 − a 3 b2 + a 4 b1 .

   We use Greek letters to represent quaternions, and if α = (a1 , a2 , a3 , a4 ),
then we say that a1 , a2 , a3 , and a4 are the coordinates of α.

Definition. If α = (a1 , a2 , a3 , a4 ) and β = (b1 , b2 , b3 , b4 ), then α and β are
equal if (a1 , a2 , a3 , a4 ) = (b1 , b2 , b3 , b4 ).

    The equation x2 = −1 has the quadruples (0, 1, 0, 0), (0, 0, 1, 0), and
(0, 0, 0, 1) as solutions. We denote each solution quadruple by the symbols i,
j, and k, respectively. In other words,

(4)                                         i = (0, 1, 0, 0),
(5)                                         j = (0, 0, 1, 0),
(6)                                     and k = (0, 0, 0, 1).

      Although

(7)                                     i2 = j 2 = k 2 = −1,

we can see from the definition of equality that

(8)                                           i = j = k.


                                                     2
We then define i, j, and k so that

                                       jk = i = −kj,
(9)                                    ki = j = −ik,
                                       ij = k = −ji.

From this fact, we can see that multiplication of quaternions is not commuta-
tive. This peculiarity about quaternions is what sets them apart from other
number systems.
    If α = (a1 , a2 , a3 , a4 ), then we can also write α = a1 + a2 i + a3 j + a4 k. In
fact, the aforementioned definition of multiplication comes from expanding
the product (a1 + a2 i + a3 j + a4 k)(b1 + b2 i + b3 j + b4 k) according to the
distributive property and the multiplicative definitions of i, j, and k.
    In addition to the set H of quaternions, which we defined as

(10)                   H = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ R},

we can define other sets of quaternions, such as

(11)                   HQ = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Q},
(12)               and HZ = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Z}.

      We will now consider the set H ⊂ H, which we define as

(13)      H = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Z or a1 , a2 , a3 , a4 ∈ 1 ZOdd },
                                                                                   2

where 1 ZOdd is the set of all odd integers divided by 2. The reasoning for
        2
defining this odd (no pun intended!) set of quaternions will soon become
clear, but before we explore some of the deeper properties of quaternions, we
will first show that the set H is closed under addition and multiplication,
and that the set is a non-commutative ring.

Theorem 1. H is closed under addition.

Proof. Suppose α, β ∈ H and

(14)                             α = a 1 + a2 i + a3 j + a4 k
(15)                         and β = b1 + b2 i + b3 j + b4 k.

Then, α+β = (a1 , a2 , a3 , a4 )+(b1 , b2 , b3 , b4 ) = (a1 +b1 , a2 +b2 , a3 +b3 , a4 +b4 ).

                                               3
If the coordinates of α and β are integers, then the coordinates of α + β
are also integers, and so α + β ∈ H .
    If the coordinates of α and β are not integers, then they are odd integers
divided by 2. The sum of any two odd integers is an even integer, and an
even integer divided by two is always an integer. Thus if the coordinates of
α and β are not integers, then the coordinates of α + β are integers, and
α+β ∈H.
    If the coordinates of either α or β are integers, and the coordinates of
the other are not integers, then the coordinates of α + β are the sums of an
integer and half of an odd integer. Each of these sums will be half of an odd
integer, and thus α + β ∈ H .
Theorem 2. H is closed under multiplication.
Proof. First, we will define a special quaternion ξ so that
                                    1
(16)                            ξ = 2 (1 + i + j + k).

Thus, any integer quaternion can now be written in the manner

(17)                         ρ = r1 ρ + r2 i + r3 j + r4 k,

where r1 , r2 , r3 , r4 ∈ Z. If the coordinates of ρ are integers, then r1 will be
even, and if the coordinates of ρ are non-integers, then r1 will be odd. Any
quaternion written in this form will be an integer quaternion.
   Using the definition of multiplication, we then compute

(18)   ξ 2 = 1 (1 + i + j + k) 1 (1 + i + j + k) = 1 (−1 + i + j + k) = ξ − 1,
             2                 2                   2
(19)   ξi = 2 (1 + i + j + k)i = 1 (−1 + i + j − k) = −ξ + i + j
             1
                                    2
(20)   iξ = i 1 (1 + i + j + k) = 1 (−1 + i − j + k) = −ξ + i + k,
              2                     2
(21)   ξj = 1 (1 + i + j + k)j = 1 (−1 − i + j + k) = −ξ + j + k,
             2                      2
(22)   jξ = j 1 (1 + i + j + k) = 1 (−1 + i + j − k) = −ξ + i + j,
               2                    2
(23)   ξk = 2 (1 + i + j + k)k = 1 (−1 + i − j + k) = −ξ + i + k
             1
                                     2
(24)   kξ = k 1 (1 + i + j + k) = 1 (−1 − i + j + k) = −ξ + j + k.
               2                     2

Each of these products is in the form of equation (17), so each is itself an
integer quaternion.
    If we take α = (a1 , a2 , a3 , a4 ) and β = (b1 , b2 , b3 , b4 ), with α, β ∈ H , and
rewrite each so that it is in the form of equation (17), then by the definition of

                                           4
multiplication and the results of equations (18) through (24), we can see that
the product αβ can also be written in the form of equation (17). Therefore,
αβ is an integer quaternion, and H is indeed closed under multiplication.
   Since we have shown that H is closed under addition and multiplication,
we will now show that H is a non-commutative ring.
Theorem 3. H is a non-commutative ring.
Proof. If we say that α, β, and γ are of the form

(25)                                α = (a1 , a2 , a3 , a4 ),
(26)                                β = (b1 , b2 , b3 , b4 ),
(27)                            and γ = (c1 , c2 , c3 , c4 ),

we can then show that each of the seven conditions for a non-commutative
ring hold.
    For all α, β ∈ H , we see from the definition of addition that

                     α + β = (a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 )
                           = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 )
(28)                       = (b1 + a1 , b2 + a2 , b3 + a3 , b4 + a4 )
                           = (b1 , b2 , b3 , b4 ) + (a1 , a2 , a3 , a4 )
                           = β + α.

Thus, the commutative law of addition holds.
  For all α, β, γ ∈ H , we see from the definition of addition that

        (α + β) + γ = [(a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 )] + (c1 , c2 , c3 , c4 )
                    = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ) + (c1 , c2 , c3 , c4 )
                    = (a1 + b1 + c1 , a2 + b2 + c2 , a3 + b3 + c3 , a4 + b4 + c4 )
(29)
                    = ((a1 , a2 , a3 , a4 ) + (b1 + c1 , b2 + c2 , b3 + c3 , b4 + c4 )
                    = ((a1 , a2 , a3 , a4 ) + [(b1 , b2 , b3 , b4 ) + (c1 , c2 , c3 , c4 )]
                    = α + (β + γ).

Thus, the associative law of addition holds.
  It is clear that (0, 0, 0, 0) ∈ H . For all α ∈ H ,

(30) (0, 0, 0, 0) + α = (0 + a1 , 0 + a2 , 0 + a3 , 0 + a4 ) = (a1 , a2 , a3 , a4 ) = α.

                                              5
Thus there exists an element which is an additive identity.
     Since (a1 , a2 , a3 , a4 ∈ H , then (−a1 , −a2 , −a3 , −a4 ) ∈ H . We can see that
(a1 , a2 , a3 , a4 ) + (−a1 , −a2 , −a3 , −a4 ) = (a1 − a1 , a2 − a2 , a3 − a3 , a4 − a4 ) =
(0, 0, 0, 0), so H has an additive inverse for all α ∈ H .
     For all α, β, γ ∈ H , we can see from the definition of multiplication that
(αβ)γ = α(βγ), so multiplication in H is associative.
     The element (1, 0, 0, 0) is in the set H , and from the definition of multi-
plication, we see that for all α ∈ H ,
(31)       (1, 0, 0, 0)α = (1, 0, 0, 0)(a1 , a2 , a3 , a4 ) = (a1 , a2 , a3 , a4 ) = α
and
(32)      α(1, 0, 0, 0) = (a1 , a2 , a3 , a4 )(1, 0, 0, 0) = (a1 , a2 , a3 , a4 ) = α.
Thus H includes an element which is a multiplicative identity.
    If α, β, γ ∈ H , then α(β + γ) = α(b1 + c1 , b2 + c2 , b3 + c3 , b4 + c4 ), which
we can see from the definition of multiplication gives (a1 b1 + a1 c1 , a2 b2 +
a2 c2 , a3 b3 + a3 c3 , a4 b4 + a4 c4 ), and thus multiplication is distributive.
    Therefore, each of the seven laws necessary for the set H to be a non-
commutative ring hold.
    Now that various arithmetic properties of quaternions have been estab-
lished (namely, the definitions of addition and multiplication for quaternions
and the fact that the integer quaternions are closed under addition and mul-
tiplication and form a non-commutative ring), we can now begin to develop
certain intermediate theorems about quaternions which will help us in our
quest to prove Lagrange’s Theorem.


2      Quaternion Norms
The norm is an important concept in other number systems like Z[i]. Here
we draw on many of the ideas from such number systems in order to develop
the idea of a norm for quaternions.
   The first idea we will borrow from other number systems is that of the
conjugate. Recall that in C, the number (a + bi) had a conjugate, namely,
(a − bi). We define something similar for quaternions.
Definition. Let α = a1 + a2 i + a3 j + a3 k. Then α = a − a1 i − a2 j − a3 k is
                                                 ¯
the conjugate of α.

                                               6
Note, that it is clear if α ∈ H , then α ∈ H . Thus, armed knowing how to
                                           ¯
form the conjugate of a quaternion, we can define the norm of a quaternion
in the same way as number systems like Z[i] or C.
Definition. For any quaternion α, N (α) = αα is the norm of α.
                                          ¯
Lemma 1. Let α = a1 + a2 i + a3 j + a4 k. Then
(33)                   N (α) = αα = αα = a1 + a2 + a2 + a2 .
                                ¯ ¯       2
                                               2    3    4

Proof. If α = a1 + a2 i + a3 j + a4 k, then α = a1 − a2 i − a3 j − a4 k and
                                            ¯
N (α) = αα. From the definition of multiplication,
          ¯
(34) αα = (a1 + a2 i + a3 j + a4 k)(a1 − a2 i − a3 j − a4 k) = c1 + c2 i + c3 j + c4 k,
      ¯
where
          c1   = a1 a1 − a2 (−a2 ) − a3 (−a3 ) − a4 (−a4 ) = a1 + a2 + a3 + a4 ,
                                                              2
                                                                   2
                                                                        2    2

          c2   = a1 (−a2 ) + a2 a1 + a3 (−a4 ) − a4 (−a3 ) = 0,
(35)
          c3   = a1 (−a3 ) − a2 (−a4 ) + a3 a1 + a4 (−a2 ) = 0,
          c4   = a1 (−a4 ) + a2 (−a3 ) − a3 (−a2 ) + a4 a1 = 0.
Substituting the results of equation (35) into equation (34), we obtain
(36)                          α α = a 2 + a2 + a2 + a2 .
                                ¯     1    2    3    4

Similarly, we see that
(37) αα = (a1 − a2 i − a3 j − a4 k)(a1 + a2 i + a3 j + a4 k) = c1 + c2 i + c3 j + c4 k,
     ¯
where
          c1   = a1 a1 − (−a2 )a2 − (−a3 )a3 − (−a4 )a4 = a2 + a2 + a2 + a2 ,
                                                            1
                                                                2
                                                                     3    4
          c2   = a1 a2 + (−a2 )a1 + (−a3 )a4 − (−a4 )a3 = 0,
(38)
          c3   = a1 a3 − (−a2 )a4 + (−a3 )a1 + (−a4 )a2 = 0,
          c4   = a1 b4 + (−a2 )a3 − (−a3 )a2 + (−a4 )a1 = 0.
After substituting the results of equation (38) into equation (37), we again
obtain
(39)                          α α = a 2 + a2 + a2 + a2 .
                                ¯     1    2    3    4

Therefore,
(40)                   N (α) = αα = αα = a2 + a2 + a2 + a2 .
                                ¯ ¯       1    2    3    4




                                           7
Remember the seemingly bizarre way we defined the set H ? The reason
for constructing the set in such a manner was so that the norm of any element
of the set would be an integer.

Lemma 2. Let α ∈ H where α = (a1 , a2 , a3 , a4 ) = 0. Then N (α) ∈ N.

Proof. Since α ∈ H , there are two possibilities. Either a1 , a2 , a3 , a4 ∈ Z or
a1 , a2 , a3 , a4 ∈ 1 ZOdd .
                    2
     First consider the case where a1 , a2 , a3 , a4 ∈ Z. Because multiplication is
closed over the integers, we know that

(41)                            a2 , a2 , a3 , a2 ∈ Z,
                                 1 2
                                           2
                                                4

and because addition is closed as well, we know

(42)                         a2 + a2 + a2 + a2 ∈ Z.
                              1    2    3    4

    Now consider the alternative, that a1 , a2 , a3 , a4 ∈ 1 ZOdd . If we use 1 ZOdd
                                                           2                   2
to mean the set of all odd integers divided by 2, then the set of all odd integers
                                  1                1                        2
divided by 4 will be denoted as 4 ZOdd . If m ∈ 2 ZOdd , then ( m )2 = m . Since
                                             2                      2      4
                                     2
m is odd, then m2 is odd, and m ∈ 1 ZOdd . Thus, if a1 , a2 , a3 , a4 ∈ 2 ZOdd ,
                                    4    4
                                                                              1

then we can see that

(43)                          a1 , a2 , a2 , a2 ∈ 1 ZOdd .
                               2
                                    2 3 4         4

When we add four numbers from the set 1 ZOdd , we obtain an integer, so that
                                      4

(44)                         a2 + a2 + a2 + a2 ∈ Z.
                              1    2    3    4

Therefore, for this case as well, N (α) ∈ Z.
   In addition, by trichotomy, we know that if a ∈ Z, then a2 ∈ N ∪ {0}.
Thus, N (α) ∈ Z. However, since α = 0, we know it has at least one non-
zero coordinate. Thus a2 + a2 + a2 + a2 > 0, and N (α) ∈ {0}. Therefore,
                          1    2     3    4             /
N (α) ∈ N.


Definition. Suppose α ∈ H . Then we say α is an integer quaternion.

   The term integer quaternion of comes from the fact that the norm of a
quaternion in the set H is an integer, which was proven in Lemma 2.


                                           8
Definition. Suppose α ∈ H . Then we say α is odd if N (α) is odd and even
if N (α) is even.
    In order to prove deeper results about quaternions, we need the norm to
have the property that N (αβ) = N (α)N (β). Before we can prove this fact,
however, we must first prove the following Lemma, which has some lengthy
arithmetic.
Lemma 3. Let α = a1 + a2 i + a3 j + a4 k and β = b1 + b2 i + b3 j + b4 k. Then
     ¯¯
αβ = β α.
Proof. By the definition of multiplication,

(45)                           αβ = c1 + c2 i + c3 j + c4 k,

where
                          c1   = a 1 b1 − a 2 b2 − a 3 b3 − a 4 b4 ,
                          c2   = a 1 b2 + a 2 b1 + a 3 b4 − a 4 b3 ,
(46)
                          c3   = a 1 b3 − a 2 b4 + a 3 b1 + a 4 b2 ,
                          c4   = a 1 b4 + a 2 b3 − a 3 b2 + a 4 b1 .
By definition of the conjugate,

(47)                           αβ = c1 − c2 i − c3 j − c4 k.
                                                          ¯¯
    After establishing what αβ looks like, we now turn to β α. By the defini-
tion of the conjugate, we see that

(48)                          α = a 1 − a2 i − a 3 j − a4 k
                              ¯
(49)                          ¯
                          and β = b1 − b2 i − b3 j − b4 k,

and by the definition of multiplication, we have

(50)                           ¯¯
                               β α = c1 + c2 i + c3 j + c4 k,

where
            c1   = b1 a1 − (−b2 )(−a2 ) − (−b3 )(−a3 ) − (−b4 )(−a4 ),
            c2   = b1 (−a2 ) + (−b2 )a1 + (−b3 )(−a4 ) − (−b4 )(−a3 ),
(51)
            c3   = b1 (−a3 ) − (−b2 )(−a4 ) + (−b3 )a1 + (−b4 )(−a2 ),
            c4   = b1 (−a4 ) + (−b2 )(−a3 ) − (−b3 )(−a2 ) + (−b4 )a1 .

                                               9
By simple algebraic manipulation, we can transform equation (51) as

         c1   = b 1 a1 − b 2 a2 − b 3 a3 − b 4 a4 = a 1 b 1 − a 2 b 2 − a 3 b 3 − a 4 b 4 ,
         c2   = −b1 a2 − b2 a1 + b3 a4 − b4 a3 = −a1 b2 − a2 b1 − a3 b4 + a4 b3 ,
(52)
         c3   = −b1 a3 − b2 a4 − b3 a1 + b4 a2 = −a1 b3 + a2 b4 − a3 b1 − a4 b2 ,
         c4   = −b1 a4 + b2 a3 − b3 a2 − b4 a1 = −a1 b4 − a2 b3 + a3 b2 − a4 b1 .

    Notice that this makes

(53)     c1 = c 1 ,        c2 = −c2 ,          c3 = −c3 ,         and         c4 = −c4 .

Substituting equation (53) into equation (50), we find that

(54)                           αβ = c1 − c2 i − c3 j − c4 k.

From equations (45) and (54), it is clear that

(55)                                          ¯¯
                                         αβ = β α.



    Now our desired property of the norm, that N (αβ) = N (α)N (β), follows
easily from Lemma 3.

Lemma 4. Let α and β be quaternions. Then N (αβ) = N (α)N (β).
                          ¯¯
Proof. N (αβ) = αβαβ = αβ β α = αN (β)¯ = ααN (β) = N (α)N (β).
                                      α    ¯


3      Quaternion Units
We now turn to units, an idea which as a place in nearly every number system
imaginable. In N, 1 is a unit, in Z, 1 and −1 are units, and in Z[i], there are
four units: 1, −1, i, and −i, and in Zp , every element is a unit. Yet, despite
the existence of different units in different number systems, the definition of
a unit remains the same in each. We will take this same definition and apply
it to quaternions.

Definition. We say ε ∈ H is a unit if there is a quaternion α ∈ H so that
εα = αε = 1.


                                              10
In number systems which have the concept of a norm, such as Z[i], a unit
u has the property that N (u) = 1. To show this is true for quaternions, we
must first define the inverse of a quaternion.
Definition. Suppose α is nonzero and α ∈ H . Then there exists α−1 , called
the inverse of α, so that
                                          α¯
(56)                           α−1 =           .
                                         N (α)
    We saw that for a quaternion α ∈ H , it had a conjugate α ∈ H and a
                                                             ¯
norm N (α) ∈ Z. Thus we can see that it has an inverse α−1 ∈ HQ . However,
if you know that N (α) = 1, then α−1 = α, so α−1 ∈ H . This fact is the
                                         ¯
gateway for our discussion of units.
Lemma 5. α is a unit if and only if N (α) = 1.
Proof. From equations (33) and (56), we see
                                    αα¯    N (α)
(57)                    αα−1 =           =       =1
                                   N (α)   N (α)
and
                                   αα
                                   ¯      N (α)
(58)                    α−1 α =         =       = 1.
                                  N (α)   N (α)
It is here we see that α and α−1 are both units. Now we can take the norm
of both sides of equation (57) or (58) to obtain
(59)                         N (αα−1 ) = N (1),
which we know is
(60)                         N (α)N (α−1 ) = 1.
It appears that the only solution to equation (60) is N (α) = N (α −1 ) = 1,
but to be sure, we can use our definition of inverse and norm to see
                            α¯            α¯    α      N (α)
(61)        N (α−1 ) = N             =               =       = 1.
                           N (α)         N (α) N (α)   N (α)
From equations (60) and (61), we see that if N (α−1 ) = 1, then N (α) = 1
as well. Therefore, it is indeed the case that α is a unit if and only if
N (α) = 1.

                                      11
Since we know that α ∈ H must have coordinates in either the set Z or
1
  Z ,
2 Odd
       it would seem reasonable to conjecture that there are only a finite
number of units. Thus we will now take a moment to examine exactly how
many quaternions are units in H , and also what they are.

Theorem 4. There are 24 units in H .

Proof. Say α = a1 + a2 i + a3 j + a4 k and α ∈ H . From Lemma 5, if α
is a unit, then N (α) = 1. Since α ∈ H , then either a1 , a2 , a3 , a4 ∈ Z or
a1 , a2 , a3 , a4 ∈ 1 ZOdd .
                      2
     If a1 , a2 , a3 , a4 ∈ Z, then we know a2 , a2 , a2 , a2 ∈ N∪{0}. Since N (α) = 1,
                                             1 2 3 4
then we can see that one out a1 , a2 , a3 , and a4 must equal ±1, while the
other three must equal 0. This provides for a total 8 possible units.
     On the other hand, if a1 , a2 , a3 , a4 ∈ 1 ZOdd , then we can observe that
                                                    2
a1 , a2 , a2 , a2 ∈ 1 ZOdd . If N (α) = 1, then the only solution of fourths of odd
  2
      2 3 4            4
integers comes when a2 = a2 = a2 = a2 = 1 . Thus, a1 = ± 1 , a2 = ± 1 ,
                               1     2    3       4       4               2          2
a3 = ± 1 , and a4 = ± 1 . These values for a1 , a2 , a3 , and a4 present 16
            2                    2
possible units.
     Therefore, we have 8 units when the coordinates of the units are integers
and 16 units when the coordinates are non-integers, for a total of 24 different
units in H .
    Now knowing the coordinates of all 24 units, we immediately reach the
following corollary.

Corollary 1. The units in H are:
                                                           1
      ±1,        ±i,       ±j,       ±k,        and        2
                                                             (±1   ± i ± j ± k).

   Another concept for quaternions which we will borrow from other num-
ber systems is the associate. The following definition for the associate of
a quaternion is identical to the definition for associates in number systems
such as Z[i].

Definition. Let α be a quaternion. If ε ∈ H is a unit, then εα and αε are
called associates of α. If β = εα, then it is said that β associates α and
written β ∼ α.

    We will now prove four lemmas which will be immensely helpful in many
of the more difficult proofs which lay ahead.

                                           12
Lemma 6. Suppose α, β ∈ H . If α ∼ β, then N (α) = N (β).

Proof. Let ε ∈ H be a unit. From the definition of associates,

(62)                                α = εβ.

Taking the norm of equation (62), we obtain

(63)                     N (α) = N (εβ) = N (ε)N (β)

But from Lemma 5, N (ε) = 1, so therefore equation (63) reduces to

(64)                            N (α) = N (β),

as desired.

Lemma 7. If α ∼ β and β ∼ γ, then α ∼ γ.

Proof. From the definition of associates, if α ∼ β and β ∼ γ, then there exist
units ε1 and ε2 so that

(65)          α = ε1 β                and                  β = ε2 γ.

Combining these two equations, we see that

(66)                             α = (ε1 ε2 )γ.

What is the product ε1 ε2 ? Since ε1 and ε2 are themselves units, it would
seem that ε1 ε2 is also a unit. We can verify their product is a unit as well
by taking the norm:

(67)                 N (ε1 ε2 ) = N (ε1 )N (ε2 ) = 1 · 1 = 1.

By Lemma 5, since N (ε1 ε2 ) = 1, we see that indeed ε1 ε2 is also unit. Knowing
this, we can refer back to equation (66) and see that α ∼ γ.

Lemma 8. α ∼ β if and only if β ∼ α.

Proof. If α ∼ β, then there exists a unit ε so that

(68)                                α = εβ.


                                       13
We can multiply both sides of equation (68) by ε−1 , the inverse of ε, so that

(69)                              ε−1 α = ε−1 εβ.

Recall from the discussion of inverses that ε−1 ε = 1 and that if ε is a unit,
then ε−1 is also a unit. Thus,

(70)                                β = ε−1 α,

and since ε−1 is a unit, then we can say β ∼ α. Finally, the converse of this
proof can be proven by merely switching α and β.

Lemma 9. If α ∈ H; and β ∼ α, then β ∈ H;

Proof. If β ∼ α, then there exists a unit ε ∈ H such that

(71)                                 β = εα.

From Theorem 2, we know that H is closed under multiplication, that is, the
product of two elements of H is itself an element of H . Thus, from equation
(71), we can easily see that β ∈ H .
    The following theorem has a difficult proof and, while the result is cer-
tainly true, may appear to be meaningless and irrelevant. However, this
could not be farther from the truth, as we will later see that this theorem
provides a crucial link in the proofs of what numbers can be written as a
sum of four integer squares.

Theorem 5. If α = a1 + a2 i + a3 j + a4 k ∈ H , then there is β = b1 + b2 i +
b3 j + b4 k so that β ∼ α and β ∈ HZ .

Proof. It is given that α ∈ H , so we know either a1 , a2 , a3 , a4 ∈ Z or
a1 , a2 , a3 , a4 ∈ 1 ZOdd .
                    2
     The simple case is when a1 , a2 , a3 , a4 ∈ Z. Let ε be a unit such that ε = 1.
Take

(72)                                 β = εα.

Thus, β ∼ α. Furthermore, from the definition of multiplication, we see that

(73)                                 εα = α,

                                        14
so

(74)                                 β = α.

From the definition of equal quaternions, the coordinates of β are the same
as those of α, and thus β ∈ HZ and β ∼ α.
    The other case is when a1 , a2 , a3 , a4 ∈ 1 ZOodd . Through simple algebra,
                                               2
we can manipulate the terms of α into the form δ + γ, where

(75) δ = d1 + d2 i + d3 j + d4 k, di ∈ ZEven ,   and   γ = 1 (±1 ± i ± j ± k),
                                                           2

so that

(76)                               α = δ + γ.

From Corollary 1, we know γ is a unit, as is γ , so
                                             ¯

(77)                                γ¯ = 1.
                                     γ

Because each of d1 , d2 , d3 , and d4 are even, according to the definition of
multiplication, the coordinates of δ¯ will be integers. Since γ is a unit, by
                                      γ                        ¯
taking β = α¯ , it is plain that β ∼ α. It follows from equation (76) that
             γ

(78)                    β = α¯ = (δ + γ)¯ = δ¯ + γ¯ .
                             γ          γ    γ    γ

Therefore, because δ¯ has integer coordinates and γ¯ = 1, the coordinates
                     γ                                γ
of δ¯ + γ¯ , and thus β, are integers. Therefore, β ∼ α and β ∈ HZ .
    γ    γ
    We have seen that norms, units, and associates of quaternions have near-
identical definitions to their counterparts living in number systems like Z[i].
In the area of divisors, however, quaternions begin to distinguish themselves
from these other number systems. This is due to the fact that, unlike for the
Gaussian integers, multiplication of quaternions is not commutative. Natu-
rally then, it makes sense that division in H differs from division in Z[i].

Definition. If α, β, γ ∈ H and γ = αβ, then we say α is a left-hand divisor
of γ and write α γ, and that β is a right-hand divisor of γ and write β γ.

    The distinction between left- and right-hand divisors is necessary because,
in general, αβ = βα. As we saw earlier, multiplication is not commutative.
For the purposes of this paper, we will work with right-hand divisors, but

                                       15
for consistency’s sake only. Every proof involving a right-hand divisor could
easily be modified to prove a similar result using left-hand divisors.
    (Note, however, multiplication between an integer and a quaternion is
commutative, so if α ∈ Z or β ∈ Z, then αβ = βα, and the distinction
between left- and right-hand divisors is unnecessary.)
    The following theorem is necessary in order to prove the existence of a
division algorithm in H . In order for long division to be “useful,” repeated
long division must terminate; that is, the remainder must be less than the
divisor. This theorem proves just that, if there is long division, then the
remainder term will be less than the divisor.
Theorem 6. Suppose κ ∈ H and m ∈ Z. Then there exists λ ∈ H so that
(79)                                N (κ − mλ) < m2 .
Proof. First note that because m ∈ Z, the norm of m is given by N (m) = m2 ,
so it may also be said that we are proving
(80)                               N (κ − mλ) < N (m).
   If κ = (k1 , k2 , k3 , k4 ), and λ = (l1 , l2 , l3 , l4 ), then
(81)           κ − mλ = (k1 − ml1 , k2 − ml2 , k3 − ml3 , k4 − ml4 ).
   We want to find when N (κ − mλ) < N (m), which happens when
(82)     (k1 − ml1 )2 + (k2 − ml2 )2 + (k3 − ml3 )2 + (k4 − ml4 ) < m2 .
Observe that if each |ki − mli | < | 1 m|, then (ki − mli )2 < 1 m2 , and
                                     2                         4

(83) (k1 − ml1 )2 + (k2 − ml2 )2 + (k3 − ml3 )2 + (k4 − ml4 ) < 4 1 m2 = m2 .
                                                                  4

So we merely need to find each li so that
(84)                                 |ki − mli | < | 1 m|,
                                                     2

which occurs when
                                2ki − m        2ki + m
(85)                                    < li <         .
                                  2m             2m
We can see that 2k2m − 2k2m = 1, that is, the interval between 2k2m and
                     i +m    i −m                                             i −m

2ki +m
   2m
        is 1. This ensures there exists at least one li in the interval such that
               1
li ∈ Z or li ∈ 2 ZOdd .
     Finding each coordinate of λ, l1 , l2 , l3 , and l4 , in this manner ensures that
λ ∈ H and N (κ − mλ) < N (m).

                                               16
Armed with Theorem 6, we now look to prove the existence of a full-
fledged division algorithm for H.

Theorem 7 (Division Algorithm). Suppose α, β ∈ H , and β = 0, then
there are lambda, γ ∈ H so that

(86)         α = λβ + γ,               with 0 ≤ N (γ) < N (β).

Proof. Define κ ∈ H and m ∈ N such that

(87)              ¯
             κ = αβ              and               m = N (β).

From Theorem 6, we know there exists λ ∈ H such that N (κ − mλ) < m2 .
Now, using such a λ derived from Theorem 6, define γ as

(88)                             γ − λβ.

This satisfies α = δβ + from equation (86). From the definitions of κ and
m, we see that

(89)                        ¯    ¯      ¯
                    (α − λβ)β = αβ − λβ β = κ − mλ,

so thus

(90)                              ¯
                       N [(α − λβ)β] = N (κ − mλ).

From Theorem 6, we know that N (κ − mλ) < m2 , so

(91)                        ¯                 ¯
                 N [(α − λβ)β] = N (α − λβ)N (β) < m2 .
         ¯      ¯
Since N (β) = β β = m, we can apply the cancelation law to see

(92)                         N (α − λβ) < m,

and from the definitions of γ and m, we can substitute to conclude

(93)                          N (γ) < N (β).




                                    17
4      Quaternion GCDs
In addition to norms and units, we are now familiar with other important
ideas about quaternions, namely division and divisors. In other number sys-
tems, we might want to consider when a number of the greatest common
divisor, or GCD, of two other numbers. If d is the GCD of a and b, then
we can easily see that d divides all linear combinations of a and b. This
knowledge can be useful for many reasons, such as proving unique prime fac-
torization within a number system or finding solutions of linear Diophantine
equations. Also in other number systems, the GCD has important connec-
tions to the primality, and such a connection also exists for quaternions. For
this reason, an examination of GCDs in H now will aid our later study of
primes in H .
    As we have done with concepts such as the norm, units, and associates,
in order to define a GCD in H , we look to the definition of a GCD in other
number systems. However, knowing that multiplication is not commutative,
we must again be careful to distinguish between left- and right-hand divisors.

Definition. Let α, β be quaternions. Then α and β have a greatest common
right-hand divisor δ, denoted gcdr (α, beta), if
    (1) δ is a right-hand divisor of both α and β, and
    (2) every right-hand divisor of α and β is also a right-hand divisor of δ.

Theorem 8 (Greatest Common Divisor). Given α, β ∈ H , where at
least one of α and β are non-zero, then α and β have a greatest common
right-hand divisor δ, which is unique up to associates and can be written in
the form

(94)                            δ = µα + νβ,

where µ, ν ∈ H .

Proof. Let Γ be a set defined as

(95)                   Γ = {N (µα + νβ) | µ, ν ∈ H }.

Since it was given that both α and β are not zero, then we can see that for
any µ, ν ∈ H , N (µα + νβ) > 0, so Γ ∈ N. Furthermore, taking µ = α and ¯
     ¯
ν = β, we see that N (N (α) + N (β)], so Γ = ∅. Thus, the Well Ordering
Principle applies to Γ, so we know there exists g0 = N (µ0 α + ν0 β) such that

                                     18
g0 ≤ g for all g ∈ Γ. Also, define δ = µ0 α + ν0 β. We want to show that δ is
a greatest common right-hand divisor of α and β.
   By Theorem 7, we know that given α, δ ∈ H , we can find λ, γ ∈ H so
that

(96)                             α = λδ + γ,

where 0 ≤ N (γ) < N (δ). Since δ0 = µ0 α + ν0 β, we can substitute to show

(97)                       α = λ(µ0 α + ν0 β) + γ,

and then rearrange to show

(98)                      γ = (1 − λµ0 )α + (−ν0 )β.

On account of the closure of addition and multiplication, (1 − λµ0 ), (−ν0 ) ∈
H , and thus N (γ) ∈ Γ if N (γ) > 0. Assume momentarily that N (γ) ∈ Γ.
By Theorem 7, we know that N (γ) < N (δ), however, g0 = N (δ) is the least
element of Γ, so N (γ) ∈ Γ and thus we know N (γ) = 0, and also γ = 0.
                       /
Equation (96) now becomes

(99)                               α = λδ,

and by definition, δ α. Similarly, we can see that δ β.
    Now let κ ∈ H and assume that κ α and κ β. Then it follows that κ µ0 α
and κ ν0 β, and that κ (µ0 α + ν0 β). Since δ = µ0 α + ν0 β, then κ δ.
    Thus, δ is a right-hand divisor of α and β, and an arbitrary common
divisor of α and β is also a divisor of δ. Therefore, by definition, δ is a
greatest common right-hand divisor.
Theorem 9. Suppose α, β ∈ H . If gcdr (α, β) = δ and δ is not a unit, then
gcd[N (α), N (β)] = N (δ), and N (δ) > 1.
Proof. It is given that

(100)                          gcdr (α, β) = δ,

and so δ is a right-hand divisor of α and β, and there exist γ1 , γ2 ∈ H such
that

(101)                               α = γ1 δ
(102)                           and β = γ2 δ.

                                     19
Taking the norm of both sides of these equations gives

(103)                     N (α) = N (γ1 δ) = N (γ1 )N (δ)
(104)                 and N (β) = N (γ2 δ) = N (γ2 )N (δ).

From Lemma 2, the norm of a quaternion is an integer, and from divisibility
for the integers, we can say that

(105)        N (δ)|N (α)              and                N (δ)|N (β).

Thus N (δ) is, at minimum, a common divisor of N (α) and N (β).
    Another consequence of equation (100) is that if there exists λ ∈ H that
is also right-hand divisor of α and β, then λ must be a right-hand divisor of
δ. If such λ exists, then there is γ ∈ H such that δ = γ λ. Substituting this
into equations (101) and (102), we see that

(106)                               α = γ1 γ λ
(107)                           and β = γ2 γ λ.

Following similar reasoning as above, this implies that

(108)       N (λ)|N (α),       N (λ)|N (β),        and         N (λ)|N (δ).

    Therefore, if gcdr (α, β) = δ, we can draw two conclusions. First, that
N (δ)|N (α) and N (δ)|N (β). Second, if there exists λ such that N (λ)|N (α)
and N (λ)|N (β), then it must also be the case that N (λ)|N (δ). These are the
two criteria for N (δ) to be the greatest common divisor of N (α) and N (β),
so thus gcd[N (α), N (β)] = N (δ). In addition, it is given that δ is not a unit,
so N (δ) = 1, which, along with Lemma 2, implies that N (δ) > 1.

Theorem 10. Suppose α ∈ H and let β = m ∈ N. Then gcdr (α, β) = 1 if
and only if gcd[N (α), m] = 1.

Proof. By Theorem 8, the following statements are equivalent:

(109)                           gcdr (α, β) = 1,

and there exist µ, ν ∈ H such that

(110)                            1 = µα + νβ.

                                       20
Equation 110 can be rearranged in the form

(111)                                µα = 1 − νβ.

Substituting m for β in equation (111) and taking the norm of both sides of
gives

(112)             N (µα) = N (1 − mν) = (1 − mν)(1 − m¯),
                                                      ν

which can be expanded to

(113)                   N (µ)N (α) = 1 − mν − m¯ + m2 N (ν)
                                               ν

and then rearranged as

(114)               N (µ)N (α) + mν + m¯ − m2 N (ν) = 1.
                                       ν

Let d be an integer such that d = gcd[N (α), m]. By definition, d is a common
divisor of N (α) and m, so d|N (α) and d|m. Thus, each of the following is
true as well:

        d|N (µ)N (α),        d|mν,       d|m¯,
                                            ν       and   d| − m2 N (ν).

Therefore, from the properties of divisibility for the integers, we know that
d|[N (µ)N (α)+mν +m¯ −m2 N (ν)], and given equation (114), d|1. Since d ∈
                      ν
Z, d must equal 1. Therefore, gcd[N (α), m] = 1. Note that because N (β) =
m2 , the statement gcd[N (α), m] = 1 is equivalent to gcd[N (α), N (β)] =
1.


5       Primes
We now look at what it means to be a prime quaternion. By proving several
statements about primes in set H , and exploring how these primes relate to
primes in Z, we will eventually be able to prove that any prime number can
be written as a sum of the squares of four integers.
    Before we can prove anything about primes in H , we need to know exactly
what constitutes a prime in H . For this, we draw from the definition used
for primes in sets such as Z[i] and Zm .
Definition. A non-zero quaternion π is prime in H if, for any α, β ∈ H
such that π = αβ, either α or β is a unit (but not both).

                                         21
Now that we have established what it means to be prime in H , we can
begin to prove certain theorems connecting primes in H with primes in Z.
The following theorem is similar to results in Z[i]. A Gaussian integer is
prime in Z[i] if its norm is prime in Z

Theorem 11. Let π be a quaternion. If N (π) is prime in Z, then π is prime
in H .

Proof. It is given that π ∈ H and that N (π) is prime in Z. Define α, β ∈ H
such that

(115)                              π = αβ.

Taking the norm of both sides of equation (115) gives N (π) = N (αβ) =
N (α)N (β). Since N (π) is prime in Z, by the definition of prime, either N (α)
or N (β) must be a unit in Z. From Lemma 5, this implies that either α or
β is a unit in H . Given equation (115), if either α or β is a unit, then by
definition, π must be prime in H .
    Before we can continue to prove things about primes in H , we must first
state an auxiliary theorem that will be needed later.

Theorem 12. Suppose p ∈ Z is an odd prime (p = ±2). Then there exist
x, y ∈ Z such that

(116)                     1 + x2 + y 2 ≡ 0   (mod p),

where 0 < x < p and 0 < y < p.

Theorem 13. If an integer p is prime in Z, then p is not prime in H .

Proof. Let p be an integer that is prime in Z. If p = 2, then we can see that

(117)                         2 = (1 + i)(1 − i).

Of course, (1 + i) and (1 − i) are in the set H , but neither is a unit, so 2 is
not prime in H . We can therefore assume that p > 2.
   By Theorem 12, there exist r, s ∈ Z such that

(118)                     1 + r 2 + s2 ≡ 0 (mod p),


                                      22
where 0 < r < p and 0 < s < p. Now define α ∈ H so that

(119)                       α = 1 + 0i + sj − rk,

where r and s are obtained from Theorem 12. Thus N (α) = 1 + r 2 + s2 ,
and from equation (118), N (α) ≡ 0 (mod p). It follows from the proper-
ties of modular arithmetic that p|N (α), and it is trivial that p|p. There-
fore, gcd[N (α), p] ≥ p. By definition, p > 1, so it is easy to see that
that gcd[N (α), p] = 1. Using the result of Theorem 10, this implies that
gcdr (α, p) = 1. Now, we define δ such that

(120)                           δ = gcdr (α, p),

and we know that δ is not a unit in H . Furthermore, because δ is also a
common right-hand divisor of α and p, we can say that δ α and δ p. Thus,
there exist λ1 , λ2 ∈ H such that

(121)                                α = λ1 δ
(122)                            and p = λ2 δ.

Assume by way of contradiction that λ2 is a unit in H . Then from equation
(122), we see that p ∼ δ. This in turn implies that p α. Given equation (119),
there must exist γ = c1 + c2 i + c3 j + c4 k ∈ H such that (1 + sj − rk) = γp =
pc1 + pc2 i + pc3 j + pc4 k. However, no such γ exists, because it is impossible
to find a suitable c1 to satisfy pc1 |1 when p > 2. Thus, a contradiction arises,
and it cannot be the case that λ2 is a unit. Hence, p = λ2 δ, where neither
λ2 nor δ is a unit, and therefore, p is not prime in H .
    We saw in Theorem 11 the connection between prime quaternions and
prime integers, a result for quaternions which is identical to Gaussian inte-
gers. The following theorem is the converse of Theorem 11 but is not true
for primes in Z[i].

Theorem 14. Let π be a prime in H . Then N (π) is prime in Z.

Proof. It is given that π is a quaternion which is prime in H . However,
assume by way of contradiction, that the norm N (π) is not prime in Z. This
means that there exist integers a and p, both not units, so that

(123)                             N (π) = ap.

                                      23
Furthermore, assume that p is a prime factor of N (π). Since by construction,
p|N (π), and also trivially, p|p, then p is a common divisor of N (π) and p, so
gcd[N (π), p] ≥ p. Because p > 1, it is also true that gcd[N (π), p] = 1. From
Theorem 10, it follows that gcdr (π, p) = 1. Define δ such that
(124)                          δ = gcdr (π, p).
Thus δ 1, so δ is not a unit in H . Since δ is the greatest common right-
hand divisor of π and p, it is also a common right-hand divisor of π and p.
Hence, there exist λ1 , λ2 ∈ H such that
(125)                                π = λ1 δ
(126)                            and p = λ2 δ.
Since π is prime and δ is not a unit, then λ1 is a unit, and π ∼ δ. From
Lemma 6, N (π) = N (δ). Taking the norm of both sides of equation (126)
gives N (p) = p2 = N (λ2 δ) = N (λ2 )N (δ), and substituting N (π) for N (δ)
gives
(127)                         p2 = N (λ2 )N (π).
By combining this with equation (123) and performing simple algebra, we
obtain
(128)                            p = aN (λ2 ).
Because p was defined to be a prime, then by definition, N (λ2 ) must equal
either 1 or p. Assume temporarily that N (λ2 ) = 1. Then from Lemma 5,
λ2 would be a unit in H . From equation (126), it would follow that p ∼ δ,
and from Lemmas 7 and 8, p ∼ π. But then p would be prime in H , which
violates Theorem 13. This contradiction means that N (λ2 ) cannot equal 1,
and instead must equal p. Substituting this result into equation (127) gives
p2 = pN (π), which reduces to p = N (π). Therefore, since p is prime in Z
and N (π) = p, then N (π) is prime in Z.
    We can now combine some of our results to form a single statement that
is stronger than previous theorems.
Theorem 15. Let π be a quaternion. Then π is prime in H if and only if
N (π) is prime in Z.
Proof. This theorem easily follows from Theorems 11 and 14.

                                      24
6       Numbers that are Sums of 4 Squares
Now that we know some of the relationships between primes in H and Z, we
can easily prove that all prime numbers can be expressed as a sum of four
integer squares. This in turn will allow us to see what other numbers are
sums of four squares.
Theorem 16. Let p ∈ N be a prime number. Then there are integers n1 ,
n2 , n3 , and n4 so that p = n1 + n2 + n2 + n2 .
                              2
                                   2    3    4

Proof. Since it is given that p is prime in N, we know that p is also prime in
Z. By Theorem 13, we know p is not prime in H . Thus, there exist α, π ∈ H
such that
(129)                                  p = απ,
where α and π are not units in H . Taking the norm of this gives N (p) =
p2 = N (απ) = N (α)N (π). Since p is prime in Z, the only factors of p2 in Z
are 1, p, and p2 . Hence, either
(130)        N (α) = 1,         N (α) = p,              or         N (α) = p2 .
However, since α and π are not units in H , then it cannot be the case that
N (α) = 1 or N (π) = 1. This rules out that possibilities that N (α) = 1
and N (α) = p2 , leaving only that N (α) = p. Thus, the only solution to
p2 = N (α)N (π) is N (α) = N (π) = p.
     Thus, N (π) = p, so N (π) is prime in Z. If we say that π = p1 + p2 i +
p3 j + p4 k, then we know that either pi ∈ Z or pi ∈ 1 ZOdd .
                                                        2
     If pi ∈ Z, then we can define n1 , n2 , n3 , and n4 such that
(131)      n1 = p 1 ,     n2 = p 2 ,       n3 = p 3 ,        and        n4 = p4 ,
and so it is plain that p = N (π) = n2 + n2 + n2 + n2 , where n1 , . . . ∈ Z.
                                     1    2    3    4
   If instead pi ∈ 1 ZOdd , then by Theorem 5, there exists π ∈ H such that
                    2
π ∼ π and π has integer coordinates. If we define n1 , n2 , n3 , and n4 to be
the coordinates of π , so
(132)                     π = n1 + n2 i + n3 j + n4 k,
then using Lemma 6, we see that p = N (π) = N (π ) = n2 + n2 + n2 + n2 ,
                                                           1     2   3    4
where ni ∈ Z.
   Thus, for any prime p, there exist integers n1 , n2 , n3 , and n4 so that
p = n 2 + n2 + n2 + n2 .
      1    2    3    4


                                         25
Now that we have proven that a prime number can be expressed as the
sum of four integer squares, we now turn to composite numbers to show that
they too can be expressed as such a sum.
Theorem 17. Let m ∈ N be a composite number. Then there are integers
n1 , n2 , n3 , and n4 so that m = n2 + n2 + n2 + n2 .
                                   1    2    3    4

Proof. This will be proven by induction. Because a is a composite number,
there exist primes p1 , p2 , . . . , pn ∈ N such that a = p1 p2 · · · pn , where n is
the number of prime factors of a. If n = 1, then m would be prime, which
violates the initial assumption that m is composite. Therefore, assume that
n ≥ 2.
    First, take n = 2, so that
(133)                               m = p 1 p2 .
The numbers p1 and p2 are prime, so from Theorem 16, there exist integers,
say a1 , a2 , a3 , a4 , b1 , b2 , b3 , b4 , such that
(134)                          p1 = a 2 + a 2 + a 3 + a 2
                                      1     2     3     4
(135)                      and p2 = b2 + b2 + b3 + b4 .
                                     1
                                           2
                                                3
                                                      2


Now, define α, β ∈ H such that
(136)                         α = a 1 + a2 i + a3 j + a4 k
(137)                     and β = b1 + b2 i + b3 j + b4 k.
Observe that a consequence of this definition of α and β is that N (α) = p1
and N (β) = p2 . If we let µ = m1 + m2 i + m3 j + m4 k = αβ, then from
Theorem 2, µ ∈ H , and thus either m1 , . . . ∈ Z or m1 , . . . ∈ 1 ZOdd .
                                                                  2
   If m1 , . . . ∈ Z, then we can simply define n1 , n2 , n3 , and n4 so that
(138)      n1 = m 1 ,      n2 = m 2 ,      n3 = m 3 ,        and     n4 = m4 .
Thus, we can see that
(139)                      N (µ) = n2 + n2 + n2 + n2 .
                                    1    2    3    4

In addition, using the definition of µ and equation (133), we can see that
N (µ) = N (αβ) = N (α)N (α) = p1 p2 = m. Therefore, after combining this
result with equation (139), we obtain
(140)                        m = n 1 + n2 + n2 + n4 ,
                                   2
                                        2    3
                                                  2



                                         26
where n1 , n2 , n3 , n4 ∈ Z.
                                   1
    If instead we had m1 , . . . ∈ 2 ZOdd , then by Theorem 5, we know there
exists µ ∈ H , an associate of π1 π2 , with integer coordinates. In this case,
we can instead define n1 , n2 , n3 , and n4 so that
(141)                     µ = n1 + n2 i + n3 j + n4 k.
Hence, N (µ) = n2 + n2 + n2 + n2 . On account of Lemma 6 and equation
                  1    2     3   4
(133), we see that N (µ ) = N (αβ) = N (α)N (β) = p1 p2 = m. Thus, for this
case as well,
(142)                       m = n 2 + n2 + n2 + n2 ,
                                  1    2    3    4

where n1 , n2 , n3 , n4 ∈ Z.
     Now, take n ≥ 2 and assume that for a = p1 p2 · · · pn , there exist integers
                                         1    2     3
                                                          2
a1 , a2 , a3 , and a4 such that a = a2 + a2 + a2 + a4 . We must show that
for m = p1 p2 · · · pn pn+1 , there exist integers n1 , n2 , n3 , and n4 such that
m = n2 + n2 + n2 + n2 . Given the assumptions about a, we can write b as
         1     2     3    4
b = apn+1 . From Theorem 16, since pn+1 is prime, there exist integers b1 , b2 ,
b3 , and b4 such that pn+1 = b2 + b2 + b2 + b2 . If here we define α, β ∈ H such
                                 1   2     3   4
that
(143)                        α = a 1 + a2 i + a3 j + a4 k
(144)                    and β = b1 + b2 i + b3 j + b4 k,
we find ourselves in the same situation as when we set n = 2. Thus, in the
same manner, we can obtain a quaternion µ, where either
(145)          µ = αβ                  or                   µ ∼ αβ,
and where
(146)                      µ = n1 + n2 i + n3 j + n4 k.
Thus, N (µ) = n2 + n2 + n3 + n2 and N (µ) = N (αβ) = N (α)N (β) = apn+1 =
               1    2
                         2
                              4
m. We can now see that for this case as well, there exist n1 , n2 , n3 , n4 ∈ Z
such that
(147)                       m = n 2 + n2 + n2 + n2 ,
                                  1    2    3    4

which concludes the proof. Therefore, any composite number can be written
as the sum of the squares of four integers.

                                       27
There is one last case to consider, the trivial case of when a number is 1
(recall that 1 is neither prime nor composite). The following Lemma proves
that 1 can also be expressed as a sum of the squares of four integers.

Lemma 10. There are integers n1 , n2 , n3 , and n4 so that 1 = n2 +n2 +n2 +n2 .
                                                                1   2   3   4

Proof. Take n1 = 1 and n2 = n3 = n4 = 0. Thus, 1 = 12 + 02 + 02 + 02 .
   Finally, we can now formally prove that every positive integer can be
expressed as a sum of the squares of four integers, which was our goal from
the beginning. This result is known as Lagrange’s Theorem.

Theorem 18 (Lagrange’s Theorem). For every n ∈ N, there are integers
n1 , n2 , n3 , and n4 so that

(148)                       n = n 2 + n2 + n2 + n4 .
                                  1    2    3
                                                 2


Proof. Every n ∈ N is either prime, composite, or 1, so there are three cases
to consider. From Theorem 16, Theorem 17, and Lemma 10, it is clear that
if n falls under any of these cases, then there exist integers n1 , n2 , n3 , and
n4 such that n = n2 + n2 + n2 + n4 .
                    1    2    3
                                    2




                                       28

Quaternion algebra

  • 1.
    A Proof ofLagrange’s Four Square Theorem Using Quaternion Algebras Drew Stokesbary Spring 2007 Abstract Many prime numbers can be expressed as a sum of the squares of two other numbers. This paper explores which numbers can be written as a sum of the squares of four numbers. This question is deeply related to a number system known as quaternion algebra, which will be developed in this paper to describe what numbers can be written as the sum of four squares. In 1770, Joseph Louis Lagrange proved that every positive integer can be expressed as a sum of the squares of four integers, which has henceforth been called Lagrange’s Theorem, and we will eventually prove this very result. In order to reach this conclusion, we will introduce a number system called a Quaternion Algebra, which may be roughly thought of as an extension of the complex numbers or the Gaussian integers. After introducing the fundamentals of quaternions and exploring some peculiarities of arithmetic in this number system, we will begin traveling along the road towards a proof a Lagrange’s Theorem. We will use the arithmetic of quaternions to examine the notion of a norm and of a unit in this number system. These concepts will allow us to develop a division algorithm for quaternions and prove the existence of a greatest (right-hand) common divisor. Finally, we will see what it means to be a prime quaternion. All of these properties of quaternions will eventually allow us to prove that every positive integer can be written as a sum of four squares. 1
  • 2.
    1 Quaternion Arithmetic Formally, we define the quaternions H to be the set of all ordered quadruples (a1 , a2 , a3 , a4 ), where a1 , a2 , a3 , a4 ∈ R. Addition is defined as (1) (a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 ) = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ), and multiplication is defined as (2) (a1 , a2 , a3 , a4 ) · (b1 , b2 , b3 , b4 ) = (c1 , c2 , c3 , c4 ), where c1 = a 1 b1 − a 2 b2 − a 3 b3 − a 4 b4 , c2 = a 1 b2 + a 2 b1 + a 3 b4 − a 4 b3 , (3) c3 = a 1 b3 − a 2 b4 + a 3 b1 + a 4 b2 , c4 = a 1 b4 + a 2 b3 − a 3 b2 + a 4 b1 . We use Greek letters to represent quaternions, and if α = (a1 , a2 , a3 , a4 ), then we say that a1 , a2 , a3 , and a4 are the coordinates of α. Definition. If α = (a1 , a2 , a3 , a4 ) and β = (b1 , b2 , b3 , b4 ), then α and β are equal if (a1 , a2 , a3 , a4 ) = (b1 , b2 , b3 , b4 ). The equation x2 = −1 has the quadruples (0, 1, 0, 0), (0, 0, 1, 0), and (0, 0, 0, 1) as solutions. We denote each solution quadruple by the symbols i, j, and k, respectively. In other words, (4) i = (0, 1, 0, 0), (5) j = (0, 0, 1, 0), (6) and k = (0, 0, 0, 1). Although (7) i2 = j 2 = k 2 = −1, we can see from the definition of equality that (8) i = j = k. 2
  • 3.
    We then definei, j, and k so that jk = i = −kj, (9) ki = j = −ik, ij = k = −ji. From this fact, we can see that multiplication of quaternions is not commuta- tive. This peculiarity about quaternions is what sets them apart from other number systems. If α = (a1 , a2 , a3 , a4 ), then we can also write α = a1 + a2 i + a3 j + a4 k. In fact, the aforementioned definition of multiplication comes from expanding the product (a1 + a2 i + a3 j + a4 k)(b1 + b2 i + b3 j + b4 k) according to the distributive property and the multiplicative definitions of i, j, and k. In addition to the set H of quaternions, which we defined as (10) H = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ R}, we can define other sets of quaternions, such as (11) HQ = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Q}, (12) and HZ = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Z}. We will now consider the set H ⊂ H, which we define as (13) H = {(a1 , a2 , a3 , a4 ) | a1 , a2 , a3 , a4 ∈ Z or a1 , a2 , a3 , a4 ∈ 1 ZOdd }, 2 where 1 ZOdd is the set of all odd integers divided by 2. The reasoning for 2 defining this odd (no pun intended!) set of quaternions will soon become clear, but before we explore some of the deeper properties of quaternions, we will first show that the set H is closed under addition and multiplication, and that the set is a non-commutative ring. Theorem 1. H is closed under addition. Proof. Suppose α, β ∈ H and (14) α = a 1 + a2 i + a3 j + a4 k (15) and β = b1 + b2 i + b3 j + b4 k. Then, α+β = (a1 , a2 , a3 , a4 )+(b1 , b2 , b3 , b4 ) = (a1 +b1 , a2 +b2 , a3 +b3 , a4 +b4 ). 3
  • 4.
    If the coordinatesof α and β are integers, then the coordinates of α + β are also integers, and so α + β ∈ H . If the coordinates of α and β are not integers, then they are odd integers divided by 2. The sum of any two odd integers is an even integer, and an even integer divided by two is always an integer. Thus if the coordinates of α and β are not integers, then the coordinates of α + β are integers, and α+β ∈H. If the coordinates of either α or β are integers, and the coordinates of the other are not integers, then the coordinates of α + β are the sums of an integer and half of an odd integer. Each of these sums will be half of an odd integer, and thus α + β ∈ H . Theorem 2. H is closed under multiplication. Proof. First, we will define a special quaternion ξ so that 1 (16) ξ = 2 (1 + i + j + k). Thus, any integer quaternion can now be written in the manner (17) ρ = r1 ρ + r2 i + r3 j + r4 k, where r1 , r2 , r3 , r4 ∈ Z. If the coordinates of ρ are integers, then r1 will be even, and if the coordinates of ρ are non-integers, then r1 will be odd. Any quaternion written in this form will be an integer quaternion. Using the definition of multiplication, we then compute (18) ξ 2 = 1 (1 + i + j + k) 1 (1 + i + j + k) = 1 (−1 + i + j + k) = ξ − 1, 2 2 2 (19) ξi = 2 (1 + i + j + k)i = 1 (−1 + i + j − k) = −ξ + i + j 1 2 (20) iξ = i 1 (1 + i + j + k) = 1 (−1 + i − j + k) = −ξ + i + k, 2 2 (21) ξj = 1 (1 + i + j + k)j = 1 (−1 − i + j + k) = −ξ + j + k, 2 2 (22) jξ = j 1 (1 + i + j + k) = 1 (−1 + i + j − k) = −ξ + i + j, 2 2 (23) ξk = 2 (1 + i + j + k)k = 1 (−1 + i − j + k) = −ξ + i + k 1 2 (24) kξ = k 1 (1 + i + j + k) = 1 (−1 − i + j + k) = −ξ + j + k. 2 2 Each of these products is in the form of equation (17), so each is itself an integer quaternion. If we take α = (a1 , a2 , a3 , a4 ) and β = (b1 , b2 , b3 , b4 ), with α, β ∈ H , and rewrite each so that it is in the form of equation (17), then by the definition of 4
  • 5.
    multiplication and theresults of equations (18) through (24), we can see that the product αβ can also be written in the form of equation (17). Therefore, αβ is an integer quaternion, and H is indeed closed under multiplication. Since we have shown that H is closed under addition and multiplication, we will now show that H is a non-commutative ring. Theorem 3. H is a non-commutative ring. Proof. If we say that α, β, and γ are of the form (25) α = (a1 , a2 , a3 , a4 ), (26) β = (b1 , b2 , b3 , b4 ), (27) and γ = (c1 , c2 , c3 , c4 ), we can then show that each of the seven conditions for a non-commutative ring hold. For all α, β ∈ H , we see from the definition of addition that α + β = (a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 ) = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ) (28) = (b1 + a1 , b2 + a2 , b3 + a3 , b4 + a4 ) = (b1 , b2 , b3 , b4 ) + (a1 , a2 , a3 , a4 ) = β + α. Thus, the commutative law of addition holds. For all α, β, γ ∈ H , we see from the definition of addition that (α + β) + γ = [(a1 , a2 , a3 , a4 ) + (b1 , b2 , b3 , b4 )] + (c1 , c2 , c3 , c4 ) = (a1 + b1 , a2 + b2 , a3 + b3 , a4 + b4 ) + (c1 , c2 , c3 , c4 ) = (a1 + b1 + c1 , a2 + b2 + c2 , a3 + b3 + c3 , a4 + b4 + c4 ) (29) = ((a1 , a2 , a3 , a4 ) + (b1 + c1 , b2 + c2 , b3 + c3 , b4 + c4 ) = ((a1 , a2 , a3 , a4 ) + [(b1 , b2 , b3 , b4 ) + (c1 , c2 , c3 , c4 )] = α + (β + γ). Thus, the associative law of addition holds. It is clear that (0, 0, 0, 0) ∈ H . For all α ∈ H , (30) (0, 0, 0, 0) + α = (0 + a1 , 0 + a2 , 0 + a3 , 0 + a4 ) = (a1 , a2 , a3 , a4 ) = α. 5
  • 6.
    Thus there existsan element which is an additive identity. Since (a1 , a2 , a3 , a4 ∈ H , then (−a1 , −a2 , −a3 , −a4 ) ∈ H . We can see that (a1 , a2 , a3 , a4 ) + (−a1 , −a2 , −a3 , −a4 ) = (a1 − a1 , a2 − a2 , a3 − a3 , a4 − a4 ) = (0, 0, 0, 0), so H has an additive inverse for all α ∈ H . For all α, β, γ ∈ H , we can see from the definition of multiplication that (αβ)γ = α(βγ), so multiplication in H is associative. The element (1, 0, 0, 0) is in the set H , and from the definition of multi- plication, we see that for all α ∈ H , (31) (1, 0, 0, 0)α = (1, 0, 0, 0)(a1 , a2 , a3 , a4 ) = (a1 , a2 , a3 , a4 ) = α and (32) α(1, 0, 0, 0) = (a1 , a2 , a3 , a4 )(1, 0, 0, 0) = (a1 , a2 , a3 , a4 ) = α. Thus H includes an element which is a multiplicative identity. If α, β, γ ∈ H , then α(β + γ) = α(b1 + c1 , b2 + c2 , b3 + c3 , b4 + c4 ), which we can see from the definition of multiplication gives (a1 b1 + a1 c1 , a2 b2 + a2 c2 , a3 b3 + a3 c3 , a4 b4 + a4 c4 ), and thus multiplication is distributive. Therefore, each of the seven laws necessary for the set H to be a non- commutative ring hold. Now that various arithmetic properties of quaternions have been estab- lished (namely, the definitions of addition and multiplication for quaternions and the fact that the integer quaternions are closed under addition and mul- tiplication and form a non-commutative ring), we can now begin to develop certain intermediate theorems about quaternions which will help us in our quest to prove Lagrange’s Theorem. 2 Quaternion Norms The norm is an important concept in other number systems like Z[i]. Here we draw on many of the ideas from such number systems in order to develop the idea of a norm for quaternions. The first idea we will borrow from other number systems is that of the conjugate. Recall that in C, the number (a + bi) had a conjugate, namely, (a − bi). We define something similar for quaternions. Definition. Let α = a1 + a2 i + a3 j + a3 k. Then α = a − a1 i − a2 j − a3 k is ¯ the conjugate of α. 6
  • 7.
    Note, that itis clear if α ∈ H , then α ∈ H . Thus, armed knowing how to ¯ form the conjugate of a quaternion, we can define the norm of a quaternion in the same way as number systems like Z[i] or C. Definition. For any quaternion α, N (α) = αα is the norm of α. ¯ Lemma 1. Let α = a1 + a2 i + a3 j + a4 k. Then (33) N (α) = αα = αα = a1 + a2 + a2 + a2 . ¯ ¯ 2 2 3 4 Proof. If α = a1 + a2 i + a3 j + a4 k, then α = a1 − a2 i − a3 j − a4 k and ¯ N (α) = αα. From the definition of multiplication, ¯ (34) αα = (a1 + a2 i + a3 j + a4 k)(a1 − a2 i − a3 j − a4 k) = c1 + c2 i + c3 j + c4 k, ¯ where c1 = a1 a1 − a2 (−a2 ) − a3 (−a3 ) − a4 (−a4 ) = a1 + a2 + a3 + a4 , 2 2 2 2 c2 = a1 (−a2 ) + a2 a1 + a3 (−a4 ) − a4 (−a3 ) = 0, (35) c3 = a1 (−a3 ) − a2 (−a4 ) + a3 a1 + a4 (−a2 ) = 0, c4 = a1 (−a4 ) + a2 (−a3 ) − a3 (−a2 ) + a4 a1 = 0. Substituting the results of equation (35) into equation (34), we obtain (36) α α = a 2 + a2 + a2 + a2 . ¯ 1 2 3 4 Similarly, we see that (37) αα = (a1 − a2 i − a3 j − a4 k)(a1 + a2 i + a3 j + a4 k) = c1 + c2 i + c3 j + c4 k, ¯ where c1 = a1 a1 − (−a2 )a2 − (−a3 )a3 − (−a4 )a4 = a2 + a2 + a2 + a2 , 1 2 3 4 c2 = a1 a2 + (−a2 )a1 + (−a3 )a4 − (−a4 )a3 = 0, (38) c3 = a1 a3 − (−a2 )a4 + (−a3 )a1 + (−a4 )a2 = 0, c4 = a1 b4 + (−a2 )a3 − (−a3 )a2 + (−a4 )a1 = 0. After substituting the results of equation (38) into equation (37), we again obtain (39) α α = a 2 + a2 + a2 + a2 . ¯ 1 2 3 4 Therefore, (40) N (α) = αα = αα = a2 + a2 + a2 + a2 . ¯ ¯ 1 2 3 4 7
  • 8.
    Remember the seeminglybizarre way we defined the set H ? The reason for constructing the set in such a manner was so that the norm of any element of the set would be an integer. Lemma 2. Let α ∈ H where α = (a1 , a2 , a3 , a4 ) = 0. Then N (α) ∈ N. Proof. Since α ∈ H , there are two possibilities. Either a1 , a2 , a3 , a4 ∈ Z or a1 , a2 , a3 , a4 ∈ 1 ZOdd . 2 First consider the case where a1 , a2 , a3 , a4 ∈ Z. Because multiplication is closed over the integers, we know that (41) a2 , a2 , a3 , a2 ∈ Z, 1 2 2 4 and because addition is closed as well, we know (42) a2 + a2 + a2 + a2 ∈ Z. 1 2 3 4 Now consider the alternative, that a1 , a2 , a3 , a4 ∈ 1 ZOdd . If we use 1 ZOdd 2 2 to mean the set of all odd integers divided by 2, then the set of all odd integers 1 1 2 divided by 4 will be denoted as 4 ZOdd . If m ∈ 2 ZOdd , then ( m )2 = m . Since 2 2 4 2 m is odd, then m2 is odd, and m ∈ 1 ZOdd . Thus, if a1 , a2 , a3 , a4 ∈ 2 ZOdd , 4 4 1 then we can see that (43) a1 , a2 , a2 , a2 ∈ 1 ZOdd . 2 2 3 4 4 When we add four numbers from the set 1 ZOdd , we obtain an integer, so that 4 (44) a2 + a2 + a2 + a2 ∈ Z. 1 2 3 4 Therefore, for this case as well, N (α) ∈ Z. In addition, by trichotomy, we know that if a ∈ Z, then a2 ∈ N ∪ {0}. Thus, N (α) ∈ Z. However, since α = 0, we know it has at least one non- zero coordinate. Thus a2 + a2 + a2 + a2 > 0, and N (α) ∈ {0}. Therefore, 1 2 3 4 / N (α) ∈ N. Definition. Suppose α ∈ H . Then we say α is an integer quaternion. The term integer quaternion of comes from the fact that the norm of a quaternion in the set H is an integer, which was proven in Lemma 2. 8
  • 9.
    Definition. Suppose α∈ H . Then we say α is odd if N (α) is odd and even if N (α) is even. In order to prove deeper results about quaternions, we need the norm to have the property that N (αβ) = N (α)N (β). Before we can prove this fact, however, we must first prove the following Lemma, which has some lengthy arithmetic. Lemma 3. Let α = a1 + a2 i + a3 j + a4 k and β = b1 + b2 i + b3 j + b4 k. Then ¯¯ αβ = β α. Proof. By the definition of multiplication, (45) αβ = c1 + c2 i + c3 j + c4 k, where c1 = a 1 b1 − a 2 b2 − a 3 b3 − a 4 b4 , c2 = a 1 b2 + a 2 b1 + a 3 b4 − a 4 b3 , (46) c3 = a 1 b3 − a 2 b4 + a 3 b1 + a 4 b2 , c4 = a 1 b4 + a 2 b3 − a 3 b2 + a 4 b1 . By definition of the conjugate, (47) αβ = c1 − c2 i − c3 j − c4 k. ¯¯ After establishing what αβ looks like, we now turn to β α. By the defini- tion of the conjugate, we see that (48) α = a 1 − a2 i − a 3 j − a4 k ¯ (49) ¯ and β = b1 − b2 i − b3 j − b4 k, and by the definition of multiplication, we have (50) ¯¯ β α = c1 + c2 i + c3 j + c4 k, where c1 = b1 a1 − (−b2 )(−a2 ) − (−b3 )(−a3 ) − (−b4 )(−a4 ), c2 = b1 (−a2 ) + (−b2 )a1 + (−b3 )(−a4 ) − (−b4 )(−a3 ), (51) c3 = b1 (−a3 ) − (−b2 )(−a4 ) + (−b3 )a1 + (−b4 )(−a2 ), c4 = b1 (−a4 ) + (−b2 )(−a3 ) − (−b3 )(−a2 ) + (−b4 )a1 . 9
  • 10.
    By simple algebraicmanipulation, we can transform equation (51) as c1 = b 1 a1 − b 2 a2 − b 3 a3 − b 4 a4 = a 1 b 1 − a 2 b 2 − a 3 b 3 − a 4 b 4 , c2 = −b1 a2 − b2 a1 + b3 a4 − b4 a3 = −a1 b2 − a2 b1 − a3 b4 + a4 b3 , (52) c3 = −b1 a3 − b2 a4 − b3 a1 + b4 a2 = −a1 b3 + a2 b4 − a3 b1 − a4 b2 , c4 = −b1 a4 + b2 a3 − b3 a2 − b4 a1 = −a1 b4 − a2 b3 + a3 b2 − a4 b1 . Notice that this makes (53) c1 = c 1 , c2 = −c2 , c3 = −c3 , and c4 = −c4 . Substituting equation (53) into equation (50), we find that (54) αβ = c1 − c2 i − c3 j − c4 k. From equations (45) and (54), it is clear that (55) ¯¯ αβ = β α. Now our desired property of the norm, that N (αβ) = N (α)N (β), follows easily from Lemma 3. Lemma 4. Let α and β be quaternions. Then N (αβ) = N (α)N (β). ¯¯ Proof. N (αβ) = αβαβ = αβ β α = αN (β)¯ = ααN (β) = N (α)N (β). α ¯ 3 Quaternion Units We now turn to units, an idea which as a place in nearly every number system imaginable. In N, 1 is a unit, in Z, 1 and −1 are units, and in Z[i], there are four units: 1, −1, i, and −i, and in Zp , every element is a unit. Yet, despite the existence of different units in different number systems, the definition of a unit remains the same in each. We will take this same definition and apply it to quaternions. Definition. We say ε ∈ H is a unit if there is a quaternion α ∈ H so that εα = αε = 1. 10
  • 11.
    In number systemswhich have the concept of a norm, such as Z[i], a unit u has the property that N (u) = 1. To show this is true for quaternions, we must first define the inverse of a quaternion. Definition. Suppose α is nonzero and α ∈ H . Then there exists α−1 , called the inverse of α, so that α¯ (56) α−1 = . N (α) We saw that for a quaternion α ∈ H , it had a conjugate α ∈ H and a ¯ norm N (α) ∈ Z. Thus we can see that it has an inverse α−1 ∈ HQ . However, if you know that N (α) = 1, then α−1 = α, so α−1 ∈ H . This fact is the ¯ gateway for our discussion of units. Lemma 5. α is a unit if and only if N (α) = 1. Proof. From equations (33) and (56), we see αα¯ N (α) (57) αα−1 = = =1 N (α) N (α) and αα ¯ N (α) (58) α−1 α = = = 1. N (α) N (α) It is here we see that α and α−1 are both units. Now we can take the norm of both sides of equation (57) or (58) to obtain (59) N (αα−1 ) = N (1), which we know is (60) N (α)N (α−1 ) = 1. It appears that the only solution to equation (60) is N (α) = N (α −1 ) = 1, but to be sure, we can use our definition of inverse and norm to see α¯ α¯ α N (α) (61) N (α−1 ) = N = = = 1. N (α) N (α) N (α) N (α) From equations (60) and (61), we see that if N (α−1 ) = 1, then N (α) = 1 as well. Therefore, it is indeed the case that α is a unit if and only if N (α) = 1. 11
  • 12.
    Since we knowthat α ∈ H must have coordinates in either the set Z or 1 Z , 2 Odd it would seem reasonable to conjecture that there are only a finite number of units. Thus we will now take a moment to examine exactly how many quaternions are units in H , and also what they are. Theorem 4. There are 24 units in H . Proof. Say α = a1 + a2 i + a3 j + a4 k and α ∈ H . From Lemma 5, if α is a unit, then N (α) = 1. Since α ∈ H , then either a1 , a2 , a3 , a4 ∈ Z or a1 , a2 , a3 , a4 ∈ 1 ZOdd . 2 If a1 , a2 , a3 , a4 ∈ Z, then we know a2 , a2 , a2 , a2 ∈ N∪{0}. Since N (α) = 1, 1 2 3 4 then we can see that one out a1 , a2 , a3 , and a4 must equal ±1, while the other three must equal 0. This provides for a total 8 possible units. On the other hand, if a1 , a2 , a3 , a4 ∈ 1 ZOdd , then we can observe that 2 a1 , a2 , a2 , a2 ∈ 1 ZOdd . If N (α) = 1, then the only solution of fourths of odd 2 2 3 4 4 integers comes when a2 = a2 = a2 = a2 = 1 . Thus, a1 = ± 1 , a2 = ± 1 , 1 2 3 4 4 2 2 a3 = ± 1 , and a4 = ± 1 . These values for a1 , a2 , a3 , and a4 present 16 2 2 possible units. Therefore, we have 8 units when the coordinates of the units are integers and 16 units when the coordinates are non-integers, for a total of 24 different units in H . Now knowing the coordinates of all 24 units, we immediately reach the following corollary. Corollary 1. The units in H are: 1 ±1, ±i, ±j, ±k, and 2 (±1 ± i ± j ± k). Another concept for quaternions which we will borrow from other num- ber systems is the associate. The following definition for the associate of a quaternion is identical to the definition for associates in number systems such as Z[i]. Definition. Let α be a quaternion. If ε ∈ H is a unit, then εα and αε are called associates of α. If β = εα, then it is said that β associates α and written β ∼ α. We will now prove four lemmas which will be immensely helpful in many of the more difficult proofs which lay ahead. 12
  • 13.
    Lemma 6. Supposeα, β ∈ H . If α ∼ β, then N (α) = N (β). Proof. Let ε ∈ H be a unit. From the definition of associates, (62) α = εβ. Taking the norm of equation (62), we obtain (63) N (α) = N (εβ) = N (ε)N (β) But from Lemma 5, N (ε) = 1, so therefore equation (63) reduces to (64) N (α) = N (β), as desired. Lemma 7. If α ∼ β and β ∼ γ, then α ∼ γ. Proof. From the definition of associates, if α ∼ β and β ∼ γ, then there exist units ε1 and ε2 so that (65) α = ε1 β and β = ε2 γ. Combining these two equations, we see that (66) α = (ε1 ε2 )γ. What is the product ε1 ε2 ? Since ε1 and ε2 are themselves units, it would seem that ε1 ε2 is also a unit. We can verify their product is a unit as well by taking the norm: (67) N (ε1 ε2 ) = N (ε1 )N (ε2 ) = 1 · 1 = 1. By Lemma 5, since N (ε1 ε2 ) = 1, we see that indeed ε1 ε2 is also unit. Knowing this, we can refer back to equation (66) and see that α ∼ γ. Lemma 8. α ∼ β if and only if β ∼ α. Proof. If α ∼ β, then there exists a unit ε so that (68) α = εβ. 13
  • 14.
    We can multiplyboth sides of equation (68) by ε−1 , the inverse of ε, so that (69) ε−1 α = ε−1 εβ. Recall from the discussion of inverses that ε−1 ε = 1 and that if ε is a unit, then ε−1 is also a unit. Thus, (70) β = ε−1 α, and since ε−1 is a unit, then we can say β ∼ α. Finally, the converse of this proof can be proven by merely switching α and β. Lemma 9. If α ∈ H; and β ∼ α, then β ∈ H; Proof. If β ∼ α, then there exists a unit ε ∈ H such that (71) β = εα. From Theorem 2, we know that H is closed under multiplication, that is, the product of two elements of H is itself an element of H . Thus, from equation (71), we can easily see that β ∈ H . The following theorem has a difficult proof and, while the result is cer- tainly true, may appear to be meaningless and irrelevant. However, this could not be farther from the truth, as we will later see that this theorem provides a crucial link in the proofs of what numbers can be written as a sum of four integer squares. Theorem 5. If α = a1 + a2 i + a3 j + a4 k ∈ H , then there is β = b1 + b2 i + b3 j + b4 k so that β ∼ α and β ∈ HZ . Proof. It is given that α ∈ H , so we know either a1 , a2 , a3 , a4 ∈ Z or a1 , a2 , a3 , a4 ∈ 1 ZOdd . 2 The simple case is when a1 , a2 , a3 , a4 ∈ Z. Let ε be a unit such that ε = 1. Take (72) β = εα. Thus, β ∼ α. Furthermore, from the definition of multiplication, we see that (73) εα = α, 14
  • 15.
    so (74) β = α. From the definition of equal quaternions, the coordinates of β are the same as those of α, and thus β ∈ HZ and β ∼ α. The other case is when a1 , a2 , a3 , a4 ∈ 1 ZOodd . Through simple algebra, 2 we can manipulate the terms of α into the form δ + γ, where (75) δ = d1 + d2 i + d3 j + d4 k, di ∈ ZEven , and γ = 1 (±1 ± i ± j ± k), 2 so that (76) α = δ + γ. From Corollary 1, we know γ is a unit, as is γ , so ¯ (77) γ¯ = 1. γ Because each of d1 , d2 , d3 , and d4 are even, according to the definition of multiplication, the coordinates of δ¯ will be integers. Since γ is a unit, by γ ¯ taking β = α¯ , it is plain that β ∼ α. It follows from equation (76) that γ (78) β = α¯ = (δ + γ)¯ = δ¯ + γ¯ . γ γ γ γ Therefore, because δ¯ has integer coordinates and γ¯ = 1, the coordinates γ γ of δ¯ + γ¯ , and thus β, are integers. Therefore, β ∼ α and β ∈ HZ . γ γ We have seen that norms, units, and associates of quaternions have near- identical definitions to their counterparts living in number systems like Z[i]. In the area of divisors, however, quaternions begin to distinguish themselves from these other number systems. This is due to the fact that, unlike for the Gaussian integers, multiplication of quaternions is not commutative. Natu- rally then, it makes sense that division in H differs from division in Z[i]. Definition. If α, β, γ ∈ H and γ = αβ, then we say α is a left-hand divisor of γ and write α γ, and that β is a right-hand divisor of γ and write β γ. The distinction between left- and right-hand divisors is necessary because, in general, αβ = βα. As we saw earlier, multiplication is not commutative. For the purposes of this paper, we will work with right-hand divisors, but 15
  • 16.
    for consistency’s sakeonly. Every proof involving a right-hand divisor could easily be modified to prove a similar result using left-hand divisors. (Note, however, multiplication between an integer and a quaternion is commutative, so if α ∈ Z or β ∈ Z, then αβ = βα, and the distinction between left- and right-hand divisors is unnecessary.) The following theorem is necessary in order to prove the existence of a division algorithm in H . In order for long division to be “useful,” repeated long division must terminate; that is, the remainder must be less than the divisor. This theorem proves just that, if there is long division, then the remainder term will be less than the divisor. Theorem 6. Suppose κ ∈ H and m ∈ Z. Then there exists λ ∈ H so that (79) N (κ − mλ) < m2 . Proof. First note that because m ∈ Z, the norm of m is given by N (m) = m2 , so it may also be said that we are proving (80) N (κ − mλ) < N (m). If κ = (k1 , k2 , k3 , k4 ), and λ = (l1 , l2 , l3 , l4 ), then (81) κ − mλ = (k1 − ml1 , k2 − ml2 , k3 − ml3 , k4 − ml4 ). We want to find when N (κ − mλ) < N (m), which happens when (82) (k1 − ml1 )2 + (k2 − ml2 )2 + (k3 − ml3 )2 + (k4 − ml4 ) < m2 . Observe that if each |ki − mli | < | 1 m|, then (ki − mli )2 < 1 m2 , and 2 4 (83) (k1 − ml1 )2 + (k2 − ml2 )2 + (k3 − ml3 )2 + (k4 − ml4 ) < 4 1 m2 = m2 . 4 So we merely need to find each li so that (84) |ki − mli | < | 1 m|, 2 which occurs when 2ki − m 2ki + m (85) < li < . 2m 2m We can see that 2k2m − 2k2m = 1, that is, the interval between 2k2m and i +m i −m i −m 2ki +m 2m is 1. This ensures there exists at least one li in the interval such that 1 li ∈ Z or li ∈ 2 ZOdd . Finding each coordinate of λ, l1 , l2 , l3 , and l4 , in this manner ensures that λ ∈ H and N (κ − mλ) < N (m). 16
  • 17.
    Armed with Theorem6, we now look to prove the existence of a full- fledged division algorithm for H. Theorem 7 (Division Algorithm). Suppose α, β ∈ H , and β = 0, then there are lambda, γ ∈ H so that (86) α = λβ + γ, with 0 ≤ N (γ) < N (β). Proof. Define κ ∈ H and m ∈ N such that (87) ¯ κ = αβ and m = N (β). From Theorem 6, we know there exists λ ∈ H such that N (κ − mλ) < m2 . Now, using such a λ derived from Theorem 6, define γ as (88) γ − λβ. This satisfies α = δβ + from equation (86). From the definitions of κ and m, we see that (89) ¯ ¯ ¯ (α − λβ)β = αβ − λβ β = κ − mλ, so thus (90) ¯ N [(α − λβ)β] = N (κ − mλ). From Theorem 6, we know that N (κ − mλ) < m2 , so (91) ¯ ¯ N [(α − λβ)β] = N (α − λβ)N (β) < m2 . ¯ ¯ Since N (β) = β β = m, we can apply the cancelation law to see (92) N (α − λβ) < m, and from the definitions of γ and m, we can substitute to conclude (93) N (γ) < N (β). 17
  • 18.
    4 Quaternion GCDs In addition to norms and units, we are now familiar with other important ideas about quaternions, namely division and divisors. In other number sys- tems, we might want to consider when a number of the greatest common divisor, or GCD, of two other numbers. If d is the GCD of a and b, then we can easily see that d divides all linear combinations of a and b. This knowledge can be useful for many reasons, such as proving unique prime fac- torization within a number system or finding solutions of linear Diophantine equations. Also in other number systems, the GCD has important connec- tions to the primality, and such a connection also exists for quaternions. For this reason, an examination of GCDs in H now will aid our later study of primes in H . As we have done with concepts such as the norm, units, and associates, in order to define a GCD in H , we look to the definition of a GCD in other number systems. However, knowing that multiplication is not commutative, we must again be careful to distinguish between left- and right-hand divisors. Definition. Let α, β be quaternions. Then α and β have a greatest common right-hand divisor δ, denoted gcdr (α, beta), if (1) δ is a right-hand divisor of both α and β, and (2) every right-hand divisor of α and β is also a right-hand divisor of δ. Theorem 8 (Greatest Common Divisor). Given α, β ∈ H , where at least one of α and β are non-zero, then α and β have a greatest common right-hand divisor δ, which is unique up to associates and can be written in the form (94) δ = µα + νβ, where µ, ν ∈ H . Proof. Let Γ be a set defined as (95) Γ = {N (µα + νβ) | µ, ν ∈ H }. Since it was given that both α and β are not zero, then we can see that for any µ, ν ∈ H , N (µα + νβ) > 0, so Γ ∈ N. Furthermore, taking µ = α and ¯ ¯ ν = β, we see that N (N (α) + N (β)], so Γ = ∅. Thus, the Well Ordering Principle applies to Γ, so we know there exists g0 = N (µ0 α + ν0 β) such that 18
  • 19.
    g0 ≤ gfor all g ∈ Γ. Also, define δ = µ0 α + ν0 β. We want to show that δ is a greatest common right-hand divisor of α and β. By Theorem 7, we know that given α, δ ∈ H , we can find λ, γ ∈ H so that (96) α = λδ + γ, where 0 ≤ N (γ) < N (δ). Since δ0 = µ0 α + ν0 β, we can substitute to show (97) α = λ(µ0 α + ν0 β) + γ, and then rearrange to show (98) γ = (1 − λµ0 )α + (−ν0 )β. On account of the closure of addition and multiplication, (1 − λµ0 ), (−ν0 ) ∈ H , and thus N (γ) ∈ Γ if N (γ) > 0. Assume momentarily that N (γ) ∈ Γ. By Theorem 7, we know that N (γ) < N (δ), however, g0 = N (δ) is the least element of Γ, so N (γ) ∈ Γ and thus we know N (γ) = 0, and also γ = 0. / Equation (96) now becomes (99) α = λδ, and by definition, δ α. Similarly, we can see that δ β. Now let κ ∈ H and assume that κ α and κ β. Then it follows that κ µ0 α and κ ν0 β, and that κ (µ0 α + ν0 β). Since δ = µ0 α + ν0 β, then κ δ. Thus, δ is a right-hand divisor of α and β, and an arbitrary common divisor of α and β is also a divisor of δ. Therefore, by definition, δ is a greatest common right-hand divisor. Theorem 9. Suppose α, β ∈ H . If gcdr (α, β) = δ and δ is not a unit, then gcd[N (α), N (β)] = N (δ), and N (δ) > 1. Proof. It is given that (100) gcdr (α, β) = δ, and so δ is a right-hand divisor of α and β, and there exist γ1 , γ2 ∈ H such that (101) α = γ1 δ (102) and β = γ2 δ. 19
  • 20.
    Taking the normof both sides of these equations gives (103) N (α) = N (γ1 δ) = N (γ1 )N (δ) (104) and N (β) = N (γ2 δ) = N (γ2 )N (δ). From Lemma 2, the norm of a quaternion is an integer, and from divisibility for the integers, we can say that (105) N (δ)|N (α) and N (δ)|N (β). Thus N (δ) is, at minimum, a common divisor of N (α) and N (β). Another consequence of equation (100) is that if there exists λ ∈ H that is also right-hand divisor of α and β, then λ must be a right-hand divisor of δ. If such λ exists, then there is γ ∈ H such that δ = γ λ. Substituting this into equations (101) and (102), we see that (106) α = γ1 γ λ (107) and β = γ2 γ λ. Following similar reasoning as above, this implies that (108) N (λ)|N (α), N (λ)|N (β), and N (λ)|N (δ). Therefore, if gcdr (α, β) = δ, we can draw two conclusions. First, that N (δ)|N (α) and N (δ)|N (β). Second, if there exists λ such that N (λ)|N (α) and N (λ)|N (β), then it must also be the case that N (λ)|N (δ). These are the two criteria for N (δ) to be the greatest common divisor of N (α) and N (β), so thus gcd[N (α), N (β)] = N (δ). In addition, it is given that δ is not a unit, so N (δ) = 1, which, along with Lemma 2, implies that N (δ) > 1. Theorem 10. Suppose α ∈ H and let β = m ∈ N. Then gcdr (α, β) = 1 if and only if gcd[N (α), m] = 1. Proof. By Theorem 8, the following statements are equivalent: (109) gcdr (α, β) = 1, and there exist µ, ν ∈ H such that (110) 1 = µα + νβ. 20
  • 21.
    Equation 110 canbe rearranged in the form (111) µα = 1 − νβ. Substituting m for β in equation (111) and taking the norm of both sides of gives (112) N (µα) = N (1 − mν) = (1 − mν)(1 − m¯), ν which can be expanded to (113) N (µ)N (α) = 1 − mν − m¯ + m2 N (ν) ν and then rearranged as (114) N (µ)N (α) + mν + m¯ − m2 N (ν) = 1. ν Let d be an integer such that d = gcd[N (α), m]. By definition, d is a common divisor of N (α) and m, so d|N (α) and d|m. Thus, each of the following is true as well: d|N (µ)N (α), d|mν, d|m¯, ν and d| − m2 N (ν). Therefore, from the properties of divisibility for the integers, we know that d|[N (µ)N (α)+mν +m¯ −m2 N (ν)], and given equation (114), d|1. Since d ∈ ν Z, d must equal 1. Therefore, gcd[N (α), m] = 1. Note that because N (β) = m2 , the statement gcd[N (α), m] = 1 is equivalent to gcd[N (α), N (β)] = 1. 5 Primes We now look at what it means to be a prime quaternion. By proving several statements about primes in set H , and exploring how these primes relate to primes in Z, we will eventually be able to prove that any prime number can be written as a sum of the squares of four integers. Before we can prove anything about primes in H , we need to know exactly what constitutes a prime in H . For this, we draw from the definition used for primes in sets such as Z[i] and Zm . Definition. A non-zero quaternion π is prime in H if, for any α, β ∈ H such that π = αβ, either α or β is a unit (but not both). 21
  • 22.
    Now that wehave established what it means to be prime in H , we can begin to prove certain theorems connecting primes in H with primes in Z. The following theorem is similar to results in Z[i]. A Gaussian integer is prime in Z[i] if its norm is prime in Z Theorem 11. Let π be a quaternion. If N (π) is prime in Z, then π is prime in H . Proof. It is given that π ∈ H and that N (π) is prime in Z. Define α, β ∈ H such that (115) π = αβ. Taking the norm of both sides of equation (115) gives N (π) = N (αβ) = N (α)N (β). Since N (π) is prime in Z, by the definition of prime, either N (α) or N (β) must be a unit in Z. From Lemma 5, this implies that either α or β is a unit in H . Given equation (115), if either α or β is a unit, then by definition, π must be prime in H . Before we can continue to prove things about primes in H , we must first state an auxiliary theorem that will be needed later. Theorem 12. Suppose p ∈ Z is an odd prime (p = ±2). Then there exist x, y ∈ Z such that (116) 1 + x2 + y 2 ≡ 0 (mod p), where 0 < x < p and 0 < y < p. Theorem 13. If an integer p is prime in Z, then p is not prime in H . Proof. Let p be an integer that is prime in Z. If p = 2, then we can see that (117) 2 = (1 + i)(1 − i). Of course, (1 + i) and (1 − i) are in the set H , but neither is a unit, so 2 is not prime in H . We can therefore assume that p > 2. By Theorem 12, there exist r, s ∈ Z such that (118) 1 + r 2 + s2 ≡ 0 (mod p), 22
  • 23.
    where 0 <r < p and 0 < s < p. Now define α ∈ H so that (119) α = 1 + 0i + sj − rk, where r and s are obtained from Theorem 12. Thus N (α) = 1 + r 2 + s2 , and from equation (118), N (α) ≡ 0 (mod p). It follows from the proper- ties of modular arithmetic that p|N (α), and it is trivial that p|p. There- fore, gcd[N (α), p] ≥ p. By definition, p > 1, so it is easy to see that that gcd[N (α), p] = 1. Using the result of Theorem 10, this implies that gcdr (α, p) = 1. Now, we define δ such that (120) δ = gcdr (α, p), and we know that δ is not a unit in H . Furthermore, because δ is also a common right-hand divisor of α and p, we can say that δ α and δ p. Thus, there exist λ1 , λ2 ∈ H such that (121) α = λ1 δ (122) and p = λ2 δ. Assume by way of contradiction that λ2 is a unit in H . Then from equation (122), we see that p ∼ δ. This in turn implies that p α. Given equation (119), there must exist γ = c1 + c2 i + c3 j + c4 k ∈ H such that (1 + sj − rk) = γp = pc1 + pc2 i + pc3 j + pc4 k. However, no such γ exists, because it is impossible to find a suitable c1 to satisfy pc1 |1 when p > 2. Thus, a contradiction arises, and it cannot be the case that λ2 is a unit. Hence, p = λ2 δ, where neither λ2 nor δ is a unit, and therefore, p is not prime in H . We saw in Theorem 11 the connection between prime quaternions and prime integers, a result for quaternions which is identical to Gaussian inte- gers. The following theorem is the converse of Theorem 11 but is not true for primes in Z[i]. Theorem 14. Let π be a prime in H . Then N (π) is prime in Z. Proof. It is given that π is a quaternion which is prime in H . However, assume by way of contradiction, that the norm N (π) is not prime in Z. This means that there exist integers a and p, both not units, so that (123) N (π) = ap. 23
  • 24.
    Furthermore, assume thatp is a prime factor of N (π). Since by construction, p|N (π), and also trivially, p|p, then p is a common divisor of N (π) and p, so gcd[N (π), p] ≥ p. Because p > 1, it is also true that gcd[N (π), p] = 1. From Theorem 10, it follows that gcdr (π, p) = 1. Define δ such that (124) δ = gcdr (π, p). Thus δ 1, so δ is not a unit in H . Since δ is the greatest common right- hand divisor of π and p, it is also a common right-hand divisor of π and p. Hence, there exist λ1 , λ2 ∈ H such that (125) π = λ1 δ (126) and p = λ2 δ. Since π is prime and δ is not a unit, then λ1 is a unit, and π ∼ δ. From Lemma 6, N (π) = N (δ). Taking the norm of both sides of equation (126) gives N (p) = p2 = N (λ2 δ) = N (λ2 )N (δ), and substituting N (π) for N (δ) gives (127) p2 = N (λ2 )N (π). By combining this with equation (123) and performing simple algebra, we obtain (128) p = aN (λ2 ). Because p was defined to be a prime, then by definition, N (λ2 ) must equal either 1 or p. Assume temporarily that N (λ2 ) = 1. Then from Lemma 5, λ2 would be a unit in H . From equation (126), it would follow that p ∼ δ, and from Lemmas 7 and 8, p ∼ π. But then p would be prime in H , which violates Theorem 13. This contradiction means that N (λ2 ) cannot equal 1, and instead must equal p. Substituting this result into equation (127) gives p2 = pN (π), which reduces to p = N (π). Therefore, since p is prime in Z and N (π) = p, then N (π) is prime in Z. We can now combine some of our results to form a single statement that is stronger than previous theorems. Theorem 15. Let π be a quaternion. Then π is prime in H if and only if N (π) is prime in Z. Proof. This theorem easily follows from Theorems 11 and 14. 24
  • 25.
    6 Numbers that are Sums of 4 Squares Now that we know some of the relationships between primes in H and Z, we can easily prove that all prime numbers can be expressed as a sum of four integer squares. This in turn will allow us to see what other numbers are sums of four squares. Theorem 16. Let p ∈ N be a prime number. Then there are integers n1 , n2 , n3 , and n4 so that p = n1 + n2 + n2 + n2 . 2 2 3 4 Proof. Since it is given that p is prime in N, we know that p is also prime in Z. By Theorem 13, we know p is not prime in H . Thus, there exist α, π ∈ H such that (129) p = απ, where α and π are not units in H . Taking the norm of this gives N (p) = p2 = N (απ) = N (α)N (π). Since p is prime in Z, the only factors of p2 in Z are 1, p, and p2 . Hence, either (130) N (α) = 1, N (α) = p, or N (α) = p2 . However, since α and π are not units in H , then it cannot be the case that N (α) = 1 or N (π) = 1. This rules out that possibilities that N (α) = 1 and N (α) = p2 , leaving only that N (α) = p. Thus, the only solution to p2 = N (α)N (π) is N (α) = N (π) = p. Thus, N (π) = p, so N (π) is prime in Z. If we say that π = p1 + p2 i + p3 j + p4 k, then we know that either pi ∈ Z or pi ∈ 1 ZOdd . 2 If pi ∈ Z, then we can define n1 , n2 , n3 , and n4 such that (131) n1 = p 1 , n2 = p 2 , n3 = p 3 , and n4 = p4 , and so it is plain that p = N (π) = n2 + n2 + n2 + n2 , where n1 , . . . ∈ Z. 1 2 3 4 If instead pi ∈ 1 ZOdd , then by Theorem 5, there exists π ∈ H such that 2 π ∼ π and π has integer coordinates. If we define n1 , n2 , n3 , and n4 to be the coordinates of π , so (132) π = n1 + n2 i + n3 j + n4 k, then using Lemma 6, we see that p = N (π) = N (π ) = n2 + n2 + n2 + n2 , 1 2 3 4 where ni ∈ Z. Thus, for any prime p, there exist integers n1 , n2 , n3 , and n4 so that p = n 2 + n2 + n2 + n2 . 1 2 3 4 25
  • 26.
    Now that wehave proven that a prime number can be expressed as the sum of four integer squares, we now turn to composite numbers to show that they too can be expressed as such a sum. Theorem 17. Let m ∈ N be a composite number. Then there are integers n1 , n2 , n3 , and n4 so that m = n2 + n2 + n2 + n2 . 1 2 3 4 Proof. This will be proven by induction. Because a is a composite number, there exist primes p1 , p2 , . . . , pn ∈ N such that a = p1 p2 · · · pn , where n is the number of prime factors of a. If n = 1, then m would be prime, which violates the initial assumption that m is composite. Therefore, assume that n ≥ 2. First, take n = 2, so that (133) m = p 1 p2 . The numbers p1 and p2 are prime, so from Theorem 16, there exist integers, say a1 , a2 , a3 , a4 , b1 , b2 , b3 , b4 , such that (134) p1 = a 2 + a 2 + a 3 + a 2 1 2 3 4 (135) and p2 = b2 + b2 + b3 + b4 . 1 2 3 2 Now, define α, β ∈ H such that (136) α = a 1 + a2 i + a3 j + a4 k (137) and β = b1 + b2 i + b3 j + b4 k. Observe that a consequence of this definition of α and β is that N (α) = p1 and N (β) = p2 . If we let µ = m1 + m2 i + m3 j + m4 k = αβ, then from Theorem 2, µ ∈ H , and thus either m1 , . . . ∈ Z or m1 , . . . ∈ 1 ZOdd . 2 If m1 , . . . ∈ Z, then we can simply define n1 , n2 , n3 , and n4 so that (138) n1 = m 1 , n2 = m 2 , n3 = m 3 , and n4 = m4 . Thus, we can see that (139) N (µ) = n2 + n2 + n2 + n2 . 1 2 3 4 In addition, using the definition of µ and equation (133), we can see that N (µ) = N (αβ) = N (α)N (α) = p1 p2 = m. Therefore, after combining this result with equation (139), we obtain (140) m = n 1 + n2 + n2 + n4 , 2 2 3 2 26
  • 27.
    where n1 ,n2 , n3 , n4 ∈ Z. 1 If instead we had m1 , . . . ∈ 2 ZOdd , then by Theorem 5, we know there exists µ ∈ H , an associate of π1 π2 , with integer coordinates. In this case, we can instead define n1 , n2 , n3 , and n4 so that (141) µ = n1 + n2 i + n3 j + n4 k. Hence, N (µ) = n2 + n2 + n2 + n2 . On account of Lemma 6 and equation 1 2 3 4 (133), we see that N (µ ) = N (αβ) = N (α)N (β) = p1 p2 = m. Thus, for this case as well, (142) m = n 2 + n2 + n2 + n2 , 1 2 3 4 where n1 , n2 , n3 , n4 ∈ Z. Now, take n ≥ 2 and assume that for a = p1 p2 · · · pn , there exist integers 1 2 3 2 a1 , a2 , a3 , and a4 such that a = a2 + a2 + a2 + a4 . We must show that for m = p1 p2 · · · pn pn+1 , there exist integers n1 , n2 , n3 , and n4 such that m = n2 + n2 + n2 + n2 . Given the assumptions about a, we can write b as 1 2 3 4 b = apn+1 . From Theorem 16, since pn+1 is prime, there exist integers b1 , b2 , b3 , and b4 such that pn+1 = b2 + b2 + b2 + b2 . If here we define α, β ∈ H such 1 2 3 4 that (143) α = a 1 + a2 i + a3 j + a4 k (144) and β = b1 + b2 i + b3 j + b4 k, we find ourselves in the same situation as when we set n = 2. Thus, in the same manner, we can obtain a quaternion µ, where either (145) µ = αβ or µ ∼ αβ, and where (146) µ = n1 + n2 i + n3 j + n4 k. Thus, N (µ) = n2 + n2 + n3 + n2 and N (µ) = N (αβ) = N (α)N (β) = apn+1 = 1 2 2 4 m. We can now see that for this case as well, there exist n1 , n2 , n3 , n4 ∈ Z such that (147) m = n 2 + n2 + n2 + n2 , 1 2 3 4 which concludes the proof. Therefore, any composite number can be written as the sum of the squares of four integers. 27
  • 28.
    There is onelast case to consider, the trivial case of when a number is 1 (recall that 1 is neither prime nor composite). The following Lemma proves that 1 can also be expressed as a sum of the squares of four integers. Lemma 10. There are integers n1 , n2 , n3 , and n4 so that 1 = n2 +n2 +n2 +n2 . 1 2 3 4 Proof. Take n1 = 1 and n2 = n3 = n4 = 0. Thus, 1 = 12 + 02 + 02 + 02 . Finally, we can now formally prove that every positive integer can be expressed as a sum of the squares of four integers, which was our goal from the beginning. This result is known as Lagrange’s Theorem. Theorem 18 (Lagrange’s Theorem). For every n ∈ N, there are integers n1 , n2 , n3 , and n4 so that (148) n = n 2 + n2 + n2 + n4 . 1 2 3 2 Proof. Every n ∈ N is either prime, composite, or 1, so there are three cases to consider. From Theorem 16, Theorem 17, and Lemma 10, it is clear that if n falls under any of these cases, then there exist integers n1 , n2 , n3 , and n4 such that n = n2 + n2 + n2 + n4 . 1 2 3 2 28