Prime numbers

Today I’m going to be talking Number Theory.  The subject of number theory is the positive integers 1, 2, 3, …, so (just for today) when I say “number”, they’re the ones I mean.

Most people know what a prime number is, even the ones who run with terror at the thought of mathematics.  A prime is simply a number that has exactly two divisors - 1 and itself.  It is important to note that this definition excludes 1 from the primes, since it has only one divisor.

You can work out by hand that the first few primes are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29.  If you keep going, it becomes apparant that they start out fairly common (nearly half of the numbers from 1 to 10), and become increasingly uncommon (from 21 to 30 there are only 2).  So an obvious question presents itself: do they eventually stop appearing altogether?

The answer is no.  The classic proof appears in Euclid’s Elements (Book IX, proposition 20).  A modern version of it goes as follows:

Take any finite set of prime numbers $latex p_1, p_2, p_3, \ldots, p_r$ and consider $latex n = p_1p_2p_3\ldots p_r + 1$.  This number is not divisible by any of the primes in the set, because they all leave a remainder of 1 on division.  But all positive integers greater than 1 are divisible by at least one prime (the number itself, if it happens to be prime).  So there must be a prime that is not in our set.  And we can repeat this process, adding the new primes to the set and then finding more, indefinitely.  That means there are infinitely many primes.

That’s a perfectly fine proof, but it isn’t the one I wanted to show you.  But if you’re going to understand that proof, I’m going to have to introduce you to a little group theory.


One thing that mathematicians like to do is to notice similarities between apparantly different things, and formalise them.  Then we can use our knowledge of one thing to gain some insight into the others.  Group theory is one of those tools.  It seems a needlessly abstract on a first encounter, but all it is doing is creating a language for talking about a certain set of properties that many different mathematical objects have in common.

One example of a group is the set of integers under addition.  I’ll use it to clarify the defintion.

So what is a group?  It’s so simple you’ll wonder how the subject could be interesting.  A group is a set coupled with a binary operation, that obeys certain rules.  A binary operation is something like addition or divison: it takes two elements from the set (the order may or may not matter, depending on the operator) and gives you a third.

That’s the first rule: the set must be closed under the operator.  To make things concrete, let’s call the set $latex G$ and the operator $latex \circ$ (and, in a bit of an abuse of notation, I’ll call the group $latex G$ as well.  It’s generally clear from context whether the set or the group is meant).  Then if $latex f$ and $latex g$ are any elements of $latex G$, $latex f \circ g$ must also be an element of $latex G$.  For our example group, if we add any two integers we get another integer, so the integers are closed under addition.

The next rule is that there is a special element of $latex G$, called the identity and conventionally given the symbol $latex e$, which has the property that if $latex f$ is any element of $latex G$, then $latex e \circ f = f \circ e = f$.  Zero plays this role in our example group: $latex 0 + f = f + 0 = f$ for any integer $latex f$.

Next, each element $latex f$ must have an inverse $latex f^{-1}$ that satisfies $latex f\circ f^{-1} = f^{-1} \circ f = e$.  Although I’ve written this with the same notation as exponentiation, it’s an entirely different thing (although there is a connection, which I might get to at another time).  For the integers, the inverse of $latex f$ is $latex -f$.  -2 is the inverse of 2, and so on.

Finally, the operator must be associative.  That means if $latex f$, $latex g$, and $latex h$ are elements of the set, then $latex (f \circ g) \circ h = f \circ (g \circ h)$.  You already know that is true for addition on the integers (and, indeed, on any other numbers).  (1+2)+8 = 3+8 = 11, and 1+(2+8) = 1+10 = 11.

And that’s it.  These four rules leave enough unsaid that you can find groups everywhere, but provides just enough structure that it’s possible to say interesting things about groups in general.  Anything that we can prove from only these rules will apply to anything that is a group, and that’s a very powerful tool.

If this was a proper introduction to group theory, I’d describe a few more examples of groups; symmetries and permutations are the traditional ones, and group theory is a useful tool for analysing Rubik’s cube.  That can come another time.  Right now, there’s still a fair way to go before we’re ready for the proof.  But it would be useful to look at one more example.

Finite groups

Groups can be classified in a number of ways.  One of the most important is whether a group is finite or infinite.  This is just whether the set has a finite or infinite number of elements.  The integers under addition is an example of an infinite group.  The integers $latex 0, 1, 2, \ldots, n-1$ under addition modulo n form a finite group.  Addition modulo n is just like normal addition, only you subtract n if the result is above n-1.  Hours are a familiar example: if it’s 9 o’clock now, then in 6 hours it will be 3 o’clock: $latex 9 +_{12} 6 = 3$ (the subscript 12 on the + is to show that it is addition modulo 12).

I claimed that this is a group.  It’s easy to check:

  1. Yes, it is closed.
  2. 0 is the identity.
  3. The inverse of r is 12-r, or 0 if r = 0.
  4. Addition is still associative.

This group is called $latex \mathbb{Z}_{12}$.

A multiplicative group

One more example of a group.  If $latex p$ is a prime, then the numbers $latex 1, 2, 3, \ldots, p-1$ form a group number multiplication modulo $latex p$ (that’s just like modular addition, but you might have to subtract a multiple of $latex p$ from the result to get it back to the right range).  If you’re feeling brave, you might like to prove that this is a group.  Or you can just take my word for it.  The only tricky bit is proving that every element has an inverse.

For example, we could take $latex p = 5$.  Then the set $latex \{1,2,3,4\}$ with the operator $latex \times_{5}$ is a group.  1 is the identity, and the inverses are $latex 1^{-1} = 1$, $latex 2^{-1} = 3$, $latex 3^{-1} = 2$, $latex 4^{-1} = 4$.  This group is sometimes called $latex \mathbb{Z}^*_5$ .

The order of an element

And one more definition.   Consider the group $latex \mathbb{Z}_{12}$ again.  What happens if we repeatedly add 2 to itself?  It goes around in a circle: 2, 4, 6, 8, 10, 0, 2, … What abour 5?  That goes 5, 10, 3, 8, 1, 6, 11, 4, 9, 2, 7, 0, 5, … If the sequence reaches the identity, as both of these do, then the number of steps it takes to get there is called the order of the element.  The order of 2 is 6, and the order of 5 is 12.  If the sequence never reaches the identity, then that element is said to have infinite order.

In the group $latex \mathbb{Z}^*_5$, we find that $latex 1^1 = 1$, $latex 2^4 = 1$, $latex 3^4 = 1$, and $latex 4^2 = 1$, so the orders of 1, 2, 3, 4 are 1, 4, 4, 2 respectively (this time, $latex a^b$ means exponentiation modulo 5).

A property of the order of an element which will be useful in our proof is that every position of the identity in the sequence is a multiple of the order.

And another property, which is important enough to have been given the name Lagrange’s theorem, is that in a finite group, the order of each element is a divisor of the number of elements in the group.  You can see this in the previous examples: $latex \mathbb{Z}_{12}$ has 12 elements, and the orders we saw for 2 and 5 are 6 and 12 – both divisors of 12.  And in $latex \mathbb{Z}^*_5$, with 4 elements, the orders 1, 4, 4, 2 are all divisors of 4.

The proofs of these properties are fairly simple, and maybe I’ll present them another time.  But for now…

The proof

I think we’re finally ready.

This will be a proof by contradiction.  We will assume that there are finitely many primes, then derive a contradiction from this.  That will mean the original assumption is false, and there are infinitely many primes.

So suppose that there are finitely many primes.  Then there must be a largest prime, which we will call $latex p$.  Now, consider the number $latex 2^p – 1$.  It will be divisible by at least one prime (possibly itself).  We choose any one of them and call it $latex q$.

Since $latex 2^p-1$ is divisible by $latex q$, $latex 2^p$ must give a remainder of 1.  Now, 1 is the identify of the group $latex \mathbb{Z}^*_q$.  And since $latex p$ is a prime, that means the order of 2 in this group is $latex p$.

Now we apply Lagrange’s theorem.  $latex \mathbb{Z}^*_q$ has $latex q-1$ elements, and one of those elements has order $latex p$.  That means $latex p$ is a divisor of $latex q-1$, which means q is greater than p.

But q is a prime, and we’ve just proved that it’s larger than the largest prime.  This is the contradiction we were after.  The original assumption must be false, and we conclude that there are infinitely many primes.


Ah.  And now WordPress has decided that it doesn’t like me enough to show me a preview of this post.  Please forgive the inevitable errors.