Efficiently find primes below a given number. An often required task when solving programming puzzles and challenges. Runtime is discussed. Example algorithms are provided.
As a programmer who likes math and puzzles, I’ve spent a lot of time answering questions on Project Euler. While writing solutions to the first fifty problems, I often needed to find all primes below a given number. The golang
code for this article can be found here.
Here’s the first algorithm I wrote to find primes less than n
:


Ok, you got me, that wasn’t actually the first one, but it was the first one that worked, yet it didn’t work for long. As soon as the input numbers got large, its runtime grew in, what I later learnt was, linear time.
Linear time, in my own words, is when the runtime of an algorithm grows at the same rate as the size of the input. Let’s take a look at what kinds of numbers a linear time algorithm can handle:
n  time 

10,000  56 ms 
100,000  4.5 sec 
1,000,000  5min 40sec 
The table above shows how long it takes to generate all the primes under n
using the above algorithm.
The algorithm works for numbers up to about ten thousand; after that, we see the algorithm takes a few seconds for numbers around onehundred thousand and several minutes for numbers of size one million or larger.
My original solution didn’t run anywhere close to fast enough when questions involved numbers around one million
. Luckily, after some research, I came across the following shortcut:
To determine if a number
p
is prime, one only has to check if it has any divisors from 2 to sqrt(n) inclusive. Ifa
andb
are factors ofp
anda <= b
thena
must be less than or equal to sqrt(p).
Since it’s only necessary to check numbers below sqrt(p)
, I wrote a new algorithm that ran much faster:


I mentioned linear time previously. This new algorithm bounded by sqrt(n)
runs in sqrt(n)
time. And, if we want to use this improved algorithm to find all the primes under n
, we’ll have to call it n
times making the total runtime n * sqrt(n)
.
n  Original  Improved 

10,000  56 ms  1.3 ms 
100,000  4.5 sec  36 ms 
1,000,000  5min 40sec  680 ms 
Now, here’s where things get interesting. The above sqrt(n)
shortcut made my code fast enough that I could successfully use it whenever generating primes was required. It was fast enough that for a long time I never searched for anything faster. Yet, in form posts on the topic, people always mentioned something called the Sieve of Eratosthenese
.
The Sieve strategy for finding primes less than a given number is slightly different then what I’ve done so far in this article. Instead of looping over numbers from 0 to n
and checking for divisors from 2 to sqrt(n)
, we start at the number 2
and remove all multiples of 2
from 4 to n
. Then, we remove all multiples of 3
and so on. Once all multiples have been removed, the numbers left over are primes.
In terms of speed, I’d heard a Sieve was faster, but I never considered how much faster it really was until writing this post.
Here’s the code for a prime number Sieve:


Let’s look at the runtimes in a table compared to the Improved
algorithm from before:
n  Improved  Sieve 

10,000  56 ms  0.1 ms 
100,000  4.5 sec  1.5 ms 
1,000,000  5min 40sec  18 ms 
The runtimes show that a sieve is significantly faster. Let’s think about why that might be. My original nonsieve algorithm in the worst case scenario would have to check sqrt(n)
numbers n
times. If we say that n
is one million, that’s one billion
operations in the worst case.
Now lets think about how many operations the sieve method does. If our number n
is once again one million, we would loop over (1 000 000 / 2) multiples of 2
, (1 000 000 / 3) multiples of 3
, and so on. This forms the following sequence: 1m/2 + 1m/3 + … + 1m/1m
or 1m * (1/2 + 1/3 + … + 1/1m)
.
Here’s a python program to find that sum:
sum([1/x for x in range(2, 1000000+1)])
The answer is ~13.4 million
. That’s only 1.34%
of the operations required by the nonsieve method. Based on the measurements above the sieve method is about 30 times faster
when n
is one million and does only 1.34%
of the work.
Now, we might ask, in the time it takes the nonsieve method to generate all the primes under one million (~680ms) how many primes can the sieve method find.
Let’s use a binary search technique to find this number. In the table below, I start by asking my code how long it takes to find all the primes under 100 million. I’ll determine the next value I search based on how much time it took for the previous code to run. With a search space of 100million, it would take log2(100m) = 26 queries to find a final value.
Step  n  Time  high/low 

1  100 million  10 sec  Too high 
2  50 million  3 sec  Too high 
3  25 million  1.6 sec  Too high 
4  12.5 million  702 ms  Too high 
5  6 million  227 ms  Too low 
6  9.25 million  409 ms  Too low 
7  10.8 million  522 ms  Too low 
8  11.65  622 ms  Too low 
9  12 million  ~680 ms   
We can find all primes under 12 million while a nonsieve algorithm finds only those under one million! There’s 788060
primes under 12 million, and, if we wanted to let our algorithm run for 10 sec with n = 100 million
we could find the more than Five million primes! (specifically 5761455
primes)
A Sieve can be implemented in only a few lines of python. So, a Sieve is just as quick to implement as the other methods.


In this article, we found that a loopbased prime finding algorithm works well enough for low numbers (under one million). However, a Sieve can be equally quick to implement, runs an order of magnitude faster, and can find primes as large as five million in ~10sec.