Is 2011 a special number?

Today is the first day of the year 2011. 2011 is a prime number. Not only that, but according to numerous math bloggers and twitterers on the internet, 2011 is the sum of 11 consecutive primes — that is, 2011 = 157 + 163 + 167 + 173 + 179 + 181 + 191 + 193 + 197 + 199 + 211. Furthermore (as I already said), 2011 is prime.

Naturally I wonder, how rare are these numbers — numbers that are prime but also a sum of a bunch of consecutive primes? This seems like a problem easily solved with some programming.

Let us write P(n,k) as the number of primes less than or equal to n that can be written as a sum of at least k consecutive primes. How big is P(2011,k) compared to \pi(2011) = 305, the number of primes less than or equal 2011?

My method for computing P(n,k) is pretty simple. First we start with a list of prime numbers, then write the cumulative sums for the prime numbers (these images were taken with my laptop camera off a whiteboard):

Next take every pair from the bottom list, and find their differences:

Because of the way I arranged the numbers, we can see that the bottom diagonal are all the numbers that can be written as a sum of 1 prime (obviously, the prime numbers), then the next row are all numbers that can be written as a sum of 2 primes, and so on. If we want to compute P(n,k) we simply list enough prime numbers to complete this table, take everything above a diagonal, and filter out all the duplicates.

Here’s my hopefully correct implementation:

import java.util.*;
import java.lang.*;

class Main
{
	public static void main (String[] args) throws java.lang.Exception
	{
		int LIM = 2011;
		int NCONSEC = 3;
		boolean[] sieve = new boolean[LIM+1];
		Arrays.fill(sieve, true);
		for(int i=2; i<=LIM; i++){
			if(!sieve[i]) continue;
			for(int j=2; j*i<=LIM; j++)
				sieve[i*j] = false;
		}

		List<Integer> primes = new ArrayList<Integer>();
		for(int i=2; i<=LIM; i++)
			if(sieve[i]) primes.add(i);

		List<Integer> cuml = new ArrayList<Integer>();
		cuml.add(0);
		int cum = 0;
		for(int p : primes){
			cum += p;
			cuml.add(cum);
		}

		Set<Integer> consums = new TreeSet<Integer>();
		for(int i=1; i<cuml.size(); i++){
			for(int j=0; j<=i-NCONSEC; j++)
				consums.add(cuml.get(i) - cuml.get(j));
		}

		int p = 0;
		for(int i : consums){
			if(i > LIM) break;
			if(!sieve[i]) continue;
			//System.out.println(i);
			p++;
		}

		System.out.println(p);
	}
}

It turns out that P(2011,3) = 147 and since there are 305 primes less or equal to 2011, roughly half of all primes under 2011 can be written as a sum of at least 3 consecutive primes. Hardly a special number at all! Even if we insist on a list of 11 consecutive primes, there are still 56 primes less than or equal to 2011 that can be written as a sum of 11 consecutive primes, about 1 in every 5 primes.

Happy New Year!

The Sieve of Sundaram

The Sieve of Eratosthenes is probably the best known algorithm for generating primes. Together with wheel factorization and other optimization options, it can generate primes very quickly.

But a lesser well known algorithm for sieving primes is the Sieve of Sundaram. This algorithm was discovered in 1934 by Sundaram; like the sieve of Eratosthenes it finds all prime numbers up to a certain integer.

The algorithm

A simplified version of the algorithm, using N as the limit to which we want to find primes to:

m =: Floor of N/2
L =: List of numbers from 1 to m
For every solution (i,j) to i + j + 2ij < m:
    Remove i + j + 2ij from L

For each k remaining in L:
    2k + 1 is prime.

In practice we can find solutions to i + j + 2ij < m by using two nested for loops:

For i in 0 to m:
    For j in i to m:
        L[i + j + 2ij] =: False

Here i is always less than j, because the two are interchangeable and filtering it twice would be a waste.

We don’t actually need to loop j from 0 to m. From the inequality i + j + 2ij < m, we can solve for j: j < \frac{m-i}{2i+1}. The new algorithm:

m =: Floor of N/2
L =: Boolean array of length m
Fill L with true

For i in 0 to m:
    For j in i to (m-i)/(2i+1):
        L[i + j + 2ij] =: False

For each k remaining in L:
    2k + 1 is prime.

Why this algorithm works

In the algorithm, 2k+1 is prime where k can be written as i + j + 2ij where i and j are integers. We can rewrite this:

\begin{array}{l} 2(i+j+2ij) + 1 \\ = 2i + 2j + 4ij + 1 \\ = (2i+1) (2j+1) \end{array}

Both 2i+1 and 2j+1 are odd numbers, and any number that can be written as the product of two odd numbers are composite.

Of the odd numbers, those that cannot be written as the product of two odd numbers are prime. We’ve filtered everything that can be written as (2i+1)(2j+1) so we are left with the odd prime numbers.

This algorithm only gets the odd prime numbers, but fortunately there is only one even prime number, 2.

Benchmarks with the Sieve of Eratosthenes

Here’s an implementation of the Sieve of Sundaram:

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long long ll;
int main() {
 ll n = 100000000LL;
 ll m = n/2;
 char *sieve = malloc(m);

 for(ll i=0; i<m; i++)
  sieve[i] = 1;

 for(ll i=1; i<m; i++)
  for(ll j=i; j<=(m-i)/(2*i+1); j++)
   sieve[i+j+2*i*j] = 0;

 ll s=1;
 for(ll i=1; i<m; i++)
  if(sieve[i]) s++;

 printf("%llu", s);
 return 0;
}

This code counts the number of primes below 100 million, which should be 5761455. The above code runs in 9.725 seconds.

Here’s an alternative, an implementation of the more standard Sieve of Eratosthenes:

#include <stdio.h>
#include <stdlib.h>

typedef unsigned long long ll;
int main(){
 ll lim = 100000000LL;
 char *sieve = malloc(lim);

 for(int i=0; i<lim; i++)
  sieve[i] = 1;

 int s=0;
 for(int i=2; i<lim; i++){
  if(sieve[i]){
   s++;
   for(int j=2; j<=lim/i; j++)
    sieve[i*j] = 0;
  }
 }

 printf("%d", s);
 return 0;
}

I was surprised to find that the Sieve of Eratosthenes actually ran faster. It completed in 7.289 seconds.

I expected the Sieve of Sundaram to be faster because according to Wikipedia this algorithm uses O(n \log(n)) operations, while the Sieve of Eratosthenes uses O(n \log(n) \log \log(n)).