Project Euler in R: Problem 7

Problem:  What is the 10001st prime number?

Commentary:  So it’s finally time to discuss my prime-searching algorithm, as promised back in Problem 5.  We will find primes in the following way:

  1. 2 and 3 are prime
  2. We examine the odd integers (having already discovered the only even prime) starting with i=5.
  3. First, check if i is of the form 6k+3 for some integer k>0.  If so, then i can’t be prime, because then i=6k+3=3(2k+1), and since k>0, 2k+1 > 1 (so i\neq 3).  In this case, don’t even bother proceeding; replace i with i+2 and start over.
  4. Calculate s(i)=\sqrt{i}.  Among all known primes below s(i)  (going no further for the same reason as that outlined in Problem 3), check if these primes divide into i.  If we find one that does, then there is no reason to proceed; i is not prime.  Go back to 3. and try again with i replced with i+2.  If none of the known primes below s(i) divide i, then stop; i must be prime.  Add it to the list of known primes.
  5. Repeat this process by replacing i with i+2 until we have 10001 primes.

It’s nothing fancy, but it works.  As for the implementation, believe it or not, I actually don’t construct a list of length 10001 and insert into it as I gather my primes like a good little R user.  See the post-code commentary for an explanation.

R Code:

ElapsedTime <- system.time({
##########################
primes <- c(2, 3)
i <- 3
while (length(primes) < 10001){
  flag <- 0
  i <- i+2
  if (i%%6 == 3){
    flag <- 1
  }
  if (flag == 0){
    s <- sqrt(i)+1
    for (prime in primes){
      if ((i%%prime == 0)){
        flag <- 1
        break
      }
      if (prime > s){
        break
      }
    }
    if (flag == 0){
      primes <- c(primes, i)
    }
  }
}

answer <- tail(primes, 1)
##########################
})[3]
ElapsedMins <- floor(ElapsedTime/60)
ElapsedSecs <- (ElapsedTime-ElapsedMins*60)
cat(sprintf("\nThe answer is:  %d\nTotal elapsed time:  %d minutes and %f seconds\n",
answer, ElapsedMins, ElapsedSecs))

Output:
The answer is:  104743
Total elapsed time:  0 minutes and 1.191000 seconds

Post-Code Commentary:  You may notice that I’m not pre-allocating the prime list and then storing new prime values as I go, like a good little R user.  The reason is that it’s actually slower to do it this way.  Now maybe I’m an idiot and I don’t really understand how R does things, but if I take this algorithm and only modify the prime storage part of it to include preallocating the list and then injecting into that list (replacing a 0) as I go, then the time is nearly a full order of magnitude slower.  Here’s the R code that you would produce if you were to do what I just described:

primes <- numeric(10001)
primes[1:2] <- c(2, 3)
i <- 3
prime.position <- 3

while (primes[10001] == 0){
  flag <- 0
  i <- i+2
  if (i%%6 == 3){
    flag <- 1
  }
  if (flag == 0){
    s <- sqrt(i)+1
    for (prime in primes[primes > 0]){
      if ((i%%prime == 0)){
        flag <- 1
        break
      }
      if (prime > s){
        break
      }
    }
    if (flag == 0){
      primes[prime.position] <- i
      prime.position<-prime.position+1
    }
  }
}

answer <- tail(primes, 1)

So how do the two run?  Well, on my work computer (which is a little slower than my home computer–also don’t tell my boss!), the original “bad R user” code runs in 1.852 seconds.  The “better” solution with pre-fetching the vector runs in 11.100 seconds.  That is a 9.248 second differential, which amounts to a nearly 500% increase in run time!

Of course, there are plenty of times when it’s objectively better to pre-allocate; but as it turns out, this isn’t one of them (though I have no idea why).


4 responses to “Project Euler in R: Problem 7

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s