Project Euler in R: Problem 10

Problem:  Find the sum of all the primes below two million.

Commentary:  If you’re not careful, this problem can really get you in R.  In poking around the internet, there are a few solutions to this problem, but all the ones I’ve tested are slow.  Some, even from people who are better at R than me, are VERY slow.  That linked one in particular took 6 minutes and 8.728000 seconds to complete.  My solutions runs in under 2 seconds.

So we’ll be using that old prime algorithm I’ve already discussed to death.  Other than a few tricks in there that I never see get implemented, there’s a crucial trick to this solution.  Since we only want the sum, let’s not be slow about finding the primes we want.  I only have to find all primes below \sqrt{n}, where n=2000000.  It takes basically no time to find them.  From there, I start evaluating my sum, and like a good little R user, I vectorize by getting the possible list of possible primes above \sqrt{n}.  From there, I check if any of my known primes divide into those possible primes.  This is actually much faster than generating primes and storing them.

R Code:

ElapsedTime <- system.time({
##########################
n <- 2000000

### Function to get primes below specified n; option to get only those below sqrt(n)
PrimesBelow <- function(n, below.sqrt=FALSE){
  if (below.sqrt == TRUE){
    m <- ceiling(sqrt(n))
  } else {
    m <- n
  }

  primes <- c(2,3)
  i <- 3
  while (i < m-1){
    flag <- 0
    i <- i+2
    if (i%%6 == 3){
      flag <- 1
    }
    if (flag == 0){
      s <- sqrt(i)+1
      possible.primes <- primes[primes<s]
      for (prime in possible.primes){
        if ((i%%prime == 0)){
          flag <- 1
          break
        }
      }
      if (flag == 0){
        primes <- c(primes, i)
      }
    }
  }
  primes
}
###

primes <- PrimesBelow(n, below.sqrt=TRUE)

biglist <- ceiling(sqrt(n)):n
biglist <- biglist[-which(biglist%%2 == 0)]
biglist <- biglist[-which(biglist%%6 == 3)]

for (prime in primes){
  biglist <- biglist[which(biglist%%prime != 0)]
}

answer <- sum(primes)+sum(as.numeric(biglist))
##########################
})[3]
ElapsedMins <- floor(ElapsedTime/60)
ElapsedSecs <- (ElapsedTime-ElapsedMins*60)
cat(sprintf("\nThe answer is:  %f\nTotal elapsed time:  %d minutes and %f seconds\n", answer, ElapsedMins, ElapsedSecs))

Output:
The answer is:  142913828922.000000
Total elapsed time:  0 minutes and 1.915000 seconds


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s