## Speeding up my Project Euler 10 implementation with threads

My implementation of Project Euler 10 takes almost 25 minutes to complete. It’s a very naive attempt, and there is a great deal of optimization that needs to happen – particularly in the function I use to determine if a number is prime.

I decided to ignore an elegant optimization and see if I could decrease the execution time by solving the problem in several concurrent threads.

Project Euler 10 asks you to find the sum of all prime numbers that are under 2,000,000. It’s easy to see how to parallelize this – calculate (and add together) the primes from 0 to 1,000,000 in on thread, and the primes from 1,000,001 to 2,000,000 in another thread. Add the results from the two threads together and you have the solution. I decided to do it in four threads:

Thread 1: add primes from 0 to 500,000

Thread 2: add primes from 500,001 to 1,000,000

Thread 3: add primes from 1,000,001 to 1,500,000

Thread 4: add primes from 1,500,001 to 2,000,000

I’ve never written any threaded code before, so I hit up Google and found this page. I’m surprised (but quite relieved!) that it’s actually very easy to create and synchronize threads.

Each thread is summing up primes under it’s respective range, and when the thread is finished it adds the calculated total to a variable that all threads can see. I use a mutex to prevent the threads from potentially trying to write to the variable all at the same time, but in my application this really isn’t a risk since each thread takes a considerably different time to complete.

I don’t feel like fighting the code formatting in WordPress, so here’s a Gist of my solution: https://gist.github.com/anonymous/5130987

It’s pretty straightforward and follows much of the conventions in the tutorial I found, so I won’t do a line-by-line breakdown here (unless someone asks).

The end result? Execution time is just over 8 minutes (which is how long it takes for the 4th thread to finish – the other three finish well before then).

Two quick notes. The GCC command I needed to use was: