A belated Happy Pi Day!

Honestly, is there any better way to spend a day (or two) on Pi Day (March 14 = “3.14”) than to write some code to compute the value of Pi?!

You may remember from your math coursework Newton’s Formula, which is easy to implement:

\pi = 6 \sum\limits_{n=0}^{\infty}\frac{(2n)!}{(2^{4n+1})(n!)^{2} (2n+1)}

Or, you may have seen¬†the Monte Carlo method. You could “speed this up” using a CUDA GPU implementation, e.g., http://cacs.usc.edu/education/cs596/src/cuda/pi.cu:

// Using CUDA device to calculate pi
#include <stdio.h>
#include <cuda.h>

#define NBIN 10000000  // Number of bins
#define NUM_BLOCK  30  // Number of thread blocks
#define NUM_THREAD  8  // Number of threads per block
int tid;
float pi = 0;

// Kernel that executes on the CUDA device
__global__ void cal_pi(float *sum, int nbin, float step, int nthreads, int nblocks) {
	int i;
	float x;
	int idx = blockIdx.x*blockDim.x+threadIdx.x;  // Sequential thread index across the blocks
	for (i=idx; i< nbin; i+=nthreads*nblocks) {
		x = (i+0.5)*step;
		sum[idx] += 4.0/(1.0+x*x);
	}
}

// Main routine that executes on the host
int main(void) {
	dim3 dimGrid(NUM_BLOCK,1,1);  // Grid dimensions
	dim3 dimBlock(NUM_THREAD,1,1);  // Block dimensions
	float *sumHost, *sumDev;  // Pointer to host & device arrays

	float step = 1.0/NBIN;  // Step size
	size_t size = NUM_BLOCK*NUM_THREAD*sizeof(float);  //Array memory size
	sumHost = (float *)malloc(size);  //  Allocate array on host
	cudaMalloc((void **) &sumDev, size);  // Allocate array on device
	// Initialize array in device to 0
	cudaMemset(sumDev, 0, size);
	// Do calculation on device
	cal_pi <<<dimGrid, dimBlock>>> (sumDev, NBIN, step, NUM_THREAD, NUM_BLOCK); // call CUDA kernel
	// Retrieve result from device and store it in host array
	cudaMemcpy(sumHost, sumDev, size, cudaMemcpyDeviceToHost);
	for(tid=0; tid<NUM_THREAD*NUM_BLOCK; tid++)
		pi += sumHost[tid];
	pi *= step;

	// Print results
	printf("PI = %f\n",pi);

	// Cleanup
	free(sumHost); 
	cudaFree(sumDev);

	return 0;
}

Unfortunately, while these are easy to implement, they are terrible algorithms, converging on the value of Pi very, very slowly indeed–even using the parallelism of GPUs.¬†Although I’ve read about Chudnovsky’s Formula, I never tried it, probably because I had my priorities wrong. Beside, all of these require an arbitrary precision math package, which I didn’t have.

But, it turns out there are some: MPIR (Multiple Precision Integers and Rationals), MPIR.NET, and LongCalc

So, I wish everybody a belated Pi Day with a simple C# program that uses LongCalc (Mpir.NET/MPIR).

namespace Pi
{
    using System;
    using LongCalc;

    class Pi_ChudnovskyFormula
    {
        public static bf Compute(int times)
        {
            // https://en.wikipedia.org/wiki/Pi#Rapidly_convergent_series
            bf bigK = 6;
            bf bigM = 1;
            bf bigL = 13591409;
            bf bigX = 1;
            bf bigS = 13591409;
            bf mul = new bf("-262537412640768000");
            System.Console.WriteLine("iterations " + times);
            for (int k = 1; k < times; ++k)
            {
                bf bfk = k;
                if (k % 1000 == 0) System.Console.Write(" " + k);
                bf bfk3 = bfk * bfk * bfk;
                bf bigK3 = bigK * bigK * bigK;
                bf bigK4 = (bigK * 16);
                bigM = (bigK3 - bigK4) * bigM / bfk3;
                bigL += 545140134;
                bigX = bigX.Multiply(mul);
                bf ml = bigM * bigL;
                bigS += ml / bigX;
                bigK += 12;
            }
            System.Console.WriteLine();
            bf sq = new bf(10005).Sqrt();
            bf pi = 426880 * sq / bigS;
            return pi;
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            int digits = 50000;
            LongCalc.bf.GlobalPrecision = (int)(digits * Math.Log(10, 2) * 2); // bits
            int times = LongCalc.bf.GlobalPrecision / 14; // approximate accuracy added to significand per iteration.
            bf r2 = Pi_ChudnovskyFormula.Compute(times);
            System.Console.WriteLine(r2.toString(10, false, digits));
        }
    }
}


3.1415926535897932384626433...