Ibán Cereijo

A division and n-root algorithm for big numbers

2012-11-02T00:00:00+00:00

In a previous post, we introduced a library for performing arithmetic operations with arbitrary precision, and in this update we optimized the multiplication algorithm using the Fast Fourier Transform factorization.

In this post, using the FFT-based multiplication algorithm as the cornerstone of our library, we will optimize the division algorithm and introduce some other methods for computing square and quartic roots. In a previous example, we succeeded to compute the largest known prime number at the time - the Mersenne prime $M_{43112609}$ - working with integer arithmetic. With the new ingredients, we will be able to compute some other interesting things, such as millions of decimals of $\pi$ or other irrational numbers using convergent series.

A new division algorithm

We have already optimized the multiplication method, moving from a $O(n^2)$ time complexity algorithm to a $O(n\,log_2(n))$ one. But the division is still computed with a long division algorithm, with a time complexity of $O(n^2)$ . We will try to find a faster algorithm using the Newton Raphson method to converge to the solution.

The problem is: given the numbers $a$ and $b$ , we want to compute the quotient $c = \dfrac{a}{b}$ . The following function has a zero at the desired quotient:

$f(x) = b\,x - a.$

Applying the Newton-Raphson method to it, starting with an initial guess $x_0$ , we can compute a sequence $\{x_k\},\,k=1,\,2,\,\ldots$ that converges quadratically to $\dfrac{a}{b}$ in the following way

$x_{k+1} = x_{k} - \dfrac{f(x_k)}{f'(x_{k})} = x_{k} - \dfrac{b\,x_{k} - a}{b}$

Turns out that we didn’t make any progress yet, as long as we still need to compute a division with the same divisor $b$ . But, conceiving the division $\dfrac{a}{b}$ as a multiplication by the inverse $a\,\left(b^{-1}\right)$ , and being that inverse $b^{-1}$ a zero of the following trivial function:

$f(x) = \dfrac{1}{x} - b,$

we can build a series that converges quadratically to $b^{-1}$

$x_{k+1} = x_{k} - \dfrac{f(x_k)}{f'(x_{k})} = x_{k} - \dfrac{\dfrac{1}{x_{k}} - b}{-\dfrac{1}{x_{k}^2}} = x_{k}\left(2 - b\,x_{k}\right)$

Thus, we only need multiplications and subtractions to compute every element of the sequence. Once we have converged to the inverse, $x_{k} \simeq b^{-1}$ , we only need to estimate the division as

$c \simeq = a\,x_{k}$

A quadratic convergence means that the amount of figures found, should double from one iteration to the next one.

A square root and n-root algorithm

In a similar way, we can build a sequence converging to the square root of a given number, which is a root of the following function

$f(x) = x^2 - a \qquad \Rightarrow \qquad x_{k+1} = x_{k} - \dfrac{ x_k ^ 2 - a}{2\,x_k}$

Or, if we want to avoid computing the division in each iteration, we can build a sequence converging to the inverse of the square root, being such number a zero of the function

$f(x) = \dfrac{1}{x^2} - a \qquad \Rightarrow \qquad x_{k+1} = \dfrac{x_{k}}{2}\left(3 - a\,{x_k}^2\right).$

Voilà, we only need multiplications and additions/subtractions in each iteration, and the solution will converge quadratically to $\dfrac{1}{\sqrt{a}}$ . We can even compute the desired value ${\sqrt{a}}$ at the end without having to invert the solution, only multiplying by $a$

$\sqrt{a} = a\,\dfrac{1}{\sqrt{a}} \simeq a\,x_k.$

This method can be generalized for computing ${\sqrt[n]{a}}$ (the two previous examples correspond to $n = 1$ and $n = 2$ ):

$f(x) = \dfrac{1}{x^n} - a \qquad \Rightarrow \qquad x_{k+1} = \dfrac{x_{k}}{n}\left(n + 1 - a\,{x_k}^n\right).$

Once we have the final solution $x_{k} \simeq \dfrac{1}{\sqrt[n]{a}}$ , we can compute the n-root without inverting the solution, by means of

$\sqrt[n]{a} \simeq a\,x_k^{n-1}$

Implementation details

The implementation in the project page adds the new features to the previous version:

Inverse computation (file div.cc).
Division algorithm using the inverse method (file div.cc). The long division algorithm was already implemented.
Square and quartic roots (n-root not implemented) using the direct methods or the more efficient inverse ones (file sqrt.cc).

You will find more information about how to switch the algorithms in the README file.

The plot below shows a scalability comparison of the methods implemented in the library. Notice the scalability problems of Long Division and Long Multiplication algorithms (the latest takes 10,000 times longer than the FFT multiplication algorithm to compute 1 million figures). The Newton inverse algorithm and its related algorithms (inverse division, square root and quartic root), perform well enough for our purposes (much better than the non inverse versions). The non inverse square and quartic root algorithms are based on an inverse division algorithm (they would perform much worse if based on a long division algorithm). There are many possible optimizations, but they are beyond the scope of this post.

Figure 1: Scalability of the algorithms implemented in the library.

Some other important issues we didn’t tackle yet:

Initial guesses for the iterative methods: check the methods BigNumber::fromDouble(double, int) and toDouble(double&, int&) and their usage for computing an initial guess, i.e. for a multiplicative inverse.
Convergence criterions: a typical stop criterion with a convergent sequence would be when $x_{k+1}$ and $x_{k}$ are very close each other, $x_{k+1} \simeq x_{k}$ , and that means that all (or almost all) of their digits match up. Check the source code again for further detail.
Rounding errors: using algorithms that converge to multiplicative inverses can incur rounding errors when working with finite arithmetic. These errors can be overcome, but no solution has been implemented in the code, nor discussed in this post. As a hint for the division algorithm, for example, once we get the approximate quotient $c_1 \simeq \dfrac{a}{b}$ using the inverse approach, we can compute a new quotient $c_2 \simeq \dfrac{r}{b}$ (where $r = a - b\,c_1$ is the first remainder), and estimate the final quotient as $c \simeq c_1 + c_2$ . The second division can be performed with a more accurate Long Division algorithm, not taking too long given the small remainder.

Application example

I always wanted to write my own program for computing millions of decimals of $\pi$ , and now we have all the ingredients to do it by using, for instance, a 4th order Borwein’s algorithm (disclaimer: this is just an application of this general purpose library, not a specially optimized way to compute decimals of $\pi$ ). Let us start with the following values:

$a_0 = 6 - 4\sqrt{2},\qquad y_0 = \sqrt{2} - 1,$

and then construct the sequences $\{y_k\}$ and $\{a_k\}$

$y_{k+1} = \dfrac{1-(1-y_k^4)^{1/4}}{1+(1-y_k^4)^{1/4}}, \\ \\ a_{k+1} = a_k(1+y_{k+1})^4 - 2^{2k+3} y_{k+1} (1 + y_{k+1} + y_{k+1}^2),$

the sequence $\{a_{k}\}$ converges with quartic order to $\dfrac{1}{\pi}$ . You can get this result in a few minutes with a normal computer - as of 2012 - if you compute $\pi$ with 1,038,090 decimals (the last 7 decimals are wrong due to rounding errors, and the total amount of figures involved in computations is $2^{20}$ = 1,048,576). You can find the implementation in the file pi.cc.

Modeling and simulation of planetary motion

2011-12-05T00:00:00+00:00

One of my favourite problems in classical mechanics is the planetary motion, and one of my first attempts to solve it with computer algorithms dates from my early teens. My first approach was written in BASIC, and the results were completely wrong, as I didn’t know enough about the underlying maths. Subsequent attempts produced good results once I knew some basics of differential calculus during my high-school years.

In this post I’ll rescue one of those implementations, written in C, which uses the laws of motion to compute astronomical body paths.

Physics background

Let’s suppose we have two bodies $B_1$ and $B_2$ , at positions $\vec{p}_1$ and $\vec{p}_2$ , and with masses $m_1$ and $m_2$ , respectively. Newton’s law of universal gravitation states that the force between the two bodies is proportional to the product of the two masses, and inversely proportional to the squared distance:

$F = G\,\dfrac{m_1 \cdot m_2}{r^2}.$

Where $r = \left|\vec{p}_1-\vec{p}_2\right |$ . For instance, the force with which $B_1$ is attracted towards $B_2$ , including magnitude and direction, and being $\hat{r}$ the unitary vector from $\vec{p}_1$ to $\vec{p}_2$ , is

$\vec{F}_{12} = G\,\dfrac{m_1 \cdot m_2}{\left|\vec{p}_1-\vec{p}_2\right|^2}\hat{r} = G\,\dfrac{m_1 \cdot m_2}{\left|\vec{p}_1-\vec{p}_2\right|^3}(\vec{p}_2-\vec{p}_1)$

Figure 1: Solution example when two bodies interact.

If we also use Newton’s second law of motion, which relates force and acceleration

$\vec{F} = m\,\vec{a},$

and knowing that the acceleration $\vec{a}(t)$ is the second time derivative of position $\vec{p}(t)$

$\vec{a}(t) = \dfrac{d^2}{dt^2}\vec{p}(t),$

we can express the problem with the following set of coupled non-linear differential equations:

$\begin{array}{c} \dfrac{d^2}{dt^2}\vec{p}_1(t) = G\,\dfrac{m_2}{\left|\vec{p}_1(t)-\vec{p}_2(t)\right|^3}(\vec{p}_2(t)-\vec{p}_1(t))\\ \dfrac{d^2}{dt^2}\vec{p}_2(t) = G\,\dfrac{m_1}{\left|\vec{p}_1(t)-\vec{p}_2(t)\right|^3}(\vec{p}_1(t)-\vec{p}_2(t)). \end{array}$

If we provide the initial conditions, i.e. the initial positions $\vec{p}_1(t_0)$ and $\vec{p}_2(t_0)$ , and the initial velocities $\vec{v}_1(t_0)$ and $\vec{v}_2(t_0)$ , we can obtain, from the above equations, the paths $\vec{p}_1(t)$ and $\vec{p}_2(t)$ , regardless of the number of dimensions. Particularly, for a two-dimensional problem, where

$\begin{array}{c} \vec{p}_1(t) = (x_1(t), y_1(t))\\ \vec{p}_2(t) = (x_2(t), y_2(t)), \end{array}$

the set of equations becomes

$\begin{array}{c} \dfrac{d^2}{dt^2}x_1(t) = G\,\dfrac{m_2}{\bigl((x_1(t)-x_2(t))^2+(y_1(t)-y_2(t))^2\bigr)^\frac{3}{2}}\bigl(x_2(t)-x_1(t)\bigr)\\ \dfrac{d^2}{dt^2}y_1(t) = G\,\dfrac{m_2}{\bigl((x_1(t)-x_2(t))^2+(y_1(t)-y_2(t))^2\bigr)^\frac{3}{2}}\bigl(y_2(t)-y_1(t)\bigr)\\ \dfrac{d^2}{dt^2}x_2(t) = G\,\dfrac{m_1}{\bigl((x_1(t)-x_2(t))^2+(y_1(t)-y_2(t))^2\bigr)^\frac{3}{2}}\bigl(x_1(t)-x_2(t)\bigr)\\ \dfrac{d^2}{dt^2}y_2(t) = G\,\dfrac{m_1}{\bigl((x_1(t)-x_2(t))^2+(y_1(t)-y_2(t))^2\bigr)^\frac{3}{2}}\bigl(y_1(t)-y_2(t)\bigr), \end{array}$

Figure 2: Same solution as before, but showing the relative motion. The result is an ellipse.

where we have to find $x_1(t)$ , $y_1(t)$ , $x_2(t)$ and $y_2(t)$ . The problem can be easily generalized to $N$ dimensions and $M$ bodies.

Designing an numerical method to solve the equations

If we discretize the time domain

Figure 3: 3D solution example.

$t_n = t_0 + n\,\Delta_t\,\qquad n \in \mathbb{Z}^+,$

and we apply the following first order finite differences approximations of the differential operators (this is the easiest way, and it’s equivalent to Euler method):

$\begin{array}{c} \vec{v}(t) = \dfrac{d}{dt}\vec{p}(t) \simeq \dfrac{\vec{p}(t+\Delta_t)-\vec{p}(t)}{\Delta_t}\\ \vec{a}(t) = \dfrac{d}{dt}\vec{v}(t) \simeq \dfrac{\vec{v}(t+\Delta_t)-\vec{v}(t)}{\Delta_t} \end{array}$

We can compute both the velocity and position of a body $B_i$ , given the force exerted over it, by finding $\vec{p}_i(t+\Delta_t)$ from the above equations

As simple as:

We compute the forces for each body (the forces due to all the other bodies): $\vec{F}_i = \displaystyle\sum_{j \neq i}G\,\dfrac{m_i \cdot m_j}{\left|\vec{p}_i(t_n)-\vec{p}_j(t_n)\right|^3}\bigl(\vec{p}_j(t_n)-\vec{p}_i(t_n)\bigr)$
Hence we have the acceleration over each body: $\vec{a}_i = \dfrac{\vec{F}_i}{m_i}$
We integrate it, once to obtain the velocity: $\vec{v}_i(t_n+\Delta_t) = \vec{v}_i(t_n) + \Delta_t\,\vec{a}_i$
And we integrate it again, for the position: $\vec{p}_i(t_n+\Delta_t) = \vec{p}_i(t_n) + \Delta_t\,\vec{v}_i(t_n)$

The code

In the link below you have an implementation in C for the two-dimensional case. The code is ancient and platform-dependent, as it uses some basic graphic routines. Originally this code was written for MS-DOS, using the protected mode and the mode 13h. The code was originally compiled with the DJGPP programming platform under RHIDE. As this platform is from the past, you can either run it under an emulator like DOSBox, or compile it using the SDL wrapper that I’ve written for rescuing this and other old code (only tested under GNU/Linux). The wrapper is implemented in files mode13h.h and mode13h.c (the only platform-dependent code), and in order to switch between a native MS-DOS platform and SDL you must use the DOS symbol (when present, it assumes a native MS-DOS platform, otherwise it will use SDL).

The main algorithm is implemented in file newton.c, and, as I described before, it just computes the force exerted over each body, and then velocity and position.

Figure 4: Example of execution with relative motion.

A body is modelled with the struct body_t, which considers the position and velocity. Before starting the algorithm, you must create the bodies with the createBody() function, providing initial position, initial velocity and mass. Also, you can specify whether or not the body is fixed at its position, to simulate fictitious situations, suitable to being validated against Keppler’s laws (you can visually check how the two first laws hold when only two bodies interact). Another way to do this with real cases is by using the bodyref variable: when this variable is not null, the camera will follow the selected body.

Other useful parameters are scale_factor, to fit the required space in the screen, and calculations_per_plot, that allows you to enable frame dropping, computing with enough accuracy without having to plot all the solutions.

The source code includes an example of an Earth–Moon system, where you can check that, given realistic data for the initial values and masses, some other parameters like the orbital period match the known values.

In this video you can watch an execution example of the three-body problem, under DOSBox emulator, where you can appreciate the chaotic behaviour.

A fast circle algorithm for ZX Spectrum

2011-11-01T00:00:00+00:00

My first steps in computer programming were with an old ZX Spectrum, like many other people of my generation. Following a natural order, Sinclair BASIC was the first language I dealt with, but quite often the hardware limitations forced me to move to the Z80 assembly language. A common practice was to code some “time critical” routines - like some graphic routines - in assembly to avoid many of the bottlenecks.

In this post, I’ll explain, as an example, how to outperform the built-in circle drawing routine.

Algorithm explanation

The original built-in algorithm (command CIRCLE in BASIC) drew the circumference with an angular sweep, from 0 to 360 degrees, and although it was probably implemented in assembly, it seemed rather slow.

It is possible to outperform the original implementation without using trigonometric functions nor square roots, and even without using multiplications. The following algorithm is equivalent to the now known mid-point algorithm, but with different error formulation, and it is based on the implicit equation of the circumference. Such equation is, assuming the circle centered at (0,0):

$x^2 + y^2 - r^2 = 0$

To draw the contour, we establish $(x,y) = (r,0)$ as the starting point, and draw the first octant, with angles between 0 and 45 degrees. Below 45 degrees the $y$ component goes faster than the $x$ one, so we will increment the $y$ coordinate inside a loop, and correct the path decreasing the $x$ component whenever we consider we are “far away” from the contour. To evaluate how far we are, let’s compute the squared error

$e(x,y) = x^2 + y^2 - r^2$

Figure 1: Sweep from 0 to 45º and pixel discretization.

Let $\vec{p}_{k} = (x_{k},y_{k})$ be the coordinates of a pixel at a given step k; at each step we have to choose whether to move upwards or towards the upper-left pixel:

$\begin{array}{l} \vec{p}_{1,k+1} = \vec{p}_{1,k} + (0,1)\\ \vec{p}_{2,k+1} = \vec{p}_{2,k} + (-1,1) \end{array}$

Our decision will be based on a minimum squared error criterion; for the two options above we have the following expressions for the error (in our discrete space, we measure at the middle of the pixels, so the mathematical center is at the middle of the pixel (0,0))

$\begin{array}{ll} e_{1,k+1} = & e(\vec{p}_{1,k+1}) = x_{k}^2 + (y_{k}+1)^2 - r^2 = e_{k} + 1 + 2\,y_{k}\\ \\ e_{2,k+1} = & e(\vec{p}_{2,k+1}) = (x_{k}-1)^2 + (y_{k}+1)^2 - r^2\\ &= e_{1,k+1} + 1 - 2\,x_{k} = e_{k} + 1 + 2\,y_{k} + 1 - 2\,x_{k} \end{array}$

Notice that we can know the error at a given step if we know the error previous to it, with only a few additions. Whenever $|e_{1,k+1}| < |e_{2,k+1}|$ , we choose $\vec{p}_{1,k+1}$ as the next pixel, otherwise we take $\vec{p}_{2,k+1}$ . We can simplify the comparison in the following way, knowing that $e_{2,k+1}$ is always negative and $x_{k}$ and $y_{k}$ are integers

$\begin{array}{l} |e_{1,k+1}| < |e_{2,k+1}| \,\,\,\, \Rightarrow \,\,\,\, e_{1,k+1} < -e_{2,k+1} \,\,\,\, \Rightarrow \,\,\,\, e_{1,k+1} < -e_{1,k+1} - 1 + 2\,x_{k} \\ \,\,\,\, \Rightarrow \,\,\,\, 2\,e_{1,k+1} < 2\,x_{k} - 1 \,\,\,\, \Rightarrow \,\,\,\, e_{1,k+1} < x_{k} - \frac{1}{2} \,\,\,\, \Rightarrow \,\,\,\, e_{1,k+1} < x_{k} \end{array}$

The initial value for the error is $e_{0} = 0$ , because the first pixel $\vec{p}_{0} = (r,0)$ satisfies the circle equation. Observe the evolution of the error in the figure below; the corrections produced by the decrements in the $x$ component help to keep the error around $0$ .

Figure 2: Evolution of error in the angular sweep from 0 to 45º. While incrementing the y component, we correct the x component whenever the error becomes too high, according to a minimum squared error criterion. Each x correction results in an error reduction.

The algorithm can be coded in C as follows:

int x = r;
int y = 0;
int e = 0;

for (;;) {
  drawPixel(xc + x, yc + y);
  if (x <= y) {
    break;
  }
  e += 2*y + 1;
  y++;
  if (e > x) {
    e += 1 - 2*x;
    x--;
  }
}

Once we know how to draw the first octant, the rest of the circle can be drawn applying symmetry. Instead of drawing one single pixel, we draw 8 of them:

Figure 3: Simmetry.

drawPixel(xc + x, yc + y);
drawPixel(xc + x, yc - y);
drawPixel(xc - x, yc - y);
drawPixel(xc - x, yc + y);
drawPixel(xc + y, yc + x);
drawPixel(xc + y, yc - x);
drawPixel(xc - y, yc - x);
drawPixel(xc - y, yc + x);

The code

The source code is a .asm text file that you can compile with a z80 assembler like pasmo (you can use other assemblers, but some of them ignore the ORG directives, so be careful with the relocations). With pasmo you can generate directly a .tap file ready to be loaded in an spectrum emulator.

The code contains two main functions: one for drawing pixels and another one for drawing circles. Also, the file includes an execution example placed at the address 53000, that draws a set of concentric circles growing in size.

For drawing pixels, two lookup tables are used: tabpow2, with powers of 2, and tablinidx, with the order of the 192 screen lines (remember that the ZX spectrum used an interlaced access).

You can invoke the routine by placing the point coordinates at the addresses 50998 and 50999, and jumping to the address 51000

POKE 50998, 128
POKE 50999, 88
RANDOMIZE USR 51000

To invoke the circle routine, you must place the center coordinates at 51997 and 51998, and the radius at 51999, and then jump to the address 52000.

POKE 51997, 128
POKE 51998, 88
POKE 51999, 80
RANDOMIZE USR 52000

In this video you can see the execution of the following code under Spectemu emulator, comparing the performance of the two algorithms.

10 FOR i=1 TO 20
20 CIRCLE 128, 80, i
30 NEXT i
40 RANDOMIZE USR 53000

FFT multiplication algorithm

2010-10-25T00:00:00+00:00

We introduced in a previous post a library for operating with arbitrarily big numbers, where we implemented some basic arithmetic operations, like addition, multiplication and division.

In this post we will focus on improving the efficiency of the multiplication algorithm through Fast Fourier Transforms.

The algorithm

Let’s take two random numbers, say

$\begin{array}{c} a = 143672,\\ b = 669381 \end{array}$

The multiplication result is

$a \cdot b = 96171307032$

If we see each number as a digit sequence (digits between 0 and 9), we can think of each of those digits as coefficients of a polynomial. The multiplication of two polynomials and the ordinary multiplication are closely related

$\begin{array}{c} p_a(x) = x^5+4\,x^4+3\,x^3+6\,x^2+7\,x+2\\ p_b(x) = 6\,x^5+6\,x^4+9\,x^3+3\,x^2+8\,x+1 \end{array}$ $\begin{array}{c} p_a(x) \cdot p_b(x) = 6\,x^{10} + 30\,x^9 + 91\,x^8 + 53\,x^7 + 125\,x^6 + 150\,x^5 + \\ 121\,x^4 + 90\,x^3 + 68\,x^2 + 23\,x + 2 \end{array}$

Thus, we obtain the sequence [6 30 91 53 125 150 121 90 68 23 2]. As some of these digits are greater than 9, we need to apply a carry propagation correction, obtaining the sequence [9 6 1 7 1 3 0 7 0 3 2], which is the result we were looking for. The carry correction algorithm has a time complexity of order $O(n)$ .

It is well known that a polynomial multiplication algorithm is equivalent to a discrete convolution over the coefficients, so we can also conceive our digit sequences as discrete signals.

Figure 1: Convolution of two discrete signals.

The discrete convolution can be computed as

$(a * b)[n] = \displaystyle{\sum_{m=-\infty}^{\infty}{a[m] \cdot b[n-m]}}.$

The classical algorithm for the convolution has an order of $O(n^2)$ time complexity, so there is no point at the moment. To reduce this complexity, we’ll use the convolution theorem, which tells us that the Fourier transform of a convolution is the pointwise product of the Fourier transforms.

$\mathcal{F}\{a * b\} = \mathcal{F}\{a\} \cdot \mathcal{F}\{b\}.$

That is, we can compute the DFTs of the original signals, multiply them pointwise, compute the inverse DFT and apply the decimal correction. The key point is that we can use efficient FFT algorithms to compute both the DFT and the inverse DFT.

Figure 2: Block diagram of the algorithm.

From a $O(n^2)$ order algorithm, we have moved to the computation of two FFTs, with order $O(n \cdot log_2(n))$ each one, a pointwise product, with $O(n)$ , an inverse FFT, with $O(n \cdot log_2(n))$ , and a carry propagation correction, with $O(n)$ , so putting all together, our final algorithm will have an order of $O(n \cdot log_2(n))$ , much better than the original algorithm.

As a final optimization, it is possible to avoid computing one of the two FFTs if we take into account that the original signals are both real, so if we build a complex signal which carries one of the signals in its real part and the other signal in the imaginary part

$c[n] = a[n] + i\,b[n],$

we can extract the original signals applying the real part and imaginary part operators

$\begin{array}{c} a[n] = \Re\{y[n]\} = \dfrac{c[n] + c^*[n]}{2}\ \\ b[n] = \Im\{y[n]\} = \dfrac{c[n] - c^*[n]}{2\,i} \end{array}.$

Using the following property of the Fourier transform

$a^*[n] \longrightarrow \mathcal{F}\{a\}^*[-n],$

we can compute now the DFT of the composite signal and extract the DFTs of the original signals

$\begin{array}{c} \mathcal{F}\{a\}[n] = \dfrac{\mathcal{F}\{c\}[n] + \mathcal{F}\{c\}^*[-n]}{2}\ \\ \mathcal{F}\{b\}[n] = \dfrac{\mathcal{F}\{c\}[n] - \mathcal{F}\{c\}^*[-n]}{2\,i} \end{array},$

where the extra operations have all order $O(n)$ .

Figure 3: Algorithm with only 1 FFT.

The code

Starting from the original code, some functions have been added: FFT and inverse FFT computation functions, in the files fft.h and fft.c. Also, the memory allocation is now dynamic in order to deal with really huge numbers; furthermore, the show() function has been improved in such a way that it only prints the most and less significant digits with very big numbers, while the whole computation is dumped to the file output.txt.

To test the new multiplication algorithm, the program evaluates the largest known prime number at the moment this post was written, the Mersenne prime $2^{43,112,609}-1$ , with 12,978,189 digits. You no longer need to modify BigNumber::N_DIGITS, it will automatically adapt to the selected exponent. In this case, we need $2^{24}$ = 16,777,216 figures - must be power of 2 - to compute a number with 13 million digits.

The algorithm used to obtain the Mersenne number is detailed in the source code. You will need a few minutes and 1,3 GB of RAM memory to run the program with the current configuration in a normal - as of 2010 - computer. If that’s too long for you, you can compute a cheaper Mersenne prime, e.g. the number $2^{3,021,377}-1$ (with 909,526 digits), in a few seconds.

Take a look to those humongous numbers to get an idea of what we are talking about. Find below the execution output for both of them in raw text formatted to 75 columns. Can you imagine how hard it must be to proof that they are indeed prime numbers?

It is possible to aplly some improvements with parallelization and a more dynamic memory allocation, but that’s beyond the scope of this post. In future articles we will introduce algorithms to perform more complicated mathematical operations, like square roots.

Arithmetic operations with arbitrary precision

2009-12-19T00:00:00+00:00

I decided to write this small library in 2000, after watching a rerun of “The Simpsons” episode “Treehouse of Horror VI”.

Figure 1: Homer Simpson in a three-dimensional world.

In the last story of this episode, Homer Simpson jumps into a three-dimensional world. The full story, and specially this parallel world, are full of mathematical tributes: geometrical shapes, Euler’s identity, a reference to the P versus NP problem, and so on. Among these mathematical references, a disturbing numerical formula attracted my attention:

$1782^{12} + 1841^{12} = 1922^{12}$

The previous statement is obviously a joke, since it contradicts Fermat’s Last Theorem. However, evaluating both sides of the equation with a scientific calculator, one can fall for it. Try with Google, for example:

The problem with is that a normal calculator doesn’t handle the enough amount of significant figures. Given the fact that both sides of the equality share the 9 first digits, the result is, apparently, the same. Here is the full 40-digit result:

The main trouble stems obviously from a spatial limitation of the display, unable to represent all the computed figures in scientific notation. But even if our device had a larger screen, the above formulas couldn’t be properly evaluated due to the internal floating point storage precision.

Even though there are numerous tools able to perform arithmetic operations with arbitrary big integers (like the phyton language, or GNU bc), I found interesting, as part of my learning process, to write a lightweight and generic library to deal with this and other similar problems.

In this post I will present the library code and its basic structure, and in future publications we will try to improve its efficiency and show some practical examples.

Library specification

The library works with natural binary-coded decimal (NBCD) format, with sign and module agreement, so an arbitrarily big integer can be encoded in a - long enough - BCD digit sequence.

Figure 2: BCD sequence.

In order to handle fractional numbers, the library uses fixed-point arithmetic, therefore the format - the BCD sequence - is divided into integer and fractional parts. This way, fractional numbers can be approximated with arbitrary precision by simply allocating enough memory in the fractional part. This scheme can deal with arbitrary-sized numbers, but in a static way, i.e., having previously allocated the needed memory.

Figure 3: Multiplication with fixed-point arithmetic.

Once the format has been chosen, let us describe some simple arithmetic operations:

Addition and subtraction: These operations are made by a point-wise addition or subtraction of both BCD sequences, applying a carry propagation from each digit to the next one.
Multiplication: The operation is performed by a typical long multiplication algorithm.
Division: The algorithm used is the long division algorithm.

The code

The library is coded in C++, but without exploiting many of the features of the language.

A number is coded in BigNumber class, which consists basically in a byte array and a sign flag. For simplicity and clarity, only one decimal digit is coded in each byte, instead of two per byte.

The file config.h is a configuration point, where we can set the number of digits assigned to the integer and fractional parts in the format.

The file test.cc contains a main function with an example application: it evaluates both sides of the former equation $1782^{12} + 1841^{12} = 1922^{12}$ , and verifies its falseness. At least 40 digits are needed to carry out the operation, so the default configuration, which reserves 50 digits for both the integer and fractional parts, is enough for our purposes.

The rest of the files implement several operations.

addsub.cc: addition and subtraction.
mul.cc: multiplication.
div.cc: division.
convert.cc: conversion between bignumbers and floating point numbers.

Sudoku Solver

2009-11-22T00:00:00+00:00

In this post I will propose a way to solve a sudoku puzzle with a computer program, trying to find all the puzzle solutions in an efficient way using an algorithm based on reduction of Uncertainty.

The basic idea and the source code date from 2005, when the puzzle became popular in Europe.

The algorithm basis

Figure 1: A sudoku puzzle.

A sudoku puzzle consists of 9×9 squares accepting digits between 1 and 9. A digit placed in a square cannot be repeated in its row, column, or 3×3 grid. Therefore, each square belongs to 3 uniqueness constraint groups (a row, a column and a 3×3 grid), having a total amount of 3×9 constraint groups, 9 of each type.

Each square has a state: a set of digits that can be fit without violating its 3 uniqueness constraints, and therefore the total board state will be the set of all the 81 individual states. Without any information - an empty sudoku - the initial state is the one with the maximum uncertainty: all the digits are possible in each square.

The state of a square gives us a metric of uncertainty, i.e., the number of possible digits. By adding up the uncertainty of each square, we can obtain a global uncertainty metric. Our strategy will be to reduce this uncertainty until we reach the minimum state: the solution.

We have two ways to perform this reduction:

Using deductions: given a state, we move to other state with lower uncertainty. The new state will be valid whenever so is the original one. We say a state is valid when it contains the solution, which is actually not easy to determine. Fortunately, we can at least identify those invalid states that do not respect the uniqueness constraints; those, for sure, will not contain the desired solution. Fair enough.
Using hypothesis: given a state where no deductions can be made, we make an assumption which takes us to another state with lower uncertainty. We have no guarantee about the validity of our hypothesis, and hence about the validity of our new states, but a wrong decision will result sooner or later in an invalid state, so we will have the chance to rectify.

Using only hypothesis involves a typical brute-force algorithm. But in this approach, the use of deductions will allow us to drastically reduce the space where hypothesis are made: the lower the uncertainty is, the less assumptions we will need. The key point is that every time we reduce the uncertainty of a square, we can trigger recursively new deductions in their related squares (the ones belonging to its constraint groups), resulting in a notable reduction of the overall uncertainty of the problem.

The algorithm will therefore consist of applying deduction rules when possible, and hypothesis when necessary, until we reach the solution. If we reach an invalid state, produced by a hypothesis, we rule out that hypothesis and make a new one. The algorithm flowchart is as follows:

Figure 2: Algorithm flow.

We start from the maximum uncertainty state: a void board where all the digits (1..9) are possible in all the squares.
Use the initial board information to recursively trigger the initial deductions. If the puzzle is solved, we exit. Many cases can be solved using this technique only.
If the puzzle is not solved, we make an assumption over the next square with uncertainty (the next within a given sequence). This will trigger new deductions recursively, with three possible results:
- If the puzzle is solved, we exit.
- If we arrive at an inconsistent state, the last hypothesis was incorrect and we try the next possible one over the same square (or, if we cannot make more assumptions over that square, we move to the next one in our sequence).
- Otherwise - the puzzle is not solved yet - we jump to point 3.

But, how do we exactly make hypothesis and deductions?. Well, making a hypothesis is not a big deal. It could be, say, to randomly pick a digit among the state of a given square (set of possible digits), and suppose it is indeed the correct choice. These are the kind of assumptions we will consider in this program. But we could have chosen other options, like, for instance, assuming that one or more of the possible digits are not possible in the given square. This approach - not explored in this post - could give a better performance, as it could minimize mistakes by choosing more likely options.

Well then, and what about the deductions?. Let’s start with the easy case: if we know that a square contains a given digit, that digit cannot appear in the same row, column or 3×3 grid. That digit can therefore be removed from the state of all the related squares.

Figure 3: Uncertainty reduction in the 3×3 subgrid at the middle. As far as we know the initial value of the above four squares, their digits cannot appear in the other squares of the restriction group.

Let’s say it in a different way: if a given constraint group has a square whose state contains one single digit (no uncertainty at all), that digit necessarily has to belong exclusively to that square, and the uncertainty of the other squares can be reduced.

What if we have, in a given constraint group, two squares accepting the same pair of digits and nothing else?, don’t panic: those two digits must be distributed only between those two squares, no other square in the group can accept any of them, and hence their uncertainty can be reduced by removing them from their states.

Can we generalize this rule? certainly. By simple induction, if we want to distribute a set of N digits among the squares of a constraint group, and there are exactly N squares accepting only digits from that set, those N digits necessarily have to be distributed among those N squares, and the other 9 - N squares can’t receive any of them.

Figure 4: The four numbers 2, 3, 5 and 6 can only be placed at four squares, so those four squares cannot accept any other numbers. Or, alternatively: as long as there are three squares accepting only elements from the set of three digits { 4, 7, 9 }, no other square can accept any of them.

Let’s say it again, in a more formal way this time: given a restriction group C - a set of 9 squares -, let A be the state of one of them - the set of possible digits of that square - and let B be the set of squares of C with their state contained in A. If the cardinal of A is the same as the one of B, then the squares of C which are not contained in B - i.e., C \ B - can’t contain any element of A in their states. Thus, those elements can be removed from their states, and, consequently, the uncertainty is reduced.

That’s all, this is the basic rule, we don’t need anything else, so let’s start programming.

The code

The program is written in C and has been tested under GNU Linux. More details about its usage and board format can be found in the README file.

The source code is quite clear, so I’ll just give some guidelines to help its understanding:

About the data model, the structures square_t and square_group_t respectively model a square and a restriction group. No more complex data structures are needed.
Regarding that the square state is coded as a bit field (an unsigned short where the least significant 9 bits represent the possible digits), the following bitwise operations can be done:
- An uncertainty reduction operation consists in removing a bit subset from a bit set: a &= ~b (the set b is removed from a).
- A set inclusion can be detected checking the equality: if ((a | b) == b) … (is a contained in b?)
- The number of considered digits (the number of set bits) can be computed with the GCC built-in function __builtin_popcount.
- The digit coded by a set of bits with only one set bit can be obtained with the GCC built-in function __builtin_ffs, wich returns the first set bit index.
The method uncertainty_reduction performs a deduction rule over a square (the recursion enables to trigger other reductions in the related squares), and the method make_assumption makes a hypothesis.

Ibán Cereijo

A division and n-root algorithm for big numbers

A new division algorithm

A square root and n-root algorithm

Implementation details

Application example

Links

Modeling and simulation of planetary motion

Physics background

Designing an numerical method to solve the equations

The code

Links

A fast circle algorithm for ZX Spectrum

Algorithm explanation

The code

Links

FFT multiplication algorithm

The algorithm

The code

Links

Arithmetic operations with arbitrary precision

Library specification

The code

Links

Sudoku Solver

The algorithm basis

The code

Links