Generating Random Numbers: Basics of Monte Carlo Simulations, Kai Nordlund 2006 1
Generating Random Numbers: Basics of Monte Carlo Simulations, Kai Nordlund 2006 1
Central to any MC simulation are the random numbers. Hence it is important to have a good
source of random numbers available for the simulations. Getting ’good’ random numbers is in fact
not quite as easy as many people think it is, so we will spend quite some time on this topic.
Random numbers can be divided into (at least) three categories: true random numbers,
pseudorandom numbers and quasirandom numbers.
• True random : no way to predict what the next random number is
• Pseudorandom: a sequence of numbers which is algorithmically produced and not really
random
– It can be fully repeated if the initial conditions and algorithm are known
• Quasirandom numbers: act as random numbers in some sort of simulations, but are
well-ordered in some other types.
More precise explanations for these categories are given later in this section.
Although these are almost never used anymore, it is good to keep in mind that there do exist
non-automated means to generate random numbers.
The most obvious one is to just write one manually.
Although that has obvious problems for generating large sets of numbers, it is widely used for
selecting the seed number (see below) for electronic random number generators.
Another one, which was historically used to some extent, is to select numbers from some number
sequence, e.g. the phone book or the decimals of π .
• The former method is highly nonadvisable, as there obviously can be strong non-random
features in phone numbers.
• The latter is not so bad, since the decimals of π are not supposed to have correlations
As technology evolved, mechanical devices were devised which produced random numbers.
• The machines used in lottery or roulette
• Also specifically designed machines which could generate thousands of numbers have
been built. In 1955 the RAND corporation actually published a book with 1 million
random numbers. (undoubtedly a strong contender for the title of most boring book ever
published).
Since the advent of electronic computers, such devices have become obsolete, except in applications
(such as the lotto games) where it is important that even a layman can easily check that there is
no scam involved.
Mechanical random numbers can be true random numbers, such as the ones generated in a lotto
machine or a roulette, (unless maybe if the casino is run by the mob...). Numbers generated by a
mechanical calculating device may also be pseudorandom.
[http://www.fourmilab.ch/hotbits/]
It might seem an obvious idea to design microprocessors, or parts of them, to be able to generate
random numbers electronically - that is, design an electronic part which delivers a signal which
randomly gets translated to 0’s and 1’s when they are translated to digital form. But this is in
practice very rarely done.
But even an electronic random number generator could have its problems; it is easy to imagine that
minute electronic disturbances from the environment could affect the results produced by it.
For a physicist, an obvious idea to obtain random numbers independent of any reasonably possible
outside disturbance is to use radioactive decay. Since nuclear processes are almost completely
A quick google search reveals that there does indeed exist random numbers based on radioactivity,
see http://www.fourmilab.ch/hotbits/. Here is a sample of 16 truly random bytes:
unsigned char hotBits[16] = 168, 193, 241, 195, 37, 251, 190, 121, 105, 137, 173, 226, 95, 181,
239, 231 ;
But unfortunately downloading random numbers from the internet would be way too slow for most
MC simulations, and putting a radioactive source inside a chip is not a very good idea either.
Before we discuss the common algorithmic pseudorandom numbers, we mention something which is
something between true and pseudorandom numbers.
4.3.2.1. HAVEGE – HArdware Volatile Entropy Gathering and Expansion
[http://www.irisa.fr/caps/projects/hipsor/HAVEGE.html]
There is a serious effort going on in generating random numbers based on computer hardware. This
is not done in the chip electronics, but by software which reads chip registers.
There are several such approaches, but most are pretty slow, capable of producing only 10’s or 100’s
of bits per second.
In a quite new (2002 on) development, Seznec and Sendrier have formed a generator which uses
several advanced features of modern superscalar processors, such as caches, branch predictors,
pipelines, instruction caches, etc. Every invocation of the operating system modifies thousands of
these binary volatile states. So by reading from them, one can obtain data which will immediately
change afterwards and hence is practically impossible to reproduce.
So for practical purposes they do generate true random numbers, although formally they do not.
It is even possible to randomly personalize the generator source code during installing, and not only
with numbers but actually even by randomizing bit-level logical and shift operators in the source
code.
On a 1GHz Pentium, the routine can throughput randomness at a rate of 280 Mbit/s. On Pentium
III’s, the uncertainty is gathered from the processor instruction cache, data cache, L2 cache, as well
as the data translation buffer.
The downside here is that the implementation can not possibly be fully portable. As of the writing
of this (Jan 2006), the routine works for Sun UltraSparc’s, Pentium II, III, 4, Celeron, Athlon and
Duron, Itanium, and PowerPC G4. A nice list, but by no means complete.
For scientific simulations, as discussed in the next section, it is actually usually desirable to use
predictable random numbers, to enable repeatability. For this kind of application, HAVEGE is
clearly unsuitable.
The by far most common way of obtaining random numbers is by using some algorithm which
produces a seemingly random sequence of numbers. But the numbers are not truly random; if the
call to the routine is repeated with the same input parameters, the numbers will always be the
same. Such numbers are called pseudorandom numbers.
The most common way to implement the generators is to enable initializing them once with an
integer number, called the seed number. For a given seed number, the generator produces always
the same sequence of numbers.
The seed number could be set from the system clock, or just selected manually. Doing the latter is
actually almost always advisable, since this allows one to repeat the simulation identically, i.e. the
simulation is repeatable.
• Science should per definition be repeatable!
• Repeatability is also useful from a practical point of view: can get more details if needed
The repeat interval in the most used decent generators is of the order of the number of integers
that can be represented by a 4-byte integer, 232 ≈ 4 × 109. While this might seem large enough,
it is not necessarily so!
• A simple calculation shows it is quite easy to exceed 4 × 109 random numbers nowadays.
Another important quantity to know for integer generators is the maximum value the routine can
return. This is often (but not always) about the same as the maximum possible value of a (signed
or unsigned) integer or long integer variable, i.e. e.g. 231 − 1 or 232 − 1.
To then get floating point random numbers divided evenly between 0.0 and 1.0 one should simply
Finally, it is often important to know exactly what the limits of the generator are. That is, for a
floating point generator, can exactly 0 and 1 (within the numerical precision) be returned or not?
• Is the interval [0, 1], [0, 1[, ]0, 1] or ]0, 1[.
Purely mathematically this does not matter: the probability of hitting exactly the limit is exactly 0.
But on a computer it may matter a lot!. Especially if we work with 4-byte floating point values,
which have a mantissa with only about 7 digit accuracy, the probability of hitting the limit is
actually about 1 in 10 million. Even for 8-byte floating point variables, with a 15-digit mantissa,
the probability of hitting the limit may not be negligible.
• It is easy to think of cases where this matters (imagine e.g. what happens if you take a
logarithm of a random number which can be 0).
• Different applications need different limits, and many times it really does not matter after
all.
I will now attempt to take a pedagogical approach to generating random numbers, presenting one
of the most common approaches step-by-step.
One of the simplest decent, and probably still most used method to generate random numbers is to
use the following equation
For the sake of example, let us take as the seed number I0 = 4, and select the constants as
a = 7, c = 4 and m = 24 − 1 = 15. Plugging in the numbers gives
aI0 + c = 32
and after taking the modulus we get I1 = 2. Continuing gives the following list of numbers:
2. Hence care should be taking in selecting not only the random number generator but also the
constants in it.
For the linear congruential generator there are actually quite simple rules which guarantee that the
period reaches the maximum m − 1, i.e. we obtain all integers in the range [0, m − 1] before
repetition. These are:
• c and m should have no common prime factors
• a − 1 is divisible with all the prime factors of m
• a − 1 is divisible with 4 if m is divisible with 4.
Since the numbers chosen above, a = 7, c = 4 and m = 15, do not fulfill the second criterion, it
is indeed to be expected that the full period is not achieved, and that the period length may depend
on the seed. This is a terrible feature for a generator.
But if we had chosen m = 17, the full period will be achieved for any seed, except 5 (since 7
×5 + 4mod 17 = 5). In general, if a solution in the interval [0, m − 1] exists for the equation
aI + c (mod m) = I there will always be one seed value I which fails in them.
A note on coding generators in practice. There are two basic approaches one can use to code these
generators. The first is to have two separate subroutines, one which sets the seed number, another
which gives the random numbers. The first obviously takes the seed number as an argument,
whereas the latter does not need any argument at all.
This is the approach taken e.g. in the C language. The system-provided random number generator
is given a seed with the function
void srand(unsigned int seed)
and the actual random numbers are generated using the function
int rand(void)
This means that the subroutine srand has to pass the seed number to rand() somehow. But this
can of course be easily implemented e.g. with an external variable, or a module in Fortran90. In C:
seed = seed*a + c;
return (unsigned int) (seed/65536) % div;
}
Another approach is to require the seed number to always hang along in the function call, and be
treated identically no matter what. In that case one simply sets seed once, then lets it change
value without touching it outside the main routine. E.g.
seed=1;
for(i=0;i<=10000;i++) {
Which approach is better to use will of course depend on the other aspects of the code.
• The first one is slightly simpler in large codes, as one does not have to drag along the
seed number over many places of the code.
• The latter one has the advantage that one could use different random numbers sequences
at the same time easily, by using variables seed1, seed2 and so on
But keeping the possible problems in mind still, the simple equation 2 is often quite good enough for
many applications, especially with a careful choice of the parameters. A very widely used generator
is the “Minimal standard” generator proposed by Park and Miller. It has
5 31
a = 7 = 16807 c = 0(!) m=2 −1
When implementing it one meets a practical problem. Since the modulus factor m is almost 231,
the values returned by the generator can also be close to 231. This means that on a computer one
needs at least 31 bits to describe them. Then if we look at the product aI , one sees that the values
can easily exceed 232.
Why is this a problem? Because many compilers to date can only handle integers with sizes up to
32 bits.
For instance on most 32-bit computers (including Pentiums) the sizes of the C integer variables are
16 bits (data type short) or 32 bits (data type int and long).
So doing the product above directly is not even possible on all machines.
Fortunately there exists a well-established solution by Schrage. It involves an approximate
factorization of m, which we will not go into here. An implementation of this, from Numerical
Recipes in C is:
#define IA 16807
#define IM 2147483647
*idum ^= MASK;
k=(*idum)/IQ;
*idum=IA*(*idum-k*IQ)-IR*k;
if (*idum < 0) *idum += IM;
ans=AM*(*idum);
*idum ^= MASK;
return ans;
}
(if you want the actual integer seed number idum returned, just remove the multiplication with AM.
)
When using C, it is actually possible to avoid this implementation problem using a dirty trick.
• The C standard actually specifies that if you multiply two 32-bit integers and place the
result into a 32-bit int register, the result returned is the low-order 32-bit of the real, up
to 64-bit result.
• So this is exactly the operation of taking a modulus with 232 for unsigned integers. Hence
once can reduce the entire linear congruential generator into a single line:
This is highly non-portable, since it requires that the length of “long” be 32 bits (not true on e.g.
The problems with the simple approach 2 can be divided into two categories: poor implementations,
and fundamental problems.
4.4.2.1. Implementation problems
Although it might not seem necessary to list ways in which one can screw things up in programming,
there is one important example which should be mentioned.
In the ANSI standard of the C language, the way to implement the system functions (in stdlib.h)
rand() and srand(seed) is not specified. But they do specify an example of a generator, which
is (here I have modified the code to add clarifying variable names).:
seed = seed*a + c;
return (unsigned int) (seed/65536) % div;
}
void ansistandardsrand(unsigned int seed_set)
{
seed=seed_set;
}
Note that the modulus m is not given explicitly, but really is 232. But then the returned value
is divided by the very low number div=32768, meaning that only 32768 distinct values can be
returned, and that the smallest non-zero value which can be returned is 1/32768. Hence here the
repeat interval is much longer than the number of returned values.
It is obvious that there are many many applications where values of less than 1/32768 can be
returned. So this example should essentially not be used for anything.
Unfortunately this generator, and even worse modifications of this, is in fact implemented on many
compilers (both C and Fortran, don’t know about Java). This leads us to the important rule of
thumb: never use the compiler standard random number generator! And do not think that
Related to this is this simple problem: if you consider a very small value of I in the Park-Miller
generator
Ij+1 = 16807Ij (mod 2147483647)
say Ij = 10. Then Ij+1 = 168070, i.e. still much less than the modulus 2147483647. So when I
is divided by the modulus to give a floating-point value, we see that we first get 4.6566128 × 10−9,
then 4.6566128 × 10−8. I.e. a very small value will always be followed by another small value,
whereas for truly random numbers of course it could be followed by any number between 0 and 1 !
There are also more subtle problems, such as so called short-range serial correlations. And they also
often can have really terrible problems in a 2D plane; this is illustrated in an exercise.
To overcome these problems, it seems like a pretty obvious idea to ’shuffle’ the numbers somehow.
This should at least solve the problem with successive small numbers and the 2D problem, and
might help with the short-range correlation as well.
This can be simply achieved by having a table which holds a number of random numbers, and
returns one of them in a random sequence. In practice, already a small table is enough to improve
on the results significantly. Numerical Recipes presents a solution which has an array of 32 elements.
#define IA 16807
#define IM 2147483647
#define AM (1.0/IM)
#define IQ 127773
#define IR 2836
#define NTAB 32
How does this work? The first long if clause initializes the sequence the first time the generator is
called. It first generates 8 numbers which are thrown away, then fills in the array with 32 random
number elements.
After this the ordinary Park-Miller generator follows, except that in the middle we have the operations
j=iy/NDIV;
iy=iv[j]
iv[j] = *idum
Here j first acquires a random value from the iy value, which is the random number generated in
Proceeding in the development of the congruential generators, one can combine two single generators
to form one with a very much longer period. This can be done by generating two sequences, then
subtracting the result of one of them from the other (subtraction prevents an integer overflow). If
the answer is negative, the number is wrapped to the positive side by adding the modulus m of one
of the generators.
This forms a random-number sequence whose period can be the multiple of the period of the two
generators. With generators similar to the Park-Miller generator with m ∼ 231, one can thus reach
a period of the order of 262 ≈ 4.6×1018. Important in selecting the moduli m1 and m2 is that
the periods they form do not share many common factors. In the following generator the periods
are
m1 − 1 = 2 × 3 × 7 × 631 × 81031 = 2147483562
and
m2 − 1 = 2 × 19 × 31 × 1019 × 1789 = 2147483398
if (*idum <= 0) {
if (-(*idum) < 1) *idum=1;
else *idum = -(*idum);
idum2=(*idum);
for (j=NTAB+7;j>=0;j--) {
k=(*idum)/IQ1;
*idum=IA1*(*idum-k*IQ1)-k*IR1;
if (*idum < 0) *idum += IM1;
if (j < NTAB) iv[j] = *idum;
}
iy=iv[0];
}
k=(*idum)/IQ1; ! A
*idum=IA1*(*idum-k*IQ1)-k*IR1; ! A
if (*idum < 0) *idum += IM1; ! A
What is going on here? The if clause is again initialization. Part A calculates the first random
number using Schrage’s method, part B likewise the second. Part S+G handles the shuffle and
generates the combined random number, using the periodicity clause on line P.
Numerical Recipes places great trust in this algorithm, they even promise $ 1000 to anyone who
demonstrates that it fails in any known application. As far as I know, it has not been reported to fail
so far, despite being subject to quite some testing [www.cs.adelaide.edu.au/users/paulc/papers/sccs-526/sccs-526.ps.gz].
(note that RAN2 in the first edition of numerical recipes is entirely different, and this has been
reported to have problems [http://www.lysator.liu.se/c/num-recipes-in-c.html].)
Another, widely used and independent line of generators are the so called GFSR generators, which
were originally, in 1973, developed to overcome some problems in the simplest linear congruential
algorithms.
In this method, one starts with p random integer numbers ai, i = 0, 1, . . . , p − 1 generated
somehow in advance. Then the new elements k, with k ≥ p, can be generated as
ak = ak−p+q ⊕ ak−p
where p and q are constants, p > q , and ⊕ is the XOR logical operation.
(The XOR logical operation on two bits returns 1 is exactly one of the bits is 1, 0 otherwise.)
The difficult part is generating the initial numbers. In the original paper an older random
number generator, so called Kendall sequence, was used to generate a first number of binary
“1111100011011101010000100101100”, for p = 5. This number is the first number in a so called
W sequence. The second number in the W is then the same but shifted forwards by 6 bits, i.e.
1011001111100011011101010000100’. Then the next number is again shifted forwards, and so on.
This process gives the following numbers (the shift point is indicated by the horizontal line).
a5 = 00001 (4)
1◦ k=0
2◦ Set a0, . . . , ap−1 to some suitable initial values
3◦ Output ak
4◦ ak = ak+q mod p ⊕ ak
5◦ k = k + 1 mod p
6◦ Goto step 3◦
p=5; q=2;
k=0;
while(1) {
printf("%d\n",xn[k]);
xn[k]=xn[ ((k+q)%p) ] ^ xn[k];
k=((k+1)%p);
}
If we run this, we see that the period is 31. This actually follows from the full derivation and
motivation: it can be shown that the period of the GFSR generator is
p
2 −1 (6)
From this we also observe an obvious weakness of this original formulation: to get the initial values
we actually already generated all the 31 independent numbers the method can produce.
• This is, however, not really a general problem: Lewis and Payne showed that more
generally one needs to form a p × p matrix with linearly independent rows, and gave a
recipe for how to do that.
With that approach, the method actually does have some clear advantages:
1. It is extremely fast after initialization: note above that a single loop step, without the print,
only needs three memory references, one XOR and a couple of additions and moduli. All of these
can easily be made in a single bit-level machine language operation (for instance a multiplication
is on the bit-level waaay slower than an addition or logical operation). Hence it is well suited
for situations where very fast generation of random numbers is needed with minimal demands on
hardware, e.g. in small devices without too much computing power like mobile phones.
The basic GFSR does have known weaknesses, the most obvious of which is that the schemes
to initialize the array in a good way are complicated and quite slow. But there are many
further developments of it, e.g. using several XOR operations, and non-linear versions.
[http://www.math.sci.hiroshima-u.ac.jp/∼m-mat/MT/ARTICLES/tgfsr3.pdf]. We will discuss one of those later on.
FUNCTION RAND(M,P,Q,INTSIZ)
C
C M(P)=TABLE OF P PREVIOUS RANDCM NUMBERS.
C P,O=POLYNOMIAL PARAMETERS:X**P+X**Q+1.
C .NOT. OPERATOR IMPLEMENTED IN ARITHMETIC.
C INTSIZ=INTEGER SIZE (BITS) OF HOST MACHINE: E.G.,
C IBM 360, 31; CDC 6000, 48; SRU 1100, 35; HP 2100, 15.
C
which is pretty hard to decipher for anyone not grown up with 1960’s Fortran... Note the extremely
dirty trick of using EQUIVALENCE between logical and integer variables and the construction of
an XOR operation by a combination of AND and OR. At that time Fortran did not have built-in
bit-level logical operations, which lead to the above implementation. The authors simply assumed
that a logical AND and OR also worked as bit-level operators on the whole word size. They then
Many of the most modern and promising generators are based on the idea of using generators which
have nonlinear properties.
4.4.6.1. Nonlinear congruential generators
The basic idea of the inverse congruential generators is to generate numbers using
This at first looks exactly like the equation for the ordinary linear congruential generators. But the
difference is given by the bar on y . This signifies the solution for the equation
cc̄ = 1 (mod M )
where c is given and c̄ is the unknown. So instead of directly calculating the new number from yn,
one first calculates the congruence ȳn.
Another development is simply to add a quadratic term to the linear generator to produce the
quadratic congruential generator,
2
yn+1 = ayn + byn + c (mod M )
where A is a binary matrix of dimensions w × w where w is the word size (e.g. 32 for an unsigned
4-byte integer). ak−p must here be considered a row vector for the vector×matrix operation to
make sense.
This is a “twisted GFSR” or TGFSR generator. [Matsumoto and Kurita, 1992 ACM transactions on modelling
and Computer Similations 2 (179-194); http://www.math.sci.hiroshima-u.ac.jp/ m-mat/MT/ARTICLES/tgfsr3.pdf]. With suitable
choices of p, q and A this generator can achieve the maximal period of
pw
2 −1 (9)
which is about a factor of 2w more than the period of the basic GFSR. This is a huge improvement,
of course; even for 4-byte integers it is 232 more.
An additional major advantage of the TGFSR is that A can be chosen to make the algorithmic
implementation almost as simple as that of the basic GFSR. That is, the algorithm can be identical
to the GFSR one given above except that step 4 changes. The TGFSR algorithm is
Here a is a fixed constant binary vector of size w and the shift means a one-bit shift operation.
The TGFSR generator is proven to have several major advantages over the GFSR and pass quite
advanced tests. One obvious practical advantage is that the number of initialization vectors p does
not need to be very large to achieve a long period. E.g. already w = 16, p = 25, q = 11, a =
A875 (last one is hexadecimal) gives a period of 2400 − 1.
• Moreover, also the initialization vector is much less sensitive than in the GFSR, any
initialization vector (except all zero) will in fact do since the algorithm in any case goes
through the full possible set of ways to obtains sequences of p integers from the set of 2w
possible integers.
u t
ak = ak−p+q ⊕ (ak−p|ak−p+1)A (10)
where auk−p denotes the upper w − r bits of the vector ak−p, and atk−p+1 the lower r bits of the
vector ak−p+1. That is r is now an extra constant between 0 and w. The alert reader will notice
that if r = 0 this reduces back to the TGFSR.
• The A can still be handled in the same manner as in the TGFSR, and initialization is
equally easy.
With suitable good choices of the parameters this generator can achieve a period of
pw−r
2 −1 (11)
The values recommended by the authors are (w, p, q, r) = (32, 624, 397, 31) and a =
9908B0DF , which gives the period
19937
2 −1
which should be enough for a few years to come... The name “Mersenne” comes from the choice
of w, p and r to give a period which is a so called Mersenne prime number.
It is of course also possible to combine generators of different type, which when carried out well
should minimize the chances of the algorithm of one generator messing things up. One famous
generator which seems to be generally held in good respect is the RANMAR generator by G.
Marsaglia (source code with built-in documentation ranmar.f available on the course home page).
It has a period of ≈ 2144 which also should be good enough for some time to come...
We have already seen several very basic tests of algorithms, but we will reiterate what they are.
0. The very first thing to consider is that the period of the generator is larger than the number of
random numbers needed in your simulations. Testing for this simply amounts to finding or knowing
the period.
1. Another very basic test for a generator is of course to check that it produces a uniform distribution
between 0 and 1 in 1D. But failing this test is so basic that no sensible bug-free generator does it.
2. Many-dimensional smoothness tests. As an extension of the above, an important test is that the
random-number distribution is flat also in many dimensions. Here actually many generators already
start to have problems, as we shall see in the exercises.
Testing for problems 1 and 2 is simple, just doing the plots is the basic tests in 1, 2 and 3D.
Zooming in on parts of the test region may be useful in case the overall region looks OK.
M
2
X (yi − Ei)2
χ =
i=1
Ei
For large values of M (M > 30) there should be a 50 % probability that χ2 > M − 2/3, and a
50 % probability that χ2 < M − 2/3. Now we can test the generator by calculating χ2 numerous
It is also possible to predict the probabilities for different percentage points, and then calculate how
often χ2 exceeds this point. For instance, for M = 50 χ2 should exceed 1.52M at most 1 % of
the time. This may actually be a much stronger test of whether the distribution is what it should
be.
(n.b. Gould-Tobochnik has an error in their equation 12.19.c : they say χ2 ≤ M , when it should
be χ2 ≈ M .)
Finding correlation problems may be difficult. We described the “two small numbers in sequence”
problem above, and mentioned that other exist. In fact, the most complex correlation tests are the
empirical tests mentioned in the next section.
4. Autocorrelation tests
One way to look for short-range correlations in a sequence of random numbers xi is to use the
autocorrelation function
The tests described above were of a rather general nature, and a good generator should pass them
all. However, even these tests do almost certainly not prove that a generator is good enough.
A logical continuation of testing is to use tests which are close to the system studied. For instance,
if one simulates the Ising model, a good way to test the generator is to simulate the Ising model
in a case where the accurate answer is known. These tests are called empirical, physical or
application-specific.
The research group of Vattulainen, Ala-Nissilä and Kankaala (formerly at HIP, now at HUT) have
developed several such tests. Here is a quick overview of some of them; codes to test them can be
found in http://www.physics.helsinki.fi/∼vattulai/rngs.html.
1. SN test. The test uses random walkers on a line and calculates the total number of sites visited
by the random walkers (versus the number of jumps made). In the test, N random walkers move
simultaneously without any interaction such that, at any jump attempt, they can make a jump to
the left or to the right with equal probability. After n >> 1 jumps by all random walkers, the
1/2 x
SN,n ∼ log N n
where the exponent x should be 1/2 according to theory. The value of the exponent x observed
from simulations serves as a measure of correlations.
2. Interface roughening in 1+1 dimensions. In this case the roughening of a 1-dimensional interface
is followed with time. Consider two (independent) 1D random walks, which determine the heights
h1(n) and h2(n) of two interfaces versus the number of jumps made n. The height of the interface
between the two random walkers is then
h1(n) − h2(n),
whose height-height correlation function follows a power law in distance n where the roughening
exponent should be 1/2.
3. Ising model autocorrelation test. In this test, averages of some physical quantities such as
the energy, the susceptibility, and the updated cluster size in the 2D Ising model are calculated.
Additionally, their autocorrelation functions and corresponding integrated autocorrelation values are
To summarize all of this discussion, I give my personal view of what generator to use when.
I can see almost no reason to ever use anything less than the Park-Miller minimal generator. And
since even this has many known problems, it should be used only in cases where the random
numbers are of secondary importance. This can be for instance when only a few hundred random
numbers are needed for e.g. selecting impact points on a 2D surface, or when initializing velocities
of atoms in an MD simulation.
In cases where random numbers are used all through a large simulation run in crucial parts of the
code, I would recommend using something better than the Park-Miller generator.
Since there are no guarantees any generator is good enough for a problem which has not been
studied before, a good strategy would be to choose a few completely different generators and repeat
some of the central simulations with all of these. If no dependence on the choice of generator is
So far we have only discussed generating random numbers in a uniform distribution, but at quite
some length. There is a good reason to this - random numbers in any other distribution are almost
always generated starting from random numbers distributed uniformly between 0 and 1.
But in the natural sciences it is clear that data often comes in many other forms. E.g. the peaks in
x-ray or γ spectra have a Gaussian or Lorentzian shape, the decay activity of a radioactive sample
follows an exponential function, and so on.
There are two basic approaches to generate other distributions than the uniform. The first is the
analytical one, the other the numerical von Neumann rejection method.
We want to calculate random numbers which are distributed as some arbitrary function f (x). To
be a reasonable probability distribution, the function must have the properties
In the derivation, we will also need the cumulative distribution function (“kertymäfunktio”)
Z x
F (x) = f (t)dt
−∞
Let us denote our generator for uniform random numbers Pu(0, 1). We now postulate that to
generate random numbers r distributed as f (x), we should perform the following steps:
1◦ Generate a uniformly distributed number u = Pu(0, 1)
2◦ Calculate x = F −1(u)
To prove this, we will show that the cumulative distribution function F 0(x) of the numbers x is the
function F (x). Since each function f has a unique cumulative function, this is enough as a proof.
Consider the probability that a point x is below the given point r ,
0
F (r) = P (x ≤ r)
This is the cumulative distribution function of the x, but we do not yet know what the function is.
But now, using (2◦ ),
0 −1
F (r) = P (x ≤ r) = P (F (u) ≤ r)
0 −1
F (r) = P (F (F (u)) ≤ F (r)) = P (u ≤ F (r))
But because u is just a uniformly distributed number, we have simply P (a ≤ b) = b and hence
0
F (r) = P (u ≤ F (r)) = F (r) q.e.d.
So the algorithm is very simple, but requires that it is possible to calculate the inverse function of
the integral of the function we want. Since all functions are not integrable, using this analytical
approach is not always possible.
Let us illustrate how this works in practice with an example. Say we want to have random numbers
with an exponential decay above 0, i.e.
e−x
x>0
f (x) =
0 otherwise
But because u is a random number between 0 and 1, so is 1 − u, and we can reduce this to
−1
F (u) = − log(u)
To test whether this really works, I wrote the following small gawk script (since this is for demo
only, it is excusable to use the system random number generator):
gawk ’BEGIN {
• However, note that if the set of points pi has a region where it is zero in the middle of the
distribution, Fj will be flat in this region. If you use interpolation schemes when doing
step 2◦, you may still create a point in the forbidden region
– may well lead to either your program crashing, or (even worse!) completely unphysical
results.
There is another way to generate random numbers, which is purely numerical and works for any
finite-valued function in a finite interval, regardless of whether it can be integrated or inverted. It
is called the (von Neumann) rejection method or hit-and-miss method
The idea is straightforward. Consider a function f (x) defined in some finite interval x ∈ [a, b]. It
has to be normalized to give probabilities. Let M be an number which is ≥ f (x) for any x in the
interval:
This seems nice and easy. The downside is of course that we do some redundant work: all the
“miss” numbers were generated in vain. The probability to get a hit is
Rb
a
f (x)dx 1
P (hit) = =
M (b − a) M (b − a)
For a function like that plotted in the figure above, this is not too bad. But if the function is highly
peaked, or has a long weak tail, the number of misses will be enormous.
Consider for instance an exponential function e−x, x > 0. If the problem is such that one can
neglect very small values of e−x (large values of x) one can use some cutoff value b >> 1 in the
generation of random numbers. One could then use the hit-and-miss algorithm in the interval [0, b]
to generate random numbers for the exponential function. The normalized function would be
e−x e−x
f (x) = R b = −b
, x ∈ [0, b]
−x
e dx 1 − e
0
1 1
P (miss) = 1 − =1−
M (b − 0) Mb
1
M =
1 − e−b
and get
1 − e−b
P (miss) = 1 −
b
If for instance b = 100, we have ≈ 99 % misses, i.e. terrible efficiency. Obviously it would be
much better in this case to use the analytical approach.
Unfortunately for many functions it will not be possible to do the analytical approach. But there
may still be a way to do better than the basic hit-and-miss algorithm.
In case the shape of the function is known (which certainly almost always is the case in 1D), then
maybe we can find a function which is always larger than the one to be generated, but only slightly
To put this more precisely, say we can find a function g(x) for which a constant A exists such that
It is important to include the constant A here because both g(x) and f (x) are probabilities
normalized to one. For this to be useful, we further have to demand that it is possible to form the
inverse of the cumulative function G−1(x) of g(x).
But as you probably all also remember, the definite integral from −∞ to ∞ of f (x) can
be evaluated by a trick using two dimensions. For simplicity, let’s work with the the Gaussian
distribution in the form centered at 0 with σ = 1,
1 − 1 x2
f (x) = √ e 2
2π
The Box-Muller method to generate random numbers with a Gaussian distribution relies on a
similar trick.
In this method, we also consider two Gaussian distributions in 2D. Their joint distribution is
1 − 1 x2 1 − 1 y2 1 − 1 (x2+y2)
f (x, y) = √ e 2 √ e 2 = e 2
2π 2π 2π
Switching again to polar coordinates, and remembering that the surface element transforms as
dxdy = rdrdφ we get the polar density distribution function
1 −1 r2
g(r, φ) = f (x, y)r = re 2
2π
where
1
fφ(φ) =
2π
−1 2
fr (r) = re 2r
So if we can generate fφ and fr separately, we will also be able to generate the joint 2D distribution
and hence two Gaussian numbers at a time.
Generating fφ is trivial, we just need to form a uniform number and multiply by 2π to get an even
1
distribution in the range [0, 2π], which has the value 2π everywhere. fr can also be handled since
Z r 2 1 r2
−1
2r −2
Fr (r) = re =1−e
0
So this gives two numbers at a time. In practice, the subroutine is best written so that steps 1◦ -
5◦ are carried out every second step to get both x and y . The first step returns x and stores y .
The second step only returns y without any calculations at all.
It turns out that a variety of a hit-and-miss algorithm is actually almost always faster. The trick is
to avoid having to calculate the sine and cosine explicitly.
Here we consider a simple unit circle:
The advantage here is that by using the latter parts of the equation above, we do not ever have to
calculate the sine and cosine explicitly. This makes this approach faster, even though we do have
to reject
4 − π12 π
= 1 − ≈ 21%
4 4
of all values in the hit-and-miss steps 1 and 2.
I will not discuss generating random numbers for multidimensional cases. The very basic cases are
straightforward extensions of the 1D case: an N -dimensional uniform distribution is just a vector
of N 1-dimensional deviates, and the hit-and-miss approach works in any dimensions. If interested
on generating Gaussian distributions in multidimensions, a good discussion can be found in the MC
lectures of Karimäki (available on the course web page).
But there is one serious pitfall related to generating random deviates in 2 dimensions that it needs
to be mentioned here separately.
The problem is simply to select a random direction in 3 dimensions. That is, in spherical coordinates
any vector can of course be given in the unit system (r, θ, φ). To give a random direction, one then
simply has to generate θ and φ randomly. So the obvious solution would seem to be to generate
θ = πPu(0, 1)
(WRONG)
φ = 2πPu(0, 1)
Why is this? Consider the unit sphere r = 1. To generate random numbers on the surface of it
(which is equivalent to generating a random direction), we want to have an equal number of points
per surface area everywhere on the sphere. The surface area element dσ is
sin θdθdφ
i.e. the area is not an even function of θ . Hence one has to generate the θ random numbers
distributed as
sin θ
Let’s for generality consider an arbitrary angular interval [α, β] lying inside the [0, π] region. Then
the normalization factor is
Z β
fN = sin(θ)dθ = cos α − cos β
α
sin θ
f (θ) =
cos α − cos β
θ
sin θ cos α − cos θ
Z
F (θ) = dθ =
α cos α − cos β cos α − cos β
cos α − cos θ
u=
cos α − cos β
−1
θ = cos (1 − 2u)
φ = 2πPu(0, 1)
If you do not believe me, compare the following two plots. They show 10000 points generated on
the unit sphere, the left one using the wrong way to generate θ , the right one using the correct
way. The rotation is exactly the same in both figures, so they really are comparable.
Finally, we discuss pseudo-random numbers which are not random at all. Still, in some cases
using these may be more efficient than using real pseudorandom numbers. This can be understood
intuitively from the following two figures:
The left one shows 30000 random points in the unit box, the right one 300 points. From the left
side we see that there is nothing wrong with the distribution itself (on this scale at least): it fills
But this brings us to think that in case we want to integrate over the 2D plane, would it not be
more efficient to arrange the numbers so that it is guaranteed that even for low statistics points lie
fairly smoothly everywhere in the interval? This is the idea behind quasirandom numbers.
In fact, it has been shown that the convergence in MC integration behaves as follows:
∝ √1 for true and pseudo-random numbers.
n
1
∝ n at best for quasi-random numbers
The simplest imaginable way to achieve this is to use fully stratified sampling.
• This means that we first decide how many points we want to generate, the divide our
sample into exactly this many equal-sized boxes.
• Then we generate exactly one random number inside each box.
This way it is guaranteed that the point distribution is fairly smooth.
This is illustrated in the following two figures: one has 1024 completely random numbers in 2D, the
This method has a significant practical drawback, however: one has to decide on the number of
points needed in advance, and any intermediate result is worthless since parts of the 2D space have
not been examined at all. One could introduce a mixing scheme which selects the minor boxes in
random order to overcome the latter problem, but this would not solve the former.
A somewhat more flexible solution is to use partially stratified sampling. In here, we also divide
the integration sample into boxes, but then select several points per box. This has the advantage
that we can choose a somewhat smaller number of boxes, then do several loops where in every
loop we select one point per box. This way we can stop the simulation anytime the outermost loop
finishes.
This is probably clearer in code:
So we could in this example stop the run after 256, 512 or 768 steps in case we see we already have
good enough statistics.
But even this is not very convenient. In large simulations, where the actual evaluation of a function
Fortunately it is quite possible to implement a number sequence such that it fills space evenly
without having any specific “filling interval” like the stratified schemes. Such random number
sequences are called low-discrepancy numbers or quasi-random numbers. Contrary to true
random numbers, they are designed to be highly correlated, in a way such that they will fill space
fairly evenly.
We give three examples of methods to generate such sequences.
4.8.2.1. Richtmeyer-sequences
We want to have a set of vectors xi in k dimensions
p
xij = i Nj (mod 1)
where (mod 1) means we take the decimal part, and Nj is the j :th prime number.
This is simple in concept, but not very efficient if k is large, since one also needs a list of prime
numbers.
4.8.2.2. Van der Corput-sequences
This method is so simple that it can be easily explained, but in the default version it works only in
1D (you can think about yourself whether you could invent an extension to 2D).
The method is as follows:
1◦ Take an integer, and write it in some base (e.g. binary)
2◦ Reverse the digits in the number
3◦ Put a decimal point in front and interpret it as a decimal
Using e.g. the binary system, we get:
Back in the 1940’s, when the very first generators were designed, von Neumann thought that:
Now, when generators tend to be at least fairly decent and are very widely used, this fortunately is
no longer true!