0% found this document useful (0 votes)
3 views12 pages

Correlation

Uploaded by

Kyaw Aunghein
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views12 pages

Correlation

Uploaded by

Kyaw Aunghein
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

What is the pair correlation function g(r)?

This is related to the probability of finding the


center of a particle a given distance from the
center of another particle.
For short distances, this is related to how the
particles are packed together. For example,
consider hard spheres (like marbles). The spheres
can't overlap, so the closest distance two centers
can be is equal to the diameter of the spheres.
However, several spheres can be touching one
sphere; then a few more can form a layer around
them, and so on.
Further away, these layers get more diffuse, and so
for large distances, the probability of finding two
spheres with a given separation is essentially
constant. In that case, it's related to the density --
a more dense system has more spheres, thus it's
more likely to find two of them with a given
distance.
The pair correlation function g(r) accounts for
these factors by normalizing by the density; thus at
large values of r it goes to 1, uniform probability.
The right picture above shows g(r) calculated for a
simple simulation of two-dimensional disks. The
function is calculated based on all pairs of
particles, but to make it clear I've highlighted one
reference particle (black) in the left picture. The
surrounding particles are colored based on their
distance from the black particle, and correspond to
the graph of g(r) on the right. Thus for example
about 5 particles are colored purple, and
correspond to the prominent peak at a separation
of 1 diameter. A couple particles are dark blue,
corresponding to the less-likely position about 1.5
diameters away. Several more particles are light
blue, corresponding to the 2nd nearest neighbor
peak at about 2 diameters. Further away, the
green particles form the third nearest neighbors,
and the yellow particles form the fourth nearest
neighbors.
Note of course that this is all a little ambiguous;
actually the colors shade continuously just as the
distances between particles shade continuously.
Nonetheless, there is a lot of structure here that
results in the characteristic shape of g(r). All of the
particles farther away are colored red, and you can
see that g(r) tends toward a uniform value of 1 for
large values of r. (Note that this happens even
though the number of particles you're finding gets
larger, for example, consider the yellow particles in
the picture.)
How to calculate the pair correlation function
g(r)
This explanation is for three-dimensional data. To
calculate g(r), do the following:
 Pick a value of dr
 Loop over all values of r that you care about:
1.Consider each particle you have in turn.
Count all particles that are a distance
between r and r + dr away from the
particle you're considering. You can think
of this as all particles in a spherical shell
surrounding the reference particle. The
shell has a thickness dr.
2.Divide your total count by N, the number of
reference particles you considered --
probably the total number of particles in
your data.
3.Divide this number by 4 pi r^2 dr, the
volume of the spherical shell (the surface
area 4 pi r^2, multiplied by the small
thickness dr). This accounts for the fact
that as r gets larger, for trivial reasons you
find more particles with the given
separation.
4.Divide this by the particle number density.
This ensures that g(r)=1 for data with no
structure. In other words, if you just had an
arbitrarily placed spherical shell of inner
radius r and outer radius r+dr, you'd
expect to find about rho * V particles
inside, where rho is the number density
and V is the volume of that shell.
 In 2D, follow the algorithm as above but divide
by 2 pi r dr instead of step #3 above.
One caveat: For experimental data, you often have
edges to your sample that are artificial. For
example, you take a picture of particles but at the
edges of your picture, the system extends further
outwards. Thus when calculating g(r) based on
reference particles near the edge of your image,
you have a problem. You'll have to modify step #3
above with the correct volume/area that actually
lies within the image you're looking at. The
routines I've written take care of that.
I wrote IDL routines to calculate g(r) in 2D and 3D.
These do some special tricks to deal with
experimental data, in other words, to cleverly deal
with the edge effects.

2D program: For a particle near the edge of a


rectangular image, when I'm counting particles a
distance r away from it, for many values of r the
circle of radius r extends outside of the image. I did
some math to figure out how to determine how
much angular extent of the circle lies within the
image for these cases (in the subroutine
checkquadrant). Thus the edges are correctly
accounted for.
3D program: For a particle near the edge of a
rectangular image box, when I'm counting particles
a distance r away from it, we have the same
problem described for 2D data. However,
calculating the resulting solid angle of the sphere
contained within the box is more than I could
handle mathematically, for arbitrary box
dimensions and arbitrary locations within the box.
So I did a trick. My image boxes tend to be short
and wide, that is, very narrow in Z and large in X
and Y.
My program does the following. I'm calculating g(r)
for r < rmax, with a default rmax = 10. I consider
only reference particles that are more than rmax
away from the horizontal edges of the box, so I
never have to worry about overlaps in X and Y. The
resulting formula, to worry about overlaps in Z
alone, turns out to be quite reasonable. (That is, I
calculate the surface area of a spherical
hemisphere of radius R that is cut off at a finite
height H < R. It turns out this formula is simple: 2
pi R H.)
So, an important warning: Don't choose rmax to be
more than half of the width of your data, in X and
Y. This restriction is only for the 3D program.
extra comments(မပါ)
I wrote this in response to a question I got, asking
to clarify my special tricks.
For a given particle and a given distance r, you are
trying to count all the particles that lie between r
and r+dr away. In 3D this is a spherical shell with
thickness dr. To correctly normalize g(r), you have
to then account for the volume contained within
that shell. But, if your shell lies outside your
computation box, then you have to figure out
exactly how much volume lies within your box.
Assuming dr is small, then really it's like computing
how much surface area of the sphere of radius r
lies within the box.
So, you're left with a nontrivial trigonometry
problem, especially if you have a small cubical box.
For most values of r, you'd have several pieces of
your sphere lying outside your box, and computing
the area of what's left inside would be difficult.
In 2D the math is not too hard (for a circle and a
rectangular box). My trick there was to consider
each quadrant of the circle separately. At most,
each quadrant of the circle could intersect (or lie
outside of) two edges of the rectangle. Then I could
determine how much of the quadrant was within
those two edges. Working it out took a few sheets
of paper but the resulting formulas were fairly
simple (although at this point embedded within my
program and so I don't have them easily at hand).
But in 3D it's a bit harder. I cheated by forcing the
program to stay away from the edges of the box in
X and Y, so the only intersections I have to
consider are with the faces of the box in Z. The
math then wasn't too bad (to find the surface area
of a spherical cap lying outside the box).
Anyway, if you have a cubical box, I don't know if
there is any simple way to do it. Perhaps my trick
in 2D with quadrants and two edges of the
rectangle can be extended to 3D with octants and
3 faces of the cube. But you'll have to see if you
can do the trigonometry and find a tractable way
to compute the area contained within the box.
IDL routines to calculate the pair correlation
function g(r)
2D data -- ericgr2d.pro
Slight revision July 22, 2005 -- fixed a bug when
rmax is an integer rather than a float
Slight revision June 14, 2005 -- fixed a bug in the
plotting, which did not affect the output data,
fortunately!

Usage:
IDL> gr = ericgr2d(data)
IDL> plot,gr(0,*),gr(1,*)
The default is to assume the data is in the form
(x,y, --- ,t) where the first two columns are
positions, the next few columns are arbitrary (and
could be missing), and the last column is a
timestamp. If you have tracked data:
IDL> gr = ericgr2d(data,/track)
There are other options as well. The default is to
calculate g(r) for 0 < r < 10, with a spacing of
every 0.01. (See here for brief explanation.) To
override this, you can use the following keywords:
IDL> gr =
ericgr2d(data,rmin=1.0,rmax=100.0,delta
r=0.1)
3D data -- ericgr3d.pro
It's used exactly the same as the 2D program. The
coding is quite different, though, for boring
technical reasons, thus you have to use a different
program. One warning that's only for ericgr3d:
again due to boring technical reasons, you
shouldn't set rmax to be more than half of the
spatial extent of your data in X and Y. ericgr2d
doesn't have this problem.
This program plots an update of g(r) at each step,
which you can prevent with the /noplot keyword.
Extra g(r) routines
Consider these as unsupported programs, but you
may find them useful. Not that we really provide
"support" for any of our programs, but these in
particular may be of limited interest.

grwrap.pro
(click here if that link fails)
Usage:
IDL> grwrap, 'trackfiles*', dim=2, /track
or
IDL> grwrap, 'pretrackfiles*', dim=2
This calculates g(r) for a bunch of files, and saves
the data with filenames with the prefix "gr." on the
front. If you want to change the prefix, you can use
the prefix keyword:
IDL> grwrap, 'files*', prefix='pair.'
(for example). If you want to run ericgr2d (or
ericgr3d) with extra options, you can provide the
options:
IDL> grwrap, 'pretrackfiles*', dim=2,
deltar=0.1, rmax=5.0
Note that the default is dim=3 (use ericgr3d), you
have to use dim=2 to have the routine use
ericgr2d.

grdatwrap.pro
circ.pro
(download here, if those links fail)

Usage:
IDL> grdatwrap, 'gr.*', data
This is a kludgy program that goes through a
bunch of g(r) files and tries to determine the
location of the first peak. In the variable "data" is
returned (r_peak, full width half max, height). Also
this information is printed to the screen , along
with the filename. The circ function is included as
it is called by grdatwrap. It sets the user-defined
symbol (plot symbol 8) to a circle.

Additionally, the program makes a nice plot for


each g(r) file, that indicates where the peak
position is:

The top plot shows the g(r) data (circles), curve fit
(solid line), peak position (vertical line), and half
maximum (dotted line). The bottom plot shows the
full g(r). You can see that the curve fit isn't great
far from the peak, since it's a 6th order polynomial
it certainly can do funny things. However this
seems to provide something that is reproducible
and which my eye likes best for its ability to fit the
data at the peak.

You might also like