Cursus Nucleo
Cursus Nucleo
Johan Nuyts
Nuclear Medicine, K.U.Leuven
U.Z. Gasthuisberg, Herestraat 49
B3000 Leuven
tel: 016/34.37.15
e-mail: Johan.Nuyts@uz.kuleuven.be
September, 2023
Contents
1 Introduction 4
2 Radionuclides 5
2.1 Radioactive decay modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Data acquisition 15
4.1 Detecting the photon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Scintillation crystal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.2 Photo Multiplier Tube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.3 Two-dimensional gamma detector . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.4 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.5 Storing the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.6 Alternative designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Collimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.1 Mechanical collimation in the gamma camera . . . . . . . . . . . . . . . . 23
4.2.2 Electronic and mechanical collimation in the PET camera . . . . . . . . . . 26
4.3 Partial volume effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4 Compton scatter correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.1 Gamma camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.4.2 PET camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.5 Other corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.5.1 Linearity correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5.2 Energy correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5.3 Uniformity correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5.4 Dead time correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5.5 Random coincidence correction . . . . . . . . . . . . . . . . . . . . . . . . 43
1
CONTENTS 2
5 Image formation 45
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 Planar imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 2D Emission Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1 2D filtered backprojection . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.2 Iterative Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.3 Compton scatter, random coincidences . . . . . . . . . . . . . . . . . . . . 59
5.3.4 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.5 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3.6 OSEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.4 Fully 3D Tomography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5 Time-of-flight PET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.6 Reconstruction with resolution modeling . . . . . . . . . . . . . . . . . . . . . . . 68
7 System maintenance 83
7.1 Gamma camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.1.1 Planar imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.1.2 Whole body imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1.3 SPECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.2 Positron emission tomograph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2.1 Evolution of blank scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2.2 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.2.3 Front end calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.2.4 Quantification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
12 Appendix 138
12.1 Poisson noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
12.2 Convolution and PSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.3 Combining resolution effects: convolution of two Gaussians . . . . . . . . . . . . . 140
12.4 Error propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
12.4.1 Sum or difference of two independent variables . . . . . . . . . . . . . . . . 141
12.4.2 Product of two independent variables . . . . . . . . . . . . . . . . . . . . . 141
12.4.3 Any function of independent variables . . . . . . . . . . . . . . . . . . . . . 142
12.5 Central slice theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
12.6 The inverse Fourier transform of the ramp filter . . . . . . . . . . . . . . . . . . . . 144
12.7 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Introduction
The use of radioactive isotopes for medical purposes has been investigated since 1920, and since 1940
attempts have been undertaken at imaging radionuclide concentration in the human body.
In the early 1950s, Ben Cassen introduced the rectilinear scanner, a “zero-dimensional” scanner,
which (very) slowly scanned in two dimensions to produce a two-dimensional image of the radionu-
clide concentration in the body. In the late 1950s, Hal Anger developed the first “true” gamma cam-
era, introducing an approach that is still being used in the design of virtually all modern camera’s: the
Anger scintillation camera, a 2D planar detector to produce a 2D projection image without scanning.
The Anger camera can also be used for tomography. The projection images can then be used
to compute the original spatial distribution of the radionuclide within a slice or a volume. Already
in 1917, Radon published the mathematical method for reconstruction from projections, but only
in the 1970s, the method was introduced in medical applications, first in CT and next in nuclear
medicine imaging. At the same time, iterative reconstruction methods were being investigated, but
the application of those methods had to wait for sufficient computer power till the 1980s.
The Anger camera is often called gamma camera, because it detects gamma rays. When it is
designed for tomography, it is also called a SPECT camera. SPECT stands for Single Photon Emission
Computed Tomography and contrasts with PET, i.e. Positron Emission Tomography, which detects
photon pairs. Anger showed that two scintillation camera’s could be combined to detect photon pairs
originating after positron emission. Ter-Pogossian et al. built the first dedicated PET-system in the
1970s, which was used for phantom studies. Soon afterwards, Phelps, Hoffman et al built the first
PET-scanner (also called PET-camera) for human studies. Since its development, the PET-camera has
been regarded nearly exclusively as a research system. Only in about 1995, it became a true clinical
instrument.
Below, PET and SPECT will be discussed together since they have a lot in common. However,
there are also important differences. One is the cost price: PET systems are about 4 times as expensive
as gamma cameras. In addition, many PET-tracers have a very short half life (i.e. the time after which
the radioactivity decreases to 50%), so it is mandatory to have a small cyclotron, a laboratory and a
radiopharmacy expert in the close neighborhood of the PET-center.
An excellent book on this subject is “Physics in Nuclear Medicine”, by Cherry, Sorenson and
Phelps [1]. Two useful books have been published in the Netherlands, one to provide insight [2],
the other describing procedures [3]. An excellent book for researchers in the field is the one of H.
Barrett and K. Meyers [4]. The International Atomic Energy Agency (IAEA) publishes books with
free online access, e.g. [5].
4
Chapter 2
Radionuclides
In nuclear medicine, a tracer molecule is administered to the patient, usually by intravenous injec-
tion. A tracer is a particular molecule, carrying an unstable isotope, a radionuclide. The molecule
is recognized by the body, and will get involved in some metabolic process. The unstable isotopes
produce gamma rays, which allow us to measure the concentration of the tracer molecule in the body
as a function of position and time. A tracer is always administered in very low amounts, such that the
natural process being studied is not affected by the tracer.
Consequently, nuclear medicine is all about measuring function or metabolism, in contrast to
many other modalities including CT, MRI and echography, which mainly perform anatomical mea-
surements. This boundary is not strict, though: CT, MRI and ultrasound imaging allow functional
measurements, while nuclear medicine instruments also provide some anatomical information. How-
ever, nuclear medicine techniques provide concentration measurements with a sensitivity that is orders
of magnitude higher than that of any other modality.
where h is the Planck constant, c is the speed of light, ν the frequency and λ the wavelength of the
photon. The ratio between electronvolt and joule is numerically equal to the unit of charge in coulomb:
e = 1.602 · 10−19 C. Visible light has a wavelength of about 400 to 700 nm, and therefore an energy
of about 4.1 to 1.8 eV. The gamma rays used in nuclear medicine have energies of 100 to 600 keV,
corresponding to wavelengths of a few pm.
There are many ways in which unstable isotopes can decay. Depending on the decay mode, one
or more gamma rays will be emitted in every event. Some isotopes may use different decay modes.
The probability of each mode is called the branching ratio. As an example, 18 F has a branching ratio
of 0.97 for positron decay, and a branching ratio of 0.03 for electron capture.
5
CHAPTER 2. RADIONUCLIDES 6
1. β − emission.
• In this process, a neutron is transformed into a proton and an electron (called a β − parti-
cle). Also a neutrino (an electron anti-neutrino ν̄e ) is produced and (since the decay result
is more stable) some energy is released:
• An orbital electron is captured, and combined with a proton to produce a neutron and an
electron neutrino:
p+ + e− → n + νe (2.5)
As a result of this, there is an orbital electron vacancy. An electron from a higher or-
bit will fill this vacancy, emitting the excess energy as a photon. Note that EC causes
transmutation towards the leftmost neighbor in Mendeleev’s table.
• If the daughter product is in an excited state, it will further decay towards a state with
lower energy by emitting additional energy as γ photons (or conversion electrons).
• Similarly, the daughter product may be metastable, decaying via isomeric transition after
a longer time.
Examples of radionuclides decaying with EC are 57 Co, 111 In, 123 I and 201 Tl. Figure 2.1 shows
the decay scheme for 111 In and (a small part of) the decay table. Conversion electrons are
denoted with “ce-X, y”, where X is the electron shell and y is the gamma ray. The table also
gives the kinetic energy of the conversion electron, which must of course be less than the energy
of the photon that produced it. The ejection of the conversion electron creates a vacancy in the
1
The boundary between excited and metastable is usually said to be about 10−9 s, which is rather short in practice. But
the half life of some isomers is in the order of days or years.
2
There are only 5 places worldwide where 99 Mo is produced (one in Mol, Belgium, one in Petten, the Netherlands), and
4 where it can be processed to produce 99m Tc generators (one in Petten, one in Fleurus, Belgium).
CHAPTER 2. RADIONUCLIDES 7
involved electron shell, which is filled by an electron from a higher shell. The excess energy is
released as an X-ray or as an Auger electron. In the decay table, X-rays are denoted with “Xa
X-ray”, where X is the electron shell that had a vacancy, and a says (in code) from which shell
the new electron comes. The energy of this X-ray exactly equals the difference between the
energies of the two electron shells. Instead of emitting an X-ray, the excess energy can be used
to emit another electron from the atom. This is called the Auger-effect, and the emitted electron
is called an Auger-electron. In the table an Auger electron is denoted as “Auger-XY Z”, if a
vacancy in shell X is filled by an electron from shell Y , using the excess energy to emit an
electron from shell Z. The table gives the kinetic energy of the emitted Auger-electron. The
table also gives the frequency, i.e. the probability that each particular emission occurs when the
isotope decays.
Figure 2.1: Decay scheme of 111 In, by electron capture. The scheme is dominated by the emission of
two gamma rays (171 and 245 keV). These gamma rays can be involved in production of conversion
electrons. A more complete table is given in [1].
After a very short time the positron will hit an electron and annihilate (fig. 2.2). The
mass of the two particles is converted into energy, which is emitted as two photons. These
photons are emitted in opposite directions. Each photon has an energy of 511 keV (the
rest mass of an electron or positron).
• Again, the daughter nucleus may also be in an excited or a metastable state.
As a rule of thumb, light atoms tend to emit positrons, heavy ones tend to prefer other modes, but
there are exceptions. The most important isotopes used for positron emission imaging are 11 C, 13 N,
15 O and 18 F. Since these atoms are very frequent in biological molecules, it is possible to make a
radioactive tracer which is chemically identical to the target molecule, by substitution of a stable atom
by the corresponding radioactive isotope. These isotopes are relatively short lived: 11 C: 20 min, 13 N:
CHAPTER 2. RADIONUCLIDES 8
10 min, 15 O: 2 min and 18 F: 2 hours. As a result, except for 18 F, they must be produced close to the
PET-system immediately before injection. To produce the isotope, a small cyclotron is required. The
isotope must be rapidly incorporated into the tracer molecule in a dedicated radiopharmacy laboratory.
In contrast, 18 F can be transported from the production site to the hospital. That makes it the most
popular isotope in PET.
In nuclear medicine, we use radioactive isotopes that emit photons with an energy between 80 and
300 keV for single photon emission, and 511 keV for positron emission. The energy of the emitted
photons is fixed: each isotope emits photons with one or a few very sharp energy peaks. If more than
one peak is present, each one has its own probability which is a constant for that isotope. So if we
administer a tracer to a patient, we know exactly what photons we will have to measure.
2.2 Statistics
The exact moment at which an atom will decay cannot be predicted. All that is known is the proba-
bility that it will decay in the next time interval dt. This probability is αdt, where α is a constant for
each isotope. So if we have N radioactive atoms at time t0 , we expect to see a decrease dN in the
next interval dt of
dN = −N αdt. (2.7)
Integration over time yields
N (t) = N (t0 )e−α(t−t0 ) . (2.8)
This is what we expect. If we actually measure it, we may obtain a different value, since the process
is statistical. As shown below, the estimate will be better for larger dN/dt.
The amount of radioactivity is not expressed as the number of radioactive atoms present, but as
the number of radioactive atoms decaying per unit of time. From (2.7) we find that
dN N
radioactivity = − = αN = ln 2. (2.9)
dt t1
2
Therefore, for the same amount of radioactivity, more radioactive atoms are needed if the half life is
longer. This quantity used to be expressed in Curie (Ci), but the preferred unit is now Becquerel (Bq)
3 . One Bq means 1 event per s. For the coming years, it is useful to know that
The amounts of radioactivity used in nuclear medicine imaging are typically in the order of 102 Mbq.
3
Marie and Pierre Curie and Antoine Becquerel received the Nobel prize in 1903, for their discovery of radioactivity in
1896
CHAPTER 2. RADIONUCLIDES 9
The half life of a tracer is the amount of time after which only half the amount of radioactivity is
left. It is easy to compute it from equation (2.8):
ln 2
t1 = . (2.11)
2 α
As shown in appendix 12.1, p 138, the probability of measuring n photons, when r photons are
expected equals
e−r rn
pr (n) = (2.12)
n!
−r r r r r
= e ... (2.13)
123 n
This is a Poisson distribution with r the average number of expected photons. The mean of the
√
distribution is r, and the standard deviation equals r. For large r, it can be well approximated by a
√
Gaussian with mean r and standard deviation r:
−(n − r)2
1
pr (n) ' √ exp (2.14)
2πr 2r
For smaller ones (less than 10 or so) it becomes markedly asymmetrical, since the probability is
always 0 for negative values.
Note that the distribution is only defined for integer values of n. This is obvious, because one
cannot detect partial photons (r is a real number, because the average number of expected photons
does not have to be integer). Summing over all n values yields
∞ ∞ n
X
−r
X r
pr (n) = e = e−r er = 1. (2.15)
n!
0 0
As with a Gaussian, r is not only the mean of the distribution, it is also the value with highest proba-
bility.
When one estimates radioactivity by counting emitted photons (or other particles), the signal-to-
noise ratio (SNR) of that measurement equals
r √
SN R = √ = r, (2.16)
r
where r is the expection of the number of measured photons. Hence, if we measure the amount of
radioactivity with particle detectors, the SNR becomes larger if we measure longer.
The only assumption made in the derivation in appendix 12.1 was that the probability of an event
was constant in time. It follows that “thinning” a Poisson process results in a new Poisson process.
With thinning, we mean that we randomly accept or reject events, using a fixed acceptance probability.
If we expect N photons, and we randomly accept with a probability f , then the expected number of
accepted photons is f N . Since the probability of surviving the whole procedure (original process
followed by selection) has a fixed probability, the resulting process is still Poisson.
Chapter 3
In nuclear medicine, photon energy ranges roughly from 60 to 600 keV. For example, 99m Tc has an
energy of 140 keV. This corresponds to a wave length of 9 pm and a frequency of 3.4 × 1019 Hz. At
such energies, γ photons behave like particles rather than like waves.
At these energies, the dominating interactions of photons with matter are photon-electron interac-
tions: Compton scatter and photo-electric effect. As shown in figure 3.1, the dominating interaction
is a function of energy and electron number. For typical nuclear medicine energies, Compton scatter
dominates in light materials (such as water and human tissue), and photo-electric effect dominates
in high density materials. Pair production (conversion of a photon in an electron and a positron) is
excluded for energies below 1022 keV, since each of the particles has a mass of 511 keV. Rayleigh
scatter (absorption and re-emission of (all or a fraction of) the absorbed energy as a photon in a dif-
ferent direction) has a low probability.
Figure 3.1: Dominating interaction as a function of electron number Z and photon energy.
10
CHAPTER 3. INTERACTION OF PHOTONS WITH MATTER 11
Figure 3.2: Compton scatter can be regarded as an elastic collision between a photon and an electron.
the binding energy. In a dense material, this photo-electron will collide with electrons from other
atoms, and will lose energy in each collision, until no kinetic energy is left.
As a result, there is now a electron vacancy in the atom, which may be filled by an electron from a
higher energy state. The difference in binding energy of the two states must be released. This can be
done by emitting a photon. Alternatively, the energy can be used by another electron with low binding
energy to escape from the atom.
In both cases, the photon is completely eliminated.
The probability of a photo-electric interaction is approximately proportional to Z 3 /E 3 , where Z
is the atomic number of the material, and E is the energy of the photon. So the photo-electric effect
is particularly important for low energy radiation and for dense materials.
scatter. For that integration, dΩ can be written as 2π sin θ dθ 1 . The differential cross section (3.2) is
shown for a few incoming photon energies in figure 3.3. For higher energies, smaller scatter angles
become increasingly likely.
Figure 3.3: The scattering-angle distribution for a few photon energies. Figure from Wikipedia
(https://en.wikipedia.org/wiki/Klein-Nishina formula).
The probability of a Compton decreases very slowly with increasing E (see fig. 3.4) and with
increasing Z.
Because the energy of the photon after scattering is less than before, Compton scatter is sometimes
also called “inelastic scattering”.
Figure 3.4: Photon attenuation in water as a function of photon energy. The photo-electric component
decreases approximately with (Z/E)3 (of course, Z is a constant here). The Compton components
varies slowly.
3.4 Attenuation
The linear attenuation coefficient µ is defined as the probability of interaction per unit length (unit:
cm−1 ). Figure 3.4 shows the mass attenuation coefficients as a function of energy in water. Multi-
ply the mass attenuation coefficient with the mass density to obtain the linear attenuation coefficient.
When photons are traveling in a particular direction through matter, their number will gradually de-
crease, since some of them will interact with an electron and get absorbed (photo-electric effect) or
deviated (Compton scatter) into another direction. By definition, the fraction that is eliminated over
distance ds equals µ(s)N (s):
dN (s) = −µ(s)N (s)ds. (3.3)
If initially N (a) photons are emitted in point s = a along the s-axis, the number of photons N (d)
we expect to arrive in the detector at position s = d is obtained by integrating (3.3):
Rd
N (d) = N (a) e− a µ(s)ds
. (3.4)
Assume further that during a measurement, N (a) photon pairs were emitted along the s-axis (fig.3.5).
The number of detected pairs then is:
Ra Rd R d2
− µ(s)ds − 2 µ(s)ds − µ(s)ds
N (d1 , d2 ) = N (a)e d1 e a = N (a)e d1 . (3.5)
Equation (3.5) shows that for PET, the effect of attenuation is independent of the position along the
line of detection.
Photon-electron interactions will be more likely if there are more electrons per unit length. So
dense materials (with lots of electrons per atom) have a high linear attenuation coefficient.
Chapter 4
Data acquisition
In nuclear medicine, photon detection hardware differs considerably from that used in CT. In CT, large
numbers of photons must be acquired in a very short measurement. In emission tomography, a very
small number of photons is acquired over a long time period. Consequently, emission tomography
systems are optimized for sensitivity.
Photo-electric absorption is the preferred interaction in the detector, since it results in absorption
of all the energy of the incoming photon. Therefore the detector material must have a high atomic
number (the atomic number Z is the number of electrons in the atom). Since interaction probability
decreases with increasing energy, a higher Z is needed for higher energies.
In single photon imaging, 99m Tc is the tracer that is mostly used. It has an energy of 140 keV and
the gamma camera performance is often optimized for this energy. Obviously, PET-cameras have to
be optimized for 511 keV.
15
CHAPTER 4. DATA ACQUISITION 16
electrons to an higher energy level. Each of these electrons will then produce a single scintillation-
photon, always with the same color. So if we want to have an idea about the energy of the incoming
photon, we need to look at the number of scintillation photons (the intensity of the flash), and not at
their energy (the color of the flash).
Many scintillators exist, and quite some research on new scintillators is still going on. The crystals
that are mostly used today are NaI(Tl) for single photons (140 keV) in gamma camera and SPECT, and
LSO (lutetium oxyorthosilicate) for annihilation photon (511 keV) in PET. Before LSO was found,
BGO (bismuth germanate) was used for PET, and there are still some old BGO-based PET systems
around. Some other crystals, such as GSO (gadolinium oxyorthosilicate) and LYSO (LSO with a
bit of Yttrium, very similar to LSO) are used for PET as well. The important characteristics of the
crystals are:
• Photon yield per incoming keV. More photons per keV is better, since the light flash will be
easier to see. The scintillation and its detection are statistical processes, and the uncertainty
associated with it will go down if the number of scintillation photons increases.
• Scintillation time. This is the average time that the electrons remain in the excited state before
releasing the scintillation photon. Shorter is better, since we want to be ready for the next
photon.
• Attenuation coefficient for the high energy photon. We want to stop the photon, so a higher
attenuation coefficient is better. Denser materials tend to stop better.
• Wave length of the scintillation light. Some wave lengths are easier to detect than others.
• Ease of use: some crystals are very hygroscopic: a bit of water destroys them. Others can only
be produced at extremely high temperatures. And so on.
The scintillation photons are emitted to carry away the energy set free when an electron returns to
a lower energy level. Consequently, these photons contain just the right amount of energy to drive an
electron from a lower level to the higher one. In pure NaI this happens all the time, so the photons are
CHAPTER 4. DATA ACQUISITION 17
Figure 4.1: Photomultiplier. Left: the electrical scheme. Right: scintillation photons from the crystal
initiate an electric current to the dynode, which is amplified in subsequent stages.
reabsorbed and the scintillation light never leaves the crystal. To avoid this, the NaI crystal is doped
with a bit of Tl (Thallium). This disturbs the lattice and creates additional energy levels. Electrons
returning to the ground state via these levels emit photons that are not reabsorbed. NaI is hygroscopic
(similar as NaCl (salt)), so it must be protected against water. LSO has excellent characteristic, but
is difficult to process (formed at very high temperature), and it is slightly radioactive. Table 4.1 lists
some characteristics of three scintillation crystals.
• Position: (x, y)
The x-position is computed as P
xi Si
x = Pi , (4.1)
i Si
where i is the PMT-index, xi the x-position of the PMT and Si the integral of the PMT output
over the scintillation duration. The y-position is computed similarly. Expression (4.1) is not
very accurate and needs correction for systematic errors (linearity correction). The reason is
that the response of the PMT’s does not vary nicely with the distance to the pulse. In addition,
each PMT behaves a bit differently. This will be discussed later in this chapter.
• Energy: E
The energy is computed as X
E = cE Si (4.2)
i
where cE is a coefficient converting voltage (integrated PMT output) to energy. The “constant”
cE is not really a constant: it varies slightly with the energy of the high energy photon. More-
over, it depends also on the position of the scintillation, because of the complex and individual
behavior of the PMT’s. Compensation of all these effects is called “energy correction”, and will
be discussed below.
• Time: t
In positron emission tomography, we must detect pairs of photons which have been produced
simultaneously during positron-electron annihilation. Consequently, the time of the scintillation
must be computed as accurately as possible, so that we can separate truly simultaneous events
from events that happen to occur almost simultaneously.
The scintillation has a finite duration, depending on the scintillator (see table 4.1, p 16). The
scintillation duration is characterized by the the decay time τ , assuming that the number of
scintillation photons decreases as exp(−t/τ ). The decay times of the different scintillation
crystals are in the range of 40 to several 100 ns. This seems short, but since a photon travels
about 1 meter in only 3.3 ns, we want the time resolution to be in the order of a few ns (e.g.
to suppress the random coincidence rate in PET, as will be explained in section 4.2.2.3, p 29).
To assign such a precise time to a relatively slow event, the electronics typically computes
the time at which a predefined fraction of the scintillation light has been collected (constant
fraction discriminator). In current time-of-flight PET systems based on LSO, one obtains a
timing resolution of about 400 ps.
CHAPTER 4. DATA ACQUISITION 19
Figure 4.2: Multi-crystal module. The outputs of four photomultipliers are used to compute in which
of the 64 crystals the scintillation occurred.
4.1.4 Resolution
The coordinates (x, y, E, t) can only be measured with limited precision. They depend on the actual
depth where the incoming photon was stopped, on the number of electrons that was excited, on the
time the electrons remain in the excited state, on the direction in which each scintillation photon
is emitted when the electron returns to a lower energy state and on the amount of electrons that is
activated in the PMT’s in each dynode. These are all random processes: if two identical high energy
CHAPTER 4. DATA ACQUISITION 20
photons enter the crystal at exactly the same position, all the forthcoming events will be different. We
can only describe them with probabilities.
So if the photon really enters the crystal at (x̄, ȳ, Ē, t̄), the actual measurement will produce a
random realization (xi , yi , Ei , ti ) drawn from a four-dimensional probability distribution (x, y, E, t).
We can measure that probability distribution by doing repeated measurements in a well-controlled
experimental situation. E.g., we can put a strongly collimated radioactive source in front of the crystal,
which sends photons with the same known energy into the crystal at the same known position. This
experiment would provide us the distribution for x, y and E. The time resolution can be determined
in several ways, e.g. by activating the PMT’s with light emitting diodes. The resolution on x and y is
often called the intrinsic resolution.
If we plot all the measurements in a histogram, we will see the distribution. Usually the distri-
bution is approximately Gaussian, and can be characterized by its standard deviation σ. Often one
specifies the full width at half
√ maximum (FWHM) instead (figure 4.3). It is easy to show that for a
Gaussian, the FWHM = 2 2 ln 2σ. This leads to a useful rule of thumb: any detail smaller than the
FWHM is lost during the measurement.
Position resolution of a scintillation detector has a FWHM of 3 to 4 mm. Energy resolution is
about 10% FWHM (so 14 keV for a 140 keV tracer) in NaI(Tl) cameras, about 20% FWHM (about
130 keV at 511 keV) or worse in BGO PET-systems, and about 15% FWHM in LSO and GSO PET-
systems.
Figure 4.4: Collimation in the PET camera, the gamma camera (SPECT) and the CT camera.
Compared to the regular APD, the SiPM has the advantage of being much faster, achieving a
response time similar to that of analogue photomultiplier tubes. That makes them fast enough for
time-of-flight PET. Current timing resolution is around 400 ps, and prototypes with still better timing
are currently being developed.
4.2 Collimation
At this point, we know how the position and the energy of a photon impinging on the detector can
be determined. Now we need a way to ensure that the impinging photons will produce an image.
In photography, a lens is used for that purpose. But there are no easy-to-use lenses to focus the
high energy photons used in nuclear medicine. So we must fall back on a more primitive approach:
collimation. Collimation is the method used to make the detector “see” along straight lines. Different
tomographic systems use different collimation strategies, as shown in figure 4.4.
In the CT-camera, collimation is straightforward: there is only one transmission source, so any
photon arriving in the detector has traveled along the straight line connecting source and detector. In
the PET-camera, a similar collimation is obtained: if two photons are detected simultaneously, we
can assume they have originated from the same annihilation, which must have occurred somewhere
on the line connecting the two detectors. But for the gamma camera there is a problem: only one
CHAPTER 4. DATA ACQUISITION 23
Figure 4.5: Parallel hole, fan beam, cone beam and pin hole collimators.
photon is detected, and there is no simple way to find out where it came from. To solve that problem,
mechanical collimation is used. A mechanical collimator is essentially a thick sieve with long narrow
holes separated by thin septa. The septa are made from a material with strong attenuation (usually
lead), so photons hitting a septum will be eliminated, mostly by photo-electric interaction. Only
photons traveling along lines parallel to the collimator holes will reach the detector. So instead of
computing the trajectory of a detected photon, we eliminate all trajectories but one, so that we know
the trajectory even before the photon was detected. Obviously, this approach reduces the sensitivity
of the detector, since many photons will end up in the septa. This is why a PET-system acquires more
photons per second than a gamma camera, for the same activity in the field of view.
Most often, the parallel hole collimator is used, but for particular applications other collimators are
used as well. Figure 4.5 shows the parallel hole collimator (all lines parallel), the fan beam collimator
(lines parallel in one plane, focused in the other), the cone beam collimator (all lines focused in a
single point) and the pin hole collimator (single focus point, but in contrast with other collimators, the
focus is placed before the object).
Collimation ensures that tomographic systems collect information about lines. In CT, detected
photons provide information about the total attenuation along the line. In gamma cameras and PET
cameras, the number of detected photons provides information about the total radioactivity along the
line (but of course, this information is affected by the attenuation as well). Consequently, the acquired
two-dimensional image can be regarded as a set of line integrals of a three dimensional distribution.
Such images are usually called projections.
Figure 4.6: Focusing by a lens compared to ray selection by a parallel hole collimator
immediately under the source, as the origin of the detector plane. The length of the shadow s cast by
a septum on the detector is then
T
s=r (4.3)
H
where r is the distance to the origin. The fraction of the detector element that is actually detecting
photons is
a−s rT r
f= =1− =1− (4.4)
a aH R
This expression holds for r between 0 and R, where R is the position at which the shadow completely
covers the detector so that f = 0. The expression gives the fraction of the photons detected at
r, relative to the number that would be detected at the same position if there was no collimator.
This latter number is easy to compute. The source is emitting photons uniformly, so the number of
photons per solid angle is constant. Stated otherwise, if we would put a point source in the center of
a spherical uncollimated detector with radius H, the detector surface would be uniformly irradiated.
The total area of the sphere is 4πH 2 , so the sensitivity per unit area is 1/(4πH 2 ). Multiplying with
the collimator sensitivity produces the point spread function of the collimated detector at distance H:
rT 1
PSF(r) = (1 − ) (4.5)
aH 4πH 2
In this expression, we have ignored the fact that for a flat detector, the distance to the point increases
with r. The approximation is fair if r is much smaller than H. Consequently, the PSF has a triangular
profile, as illustrated in figure 4.8.
CHAPTER 4. DATA ACQUISITION 25
To calculate (approximately) the total collimator sensitivity, expression (4.5) must be integrated
over the region where PSF(r) is non-zero. We assume circular symmetry and use integration instead
of the summation (actually required by the discrete nature of the problem). Integration assumes that
a and T are infinitely small. Since we use only the ratio, this poses no problems.
Z R
1 rT
sens = 2
(1 − )2πrdr
4πH 0 aH
1 πR2 1 a 2
= = (4.6)
4πH 2 3 12 T
aH
R = (4.7)
T
Note that a similar result is obtained if one assumes that the collimator consists of square grid of
square holes. One finds:
R R
|x| |y|
Z Z
1 1 a 2
sens = dx dy(1 − ) (1 − ) = . (4.8)
4πH 2 −R −R R R 4π T
In reality, the holes are usually hexagonal. With a septa length of 2 cm and septa spacing of 1 mm,
about 1 in 5000 photons is passing through the collimator, according to equation (4.6). It is fortunate
that the sensitivity does not depend on the distance H.
It is easy to show that (4.7) is also equal to the FWHM of the PSF: you can verify by computing
the FWHM from (4.5). So the FWHM of the spatial resolution increases linearly with the distance to
the collimator! Therefore, the collimator must always be as close as possible to the patient, and failure
to do so leads to important image degradation! Many gamma cameras have hardware to automically
minimize the distance between patient and collimator during scanning.
To improve the resolution, the ratio a/T must be decreased. However, the sensitivity is propor-
tional to the square of a/T . As a result, the collimator must be a compromise between high sensitivity
and low FWHM.
The PSF is an important characteristic of the linear system, which allows us to predict how the
system will react to any input. See appendix 12.2, p 139 if you are not familiar with the concept of
PSF and convolution integrals.
In section 4.1.4, p 19 we have seen that the intrinsic resolution of a typical scintillation detector
has a FWHM of about 4 mm. Now, we have seen that the collimator contributes significantly to the
FWHM, but in the derivation, we have ignored the intrinsic resolution. Obviously, they have to be
combined to obtain realistic results, so we must “superimpose” the two random processes. The prob-
ability distribution of each random process is described as a PSF. To compute the overall probability
distribution, the second PSF must be “applied” to every point of the first one. Mathematically, this
means that the two PSFs must be convolved. If we make the reasonable assumption that the PSFs are
Gaussian, convolution becomes very simple, as shown in appendix 12.3, p 140: the result is again a
Gaussian, and its variance equals the sum of variances of the contributing Gaussians.
At distances larger than a few cm, the collimator PSF dominates the spatial resolution. The PSF
increases in the order of half a cm per 10 cm distance, but there is a wide range: different types of
collimator exist, either focusing on resolution or on sensitivity.
In the calculations above, the thickness of the septa has been ignored. In reality, the septa have to
be thick enough to stop the photons hitting them. Septa are made from high-Z materials, like lead or
tungsten, to maximize the photon attenuation. High resolution collimators are optimized for imaging
with 99m Tc: the septa are relatively thin, and a/T is small, resulting in high resolution but not so high
CHAPTER 4. DATA ACQUISITION 26
Figure 4.8: Point spread function of the collimator. The number of detected photons at a point in the
detector decreases linearly with the distance r to the central point.
sensitivity. High sensitivity collimators have a larger a/T , producing poorer resolution. High energy
collimators are designed for isotopes emitting high energy photons, such as 131 I (see table 11.1, p
125). They have thicker septa to minimize the number of photons penetrating them. To account for
the septa thickness, equation (4.6) can be extended to
1 a 2 a2
sens = (4.9)
12 T (a + s)2
where a is the distance between neighboring septa and s is the septa thickness. Using thicker septa
while keeping the resolution reduces the collimator sensitivity.
2d2 d2
PSF(x = 0, r) = = (4.10)
4π(R + r)2 2π(R + r)2
This is an approximation: we assume that the far detector surface coincides with the surface of a
sphere of radius R + r. It does not, because it is flat. However, because d is very small compared to
R, the approximation is very good.
We now move the point source vertically over a distance x. As long as the point is between the
dashed lines in figure 4.9, nothing changes (that is, if we ignore the small decrease in solid angle
CHAPTER 4. DATA ACQUISITION 27
Figure 4.9: A point source positioned in the field of view of a pair of detectors
.
due to the fact that the detector surface is now slightly tilted when viewed from the point source).
However, when the point crosses the dashed line, the far detector is no longer fully irradiated. A part
of the detector surface no longer contributes: if a photon reaches that part, the other photon will miss
the other detector. The height s is computed from simple geometry (figure 4.9):
d|r| 2R
s(x) = |x| − (4.11)
2R R − |r|
The first factor is the distance between the point and the dashed line, the second factor is the mag-
nification due to projection of that distance to the far detector. This equation is only valid when
dr/(2R) ≤ x ≤ d/2. Otherwise, s = 0 if x is smaller than dr/(2R), and s = d if x is larger. In
the center, r = 0 and we have simply that s(x) = 2|x|. Knowing the active detector area, we can
immediately compute the corresponding solid angle to obtain the detector pair sensitivity as a function
of the point source position (which can be regarded as a PSF):
(d − s(x))d
PSF(x, r) = (4.12)
2π(R + |r|)2
We can move the point also in the third dimension, which we will call y. The effect is identical to that
of changing x, so we obtain:
(d − s(x))(d − s(y))
PSF(x, y, r) = (4.13)
2π(R + |r|)2
From these equations it follows that the PSF is triangular in the center, rectangular close to the
detectors and trapezoidal in between. Introducing that third dimension, we obtain for the PSF in the
center and close to the detector respectively:
(d − 2|x|)(d − 2|y|)
PSF(x, y, 0) = (4.14)
2πR2
d2
PSF(x, y, R) = (4.15)
8πR2
We can compute the average sensitivity within the field of view by integrating the PSF over the detec-
tor area x ∈ [−d/2, d/2], y ∈ [−d/2, d/2] and divide by d2 . It is simple to verify that in both cases
we obtain:
d2
sens = (4.16)
8πR2
CHAPTER 4. DATA ACQUISITION 28
Table 4.2: Maximum and mean kinetic energy (k.e.), maximum and mean path length of the positron
for the most common PET isotopes.
Isotope Max k.e. [MeV] mean k.e. [MeV] Max path [mm] Mean path [mm]
11 C 0.96 0.39 3.9 1.1
13 N 1.19 0.49 5.1 1.5
15 O 1.72 0.73 8.0 2.5
18 F 0.64 0.25 2.4 0.6
68 Ga 1.90 0.85 8.9 2.9
82 Rb 3.35 1.47 17 5.9
showing that sensitivity is independent of position in the detection plane, if the object is large com-
pared to d.
Finally, we can estimate the sensitivity of a complete PET system consisting of a detector ring
with thickness d and radius R, assuming that we have a positron emitting source near the center of the
ring. We assume that the ring is in the (y, r)-plane, so the x-axis is the symmetry axis of the detector
ring. One approach is to divide the detector area by the area of a sphere with radius R, as we have
done before. This yields:
2πdR d
PETsensx=0 = = (4.17)
4πR2 2R
This is the maximum sensitivity, obtained for x = 0. The average over = −d/2 . . . d/2 is half that
value, since in the center of the PET, the sensitivity varies linearly between the maximum and zero:
d
PETsens = . (4.18)
4R
Figure 4.10: The distance traveled by the positron and the deviation from 180 degrees in the angle
between the photon paths limit the best achievable resolution in PET
.
Figure 4.11: Resolution loss as a result of uncertainty about the depth of interaction
.
these events causes a loss of resolution that increases with the distance from the center. PET systems
have been designed and built that can measure the depth of interaction to reduce this loss of resolution.
• The diameter of the PET-camera is typically about 1 m for a whole body system, and about half
that for a brain system. Light travels about 1 m in 3 ns. Consequently, the time window should
CHAPTER 4. DATA ACQUISITION 30
Figure 4.12: A PET ring detector (cut in half) with a cylindrical homogeneous object, containing a
radioactive wire near the symmetry axis
.
Figure 4.13: True coincidence, scattered coincidence, single event, random coincidence
.
For these reasons, current systems use a time window of about 5 to 10 ns. Recently, PET systems have
been built with a time resolution better than 1 ns. With those systems, one can measure the difference
in arrival time between the two photons and deduce some position information from it. This will be
discussed in section 5.5, p 65.
Figure 4.13 shows the difference between a true coincidence, which provides valuable informa-
tion, and several other events which are disturbing or even misleading. A single event provides no
information and is harmless. But if two single events are simultaneous (random coincidence), they
are indistinguishable from a true coincidence. If one (or both) of the photons is scattered, the coinci-
dence provides wrong information: the decaying atom is not located on the line connecting the two
detectors. Finally, a single event may be simultaneous with a true coincidence. Since there is no way
to find out which of the three photons is the single one, the entire event is ignored and information is
lost.
To see how the septa dimensions influence the relative contribution of the various possible events,
some simple computations can be carried out using the geometry shown in figure 4.12. Assume that
we have a cylindrical object in the PET-camera. In the center of the cylinder, there is a radioactive
wire, with activity ρ per unit length. The cylinder has radius r and attenuation coefficient µ. This
object represents an idealized patient: there is activity, attenuation and Compton scatter in the attenu-
ating object as in a patient; the unrealistic choice of dimensions makes the computations easier. The
CHAPTER 4. DATA ACQUISITION 31
PET-camera diameter is D, the detector size is d and the septa have length T . The duration of the
time window is τ . Finally, we assume that the detector efficiency is , which means that if a photon
reaches the detector, it has a chance of to produce a valid scintillation, and a chance of 1 − to
remain undetected.
To compute the number of true coincidences per unit of time, we must keep in mind that both
photons must be detected, but that their paths are not independent. To take into account the influence
of the time window, it helps to consider first a short time interval equal to the time window τ . After
that, we convert the result to counts per unit of time by dividing by τ . Why we do this will become
clear when dealing with randoms. We proceed as follows:
• The activity in the field of view is ρd.
• Both photons can be attenuated, the attenuated activity is ρda2 , with a = exp(−µr).
• Both photons must be detected if they hit the detectors, so the effective sensitivity is 2 d/(2D).
• The probability that a photon will be seen is proportional to the scan time τ .
• To compute the number of counts per time unit, we multiply with 1/τ .
We only care about the influence of the design parameters, so we ignore all constants. This leads to:
d2
trues ∼ ρa2 2 (4.19)
D
For scatters the situation is very similar, except for two issues: (1) the probability for a scatter
event to occur depends on the attenuating material and (2), the combined photon path is not a straight
line but a broken one.
Consider a source with activity λ emitting photons at x = 0 along the x-axis, in a material
met linear attenuation coefficient µ that covers the interval x ∈ [−r, r]. Then the probability of a
photon arriving at x is proportional to λe−µx , and the probability that it scatters there is proportional
to λe−µx µ dx. The probability that this scattered photon will not be attenuated is determined by
how much material it is propagating through, and can be roughly estimated as eµ(r−x) . To obtain all
scattered photons leaving the object we integrate this from the source position to the boundary of the
material: Z r
λ e−µx µ e−µ(r−x) dx = λe−µr µr = λaµr, (4.20)
0
where the last equation is obtained by setting e−µr = a.
The other issue is that the combined photon path is a broken line. This has two consequences.
First, the field of view is larger than for the trues. Second, the path of the second photon is (nearly)
independent of that of the first one.
We will assume that only one of the photons undergoes a single Compton scatter event, so the path
contains only a single break point. The corresponding field of view is indicated with the dashed line
in figure 4.12. The length of the wire in the field of view is then d(D − T )/T , but since T is much
smaller than D we approximate it as dD/T .
Each of the photons goes its own way. This will only lead to detection if both happen to end up
on a detector and are detected. For each of them, that probability is proportional to d/D, so for the
CHAPTER 4. DATA ACQUISITION 32
pair the detection probability is (d/D)2 . For the trues this factor was not squared, because detection
of one photon guarantees detection of the other!
Attenuation of scattered photons is complex, since they have lower energy and travel along oblique
lines. However, if we assume that the septa are not too short and that the energy resolution is not too
bad, we can ignore all that without making too dramatic an error. (We only want to see the major
dependencies, so we can live with moderately dramatic errors.) Consequently, we have for the scatters
count rate 2
D 2 d d3
scatters ∼ ρd µr a = ρµra2 2 (4.21)
T D DT
A single, non-scattered photon travels along a straight line, so the singles field of view is the same
as that of the scatters, and the contributing activity is ρdD/T . The attenuation is a. The effective
sensitivity is proportional to d/D. The number of singles in a time τ is proportional to τ . Finally, we
divide by τ to obtain the number of single events per unit of time. This yields:
d2
singles ∼ ρa (4.22)
T
A random coincidence consists of two singles, arriving simultaneously. So if we study a short
time interval τ equal to the time window, we must simply multiply the probabilities of the two singles,
since they are entirely independent of each other. Afterwards, we multiply with 1/τ to compute the
number of counts per time unit. So we obtain:
d4
randoms ∼ ρ2 a2 2 τ (4.23)
T2
Similarly, we can compute the probability of a “triple coincidence”, resulting from simultaneous
detection of a true coincidence and a single event. This produces:
d4
true + single ∼ ρ2 a3 3 τ (4.24)
TD
As you can see, the trues count rate is not affected by either τ nor T , in contrast to the unwanted
events. Consequently, we want τ to be as short as possible to reduce the randoms and triples count
rate. We have seen that for current systems this is about 10 ns or even less. Similarly, we want T to
be as large as possible to reduce all unwanted events. Obviously, we need to leave some room for the
patient.
Figure 4.14: Positron emission tomograph (cut in half). When septa are in the field of view (a),
the camera can be regarded as a series of separate 2D systems. Coincidences along oblique projec-
tion lines between neighboring rings can be treated as parallel projection lines from an intermediate
plane. This doubles axial sampling: 15 planes are reconstructed from 8 rings. Retracting the septa
(b), increases the number of projection lines and hence the sensitivity of the system, but fully 3D
reconstruction is required.
in ring i and the other one in ring i + 1. The small inclination of the projection line can be ignored,
and such coincidences are processed as if they were acquired by an imaginary detector ring located
at i + 0.5. This improves axial sampling. It turns out that the cross-planes are in fact superior to the
direct planes: near the center the axial resolution is slightly better because of the shadow cast by the
septa on the contributing detectors, and the sensitivity is nearly twice that of direct planes because of
the larger solid angle.
According to the Nyquist criterion, the sampling distance should not be longer than half the period
of the highest special frequency component in the acquired signal (you need at least two samples per
period to represent a sine correctly). Without the cross-planes, the criterion would be violated. So the
cross-planes are not only useful to increase sensitivity, the are needed to avoid aliasing as well.
In 3D mode, the septa are retracted, except for the first and last one. The detection is no longer
restricted to parallel planes, photon pairs traveling along oblique lines are now accepted as well. In
chapter 5 we will see that this has a strong impact on image reconstruction.
This has no effect on spatial resolution, since that is determined by the detector size. But the
impact on sensitivity is very high. In fact, we have already computed all the sensitivities in section
4.2.2.3, p 29. Those expressions are still valid for 3D PET, if we replace d (the distance between the
septa) with N d, where N is the number of neighboring detector rings. To see the improvement for 3D
mode, you have to compare to a concatenation of N independent detector rings (2D mode), which are
N times more sensitive than a single ring (at least if we wish to scan an axial range larger than that
seen by a single ring). So you can see that the sensitivity for trues increases with a factor of N when
going from 2D to 3D. However, scatters increase with N 2 and randoms even with N 3 . Because the
3D PET is more sensitive, we can decrease the injected dose with a factor of N (preserving the count
rate). If we do that, randoms will only increase with N 2 . Consequently, the price we pay for increased
sensitivity is an even larger increase of disturbing events. So scatter and randoms correction will be
more important in 3D PET than in 2D PET. It turns out that scatter in particular poses problems: it
was usually ignored in 2D PET, but this is no longer acceptable in 3D PET. Various scatter correction
algorithms for 3D PET have been proposed in the literature, the one mostly used is described below
(4.4.2, p 36).
CHAPTER 4. DATA ACQUISITION 34
Figure 4.15: The partial volume effect caused by the finite pixel size (left), and by the finite system
PSF (right). The first row shows the true objects, the second row illustrates the partial volume effect
due to pixels and PSF, and the last row compares horizontal profiles to the true objects and the images.
Objects that are small compared to the pixel size and/or the PSF are poorly represented.
Figure 4.16: Compton scatter allows photons to reach the camera via a broken line. These photons
produce a smooth background in the acquired projection data.
Figure 4.17: The energy spectrum measured by the gamma camera, with a simulation in overlay.
The simulation allows to compute the spectrum of non-scattered (primary) photons, single scatters
and multiple scatters. The true primary spectrum is very narrow, the measured one is widened by the
limited energy resolution.
However, if the energy loss during Compton scatter is smaller than or comparable to the energy
resolution of the gamma camera, then it may survive the energy test and get accepted by the camera.
Figure 4.17 shows the energy spectrum as measured by the gamma camera. A Monte Carlo simulation
was carried out as well, and the resulting spectrum is very similar to the measured one. The simulation
allows to compute the contribution of unscattered (or primary) photons, photons that scattered once
and photons that suffered multiple scatter events. If the energy resolution were perfect, the primary
photon peak would be infinitely narrow and all scattered photons could be rejected. But with a realistic
energy resolution the spectra overlap, and acceptance of scattered photons is unavoidable.
Figure 4.17 shows a deviation between simulation and measurement for high energies. In the
measurement it happens that two photons arrive simultaneously. If that occurs, the gamma camera
adds their energies and averages their position, producing a mispositioned count with high energy.
Such photons must be rejected as well. The simulator assumed that each photon was measured in-
dependently. The simulator deviates also at low energies from the measured spectrum, because the
camera has not recorded these low-energy photons.
To reject as much as possible the unwanted photons, a narrow energy window is centered around
the tracer energy peak, and all photons with energy outside that window are rejected. Current gamma
cameras can use multiple energy windows simultaneously. That allows us to administer two or more
different tracers to the patient at the same time, if we make sure that each tracer emits photons at
a different energy. The gamma camera separates the photons based on their energy and produces a
different image for each tracer. For example, 201 Tl emits photons of about 70 keV (with a frequency
CHAPTER 4. DATA ACQUISITION 36
Figure 4.18: Triple energy window correction. The amount of scattered photons accepted in window
C1 is estimated as a fraction of the counts in windows C2 and C3.
of 0.70 per desintegration) and 80 keV (0.27 per desintegration), while 99m Tc emits photons of about
140 keV (0.9 per desintegration). 201 Tl labeled thallium chloride can be used as a tracer for my-
ocardial blood perfusion. There are also 99m Tc labeled perfusion tracers. These tracers are trapped in
proportion to perfusion, and their distribution depends mostly on the perfusion at the time of injection.
Protocols have been used to measure simultaneously or sequentially the perfusion of the heart under
different conditions. In one such protocol, 201 Tl is injected at rest, and a single energy scan is made.
Then the patient exercises to induce the stress condition, the 99m Tc tracer is injected at maximum
excercise, and after 30 to 60 min, a scan with energy window at 140 keV is made. If the same tracer
were used twice, the image of the stress perfusion would be contaminated by the tracer contribution
injected previously at rest.
Since not all unwanted photons can be rejected, an additional correction may be required. Figure
4.18 shows a typical correction method based on three energy windows. Window C1 is the window
centered on the energy of the primary photons. If we assume that the spectrum of the unwanted
photons varies approximately linearly over the window C1, then we can estimate that spectrum by
measuring in two additional windows C2 (just below C1) and C3 (just above C1). If all windows
had the same size, then the number of unwanted photons in C1 could be estimated as (counts in C2 +
counts in C3) / 2. Usually C2 and C3 are chosen narrower than C1, so the correction must be weighted
accordingly. In the example of figure 4.18 there are very few counts in C3. However, if we would use
a second tracer with higher energy, then C3 would receive scattered photons from that tracer.
Figure 4.19: CT and PET images (obtained by maximum intensity projection), and the correspond-
ing raw PET projection data with their estimated scatter contribution. Data were acquired with a
Biograph 16 PET/CT system (Siemens). The regular pattern in the raw data is due to sensitivity
differences between different detector pairs (see also fig 4.23, p 41).
chapter 6, PET scanners are capable of measuring the attenuation coefficients of the patient body. This
can be used to calculate an estimate of the Compton scatter contribution, if an estimate of the tracer
distribution is available, and if the characteristics of the PET system are known. Such computations
are done with Monte Carlo simulation. The simulator software “emits” photons in a similar way as
nature does: more photons are emitted where more activity is present, and the photons are emitted in
random (pseudo-random in the software) directions. In a similar way, photon-electron interactions are
simulated, and each photon is followed until it is detected or lost for detection (e.g. because it flies in
the wrong direction). This must be repeated for a sufficient amount of photons, in order to generate a
reasonable estimate of the scatter contribution. Various clever tricks have been invented to accelerate
the computations, such that the whole procedure can be done in a few minutes.
In practice, the method works as follows. First, a reconstruction of the tracer uptake without
scatter correction is computed. Based on this tracer distribution and on the attenuation image of the
patient, the scatter contribution to the measured data is estimated with the simulation software. This
scatter contribution can then be subtracted to produce a better reconstruction of the tracer distribution.
This procedure can be iterated to refine the scatter estimate.
Figure 4.19 gives an example of the original raw PET data from a 3D PET scan (slightly smoothed
so you would see something), together with the corresponding estimated scatter contribution (using
the same gray scales). Maximum intensity projections of the CT and PET images are shown as well.
The regular pattern in the raw PET data is due to position dependent sensitivities of the detector pairs
(see also fig. 4.23, p 41). The scatter distribution is smooth, and its amplitude is significant.
Figure 4.20: Schematic representation of a gamma camera with a single large scintillation crystal
and parallel hole collimator.
Figure 4.21: The PMTs (represented as circles) have a non-linear response as a function of position.
As a result, the image of a straight radioactive line would be distorted as in this drawing.
PMT’s.
The performance of the PMT’s is not ideal for our purposes, and in addition they show small
individual differences in their characteristics. In current gamma cameras and PET cameras the PMT-
gain is computer controlled and procedures are available for automated PMT-tuning. But even after
tuning, small differences in characteristics are still present. As a result, some corrections are required
to ensure that the acquired information is reliable.
xcorrected = x + ∆x (x, y)
ycorrected = y + ∆y (x, y) (4.25)
CHAPTER 4. DATA ACQUISITION 39
Figure 4.22: Because the characteristics of the photomultipliers are not identical, the conversion of
energy to voltage (and therefore to a number in the computer) may be position dependent.
of the characteristics with position (e.g. crystal transparency, optical coupling between crystal and
PMT etc). Most likely, those are simply differences in sensitivity, so they do not produce deformation
errors, only multiplicative errors. Again, these sensitivity changes can be measured and stored in order
to correct them.
AH A cos3 α
counts detected in dx dy = = (4.28)
4π(H 2 + (x − x0 )2 + (y − y0 )2 )3/2 H2
Thus, if we know the size of the detector and the uniformity error we will tolerate we can compute the
required distance H. You can verify that with H = 5D, where D is the length of the crystal diagonal,
the maximum error equals about 1.5%.
The number of photons detected in every pixel is Poisson distributed. Recall from section 2.2, p
8 that the standard deviation of a Poisson number is proportional to the square root of the expected
value. So if we accept an error (a standard deviation) of 0.5%, we need to continue the measurement
until we have about 40000 counts per pixel. Computation of the uniformity correction matrix is
straightforward:
mean(I)
sensitivity corr(x, y) = (4.29)
I(x, y)
where I is the acquired image.
Consequently, if a uniformity correction is applied to a planar image, one Poisson variable is
divided by another one. The relative noise variance in the corrected image will be the sum of the
relative noise variances on the two Poisson variables (see appendix 12.4, p 141 for estimating the
error on a function of noisy variables).
Figure 4.23: Left: PET sensitivity sinogram. Center: the sinogram of a cylinder filled with a uniform
activity. Right: the same sinogram after normalization. The first row are sinograms, the second
row projections, the cross-hairs indicate the relative position of both. Because there are small gaps
between detector blocks, there is a regular pattern of zero sensitivity projection lines, which are of
course not corrected by the normalization.
Alternatively, an indirect approach can be used. A sinogram is acquired for a large phantom in the
center of the field of view. All individual detectors contribute to the measurement, and each detector
is involved in multiple projection lines intersecting the phantom. However, many other detector pairs
are not receiving photons at all. This measurement is sufficient to compute individual detector sensi-
tivities. Sensitivity of a projection line can then be computed from the individual detector sensitivity
and the known geometry. The result is a sinogram representing projection line sensitivities. This
sinogram can be used to compute a correction for every sinogram pixel:
mean(sensitivity)
normalization(i) = . (4.30)
sensitivity(i)
A typical sensitivity sinogram is shown in figure 4.23. The sinogram is dominated by a regular pattern
of zero sensitivity projection lines, which are due to small gaps between the detector blocks. In the
projections one can see that there is a significant axial variation in the sensitivities. After normaliza-
tion, the data from a uniform cylinder are indeed fairly uniform, except for these zero sensitivities. In
iterative reconstruction, the data in these gaps are simply not used. For filtered backprojection, these
data must first be filled with some interpolation algorithm.
Figure 4.24: Due to dead time, the measured count rate is lower than the true count rate for high
count rates. The figure is for a dead time of 700 ns, the axes are in counts per second. Measured count
rate is 20% low near 300000 counts per second.
therefore 1 − τ2 R2 , which gives us the relation between incoming count rate R1 and processed count
rate R2 :
R2 = (1 − R2 τ2 )R1 (4.32)
Rearranging this to obtain R2 as a function of R1 results in
R1
R2 = . (4.33)
1 + R1 τ2
This function is monotonically increasing with upper limit 1/τ2 (as you would expect: this is the
number of intervals of length τ2 in one second). Consequently, the electronics is non-paralyzable.
You cannot paralyze it, but you can saturate it: if you put in too much data, it simply performs at
maximum speed ignoring all the rest.
R0 e−R0 τ1
R2 = . (4.34)
1 + R0 e−R0 τ1 τ2
If the count rates are small compared to the dead times (such that Rx τy is small), we can introduce an
approximation to make the expression simpler. The approximation is based on the following relations,
which are acceptable if x is small:
de−x
e−x ' e−0 + x =1−x (4.35)
dx x=0
1+x 1 + x − x − x2 1
= −x= ' . (4.36)
1+x 1+x 1+x
Applying this to (4.31) and (4.33) yields:
R1 ' R0 (1 − R0 τ1 ) (4.37)
R2 ' R1 (1 − R1 τ2 ) (4.38)
So if the count rate is relatively low, there is little difference between paralyzable and non-paralyzable
dead time, and we can obtain a global dead time for the whole machine by simply adding the con-
tributing dead times. Most clinical gamma cameras and all PET systems provide dead time correction
based on some dead time model. However, for very high count rates the models are no longer valid
and the correction will not be accurate.
random coincidences. However, there is a “simple” way to measure a reliable estimate of the number
of randoms along every projection line. This is done via the delayed window technique.
The number of randoms (equation (4.23, p 32)) was obtained by squaring the probability to mea-
sure a single event during a time window τ . In the delayed window technique, an additional time
window is used, which is identical in length to the normal one but delayed over a short time interval.
The data for the second window are monitored with the same coincidence electronics and algorithms
as used for the regular coincidence detection, ignoring the delay between the windows. If a coinci-
dence is obtained between an event in the normal and an event in the delayed window, then we know
it has to be a random coincidence: because of the delay the events cannot be due to the same anni-
hilation. Consequently, with regular windowing we obtain trues + randoms1, and with the delayed
method we obtain a value randoms2. The values randoms1 and randoms2 are not identical because of
Poisson noise, but at least their expectations are identical. Subtraction will eliminate the bias, but will
increase the variance on the estimated number of trues. See appendix 12.4, p 141 for some comments
on error propagation.
Chapter 5
Image formation
5.1 Introduction
The tomographic systems are shown (once again) in figure 5.1. They have in common that they
perform an indirect measurement of what we want to know: we want to obtain information about the
distribution in every point, but we acquire information integrated over projection lines. So the desired
information must be computed from the measured data. In order to do that, we need a mathematical
function that describes (simulates) what happens during the acquisition: this function is an operator
that computes measurements from distributions. If we have that function, we must derive its inverse:
this will be an operator that computes distributions from measurements.
In the previous chapters we have studied the physics of the acquisition, so we are ready to compose
the acquisition model. We can decide to introduce some approximation to obtain a simple acquisition
model, hoping that deriving the inverse will not be too difficult. The disadvantage is that the inverse
operator will not be an accurate inverse of the true acquisition process, so the computed distribution
will not be identical to the true distribution. On the other hand, if we start from a very detailed
acquisition model, inverting it may become mathematically and/or numerically intractable.
For a start, we will assume that collimation is perfect and that there is no noise and no scatter. We
will only consider the presence of radioactivity and attenuating tissue.
Consider a single projection line in the CT-camera. We define the s-axis such that it coincides
Figure 5.1: Information acquired along lines in the PET camera, the gamma camera and the CT-
scanner.
45
CHAPTER 5. IMAGE FORMATION 46
with the line. The source is at s = a, the detector at s = b. A known amount of photons t0 is emitted
in a towards the detector element at b. Only t(b) photons arrive, the others have been eliminated by
attenuation. If µ(s) is the linear attenuation coefficient in the body of the patient at position s, we
have (see also eq (3.4, p 13)) Rb
t(b) = t0 e− a µ(s)ds . (5.1)
The exponential gives the probability that a photon emitted in a towards b is not attenuated and arrives
safely in b.
For a gamma camera measurement, there is no transmission source. Instead, the radioactivity is
distributed in the body of the patient. We must integrate the activity along the s-axis, and attenuate ev-
ery point with its own attenuation coefficient. Assuming again that the patient is somewhere between
the points a and b on the s-axis, then the number of photons q arriving in b is:
Z b Rb
q(b) = λ(s)e− s µ(ξ)dξ
ds, (5.2)
a
where λ(s) is the activity in s. For the PET camera, the situation is similar, except that it must detect
both photons. Both have a different probability of surviving attenuation. Since the detection is only
valid if both arrive, and since their fate is independent, we must multiply the survival probabilities:
Z b Rs Rb
q(b) = λ(s)e− a µ(ξ)dξ −
e s µ(ξ)dξ
ds (5.3)
a
Rb Z b
− µ(ξ)dξ
= e a λ(s)ds. (5.4)
a
In CT, only the attenuation coefficients are unknown. In emission tomography, both the atten-
uation and the source distribution are unknown. In PET, the attenuation is the same for the entire
projection line, while in single photon emission, it is different for every point on the line.
It can be shown (and we will do so below) that it is possible to reconstruct the planar distribution,
if all line integrals (or projections) through a planar distribution are available. As you see from figure
5.1, the gamma camera and the CT camera have to be rotated if all possible projections must be
acquired. In contrast, a PET ring detector acquires all projections simultaneously (with “all” we mean
here a sufficiently dense sampling).
Figure 5.2: Raw PET data can be organized in parallel projections, similar as in the gamma camera.
Figure 5.3: 99m Tc-MDP study acquired on a dual head gamma camera. Detector size is about 40 ×
50 cm, the whole body images are acquired with slow translation of the patient bed. MDP accumulates
in bone, allowing visualization of increased bone metabolism. Due to attenuation, the spine is better
visualized on the posterior image.
The dynamic behaviour of the tracer can be measured by acquiring a series of consecutive images.
To study the function of the kidneys, a 99m Tc labeled tracer (in this example ethylene dicysteine,
known as 99m Tc-EC) is injected, and its renal uptake is studied over time. The first two minutes after
injection, 120 images of 1 s are acquired to study the perfusion. Then, about a hundred planar images
are acquired over 40 min to study the clearance. Fig 5.4 and 5.5 show the results in a patient with one
healthy and one poorly functioning kidney.
The study duration should not be too long because of patient comfort. A scanning time of 15 min
or half an hour is reasonable, more is only acceptable if unavoidable. If that time is used to acquire a
single or a few planar images, there will be many photons per pixel resulting in good signal to noise
ratio, but there is no depth information. If the same time is used to acquire many projections for tomo-
graphic reconstruction, then there will be few counts per pixel and more noise, but we can reconstruct
a three-dimensional image of the tracer concentration. The choice depends on the application.
In contrast, planar PET studies are very uncommon, because most PET-cameras are acquiring all
projections simultaneously, so the three-dimensional image can always be reconstructed.
CHAPTER 5. IMAGE FORMATION 48
Figure 5.4: A few images from a renography. The numbers are the minutes after injection of the
99m Tc-labeled tracer. The left kidney (at the right in the image) is functioning normally, the other one
is not: both uptake and clearance of the tracer are slowed down.
Figure 5.5: Summing all frames of the dynamic renal study yields the image on the left, where the
kidneys can be easily delineated. The total count in these regions as a function of time (time activity
curves or TACs) are shown on the right, revealing a distinct difference in kidney function.
• The distribution has a finite support: it is zero, except within a finite region. We assume this
region is circular (we can always do that, since the distribution must not be non-zero within the
support). We assume that this region coincides with the field of view of the camera. We select
the center of the field of view as the origin of a two-dimensional coordinate system.
CHAPTER 5. IMAGE FORMATION 49
Figure 5.6: A sinogram is obtained by storing the 1D parallel projections as the rows in a matrix (or
image). Left: the camera with a radioactive disk. Center: the corresponding sinogram. Right: the
conventions, θ is positive in clockwise direction.
Figure 5.7: The projection line as a function of a single parameter r. The line consists of the points
p~ + ~r, with p~ = (s cos θ, s sin θ), ~r = (r sin θ, −r cos θ).
• Projections are continuous: the projection is known for every angle and for every distance from
the origin. Projections are ideal: they are unweighted line integrals (so there is no attenuation
and no scatter, the PSF is a Dirac impulse and there is no noise).
• The distribution is finite everywhere. That should not be a problem in real life.
5.3.1.1 Projection
Such an ideal projection q is then defined as follows (using the conventions of figure 5.6):
Z
q(s, θ) = λ(x, y)dxdy (5.5)
(x,y)∈ projection line
Z ∞
= λ(s cos θ + r sin θ, s sin θ − r cos θ)dr. (5.6)
−∞
The point (s cos θ, s sin θ) is the point on the projection line closest to the center of the field of view.
By adding (r sin θ, −r cos θ) we take a step r along the projection line (fig 5.7).
CHAPTER 5. IMAGE FORMATION 50
So it immediately follows that Λ(νx , 0) = Q(νx ). Formulated for any angle θ this becomes:
A more rigorous proof for any angle θ is given in appendix 12.5, p 143. In words: the 1-D Fourier
transform of the projection acquired for angle θ is identical to a central profile along the same angle
through the 2-D Fourier transform of the original distribution (fig 5.8).
The Fourier theorem directly leads to a reconstruction algorithm by simply digitizing everything:
compute the 1D FFT transform of the projections for every angle, fill the 2D FFT image by inter-
polation between the radial 1D FFT profiles, and compute the inverse 2D FFT to obtain the original
distribution.
Obviously, this will only work if we have enough data. Every projection produces a central line
through the origin of the 2D frequency space. We need to have an estimate of the entire frequency
CHAPTER 5. IMAGE FORMATION 51
Figure 5.9: Raw PET data, organized as projections (a) or as a sinogram (b). There are typically a
few hundred projections, one for each projection angle, and several tens to hundred sinograms, one
for each slice through the patient body
space, so we need central lines over at least 180 degrees (more is fine but redundant). Figure 5.9 shows
clinical raw emission data acquired for tomography. They can be organized either as projections or
as sinograms. There are typically in the order of 100 (SPECT) or a few hundreds (PET) of projection
angles. The figure shows 9 of them. During a clinical study, the gamma camera automatically rotates
around the patient to acquire projections over 180◦ or 360◦ . Because the spatial resolution deteriorates
rapidly with distance to the collimator, the gamma camera not only rotates, it also moves radially as
close as possible to the patient to optimize the resolution.
As mentioned before, the PET camera consisting of detector rings measures all projection lines
simultaneously, no rotation is required.
5.3.1.3 Backprojection
This Fourier-based reconstruction works, but usually an alternative expression is used, called “filtered
backprojection”. To explain it, we must first define the operation backprojection:
Z π
b(x, y) = q(x cos θ + y sin θ, θ)dθ
0
= Backproj (q(s, θ)) . (5.12)
During backprojection, every projection value is uniformly distributed along its projection line.
Since there is exactly one projection line passing through (x, y) for every angle, the operation involves
integration over all angles. This operation is NOT the inverse of the projection. As we will see later
on, it is the adjoint operator of projection. Figure 5.10 shows the backprojection image using a
logarithmic gray value scale. The backprojection image has no zero pixel values!
Let us compute the values of the backprojection image from the projections of a point source. Be-
cause the point is in the origin, the backprojection must be radially symmetrical. The sinogram is zero
CHAPTER 5. IMAGE FORMATION 52
Figure 5.10: A point source, its sinogram and the backprojection of the sinogram (logarithmic gray
value scale).
everywhere except for s = 0 (fig 5.10), where it is constant, say A. So in the backprojection we only
have to consider the central projection lines, the other lines contribute nothing to the backprojection
image. Consider the line integral along a circle around the center (that is, the sum of all values on the
circle). The circle intersects every central projection line twice, so the total activity the circle receives
during backprojection is
Z π
total circle value = 2 q(0, θ)dθ = 2Aπ. (5.13)
0
So the line integral over a circle around the origin is a constant. The value in every point of a circle
with radius R is 2Aπ/(2πR) = A/R. In conclusion, starting with an image f (x, y) which is zero
everywhere, except in the center where it equals A, we find:
The ramp filter is defined as the sequence of 1D Fourier transform, multiplication with the “ramp” |ν|
and inverse 1D Fourier transform. To turn it into a computer program that can be applied to a real
measurement, everything is digitized and 1D Fourier is replaced with 1D FFT. Te ramp filter can also
be implemented in the spatial domain:
where ⊗ denotes convolution. This convolution filter is obtained by taking the inverse Fourier trans-
form of the ramp filter, which is done in appendix 12.6, p 144. The ramp filter and the corresponding
convolution mask are shown in figure 5.11.
One can regard the ramp filter as a high pass filter which is designed to undo the blurring caused
by backprojecting the projection, where the blurring mask is the point spread function computed in
section 5.3.1.3, p 51.
Mathematically, filtered backprojection is identical to Fourier-based reconstruction, so the same
conditions hold: e.g. we need projections over 180 degrees. Note that after digitization, the algorithms
are no longer identical. The algorithms have different numerical properties and will produce slightly
different reconstructions. Similarly, implementing the ramp filter in the Fourier domain or as a spatial
convolution will produce slightly different results after digitization.
The ramp filter obviously amplifies high frequencies. If the projection data are noisy, the recon-
structed image will suffer from high frequency noise. To reduce that noise, filtered backprojection is
always applied in combination with a smoothing (low pass) filter.
Figure 5.11: The rampfilter in the frequency domain (left) and its point spread function (right). The
latter is by definition also the mask needed to compute the filter with a convolution in the spatial
domain.
Attenuation cannot be ignored. E.g. at 140 keV, every 5 cm of tissue eliminates about 50% of
the photons. So in order to apply filtered backprojection, we should first correct for the effect of
attenuation. In order to do so, we have to know the effect. This knowledge can be obtained with a
measurement. Alternatively, we can in some cases obtain a reasonable estimate of that effect from the
emission data only.
In order to measure attenuation, we can use an external radioactive source rotating around the
patient, and use the SPECT or PET system as a CT-camera to do a transmission measurement. If the
external source emits N0 photons along the x-axis, the expected fraction of photons making it through
the patient is:
N (d) Rd
= e− a µ(x)dx . (5.23)
N0
This is identical to the attenuation-factor in the PET-projection, equation (5.4, p 46). So we can correct
for attenuation by multiplying the emission measurement q(d) with the correction factor N0 /N (d).
For SPECT (equation (5.2, p 46)), this is not possible.
In many cases, we can assume that attenuation is approximately constant (with known attenuation
coefficient µ) within the body contour. Often, we can obtain a fair body contour by segmenting a
reconstruction obtained without attenuation correction. In that case, (5.23) can be computed from the
estimated attenuation image.
Bellini has adapted filtered backprojection for constant SPECT-like attenuation within a known
contour in 1979. The inversion of the general attenuated Radon transform (i.e. reconstruction for
SPECT with arbitrary (but known) attenuation) has been studied extensively, but only in 2000 Novikov
found a solution. The solution came a bit late in fact, because by then, iterative reconstruction had
replaced analytical reconstruction as the standard approach in SPECT. And in iterative reconstruction,
non-uniform attenuation poses no particular problems.
a significant amount of noise (Poisson noise), the point spread function is not a Dirac impulse and
Compton scatter may contribute significantly to the measurement.
That is why many researchers have been studying iterative algorithms. The nice thing about these
algorithms is that they are not based on mathematical inversion. All they need is a mathematical
model of the acquisition, not its inverse. Such a forward model is much easier to derive and program.
That forward model allows the algorithm to evaluate any reconstruction, and to improve it based on
that evaluation. If well designed, iterative application of the same algorithm should lead to continuous
improvement, until the result is “good enough”.
There are many iterative algorithms, but they are rather similar, so explaining one should be suf-
ficient. In the following, the maximum-likelihood expectation-maximization (ML-EM) algorithm is
discussed, because it is currently by far the most popular one.
The algorithm is based on a Bayesian description of the problem. In addition, it assumes that both
the solution and the measurement are discrete. This is correct in the sense that both the solution and
the measurement are stored in a digital way. However, the true tracer distribution is continuous, so the
underlying assumption is that this distribution can be well described with a discrete representation.
We start by computing what we would expect to measure. We have already done that, it is the
attenuated projection from equations (5.2, p 46) and (5.4). However, we want a discrete version here:
X
ri = cij λj , i = 1, I. (5.26)
j=1,J
Here, λj ∈ Λ is the regional activity present in the volume represented by pixel j (since we have
a finite number of pixels, we can identify them with a single index). The value ri is the number of
photons measured in detector position i (i combines the digitized coordinates (d, θ, z), it identifies a
single projection line). The value cij represents the sensitivity of detector i for activity in j. If we
have good collimation, cij is zero everywhere, except for the j that are intersected by projection line
i, so the matrix C, often called the system matrix, is very sparse. This notation is very general, and
allows us e.g. to take into account the finite acceptance angle of the mechanical collimator (which will
increase the fraction of non-zero cij ). If we know the attenuation coefficients, we can include them in
the cij , and so on. Consequently, this approach is valid for SPECT and PET.
We now have for every detector two values: the expected value ri and the measured value qi .
Since we assume that the data are samples from a Poisson distribution, we can compute the likelihood
of measuring qi , if ri photons were expected (see eq. (2.12, p 9)):
e−ri riqi
p(qi |ri ) = . (5.27)
qi !
The history of one photon (emission, trajectory, possible interaction with electrons, possible detection)
is independent of that of the other photons, so the overall probability is the product of the individual
ones:
Y e−ri rqi
i
p(Q|Λ) = . (5.28)
qi !
i
Obviously, this is going to be a very small number: e.g. p(qi = 15|ri = 15) = 0.1 and smaller
for any other ri . For larger qi , the maximum p-value is even smaller. In a measurement for a single
slice, we have in the order of 10000 detector positions, so the maximum likelihood value may be in
the order of 10−10000 , which is zero in practice. We are sure the solution will be wrong. However, we
hope it will be close enough to the true solution to be useful.
Maximizing (5.28) is equivalent to maximizing its logarithm, since the logarithm is monotonically
increasing. When maximizing over Λ, factors not depending on λj can be ignored, so we will drop
qi ! from the equations. The resulting log-likelihood function is
X
L(Q|Λ) = qi ln(ri ) − ri (5.29)
i
X X X
= qi ln( cij λj ) − cij λj . (5.30)
i j j
It turns out that the Hessian (the matrix of second derivatives) is negative definite if the matrix
cij has maximum rank. In practice, this means that the likelihood function has a single maximum,
provided that a sufficient amount of different detector positions i were used.
CHAPTER 5. IMAGE FORMATION 57
∂L X qi
= cij P − 1 = 0, ∀i = 1, I. (5.31)
∂λj k cik λk
i
This produces a huge set of equations, and analytical solution is not feasible.
Iterative optimization, such as a gradient ascent algorithm, is a suitable alternative. Starting with
an arbitrary image Λ, the gradient for every λj is computed, and a value proportional to that gradient
is added. Gradient ascent is robust but can be very slow, so more sophisticated algorithms have been
investigated.
A very nice and simple algorithm with guaranteed convergence is the expectation- maximization
(EM) algorithm. Although the resulting algorithm is simple, the underlying theory is not. In the
following we simply state that convergence is proved and only show what the EM algorithm does.
The interested reader is referred to appendix 13.2, p 149 for some notes on convergence.
and similar for b. So if more counts N were measured than the expected ā + b̄, the expected value of
the contributing sources is “corrected” with the same factor. Extension to more than two sources is
straightforward.
CHAPTER 5. IMAGE FORMATION 58
The EM algorithm prescribes a two stage procedure (and guarantees that applying it repeatedly leads
to the maximisation of both Lx and L). It is an iterative algorithm, which starts from a given image,
called Λold , and computes an improved image Λnew . The two stages are:
1. Compute the function E(Lx (X, Λ)|Q, Λold ). Given Λold and Q, it is impossible to compute
Lx (X, Λ), since we don’t know the values of xij . However, we can calculate its expected value,
using the current estimate Λold . This is called the E-step.
2. Calculate a new estimate of Λ that maximizes the function derived in the first step. This is the
M-step.
The E-step
The E-step yields the following expressions:
XX
E(Lx (X, Λ)|Q, Λold ) = (nij ln(cij λj ) − cij λj ) (5.35)
i j
qi
nij = cij λold
j P old
(5.36)
k cik λk
Equation (5.35) is identical to equation (5.34), except that the unknown xij values have been replaced
by their expected values nij . We would expect that nij equals cij λold
j . However, we also know that the
old
sum of all cij λj equals the number of measured photons qi . This situation is identical to the problem
studied in section 5.3.2.3, p 57, and equation (5.36) is the straightforward extension of equation (5.32)
for multiple sources.
The M-step
In the M-step, we maximize this expression with respect to λj , by setting the partial derivative to
zero:
∂ X nij
old
E(Lx (X, Λ)|Q, Λ ) = − cij = 0 (5.37)
∂λj λj
i
So we find: P
nij
λj = Pi (5.38)
i cij
Substitution of equation (5.36) produces the ML-EM algorithm:
λold
j
X qi
λnew
j =P cij P old
(5.39)
i cij k cik λk
i
CHAPTER 5. IMAGE FORMATION 59
Discussion
This equation has a simple intuitive explanation:
We have seen the continuous version of the backprojection in (5.12, p 51). The digital version
clearly shows that backprojection is the transpose (or the adjoint) of projection 1 : the operations
are identical, except that backprojection sums over i, while projection sums over j. Note that if
the projection takes attenuation into account, the backprojection does so to. The backprojection
does not attempt to correct for attenuation (it is not the inverse), it simply applies the same
attenuation factors as in the forward projection.
3. Finally, the backprojected image is normalized and multiplied with the current reconstruction
image. It is clear that if the measured and computed sinograms are identical, the entire oper-
ation has no effect. If the measured projection values are higher than the computed ones, the
reconstruction values tend to get increased.
If the initial image is non-negative, then the final image will be non-negative too, since all factors
are non-negative. This is in most applications a desirable feature, since the amount of radioactivity
cannot be negative.
Comparing with (5.31, p 57) shows that the ML-algorithm is really a gradient ascent method. It
can be rewritten as
λj ∂L
λnew
j = λj + P . (5.41)
i cij ∂λj
So the gradient is weighted by the current reconstruction value, which is guaranteed to be positive
(negative radioactivity is meaningless). To make it work, we can start with any non-zero positive
initial image, and iterate until the result is “good enough”. Fig. 5.13 shows the filtered backprojection
and ML-EM reconstructions from the same dataset. (The image shown is not the true maximum
likelihood image, since iterations were stopped early as discussed below).
1
The adjoint of a matrix is its conjugate transpose. Because the projection coefficients are real values, the adjoint of the
projection matrix is its transpose.
CHAPTER 5. IMAGE FORMATION 60
Figure 5.13: Reconstruction obtained with filtered backprojection (top) and maximum-likelihood
expectation-maximization (34 iterations) (bottom). The streak artifacts in the filtered backprojection
image are due to the statistical noise on the measured projection (Poisson noise).
where si represents the estimate of the additive contribution from scatter and/or randoms for projection
line i. It can be shown that the ML-EM algorithm then becomes
λold
j
X qi
λnew
j =P cij P old
(5.43)
i cij i k cik λk + si
5.3.4 Regularization
Since the levels of radioactivity must be low, the number of detected photons is also low. As a result,
the uncertainty due to Poisson noise is important: the data are often very noisy.
Note that Poisson noise is “white”. This means that its (spatial) frequency spectrum is flat. Often
this is hard to believe: one would guess it to be high frequency noise, because we see small points, no
blobs or a shift in mean value. That is because our eyes are very good at picking up high frequencies:
the blobs and the shift are there all right, we just don’t see them.
Filtered backprojection has not been designed to deal with Poisson noise. The noise propagates
into the reconstruction, and in that process it is affected by the ramp filter, so the noise in the recon-
struction is not white, it is truly high frequency noise. As a result, it is definitely not Poisson noise.
Figure 5.13 clearly shows that Poisson noise leads to streak artifacts when filtered backprojection is
used.
ML-EM does “know” about Poisson noise, but that does not allow it to separate the noise from
the signal. In fact, MLEM attempts to compute how many of the detected photons have been emitted
in each of the reconstruction pixels j. That must produce a noisy image, because photon emission is
a Poisson process. What we really would like to know is the tracer concentration, which is not noisy.
Because our brains are not very good at suppressing noise, we need to do it with the computer.
Many techniques exist. One can apply simple smoothing, preferably in three dimensions, such that
CHAPTER 5. IMAGE FORMATION 61
Figure 5.14: Simulation designed to challenge convergence of MLEM: the point between the hot
regions convergence very slowly relative to the other point. The maximum converges slower than the
width, so counts are not preserved.
resolution is approximately isotropic. It can be shown that for every isotropic 3D smoothing filter
applied after filtered backprojection, there is a 2D smoothing filter, to be applied to the measured data
before filtered backprojection, which has the same effect. Since 2D filtering is faster than 3D filtering,
it is common practice to smooth before filtered backprojection.
The same is not true for ML-EM: one should not smooth the measured projections, since that
would destroy the Poisson nature. Instead, the images are often smoothed afterwards. Another ap-
proach is to stop the iterations before convergence. ML-EM has the remarkable feature that low
frequencies converge faster than high ones, so stopping early has an effect similar to low-pass filter-
ing.
Finally, one can go back to the basics, and in particular to the Bayesian expression (5.25, p 55).
Instead of ignoring the prior, one can try and define some prior probability function that encourages
smooth solutions. This leads to maximum-a-posteriori (MAP) algorithm. The algorithms can be very
powerful, but they also have many parameters and tuning them is a delicate task.
Although regularized (smoothed) images look much nicer than the original ones, they do not
contain more information. In fact, they contain less information, since low-pass filtering kills high-
frequency information and adds nothing instead! Deleting high spatial frequencies results in poorer
resolution (wider point spread function), so excessive smoothing is ill advised if you hope to see small
structures in the image.
5.3.5 Convergence
Filtered backprojection is a linear algorithm with shift invariant point spread function. That means that
the effect of projection followed by filtered backprojection can be described with a single PSF, and
that the reconstruction has a predictable effect on image resolution. Similarly, smoothing is linear and
shift invariant, so the combined effect of filtered backprojection with low pass filtering has a known
effect on the resolution.
In contrast to FBP, MLEM algorithm is non-linear and not shift invariant. Stated otherwise, if
CHAPTER 5. IMAGE FORMATION 62
Figure 5.15: Reconstruction of a cardiac PET scan with FBP, and with MLEM. The images obtained
after 34, 118 and 1000 iterations are shown. The image at 1000 iterations is very noisy, but a bit of
smoothing turns it into a very useful image.
an object is added to the reconstruction, then the effect of projection followed by reconstruction is
different for all objects in the image. In addition, two identical objects can be reconstructed differently
because they are at a different position. In this situation, it is impossible to define a PSF, projection
followed by MLEM-reconstruction cannot be modeled as a convolution. There is no easy way to
predict the effect of MLEM on image resolution.
It is possible to predict the resolution of the true maximum likelihood solution, but you need an
infinite number of iterations to get there. Consequently, if iterations are stopped early, convergence
may be incomplete. There is no way to tell that from the image, unless you knew a-priori what
the image was: MLEM images always look nice. So stopping iterations early in patient studies has
unpredictable consequences.
This is illustrated in a simulation experiment shown in figure 5.14. There are two point sources,
one point source is surrounded by large active objects, the other one is in a cold region. After 20
MLEM iterations, the “free” point source has nearly reached its final value, while the other one is just
beginning to appear. Even after 100 iterations there is still a significant difference. The difference in
convergence affects not only the maximum count, but also the total count in a region around the point
source.
Consequently, it is important to apply “enough” iterations, to get “sufficiently” close to the true
ML solution. In the previous section it was mentioned that stopping iterations early has a noise-
suppressing effect. This is illustrated in figure 5.15. After 34 MLEM iterations a “nice” image is
obtained. After 1000 iterations, the image is very noisy. However, this noise is highly correlated, and
drops dramatically after a bit of smoothing (here with a Gaussian convolution mask).
CHAPTER 5. IMAGE FORMATION 63
Figure 5.16: Illustration of OSEM: there are 160 projections in the sinogram, for an acquisition over
360◦ . OSEM used 40 subsets of 4 projections each, the first subset has projections at 0◦ , 90◦ , 180◦ and
270◦ . The second subset starts at 45◦ and so on. One OSEM iteration has 40 subiterations. The top
row shows the first few and the last subiteration of OSEM. The bottom row shows the corresponding
MLEM-iterations, where all projections are used for each iteration.
From figures 5.14 and 5.15 it follows that it is better to apply many iterations and remove the
noise afterwards with post-smoothing. In practice, “many” is at least 30 iterations for typical SPECT
and PET images, but for some applications it can mean several hundreds of iterations. One finds that
when more effects are included in the reconstruction (such as attenuation, collimator blurring, scatter,
randoms...), more iterations are needed. The number also depends on the image size, more iterations
are needed for larger images. So if computer time is not a problem, it is recommended to apply really
many iterations (several hundreds) and regularize with a prior or with post-smoothing.
5.3.6 OSEM
In each iteration, MLEM must compute a projection and a backprojection, which is very time con-
suming. FBP is much faster, it uses only a single backprojection, the additional cpu-time for the ramp
filtering is small. Consequenly, the computation of 30 MLEM iterations takes about 60 times longer
than an FBP reconstruction. MLEM was invented by Shepp and Vardi in 1982, but the required cpu
time prohibited application in clinical routine. Only after 1992 clinical application started, partly be-
cause the computers had become more powerful, and partly because a very simple acceleration trick
was proposed by Hudson and Larkin. The trick is called “ordered subsets expectation maximisation
(OSEM)”, and the basic idea is to use only a part of the projections in every ML-iteration. This is
illustrated in fig 5.16. In this example, there are 160 projections, which are divided in 40 subsets of
4 projections per subset. The first subset contains the projections acquired at 0◦ , 90◦ , 180◦ and 270◦ .
The second subset has the projections at 45◦ , 135◦ , 225◦ and 315◦ and so on. One OSEM iteration
consists of 40 subiterations. In the first subiteration, the projection and backprojection is restricted to
the first subset, the second subiteration only uses the second subset and so on. After the last subiter-
ation, all projections have been used once. As shown in fig 5.16, 1 OSEM iteration with 40 subsets
produces an image very similar to that obtained with 40 MLEM iterations. Every MLEM iteration
uses all projections, so in this example, OSEM obtains the same results 40 times faster.
A single OSEM iteration can be written as
λold
j
X qi
λnew
j =P cij P old
, (5.44)
i∈Sk cij i∈Sk j cij λj
CHAPTER 5. IMAGE FORMATION 64
where Sk is subset k. Usually, one chooses the subsets such that every projection is in exactly one
subset.
The characteristics of OSEM have been carefully studied, both theoretically and in practice. In the
presence of noise, it does not converge to a single solution, but to a so-called limit cycle. The reason
is that the noise in every subset is different and “incompatible” with the noise in the other subsets.
As a result, every subset asks for a (slightly) different solution, and in every subiteration, the subset
in use “pulls” the image towards its own solution. More sophisticated algorithms with better or even
guaranteed convergence have been proposed, but because of its simplicity and good performance in
most cases, OSEM is now by far the most popular ML-algorithm on SPECT and PET systems.
In every iteration, we have to compute a projection and a backprojection along every individ-
ual projection line. As a result, the computational burden may become pretty heavy for a fully 3D
configuration.
Figure 5.17: A single plane from a clinical TOF-data set, sampled at 13 TOF-bins. The first bin corre-
sponds to the center, the subsequent bins are for increasing distances, alternatedly in both directions.
The non-TOF sinogram was obtained by summing the 13 TOF-sinograms.
Figure 5.18: Left: TOF-PET projection (matrix A) can be considered as a 1D smoothing along the
projection angle. Right: the TOF-PET backprojection of TOF-PET projection is equivalent to a 2D
blurring filter. The blurring is much less than in the non-TOF case (fig 5.10, p 52).
which is the same expression as for non-TOF-PET and SPECT (eq. 5.26, p 56), but of course, the
elements cij of the system matrix are different. The corresponding backprojection is then given by
X
bj = cij qi . (5.46)
i
The transpose of a symmetrical blurring operator is the same blurring operator. Consequently, every
TOF-projection image must be blurred with its own 1D blurring kernel, and then they all have to be
CHAPTER 5. IMAGE FORMATION 67
Figure 5.19: CT and TOF-PET image, acquired on a PET/CT system with time-of-flight resolution
of about 700 ps. The TOF-PET image reveals a small lesion not seen on the non-TOF PET image.
(Courtesy Joel Karp and Samuel Matej, University of Pennsylvania. The image is acquired on a
Philips Gemini TOF-PET system)
summed to produce the backprojection image. The PSF of projection followed by backprojection then
yields a circularly symmetrical blob with central profile
√ 1
b(r) = Gauss(r, 2σ) , (5.47)
|r|
where r is the distance to the center of the blob, σ is the standard deviation of the Gaussian blurring
that represents the TOF resolution, and Gauss(r,
√ s) is a Gaussian with standard deviation s, evaluated
in r. The standard deviation in (5.47) is 2σ and not simply σ, because the blurring has been applied
twice, once during projection and once during backprojection. See appendix 13.3, p 151 for a more
mathematical derivation.
For very large σ, this reduces to b(r) = 1/|r|, which is the result for backprojection in non-TOF
PET, obtained just after eq (5.13, p 52). The projection and backprojection operators are illustrated in
fig 5.18.
Also for TOF-PET, filtered backprojection algorithms can be developed. Because of the data re-
dundancy, different versions of FBP can be derived. A natural choice is to use the same backprojector
as mentioned above. With this backprojector, the backprojection of the TOF-PET measurement of a
point source will yield the image of blob given by eq (5.47). Consequently, the reconstruction filter
can be obtained by computing the inverse operator of (5.47). As before, the filter can be applied be-
fore backprojection (in the sinogram) or after backprojection (in the image), because of the Fourier
theorem.
Figure 5.19 compares a TOF and non-TOF image. By discarding the TOF-information, TOF-PET
data can be converted to regular PET data. This was done here to illustrate the improvement obtained
by using TOF. There are several reasons why the TOF-PET image is better.
First, because more information is used during reconstruction, the noise propagation is reduced,
resulting in a better signal-to-noise ratio (less noise for the same resolution) in the TOF image. It
has been shown that, for the center of a uniform cylinder, improving the TOF resolution with a factor
reduces the variance in the reconstructed image with that same factor, at least if these images have the
same spatial resolution.
Second, it turns out that the convergence of the MLEM algorithm is considerably faster for TOF-
PET. It is known that MLEM converges faster for smaller images, because the correct reconstruction
of a pixel depends on the reconstruction of all other pixels that share some projection lines with
CHAPTER 5. IMAGE FORMATION 68
it. The TOF information reduces that number of information-sharing pixels considerably. Because
in clinical routine, usually a relatively low number of iterations is applied, small image detail (high
spatial frequencies) may suffer from incomplete convergence. It was found that after introducing TOF,
the spatial resolution of the clinical images was better than before. TOF has no direct effect on the
spatial resolution, but for a similar number of iterations, MLEM converges faster, producing a better
resolution. And because of the improved SNR, it can achieve this better resolution at a similar or
better noise level than non-TOF PET.
Third, an additional advantage of this reduced dependency on other pixels is a reduced sensi-
tivity to most things that may cause artifacts, such as errors in the attenuation map, patient motion,
data truncation and so on. In non-TOF PET, the resulting artifacts can propagate through the entire
image. In TOF-PET, that propagation is reduced because of the reduced dependencies between the
reconstructed pixel values.
Finally, the gain due to TOF is higher when the region of interest is surrounded by more activity
and if the active object (i.e. the patient body) is larger. The reason is that in those cases, there is
more to be gained by suppressing sensitivity to surrounding activity, since there is more surrounding
activity. As a result, TOF helps more in situations where non-TOF MLEM performs poorest. The
opposite is also true, for imaging a point source, TOF-PET does not better than non-TOF PET.
In section 4.2.1, p 23, we have seen that gamma cameras suffer from position dependent collimator
blurring. As illustrated in figure 5.20, this can be modeled by computing the projection not as a line
integral, but as the integral over a cone, using appropriate weights to model the position dependent
PSF. This position dependent blurring is then combined with the attenuation to obtain the system
CHAPTER 5. IMAGE FORMATION 69
matrix elements cij . The same can be done for PET: the blurring due to the detector width (and if
desired, also the positron range and/or the acolinearity) can be incorporated in the PET or TOF-PET
system matrix elements.
During the iterations, MLEM (or OSEM) uses the system matrix in every projection and back-
projection, and by doing so, it will attempt to invert all effects modeled by that system matrix. If the
system matrix models a blurring effect, then MLEM will automatically attempt to undo that blurring.
This produces images with improved resolution as illustrated for a SPECT and a PET case in figure
5.21. Interestingly, modeling the finite system resolution in MLEM not only improves the resolution
in the reconstructed image, it also suppresses the noise, as can be seen in the PET reconstruction.
Figure 5.21: Reconstructions with and without resolution modeling from a SPECT bone scan of a
child and a brain PET scan. “MIP” denotes “maximum intensity projection”. MLEM with resolution
modeling recovers image detail that was suppressed by the finite system resolution.
However, the deblurring cannot be 100% successful, because deblurring is a very ill-posed prob-
lem. Except in special cases, deblurring problems have multiple solutions, and iterative algorithms
cannot “know” which of those possible solutions they should produce. This is illustrated in figure
5.22. A noise-free PET simulation was done, accounting for finite spatial resolution. MLEM images
without and with resolution modeling were reconstructed from the noise-free sinogram. The figure
shows that MLEM with resolution modeling produced a sharper image, but that image is different
from the true image that was used in the simulation.
The problem is that the blurring during the projection not only suppresses some information, it
also erases some information. MLEM can recover image details that have been suppressed but are
still present in the data, but no algorithm can recover details that have been erased from the data. This
can be best explained in the frequency domain.
CHAPTER 5. IMAGE FORMATION 70
Figure 5.22: PET simulation without noise but with finite resolution modeling. MLEM with deblur-
ring produced sharper images, but they are clearly different from the ground truth.
Figure 5.23 shows a simple 1D thought experiment. In this experiment, we consider the PSF and
its frequency spectrum. The frequency spectrum of the PSF is often called the modulation transfer
function (MTF), because it tells us how the PSF of a system affects (modulates) all frequency com-
ponents of a signal which is transferred through that system. Assume we are doing measurements
on something that can be well represented by a 1D profile. The true profile has a single non-zero
element. A measurement of that profile produces a blurred version of the true profile. We assume that
the measurement device has a Gaussian PSF. The Fourier transform of a Gaussian is a Gaussian, so
the Gaussian PSF corresponds to a Gaussian MTF; denoting the Fourier transform with F, we have:
1 x2 2 2 2
F √ e− 2σ2 = e−2π σ f . (5.48)
2πσ
The wider the Gaussian PSF, the narrower the corresponding Gaussian MTF, and the larger the fraction
of frequencies that are suppressed so much that they will be lost in the noise. Now imagine that we
Figure 5.23: Impulse responses (top) and their corresponding frequency spectra or MTF (bottom).
From left to right: (1) the ideal PSF (a flat MTF), a Gaussian PSF (Gaussian MTF) and a sinc-shaped
PSF, corresponding to a rectangular MTF
apply some deblurring algorithm, which restores the frequency components that are still present in
the data. We also assume that it does not “invent” data, but assumes that things not seen by the
measurement should be set to zero. That will result in an approximately rectangular PSF. This is
illustrated in figure 5.23, where we assumed that all frequencies below 0.16 could still be restored,
CHAPTER 5. IMAGE FORMATION 71
whereas the higher frequencies were filtered away by the measurement PSF. The inverse Fourier
transform of a rectangular function is a sinc function (sinc(x) = sin(x)/x), as can easily be verified:
Z ∞
−1
F (rectanglefc (f )) = rectanglefc (f )e2jπf x df
−∞
Z fc
= e2jπf x df
−fx
e2jπfc x − e−2jπfc x
=
2jπx
2j sin(2πfc x)
=
2jπx
sin(2πfc x)
= , (5.49)
πx
√
where we used ejx = cos(x) + j sin(x), and j = −1. Figure 5.23 shows a sinc function (to plot
the function, make us of limx→0 sin(x)/x = 1). The PSF will oscillate at the cutoff frequency, so
the wider the rectangular MTF, the narrower the central peak of the PSF and the narrower the ripples
surrounding it.
Knowing the PSF, we can compute what the deblurred measurement will look like for any other
signal, by convolving that signal with the PSF. Figure 5.24 shows the result for a block-shaped signal.
The result is still block-like, but the two edges are less steep and are accompanied by ripples creating
under- and overshoots. These ripples are often called Gibbs artefacts.
Figure 5.24: The convolution of a block-shaped signal with a sinc-shaped PSF produces a smoother
version of the block with ripples, also called Gibbs artefacts, on top of it.
MLEM with resolution modeling combines reconstruction with deblurring for the system PSF,
which is more complicated than simple deblurring. Nevertheless, when the system PSF is modeled
during MLEM, very similar Gibbs artefacts are created, because MLEM cannot restore the higher
frequencies that were lost due to that PSF. Because now the blurring and deblurring are done in
three dimensions, the Gibbs artefact are rippling in 3D too. The effects in different dimensions can
accumulate, and as a result, the over- and undershoots can become very large. E.g. for small hot
objects, the overshoot can go as high as 100%. That is unfortunate, such high overshoots can cause
problems for quantitative analysis of the tracer uptake in small hot lesions, as is often done in PET for
oncology. Therefore, depending on the imaging task, it may be necessary to smooth the image a bit
to avoid quantification errors due to these Gibbs effects.
Chapter 6
72
CHAPTER 6. THE TRANSMISSION SCAN 73
Figure 6.1: Scanning transmission source in a gamma camera. An electronic window is synchronized
with the source, improving separation of transmission and emission counts.
Figure 6.2: Rotating transmission source in PET. As a reference, a blank scan is acquired daily.
lines involving a detector close to the source) are used, which reduce the emission contamination
with a factor of about 40. The remaining contamination can be estimated from an emission sinogram
acquired immediately before or after the transmission scan.
For every projection line, we also need to know the number of photons emitted by the source
(see eq. (5.22, p 53) and (5.23)). These values are measured during a blank scan. The blank scan is
identical to the transmission scan, except that there is nothing in the field of view. Blank scans can be
acquired over night, so they do not increase the study duration. Long acquisition times can be used,
to reduce the noise in the blank scan.
Figure 6.3: Left: emission PET projections. Right: transmission projections of the same patient.
as well. However, there is an important difference. In emission tomography, the raw data are a
weighted sum of the unknown reconstruction values. In transmission tomography, the raw data are
proportional to the exponent of such a weighted sum. As a result of this difference, the MLEM
algorithm cannot be applied directly, so several new ones have been presented in recent literature.
Similarly as in the emission case, one can use a non-trivial prior distribution for the reconstruction
images. In that case, the reconstruction is called a maximum-a-posteriori algorithm or MAP-algorithm
(see section 5.3.2.1, p 55).
The prior p(Λ) is the probability we assign to the image Λ, independently of any measurement.
To suppress the noise, the prior probability function is designed such that it takes a higher value for
smoother images. This is typically done by computing the sum of all squared differences between
neighboring pixels, but many other functions have been invented as well. Such functions have been
successfully used, both for emission and transmission tomography. In the case of transmission to-
mography, we also have prior knowledge about the image values: the linear attenuation coefficient
of human tissues is known. Thus, it makes sense to design prior functions that return a higher value
when the pixel values are closer to the typical attenuation values encountered in the human body.
Figure 6.4 shows the reconstructions of a 10 min transmission scan. The same sinogram has been
reconstructed with filtered backprojection, ML and MAP. Because the scan time is fairly long, image
quality is reasonable for all algorithms.
Figure 6.5 shows the reconstructions of a 1 min transmission scan obtained with the same three
algorithms. In this short scan, the noise is prominent. As a result, streak artifacts show up in the
filtered backprojection image. The ML-image produces non-correlated noise with high amplitude. As
argued in section 5.3.4, p 60, this can be expected, since the true number of photons attenuated during
the experiment in every pixel is subject to statistical noise. And if that number is small, the relative
noise amplitude is large. The MAP-reconstruction is much smoother, because the prior assigns a low
probability to noisy solutions. This image is a compromise between what we know from the data and
what we (believe to) know a priori.
Transmission scanning has never become very popular in SPECT, because the physicians felt that
their diagnosis was not adversely affected by the attenuation-artifacts (i.e. artifacts caused by ignoring
the attenuation) and because it increases the cost of the gamma camera. In PET, those artifacts tend
to be more severe, and transmission scanning has been provided and used in almost all commercial
PET systems. However, since the introduction of PET/CT, the CT image has been used for attenuation
CHAPTER 6. THE TRANSMISSION SCAN 75
Figure 6.4: Reconstruction of a PET transmission scan of 10 min. Left: filtered backprojection;
Center: MLTR; Right: MAP (Maximum a posteriori reconstruction).
Figure 6.5: Reconstruction of a PET transmission scan of 1 min. Left: filtered backprojection;
Center: MLTR; Right: MAP (Maximum a posteriori reconstruction).
correction, and most PET/CT systems do no longer have the hardware for PET transmission scanning.
where s and d are the position of the source and the detector on the projection line corresponding to
detector element i.
CHAPTER 6. THE TRANSMISSION SCAN 76
Figure 6.6: A CT system consists of a collimated X-ray tube and a detector mounted on a rotating
gantry. The detector has tens to hundreds of rows, each consisting of about a thousand detector
elements
When the line integrals (6.3) are available, the attenuation image µ can be reconstructed, either
analytically (with a filtered backprojection algorithm) or iteratively (e.g. with a maximum likelihood
algorithm). The pixel values of the reconstructed image would then represent (approximately) the
attenuation of the tissues at the average energy of the X-ray photons (typically around 70 keV). To
make the image independent of the chosen energy of the X-ray tube, the image values are converted
to so-called Hounsfield units (HU) as follows:
µ(x, y, z) − µwater
HU-value(x,y,z) = 1000 , (6.4)
µwater
where (x, y, z) represents the coordinate of the considered voxel in the CT image. Consequently, the
attenuation of vacuum corresponds to -1000 HU, and the attenuation of water to 0 HU.
Figure 6.7: A PET/CT system consists of a PET and a CT system, sharing the same patient table.
at subsequent bed positions, where each acquisition provides an image volume extending over 10 to
15 cm. At 2 to 4 minutes per bed position this requires about 15 to 30 min, so the whole examination
would require about 20 to 35 min. Assuming that the patient does not move during the entire proce-
dure, the two images are perfectly aligned, and the physicians can see the the function and anatomy
at the same time, as illustrated in fig 6.8.
It has been shown that this combined imaging improves the diagnostic procedure considerably,
in particular in oncology, and the PET/CT system has been accepted with great enthousiasm. All
PET-manufacturers offer PET/CT systems, and since 2006 it has even become virtually impossible to
buy a PET-system without CT. Also SPECT-CT systems are being offered now, although they are not
nearly as successful (yet?) as the PET/CT systems.
Figure 6.8: Images from a PET/CT system, allowing simultaneous inspection of anatomy and func-
tion.
The CT produces radiation by bombarding an anode with electrons in a vacuum tube. The elec-
trons are accelerated with an electric field, which can usually be adjusted in a range of about 80 to 140
kV, so they acquire a kinetic energy of 80 to 140 keV. They hit the anode where they are stopped. Part
of their energy is dissipated as Bremsstrahlung, which produces a continuous spectrum. In addition,
the electrons may also knock anode electrons out of their shell (e.g. the K-shell). A higher shell elec-
tron (e.g. from the L-shell) will then move to the vacant lower energy state, producing a characteristic
radiation of EL − EK . So if the CT voltage is set to 140 kV, the CT-tube produces radiation with a
spectrum between 0 and 140 keV (fig 6.9).
Human tissue consists mostly of very light atoms. For attenuation correction, these tissues can be
well approximated as mixtures of air or vacuum (no attenuation) and water. The attenuation of water
is about 0.096/cm at 511 keV, and about 0.19 at 70 keV, which is the ”effective” energy of the CT-
spectrum. Assuming that everything in the human body is a mixture of water and vacuum, we only
need a single scale factor equal to 0.096 / 0.19. But of course, there is also bone. One can apply the
same trick with good approximation: we regard everything denser than water as a mixture between
water and bone. Dense bone has an attenuation of about 0.17/cm at 511 keV, and about 0.44/cm at 70
keV. This yields the piecewise linear conversion function shown in fig 6.10.
This scaling works reasonably well, except if a contrast agent is used during a diagnostic CT-scan.
Contrast agents are used to improve the diagnostic value of the CT-image. Often a contrast agent is
injected intravenously to obtain a better outline of well perfused regions. This is illustrated in fig 6.11,
which compares a CT image acquired with lower radiation dose and no contrast, to an image obtained
with a higher radiation and intravenous contrast. Other contrast agents have to be taken orally, and
are used to obtain a better visualization of the digestive tract. These contrast agents usually contain
iodine, which has very high attenuation near 70 keV due to photo-electric effect (which is why they are
effective for CT-imaging), but the attenuation is hardly higher than that of water at 511 keV (mostly
CHAPTER 6. THE TRANSMISSION SCAN 79
Figure 6.9: Typical CT spectrum with continuous Bremsstrahlung and a few characteristic radiation
peaks.
Figure 6.10: Piecewise linear conversion typically used in PET/CT software to move attenuation
coefficients from 70 to 511 keV.
Figure 6.11: Left: CT image, acquisition with 30 mAs and no contrast. Right: CT image, acquisition
with 140 mAs and intravenously injected contrast agent.
CHAPTER 6. THE TRANSMISSION SCAN 80
Compton interactions). As a result, the piecewise conversion produces an overestimation of the PET-
attenuation in regions with significant contrast uptake. This matter is currently being investigated, no
standard solution exists.
A related problem is that of metal artifacts. Because the attenuation of metal prostheses or dental
fillings is extremely high at 70 keV, such objects can attenuate almost all photons. The number
of acquired photons is in the denominator of equation (5.22, p 53), so the CT-reconstruction gets
into problems when the number of acquired photons goes to zero. This numerical instability can
produce artifacts in the CT-reconstruction, which will propagate into the PET image via the CT-based
attenuation correction. (Partial) solutions exist, but currently, they are not yet included in commercial
CT-software.
formation (6 degrees of freedom) required to accurately align the images. The same transformation
is then applied to align the patient images. This transformation usually remains valid until something
changes, which is typically the case when something must be repaired (X-ray tube replacement, PET
detector replacement etc.). For such actions, the two systems are pulled apart. After combining them
again, the spatial alignment could be change by a few mm, so a new calibration is then required.
Figure 6.12: PET/CT attenuation artifact due to breathing. The tumor is really located in the liver,
but the mismatch with the CT and the resulting attenuation correction errors make it show up in the
lung.This figure is from a paper by Sarikaya, Yeung, Erdi and Larson, Clinical Nuclear Medicine,
2003; 11: 943
diagnosis. Such an artifact is present in the brain in fig 6.13.C, which shows a left-right asymmetry
caused by a motion artifact.
The correction of these artifacts is a current research topics, no standard solutions exist.
CHAPTER 6. THE TRANSMISSION SCAN 82
Figure 6.13: Coronal slice of a whole body PET/CT study reconstructed without (A) and with (C)
attenuation correction based on a whole body CT (B). PET and CT are combined in a fusion image
(D). The relative intensity of the subcutaneous metastasis (small arrow) compared to the primary
tumor (large arrow) is much higher in the non corrected image than in the corrected one, because
the activity in this peripheral lesion is much less attenuated than the activity in the primary tumor. A
striking artifact in (A) is the apparent high uptake in the skin and the lungs. Note also that regions of
homogenous uptake, such as the heart (thick arrow), are no longer homogenous, but show a gradient.
The uptake in the left side of the brain (dotted arrow) is apparently lower than in the contralateral
one in (C). The fusion image shows that the head did move between the acquisition of the CT and the
emission data, resulting in an apparent decrease in activity in the left side of the brain due errors in
the attenuation correction.
Chapter 7
System maintenance
The photomultiplier tubes are analog devices, with characteristics that are influenced by the age and
the history of the device. To produce correct images, some corrections are applied (section 4.5, p
37). The most important are the linearity correction and the energy correction. In most cameras, there
is an additional correction, called the uniformity correction, taking care of remaining second order
deviations. It is important to keep on eye on the camera performance and tune it every now and then.
Failure to do so will lead to gradual deterioration of image quality.
When a camera is first installed, the client tests it extensively to verify that the system meets
all required specifications (the acceptance test). The National Electrical Manufacturers Association
(NEMA, “national” is the USA here) has defined standard protocols to measure the specifications. The
standards are widely accepted, such that specifications from different vendors can be compared, and
that discussions between customers and companies about acceptance test procedures can be avoided.
You can find information on http://www.nema.org.
This chapter gives an overview of tests which provide valuable information on camera perfor-
mance. Many of these tests can be done quantitatively: they provide a number. It is very useful to
store the numbers and plot them as a function of time: this helps detecting problems early, since grad-
ual deterioration of performance is detected on the curve even before the performance deteriorates
beyond the specifications.
Quality control can be tedious sometimes, and it is not productive on the short term: it costs time,
and if you find an error, it also costs money. Most of the time, the camera is working well, so if you
assume that the camera is working fine, you are probably right and you save time. This is why a quality
control program needs continuous monitoring: if you don’t insist on quality control measurements,
the QC-program will silently die, and image quality will slowly deteriorate.
83
CHAPTER 7. SYSTEM MAINTENANCE 84
linearity correction is applied, the image is uniform, except for Poisson noise. Uniformity correction
produces a marginal improvement which is hardly visible.
For most properties, specifications are supplied for the UFOV (usable field of view) and the CFOV
(central field of view). The reason is that performance near the edges is always slightly inferior to
performance in the center. Near the edges, the scintillation point is not nicely symmetrically sur-
rounded by (nearly) identical PMTs. As a result, deviations are larger and much stronger corrections
are needed, unavoidably leading to some degradation. The width of the CFOV is 75% of the UFOV
width, both in x and y direction.
Quality control typically involves
The scheme can be adapted according to the strengths and weaknesses of the systems.
Figure 7.2: Uniform image acquired on a gamma camera with a dead photomultiplier.
where the regional uniformity is computed by applying (7.1) to a small line interval containing only
5 pixels, and this for all possible vertical and horizontal line intervals. The differential uniformity is
always a bit smaller than the integral uniformity, and insensitive to gradual changes in image intensity.
In an image of 64 × 64 with 10000 counts per pixel, the integral uniformity due to Poisson noise
only is about 4%, typical specifications are a bit larger, e.g. 4.5%.
To acquire a uniformity correction matrix, typically 45000 counts per pixel are acquired. It is
important that the camera is in good shape when a uniformity correction matrix is produced. Figure
7.3 shows a flood source image acquired on a camera with a linearity correction problem. The linearity
problem causes deformations in the image: counts are mispositioned. In a uniform image this leads to
non-uniformities, in a line source image to deformations of the line. If we now acquire a uniformity
correction matrix, the corrected flood image will of course be uniform. However, the line source
image will still be deformed, and in addition the intensities of the deformed line will become worse if
we apply the correction matrix! If uniformity is poor after linearity and energy correction, do not fix
it with uniformity correction. Instead, try to figure out what happens or call the service people from
the company.
The uniformity test is sensitive to most of the things that can go wrong with a camera, but not all.
One undetected problem is deterioration of the pixel size, which stretches or shrinks the image in x or
y direction.
Figure 7.3: Flood source (= uniform sheet source) image acquired on a dual head gamma camera,
with a linearity correction problem in head 1 (left)
Figure 7.4: Images of a dot phantom acquired on a dual head camera with a gain problem. Left:
image acquired on head 1. Center: image simultaneously acquired on head 2. Right: superimposed
images: head1 + mirror image of head2. The superimposed image is blurred because the heads have
a different pixel size.
Figure 7.4 shows an example where the y-amplifier gain is wrong for one of the heads of a dual
head camera (the x-gain is better but not perfect either). The error is found from this dot phantom
measurement by superimposing both images. Since the images are recorded simultaneously, the dots
from one head should fit those from the other. Because of the invalid y-pixel size in one of the heads
there is position dependent axial blurring (i.e. along the y-axis), which is worse for larger y values.
It is useful to verify if the pixel size is independent of the energy. This can be done with a
67 Ga point source, which emits photons of 93, 184 and 296 keV. Using three appropriate energy
windows, three point source images are obtained. The points should coincide when the images are
superimposed. Repeat the measurement at a few different positions (or use multiple point sources).
Usually, the PSF can be well approximated as a Gaussian curve. The LSF is then easy to compute:
Z ∞
1 − x2 +y2 2 1 2
− x2
LSF(x) = 2
e 2σ dy = √ e 2σ . (7.4)
−∞ 2πσ 2πσ
CHAPTER 7. SYSTEM MAINTENANCE 87
Figure 7.5: (Simulated) energy spectra of Cobalt (57 Co, 122 keV) and technetium (99m Tc, 140 keV).
In these equations, we have assumed that the line coincides with the y-axis. So it is reasonable to
assume that the FWHM of the LSF is the same as that of the PSF. On a single LSF, several independent
FWHM measurements can be done and averaged to improve accuracy. Alternatively, if the line is
positioned nicely in the x or y direction, one can first compute an average one-dimensional profile by
summing image columns or rows and use that for LSF computations. Again, measurements for x and
y must be made because the resolution can become anisotropic.
7.1.1.5 Linearity
The linearity is measured with a line source phantom as shown in figure 7.6. This can be obtained in
several ways. One is to use a lead sheet with long slids of 1 mm width, positioned at regular distances
(e.g. 3 cm distance between the slids). Remove the collimator (to avoid interference between the slids
and the collimator septa, possibly resulting in beautiful but useless Moiré patterns), put the lead sheet
on the camera (very carefully, the fragile crystal is easily damaged and very expensive!) and irradiate
with a uniform phantom or point source at large distance.
NEMA states that integral linearity is defined as the maximum distance between the measured
line and the true line. The true line is obtained by fitting the known true lines configuration to the
image.
The NEMA definition for the differential linearity may seem a bit strange. The procedure pre-
scribes to compute several image profiles perpendicular to the lines. From each profile, the distances
between consecutive lines is measured, and the standard deviation on these distances is computed.
CHAPTER 7. SYSTEM MAINTENANCE 88
Figure 7.6: Image of a line phantom acquired on a gamma camera with poor linearity correction.
The maximum standard deviation is the differential linearity. This procedure is simple, but not 100%
reproducible.
With current computers, it is not difficult to implement a better and more reproducible procedure.
However, a good standard must not only produce a useful value, it must also be simple. If not,
people may be reluctant to accept it, and if they do, they might make programming errors when
implementing it. At the time the NEMA standards were defined, the procedure described above was
a good compromise.
where the (unknown) true count rates are marked with an asterisk. If we did a good job preparing the
CHAPTER 7. SYSTEM MAINTENANCE 89
where we define R = (R1 + R2 )/2. A bit of work (left as an exercise to the reader) leads to
2R12 R1 + R2
τ= 2
ln . (7.10)
(R1 + R2 ) R12
Knowing τ , we can predict the dead time for any count rate, except for very high ones (which
should be avoided at all times anyway, because in such a situation limited information is obtained at
the cost of increased radiation to the patient).
7.1.1.7 Sensitivity
Sensitivity is measured by recording the count rate for a known radioactive source. Because the
measurement should not be affected by attenuation, NEMA prescribes to use a recipient with a large
(10 cm) flat bottom, with a bit of water (3 mm high) to which about 10 MBq radioactive tracer has
been added. The result depends mainly on the collimator, and will be worse if the resolution is better.
xwb = x (7.11)
ywb = y + vt, (7.12)
where (x, y) is the normal planar coordinate, v is the table speed and t is the time. The speed v must
be expressed in pixels per s, while the motor of the bed is trying to move the bed at a predefined speed
in cm per s. Consequently, whole body image quality will only be optimal if
Figure 7.7: Simulation of center of rotation error. Left: sinogram and reconstruction in absence
of center of rotation error. Right: filtered backprojection with center of rotation error equal to the
diameter of the point.
7.1.2.2 Uniformity
If the table motion is not constant, the image of a uniform source will not be uniform. Of course, this
uniformity test will also be affected if something is wrong with planar performance. Uniformity is not
affected if the motion is wrong but constant, except possibly at the beginning and the end, depending
on implementation details.
7.1.3 SPECT
A gamma camera which survives all quality control tests for planar imaging may still fail for SPECT,
since SPECT poses additional requirements on the geometry of the system.
7.1.3.1 Center-of-rotation
In SPECT, the gamma camera is used to acquire a set of sinograms, one for each slice. Sinogram data
are two-dimensional, one coordinate is the angle, the other one the distance between the origin and the
projection line. Consequently, a sinogram can only be correctly interpreted if we know which column
in the image represents the projection line at zero distance. By definition, this is the point where the
origin, the rotation center is projected.
For reasons of symmetry it is preferred (but not necessary) that the projection of the origin is
the central column of the sinogram, and manufacturers attempt to obtain this. However, because of
drift in the electronics, there can be a small offset. This is illustrated in figure 7.7. The leftmost
images represent the sinogram of a point source and the corresponding reconstruction if everything
CHAPTER 7. SYSTEM MAINTENANCE 91
works well. The rightmost images show what happens if there is a small offset between the projection
of the origin and the center of the sinogram (the position where the origin is assumed to be during
reconstruction, indicated with the arrows in the sinogram). For a point source in the center, the
sinogram becomes inconsistent: the point is always at the right, no matter the viewing angle. As a
result, artifacts result in filtered backprojection (the backprojection lines do not intersect where they
should), which become more severe with increasing center-of-rotation error.
From the sinogram, one can easily compute the offset. This is done by fitting a sine to the mass
center in each row of the sinogram, determining the values of A, ϕ and B by minimizing
X
cost = (m(θi ) − (A sin(∆θ θi + ϕ) + B))2 . (7.13)
i
The mass center, computed for each row θi , represents the detector position of the point source for the
projection acquired at angle ∆θ θi , where ∆θ ∈ IR is the angle between subsequent sinogram rows, and
θi ∈ IN is the row index. The offset is the fitted value for B, while A and ϕ are nuisance variables to be
determined because we don’t know exactly where the point source was positioned. Once the offset is
known, we can correct for it, either by shifting the sinogram over −B, or by telling the reconstruction
algorithm where the true rotation center is. SPECT systems software provides automated procedures
which compute the projection of the rotation axis from a point source measurement. Older cameras
suffered a lot from this problem, newer cameras seem to be far more stable.
7.1.3.4 Quantification
In principle, a SPECT system can produce quantitative images, just like a PET system, albeit with
poorer resolution. This has long been ignored in clinical practice, because absolute quantification
CHAPTER 7. SYSTEM MAINTENANCE 92
Figure 7.8: If the gamma camera is not parallel to the rotation axis, a point apparently moves up and
down the axis during rotation.
Figure 7.9: A typical “Jaszczak phantom”, named after a very well known SPECT researcher who
developed many such phantoms (figure from P. Zanzonico, J Nucl Med 2008; 49:1114-1131).
usually adds little diagnostic value. However, SPECT is increasingly being used to prepare or ver-
ify radionuclide therapy treatments. In radionuclide therapy, a large amount of a tumor-targetting
radiopharmacon is administered to the patient. The radiopharmacon typically carries a beta- or alpha-
emitter. In contrast to gamma photons, these particles deposit their energy at short distances, and
when accumulated in the tumor lesions, they will deposit a large amount of energy in the tumors and
much less in the surrounding healthy tissues. For those applications, it is important to determine the
exact amount of radioactivity in the tumors and healthy tissues, to ensure that the tumors are destroyed
and the damage to the healthy tissues is as low as possible.
If the gamma camera is in good shape, and if the SPECT images are reconstructed with attenuation
correction and scatter correction, then the reconstructed image pixel values should be proportional to
the local activity at that location. The constant of proportionality can be obtained from a phantom
measurement. A cylinder is filled with water containing a well known uniform tracer concentration.
The cylinder is scanned and an image is reconstructed with attenuation and scatter correction. Then the
calibration factor is computed as the ratio of the true tracer concentration in Bq/ml to the reconstructed
value. The reconstructed value is typically computed as the mean value in a volume centered inside the
CHAPTER 7. SYSTEM MAINTENANCE 93
phantom (this volume should stay away from the phantom boundary, to avoid the influence of partial
volume effects). Multiplication of the reconstructed images with this calibration factor converts them
into quantitative images, with pixel values giving the local activity in Bq/ml.
The calibration factor is tracer dependent, because the sensitivity of the gamma camera to the
radionuclide activity depends on the branching ratio, on the energy (or energies) of the emitted photons
and on the collimator.
For “well-behaved” tracers which emit photons at a single energy that is not too high, such as
99m Tc, the attenuation and scatter correction is accurate. As a result, the calibration can be accurate
too. For tracers that emit photons at different energies, such as 111 In, attenuation and scatter correction
is more difficult. Scatter correction is more complicated because the Compton scattered photons
from the high energy peak may end up in the low energy window. Attenuation correction is more
complicated because the attenuation is energy dependent. For such tracers, accurate quantification is
still possible, but it is more challenging.
The calibration factor must be verified, and corrected if necessary, a few times per year, to ensure
that the SPECT system remains quantitatively accurate.
7.2.2 Normalization
Normalization is used to correct for non-uniformities in PET detector sensitivities (see section 4.5.3.2,
p 40). The manufacturers provide software to compute and apply the sensitivity correction based on a
scan of a particular phantom (e.g. a radioactive cylinder positioned in the center of the field of view).
Current PET/CT (or PET/MR) systems do not have a transmission scan. Instead of the blank scan,
they use a scan of this phantom to detect, quantify and correct changes in detector performance over
time.
• determines the energy and position correction matrices for each detector module,
The first two operations are performed with a calibrated phantom carefully positioned in the field of
view, since 511 keV photons are needed to measure spectra and construct histograms as a function of
position. The timing calibration is based on purely electronic procedures, without the phantom (since
one cannot predict when the photons will arrive).
7.2.4 Quantification
When a PET system is well tuned, the reconstructed intensity values are proportional to the activity
that was in the field of view during the PET scan. This constant of proportionality (called calibration
factor or PET scale factor or so) is determined from a phantom measurement. A cylinder is filled with
a well known and uniform tracer concentration, and a PET scan is acquired. The calibration factor
is computed as the ratio of the reconstructed value to the true tracer concentration in Bq/ml. The
reconstructed value is typically computed as the mean value in a volume centered inside the phantom
(this volume should stay away from the phantom boundary, to avoid the influence of partial volume
effects). Once the system knows this ratio, its reconstruction program(s) produce quantitative images
of the measured tracer concentration in Bq/ml. The calibration factor must be verified, and corrected
if necessary, a few times per year, to ensure that the PET system remains quantitatively accurate.
Chapter 8
On many occasions, it is very handy or even essential to have relatively simple devices to detect or
quantify ionizing radiation. Three important applications for such devices are:
• To determine the amount of radioactivity in a syringe, so that one can adjust and/or verify how
much activity is going to be administered to the patient.
• To determine the amount of radioactivity in a sample (usually blood samples) taken from the
patient. The problem is the same as the former one, but the amount of radioactivity in a small
sample is typically orders of magnitude less than the injected dose.
• To detect contaminations with radioactivity, e.g. due to accidents with a syringe or a phantom.
To use a PET or gamma camera for these tasks would obviously be overkill (no imaging required),
impractical (they are large and expensive machines) and inefficient (for these tasks, relatively simple
systems with better sensitivity can be designed). Special detectors have been designed for each of
these tasks:
• the well counter, for measuring small amounts of activity in small volumes,
• the radionuclide calibrator for measuring high amounts of activity in small volumes,
• the survey meter for finding radioactivity in places where there should be none.
95
CHAPTER 8. WELL COUNTERS, RADIONUCLIDE CALIBRATORS, SURVEY METERS 96
Figure 8.1: Diagram of a well counter, showing a test tube inside the well.
a higher probability of traveling through the crystal without interaction. The effect of the crystal
thickness can be taken into account by calibrating the well counter, which is done by the vendor.
One problem of the well counter is that its sensitivity varies with the position of the source inside
the crystal: the deeper a point source is positioned inside the crystal, the smaller the chance that its
photons will escape through the entrance of the cavity. A simple approximate calculation illustrates
the effect. Suppose that the radius of the cylindrical cavity is H and that a point source is positioned
centrally in the hole at a depth D. As illustrated in fig. 8.2, we have to compute the solid angle of
the entrance, as seen from the point source, to obtain the chance that a photon escapes. A similar
computation was done near equation (4.7), p 25, but this time, we cannot ignore the curvature of the
sphere surface, because the entrance hole is fairly large:
Z α
1
escape-chance(D) = 2πH 0 (α0 ) Rdα0 (8.1)
4πR2 0
where the value of H 0 in the integral goes from H 0 (0) = 0 to H 0 (α) = H. A particular value of α0
defines a circular line on the sphere, and a small increment dα0 turns the line into a small strip with
thickness Rdα0 . Substituting H 0 (α0 ) = R sin(α0 )) one obtains:
Z α
1
escape-chance(D) = 2πR2 sin(α0 )dα0
4πR2 0
1 α
= (− cos(α0 )) 0
2
1 1 D
= (1 − cos(α)) = 1− √ (8.2)
2 2 D2 + H 2
And the chance that the photon will be detected equals
1 D
sensitivity(D) = 1 − escape-chance(D) = 1+ √ . (8.3)
2 D + H2
2
We find that for D = 0, the sensitivity is 0.5, which makes sense, because all photons going up will
escape and all photons going down will hit the crystal. For D going to ∞, the sensitivity goes to
unity. Note that the equation also holds for negative D, which means putting D above the crystal.
CHAPTER 8. WELL COUNTERS, RADIONUCLIDE CALIBRATORS, SURVEY METERS 97
Figure 8.2: Left: diagram of a well counter, showing a test tube inside the well. Right: the geometrical
sensitivity of the well counter to activity inside the tube, as a function of the depth inside the crystal
(H = 2 cm).
That makes the sensitivity smaller than 0.5, and for D = −∞, the sensitivity becomes zero. A plot of
the sensitivity as a function of the depth D is also shown in figure 8.2, for H = 2 cm, i.e. a cylinder
diameter of 4 cm. This figure clearly shows that the sensitivity variations are not negligible. One
should obviously put objects as deep as possible inside the crystal. Suppose we have two identical
test tubes, both with exactly the same amount of activity, but dissolved in different amounts of water.
If we measure the test tubes one after the other with a well counter, putting each tube in exactly the
same position, the tube with less water will produce more counts!
Expression (8.3) is approximate in an optimistic way, because it ignores (amongst other effects)
the possibility of a photon traveling through the crystal without any interaction. Although the well
counter is well shielded, it not impossible that there would be other sources of radioactivity in the
vicinity of the device, which could contribute a bit of radiation during the measurement. Even a
very small amount of background radiation could influence the measurement, in particular if samples
with very low activity are being measured. For that reason, the well counter can do a background
measurement (simply a measurement without a source in the detector), and subtract the recorded value
from subsequent measurements. Of course, if there would be a significant background contribution, it
would usually be much better to remove or shield it than to leave it there and correct for it.
non-zero voltage, the electrons and positive ions travel to the anode and cathode respectively, creating
a small current. If enough atoms are ionized, that current can be measured. A cartoon drawing is
shown in figure 8.3.
If the voltage between the cathode and the anode is low, the electrons and ions travel slowly,
and have still time to recombine into neutral atoms. As a result, only a fraction of them will reach
the electrodes and contribute to a current between anode and cathode. That fraction increases with
increasing voltage, until it becomes unity (i.e. all ionizations contribute to the current). Further
increases in the voltage have little effect on the measured current. This first plateau in the amplitude
vs voltage curve (fig 8.3) is the region where ionization chambers are typically operated. In this mode,
the output is proportional to the total number of ionizations, which in turn is proportional to the energy
of the particles. A single particle does not produce a measurable current, but if many particles travel
through the ionization chamber, they create a measurable current which is proportional to the total
energy deposited in the gas per unit of time.
Figure 8.3: The gas detector. Left: the detector consists of a box containing a gas. In this drawing,
the anode is a central wire, the cathode is the surrounding box. A photon or particle traveling through
the gas produces ionizations, which produce a current between anode and cathode. Right: the current
created by a particle depends on the applied voltage.
With increasing voltage, the speed of the electrons pulled towards the anode increases. If that
speed is sufficiently high, the electrons have enough energy to ionize the gas atoms too. This avalanche
effect increases the number of ionizations, resulting in a magnification of the measured current. This
is the region where proportional counters are operated. As schematically illustrated in fig 8.3, the
amplification increases with the voltage. In this mode, a single particle can create a measurable
current, and that current is proportional to the energy deposited by that particle in the gas.
With a still higher voltage, the avalanche effect is so high that a maximum effect is reached, and
the output becomes independent of the energy deposited by the traversing particle. The reason is that
the electrons hit the anode with such a high energy that UV photons are emitted. Some of those UV
photons travel through the gas and liberate even more electrons. In this mode, the gas detector can
be used as a Geiger-Müller counter. 1 The Geiger-Müller counter detects individual particles, but its
1
Special tricks must actually be used to make sure that the avalanche dies out after a while, the avalanche must be
quenched. A common approach is to use ”quenching gases”, which happen to have the right features to avoid the excessive
positive feedback that would maintain the ionization avalanche.
CHAPTER 8. WELL COUNTERS, RADIONUCLIDE CALIBRATORS, SURVEY METERS 99
Figure 8.4: Left: loss of photons due to attenuation in the test tube, detector wall etc., probability
of interaction of the photons with the gas and the number of ionisations that could be produced with
the energy of the photon (in arbitrary units), all as a function of the photon energy. Right: the
combination of these three effects yields the sensitivity of the radionuclide calibrator as a function of
the photon energy (solid line). The dashed line shows the sensitivity when the sample is surrounded
by an additional Cu-filter of 0.2 mm thick.
Radionuclide calibrators are designed to be very accurate, but to fully exploit that accuracy, they
should be used with care (e.g. using Cu-filters when needed, position the syringe correctly et), and
their performance should be monitored with a rigorous quality control procedure.
An important quality control test is to verify the linearity of the radionuclide calibrator over the
entire range of activities that is supported, from 1 MBq to about 400 GBq.
Other types of survey meters use a Geiger-Müller counter. Often the device has a microphone and
makes a clicking sound for every detected particle. The Geiger-Müller counter counts the incoming
photons (or other particles), but it does not measure its energy.
Finally, still other survey meters contain a scintillation crystal with photomultiplier. These are also
called “contamination monitors”, because they can be used to measure the spectrum of the activity, to
find out which tracer has caused the contamination. This is important information: a contamination
with a long lived tracer is obviously a more serious problem than with a short lived tracer.
Most survey meters can usually measure both photons and beta-particles (electrons and positrons).
Remember that to measure contamination with beta-particles, the survey meter must be hold at small
distance, because in contrast to x-rays or gamma rays, beta-particles have non-negigible attenuation
in air.
Chapter 9
Image analysis
In principle, PET and SPECT have the potential to provide quantitative images, that is pixel values can
be expressed in Bq/ml. This is only possible if the reconstruction is based on a “sufficiently accurate”
model, e.g. if attenuation is not taken into account, the images are definitely not quantitative. It is
difficult to define “sufficiently accurate”, the definition should certainly depend upon the application.
Requirements for visual inspection are different from those for quantitative analysis.
In image analysis often regions of interest (ROI’s) are used. A region of interest is a set of pixels
which are supposed to belong together. Usually, an ROI is selected such that it groups pixels with
(nearly) identical behavior, e.g. because they belong to the same part of the same organ. The pixels are
grouped because the relative noise on the mean value of the ROI is lower than that on the individual
pixels, so averaging over pixels results in strong noise suppression. ROI definition must be done
carefully, since averaging over non-homogeneous regions leads to errors and artifacts! Ideally an ROI
should be three-dimensional. However, mostly two dimensional ROI’s defined in a single plane are
used for practical reasons (manual ROI definition or two-dimensional image analysis software).
In this chapter only two methods of quantitative analysis will be discussed: standard uptake values
(SUV) and the use of compartmental models. SUV’s are simple to compute, which is an important
advantage when the method has to be used routinely. In contrast, tracer kinetic analysis with com-
partmental models is often very time consuming, but it provides more quantitative information. In
nuclear medicine it is common practice to study the kinetic behavior of the tracer, and many more
analysis techniques exist. However, compartmental modeling is among the more complex ones, so
if you understand that technique, learning the other techniques should not be too difficult. The book
by Cherry, Sorenson and Phelps [1] contains an excellent chapter on kinetic modeling and discusses
several analysis techniques.
102
CHAPTER 9. IMAGE ANALYSIS 103
the tracer in between. Moreover, the tracer amounts are measured with different devices: the regional
tracer concentration is measured with the SPECT or PET, the dose is measured with a radionuclide
calibrator. Therefor, the sensitivity of the tomograph must be determined. This is done by acquiring an
image of a uniform phantom filled with a know tracer concentration. From the reconstructed phantom
image we can compute a calibration factor which converts “reconstructed pixel values” into Bq/ml
(see section 7.2.4, p 94).
A SUV of 1 means that the tracer concentration in the ROI is identical to the average tracer
concentration in the entire patient body. A SUV of 4 indicates markedly increased tracer uptake. The
SUV-value is intended to be robust, independent from the administered tracer amount and the mass
of the patient. However, it changes with time, since the tracer participates in a metabolic process. So
SUVs can only be compared if they correspond to the same time after injection. Even then, one can
think of many reasons why SUV may not be very reproducible, and several publications have been
written about its limitations. Nevertheless, it works well in practice and is used all the time.
The SUV is a way to quantify the tracer concentration. But we don’t really want to know that. The
tracer was injected to study a metabolic process, so what we really want to quantify is the intensity
of that process. The next section explains how this can be done. Many tracers have been designed
to accumulate as a result of the metabolic process being studied. If the tracer is accumulated to high
concentrations, many photons will be emitted resulting in a signal with good signal to noise ratio. In
addition, although the tracer concentration is not nicely proportional to the metabolic activity, it is
often an increasing function of that activity, so it still provides useful information.
Figure 9.1: Four samples from a dynamic study. Each sample shows a short axis and long axis slice
through the heart. The wall of the left ventricle is delineated. 20 s after injection, the tracer is in
the right ventricle. At 40 s it arrives in the left ventricle. At 3 min, the tracer is accumulating in the
ventricular wall. At 20 min, the tracer concentration in the wall is increased, resulting in better image
quality.
Figure 9.2: Three compartment model, and corresponding regions in a cardiac study. The first
compartment represents plasma concentration (blood pool ROI), the second and third compartment
represent the extavascular tracer in original and metabolized form (tissue ROI). For the tissue curve,
both the measured points and the fitted curve are shown.
are drawn in the image and the mean tracer concentration in the ROI as a function of time is extracted
to produce the time-activity curves. In figure 9.2 a region is drawn inside the left ventricle to obtain
the tracer concentration in the blood. A second region is drawn in the left ventricular wall to obtain
the tracer concentration in that region of the heart. The model is then applied to analyse how the
concentration in the blood influences the concentration in the tissue. As illustrated in figure 9.3, the
analysis must explain how the tissue concentration is determined by the blood concentration and the
tissue features. In contrast, the blood concentration is essentially independent of the concentration in
the region we study. It is determined by the way the tracer is injected and by the tracer exchange with
the entire body. The contribution of the small region we study to the blood concentration is negligible
compared to the contribution of the entire body.
Figure 9.3: The time-dependent tracer concentration in the blood and the rate constants in the tissue
region determine the evolution of the tissue tracer concentration. The blood interacts with a large
amount of tissue. Since the tissue region we study is small, its effect on the blood concentration is
negligible.
is
TracerP →E = K1 CP (9.3)
where CP is the plasma tracer concentration (units: Bq/ml). K1 is the product of two factors. The first
one is the blood flow F , the amount of blood supplied in every second to the gram tissue. F has units
ml/(s g). The second factor is the extraction fraction f , which specifies which fraction of the tracer
passing by in P is exchanged with the second compartment E. Since it is a fraction, f has no unit. For
some tracers, f is almost 1 (e.g. for radioactive water). For other tracers it is 0, which means that the
tracer stays in the blood. For FDG it is in between. Consequently, the product K1 CP = F f CP has
units Bq/(s g) and tells how many Bq are extracted from the blood per second and per gram tissue. CP
is a function of the time, K1 is not (or more precisely, we assume it stays constant during the scan).
Both K1 and CP are also a function of position: the situation will be different in different tissue types.
The amount of tracer going from E to M equals:
where CE is the total amount of tracer in compartment E per gram tissue (units: Bq/g). Constant k3
tells which fraction of the available tracer is metabolized in every second, so the unit of k3 is 1/s. The
physical meaning of k2 and k4 is similar: they specify the fractional transfer per second.
Since FDG and glucose are not identical, their rate constants are not identical. The glucose rate
constants will be labeled with the letter g. The glucose and FDG amounts are definitely different:
glucose is abundantly present and is in steady state condition. FDG is present in extremely low
concentrations (pmol) and has not reached steady state since the images are acquired immediately
after injection.
The plasma concentration CPg is supposed to be constant. We can measure it by determining the
glucose concentration in the plasma from a venous blood sample. The extravascular glucose amount
CEg is supposed to be constant as well, so the input must equal the output. For glucose, k4g is very
small. Indeed, k4g corresponds to reversal of the initiated metabolization, which happens only rarely.
Setting k4g to zero we have
dCEg
= 0 = K1g CPg − (k2g + k3g )CEg . (9.5)
dt
Thus, we can compute the unknown glucose amount CEg from the known plasma concentration CPg :
K1g
CEg = g Cg . (9.6)
k2 + k3g P
So if we can find the values of the rate constants, we can compute the glucose metabolization rate.
As mentioned before, we cannot compute them via the tracer, since it has different rate constants.
However, it can be shown that, due to its similarity, the trapping of FDG is proportional to the glucose
metabolic rate. The constant of proportionality depends on the tissue type (difference in affinity for
both molecules), but not on the tracer concentration. The constant of proportionality is usually called
“the lumped constant”, because careful theoretical analysis shows that it is a combination of several
constants. So the lumped constant LC is:
K1 k3
k2 +k3 K̄
LC = K1g k3g
= ¯g (9.9)
K
k2g +k3g
For the (human) brain (which gets almost all its energy from glucose) and for some other organs, sev-
eral authors have attempted to measure the lumped constant. The values obtained for brain are around
0.8. This means that the human brain has a slight preference for glucose: if the glucose and FDG
concentrations in the blood would be identical, the brain would accumulate only 8 FDG molecules for
every 10 glucose molecules. If we had used radioactive glucose instead of deoxyglucose, the tracer
and target molecules would have had identical rate constants and the lumped constant would have
been 1. But as mentioned before, with glucose as a tracer, the radioactivity would not accumulate in
the cells, which would result in poorer PET images.
CHAPTER 9. IMAGE ANALYSIS 108
dCE (t)
= K1 CP (t) − (k2 + k3 )CE (t) (9.10)
dt
dCM (t)
= k3 CE (t) (9.11)
dt
For a cardiac study, we can derive the tracer concentration CP (t) in the blood from the pixel values in
the center of the left ventricle or atrium. If the heart is not in the field of view, we can still determine
CP (t) by measuring the tracer concentrations in blood samples withdrawn at regular time intervals. As
with the SUV computations, this requires cross-calibration of the plasma counter to the PET camera.
The compartments E and M can only be separated with subcellular resolution, so the PET always
measures the sum of both amounts, which we will call CI (t):
Consequently, we must combine the equations (9.10) and (9.11) in order to write CI (t) as a
function of CP (t) and the rate constants. This is the operational function. Since CI (t) and CP (t)
are known, the only remaining unknown variables will be the rate constants, which are obtained by
solving the operational function.
where we have assumed that at time t = 0 (time of injection) all tracer amounts are zero. From (9.13)
we find cE (s) as a function of cP (s). Inserting in (9.14) produces cM (s) as a function of cP (s).
K1
cE (s) = cP (s) (9.15)
s + k2 + k3
K1 k3
cM (s) = cP (s) (9.16)
s(s + k2 + k3 )
cI (s) = cE (s) + cM (s) (9.17)
K1 K1 k3
= + cP (s) (9.18)
s + k2 + k3 s(s + k2 + k3 )
CHAPTER 9. IMAGE ANALYSIS 109
Figure 9.4: The tracer amount CI (t) and its two terms when CP (t) is a step function (equation
(9.21)).
The two factors in s can be split from the denominator using the equation
a a 1 1
= − (9.19)
x(x + b) b x x+b
Applying this to (9.18) and rearranging a bit yields:
K1 k2 cP (s) K1 k3 cP (s)
cI (s) = + (9.20)
(k2 + k3 ) (s + k2 + k3 ) (k2 + k3 ) s
Applying the inverse Laplace transform is now straightforward (see appendix 12.7, p 146) and pro-
duces the operational function:
Z t Z t
K1 k2 −(k2 +k3 )(t−u) K1 k3
CI (t) = CP (u)e du + CP (u)du. (9.21)
k2 + k3 0 k2 + k3 0
Figure 9.4 plots CI and the two terms of equation (9.21) for the case when CP (t) is a step function.
CP (t) is never a step function, but the plot provides interesting information. The first term of (9.21)
represents tracer molecules that leave the vascular space, stay a while in compartment E and then
return back to the blood. As soon as the tracer is present in the blood, this component starts to grow
until it reaches a maximum. When CP (t) becomes zero again, the component gradually decreases
towards zero. This first term follows the input, but with some delay (CI1 (t) in fig. 9.4).
The second term of (9.21) represent tracer molecules that enter compartment E and will never
leave (CI2 (t) in fig. 9.4). Eventually, they will be trapped in compartment M . Note that the first term
is not equal to but smaller than CE (t). The reason is that part of the molecules in E will not return
to the blood but end up in M . It is easy to compute which fraction of CE (t) is described by the first
term of (9.21). (The rest of CE (t) and CM (t) correspond to the second term of (9.21).) This is left as
an exercise to the reader.
Impulse response
Equation 9.21 can be rewritten as
Z t
K1
CI (t) = CP (u) k2 e−(k2 +k3 )(t−u) + k3 du. (9.22)
k2 + k3 0
CHAPTER 9. IMAGE ANALYSIS 110
This is a convolution of the input function with the factor in brackets, showing that that factor is
the impulse response function. To illustrate this, we simply compute the response to an impulse, by
replacing Cp (u) with a Dirac impulse at time ξ:
Z t
K1
CI (t) = δ(u − ξ) k2 e−(k2 +k3 )(t−u) + k3 du (9.23)
k2 + k3 0
K1
= k2 e−(k2 +k3 )(t−ξ) + k3 . (9.24)
k2 + k3
dCE (t)
= K1 CP (t) − βCE (t). (9.25)
dt
First, we solve the corresponding homogeneous equation, obtained by dropping the terms independent
of CE (t):
dCE (t)
= −βCE (t). (9.26)
dt
The solution is CE (t) = Ae−βt . Now we assume that the solution to (9.25) is similar, except that
the constant A must be replaced by a function of t: CE (t) = A(t) e−βt . Inserting this in (9.25) we
obtain:
d(A(t) e−βt )
= K1 CP (t) − βA(t) e−βt (9.27)
dt
dA(t) −βt
= e − βA(t)e−βt = K1 CP (t) − βA(t) e−βt (9.28)
dt
dA(t) −βt
⇔ e = K1 CP (t) (9.29)
dt
Z t
⇔ A(t) = K1 CP (u)eβu du (9.30)
0
Z t
⇔ CE (t) = A(t) e−βt = K1 CP (u)e−β(t−u) du (9.31)
0
Having solved the equation for the first tissue compartment, we can use CE (t) to find the activity in
the second compartment. We simply insert (9.31) in the equation (9.11) for CM (t) to obtain:
Z t
dCM (t)
= k3 CE (t) = K1 k3 CP (u)e−β(t−u) du
dt 0
Z t
= K1 k3 e−βt CP (u)eβu du (9.32)
0
and therefore Z t Z ξ
−βξ
CM (t) = K1 k3 dξ e CP (u)eβu du. (9.33)
0 0
This is the integral of an integral, which may seem somewhat intimidating at first. However, it can be
simplified using integration by parts. If you are unfamiliar with that trick, you can easily derive it by
CHAPTER 9. IMAGE ANALYSIS 111
noting that the integral of the derivative of the product of two functions f (t) and g(t) equals:
Z t Z t Z t
d((f (ξ)g(ξ)) df (ξ) dg(ξ)
f (t)g(t) − f (0)g(0) = dξ = g(ξ)dξ + f (ξ) dξ (9.34)
0 dξ 0 dξ 0 dξ
And therefore we can write
Z t Z t
df (ξ) dg(ξ)
g(ξ)dξ = f (t)g(t) − f (0)g(0) − f (ξ) dξ (9.35)
0 dξ 0 dξ
We apply this trick to get rid of the inner integral in (9.33) as follows:
e−βξ
f (ξ) = − (9.36)
β
df (ξ)
= e−βξ (9.37)
dξ
Z ξ
g(ξ) = CP (u)eβu du (9.38)
0
dg(ξ)
= CP (ξ)eβξ (9.39)
dξ
With these definitions, the left hand side of (9.35) is equal to (9.33). Note that g(0) = 0 here, which
makes things slightly simpler. The trick converts (9.33) into:
−βt Z t
e−βu
Z t
e
CM (t) = K1 k3 − CP (u)eβu du − (− )CP (u)eβu du
β 0 0 β
Z t Z t
K1 k3 −(k2 +k3 )(t−u)
= − CP (u)e du + CP (u)du (9.40)
k2 + k3 0 0
where we have replaced β again with k2 + k3 . Finally, to obtain CI (t) = CE (t) + CM (t) we sum
(9.31) and (9.40):
Z t Z t
K1 k3 K1 k3
CI (t) = (K1 − ) CP (u)e−(k2 +k3 )(t−u) du + CP (u)du
k2 + k3 0 k2 + k3 0
Z t Z t
K1 k2 −(k2 +k3 )(t−u) K1 k3
= CP (u)e du + CP (u)du, (9.41)
k2 + k3 0 k2 + k3 0
which is identical to equation (9.21).
the rate constants obtained in healthy volunteers are used as a start. With these rate constants and the
known input function Cp (t), we can compute the expected value of CI (t). The computed curve will
be diff erent from the measured one. Based on this difference, the non-linear regression algorithm will
improve the values of the rate constants, until the sum of weighted squared differences is minimal. It is
always useful to plot both the measured and computed curves CI (t) to verify that the fit has succeeded,
since there is small chance that the solution corresponds to an unacceptable local minimum. In that
case, the process must be repeated, starting from a different set of rate constant values. The tissue
curve in fig. 9.2, p 105 is the result of non-linear regression. The fit was successful, since the curve is
close to the measured values.
The glucose consumption can now be computed as
1 K1 k3 g K̄ g
glucose consumption = CP = C (9.42)
LC k2 + k3 LC P
Non-linear regression programs often provide an estimate of the confidence intervals or standard
deviations on the fitted parameters. These can be very informative. Depending on the shape of the
curves, the noise on the data and the mathematics on the model, the accuracy of the fitted parameters
can be very poor. However, the errors on the parameters are correlated such that the accuracy on K̄ is
better than the accuracy on the individual rate constants.
be obtained with simple linear regression. For linear regression no iterations are required, there is a
closed form expression, so this solution is orders of magnitudes faster to compute than the previous
one. Rt
The integral 0 CP (u)du/CP (t) has the unit of time. If CP (t) would be a constant, the integral
simply equals t. It can be regarded as a correction, required because CP (t) is not constant but slowly
varying. As shown in figure 9.4, when CP (t) is constant, CI (t) has a linear and a constant term.
Equation (9.21) confirms that the slope of the linear term is indeed K̄. A potential disadvantage is
that the values of the rate constants are not computed.
where λj and rj are the image and the reference image respectively. This approach has two problems.
First, it assumes that all pixels are equally important, which is almost never true. Second, it combines
systematic (bias) and random (variance) deviations. It is better to separate the two, because they
behave very differently. This is illustrated in figure 9.5. Suppose that a block wave was measured with
two different methods A and B, producing the noisy curves shown at the top row. Measurement A is
noisier, and its sum of squared differences with the true wave is twice as large as that of measurement
B. If we know that we are measuring a block wave, we know that a bit of smoothing is probably useful.
The bottom row shows the result of smoothing. Now measurement A is clearly superior. The reason
is that the error in the measurement contains both bias and variance. Smoothing reduces the variance,
but increases the bias. In this example the true wave is smooth, so variance is strongly reduced by
smoothing, while the bias only increases near the edges of the wave. If we keep on smoothing, the
entire wave will converge to its mean value; then there is no variance but huge bias. Bias and variance
of a sample (pixel) j can be defined as follows:
where E(x) is the expectation of x. Variance can be directly computed from repeated independent
measurements. If the true data happen to be smooth and if the measurement has good resolution,
neighboring samples can be regarded as independent measurements. Bias can only be computed if the
true value is known.
In many cases, there is the possibility to trade in variance for bias by smoothing or imposing some
constraints. Consequently, if images are compared, bias and variance must be separated. If an image
has better bias for the same variance, it is probably a “better” image. If an image has larger bias and
lower variance when compared to another image, the comparison is meaningless.
Figure 9.5: Top: a simulated true block wave and two measurements A and B with different noise
distributions. Bottom: the true block wave and the same measurements, smoothed with a simple
rectangular smoothing kernel.
This sequence is in order of increasing complexity and decreasing controllability. Tests on patient
data are required to show that the method can be used. However, if such a test fails it is usually very
difficult to find out why. To find the problem, simple and controllable data are required. Moreover,
since the true solution is often not known in the case of patient data, it is possible that failure of the
method remains undetected. Consequently, there is no gain in trying to skip one or a few stages, and
with a bit of bad luck it can have serious consequences.
Evaluation on simulation has the following important advantages:
• The truth is known, comparing the result to the true answer is simple. This approach is also
very useful for finding bugs in the algorithm or its implementation.
• Data can be generated in large numbers, sampling a predefined distribution. This enables direct
quantitative analysis of bias and variance.
• Complexity can be gradually increased by making the simulations more realistic, to analyze the
response to various physical phenomena (noise, attenuation, scatter, patient motion . . . ).
• A nice thing about emission tomography is that it is relatively easy to make realistic simulations.
In addition, many research groups are willing to share simulation code.
• It is possible to produce simulations which are sufficiently realistic to have them diagnosed by
the nuclear medicine physicians. Since the correct diagnosis is known, this allows evaluation of
the effect of the new method on the final diagnosis.
When the method survives complex simulations it is time to do phantom experiments. Phantom
experiments are useful because the true system is always different from even a very realistic simu-
lation. If the simulation phase has been done carefully, phantom experiments are not likely to cause
much trouble.
A possible intermediate stage is the use of animal experiments, which can be required for the
evaluation of very delicate image processing techniques (e.g. preparing stereotactic operations). Since
CHAPTER 9. IMAGE ANALYSIS 116
the animal can be sacrificed, it is possible, at least to some extent, to figure out what result the method
should have produced.
The final stage is the evaluation on patient data, and comparing the output of the new method to
that of the classical method. As mentioned before, not all failures will be detected since the correct
answer may not be known.
Chapter 10
Radiation which interacts with electrons can affect chemical bonds in DNA, and as a result, cause
damage to living tissues. The probability that an elementary particle will cause an adverse ionisation
is proportional to the energy of the particle deposited in the tissue. In addition, the damage done per
unit energy depends on the type of particle.
In diagnostic nuclear medicine, the aim is of course to obtain diagnostic images without or with
minimum damage to the patient. Consequently, the amount of activity administered to the patient
should be as low as reasonably achievable (ALARA). “Reasonably achievable” means that the radia-
tion emitted by the tracer should still be sufficient to make valuable clinical images.
The amount of radiation deposited in an object is expressed in gray (Gy). A dose of 1 gray is
defined as the deposition of 1 joule per kg absorber:
J
1Gy = 1 . (10.1)
kg
To quantify (at least approximately) the amount of damage done to living tissue, the influence of
the particle type must be included as well. E.g. it turns out that 1 J deposited by neutrons does more
damage than the same energy deposited by photons. To take this effect into account, a quality factor
Q is introduced. Multiplication with the quality factor converts the dose into the dose equivalent.
Quality factors are given in table 10.1.
The dose equivalent is expressed in Sv (sievert). Since Q = 1 for photons, electrons and positrons,
we have 1 mSv per mGy in diagnostic nuclear medicine. The natural background radiation is about 2
mSv per year, or about 0.2 µSv per hour.
117
CHAPTER 10. BIOLOGICAL EFFECTS OF RADIATION 118
The older units for radiation dose and dose equivalent are rad and rem:
1 Gy = 100 rad (10.2)
1 Sv = 100 rem (10.3)
When the dose to every organ is computed, one can in addition compute an “effective dose”,
which is a weighted sum of organ doses. The weights are introduced because damage in one organ
is more dangerous than damage in another organ. The most sensitive organs are the gonads (weight
about 0.25), the breast, the bone marrow and the lungs (weight about 0.15), the thyroid and the bones
(weight about 0.05), see table 10.2. The sum of all the weights is 1. The weighted average produces a
single value in mSv. The risk of death due to tumor induction is about 5% per “effective Sv” according
to report ICRP-60 (International Commission on Radiological Protection (http://www.icrp.org)), but
it is of course very dependent upon age (e.g. 14% for children under 10). Research in this field is not
finished and the tables and weighting coefficients are adapted every now and then.
Table 10.2: Organ weight factors to compute the effective dose (ICRP 103, 2007).
organ weight total
red marrow, colon, lungs, stomach, breast 0.12 0.60
gonads 0.08 0.08
bladder, liver, esophagus, thyroid 0.04 0.16
brain, skin, salivary glands, bone surfaces 0.01 0.04
remainder (adrenals, extrathoracic region,
gallbladder, heart, kidneys, lymphatic nodes, 0.00923 0.12
muscle, oral mucosa, pancreas, prostate,
small intestine, spleen, thymus, uterus/cervix)
total 1
To obtain the dose equivalent delivered by one organ A to another organ B (or to itself, in which
case A = B), we have to make the following computation:
X
DE = Qi × NiA × pi (A → B) × Ei / MB , (10.4)
i
DE is the dose equivalent;
i denotes a particular emission. Often, multiple photons or particles are emitted during a single ra-
dioactive event, and we have to compute the total dose equivalent by summing all contributions;
Qi is the quality factor for emission i;
NiA is the total number of particles i emitted in organ A;
pi (A → B) is the probability that a particle i emitted in organ A will deposit (part of) its energy in
organ B;
Ei is the (average) energy that is carried by the particles i and that can be released in the tissue. For
electrons and positrons, this is their kinetic energy. For photons, it is all of their energy.
MB is the mass of organ B.
In the following paragraphs, we will discuss these factors in more detail, and illustrate the procedure
with two examples.
CHAPTER 10. BIOLOGICAL EFFECTS OF RADIATION 119
0
t1/2
= niA (0) . (10.6)
ln(2)
Here niA (0) is the number of particles or photons emitted per s at time 0, and t1/2 is the half life. For
a source of 1 MBq at time 0, niA (0) = 106 per s, since 1 Bq is defined as 1 emission per s.
Often, the tracer is metabolized and sent to the bladder, which implies a decrease of the tracer con-
centration in most other organs. Therefore, the amount of radioactivity decreases faster than predicted
by (10.6) in these organs. One way to approximate the combined effect of metabolism and physical
decay could be to replace the physical half life with a shorter, effective half life in (10.6). In other
cases, it may be necessary to integrate the measured time activity curves numerically.
CHAPTER 10. BIOLOGICAL EFFECTS OF RADIATION 120
Figure 10.1: Two objects, one with a radioactive source in the center. The size of the right box is 2 ×
2 × 4 cm3 .
The half life of 123 I is 13.0 hours. The boxes are made of plastic or something like that, with an
attenuation coefficient of 0.15 cm−1 for photons with energy of 159 keV. The density of the boxes is
1 kg/liter. The boxes are hanging motionless in air, the attenuation of air is negligible. Our task is to
estimate the dose that both boxes will receive if we don’t touch this configuration for a few days.
We are going to need the mass of the boxes. For the left box we have:
4 kg
Wleft = πR3 × 1 3 = 0.034 kg, (10.8)
3 dm
CHAPTER 10. BIOLOGICAL EFFECTS OF RADIATION 121
Combining all factors yields then number of photons interacting with the right box:
Finally, multiplying with the energy and dividing by the mass yields:
1.6 × 10−19 J 1
dose(γ, rightbox) = 4.2 × 107 × 159 keV × × = 0.067mGy. (10.17)
eV 0.016 kg
This is much less than the dose of the left box. The right box is protected by its distance from the
source!
positron is 250 keV. Most of this energy is dissipated in the tissue before the positron annihilates.
Thus, there are three contributions to the dose of the left box: the positron and the two photons. The
right box can at most be hit by one of the photons. We assume that in every desintegration, exactly one
positron is produced, which annihilates with an electron into two 511 keV. At 511 keV, the attenuation
of the boxes is 0.095 cm−1 . The half life of 18 F is 109 minutes.
The total number of desintegrations is
109 min s
N = 1MBq × × 60 = 9.44 × 109 . (10.18)
ln(2) min
1.6 × 10−19 J 1
dose(β + , leftbox) = 9.44 × 109 × 250000 eV × ×
eV 0.034 kg
= 11mGy. (10.19)
1.6 × 10−19 J 1
dose(γ, leftbox) = 9.44 × 109 × 511000 eV × × × 0.17
eV 0.034 kg
= 3.9mGy. (10.20)
The total dose then equals dose(β + , leftbox) + 2 dose(γ, leftbox) = 18.8 mGy.
For the right box, we first estimate the number of interacting photons (remember that there are
two photons per desintegration):
1.6 × 10−19 J 1
dose(γ, rightbox) = 1.09 × 107 × 511000 eV × × = 0.056mGy. (10.22)
eV 0.016 kg
CHAPTER 10. BIOLOGICAL EFFECTS OF RADIATION 123
11.1 Introduction
In nuclear medicine, radiopharmaca are mostly used for diagnostic imaging (with SPECT or PET)
and activity concentration measurements in samples (with a gamma counter). For that purpose, the
radiopharmacon is labeled with a single photon or a positron emitting isotope. The photon energy
of the single photon emitter is ideally between 100 and 200 keV, because that is a good compromise
between good penetration of human tissues and good stopping by detector crystal. This is one of the
reasons why 99m Tc (emitting photons of 140 keV) is the most used radionuclide for single photon
emission imaging. Positron emission will always produce 511 keV photons. That is a bit higher that
we would have liked; for PET, there was no choice but to design detectors with sufficient stopping
power. Ideally, the isotope would emit nothing else, because other emitted particles will not contribute
to imaging, but they will damage the tissues (DNA) they traverse.
In recent years, radionuclide therapy is being used increasingly in cancer therapy. Radionuclide
therapy makes use of molecules or other particles that accumulate preferentially in tumors, and which
are labeled with radioactive isotopes (radionuclides). The aim is that this radiation will destroy the
tumor, with minimal damage to the surrounding healthy tissues. For that purpose, the radiation should
have a small penetration depth, as is the case for electrons (aka β − ), protons and α-particles (particle
consisting of two protons and two neutrons, He2+ ). Table 11.1 and figure 11.1 list the main emissions
by a few of such radionuclides. The decay scheme of 223 Ra is complex, because several of its daughter
products are radiactive as well.
Most radionuclides used for therapy not only emit β or α particles, but also photons. The drawback
of these photons is that they cause some radiation to the healthy tissues, but an advantage is that images
can be made of the distribution of the radionuclide, enabling accurate dose verification (at least if the
photon energies are suitable for imaging).
If the radiopharmacon has high affinity for the tumor and is very specific, i.e. it binds to the tumor
and to (almost) nothing else, then the therapy can be more effective than external beam therapy for
two reasons: it will treat all metastases, even small ones that may not be revealed by imaging, and a
high radiation can be achieved with minimum damage to the surrounding healthy tissues.
To be effective, the amount of radioactivity administered to the patient should be high enough to
destroy the tumors, but at the same time low enough to ensure that the radiation to healthy tissues
(which may also accumulate some of the radioactivity) remains below safety limits. Consequently, it
is important to estimate the expected dose distributions before treatment, and determine the delivered
124
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 125
M (rT , t) is the mass of rT , which might be time dependent. The summation is over all emissions i
that may occur during a decay; pi is the branching ratio, i.e. the probability that emission i will occur
during a decay. Ei is the energy of particle i and φ(rT ← rS , Ei , t) gives the fraction of the energy
that this particle is expected to deposit in rT . Thus, φ depends on the geometry and attenuation of rT
and rS and the structures between and around them.
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 127
Since the dose rate in rT depends on all the radioactivity inside the patient’s body, we have to sum
over all rS : X
Ḋ(rT , t) = S rT ← rSj , t A rSj , t (11.3)
j
TD is some interesting time duration. In many cases, one can set TD = ∞, because the effective
half life (combining decay half life and the biogical dynamics) of the radionuclide will be short. Ã
is the time integrated activity. To put S in front of the integral sign, we had to make the common
approximation that it does not depend on time.
The time integrated activity à has no units, since Bq is the number of desintegrations per s, and a
number is unitless. In practice, one often normalizes à with the administered activity A0 :
Ã(rS )
ã(rS ) = (11.5)
A0
and therefore Ã(rS ) = ã(rS )A0 . The unit of ã is time, and for that reason it was called the residence
time. If all the administered activity A0 would stay for a short duration ã(rS ) in the region rS and
then vanish, then the same Ã(rS ) would be obtained. Since 2009, ã(rS ) is called the time-integrated
activity coefficient of rS 1 .
To apply equation (11.4) for treatment planning or verification, we need
2. the function φ(rT ← rS , Ei , t) for all possible region pairs (rS , rT ), all relevant particles i and
all relevant energies Ei .
3. the activity A rSj , t as a function of time in all these regions,
This should really be done for each individual patient (“personalized dosimetry”). Because that is
very difficult, the segmentation and calculation of φ(rT ← rS , Ei , t) have first been addressed based
on models of “average patients”. Figure 11.2 shows two such models, an early one based on simpli-
fied shapes and a more recent and realistic model using non-uniform rotational B-splines (NURBS).
Similar models have been developed for children of different ages.
When the patient model is available, the S-value can be computed with equation (11.2) for every
region pair and for all radionuclides of interest. For electrons, positrons and α-particles, one can
usually assume that their energy is deposited locally. That makes the calculation of φ(rT ← rS , Ei , t)
very easy:
1
In 2009, the MIRD committee adopted a standardization of nomenclature (Journal of Nuclear Medicine, 2009; 50:
477-484).
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 128
Figure 11.2: Left: representation of the average man for MIRD dose computation, as used in 1969
[1]. Right: human body models based on non-uniform rational B-splines (NURBS) [6].
On the other hand, for photons the calculations are very complicated and have to be done with Monte
Carlo simulation.
Finally,
to obtain a dose for a particular case, we need to determine the time-activity curves
A rSj , t for each organ rSj . Currently, this is done by acquiring SPECT or PET scans and man-
ually segmenting the organs in the activity images, or in the corresponding CT images (relying on
the spatial alignment produced by the SPECT/CT or PET/CT system). The time-activity curve is
typically obtained by acquiring a dynamic SPECT or PET scan. For tracers with long (physical and
biological) half life, some additional scans at later time points may be required. A good balance must
be found between minimizing the number of scans (and their duration) for optimal patient comfort,
and obtaining sufficient samples to reach an acceptable accuracy. In single photon emission imaging,
some of the SPECT or SPECT/CT scans may be replaced by planar imaging to improve the temporal
resolution (early after injection) or to reduce the imaging time (late after injection).
The anatomical models have been designed for assessing the typical radiation dose associated with
diagnostic procedures. Because these doses are relatively low, it is assumed that the dose estimates
obtained for such average patients are sufficiently accurate for producing clinical guidelines on the
activity to be injected for different tracers and imaging tasks. Novel tracers are often first characterized
in animal experiments. If those tests produce good results, they are evaluated in a small group of
healthy volunteers, using one or multiple SPECT/CT or PET/CT scans, manual organ delineation and
the S-values from the average patient. Because organ delineation is extremely time consuming, the
delineation is usually restricted to source organs that accumulate a significant amount of activity and
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 129
target organs that are known to be radiosensitive. The resulting radiation doses (in mGy or Gy) can
then be converted to an effective dose (in mSv or Sv) by multiplying the energies with the appropriate
quality factor Q (table 10.1, p 117) and computing the weighted sum of the organ doses 10.2, p 118.
Application of such models to compute the required activity for radionuclide therapy is contro-
versial, because the model may deviate strongly from the individual patient. However, because no
clinical tools are currently available for accurate personalized dosimetry, they are often used for es-
timating tumor and organ doses in radionuclide therapy as well. This is likely to change in the next
several years. It is found that “deep learning” (using multiple layers of convolutional neural networks
or CNNs) is an excellent tool to capture the prior knowledge of experts, and it is increasingly being
used for organ segmentation in clinical images. In addition, the execution time of Monte Carlo sim-
ulation can be decreased from several days to several minutes by implementing them in a clever way
on GPU. Consequently, personalized dosimetry for radionuclide therapy in clinically acceptable times
and with acceptable efforts seems possible.
dN (t)
= −kḊ N (t) (11.6)
dt
After irradiating that tissue for a finite time T , the expected number of surviving cells will be
N (T ) = N (0)e−kḊ T (11.7)
If we assume that kḊ is proportional to the dose rate Ḋ, we can write
RT
N (T ) = N (0)e−α 0 Ḋ(t)dt
= N (0)e−αD(T ) = N (0)e−BED (11.9)
where D(T ) is the total dose accumulated at time T . The exponent αD(T ) is called the Biologically
Effective Dose or BED. The value exp(−BED) is the fraction of surviving cells.
However, it has been observed that the BED depends non-linearly on the dose. This can be
explained intuitively as follows. Radiation damages the tissue mostly by ionizing DNA molecules. It
is possible that a single hit by a photon or electron produces so much damage to a DNA molecule that
the cell dies. But it is also possible (and more likely) that a single hit produces an error in the DNA
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 130
that can still be repaired. However, this sublethal damage could become lethal, if that same DNA
molecule would be hit a second time during the same treatment. Assuming a sublethal hit is more
likely than a lethal one, and that the probability of hitting the same location more than two times is
very small, the cell killing could be well approximated as
The first term corresponds to a single lethal hit at time t. The second corresponds to the lethal com-
bination of two sublethal hits, one which happened between 0 and t, and the second one at time t. 2
Integration over time produces
Z T
BED = kḊ (t)dt
0
Z T Z T
=α Ḋ(t)dt + 2β D(t)Ḋ(t)dt
0 0
Z T Z T
dD(t) dD(t)
=α dt + 2β D(t) dt
0 dt 0 dt
Z T
dD2 (t)
= αD(T ) + β dt
0 dt
= αD(T ) + βD2 (T ) (11.11)
This expression can be further refined to include also the effect of repair mechanisms. After
sublethal damage, the cell may be able to undo that damage, in which case it would survive a second
sublethal hit. Assume for convenience that the probability of repair is constant over time, such that a
fraction µ of the damaged cells heal themselves per unit of time. Then if M cells took a sublethal hit
at time s, only M e−µ(t−s) of them would still be damaged at time t > s. Adjusting equation (11.11)
accordingly produces
Z T Z T Z t
BED = α Ḋ(t)dt + 2β Ḋ(t)dt Ḋ(s)e−µ(t−s) ds
0 0 0
2
= αD(T ) + β G D (T ) (11.12)
Z T Z t
2
with G = Ḋ(t)dt Ḋ(s)e−µ(t−s) ds
D2 (T ) 0 0
The G-factor is called the Lea-Catcheside factor (after the authors who derived these equations). It is
easy to check that with µ = 0, i.e. no cell repair, equation (11.12) reduces to (11.11). Fortunately,
normal cells are typically better at repairing radiation damage than tumor cells.
• pre-computed S-values based on anatomical models, such as in figure 11.2 (and listed in MIRD
pamphlet 11), for many radionuclides, and
• software for fitting kinetic models, typically a sum of exponentials, which are used to interpolate
between the time points and extrapolate to infinity in a sensible way.
The output is the dose in mGy to all organs, and the corresponding effective dose in mSv.
Some research teams share their more recent software tools for personalized internal dosimetry.
An example is the LundaDose Program from Lund University, Sweden. This program focuses on
internal dosimetry for therapy with 177 Lu or 90 Y labeled molecules. It takes SPECT/CT images as
input and computes the organ and tumor doses with Monte Carlo simulation. Other similar free
software packages can be found at mirdsoft.org, www.idac-dose.org and www.opendose.org.
By taking directly the images of the activity distribution as input, the need for segmenting the
source organs is eliminated. The Monte Carlo simulation can make use of the activity in each voxel,
or in other words, each voxel is treated as a source region rS . And similarly, the dose can be computed
in each voxel. This is called voxelized dosimetry. It is not only more convenient, it is also more
accurate, because it eliminates the need for assuming a uniform activity distribution within large
source and target regions. For the final analysis of the dose images, still organ and lesion segmentation
is necessary for the target regions rT , because we need to know where the tumor and the radiosensitive
organs are. However, the analysis of the dose in regions rT can now be made more informative, by
taking into account the non-uniform dose distribution in the region.
11.5 Examples
11.5.1 Annual effective dose due to natural 40 K
40 Kis a naturally occurring radioactive isotope of potassium. Its abundance, i.e. the amount of 40 K in
natural K, is 0.012%. Humans have approximately 2 g K per kg body weight. The half life of 40 K is
1.25 · 109 years. It decays by β − emission with a branching ratio of 90%, see figure 11.3. The mean
energy of the emitted electron equals 0.59 MeV.
To compute the radiation dose due to 40 K, we assume that this dose is only due to the β − emission.
Suppose there are N0 40 K atoms in a g at time t = 0. Then the number of radioactive atoms at time t
equals
− ln 2 T t
N (t) = N0 e 1/2 (11.13)
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 132
implying that the activity, i.e. the number of decaying atoms per unit of time equals
dN (t) ln 2
A(t) = − = N (t). (11.14)
dt T1/2
where we have to express T1/2 in seconds to obtain a result in Bq. Denoting the number of seconds
per year as sy , we have T1/2 = 1.25 · 109 sy .
Avogadro’s number equals NA = 6.022 · 1023 atoms/mole, and 1 mole of 40 K weighs 40 g.
Consequently, the activity per kg tissue equals
This produces an activity per mass in Bq/kg. In 90% of these decays, an electron is emitted which
deposits on average an energy of 0.59 MeV. Thus, the energy deposited by a decay equals
J
E = 0.9 · 0.59 · 106 eV · 1.6 · 10−19 (11.16)
eV
Multiplying A and E produces the deposited energy per mass and per s in J / (kg s) = Gy / s. Finally,
we multiply this with the number of seconds per year sy and with 1000 to obtain the result in mGy
per year:
1000 mGy sy
AE (11.17)
Gy year
ln 2 6.022 · 1023 2 · 1.2 · 10−4 1000 mGy sy
= 9
0.9 · 0.59 · 106 · 1.6 · 10−19 J
1.25 · 10 sy 40 kg Gy year
mGy
= 0.17 (11.18)
year
This calculation seems OK, since the UNSCEAR 2008 report gives also 0.17 mSv for the effective
dose from natural 40 K irradiation.
Figure 11.4: Thyroid therapy with 131 I. At different time points, activity measurements are done with
a simple detector, calibrated with a phantom. A tracer kinetic model is applied to estimate the time
activity curve from a few measurements. The figures are from [7]. (RIU(t) denotes fractional 131 I
uptake in the target tissue at time t, and kt , kB and kT are time constants of the kinetic model.)
reproducibility, such that after calibration the measurement is quantitative. For calibration, a phantom
mimicking the geometry and attenuation of the neck is used, containing 131 I at the position of the
thyroid.
With this setup, the patient is scanned a few times over a week, producing measurements of the
activity in the thyroid. Then a kinetic model is applied, to fit a smooth and realistic time-activity curve
through these points. A good model must be used, because estimating the total dose to the thyroid
involves a very strong extrapolation beyond the last time point (see figure 11.4).
131
11.5.3 I-mIBG therapy for neuroblastoma
The compound mIBG (meta-iodobenzylguanidine) is taken up by cells of the sympathetic nervous
system. Neuroblastoma usually inherit this affinity for mIBG, and as a result, 131 I-mIBG can be used
to selectively irradiate these tumors. To circumvent the toxic effects on the bone marrow (which
makes red blood cells), stem cells are collected before treatment and reinfused after the treatment.
This will ensure the production of blood cells after treatment. The 131 I-mIBG treatment is combined
with Topotecan, a drug that makes neuroblastoma more sensitive to radiation [8].
The aim of the 131 I-mIBG procedure, applied to treat children with metastatic neuroblastoma, is
to achieve a whole body dose of 4 Gy. It has been shown that with this whole body dose, the dose
to the neuroblastoma is sufficiently high and side effects are limited. To achieve this, the therapy is
given in two fractions. The first fraction delivers a first whole body dose well below 4 Gy. In addition,
the photons emitted by 131 I are used to determine the whole body dose. This is often done with a
simple gamma detector and appropriate calibration, similar to what is shown in figure 11.4, but this
time focusing on the entire body. Alternatively (and more accurately), planar imaging or SPECT can
be used. From that measured dose and the known amount of administered activity in the first fraction,
the amount of activity for the second fraction is computed, to obtain the target dose of 4 Gy.
Figure 11.5 shows for 8 patients the whole body doses produced by the two fractions. The dose
of the first fraction was only based on the patient weight, and produced roughly 2 Gy. This was
used to adjust the activity given in the second fraction, shown in the plot at the right hand side in
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 134
Figure 11.5: The whole body doses achieved in 8 patients (left), and the activity that was administered
to obtain this dose of approximately 4 Gy (right). The red and green blocks correspond to the first
and second fraction, respectively. For patients 6 and 7, much more activity was needed to achieve the
same whole body dose. Courtesy of Gaze et al. [9].
the figure. The plot at the left shows that this produced a total dose of approximately 4 Gy. This
treatment is effective. However, more accurate tumor dose computation would be useful, to provide
more information about the relation between the administered doses, the tumor control and the toxic
effects to healthy tissues, which would facilitate the development of even more effective, personalized
treatment.
90
11.5.4 Y selective internal radiation therapy for liver tumors
The liver can be affected by a primary liver (hepatic) tumor or by tumor metastases originating from
elsewhere in the body.
One approach to treating liver tumors is radioembolization, i.e. to release small radioactive spheres
into the blood supplying the tumors. These microspheres, made of resin or glass, have a diameter of
25-35 µm, which is small enough to travel through larger blood vessels, and big enough to get stuck
in the tumor micro-vasculature. The microsphere then hinders or blocks the perfusion through the
microvessel, which is already bad for the tumor. But more importantly, these microspheres contain a
β-emitting isotope. The emitted electrons interact with the surrounding tissue, depositing their energy
within a few mm from where they are emitted. The aim is to have the microspheres accumulated in or
very close to the tumor lesions, such that the β-radiation destroys the tumor with minimal damage to
the healthy liver.
The liver is perfused not only by the hepatic artery, but also by the portal vein. Veins are typically
outputting blood from organs, but the liver takes input from a vein too, because its task involves
processing venous blood from the stomach, intestines and spleen. Healthy liver cells receive about
75% of their blood from the portal vein, and only 25% from the hepatic artery. In contrast, tumors take
most of their blood from the hepatic artery. For that reason, the microspheres are released via a catheter
that is maneuvered into the hepatic artery. To further increase selectivity, the catheter is steered into
one or a few branches of the hepatic artery which are known to supply tumors. As illustrated in
figure 11.6, a better healthy tissue sparing can be achieved when the tumor is more localized in the
liver. For this reason, the treatment is called selective internal radiation therapy (SIRT). One of the
electron-emitting isotopes that is often used for SIRT is the yttrium isotope 90 Y.
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 135
Figure 11.6: Radioactive microspheres are sent into the liver through selected branches of the portal
vein, targeting the tumors. [9].
the microvessels, we can assume that the microspheres concentrations remain constant over time, and
that the radioactive decay of 90 Y is the only cause of change to the activity concentration. Therefore,
we only need a single PET image for dose verification after treatment.
Since the electrons deposit their energy very close to their emission source, the radiation dose
to the tissues can be computed using the “local deposition model”. That means that with good ap-
proximation, the dose is proportional to the activity concentration achieved shortly after injection:
multiplication with a global scale factor transforms the PET activity image with voxel values in Bq/ml
to a dose image with voxel values in Gy.
The time integrated activity à equals
Z ∞
− ln 2 T1/2
à = A0 e T1/2 dt = A0 (11.19)
0 ln 2
where A0 is the activity in Bq in a particular region of interest, and T1/2 = 64.1 hours is the half life
of 90 Y. Following equation (11.4), the dose is obtained by multiplying à with the S-factor, given by
equation (11.2). For 90 Y, the S-factor reduces to
E
S(r ← r) = (11.20)
M (r)
where M (r) is the mass of the region r and E is the average energy of the emitted electron. Conse-
quently, the accumulated dose in a particular region r equals
Dr = S(r ← r) Ã (11.21)
A0 T1/2
= E (11.22)
M (r) ln 2
A0 J 64.1 h · 3600 s/h
= 933.6 ·103 eV · 1.6 · 10−19 (11.23)
M (r) eV 0.693
A0
= 49.7 · 10−9 Js (11.24)
M (r)
CHAPTER 11. RADIONUCLIDE THERAPY AND DOSIMETRY 136
Figure 11.7: Left: the 99m Tc-MAA image (in color) fused with the corresponding CT image (black
and white), from the pre-treatment SPECT/CT image. Right: the 90 Y-PET image fused with the corre-
sponding MR image, from the post-treatment PET/MR image.
As explained in previous chapters, PET and SPECT are quantitative, provided that
• all corrections for sensitivity, attenuation, Compton scatter and/or randoms are performed well.
• the reconstructed image values are calibrated by determining the calibration factor, which con-
verts the reconstructed voxel values into values in Bq/ml, see sections 7.1.3.4, p 91 for SPECT
and 7.2.4, p 94 for PET. The calibration factor is radionuclide dependent, and in particular in
SPECT, this dependence is so complicated that it necessitates determining the calibration factor
from phantom measurements for each tracer.
Chapter 12
Appendix
The first factor is the probability of detecting n photons in n intervals. The second factor is the
probability of detecting no photons in n − k intervals. The third factor is the amount of ways that
these n successful and n − k unsuccessful intervals can be combined in different measurements.
As mentioned above, equation (12.1) becomes better when k is larger. To make it exact, we simply
have to let k go to infinity.
rn r k
lim pr (n) = lim 1 − (12.2)
k→∞ n! k→∞ k
It turns out that computing the limit of the logarithm is easier, so we have
r k ln(k − r) − ln(k)
lim 1 − = exp lim
k→∞ k k→∞ 1/k
1/(k − r) − 1/k
= exp lim
k→∞ −1/k 2
= exp(−r) (12.3)
So it follows that
e−r rn
lim pr (n) = . (12.4)
k→∞ n!
138
CHAPTER 12. APPENDIX 139
1. Assume that the system is perfect: the image of a point source is a point, located on the perpen-
dicular projection line through the point source. Mathematicians would call that “point” in the
image a “Dirac impulse”. The image of two or more point sources contains simply two or more
Dirac impulses, located at corresponding projection lines.
Let f (x, y) represent the image value at position (x, y). This image can be regarded as the sum
of an infinite number of Dirac impulses δ(x, y), one at every location (x, y):
2. Correction of the first step: the expected image of a point is not a Dirac impulse, but a PSF.
Therefore, replace each of the Dirac impulses in the image by the corresponding PSF, centered
at the position of the Dirac impulse.
The second operation is called the convolution operation. Assume that a complex flat (e.g. a
inhomogeneous radioactive sheet) tracer distribution would be put in front of the camera, parallel to
the collimator (at distance H). What image will be obtained? Regard the distribution as consisting
of an infinite number of point sources. This does not change the distribution, it only changes the way
you look at it. Project all of these sources along an ideal perpendicular projection line into the image.
You will now obtain an image consisting of an infinite number of Dirac impulses. Replace each of
these impulses with the PSF and sum the overlapping values to obtain the expected image.
If the distance to the collimator is not the same for every point source, then things get more
complicated. Indeed, the convolution operator assumes that the PSF is the same for all point sources.
Therefore, to calculate the expected image, the previous procedure has to be applied individually for
every distance, using the corresponding distance dependent PSF. Finally, all convolved images have
to be summed to yield the expected image.
CHAPTER 12. APPENDIX 140
The exponentiation contains a term in u2 and a term in u. We can get rid of the u by requiring that both
terms result from squaring something of the form (u+C)2 , and rewriting everything as (u+C)2 −C 2 .
The C 2 is not a function of u, so it can be put in front of the integral sign.
Z ∞ 2 !
√ x2 x2
x
(F1 ⊗ F2 )(x) = AB exp − mu − 2 √ + 4 − 2 du (12.13)
−∞ b m b m b
2 Z ∞ 2 !
x2 √
x x
= AB exp 4 − 2 exp − mu − 2 √ du (12.14)
b m b −∞ b m
The integrand is a Gaussian. The center is a function of x, but the standard deviation is not. The
integral from −∞ to ∞ of a Gaussian is a finite value, only dependent on its standard deviation.
Consequently, the integral is not a function of x. Working out the factor in front of the integral sign
and combining all constants in a new constant D, we obtain
x2
(F1 ⊗ F2 )(x) = D exp − 2 (12.15)
a + b2
So the convolution of two Gaussians is again a Gaussian. The variance of the resulting Gaussian
is simply the sum of the input variances (by definition, the variance is the square of the standard
deviation).
The FWHM of a Gaussian is proportional to the standard deviation, so we obtain a very simple
expression to compute the FWHM resulting from the convolution of multiple PSFs:
where E is the expectation, ai is a sample from the distribution and the summation is over the entire
distribution. By definition, σa is the standard deviation. When computations are performed on sam-
ples from a distribution (e.g. Poisson variables) the noise propagates into the result. Consequently,
one can regard the result also as a sample from some distribution which can be characterized by its
(usually unknown) mean value and its variance or standard deviation. We want to estimate the vari-
ance of that distribution to have an idea of the precision on the computed result. We will compute the
variance of the sum or difference and of the product of two independent variables. We will also show
how the variance on any function of independent samples can be easily estimated using a first order
approximation.
Keep in mind that these derivations only hold for independent samples. If some of them are
dependent, you should first eliminate them using the equations for the dependencies.
This expression is not very useful, it must be rewritten as a function of a, b, σa and σb . To obtain that,
we rewrite a as a + (a − a):
E a2 = E (a + (a − a))2
h i
= a2 + E (a − a)2 + E [2a(a − a)]
= a2 + σa2 (12.21)
2 2 2
σab = (a2 + σa2 )(b + σb2 ) − a2 b
2
= a2 σb2 + b σa2 + σa2 σb2
σb2 σa2 σb2
2
2 2 σa
= a b + 2 + (12.22)
a2 b a2 b
2
σa σb2
2
' a2 b2 + 2 (12.23)
a2 b
The last line is a first order approximation which is acceptable if the relative errors are small. We
conclude that when two variables are multiplied the relative variances must be added.
The first step is a first order approximation: the expectation of a linear function is the function of
the expectations. Similarly, the second line is a Taylor expansion, assuming all higher derivatives are
zero. The third step is the application of (12.19).
With this approach you can easily verify that the variance on a product or division is obtained by
summing the relative variances.
CHAPTER 12. APPENDIX 143
A central profile through Λ along θ yields the Fourier transform of the projection q(s, θ).
CHAPTER 12. APPENDIX 144
Figure 12.1: The ramp filter can be computed as the difference between a rectangular and a triangu-
lar filter
Consider the rectangular filter R(ω) of figure 12.1: its value equals W for frequency ω between
−W and W , and it is zero elsewhere. Its inverse Fourier transform equals:
Z ∞
R(x) = W R(ω)e2πjωx dω
−∞
Z W Z W
2πjωx
= W e dω = W (cos(2πωx) + j sin(2πωx))dω
−W −W
Z W
= W cos(2πωx)dω
−W
sin(2πW x)
= W (12.32)
πx
The third equality follows from the fact that the integral of a sine over an interval symmetrical about
zero is always zero. The inverse Fourier of a rectangular filter is a so-called sinc function.
The convolution of a rectangular function with itself is a triangular function. More precisely,
consider the rectangular function R2 (ω) and the triangular function T (ω) defined as:
Figure 12.2: The inverse Fourier transform of the ramp filter, with a fine sampling (black curve). Also
shown is the sampling of the usual discretisation (red dots).
Then it is easy to verify that the convolution of R2 (ω) with itself equals T (ω):
Z ∞ Z W/2
0 0 0
R2 (ω )R2 (ω − ω )dω = R2 (ω − ω 0 )dω 0
−∞ −W/2
Zω+W/2
= R2 (ω 0 )dω 0
ω−W/2
= T (ω) (12.35)
Note that the rectangular filters R2 (ω) and R(ω) have a different width and amplitude.
Convolution in the Fourier transform corresponds to a product in the spatial domain, so the inverse
Fourier transform of a triangular filter is the square of a sinc function. Therefore, the inverse Fourier
transform of the difference between the rectangular and triangular filters, i.e. the ramp filter, equals:
Figure 12.2 shows a plot of (12.36). In a discrete implementation, where one only needs the
function values at integer positions x = −N, −N + 1, . . . , N , the highest possible value for W is 0.5,
the Nyquist frequency. The red dots in the figure show these discrete function values for W = 0.5.
CHAPTER 12. APPENDIX 146
The Laplace transform is very useful in computations involving differential equations, because in-
tegrals and derivatives with respect to t are transformed to very simple functions of s. Some of its
interesting features are listed below (most are easy to prove). The functions of the time are at the left,
the corresponding Laplace transforms are at the right:
Second appendix
The denominator should be equal to 1, but if we keep it, we can apply the equation also if p is known
except for a constant factor.
The prior expectations for a and b are:
āa b̄b
pā (a) = e−ā pb̄ (b) = e−b̄ (13.2)
a! b!
After the measurement, we know that the first detector can only have produced a counts if the other
one happened to contribute N − a counts. Dropping some constants this yields:
147
CHAPTER 13. SECOND APPENDIX 148
(ā + b̄)N −1 N!
E(a|a + b = N ) = ā (13.12)
(N − 1)! (ā + b̄)N
N
= ā (13.13)
ā + b̄
CHAPTER 13. SECOND APPENDIX 149
The conditional likelihood of the complete variables f (X|Λ) can be written as:
where k(X|Q, Λ) is the conditional likelihood of the complete variables, given the reconstruction and
the measurement. These definitions immediately imply that
which means that we write the log-likelihood of the complete variables as a function of Λ0 , and that we
eliminate the unknown variables X by computing the expectation based on the current reconstruction
Λ. Combining (13.16) and (13.17) results in
Finally, we define a generalized EM (GEM) algorithm. This is a procedure which computes a new
reconstruction from the current one. The procedure will be denoted as M . M is a GEM-algorithm if
This means that we want M to increase h. We can be more demanding and require that M maximizes
h; then we have a regular EM algorithm, such as the MLEM algorithm of section 5.3.2, p 54.
Now, from equation (13.18) we can compute what happens with L if we apply a GEM-step to
increase the value of h:
Because M is a GEM-algorithm, we already know that h(M (Λ)|Λ) − h(Λ, Λ) is positive. If we can
also show that
E [ln k(X|Q, Λ)|Q, Λ] − E [ln k(X|Q, M (Λ))|Q, Λ] ≥ 0 (13.21)
then we have proven that every GEM-step increases the likelihood L.
By definition, we have
Z
0
k(X|Q, Λ) ln k(X|Q, Λ0 )dX.
E ln k(X|Q, Λ )|Q, Λ = (13.22)
CHAPTER 13. SECOND APPENDIX 150
Consider the function ψ(t) = t ln t. It is only defined for t > 0. Here are its derivatives:
dψ(t)
ψ 0 (t) = = 1 + ln t (13.25)
dt
d2 ψ(t) 1
ψ 00 (t) = = >0 (13.26)
dt2 t
The second derivative is continuous and strictly positive (t ln t is a convex function). Consequently,
we can always find a value u such that
(t − 1)2 00
ψ(t) = ψ(1) + (t − 1)ψ 0 (1) + ψ (u) with 0 < u ≤ t. (13.27)
2
Because ψ(1) = 0 and ψ 0 (1) = 1, this becomes:
(t − 1)2 00
ψ(t) = (t − 1) + ψ (u) with 0 < u ≤ t. (13.28)
2
Consider now an integral of the same form as in (13.24):
Z Z Z
f1 (x)
f1 (x) ln dx with f1 (x)dx = f2 (x)dx = 1. (13.29)
f2 (x)
We can rework it such that we can exploit the convexity of t ln t:
Z
f1 (x)
f1 (x) ln dx
f2 (x)
Z
f1 (x) f1 (x)
= ln f2 (x)dx
f2 (x) f2 (x)
Z " 2 #
f1 (x) 1 f1 (x) 00
= −1+ − 1 ψ (u(x)) f2 (x)dx with 0 < u(x) < ∞
f2 (x) 2 f2 (x)
Z Z 2
1 f1 (x)
= (f1 (x) − f2 (x)) dx + − 12 ψ 00 (u(x))f2 (x)dx ≥ 0
2 f2 (x)
q.e.d.
Now we have proved that the GEM-step increases L. It still remains to be shown that the al-
gorithm indeed converges towards the maximum likelihood solution. If you want to know really
everything about it, you should read the paper by Dempster, Laird and Rubin, “Maximum likelihood
from incomplete data via the EM algorithm”, J R Statist Soc 1977; 39; 1-38.
CHAPTER 13. SECOND APPENDIX 151
which results in
1
b(x, y) = p . (13.34)
x2 + y 2
[1] SR Cherry, JA Sorenson, ME Phelps. “Physics in Nuclear Medicine”, Saunders, 2012. ISBN
978-1-4160-5198-5.
[2] HY Oei, JAK Blokland, Nederlandse Vereniging voor Nucleaire Geneeskunde, “Aanbevelingen
Nucleaire Geneeskundige Diagnostiek”, Eburon, Delf, ISBN 90-5166-358-7.
[3] JAJ Camps, MJPG van Kroonenburgh, P van Urk, KS Wiarda. “Leerboek Nucleaire Ge-
neeskunde”, Elsevier, ISBN 90 352 2108 7.
[4] Harrison H. Barrett and Kyle J. Myers. ”Foundations of image science.” 1584 pages. ISBN 0-
471-15300-1. Wiley-VCH, October 2003. 1 (2003).
[5] : IAEA book “Nuclear Medicine Physics: A Handbook for Teachers and Students”, IAEA
Reference Number: STI/PUB/1617, released on the internet on 2015-01-14, ISBN: 978-92-0-
143810-2.
http://www-pub.iaea.org/books/IAEABooks/10368/Nuclear-Medicine-Physics-A-Handbook-
for-Teachers-and-Students
[6] M.G. Stabin, M.A. Emmons, M.J. Fernald, A.B. Brill, W.P. Segars. “Use of realistic anthropo-
morphic models for calculation of radiation dose in nuclear Medicine”. International congress
of the International Radiation Protection Association (IRPA), Argentina, 2008.
[7] H. Hänscheid, Heribert, et al. “EANM Dosimetry Committee series on standard operational
procedures for pre-therapeutic dosimetry II. Dosimetry prior to radioiodine therapy of benign
thyroid diseases.” European journal of nuclear medicine and molecular imaging 40 (2013):
1126-1134.
[10] Martin, James E. “Physics for radiation protection: a handbook”. John Wiley & Sons, 2006.
152