Rohleder 2005
Rohleder 2005
Table 1 Mean, maximum and minimum values and standard deviation ref of the different analyte concentrations as determined by the reference
method. The relative coefficient of variation (%CV) measures the precision of the reference methods upon re-sampling. The concentration of LDL
cholesterol was determined using the Friedewald formula.
Uric acid 5.3 2.5 9.0 1.3 Enzymatic colorimetric test 1.7
rectly investigated by mid-infrared spectroscopy.6 Alterna- were determined by enzymatic tests using a MODULAR® PP
tively, various kinds of Raman spectroscopy may be used to system.§3 The concentration of LDL cholesterol was calcu-
provide access to the fundamental vibrational spectra of body lated by the Friedewald formula. Note that these reference
fluids.8,9,18 –20 Near-infrared Raman spectroscopy appears to concentrations of analytes were determined after the afore-
be particularly favorable, since it takes advantage of the low mentioned steps, particularly after spiking. Furthermore, the
absorption coefficient of water in the near-infrared spectral concentrations of cholesterol, triglycerides, HDL and LDL are
region 共which, e.g., amounts to 0.35/cm at a wavelength of 1 physiologically interrelated. While the square of Pearson’s
m兲 and since the fluorescent light background is strongly correlation coefficient indicates a substantial correlation ( r 2
reduced when compared to using visible light for Raman ⫽0.86) between cholesterol and LDL, r 2 is less than 0.3 for
spectroscopy. all other pairs of metabolites. For example, triglycerides are
For the research described in this report, two approaches not completely unrelated to cholesterol ( r 2 ⫽0.19) and HDL
were followed in order to investigate body fluids, namely ( r 2 ⫽0.21) . Qualitatively, these observations also hold true
near-infrared Raman spectroscopy of serum in its native form when calculating r 2 within the various subsets 共e.g., spiked
and mid-infrared spectroscopy of dried films of serum. While samples versus unspiked samples兲 individually.
each of the two spectroscopic techniques was optimized indi- Minimum, mean, and maximum concentrations of each
vidually, identical samples were used for both analyses, the analyte are given in Table 1 together with the standard devia-
samples throughput was required to be identical and the final tion of the concentrations and the reference test method. The
data analysis was performed with the identical software rou- standard measure for precision in a clinical laboratory is the
tines in order to allow for a close comparison of the two coefficient of variation, which is determined by remeasuring
approaches. the concentrations of analytes multiple times. The ratio be-
tween the coefficient of variation and the mean value of the
2 Materials and Methods concentration of the analyte under investigation, i.e., the rela-
Blood samples were collected from 238 healthy donors and tive coefficient of variation %CV, is listed in Table 1 for each
nine patients suffering from diabetes. The samples were cen- of the individual reference methods.
trifuged at 900 g for 30 min using a Heraeus Labofuge GL Since more than 180 million people suffer from diabetes
and the serum was isolated. Serum samples from 80 donors mellitus, a disorder of the glucose metabolism, the quantifica-
were set aside and glucose was added to these samples in the tion of glucose has frequently served as a benchmark for the
form of a glucose solution 共Fresenius Kabi Glucosteril® capabilities of vibrational spectroscopy in the context of clini-
70%兲. Forty of these samples were spiked with 2.2 L glu- cal laboratory analysis. The quantification of glucose has
cose solution per millilitre of serum to increase the glucose therefore been emphasized throughout this manuscript.
concentrations by approximately 150 mg/dl. Glucose concen- We employed an automated pipetting system together with
trations in the other 40 samples were increased by approxi- a BRUKER Matrix HTS-XT spectrometer for the experiments
mately 300 mg/dl upon addition of 4.4 L glucose solution using mid-infrared spectroscopy.21 Ninety-six-well silicon
per millilitre of serum. Subsequently, all of the serum samples sample carriers were used for mid-infrared transmission spec-
were partitioned into multiple aliquots of 3 mL each. All troscopy; 3 L of serum were pipetted onto a sample carrier
samples were frozen at ⫺80 °C for storage purposes. One of in random order and left to dry in ambient air for 30 min.
the aliquots of each donor’s samples was subjected to clinical After drying, the film 共thickness of 2–10 m兲 was subjected
chemistry testing: The concentrations of total protein, glucose,
§
uric acid, urea, cholesterol, triglycerides and HDL cholesterol MODULAR is a trademark of a member of the Roche group.
to mid-infrared transmission spectroscopy. Spectroscopy is request a throughput of at least 80 samples per day for both
performed in transmission using a DLaTGS detector, which in spectroscopic methods. Furthermore, liquid nitrogen cooling
contrast to mercury cadmium telluride 共MCT兲 detectors can had to be avoided in the view of possible future laboratory
be operated without liquid nitrogen cooling. Each spectrum application. We used samples from the same study for the
was recorded in the wave number range from 500 to 4000 investigation of both spectroscopic approaches and we split
cm⫺1 and consisted of 3629 data points. Spectra were ac- the spectra into the same calibration and validation sample
quired at a resolution of 4 cm⫺1 and averaged over 32 scans. subsets. Finally, identical multivariate analysis algorithms
Blackmann–Harris three-term apodization was used and the were used for the quantitative analysis of the pre-processed
zero filling factor was 4 共note that the use of a zero filling of spectra.
4 is standard practice in our laboratory but does not contribute After the spectroscopy and the pre-processing of spectra
to the accuracy of our results兲. To improve reproducibility had been completed, all data 共laboratory data and spectra兲 of
each sample was pipetted and measured on three sample car- the 247 donors were divided into a teaching set of 148 donors’
riers and the three absorbance spectra of each sample were data and a set of 99 donors’ data for independent validation.
corrected for sample carrier background and normalized. We Those samples exhibiting the lowest and the highest concen-
used a proprietary algorithm22 to correct for sample carrier trations of the different analytes were always assigned to the
background and to normalize the spectra. 共Note that due to the teaching set. The statistical equivalence of the teaching and
good reproducibility of our measurement method during this validation sets was verified on the basis of two-sample t-tests
study a simple background subtraction and standard vector and two-sample F-tests for the different analytes. For teach-
normalization will give similar results兲. At each wave number ing, partial least squares regression 共PLS兲 was performed us-
the median of the three pre-processed spectra was calculated ing MathWorks’ MatLab™ 6.0 Release 12 together with SIM-
and the resulting spectrum was then subjected to further PLS algorithm implemented in the PLSគToolbox 2.1 by
analysis. Although the actual integration time per spectrum Eigenvector Research, Inc. In order to optimize the training
was only 29 s, the triplicate measurement, the spectroscopic within the teaching data set, the root-mean-square error of
determination of the sample carrier background and the calibration 共RMSEC兲 and the root-mean-square error of
sample handling time resulted in an average processing time leave-one-out cross validation 共RMSECV兲 were calculated
冋 册
of 5 min per sample. The sample carriers were discarded after
N 1/2
use. 兺 i⫽1
teach
共 c pred,i ⫺c ref,i 兲 2
Stability of the mid-infrared system was coarsely assessed RMSEC⫽
N teach⫺LV⫺1
by calculating the area under the spectra before normalizing.
冋 册
Variations of the area under each spectrum are caused prima- N 1/2
rily by variations of the shape and thickness of the dried film 兺 i⫽1
teach
共 c pred,i ⫺c ref,i 兲 2
RMSECV⫽ .
of serum, which finally leads to a variation in optical path N teach
length. We find that the area under curve varied by less than
Here c ref,i and c pred,i denote the concentrations of analytes in
⫾20% among the measurements in this study. However, one
sample i as determined by the reference method and by the
sample was excluded from further analysis since the area un-
spectroscopic measurement, respectively. N teach is the number
der the mid-infrared spectra amounted to less than 50% of the
of teaching samples ( N teach⫽148) and LV is the number of
expected value for all three absorbance spectra originating
latent variables used for the PLS calibration. The optimum
from this sample.
A Kaiser Optical HoloSpec f/1.8i spectrometer was used
LV was chosen by selecting that value for LV, which corre-
sponds to the minimum of RMSECV.
for Raman spectroscopy. Laser radiation 共wavelength 785 nm兲
The validation set remained blinded until the teaching had
interacted with the sample within a quartz cuvette 共power at
been finalized. As a measure for the prediction accuracy of the
the location of the sample: 200 mW兲 and backscattered radia-
system, the root-mean-square error of prediction 共RMSEP兲
tion was collected using an Olympus PL4X lens. Ten quartz
was calculated according to
cuvettes were alternated and, after ten measurements, the cu-
冋 册
vettes were cleaned in 1% Hellmanex II solution 共Hellma N 1/2
GmbH&Co. KG, Müllheim/Baden, Germany兲 at 70 °C, dried 兺 i⫽1
val
共 c pred,i ⫺c ref,i 兲 2
RMSEP⫽ .
and used for the next set of measurements. Spectral resolution N val
was 8 cm⫺1. Spectra were acquired over 5 min during 12
acquisitions of 25 s each. The raw spectra were normalized N val is the number of validation samples ( N val⫽99 for Raman
and a fifth order polynomial background was subtracted in the spectroscopy, N val⫽98 for mid-infrared spectroscopy兲. Rela-
region from 300 to 1870 cm⫺1 using an iterative algorithm.23 tive errors 共%RMSEP兲 are calculated as the ratio between
Further details of the Raman experiments are reported in RMSEP and the mean concentration.
Ref. 19.
The strategy of the comparison was to use an optimum
setting for each spectroscopic method individually, but to re- 3 Results
quire the working conditions from a laboratory standpoint and An example of the mid-infrared spectrum of a dried film of
the data analysis to be as equivalent as possible for both meth- serum is given in Fig. 1. The mid-infrared spectrum is domi-
ods. Differences and similarities between the parameters used nated by the infrared absorption of proteins such as albumin
for the two approaches are listed in Table 2. While many of or globulins, which, after drying, constitute the major compo-
the parameters had become an internal working standard dur- nents of the serum film. Proteins exhibit characteristic vibra-
ing our prior investigations, we paid particular attention to tions of the polypeptide skeleton. The most pronounced peak
Table 2 Main characteristics of the parameters used for infrared and Raman spectroscopy. Note that
aliquots of identical serum samples have been used for both approaches.
FTIR RAMAN
Process parameters
Samples throughput 80/day
Need for liquid nitrogen cooling no
Sample carrier type Silicon plate Quartz cuvette
Sample carrier reuse No Yes
Background measurement Yes No
Sample volume used 100 L 1 mL
Sample handling Automated Manual
Sample drying Yes No
Multiplicity of measurement Triplicate Single
Spectroscopy parameters
Light source Globar Semiconductor laser
Detector type DLaTGS CCD
Acquisition time for a single spectrum 30 s 5 min.
Detected wave number range 500–4000 cm−1 300–3500 cm−1
−1
Spectral resolution 4 cm 8 cm−1
Zero filling 4 ⬃2
Analysis parameters
Data pre-treatment Background correction, Subtraction of 5th order
normalization polynomial,
normalization
Wave number range used for PLS analysis 1220–1690 cm−1 300–1500 cm−1
of proteins
Wave number range used for PLS analysis 500–1800 and 300–1500 cm−1
of all other analytes 2500–3300 cm−1
Teaching set 148 serum samples
Teaching algorithm SIMPLS
Determination of optimum No. LV minimum of RMSECV
Independent validation set 99 serum samples
Measure of quality of quantification RMSEP
Table 3 Results of the teaching procedure. LV min denotes that number of latent variables, for which the
root-mean-square error of leave-one-out cross validation (RMSECV) becomes minimal. RMSEC is the
root-mean-square error of calibration. Since RMSECV does not statistically differ from its minimum value
within a range of latent variables, the range of statistically equivalent values of LV is noted in brackets.
dependent validation was performed and the root-mean- highest concentration and they can be quantified within a rela-
square error of prediction 共RMSEP兲 was calculated. In retro- tive error as low as 2.5%. This tendency holds true for both
spect we find that those values of LV which provide the mid-infrared and Raman spectroscopy. In order to relate our
minimum value for RMSEP within the independent validation findings with present day clinical chemical analyzers it is also
set 共see Table 4兲 are frequently very close to our estimates instructive to understand the measurement accuracy in terms
LV min , which were derived from the teaching set only. Thus, of the number of molecules rather than their mass-related con-
we conclude that the method we used for estimating the opti- centration: Considering the molar weights of the analytes in-
mum number of latent variables provides a reasonable ap- vestigated, vibrational-spectroscopy based quantification ap-
proach to the problem of dimensionality. pears to be limited to accuracies in the 0.1 mmol/L range,
In our measurement setup, Raman spectroscopy required regardless of the particular choice of the spectroscopic tech-
larger sample volumes than the infrared spectroscopy. Al- nique. This finding is also supported by prior publications of
though we envisage that the volume used here 共1 mL兲 can be our and other groups as listed in Table 5.6,8,18 –19,24 –27
reduced to 200 L by means of automation, it would still be High signal-to-noise ratios are considered a fundamental
twice the volume used in infrared spectroscopy. For infrared strength of infrared spectroscopy when compared to Raman
spectroscopy, the volume may even be reduced further: in spectroscopy. However, we find that this advantage does not
fact, we designed our system such that it can operate with result in a superior prediction accuracy when compared to
volumes as low as 70 L and even lower volumes are con- Raman spectroscopy. This result supports our prior finding,
ceivable. that reproducibility rather than signal-to-noise ratio imposes a
Given the fact that vibrational changes in dipole moment lower limit on the prediction errors in mid-infrared spectros-
共or polarizability in the case of Raman spectroscopy兲 are of a copy, even if particular attention is paid to the reproducibility
similar order of magnitude for most biomolecules, the by virtue of automation, triplicate measurement, standardiza-
sample-specific detection capabilities mainly depend on the tion, and computational efforts.21 A small supplementary in-
concentration. The RMSEP values of the eight analytes under vestigation also points at the importance of reproducibility:
investigation are shown for both spectroscopic techniques as a five randomly chosen samples from the above study were re-
function of mean concentration in Fig. 6. RMSEP appears to measured over the course of the above experiments using
increase with analyte concentration. However, the ratio be- mid-infrared spectroscopy and the concentrations of analytes
tween RMSEP and mean value decreases with increasing con- were predicted on the basis of the PLS algorithm described
centration 共dashed lines in Fig. 6兲: Uric acid exhibits the low- above. For each sample and each analyte, the predicted con-
est concentration of all of the analytes investigated and centrations vary from measurement to measurement. In anal-
pertains a relative error of up to 26% upon quantification. In ogy to the clinical laboratory guidelines, the relative coeffi-
contrast, proteins constitute the molecular group with the cient of variation 共%CV兲 can be calculated as a measure for
Fig. 4 Concentrations of analytes in the validation samples as predicted by mid-infrared spectroscopy ( c pred) as compared to the concentrations
determined by the laboratory methods ( c ref). Corresponding Raman data have been published in Ref. 19.
the precision of the system. We find that, on average, %CV remeasuring the sample still substantially contributes to the
ranges from 4% 共protein兲 to 16% 共LDL兲. These numbers have overall error in the case of infrared spectroscopy of serum.
to be compared to 4.7% and 16.4% for the %RMSEP of pro- The challenge in reproducibility might be caused by the high
tein and LDL, respectively. Thus, we find that for these, as susceptibility of mid-infrared spectroscopy to changes in en-
well as most of the other analytes, the error observed upon vironmental conditions 共in particular water vapor and tem-
Table 4 Results of the independent validation. RMSEP兵 LV min其 denotes the root-mean-square error of
prediction at that number of latent variables, for which the root-mean-square error of leave-one-out cross
validation (RMSECV) became minimal (see Table 3). RMSEPmin and RMSEPmax denote the minimum and
maximum values of RMSEP observed when using PLS calibration models on the basis of different values
of LV (in braces) within a range statistically equivalent to LV min .
Total protein 328 兵3其 176 兵10其 323 兵4其 169 兵8其 434 兵20其 198 兵7其
Glucose 14.7 兵15其 17.1 兵10其 13.4 兵24其 16.9 兵9其 17.6 兵9其 21.1 兵14其
Urea 3.3 兵18其 4.4 兵12其 3.3 兵21其 4.4 兵12其 5.6 兵50其 4.9 兵17其
Uric acid 1.4 兵10其 1.1 兵12其 1.3 兵7其 1.1 兵11其 1.6 兵19其 1.3 兵1其
Cholesterol 16.1 兵12其 11.5 兵12其 15.0 兵11其 11.1 兵11其 18.0 兵24其 14.1 兵29其
Triglycerides 18.1 兵19其 20.7 兵15其 17.5 兵17其 19.8 兵12其 21.4 兵27其 23.9 兵50其
HDL cholesterol 11.9 兵15其 11.0 兵10其 11.8 兵14其 10.0 兵12其 21.1 兵44其 13.7 兵25其
LDL cholesterol 19.4 兵12其 15.7 兵14其 18.6 兵18其 14.6 兵11其 25.3 兵33其 19.1 兵50其
perature兲 which affect both the spectroscopy and the drying protein in our study population. Similar conclusions may be
process. In turn, the lower signal-to-noise ratio generally ob- drawn for HDL and uric acid, for which the relative prediction
served during Raman spectroscopy does not prevent the quan- errors exceed the biological variation among the donors of
tification of analytes in serum if a measurement time of 5 min our study population by only 30% or less. In contrast, the
per sample is acceptable. %RMSEP values for cholesterol, triglycerides, LDL and urea
In the light of a routine clinical laboratory application, the are up to four times smaller than the biological spread of
relative prediction errors 共%RMSEP兲 may be compared to the concentrations showing that for those parameters mid-infrared
standard deviations of reference concentrations, which prima- and Raman spectroscopy might supply a valuable tool for
rily reflect the physiological variations within the population quantification. It may be speculated that—similar to the case
under investigation. For example, the standard deviation of of glucose, where we have artificially spiked the samples to
the reference values amounts to only 5.4% of their mean con- deliver concentrations of glucose outside the normal, but well
centration for total protein. On the other hand, the concentra- within the possible physiological range—the quantification
tion of proteins can be predicted with a relative prediction accuracy of protein, HDL, and uric acid may appear more
error of 4.7% for mid-infrared spectroscopy. Ignoring any favorable in future studies, using samples which originate
non-Gaussian contribution to the distribution of concentra- from diseased people suffering, e.g., from dyslipidemia or
tions, it appears reasonable to conclude that the relative pre- gout.
diction error of the infrared spectroscopic approach is compa-
rable to the biological variations of the concentration of total
Table 5 Results of the multivariate analysis of mid-infrared and Raman spectra. The given values are the root-mean-square errors of prediction
(RMSEP) of a validation using N val independent samples. As an exception the results reported in Refs. 6 and 8 were obtained using leave-one-out
(LOO) validation and the values are therefore marked with an asterisk. For the study reported in this manuscript N val⫽99 samples were subjected
to the validation process for both methods. (Note that in the case of infrared spectroscopy one sample was excluded from the evaluation due to
unusually low absorbances in all three repetitions of the pipetting.) N tot⫽247 is the total number of samples used in the study. HDL and LDL denote
the high and low density lipoprotein fraction of cholesterol, respectively. All concentrations are given in mg/dl.
Reference N tot N val Total protein Triglycerides Cholesterol HDL LDL Glucose Urea Uric acid
Mid-infrared
This study 247 99 328 18.1 16.1 11.9 19.4 14.7 3.3 1.4
a
300 100 280 20.1 10.8 ¯ ¯ 7.4 6.6 2.4
b
300 100 310 23.6 11.2 ¯ ¯ 27 7.2 ¯
c
90 30 ¯ 30.6 14.7 12.0 13.5 ¯ ¯ ¯
d
122 24 ¯ 13 15 ¯ ¯ 16 ¯ ¯
e
306 (LOO) 240* 16.6* 11.3* ¯ ¯ 9.5* 2.0* ¯
Raman
This study 247 99 176 20.7 11.5 11.0 15.7 17.1 4.4 1.1
f
60 18–24 71 ¯ 10.4 ¯ ¯ ¯ ¯ ¯
g
66 (LOO) 190* 29* 12* ¯ ¯ 26* 3.8* ¯
a
Reference 24.
b
Reference 25.
c
Reference 26.
d
Reference 27.
e
Reference 6.
f
Reference 18.
g
Reference 8.
11. H. M. Heise and A. Bittner, ‘‘Multivariate calibration for near- 19. D. Rohleder, W. Kiefer, and W. Petrich, ‘‘Raman spectroscopy of
infrared spectroscopic assays of blood substrates in human plasma serum and serum ultrafiltrate,’’ Analyst (Cambridge, U.K.) 129, 906 –
based on variable selection using PLS-regression vector choices,’’ 991 共2004兲.
Fresenius’ J. Anal. Chem. 362, 141–147 共1998兲. 20. C. R. Yonzon, C. L. Haynes, X. Zhang, J. T. Walsh, Jr., and R. P. Van
12. E. Diessel, S. Willmann, P. Kamphaus, R. Kurte, U. Damm, and H. Duyne, ‘‘A glucose biosensor based on surface-enhanced Raman
M. Heise, ‘‘Glucose quantification in dried-down nanoliter samples scattering: improved partition layer, temporal stability, reversibility
using mid-infrared attenuated total reflection spectroscopy,’’ Appl. and resistance to serum protein interference,’’ Anal. Chem. 76, 78 – 85
Spectrosc. 58, 442– 450 共2004兲. 共2004兲.
13. G. Deleris and C. Petibois, ‘‘Application of FT-IR spectroscopy to 21. J. Moecks, G. Kocherscheidt, W. Köhler, and W. Petrich, ‘‘Progress
plasma contents analysis and monitoring,’’ Vib. Spectrosc. 32, 129– in diagnostic pattern recognition,’’ Proc. SPIE 5321, 117–123 共2004兲.
136 共2003兲. 22. J. Moecks, D. Rohleder, and W. Petrich, German patent application
14. R. Vonach, J. Buschmann, R. Falkowski, R. Schindler, B. Lendl, and 共pending兲.
R. Kellner, ‘‘Application of mid-infrared transmission spectrometry 23. C. A. Lieber and A. Mahadevan-Jansen, ‘‘Automated method for sub-
to the direct determination of glucose in whole blood,’’ Appl. Spec- traction of fluorescence from biological Raman spectra,’’ Appl. Spec-
trosc. 52, 820– 822 共1998兲. trosc. 57, 1363–1367 共2003兲.
15. K. Hebestreit, T. Beyer, A. Lambrecht, R. Mischler, M. Schoemaker, 24. R. A. Shaw, S. Kotowich, M. Leroux, and H. H. Mantsch, ‘‘Multi-
and W. Petrich, ‘‘Infrared spectroscopy of glucose solutions using analyte serum analysis using mid-infrared spectroscopy,’’ Ann. Clin.
quantum cascade lasers,’’ in SPIE Technical Summary Digest (BiOS Biochem. 35, 624 – 632 共1998兲.
2004, 5321–31), p. 116, SPIE, Bellingham, WA 共2004兲. 25. R. A. Shaw and H. H. Mantsch, ‘‘Multianalyte serum assay from
16. S. Schaden, M. Haberkorn, J. Frank, J. R. Baena, and B. Lendl, mid-infrared spectra of dry film on glass slides,’’ Appl. Spectrosc. 54,
‘‘Direct determination of carbon dioxide in aqueous solution using 885– 889 共2000兲.
mid-infrared quantum cascade lasers,’’ Appl. Spectrosc. 58, 667– 670 26. K.-Z. Liu, R. A. Shaw, A. Man, T. C. Dembinski, and H. H. Mantsch,
共2004兲. ‘‘Reagent-free, simultaneous determination of serum cholesterol in
17. K. H. Hazen, M. A. Arnold, and G. W. Small, ‘‘Measurement of HDL and LDL by infrared spectroscopy,’’ Cin. Chem. 48, 499–506
glucose and other analytes in undiluted human serum with near- 共2002兲.
infrared transmission spectroscopy,’’ Anal. Chim. Acta 371, 255–267 27. W. Petrich, B. Dolenko, J. Früh, M. Ganz, H. Greger, S. Jacob, F.
共1998兲. Keller, A. E. Nikulin, M. Otto, O. Quarder, R. L. Somorjai, A. Staib,
18. J. Y. Qu, B. C. Wilson, and D. Suria, ‘‘Concentration measurements G. Werner, and H. Wielinger, ‘‘Disease pattern recognition in infrared
of multiple analytes in human sera by near-infrared Raman spectros- spectra of human sera with diabetes mellitus as an example,’’ Appl.
copy,’’ Appl. Opt. 38, 5491–5497 共1999兲. Opt. 39, 3372–3379 共2000兲.