Choosing Association Measures
Choosing Association Measures
Measures of
Association
How to Choose?                                                    Researchers in sonography, as well as other areas,
                                                                  often wish to measure the strength of relationship
HARRY KHAMIS, PhD                                                 or association between two variables. For exam-
                                                                  ple, one may wish to determine if, on the average,
                                                                  total cholesterol level increases as age increases
                                                                  for adult American men. However, there are a
                                                                  very large number of measures or coefficients
                                                                  (i.e., a number that indicates the strength of the
                                                                  relationship between two variables) from which
                                                                  to choose. It is not infrequent to find a researcher
                                                                  selecting an incorrect coefficient to measure a
                                                                  given association, thereby possibly rendering a
                                                                  false or misleading conclusion. The choice of the
                                                                  proper measure of association is based on, among
                                                                  other things, the characteristics of each of the two
                                                                  variables involved. This article enumerates every
                                                                  case that can be encountered by the researcher
                                                                  and provides an appropriate measure of associa-
                                                                  tion that can be used.
                                                                  Key words: coefficient, relationship, ordinal,
                                                                  nominal, continuous
association, one must identify the level of meas-         only a few possible values. Examples of discrete-
urement (defined and discussed below) of each             valued variables are gender (with values male,
variable being studied. Based on this information,        female), severity of disease (with values mild,
an appropriate measure of association can be iden-        moderate, severe), type of operation (with values
tified as outlined below. Indeed, for any given sit-      standard, modified, laparoscopic), New York bor-
uation, there may be several different measures of        ough (with values Manhattan, Queens, etc.).
association that are valid. Rather than providing an      Because of the way these variables are defined, it
exhaustive list of all possible such measures, this       is not possible to observe a value between two
article provides recommendations of just one or           given values. For example, each patient is either
two measures for each practical situation encoun-         male or female with no other possible designation.
tered by the researcher. Ample references are                In the case of discrete variables, there are two
provided, directing the researcher to sources of          subcategories of measurement scale, ordinal and
further reading. Several real-data examples are           nominal. An ordinal variable is a discrete variable
provided, illustrating the measures discussed.            having an order associated with its levels. For
    In the next section, levels of measurement are        example, severity of disease is an ordinal variable
defined and discussed. In subsequent sections, the        because the “moderate” level represents a some-
process of selecting an appropriate measure of            what more severe disease state than the “mild”
association for any given situation is outlined, and      level, and the “severe” level corresponds to a more
a summary and conclusion are provided.                    severe condition than the “moderate” level. If the
                                                          levels of a discrete variable do not have any order
Levels of Measurement                                     associated with them, then the variable is called a
                                                          discrete nominal variable. Type of operation is a
   The level of measurement (or measurement               nominal variable because the three different types
scale) of a variable is its designation as continuous     of operation do not have any order associated with
or discrete.                                              them. You might argue that indeed there is an order
   A continuous-valued variable has values that, at       associated with these three levels—for instance,
least theoretically, come from a continuum of the         one type of operation may be more expensive than
real number line. For such variables, there are, the-     another type of operation. However, this is a dif-
oretically at least, no gaps in the possible values of    ferent variable from the one defined earlier. If the
the variable. Examples of continuous-valued vari-         variable of interest is cost of operation, with levels
ables are gestational age, blood pressure, body           inexpensive, moderate, and expensive, then indeed
mass index, left ventricular ejection fraction (cal-      this would be an ordinal variable. However, type
culated as the percentage of blood expelled in a          of operation is a nominal variable.
cycle), size of rotator cuff calcification (in cm),
and coracoacromial distance (in cm). Consider             Measures of Association—How to Choose
gestational age: it is possible that there is a fetus
with a gestational age of 60.32 days, even though            Suppose you wish to study the relationship
the recorded value is 60 days. The reason that the        between two variables by using a single measure
more precise measurement, 60.32 days, is not              or coefficient. There are many considerations that
recorded is due perhaps to the lack of accuracy of        go into selecting an optimum measure of associa-
the measuring instrument.                                 tion; these are briefly discussed in the Summary
   A discrete-valued variable has values that are         section. For simplification, we base our selection
discrete or “separated.” For such variables, the          of an appropriate measure exclusively on what
values do not come from a continuum of the real           type of variable we are measuring (i.e., its level of
number line; rather, there are gaps between the           measurement)—namely, whether it is (1) continu-
values of the variable. Usually, such variables have      ous, (2) discrete ordinal, or (3) discrete nominal.
                                                                        MEASURES OF ASSOCIATION / Khamis            157
On the basis of this information, we choose an              (i.e., knowledge of the value of one variable in no way
appropriate measure of association with which to            improves the ability to predict the value of the other
analyze the relationship between the two vari-              variable), (2) the two variables are highly variable, or
ables. From a practical point of view, the six pos-         (3) the two variables have a nonlinear relationship.
sible combinations of variables encountered by                  There is no universal rule for interpreting a given
researchers are as follows:                                 value of r, but informal guidelines have been given. For
                                                            most studies involving medical, biomedical, biological,
1.   Continuous-continuous                                  health care, sociological, educational, and psychologi-
2.   Continuous-ordinal                                     cal data, the following guidelines are appropriate:4
3.   Continuous-nominal
4.   Ordinal-ordinal                                        r                      Interpretation of Linear Relationship
5.   Ordinal-nominal                                         0.8                            Strong positive
6.   Nominal-nominal                                         0.5                            Moderate positive
                                                             0.2                            Weak positive
   For each of these combinations of variables, one or       0.0                            No relationship
more measures of association that accurately assess         –0.2                            Weak negative
                                                            –0.5                            Moderate negative
the strength of the relationship between the two vari-      –0.8                            Strong negative
ables are discussed below. The following is not an
exhaustive list of all possible measures of association
but rather the most commonly used and practically             The above guidelines are generally in agreement
useful measures. Most of these measures are available       with Cohen’s5 recommended guidelines:
on most statistical software packages. The mathemat-
ical formulas and more extensive details for these              |r| < 0.3 → Weak relationship
measures can be found in many statistics texts; three           0.3 ≤ |r| ≤ 0.5 → Moderate relationship
                                                                |r| > 0.5 → Strong relationship
basic references for such measures are Goodman and
Kruskal,1 Liebetrau,2 and Khamis.3
                                                               Suppose one or both of the continuous variables
1. CONTINUOUS-CONTINUOUS
                                                            of interest have extreme values (sometimes called
                                                            outliers). Annual income may be such a variable. For
    Consider two continuous variables. In most              example, although the vast majority of workers
instances, the Pearson correlation coefficient (also        within a given organization have annual incomes
called the Pearson product-moment correlation coef-         relatively close to the average annual income, there
ficient), r, is appropriate for measuring the strength of   are a few individuals with positions at the highest
the linear relationship between them. The value of r        levels of the administrative hierarchy (CEO, presi-
lies between –1 and +1. Values close to –1 indicate a       dent, vice presidents, etc.) whose incomes are
strong negative linear relationship (as the value of        extremely high. In this case, a more appropriate
one variable increases, the value of the other variable     measure of a linear relationship is the Spearman
decreases; e.g., consider age and physical stamina for      rank correlation coefficient. This is especially true if
adult Americans—as people get older, they have less         a test of the statistical significance of the strength of
physical stamina in general). Values close to +1 indi-      the relationship is desired. Its values also range from
cate a strong positive linear relationship (as the value    –1 to +1, and it is interpreted in the same way as r.
of one variable increases, so does the value of the            For further reading on these coefficients, refer
other variable; e.g., consider years of experience and      to almost any standard statistics text—for exam-
annual income among professional sonographers).             ple, Sokal and Rohlf6 or Zar.7
Values of r close to zero indicate no linear relation-
ship between the two variables. A value of r close to          Example 1. In a study to determine whether left
zero can occur (1) if the two variables are independent     atrial size, pressure, and ejection fraction are
158   JOURNAL OF DIAGNOSTIC MEDICAL SONOGRAPHY            May/June 2008   VOL. 24, NO. 3
useful in diagnosing patients with left ventricle          Center obtained data on 52 subjects. One of the
diastolic dysfunction through noninvasive means,8          questions posed in the survey was as follows:
the Pearson correlation coefficient between (1) the        “Have you had a lot of energy in the last four
left atrial pressure evaluated through pulmonary           weeks?” The possible responses were 1 = none of
wedge pressure and (2) the E/A wave velocity               the time, 2 = a little of the time, 3 = some of the
ratio is r = 0.77. This can be characterized as a          time, 4 = a good bit of the time, 5 = most of the time,
“strong” positive linear relationship between the          and 6 = all of the time. This researcher wished to
two variables.                                             assess the strength of relationship between this
                                                           ordinal variable and the age of the subject. For
2. CONTINUOUS-ORDINAL                                      these data, Kendall’s τb is 0.08. There appears to
                                                           be no strong relationship between these two vari-
   If one variable is continuous and the other is
                                                           ables. The Spearman rank correlation coefficient
ordinal, then an appropriate measure of associa-
                                                           for these data, with the coding given above, is
tion is Kendall’s coefficient of rank correlation
                                                           0.10, very close to the value of Kendall’s τb, ren-
tau-sub-b, τb. If the two variables are denoted by X
                                                           dering the same conclusion. This illustrates the
(continuous) and Y (ordinal), then consider the
                                                           closeness of Kendall’s τb with the Spearman rank
levels of Y to be numerically coded according to
                                                           correlation coefficient mentioned above.
the order of the levels (e.g., assign 1, 2, 3, . . . to
the levels). Then Kendall’s τb uses the numerical
values of X and the coded numerical values of Y to         3. CONTINUOUS-NOMINAL
render a number (coefficient) between –1 and +1
that measures the strength of relationship between            If one variable is continuous and the other is
X and Y. For further reading, see Liebetrau.2              nominal with just two categories, then use the
   If the ordinal variable, Y, has a large number of       point-biserial correlation coefficient. This coef-
levels (say, five or six or more), then one may use the    ficient ranges between –1 and +1. Values close
Spearman rank correlation coefficient to measure the       to ±1 indicate a strong positive/negative rela-
strength of association between X and Y. In doing so,      tionship, and values close to zero indicate no
one must be careful in numerically coding the levels       relationship between the two variables. If the
of Y in a practically meaningful way, keeping in mind      nominal variable has more than two levels, then
that a metric is being imposed by the coding scheme.       one can calculate the point-biserial correlation
See Chatfield9(p45) and Luce and Narens10 for further      between the continuous variable and all possible
discussion. A typical example of this treatment is if Y    pairs of levels of the nominal variable; this
represents degree of agreement, with the following                                k (k−1)
                                                           would result in                   such coefficients,
levels: 1 = very strongly agree, 2 = strongly agree,                                  2
3 = agree, 4 = neutral, 5 = disagree, 6 = strongly dis-    where k represents the number of levels of the nom-
agree, and 7 = very strongly disagree. Another exam-       inal variable. For further reading, see Tate.11,12
ple involves income ranges where each level is coded          The calculation of the point-biserial correlation
with the midpoint between the lowest and highest           coefficient is accomplished by coding the two lev-
income of the range (e.g., a range of $50,000–$75,000      els of the binary variable “0” and “1” and obtain-
would be coded 62,500).                                    ing the Pearson correlation coefficient between the
   The performance of the Spearman rank correla-           continuous variable and this coded binary variable.
tion coefficient is comparable to that of Kendall’s
τb, with the former being somewhat better for large           Example 3. The following data show the absolute
sample sizes (see Zar7(p392)).                             value of the difference in labor time from six hours
                                                           (median labor time) for expectant mothers and
    Example 2. A medical researcher who was a client       whether they receive analgesia (Y, N); see Kotz
at the Wright State University Statistical Consulting      and Johnson.13(p279)
                                                                          MEASURES OF ASSOCIATION / Khamis       159
Y 14.8 12.4 10.1 7.1 6.1 4.6         3.2   3.0 2.4 2.3 2.1      can be used to measure association. The same dis-
   0.8 0.1                                                      cussion as for the continuous-nominal case (see
N 13.8 5.8 4.3 3.5 3.3 2.8           2.8   2.5 1.7 1.7 1.5      above) applies here. See Cureton.15
   1.3 1.3 1.2 1.2 1.1 0.7           0.6   0.5 0.2 0.2
                                                                   Example 5. In the study discussed in example
   The point-biserial correlation coefficient for               3, the rank-biserial correlation coefficient is 0.43.
these data is 0.36. There is a “moderately” strong              This value is somewhat larger than the point-
association between extreme labor time (either                  biserial correlation coefficient of 0.36. Because
very short or very long) and the use of analgesia in            there may be one or more outliers in the data set
expectant mothers. Specifically, receiving analge-              (e.g., 13.8 in the “N” group appears to be an out-
sia is associated with more extreme labor time.                 lier), the rank-biserial correlation coefficient is
                                                                more appropriate. The general conclusion is the
4. ORDINAL-ORDINAL                                              same as that of example 3: there is a moderately
                                                                strong association between extreme labor time
   If both variables are ordinal, then an appropriate           and use of analgesia.
measure of association is Kendall’s τb. If both ordinal
variables have a large number of levels, then an appro-
priate numerical coding scheme can be used and the              6. NOMINAL-NOMINAL
Spearman rank correlation coefficient calculated. See
the discussion of the continuous-ordinal case above.               Consider two discrete nominal variables whose
                                                                association is of interest. Suppose that both vari-
   Example 4. The following data come from a study              ables have just two levels; the resulting data dis-
of economic voting behavior by Kuklinski and West:14            play is in the form of a two-by-two contingency
                                                                table. One common way of measuring association
                                     Expected Financial         in such a table is to use the phi coefficient, ϕ.
                                        Well-Being              Values of ϕ lie between 0 and 1. Values of ϕ close
                               Better      Same      Worse      to 0 indicate very little association, and values
                                                                close to 1 indicate nearly perfect predictability.
Present Financial Well-Being                                    Fleiss16 provides, as a rule of thumb, that any value
  Better                        70           85            15
  Same                          10          134            41
                                                                of ϕ less than 0.30 or 0.35 may be taken to indi-
  Worse                         27           60           100   cate no more than trivial association. See Fleiss16
                                                                for further discussion.
                                                                   For two nominal variables in which at least one
   The value of Kendall’s τb for these data is 0.39.
                                                                of the variables has more than two levels, a useful
This represents a “moderately” strong association
                                                                measure of association is Goodman and Kruskal’s
between these two variables: better present financial
                                                                lambda, λ. The value λ is the relative decrease in
well-being tends to correspond to better expected
                                                                the probability of error in guessing the level of one
financial well-being.
                                                                of the variables as between the level of the other
   It is interesting to note that, if the levels of each
                                                                variable known and unknown. The value λ lies
variable are coded 1, 2, and 3 for worse, same, and
                                                                between 0 and 1; values close to 1 correspond to a
better, respectively, then the Spearman rank correla-
                                                                strong association. For further details, see Goodman
tion coefficient is 0.42. This value is very close to
                                                                and Kruskal.1
the value for τb, once again illustrating the closeness
of these two measures.
                                                                   Example 6. The following contingency table
5. ORDINAL-NOMINAL
                                                                represents a cross-classification of hair color and
                                                                eye color in males (see Kendall17(p300)). Is there
   If one variable is nominal and the other is ordi-            an association between hair and eye color in
nal, then the rank-biserial correlation coefficient             males?
160    JOURNAL OF DIAGNOSTIC MEDICAL SONOGRAPHY            May/June 2008    VOL. 24, NO. 3
                              Hair Color
                                                               Example 7. For the data in example 2, where the
                                                            sample size is n = 52, the SE of the estimated
                 H1         H2             H3        H4     Kendall’s τb is 0.0991. Then, with 95% confidence,
Eye color                                                   the interval 0.08 ± 0.1982 or [–0.12, 0.28] contains
  E1            1768        807            189       47     the population coefficient value. Because this inter-
  E2             946       1387            746       53     val contains zero, we cannot be highly confident
  E3             115        438            288       16     that this coefficient is not zero. In fact, the data do
                                                            not adequately support the claim that the coefficient
   The value of λ for these data is 0.21. The reduc-        differs from zero. Consequently, we cannot con-
tion in the probability of error in predicting the level    clude that age is related to “recent energy level” for
of one factor is 0.21 by knowing the level of the           the population from which these 52 individuals
other factor, compared with not knowing the level           were selected.
of the other factor. That is, you can eliminate about
20% of your errors in predicting the level of one of           Example 8. For the data of example 4, the SE of
the factors if you know the level of the other factor.      Kendall’s τb coefficient is 0.04. The 95% confi-
                                                            dence interval is approximately 0.39 ± 0.08 or
                                                            [0.31, 0.47]. There is strong statistical evidence
Reliability of the Estimated Coefficient
                                                            that this coefficient is not zero because zero is not
   Because a given coefficient of correlation or            contained in the interval. Note that the SE is much
association is calculated from a sample, it is meas-        smaller here than for the data in example 7; that is
ured with a certain margin of error. Generally, the         because the sample size is much larger: n = 542.
smaller the sample size, the larger the margin of
error. Your statistical software may provide a stan-        Summary
dard error (SE) along with the estimate of the coef-
ficient. Very often, the margin of error can be                In this article, appropriate measures of associa-
calculated as approximately twice the standard              tion have been provided based on the types (or lev-
error. Then, it can be concluded with approxi-              els of measurement) of the variables involved. For
mately 95% confidence that the interval,                    any given situation, there may be a wide variety of
                                                            measures of association to choose from. The rec-
             estimated coefficient ± 2*SE,                  ommendations given in this article can be summa-
                                                            rized as follows:
contains the “true” or “population” coefficient value.
See Goodman and Kruskal1 for further details.                                                Variable X
   By the true or population value, we mean the
following: consider a population of subjects that is        Variable Y     Nominal      Ordinal           Continuous
under study, such as all 50- to 60-year-old                 Nominal        ϕ or λ         Rank biserial Point biserial
Caucasian Americans. In practice, one obtains a             Ordinal        Rank biserial τb or Spearman τb or Spearman
random sample from this population (e.g., ran-              Continuous     Point biserial τb or Spearman Pearson or
domly select 100 subjects from the population),                                                            Spearman
obtains values of two variables of interest (e.g.,          ϕ = phi coefficient, λ = Goodman and Kruskal’s lambda,
age and cholesterol level), and then calculates the         τb = Kendall’s τb.
coefficient of interest (e.g., the Pearson correlation
coefficient). This value is the estimated coefficient.         For any pair of variables, say X and Y, whose
If the coefficient value had been calculated from           association is to be studied, identify the type of
the entire population instead of only 100 randomly          variable for each (namely, nominal, ordinal, or
selected subjects, then the true or population value        continuous) and choose the measure of association
of the coefficient would have been obtained.                by referring to the table above. For example, if one
                                                                         MEASURES OF ASSOCIATION / Khamis               161
variable is discrete ordinal and the other is contin-           are several different procedures for handling ties,
uous (it does not matter which variable is called X             so it is possible for a given coefficient to give rise
and which is called Y), then an appropriate measure             to slightly different values for the same data set.
for assessing the strength of association is Kendall’s          Most statistical software packages will have a
τb or the Spearman correlation coefficient—notice               default procedure for handling ties.
the intersection of the row corresponding to “ordi-
nal” and the column corresponding to “continuous”
                                                             Conclusion
or, equivalently, the row corresponding to “continu-
ous” and the column corresponding to “ordinal.”                 It is important that the measure or coefficient
For more details about the selection of the measure          used to assess the strength of association between
of association for this case, see the discussion             two variables is appropriate for the data involved. A
under “continuous-ordinal” above.                            coefficient that is designed to measure the strength
   Several cautionary notes need to be made as               of association between two continuous variables,
follows.                                                     such as the Pearson correlation coefficient, should
                                                             not be used to assess the strength of relationship
• Establishment of a strong relationship between             between two ordinal variables or between an ordi-
  two variables does not necessarily imply a                 nal variable and a nominal variable, for example.
  cause-effect relationship. Although a strong               This article provides a simple way of choosing an
  association is necessary for establishment of a            appropriate measure of association for a given pair
  cause-effect relationship, it is not sufficient.           of variables by classifying the variables according
• Establishment of a strong relationship between             to their levels of measurement. By using these rec-
  two variables does not necessarily imply agree-            ommendations, it is hoped that research in sonogra-
  ment between the two variables, nor does it nec-           phy, as well as in other areas of medical research,
  essarily imply high reliability. The notions of            will lead to more accurate and more reliable results.
  agreement and reliability are quite different from
  association. For further reading about agreement
  between two variables, see Bland and Altman18              References
  and Cohen19; for reliability, see Fleiss.20                 1. Goodman LA, Kruskal WH: Measures of Association for
• When assessing the strength of association                     Cross Classifications. New York, Springer-Verlag, 1979.
  between two variables, it is important to adjust            2. Liebetrau AM: Measures of Association. Beverly Hills,
  for the effects of important confounders. This                 CA, Sage, 1983.
  can be done with the use of partial correlation             3. Khamis HJ: Measures of association, in Armitage P,
                                                                 Colton T (eds): Encyclopedia of Biostatistics. 2nd ed.
  coefficients. See Fisher and van Belle21 for further
                                                                 New York, John Wiley, 2004.
  discussion.                                                 4. Newton RR, Rudestam KE: Your Statistical Consultant.
• If one of the two variables under study is antecedent          Thousand Oaks, CA, Sage, 1999.
  to the other, or there is an a priori cause-effect rela-    5. Cohen J: Statistical Power Analysis for the Behavioral
  tionship, or there is a dependent-independent vari-            Sciences. 2nd ed. Hillsdale, NJ, Lawrence Erlbaum, 1988.
  able structure, then the variables are handled              6. Sokal RR, Rohlf FJ: Biometry. New York, W. H.
                                                                 Freeman, 1995.
  asymmetrically. In this case, prediction is often the       7. Zar JH: Biostatistical Analysis. Upper Saddle River, NJ,
  research goal, and special measures of association             Prentice Hall, 1996.
  are recommended. See Goodman and Kruskal1 for               8. Valentine AL, Pope J, Read T: Evaluation of left atrial
  further details.                                               size and pressure as it relates to left ventricle diastolic
• When nominal or ordinal variables are involved,                measurements in patients with left ventricle hypertrophy.
                                                                 J Diagn Med Sonography 2003;19:73–79.
  then there may be several ties in the data. That is,
                                                              9. Chatfield C: Problem Solving: A Statistician’s Guide.
  several subjects may have the same X and Y val-                2nd ed. New York, Chapman & Hall, 1995.
  ues. For instance, in example 6, many subjects             10. Luce RD, Narens L: Measurement scales on the contin-
  (1768) had hair color H1 and eye color E1. There               uum. Science 1987;236:1527–1532.
162    JOURNAL OF DIAGNOSTIC MEDICAL SONOGRAPHY               May/June 2008   VOL. 24, NO. 3
11. Tate RF: Correlation between a discrete and a continu-     16. Fleiss JL: Statistical Methods for Rates and Proportions.
    ous variable, point-biserial correlation. Ann Math Stat        2nd ed. New York, John Wiley, 1981.
    1954;25:603–607.                                           17. Kendall MG: The Advanced Theory of Statistics.
12. Tate RF: The theory of correlation between two continu-        London, Charles Griffin, 1948.
    ous variables when one is dichotomized. Biometrika         18. Bland JM, Altman DG: Statistical methods for assessing
    1955;42:205–216.                                               agreement between two methods of clinical measure-
13. Kotz S, Johnson NL: Encyclopedia of Statistical                ment. Lancet 1986;1:307–310.
    Sciences. Vol. 1. New York, John Wiley, 1982.              19. Cohen J: A coefficient of agreement for nominal scales.
14. Kuklinski JH, West DM: Economic expectations and               Educ Psychol Meas 1960;20:37–46.
    voting behavior in United States House and Senate elec-    20. Fleiss JL: The Design and Analysis of Clinical
    tions. Am Polit Sci Rev 1981;75:436–447.                       Experiments. New York, John Wiley, 1986.
15. Cureton EE: Rank-biserial correlation. Psychometrika       21. Fisher LD, van Belle G: Biostatistics, a Methodology for
    1956;21:287–290.                                               the Health Sciences. New York, John Wiley, 1993.