CHAPTER 18
Using Systematic Reviews and the Appropriate Meta-analysis
                                                  Walter W. Rosser
    Nature fits all her children with something to do.       cussions at scientific meetings, and other less formal
    He who would write and cant write, can surely           sources of information, should be described. The trans-
    review.                                                  parent literature search techniques should use key
                                      James R. Lowell       words that are relevant to the question(s). Studies have
                                                             been conducted that suggest that the literature tends to
LEARNING OBJECTIVES                                          be biased in favor of the publication of positive results.
On completion of this chapter, the reader should be          Another documented bias is toward prominent authors
able to                                                      whose publications are accepted in preference to those
                                                             of unknown authors, despite similar rigor in design,
     1. Determine if the question asked by the review
                                                             methodology, and outcome of studies. Thus, the pub-
        article is appropriate
                                                             lished literature may not always reflect all of the current
     2. Determine if the process of producing the
                                                             knowledge about a specific question. Investigators in a
        review was adequately rigorous to consider
                                                             research project supported by a pharmaceutical com-
        using the conclusions
                                                             pany who discover higher rates of side effects than have
     3. Decide if the review is applicable to the patients
                                                             been reported previously may never submit their find-
     4. Determine if a meta-analysis is appropriate
                                                             ings for publication, creating another form of publica-
     5. Assess the quality of the meta-analysis to
                                                             tion bias.
        determine if the conclusions are valid
                                                                  It is not possible for the author of a review article
     6. Be confident that the results of the meta-
                                                             to overcome all of these obstacles, but identifying them
        analysis are appropriate for his/her practice
                                                             does illustrate the importance of a rigorous search
     Many review articles are written by an expert such      strategy and the deficiencies of even the most rigorous
as an orthopedic surgeon giving views about low back         search. The search description should show that a rea-
pain gained from personal experience over a 20-year          sonable effort was made to obtain all relevant literature
career. The most common form of review article finds         in an attempt to reduce bias as much as possible.
the author describing his/her approach to diagnosis               The author should then discuss the criteria for
and management using a few selected references. From         inclusion or exclusion of literature and demonstrate
an evidence-based perspective, this style of review arti-    that the process of decision-making was as objective as
cle has some value in providing an experts approach to      possible. The inclusion and exclusion criteria must be
a common problem, but the methodology is not rigor-          stated so clearly that if other independent reviewers
ous enough to ensure that the conclusions represent an       applied the same search criteria, they would choose the
objective and systematic review of the current litera-       same primary articles. The author of a rigorous review
ture. We now outline the approach to a review article        should report that an independent reviewer using the
that is considered rigorous enough to use as evidence        same criteria and search strategy did indeed choose the
to change your practice on the basis of the recommen-        same articles.
dations.                                                          Among the criteria for selecting an article should
                                                             be the studys population, the interventions, and the
Literature Search                                            outcomes that were considered for inclusion or exclu-
The sources of data used in the search, including key        sion. It is important to determine if the outlined crite-
words, personal communications with researchers, dis-        ria make sense in family and general practice. The
                                                                                                                    113
114   Information Mastery: Evidence-Based Family Medicine Second Edition
decision to include or exclude articles on the basis of        ologically weaker studies is likely to provide some
study methods used is also important. The stronger the         explanation for variability of outcome. If the sample
study methods demanded in the review, the greater              sizes in the studies are small, chance may play a major
confidence there can be in the conclusions. Most rigor-        role in the variability of results. The Students t-test and
ous reviews include only randomized controlled trials          measurement of confidence limits are the common
(RCTs) in their search; however, this strategy risks not       ways to assess the risk of chance alone accounting for
addressing important questions for family and general          the results. The smaller the sample size, the less likeli-
practice.                                                      hood there is of a difference being statistically signifi-
      When rigorous criteria are used, even if hundreds        cant and the more likely it is that a difference can be
or thousands of studies have been published on a sub-          explained by chance alone. Chance may explain minor
ject, usually only 5 or 10 articles will meet the criteria.    differences between studies, but large differences in
Although it is rare to find a review article that is rigor-    outcome are more likely to be explained by the popu-
ous about the literature selection, all reviews that do        lation sample. Often factors such as the severity of ill-
not follow the prescribed level of rigor are at risk of        ness of patients in two samples will explain a difference.
bias. An example of biased paper selection can be              Difference in the age or sex distribution of the study
drawn from the cholesterol debate, in which inclusion          samples or racial differences in the two populations are
or exclusion of articles greatly influences the con-           factors that might explain differences. Study outcomes
clusion of a review. If the end point considered in a          may differ because of dosage differences or different
review is cholesterol lowering, then a number of well-         rates of compliance. Outcome measures may vary from
controlled randomized trials demonstrated success for          one study to another, especially in those measuring
a number of widely prescribed drugs (disease-oriented          quality of life as an outcome, in which both the instru-
evidence [DOE] rather than patient-oriented evidence           ments and the methods by which the information are
that matters [POEM]).                                          gathered may vary.
Assessment of the Quality of the Literature                    Combining Data in a Review
Once the primary studies to be included in the review          After any rigorous review is completed, consideration
have been selected, the author must then review the            should be given to combining the key articles to create
quality of the studies. An assessment of this process will     a meta-analysis. If, after proper assessment, the results
use all of the skills acquired from understanding Chap-        cannot be combined, there should be a commentary as
ters 14 to 20. The review should include a brief com-          to why and also discussion of the strengths or weak-
mentary on the strengths and weaknesses of each                nesses of the best studies that have been found in the
primary study. The commentary on each selected arti-           review. The conclusions may be weakened if the qual-
cle should include the numbers and characteristics of          ity or variability of the results from the literature raises
the population, the duration of the study, and the out-        concerns.
comes and how they were measured. These brief
descriptions give the reader a sense of the strength,          Conclusions in a Review Article
quality, and relevance of the studies from which the           Conclusions from a review should be based on findings
review has drawn its conclusions. There should be evi-         from the review. This seems to be an obvious statement,
dence that the critique of the primary studies was             but there are some examples of rigorously conducted
objective. Objective and unbiased reviews are best             reviews that draw conclusions that are not linked to the
achieved by having two or three individuals review and         findings of the review.2
critique each article independently and then share their            Review articles are an important source of infor-
opinions. If the results of the studies chosen by the          mation for all primary care providers. Historically,
author for inclusion are inconsistent, then the possible       most review articles have been opinion articles. In the
causes of the inconsistency should be discussed. There         twenty-first century, opinion is no longer a sound
are five components of a clinical study in which minor         enough basis on which to practice medicine. Primary
differences can result in different outcomes:                  care providers must demand rigorously constructed
                                                               review articles to trust them as sources of information
      1. Study design
                                                               for evidence-based practice. This demand must be
      2. Chance (small sample size)
                                                               repeatedly expressed to the editors of journals designed
      3. Population used
                                                               to assist family and general practitioners so that the
      4. Intervention used and its duration and
                                                               quality of review articles will continue to improve.
         strength
      5. Methods used to measure outcomes                      Why Do a Meta-analysis?
    If the review uses RCTs, cohort studies, and case-         After assessing a review of the literature, it may be
control studies, the potential for bias in the method-         appropriate to merge several good-quality studies
                                                              Using Systematic Reviews and the Appropriate Meta-analysis                                                                           115
found through the review process. Merging the results                                                                                                                             25% increase in
                                                                                                          1.0                                                                     event rate
of several RCTs may provide information that cannot
be obtained from each study independently. Small tri-                                                     0.9                                                                                            no change in
                                                                                                                                                                        K                                event rate
als, although less difficult and less expensive to carry
                                                                Event rate in those receiving treatment
                                                                                                          0.8
out, are subject to Type II error (ie, false-negative
results occurring by chance). Pooling of the results                                                      0.7                                           H
                                                                                                                                                                                  J                                  25% reduction
from several similar RCTs reduces the risk of Type II                                                     0.6                                                                                                  L     in event rate
error and strengthens confidence in the conclusions. A                                                                                              G
                                                                                                          0.5
further benefit can be derived from pooling results
                                                                                                                                                                                         I
from analyses of subsets from larger trials. In the past                                                  0.4                   D
                                                                                                                                                                  F
few years, subsets of persons representing specific age                                                                                         E
                                                                                                          0.3
or sex groups or a group with unique characteristics                                                                                  C
have been derived from very large trials to answer spe-                                                   0.2
                                                                                                                      A
cific questions applicable to these smaller groups. The                                                   0.1                   B
numbers in these subset analyses are small. Pooling the
                                                                                                           0
results strengthens the statistical power of the analysis.                                                      0.1       0.2   0.3       0.4   0.5         0.6       0.7   0.8    0.9       1.0   1.1   1.2   1.3
The risk of a Type I error (ie, false-positive results                                                                              Event rate in those not receiving treatment
occurring by chance) also exists in small trials. Pooling
data from all of the trials reduces this risk. Another          Figure 18-1 An example of heterogeneity of a number of studies (not suitable for
                                                                meta-analysis).
advantage to pooling several small trials, compared
with the results from one large trial, is that the sam-
pling bias of the large trial is minimized by the differ-       tions in the results of the studies should correspond to                                                                                   Meta-analysis
ences in the population samples from several small              differences in the characteristics of the studies. An eye-                                                                                is the
studies. The meta-analysis of a number of small trials          ball estimate made by the reader should correspond to                                                                                     biostatisticians
should be more generalizable to primary care practice           an estimate of the magnitude of difference between the                                                                                     playpen from
                                                                                                                                                                                                           which statistical
populations than results from a single large trial.             intervention and the control when the trials are com-
                                                                                                                                                                                                           missiles can
                                                                bined.
                                                                                                                                                                                                           be thrown at
ASSESSING THE QUALITY OF META-ANALYSIS                          Homogeneity                                                                                                                                confused
                                                                                                                                                                                                           physicians.
The Question                                                    If the combined studies are comparable, variation in
As in a review article, the question being asked by the         their outcomes should be accounted for by sampling
meta-analysis must be clear. Trials to be included in a         variation or chance. The results can be demonstrated
meta-analysis are usually determined after all of the           graphically by plotting the control groups along the
preliminary steps in a systematic review are completed          horizontal axis and the intervention groups along the
and the question as to whether similar trials can or can-       vertical axis.
not have their results combined. Many meta-analyses                  Significant variation suggests a lack of homogene-
pay little attention to the assessment of the literature as     ity in the studies being merged, indicating that com-
outlined in the first part of this chapter and then com-        bining their results is inappropriate (see Figure 18-1).
bine studies that should not be combined from a clin-           Although a number of statistical tools are available to
ical perspective.                                               analyze homogeneity (Mantel and Haenszel equation),
     Reasons for conducting a meta-analysis might               your own judgment about homogeneity from the plot
include the impracticality of conducting a large enough         is a simple way to determine if combining the studies
single trial to answer the question, the presence of small      is appropriate. A lack of homogeneity may reflect dif-
trials or subset analysis and the absence of a large trial,     ferent doses of the intervention or different age or sex
and the importance of the question in health care deliv-        composition of the study populations. Table 18-1
ery. Often a series of small trials have results that are       demonstrates a homogeneous set of trials that can be
inconsistent, and by combining them, a more definitive          appropriately combined. A good meta-analysis gives
conclusion is possible. The pharmaceutical industry             you a table like this and explains any differences.
tends to carry out short trials with a few patients when
testing out new drugs and their effects. Combining sev-         Sensitivity Analysis
eral of these studies into a meta-analysis can increase or      Sensitivity analysis responds to the question, Are the
decrease confidence in the new drug (Figure 18-1).              results of the meta-analysis sensitive to changes in the
                                                                way in which the analysis is done? An example would
Merging the Results of Trials                                   be to conduct a meta-analysis using only RCTs and
Trials can be merged only when the outcome measures             then to add cohort studies that otherwise meet the
used in each trial are the same or very similar. Varia-         same criteria to see if the outcome changes.
116     Information Mastery: Evidence-Based Family Medicine Second Edition
Table 18-1 Trials on the Effect of Lipid-Lowering Strategies (the Effect of Cholesterol Lowering on Cause-Specific and
All-Cause Death Rates)
                                                             Drug Trials
Trial                                 Cardiac           Noncardiac              Cancer              Violence               Total
LRC                                     0.78                1.34                 1.06                 2.75                  0.96
HHS                                     0.84                1.30                 0.99                 2.48                  0.96
WHO                                     1.05                1.74                 1.66                 1.19                  1.47
UCS                                     0.59                1.60                 1.00                 5.00                  0.62
Total                                   0.88                1.54                 1.32                 1.77                  1.14
                                                             Diet Trials
LAUDS                                   0.80                1.06                 1.70                 9.04                  0.96
MCS                                     1.39                1.00                 1.34                 1.50                  1.00
Total                                   0.90                1.02                 1.55                 1.79                  1.00
Adapted from the Canadian Task Force on Periodic Health Examination, 1993 update. Lowering the total blood cholesterol to prevent coro-
nary heart disease. Can Med Assoc J 1993;148:52137.
     A similar outcome after combining studies using                        3. Is the literature search strategy described in a
different methodologies adds confidence to the meta-                           transparent fashion?
analysis results. Age or sex groups might be excluded                       4. Are there explicit inclusion and exclusion cri-
from combined data as a strategy to measure their                              teria and an appropriate explanation for the
influence in determining outcomes. The results of this                         studies that were included?
type of sensitivity analysis may strengthen confidence                      5. Is the homogeneity of the studies appropri-
in the meta-analysis if the results make clinical sense                        ately evaluated?
and confirm clinical observations.                                          6. Are appropriate statistics used? Is sensitivity
     After reviewing a meta-analysis and determining                           analysis used?
that the analysis has been carefully conducted and                          7. Does the pooled analysis demonstrate signifi-
meets the outlined criteria, one can be confident that                         cant differences between the trial and control
the results are relevant and useful for patients.                              groups?
Although statisticians may lose our attention in areas                      8. Are appropriate conclusions drawn from the
within the process of meta-analysis (the actual mathe-                         analysis?
matical process of pooling the data), the general prin-                     9. Would you incorporate the recommendations
ciples of good meta-analysis can be judged by any                              into your practice?
clinician. Many meta-analyses in the literature do not
meet the above criteria, so the mathematical wizardry
                                                                      RELEVANT CAPRE TOPICS
becomes irrelevant.
                                                                      Diagnosing deep vein thrombosis (DVT)
QUESTIONS TO ASK WHEN ASSESSING THE VALUE                             Cost-effective management of dyspepsia
OF A REVIEW ARTICLE AND/OR A META-ANALYSIS                            <http://www.meds.queensu.ca/ce/capre>
        1. Is the question a POEM or a DOE?
        2. Is the question clearly stated?