The Level of Evidance
The Level of Evidance
A
s the name suggests, evidence-based medi- support a recommendation that a condition be
cine is about finding evidence and using included in the periodic health examination. The
that evidence to make clinical decisions. A levels of evidence were further described and ex-
cornerstone of evidence-based medicine is the hi- panded by Sackett8 in an article on levels of evi-
erarchical system of classifying evidence. This hi- dence for antithrombotic agents in 1989 (Ta-
erarchy is known as the levels of evidence. Physi- ble 2). Both systems place randomized controlled
cians are encouraged to find the highest level of trials at the highest level and case series or expert
evidence to answer clinical questions. Several articles opinions at the lowest level. The hierarchies rank
published in plastic surgery journals concerning ev- studies according to the probability of bias. Ran-
idence-based medicine topics have touched on this domized controlled trials are given the highest
subject.1– 6 Specifically, previous articles have dis- level because they are designed to be unbiased and
cussed the lack of higher level evidence in Plastic and have less risk of systematic errors. For example, by
Reconstructive Surgery and the need to improve the randomly allocating subjects to two or more treat-
evidence published in the Journal. Before that can be ment groups, these types of studies also randomize
accomplished, it is important to understand the his- confounding factors that may bias results. A case
tory behind the levels and how they should be in- series or expert opinion is often biased by the
terpreted. This article focuses on the origin of levels author’s experience or opinions, and there is no
of evidence, their relevance to the evidence-based control of confounding factors.
medicine movement, and the implications for the
field of plastic surgery and the everyday practice of
plastic surgery. MODIFICATION OF LEVELS
Since the introduction of levels of evidence,
several other organizations and journals have ad-
HISTORY OF LEVELS OF EVIDENCE opted variations of the classification system. Di-
The levels of evidence were originally de- verse specialties are often asking different ques-
scribed in a report by the Canadian Task Force on tions, and it was recognized that the type and level
the Periodic Health Examination in 1979.7 The of evidence needed to be modified accordingly.
report’s purpose was to develop recommenda- Research questions are divided into the following
tions on the periodic health examination and base categories: treatment, prognosis, diagnosis, and
those recommendations on evidence in the med- economic/decision analysis. For example, Table 3
ical literature. The authors developed a system of shows the levels of evidence developed by the
rating evidence (Table 1) when determining the American Society of Plastic Surgeons for prognosis9
effectiveness of a particular intervention. The ev- and Table 4 shows the levels developed by the Centre
idence was taken into account when grading rec- for Evidence-Based Medicine for treatment.10 The
ommendations. For example, a grade A recom- two tables highlight the types of studies that are
mendation was given if there was good evidence to appropriate for the question (prognosis versus
treatment) and how quality of data is taken into
From the Section of Plastic Surgery, Department of Surgery, account when assigning a level. For example, ran-
University of Michigan Health System, and the Department
of Plastic Surgery, University of Texas Southwestern Medical
Center.
Received for publication February 2, 2011; accepted March Disclosure: The authors have no financial inter-
10, 2011. ests to declare in relation to the content of this
Copyright ©2011 by the American Society of Plastic Surgeons article.
DOI: 10.1097/PRS.0b013e318219c171
www.PRSJournal.com 305
Plastic and Reconstructive Surgery • July 2011
Table 1. Canadian Task Force on the Periodic Health Table 4. Levels of Evidence for Therapeutic Studies*
Examination’s Levels of Evidence*
Level Type of Evidence
Level Type of Evidence 1a Systematic review (with homogeneity) of RCTs
I At least 1 RCT with proper randomization 1b Individual RCT (with narrow confidence intervals)
II.1 Well-designed cohort or case-control study 1c All-or-none study
II.2 Time series comparisons or dramatic results 2a Systematic review (with homogeneity) of cohort
from uncontrolled studies studies
III Expert opinions 2b Individual cohort study, including low-quality RCTs
(e.g., ⬍80% follow-up)
RCT, randomized controlled trial. 2c “Outcomes” research; ecological studies
*Adapted from Canadian Task Force on the Periodic Health Exam- 3a Systematic review (with homogeneity) of
ination. The periodic health examination. Can Med Assoc J. 1979; case-control studies
121:1193–1254. 3b Individual case-control study
4 Case series (and poor quality cohort and
Table 2. Levels of Evidence from Sackett* case-control study)
5 Expert opinion without explicit critical appraisal
Level Type of Evidence or based on physiology, bench research,
or “first principles”
I Large RCTs with clear-cut results
II Small RCTs with unclear results RCT, randomized controlled trial.
III Cohort and case-control studies *From the Centre for Evidence-Based Medicine (Web site). Available
IV Historical cohort or case-control studies at: http://www.cebm.net. Accessed December 17, 2010.
V Case series, studies with no controls
RCTs, randomized controlled trials.
*Adapted from Sackett DL. Rules of evidence and clinical recom- over time. Table 5 shows the Grade Practice Rec-
mendations on the use of antithrombotic agents. Chest 1989;95: ommendations developed by the American Soci-
2S– 4S. ety of Plastic Surgeons. The grading system pro-
vides an important component in evidence-based
Table 3. Levels of Evidence for Prognostic Studies* medicine and assists in clinical decision making.
Level Type of Evidence For example, a strong recommendation is given
I High-quality prospective cohort study with adequate when there is level I evidence and consistent ev-
power or systematic review of these studies idence from level II, III, and IV studies available.
II Lesser quality prospective cohort, retrospective The grading system does not degrade lower level
cohort study, untreated controls from an RCT, or
systematic review of these studies evidence when deciding recommendations if the
III Case-control study or systematic review of these results are consistent.
studies
IV Case series
V Expert opinion; case report or clinical example; INTERPRETATION OF LEVELS
or evidence based on physiology, bench research, Many journals assign a level to the articles they
or “first principles” publish, and authors often assign a level when
RCT, randomized controlled trial. submitting an abstract to conference proceedings.
*Adapted from the American Society of Plastic Surgeons. Available
at: http://www.plasticsurgery.org/For_Medical-Professionals/Legislation- This allows the reader to know the level of evi-
and-Advocacy/Health-Policy-Resources/Evidence-based-Guidelines dence of the research, but the designated level of
Practice-Parameters/Description-and-Development-of-Evidence-based- evidence does always guarantee the quality of the
Practice-Guidelines/ASPS-Evidence-Rating-Scales.html. Accessed Decem-
ber 17, 2010. research. It is important that readers not assume
that level I evidence is always the best choice or
appropriate for the research question. This con-
domized controlled trials are not appropriate cept will be very important for all of us to under-
when looking at the prognosis of a disease. The stand as we evolve into the field of evidence-based
question in this instance is, “What will happen if medicine in plastic surgery. By design, our desig-
we do nothing at all?” Because a prognosis ques- nated surgical specialty will always have important
tion does not involve comparing treatments, the articles that may have a lower level of evidence
highest evidence would come from a cohort study because of the level of innovation and technique
or a systematic review of cohort studies. The levels articles that are needed to move our surgical spe-
of evidence also take into account the quality of cialty forward.
the data. For example, in the chart from the Cen- Although randomized controlled trials are of-
tre for Evidence-Based Medicine, a poorly de- ten assigned the highest level of evidence, not all
signed randomized controlled trial has the same randomized controlled trials are conducted prop-
level of evidence as a cohort study. erly, and the results should be scrutinized care-
A grading system that provides strength of rec- fully. Sackett8 stressed the importance of estimat-
ommendations based on evidence has also changed ing types of errors and the power of studies when
306
Volume 128, Number 1 • Levels of Evidence
interpreting results from randomized controlled lished in the Canadian Journal of Surgery 14,15 and
trials. For example, a poorly conducted random- the Journal of Bone and Joint Surgery.16 Similar arti-
ized controlled trial may report a negative result cles that are not specific to surgery have been
because of low power when in fact a real difference published in the Journal of the American Medical
exists between treatment groups. Scales such Association.17,18
as the Jadad scale have been developed to judge
the quality of randomized controlled trials.11 Al-
though physicians may not have the time or in- PLASTIC SURGERY AND EVIDENCE-
clination to use a scale to assess quality, there are BASED MEDICINE
some basic items that should be taken into ac- The field of plastic surgery has been slow to
count. Items used for assessing randomized con- adopt evidence-based medicine. This was demon-
trolled trials include randomization, blinding, a strated in an article examining the level of evi-
description of the randomization and blinding dence of articles published in Plastic and Recon-
process, a description of the number of subjects structive Surgery.19 The authors assigned levels of
who withdrew or dropped out of the study, the evidence to articles published in Plastic and Recon-
confidence intervals around study estimates, and structive Surgery over a 20-year period. The majority
a description of the power analysis. For example, of studies (93 percent in 1983) were level IV or V,
Bhandari et al.12 published an article assessing the which denotes case series and case reports. Al-
quality of surgical randomized controlled trials. though the results were disappointing, there was
The authors evaluated the quality of randomized some improvement over time. By 2003, there were
controlled trials reported in the Journal of Bone and more level I studies (1.5 percent) and fewer level
Joint Surgery from 1988 to 2000. Articles with a IV and V studies (87 percent). A recent analysis
score of greater than 75 percent were deemed looked at the number of level I studies in five
high quality, and 60 percent of the articles had a different plastic surgery journals from 1978 to
score less than 75 percent. The authors identified 2009. The authors defined level I studies as ran-
72 randomized controlled trials during this time domized controlled trials and meta-analyses and
period, and the mean score was 68 percent. The restricted their search to these studies. The num-
main reason for the low-quality score was lack of ber of level I studies increased from one in 1978
appropriate randomization, blinding, and a de- to 32 by 2009.20 From these results, we see that the
scription of patient exclusion criteria. Another ar- field of plastic surgery is improving the level of
ticle found the same quality score of articles in the evidence but still has a long way to go, especially
Journal of Bone and Joint Surgery with a level 1 rating in improving the quality of studies published. For
compared with level 2.13 Therefore, one should example, approximately one-third of the studies
not assume that level 1 studies are of higher quality involved double blinding, but the majority did not
than level 2 studies. randomize subjects, describe the randomization
A resource for surgeons to use when apprais- process, or perform a power analysis. Power anal-
ing levels of evidence are the users’ guides pub- ysis is another area of concern in plastic surgery.
307
Plastic and Reconstructive Surgery • July 2011
A review of the plastic surgery literature found that phasizes the importance of observational studies
the majority of published studies have inadequate for a specific study question. A case-control study
power to detect moderate to large differences be- is a better option and provides higher level evi-
tween treatment groups.21 Regardless of the level dence for testing the prognosis of the long-term
of evidence for a study, if the study is underpow- effect of silicone breast implants.
ered, the interpretation of results is questionable. Another example is the injection of epineph-
Although the goal is to improve the overall rine in fingers. Based on case reports before 1950,
level of evidence in plastic surgery, this does not physicians were advised that epinephrine injec-
mean that all lower level evidence should be dis- tion can result in finger ischemia.33 We see in this
carded. Case series and case reports are important example that level IV or V evidence was accepted
for hypothesis generation and can lead to more as fact and incorporated into medical textbooks
controlled studies. In addition, in the face of over- and teaching. However, not all physicians accepted
whelming evidence to support a treatment, such as this evidence and were performing injections of epi-
the use of antibiotics for wound infections, there nephrine into the fingers, with no adverse effects on
is no need for a randomized controlled trial. the hand. Obviously, it was time for higher level
evidence to resolve this issue. An in-depth review of
CLINICAL EXAMPLES USING LEVELS the literature from 1880 to 2000 by Denkler33 iden-
OF EVIDENCE tified 48 cases of digital infarction, of which 21 had
To understand how the levels of evidence work been injected with epinephrine. Further analysis
and aid the reader in interpreting levels, we pro- found that the addition of procaine to the epineph-
vide some examples from the plastic surgery lit- rine injection was the cause of the ischemia.34 The
erature. The examples also show the peril of med- procaine used in these injections included toxic
ical decisions based on results from case reports. acidic batches that were recalled in 1948. In addi-
An association was hypothesized between lym- tion, several cohort studies found no complications
phoma and silicone breast implants based on case from the use of epinephrine in the fingers and
reports.22–27 The level of evidence for case reports, hand.35–37 The results from these cohort studies in-
depending on the scale used, is IV or V. These case creased the level of evidence. Based on the best
reports were used to generate the hypothesis that available evidence from these studies, the hypothesis
a possible association existed. Because of these that epinephrine injection will harm fingers was re-
results, several large retrospective cohort studies jected. This example highlights the biases inherent
from the United States, Canada, Denmark, Swe- in case reports. It also shows the risk when spurious
den, and Finland were conducted.28 –32 The level of evidence is handed down and integrated into med-
evidence for a retrospective cohort study is II. All ical teaching.
of these studies had many years of follow-up for a
large number of patients. Some of the studies
found an elevated risk and others found no risk for OBTAINING THE BEST EVIDENCE
lymphoma. None of the studies reached statistical We have established the need for randomized
significance. Therefore, higher level evidence controlled trials to improve evidence in plastic sur-
from cohort studies does not provide evidence of gery but have also acknowledged the difficulties, par-
any risk of lymphoma. Finally, a systematic review ticularly with randomization and blinding. Although
was performed that combined the evidence from randomized controlled trials may not be appropriate
the retrospective cohorts.27 The results found an for many surgical questions, well-designed and well-
overall standardized incidence ratio of 0.89 (95 conducted cohort or case-control studies could
percent confidence interval, 0.67 to 1.18). Be- boost the level of evidence. Many of the current
cause the confidence interval includes 1, the re- studies tend to be descriptive and lack a control
sults indicate there is no increased incidence. The group. The way forward seems clear. Plastic surgery
level of evidence for the systematic review is I. researchers need to consider using a cohort or case-
Based on the best available evidence, there is no control design whenever a randomized controlled
association between lymphoma and silicone im- trial is not possible. If designed properly, the level of
plants. This example shows how studies with a low evidence for observational studies can approach or
level of evidence were used to generate a hypoth- surpass those from a randomized controlled trial. In
esis, which then led to higher level evidence that some instances, observational studies and random-
disproved the hypothesis. This example also dem- ized controlled trials have yielded similar results.38 If
onstrates that randomized controlled trials are not enough cohort or case-control studies become avail-
feasible for rare events such as cancer and em- able, the prospect of systematic reviews of these stud-
308
Volume 128, Number 1 • Levels of Evidence
ies will increase, which will increase overall evidence 10. Centre for Evidence Based Medicine (Web site). Available at:
levels in plastic surgery. http://www.cebm.net. Accessed December 17, 2010.
11. Jadad AR, Moore RA, Carroll D, et al. Assessing the quality
of reports of randomized clinical trials: Is blinding necessary?
CONCLUSIONS Control Clin Trials 1996;17:1–12.
The levels of evidence are an important com- 12. Bhandari M, Richards RR, Sprague S, Schemitsch EH. The
quality of reporting of randomized trials in the Journal of Bone
ponent of evidence-based medicine. Understanding and Joint Surgery from 1988 through 2000. J Bone Joint Surg Am.
the levels and why they are assigned to publications 2002;84:388–396.
and abstracts helps the reader to prioritize infor- 13. Poolman RW, Struijs PA, Krips R, Sierevelt IN, Lutz KH,
mation. This is not to say that all level IV evidence Bhandari M. Does a “Level I Evidence” rating imply high
should be ignored and all level I evidence ac- quality of reporting in orthopaedic randomised controlled
cepted as fact. The levels of evidence provide a trials? BMC Med Res Methodol. 2006;6:44.
14. Urschel JD, Goldsmith CH, Tandan VR, Miller JD. Users’
guide, and the reader needs to be cautious when guide to evidence-based surgery: How to use an article eval-
interpreting these results. uating surgical interventions. Evidence-Based Surgery Work-
ing Group. Can J Surg. 2001;44:95–100.
Kevin C. Chung, M.D., M.S.
15. Thoma A, Farrokhyar F, Bhandari M, Tandan V; Evidence-
Section of Plastic Surgery
Based Surgery Working Group. Users’ guide to the surgical
Department of Surgery
University of Michigan Health System literature: How to assess a randomized controlled trial in
2130 Taubman Center, SPC 5340 surgery. Can J Surg. 2004;47:200–208.
1500 East Medical Center Drive 16. Bhandari M, Guyatt GH, Swiontkowski MF. User’s guide to
Ann Arbor, Mich. 48109-5340 the orthopaedic literature: How to use an article about prog-
kecchung@umich.edu nosis. J Bone Joint Surg Am. 2001;83:1555–1564.
17. Guyatt GH, Sackett DL, Cook DJ. Users’ guides to the med-
ical literature: II. How to use an article about therapy or
ACKNOWLEDGMENTS prevention. A. Are the results of the study valid? Evidence-
This work was supported in part by a Midcareer Based Medicine Working Group. JAMA 1993;270:2598–2601.
Investigator Award in Patient-Oriented Research (K24 18. Guyatt GH, Haynes RB, Jaeschke RZ, et al. Users’ Guides to the
Medical Literature: XXV. Evidence-based medicine: Principles
AR053120) from the National Institute of Arthritis and
for applying the Users’ Guides to patient care. Evidence-Based
Musculoskeletal and Skin Diseases (to K.C.C.). Medicine Working Group. JAMA 2000;284:1290–1296.
19. Loiselle F, Mahabir RC, Harrop AR. Levels of evidence in
REFERENCES plastic surgery research over 20 years. Plast Reconstr Surg.
1. McCarthy CM, Collins ED, Pusic AL. Where do we find the 2008;121:207e–211e.
best evidence? Plast Reconstr Surg. 2008;122:1942–1947; dis- 20. McCarthy JE, Chatterjee A, McKelvey TG, Jantzen EM, Kerrigan
cussion 1948–1951. CL. A detailed analysis of level I evidence (randomized con-
2. Chung KC, Swanson JA, Schmitz D, Sullivan D, Rohrich RJ. trolled trials and meta-analyses) in five plastic surgery journals
Introducing evidence-based medicine to plastic and recon- to date: 1978 to 2009. Plast Reconstr Surg. 2010;126:1774–1778.
structive surgery. Plast Reconstr Surg. 2009;123:1385–1389. 21. Chung KC, Kalliainen LK, Spilson SV, Walters MR, Kim HM.
3. Chung KC, Ram AN. Evidence-based medicine: The fourth The prevalence of negative studies with inadequate statistical
revolution in American medicine? Plast Reconstr Surg. 2009; power: An analysis of the plastic surgery literature. Plast Re-
123:389–398. constr Surg. 2002;109:1–6; discussion 7–8.
4. Rohrich RJ. So you want to be better: The role of evidence- 22. Newman MK, Zemmel NJ, Bandak AZ, Kaplan BJ. Primary
based medicine in plastic surgery. Plast Reconstr Surg. 2010; breast lymphoma in a patient with silicone breast implants:
126:1395–1398.
A case report and review of the literature. J Plast Reconstr
5. Burns PB, Chung KC. Developing good clinical questions
Aesthet Surg. 2008;61:822–825.
and finding the best evidence to answer those questions. Plast
23. Gaudet G, Friedberg JW, Weng A, Pinkus GS, Freedman AS.
Reconstr Surg. 2010;126:613–618.
Breast lymphoma associated with breast implants: Two case-
6. Sprague S, McKay P, Thoma A. Study design and hierarchy
reports and a review of the literature. Leuk Lymphoma 2002;
of evidence for surgical decision making. Clin Plast Surg.
2008;35:195–205. 43:115–119.
7. The periodic health examination. Canadian Task Force on 24. Sahoo S, Rosen PP, Feddersen RM, Viswanatha DS, Clark DA,
the Periodic Health Examination. Can Med Assoc J. 1979;121: Chadburn A. Anaplastic large cell lymphoma arising in a
1193–1254. silicone breast implant capsule: A case report and review of
8. Sackett DL. Rules of evidence and clinical recommendations the literature. Arch Pathol Lab Med. 2003;127:e115–e118.
on the use of antithrombotic agents. Chest 1989;95(2 Suppl): 25. Keech JA Jr, Creech BJ. Anaplastic T-cell lymphoma in prox-
2S–4S. imity to a saline-filled breast implant. Plast Reconstr Surg.
9. American Society of Plastic Surgeons. Scales for rating 1997;100:554–555.
levels of evidence. Available at: http://www.plasticsurgery. 26. Duvic M, Moore D, Menter A, Vonderheid EC. Cutaneous
org/Medical_Professionals/Health_Policy_and_Advocacy/ T-cell lymphoma in association with silicone breast implants.
Health_Policy_Resources/Evidence-based_GuidelinesPractice_ J Am Acad Dermatol. 1995;32:939–942.
Parameters/Description_and_Development_of_Evidence-based_ 27. Lipworth L, Tarone RE, McLaughlin JK. Breast implants and
Practice_Guidelines/ASPS_Evidence_Rating_Scales.html. Ac- lymphoma risk: A review of the epidemiologic evidence
cessed December 17, 2010. through 2008. Plast Reconstr Surg. 2009;123:790–793.
309
Plastic and Reconstructive Surgery • July 2011
28. Lipworth L, Tarone RE, Friis S, et al. Cancer among Scan- 34. Thomson CJ, Lalonde DH, Denkler KA, Feicht AJ. A critical
dinavian women with cosmetic breast implants: A pooled look at the evidence for and against elective epinephrine use
long-term follow-up study. Int J Cancer 2009;124:490–493. in the finger. Plast Reconstr Surg. 2007;119:260–266.
29. Deapen DM, Hirsch EM, Brody GS. Cancer risk among Los 35. Lalonde D, Bell M, Benoit P, Sparkes G, Denkler K, Chang
Angeles women with cosmetic breast implants. Plast Reconstr P. A multicenter prospective study of 3,110 consecutive cases
Surg. 2007;119:1987–1992. of elective epinephrine use in the fingers and hand: The
30. Brisson J, Holowaty EJ, Villeneuve PJ, et al. Cancer incidence Dalhousie Project clinical phase. J Hand Surg Am. 2005;30:
in a cohort of Ontario and Quebec women having bilateral 1061–1067.
breast augmentation. Int J Cancer 2006;118:2854–2862. 36. Chowdhry S, Seidenstricker L, Cooney DS, Hazani R, Wil-
31. Pukkala E, Boice JD Jr, Hovi SL, et al. Incidence of breast and helmi BJ. Do not use epinephrine in digital blocks: Myth or
other cancers among Finnish women with cosmetic breast truth? Part II: A retrospective review of 1111 cases. Plast
implants, 1970-1999. J Long Term Eff Med Implants 2002;12: Reconstr Surg. 2010;126:2031–2034.
271–279. 37. Wilhelmi BJ, Blackwell SJ, Miller JH, et al. Do not use epi-
32. Brinton LA, Lubin JH, Burich MC, Colton T, Brown SL, Hoover nephrine in digital blocks: Myth or truth? Plast Reconstr Surg.
RN. Cancer risk at sites other than the breast following aug- 2001;107:393–397.
mentation mammoplasty. Ann Epidemiol. 2001;11:248–256. 38. Concato J, Shah N, Horwitz RI. Randomized, controlled
33. Denkler K. A comprehensive review of epinephrine in the fin- trials, observational studies, and the hierarchy of research
ger: To do or not to do. Plast Reconstr Surg. 2001;108:114–124. designs. N Engl J Med. 2000;342:1887–1892.
310