Journal of Midwifery & Women’s Health                                                                                                            www.jmwh.
org
Brief Report
The ARRIVE Trial: Interpretation from an Epidemiologic
Perspective
Suzan L. Carmichael1 , PhD              , Jonathan M. Snowden2,3 , PhD
    The findings of the ARRIVE trial (A Randomized Trial of Induction Versus Expectant Management) were recently published. This multisite
    randomized trial was designed to provide evidence regarding whether labor induction or expectant management is associated with increased
    adverse perinatal outcomes and risk of cesarean birth among healthy nulliparous women at term. The trial reported that the primary outcome, a
    composite of adverse neonatal outcomes, was not significantly different between the 2 groups; the principal secondary outcome, cesarean birth,
    was significantly more common among women whose pregnancy was expectantly managed than among women whose labor was induced at
    39 weeks. These results have the potential to change existing practice. Several aspects of the study design may influence its potential internal and
    external validity and should be considered in order to make sound causal inferences from this trial, which will in turn affect how its findings
    are translated to practice. Although chance and confounding are of minimal concern, given the sample size and randomization used in the study,
    selection bias may be a concern. Studies are vulnerable to selection bias when the sample population differs from eligible nonparticipants, including
    in randomized controlled trials. External validity is defined as the extent to which the study population and setting are representative of the larger
    source population the study intends to represent. External validity may be limited given the characteristics of the women enrolled in the ARRIVE
    trial and the practice settings where the study was conducted. This brief report provides concrete suggestions for further analyses that could help
    solidify conclusions from the trial, and for further research questions that will continue advancement toward answering this complex question of
    how best to manage labor and birth decisions at full term among low-risk women.
    J Midwifery Womens Health 2019;00:1–7       c 2019 by the American College of Nurse-Midwives.
              Keywords: causality, cesarean, induction of labor, labor onset, pragmatic randomized controlled trials, randomized controlled trials
INTRODUCTION                                                                      or stillbirth, to occur. Conversely, elective labor induction is
With the emergence of evidence that neonatal risk is heteroge-                    costly with respect to staffing levels and patient time, and it
neous across the term gestational age range,1–3 there has been                    introduces medical intervention into labor and birth, which
increased focus on defining the timing of birth that best min-                    does not align with some women’s preferences.7,8
imizes risks of adverse perinatal outcomes at term. To date,                          Prior evidence evaluating the benefits and harms of
evidence suggests that birth is optimal for the health of the                     expectant management versus elective induction of labor
newborn and the woman, barring any clinical indications for                       is mixed and primarily observational.9–12 Observational
giving birth earlier, during a relatively narrow period—from                      studies that compare the outcomes of women and newborns
39 weeks 0 days to 40 weeks 6 days gestation.3–5 Approxi-                         when labor is electively induced with outcomes preceded by
mately 58% of births occur within this 2-week time frame.6                        spontaneous onset of labor at the same gestational age have
     The optimal timing of birth and best clinical management                     generally found that labor induction is associated with an
within the confines of this time frame are open questions. Un-                    increased incidence of cesarean birth.9,10 In contrast, studies
certainties exist regarding how to optimally balance the poten-                   that compare elective induction of labor with expectant man-
tial risks and benefits of expectantly managing pregnancy and                     agement, which includes all later births, whether they occur
waiting for spontaneous onset of labor, which may entail the                      spontaneously or not, have found that that labor induction is
emergence of a clinical indication for birth, versus electively                   associated with lower risks of cesarean and adverse perinatal
inducing labor. Elective induction of labor has conveniences,                     outcomes.11,12
such as planned scheduling, and the earlier it occurs, the less                       Given the conflicting findings of observational research,
at-risk time there is for adverse events, such as preeclampsia                    the recent ARRIVE trial (A Randomized Trial of Induction
                                                                                  Versus Expectant Management) was designed to provide ev-
                                                                                  idence regarding whether labor induction or expectant man-
1                                                                                 agement is associated with more adverse perinatal outcomes
  Department of Pediatrics, Stanford University School of
Medicine, Stanford, California                                                    among healthy nulliparous women at full term.13 In order
2
  School of Public Health, Oregon Health and Science                              to appropriately interpret the trial’s findings from a causal
University–Portland State University, Portland, Oregon                            inference perspective, it is important to consider epidemi-
3
  Department of Obstetrics and Gynecology, Oregon Health
and Science University, Portland, Oregon                                          ologic concepts related to study design and how the de-
                                                                                  sign may influence a study’s internal and external validity,
Correspondence
Suzan L. Carmichael                                                               which should in turn influence how the findings are trans-
Email: scarmichael@stanford.edu                                                   lated to practice. This brief report provides concrete sug-
                                                                                  gestions for further analyses that could help solidify conclu-
ORCID
Suzan L. Carmichael       https://orcid.org/0000-0001-6310-5924                   sions from the trial and for further research questions that
Jonathan M. Snowden        https://orcid.org/0000-0002-9566-3047                  will continue advancement toward answering this complex
1526-9523/09/$36.00 doi:10.1111/jmwh.12996                                        
                                                                                  c 2019 by the American College of Nurse-Midwives                           1
    ✦ The recent ARRIVE trial helped fill evidence gaps on the effects of elective induction of labor at 39 weeks’ gestation, finding
      a significantly lower risk of cesarean birth and no significant difference in composite neonatal complications after elective
      labor induction, compared with expectant management.
    ✦ Complex study design and causal considerations arise when considering the associations between labor induction, cesarean
      birth, and complications.
    ✦ Selection bias may affect studies, including randomized controlled trials, when study participants and eligible nonpartici-
      pants differ.
    ✦ Future research on elective induction of labor should explore the causal mechanisms in the ARRIVE trial, characterize
      this association in other populations and practice settings, and implement pragmatic clinical trial designs that may more
      closely reflect the concept of shared decision making.
question of how best to manage labor and birth decisions at            weight were not significantly different between the women in
full term among low-risk women.                                        the 2 study arms. Several secondary maternal outcomes were
                                                                       significantly different (P ࣘ .01) between the 2 groups; for ex-
SUMMARY OF THE DESIGN AND RESULTS OF                                   ample, hypertensive disorders were 36% lower among women
ARRIVE                                                                 in the IOL group (9.1% vs 14.1%), and duration of stay in the
                                                                       labor and delivery unit was slightly longer, and postpartum
In brief, the ARRIVE trial compared perinatal outcomes
                                                                       hospital stay slightly shorter, among women in the IOL group.
among 3062 women who were assigned to undergo elective
                                                                            An editorial accompanying the publication of the results
induction of labor at 39 weeks and 0 to 4 days’ gestation (re-
                                                                       of ARRIVE ended by stating, “These results . . . should re-
ferred to hereafter as the induction of labor [IOL] group) ver-
                                                                       assure women that elective induction of labor at 39 weeks is
sus 3044 women assigned to expectant management. In this
                                                                       a reasonable choice that is very unlikely to result in poorer
study, expectant management was defined as no elective la-
                                                                       obstetrical outcomes.”14 A statement published by the Soci-
bor induction before 40 weeks 5 days’ gestation and birth
                                                                       ety for Maternal-Fetal Medicine (SMFM) recommended that
no later than 42 weeks 2 days’ gestation. Randomization oc-
                                                                       “It is reasonable to offer elective induction of labor to low-
curred at 38 weeks 0 days’ gestation to 38 weeks 6 days’ ges-
                                                                       risk, nulliparous women at or beyond 39 weeks and 0 days of
tation. The trial was restricted to low-risk nulliparous women
                                                                       gestation” and that “women can be reassured that both elec-
whose singleton fetus was in a vertex position. The low-risk
                                                                       tive IOL and expectant management are reasonable options
status was defined at the time of randomization. Women had
                                                                       at 39 weeks of gestation.”15 The American College of Nurse-
to have a relatively certain date of last menstrual period, and
                                                                       Midwives stated that “Implementation of practice changes to
a woman’s estimated gestational age had to match the gesta-
                                                                       offer 39 week induction of labor should proceed cautiously.”16
tional age determined by ultrasound, or, if she was uncertain,
                                                                       An epidemiologic perspective on interpretation of the trial’s
she had to have had a first trimester ultrasound that estab-
                                                                       findings and their applicability beyond the study’s setting may
lished gestational age. Women were ineligible if they had any
                                                                       help to illuminate the clinical value of these findings for dif-
high-risk conditions such as oligohydramnios, fetal growth
                                                                       ferent populations.
restriction, hypertensive disorders, or diabetes. Participating
hospitals were affiliated with the National Institutes of Health,
                                                                       INTERNAL VALIDITY
National Institute of Child Health and Human Development
Maternal-Fetal Medicine Units Network (MFMU), a network                When evaluating a study, a usual sequence is to consider its
of academic health centers.13                                          potential internal validity, then its external validity, and then
     The primary newborn outcome was a composite of peri-              what inferences can be reasonably substantiated by the find-
natal death or severe neonatal complications, which was 20%            ings. Table 1 presents definitions of epidemiologic terms re-
lower in the IOL group; specifically, 4.3% of newborns in the          lated to this process. Causal inference refers to the process of
IOL group and 5.4% of newborns in the expectant manage-                inferring that a cause led to an effect and if so, how. A prereq-
ment group were affected, which resulted in a relative risk            uisite for causal inference is confidence that a study’s findings
of 0.80 (95% CI, 0.64-1.00). This result was largely driven            are internally valid (Figure 1). To assess a study’s internal va-
by lower frequency of respiratory support in the newborns              lidity, the findings are first evaluated to determine if they are
of women in the IOL group. The difference was not consid-              affected by or the result of chance, confounding, other types of
ered statistically significant at P = .049 because significance        bias, or a combination of factors.17 The well-designed size of
was set at P ⬍ .046, conservatively adjusted from .05 to re-           the ARRIVE trial helps to minimize chance as an alternative
flect the one interim analysis that was performed. The prin-           explanation, at least for the primary outcome and the more
cipal secondary outcome was cesarean birth, which was 16%              common secondary outcomes.
lower in the IOL group, that is, 18.6% versus 22.2%, respec-                A factor or variable that independently affects both the
tively (P ⬍ .01), which resulted in a relative risk of 0.84 (95%       intervention and the outcomes is referred to as a confounder
CI, 0.76-0.93). Secondary newborn outcomes such as birth               (Figure 2A). In the ARRIVE trial, the treatment was assigned
2                                                                                                                Volume 00, No. 0, xxxx 2019
 Table 1. Definition of Terms
 Term                                                                                             Definition
 Causal inference                                      The process of inferring that a cause led to an effect and if so, how. For the ARRIVE study,
                                                         defined as ensuring that observed effects are attributable to elective labor induction
                                                         rather than differences between the sample and the eligible population or between
                                                         study arms, and characterizing the mechanisms underlying the causal association.
 Bias                                                  Systematic distortion of the relationship between a treatment and the outcome. Bias can
                                                         be introduced at any step in a research study, and there are multiple forms of bias such
                                                         as selection bias and confounding.
 Selection bias                                        Bias caused by selecting or retaining a sample in which exposure effects do not reflect
                                                         exposure effects in the overall eligible population. (Thus, it could be a threat to internal
                                                         as well as external validity.)
 Confounding                                           Distortion of the association between an exposure and an outcome because of the
                                                         influence of another variable that is associated with both (Snowden et al, 201840 ).
 Internal validity                                     The degree to which a study’s finding is accurate and correctly reflects the true
                                                         association in the entire eligible population.
 External validity (also known as                      The degree to which a study’s finding generalizes to the overall population to which
    generalizability)                                    researchers wish to make inference.
 Pragmatic trial                                       A randomized trial designed to inform a policy or clinical decision by characterizing
                                                         effects of an exposure in real-world settings. This contrasts with an explanatory trial,
                                                         which aims to confirm a more narrow biological or clinical hypothesis (Ford and
                                                         Norrie, 201633 ).
 Incidence proportion (also known                      The number of new cases (ie, incident outcomes) among the study population initially at
    as cumulative incidence)                             risk, within a specified period of time. At-risk people are the denominator (Rothman et
                                                         al, 2008;17 Szklo and Nieto, 201241 ).
 Incidence density (also known as                      The number of new cases that emerge during the study population’s time at risk. At-risk
    incidence rate)                                      person-time is the denominator rather than people, accounting for differences in time
                                                         at risk between exposure groups.
 Mediation analysis                                    Analysis focusing on what specific mechanisms (ie, causal mediators) explain the
                                                         association between exposure and outcome. Mediators are temporally on the causal
                                                         pathway between an exposure and an outcome.
                                                                               each other and that results would not be attributable to a
                                                                               third unmeasured confounder. However, bias caused by un-
                                                                               measured confounders introduced during implementation of
                                                                               the study, or caused by the unblinded nature of the trial, re-
                                                                               mains a possibility.18–20
                                                                                   Selection bias is related to how individuals are selected
                                                                               into a study (Figure 2B). Selection bias refers to the extent to
                                                                               which results in a study sample differ from results in the study
                                                                               population (ie, all eligible individuals). Unfortunately, this can
                                                                               generally not be assessed directly, but when reviewing a study,
                                                                               high participation rates are reassuring.21 Any study wherein
                                                                               the sample population differs from eligible nonparticipants
                                                                               is vulnerable, including randomized controlled trials.21,22
                                                                               Specifically, selection bias can be a problem if factors affect-
  Figure 1. Illustration of Internal and External Validity as
                                                                               ing participation in the study also affect the effectiveness
  Applied to the ARRIVE Trial
                                                                               of the treatment under study, thereby affecting the study
                                                                               outcomes.22 Selection bias is a concern with regard to the
randomly, and adherence was high at 94% in the IOL arm                         ARRIVE trial. It is not clear that the sample of women who
and 95% in the expectant management arm. These features                        participated is representative of the overall study population
help to ensure that the comparison groups were similar to                      or whether associations between the treatment and outcomes
Journal of Midwifery & Women’s Health r www.jmwh.org                                                                                                    3
                                                                              of hypertension at term to be less than 8% per week (IOL
                                                                              group) or 14% during 2 weeks of expectant management, es-
                                                                              pecially among low-risk women. This example assumes the
                                                                              approximate median time between randomization and birth
                                                                              to have been 1 to 2 weeks for the 2 respective arms of the
                                                                              trial.
                                                                                   A thorough comparison of the participants and nonpar-
                                                                              ticipants is thus critical to addressing the question of internal
                                                                              validity. For example, it would be very helpful to know
    Figure 2. Potential Violations of Internal Validity                       whether sociodemographic characteristics or the prevalence
    A) Confounding bias. B) Selection bias. Dots represent a given
                                                                              of any of the ARRIVE outcomes were similar among all
    characteristic that differs between individuals. Different color dots
    represent individuals with different values of that characteristic.
                                                                              low-risk women who gave birth at the study hospitals or any
    Confounding bias is characterized by an imbalance in participant          MFMU-affiliated hospitals, in comparison with the ARRIVE
    characteristics between treatment groups. Here, blue dots are more        participants. In addition, given the potentially high incidence
    frequent in the treatment arm, and red dots are more frequent             of hypertension among study participants, it would be useful
    in the control arm (A). Selection bias in characterized by the            to determine whether the higher risk of cesarean birth and
    study participants being different from eligible nonparticipants (see     neonatal complications is also true among women who did
    Figure 1). In this example, the study sample contains only blue and red
                                                                              not develop hypertension, that is, women who were definitely
    dots, whereas eligible nonparticipants include yellow dots in addition
    to blue and red dots (B).                                                 at low risk. This could be assessed via stratum-specific
                                                                              analysis.
                                                                              EXTERNAL VALIDITY
are likely to be similar for those who are eligible but not in
the study.                                                                    Once the internal validity of a study is deemed to be strong,
     First, women had to have a relatively certain date of their              one may consider a study’s external validity, which is pertinent
last menstrual period to be eligible; 13% (n = 6606) of the                   to the generalizability of the findings (Figure 1). External va-
screened low-risk women were ineligible based on this crite-                  lidity addresses whether the intended study population and
rion. Second, only 27% of eligible women (n = 6106) agreed                    study findings are applicable to the broader population they
to participate. Third, the cesarean birth rate in both groups                 intended to represent. In considering the applicability of the
within the study was lower than in many US settings, which                    ARRIVE study sample to the general population, similar con-
suggests there might be something about this cohort that                      cerns arise as described above regarding internal validity; that
is different than the general population.23 For example, ap-                  is, it is uncertain to what extent is the study population and
proximately 20% of participants in this study had cesarean                    setting are representative of the larger source population the
births, which is lower than the 2017 US rate of 26% among                     study intends to represent.
women with nulliparous, term, singleton, vertex presenta-                           Another concern about generalizability relates to prac-
tions, a group with an admittedly higher risk profile than AR-                tice setting. To assess whether associations between labor in-
RIVE participants.6 Fourth, the participants were consider-                   duction and outcomes such as cesarean birth will general-
ably more likely to be African American than the general US                   ize to other settings, one must consider the degree to which
population of women who give birth, at 23% in the trial ver-                  MFMU-affiliated hospitals are representative of most clini-
sus 15% in the United States, and they were also younger, with                cal settings where women give birth. Intrapartum manage-
4% being aged at least 35 years in ARRIVE, compared with the                  ment varies considerably across hospitals. This is illustrated,
United States nationally, in which 18% of childbearing women                  for example, by the wide range of cesarean birth rates for low-
are aged at least 35 years.24 Fifth, the incidence of hyperten-               risk women, which vary 10-fold across US hospitals, from ap-
sion may be higher than expected for low-risk women during                    proximately 7% to 70%.26 In California, the median hospi-
a brief at-risk duration. Specifically, the median gestational                tal cesarean rate after labor induction in low-risk nulliparous
age at randomization was 38 weeks 3 days (the range was                       women is 32%, and it is as high as 50% to 60% in different
38 weeks and 0-6 days). At this point, women did not have                     hospitals across the state.27 This is in contrast to the cesarean
hypertensive disorders, yet 8% of the IOL group and 14% of                    rate of 18.6% among the ARRIVE cohort assigned to IOL.
the expectant management group had a hypertensive disor-                      One must consider the degree to which the ARRIVE study
der by the time they gave birth. During the entire pregnancy,                 results are applicable to hospitals that have twice this rate of
only about 5% to 6% of all pregnant women—high- and low-                      cesarean birth. Evidence also suggests that childbirth care
risk women combined—would be expected to develop a hy-                        practices differ between academic health centers and their
pertensive disorder (approximately 3% for preeclampsia and                    affiliates, which include the ARRIVE study sites, and hospi-
2%-3% for gestational hypertension).25 It is difficult to assess              tals that are not academically affiliated.26,28 In some instances,
the precise risk of hypertensive disorders that would be ex-                  teaching hospitals have been demonstrated to have higher lev-
pected in a low-risk population beyond 38 weeks’ gestation in                 els of evidence-based practice and lower levels of unwarranted
the absence of detailed week-by-week incidence data. How-                     practice variation.26,28
ever, considering the overall incidence of hypertensive dis-                        For example, all hospitals in the study adhered to a com-
orders during pregnancy and the US distribution of gesta-                     mon definition of unsuccessful labor induction in the latent
tional length,24 one could expect the cumulative incidence                    phase of labor,27,29 and once in the active phase followed
4                                                                                                                        Volume 00, No. 0, xxxx 2019
American College of Obstetricians and Gynecologists–                  tion of an effect that is attributable to a mediator, that is, a
SMFM guidelines for diagnosis of labor arrest and descent             factor that is on the causal path between an exposure and an
disorders.27,30 It is possible that in the absence of such practice   outcome. In these 2 examples, hypertensive disorders could
guidelines, an increased frequency of labor induction, with           be a mediator between treatment, which in this case is labor
concomitant longer mean labor duration, would result in a             induction or expectant management, and cesarean birth, and
higher cesarean birth rate because of increased diagnoses             cesarean birth could be a mediator between treatment and
of labor dystocia and labor arrest.27,31,32 More information          neonatal complications.
about how these labor management considerations compare                   Finally, the incidence rate (ie, incidence density) of some
with most US hospitals, and how the differences could                 maternal outcomes could be calculated in addition to the inci-
translate to variability in perinatal outcomes, will help in          dence proportion (ie, cumulative incidence), which was pre-
assessing the generalizability of the ARRIVE results. It could        sented for the ARRIVE trial (see Table 1 for definitions). The
be that the ARRIVE participants are representative of the             2 treatment arms had different amounts of time at risk based
study population from which they were drawn and that their            on the study design, and the risk of some outcomes, such as
management resulted in particularly low rates of cesarean             hypertension, increases with longer duration of pregnancy.
birth, but the question still remains regarding whether the           Therefore, explicitly including person-time in the denomi-
implementation of induction of labor would yield similar              nator of incidence calculations for relevant outcomes would
outcomes in other settings.                                           analytically address this difference between study arms, pro-
                                                                      viding information about the mechanism behind observed
                                                                      differences in risk.
FURTHER QUESTIONS THAT ARRIVE CAN
ANSWER
The ARRIVE trial focused on a dichotomized question, which            NEXT STEPS: RECOMMENDATIONS FOR FUTURE
                                                                      RESEARCH
was a comparison of perinatal outcomes if labor is induced
within a 4-day time window early during the 39th week of ges-         The ARRIVE trial has addressed an important question in a
tation, with outcomes of women who do not undergo elective            specific population and practice setting. Although this ques-
induction of labor at 39 weeks’ gestation. The rich ARRIVE            tion is a clinically relevant one, there are many other ques-
data could be used to determine the source of the improve-            tions that need to be asked and answered about optimal
ments in outcomes in the IOL group.                                   timing of birth and use of elective induction of labor, espe-
     First, the occurrence of the primary outcomes of the             cially if inherent goals include applicability to real-world prac-
women in the IOL group could be compared with subsets                 tice and women’s choices. Pragmatic trials may be important
of the expectant management group based on time frame of              tools to answer questions about the real-world effectiveness—
birth (eg, ⬍39 weeks 5 days’, 39 weeks 5 days to 40 weeks 4           weighing benefits and risks—of elective induction of labor
days’, 40 weeks 5 days to 41 weeks 4 days’, and 41 weeks 5 days       as compared with expectant management.33–35 The ARRIVE
to 42 weeks 2 days’ gestation, as sample size allows) and la-         study incorporated elements of a pragmatic trial (ie, not pre-
bor onset (ie, spontaneous onset of labor, elective induction         specifying the methods of labor induction), but this approach
of labor, or induction of labor or prelabor cesarean birth ow-        could be more fully applied to the design of the compari-
ing to onset of complications). Although these comparison             son groups. For example, if women not receiving elective la-
groups cannot be randomly assigned, understanding associ-             bor induction at 39 weeks’ gestation were encouraged to ac-
ations with outcomes in these more specific subgroups could           tively participate in their care, including selection of elective
be informative to decision making and understanding risks             labor induction at a later date if preferred, rather than cede
and benefits. For example, this type of analysis would address        this choice to trial protocol, this might produce a different re-
the risks and benefits associated with expectant management           sult. It also would possibly result in higher enrollment. Such
for a relatively shorter amount of time, rather than waiting un-      questions are pertinent to shared decision making and are
til 42 weeks 2 days’ gestation. This is particularly relevant in      likely to be of high interest to women and clinicians. This
the context of the design of the ARRIVE study. Elective induc-        type of inquiry is becoming more common in other areas of
tion of labor was not an option for the women in the expec-           health care, such as mental health care and chronic disease
tant management group until 40 weeks 5 days’ gestation. Thus,         management,36,37 and will be important to guide the practical
spontaneous birth, barring clinical indications for birth, was        translation of the results to practice.
implicitly assigned, from the time of randomization through                Further analysis and discussion of the ARRIVE data and
40 weeks 4 days’ gestation, for these women, and useful in-           further research would be useful before substantial practice
formation could be gained from stratifying analyses by gesta-         change, especially given the differences between controlled
tional age.                                                           study settings and general practice.38,39 The development and
     Second, it would be helpful to know whether the increased        implementation of practice guidelines is complex, and inte-
occurrence of cesarean birth in the women in the IOL group            grating across many sources of evidence and questions enables
was attributable to the increased incidence of hypertensive           practice change to occur in ways that are evidence based, vig-
disorders. Similarly, it would be informative to determine            ilant regarding potential unintended consequences, and con-
whether the higher risk of neonatal complications, especially         ducive to shared decision making.35,38 Other approaches to
respiratory support, was attributable to the higher occurrence        support the goal of safe reduction of primary cesarean birth
of cesarean birth. Mediation analyses could be conducted to           also merit further research, for example, doula support and
answer these questions. Mediation analysis assesses the por-          manual rotation of the fetal occiput for fetal malposition.16,30
Journal of Midwifery & Women’s Health r www.jmwh.org                                                                                  5
     There is also a need for research on the supporting factors               women and associated perinatal outcomes. Am J Obstet Gynecol.
that might help explain the associations found in the ARRIVE                   2012;207(6):502.e1-502.e8.
study and help translate these benefits to other settings, for              12.Darney BG, Snowden JM, Cheng YW, et al. Elective induction of labor
                                                                               at term compared with expectant management: maternal and neonatal
example, if broad adherence to modern evidence-based def-
                                                                               outcomes. Obstet Gynecol. 2013;122(4):761-769.
initions of unsuccessful labor induction and labor dystocia                 13.Grobman WA, Rice MM, Reddy UM, et al. Labor induction versus ex-
might safeguard against potential unintended consequences                      pectant management in low-risk nulliparous women. N Engl J Med.
of practice change, such as an increase in cesarean birth.27,29,30             2018;379(6):513-523.
In addition, work remains to be done to ensure that women are               14.Greene MF. Choices in managing full-term pregnancy. N Engl J Med.
fully informed about their options for labor and birth and the                 2018;379(6):580-581.
                                                                            15.Society for Maternal-Fetal Medicine Publications Committee. SMFM
concomitant risks and benefits. The final recommendation of                    statement on elective induction of labor in low-risk nulliparous
the SMFM statement about ARRIVE is, “We recommend that                         women at term: the ARRIVE Trial [published online August 9,
further research be conducted to measure the impact of this                    2018]. Am J Obstet Gynecol. https://doi.org/10.1016/j.ajog.2018.08.
practice in settings other than a clinical trial.”15 Additional                009
analysis of the ARRIVE data and additional rigorous research                16.American College of Nurse-Midwives. ARRIVE Trial: Talking Points
will support this goal.                                                        for Members. Silver Spring, MD: American College of Nurse-
                                                                               Midwives; 2018.
                                                                            17.Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd ed.
CONFLICT OF INTEREST                                                           Philadelphia, PA: Lippincott Williams & Wilkins; 2008.
                                                                            18.Schulz KF. Unbiased research and the human spirit: the challenges of
The authors have no conflicts of interest to disclose.                         randomized controlled trials. CMAJ. 1995;153(6):783-786.
                                                                            19.Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence
                                                                               of bias. Dimensions of methodological quality associated with esti-
ACKNOWLEDGMENTS
                                                                               mates of treatment effects in controlled trials. JAMA. 1995;273(5):
This research was supported by the National Institutes of                      408-412.
Health, grant NR017020.                                                     20.Klein MC, Kaczorowski J, Robbins JM, Gauthier RJ, Jorgensen SH,
                                                                               Joshi AK. Physicians’ beliefs and behaviour during a randomized con-
                                                                               trolled trial of episiotomy: consequences for women in their care.
                                                                               CMAJ. 1995;153(6):769-779.
REFERENCES
                                                                            21.Haneuse S. Distinguishing selection bias and confounding bias
                                                                               in comparative effectiveness research. Med Care. 2016;54(4):e23-
 1.Cheng YW, Nicholson JM, Nakagawa S, Bruckner TA, Washing-                   e29.
   ton AE, Caughey AB. Perinatal outcomes in low-risk term pregnan-         22.Jadad AR, Enkin MW. Bias in randomized controlled trials. In: Ran-
   cies: do they differ by week of gestation? Am J Obstet Gynecol.             domized Controlled Trials: Questions, Answers, and Musings. 2nd
   2008;199(4):370.e371-370.e377.                                              ed. Malden, MA: Blackwell Publishing; 2007:29-47.
 2.Reddy UM, Bettegowda VR, Dias T, Yamada-Kushnir T, Ko CW, Will-          23.Kozhimannil KB, Arcaya MC, Subramanian SV. Maternal clinical
   inger M. Term pregnancy: a period of heterogeneous risk for infant          diagnoses and hospital variation in the risk of cesarean delivery:
   mortality. Obstet Gynecol. 2011;117(6):1279-1287.                           analyses of a National US Hospital Discharge Database. PLoS Med.
 3.Spong CY. Defining “term” pregnancy: recommendations from the               2014;11(10):e1001745.
   Defining “Term” Pregnancy Workgroup. JAMA. 2013;309(23):2445-            24.Martin JA, Hamilton BE, Osterman MJK, Driscoll AK, Drake P. Births:
   2446.                                                                       final data for 2017. Natl Vital Stat Rep. 2018;67(8):1-50.
 4.ACOG (American College of Obstetricians and Gynecologists). Prac-        25.Hutcheon JA, Lisonkova S, Joseph KS. Epidemiology of pre-eclampsia
   tice bulletin no. 146: Management of late-term and postterm pregnan-        and the other hypertensive disorders of pregnancy. Best Pract Res Clin
   cies. Obstet Gynecol. 2014;124(2 pt 1):390-396.                             Obstet Gynaecol. 2011;25(4):391-403.
 5.ACOG (American College of Obstetricians and Gynecologists).              26.Kozhimannil KB, Law MR, Virnig BA. Cesarean delivery rates
   ACOG committee opinion no. 561: Nonmedically indicated early-               vary tenfold among US hospitals; reducing variation may address
   term deliveries. Obstet Gynecol. 2013;121(4):911-915.                       quality and cost issues. Health Aff (Millwood). 2013;32(3):527-
 6.Martin JA, Hamilton BE, Osterman MJK, Driscoll AK, Drake P. Births:         535.
   final data for 2016. Nat Vital Stat Rep. 2018;67(1):1-55.                27.Main E; CMQCC Leadership Team. How to Apply the ARRIVE Trial
 7.ACOG (American College of Obstetricians and Gynecologists). Com-            to my Practice. Stanford, CA: California Maternity Quality Care Col-
   mittee opinion no. 687: Approaches to limit intervention during labor       laborative; August 17, 2018.
   and birth. Obstet Gynecol. 2017;129(2):e20-e28.                          28.Kozhimannil KB, Karaca-Mandic P, Blauer-Peterson CJ, Shah NT,
 8.American-College-of-Nurse-Midwives; Midwives-Alliance-of-North-             Snowden JM. Uptake and utilization of practice guidelines in hospi-
   America; National-Association-of-Certified-Professional-Midwives.
                                                                               tals in the United States: the case of routine episiotomy. Jt Comm J
   Supporting healthy and normal physiologic childbirth: a con-
                                                                               Qual Patient Saf. 2017;43(1):41-48.
   sensus statement by the American College of Nurse-Midwives,
                                                                            29.Grobman WA, Bailit J, Lai Y, et al. Defining failed induction of labor.
   Midwives Alliance of North America, and the National Association
                                                                               Am J Obstet Gynecol. 2018;218(1):122.e121-122.e128.
   of Certified Professional Midwives. J Midwifery Womens Health.
                                                                            30.Caughey AB, Cahill AG, Guise JM, Rouse DJ. Obstetric care consen-
   2012;57(5):529-532.
                                                                               sus no. 1: safe prevention of the primary cesarean delivery. Obstet Gy-
 9.Vahratian A, Zhang J, Troendle JF, Sciscione AC, Hoffman MK. Labor
                                                                               necol. 2014;123(3):693-711.
   progression and risk of cesarean delivery in electively induced nulli-
                                                                            31.Harper LM, Caughey AB, Odibo AO, Roehl KA, Zhao Q, Cahill AG.
   paras. Obstet Gynecol. 2005;105(4):698-704.
                                                                               Normal progress of induced labor. Obstet Gynecol. 2012;119(6):1113-
10.Vrouenraets FP, Roumen FJ, Dehing CJ, van den Akker ES, Aarts
                                                                               1118.
   MJ, Scheve EJ. Bishop score and risk of cesarean delivery after in-
                                                                            32.Neal JL, Lowe NK, Schorn MN, et al. Labor dystocia: a common ap-
   duction of labor in nulliparous women. Obstet Gynecol. 2005;105(4):
                                                                               proach to diagnosis. J Midwifery Womens Health. 2015;60(5):499-
   690-697.
                                                                               509.
11.Cheng YW, Kaimal AJ, Snowden JM, Nicholson JM, Caughey AB.
                                                                            33.Ford I, Norrie J. Pragmatic trials. N Engl J Med. 2016;375(5):454-
   Induction of labor compared to expectant management in low-risk
                                                                               463.
6                                                                                                                           Volume 00, No. 0, xxxx 2019
34.Treweek S, Zwarenstein M. Making trials matter: pragmatic and ex-        38.Phillippi JC, King TL. Assessing the value of the ARRIVE trial for clin-
   planatory trials and the problem of applicability. Trials. 2009;10:37.      ical practice: sea change or just a splash? J Midwifery Womens Health.
35.Hawe P. Lessons from complex interventions to improve health. Annu          2018;63(6):645-647.
   Rev Public Health. 2015;36:307-323.                                      39.Davies-Tuck M, Wallace EM, Homer CSE. Why ARRIVE should not
36.Lovell K, Bee P, Brooks H, et al. Embedding shared decision-making          thrive in Australia. Women Birth. 2018;31(5):339-340.
   in the care of patients with severe and enduring mental health           40.Snowden JM, Tilden EL, Odden MC. Formulating and answering
   problems: the EQUIP pragmatic cluster randomised trial. PLoS One.           high-impact causal questions in physiologic childbirth science: con-
   2018;13(8):e0201533.                                                        cepts and assumptions. J Midwifery Womens Health. 2018;63(6):721-
37.Bennett GG, Warner ET, Glasgow RE, et al. Obesity treatment for so-         730.
   cioeconomically disadvantaged patients in primary care practice. Arch    41.Szklo M, Nieto FJ. Epidemiology: Beyond the Basics. 3rd ed. Burling-
   Intern Med. 2012;172(7):565-574.                                            ton, MA: Jones & Bartlett; 2012.
Journal of Midwifery & Women’s Health r www.jmwh.org                                                                                                 7