Glogit
Glogit
Paper 446-2013
        ABSTRACT
        Logistic regression is most often used for modeling simple binary response data. Two modifications extend it
        to ordinal responses that have more than two levels: using multiple response functions to model the ordered
        behavior, and considering whether covariates have common slopes across response functions.
        This paper describes how you can use the LOGISTIC procedure to model ordinal responses. Before
        SAS/STAT® 12.1, you could use cumulative logit response functions with proportional odds. In SAS/STAT
        12.1, you can fit partial proportional odds models to ordinal responses. This paper also discusses methods
        of determining which covariates have proportional odds. The reader is assumed to be familiar with using
        PROC LOGISTIC for binary logistic regression.
        INTRODUCTION
        Logistic regression modeling has applications in many areas, including clinical studies, epidemiology, data
        mining, social sciences, marketing, and engineering. It has proved to be reliable for both prospective
        analyses (such as designed experiments or clinical trials) and retrospective analyses (such as found data or
        case-control studies).
        If your response variable can take only two values (the “event” and the “non-event”), then the conditions for
        linear regression are not met; in particular, your errors are binary and not normally distributed. Binary logistic
        regression was developed to handle this case. Instead of modeling the response itself, you use logistic
        regression to model the probabilities of events.
        Suppose your response variable Yi has events coded as 0 and non-events coded as 1 for each observation
        i D 1; : : : ; n, and your p covariates are xi D .xi1 ; xi 2 ; : : : ; xip /. Write the probability of an event as
        i D Pr.Yi D 0/. Then the logistic model is written as
                                   
                                i
            logit.i / D log          D ˛ C xi ˇ D ˛ C xi1 ˇ1 C xi 2 ˇ2 C    C xip ˇp
                               1 i
        where the expected behavior of the responses is modeled by a linear combination of the covariates, the
        linear predictor ˛ C Xˇ. The quotient 1 ii is called the odds, and the log of the odds is called the logit or
        the log odds. This model is linear in the ˛ and ˇ, and the different combinations of the X covariates define
        subpopulations within which the response is independently binomially distributed. Under simple random
        sampling within the subpopulations, the data have a product binomial distribution from which you can derive
        a likelihood and all other results. When using continuous covariates, you might have only one observation
        in each combination, but this affects only certain statistical results. A general recommendation is that you
        should have at least 10 events and 10 non-events for each parameter (Peduzzi et al. 1996).
        Although binary logistic regression is most common, logistic regression is extensible to more than two
        response levels. If your response variable takes values that have no inherent ordering (such as voting
        Democratic, Green, Independent, Republican), then your response is nominal. If your response takes values
        that have an intrinsic order (good, better, best), then your response is ordinal.
        This paper first reviews how binary logistic regression extends to polytomous logistic regression—in particular,
        to a special ordinal response model, the proportional odds model combined with a cumulative logit link.
        Then the proportional odds model is relaxed by using the new UNEQUALSLOPES option in the LOGISTIC
        procedure to fit the partial proportional odds model. Methods for determining which model applies to
        your data are also described. The paper ends with suggestions for performing model selection while
        simultaneously assessing the proportional odds of the individual parameters.
                                                                1
SAS Global Forum 2013                                                                             Statistics and Data Analysis
        Link Functions
        Generalized Logit Link
           Define a link function so that each response function contrasts a lower level with the last level:
                                            
                 i1           i2           i3
           log           log          log
                 i 4          i4           i4
           This is the generalized logit link, and it ignores the order of the responses, beyond identifying the last
           one as the reference response. You use this link when you have nominal data.
        Cumulative Logit Link
           The most popular ordinal link function uses every probability in every function by contrasting the lower
           levels of Y with the higher levels of Y . Let the cumulative probability be denoted as ij D Pr.Yi  j /:
                                                                                                                        
                                      i1                               i1 C i 2                         i1 C i 2 C i 3
            logit.i1 / D log                      logit.i 2 / D log                 logit.i 3 / D log
                                i2 C i3 C i4                         i 3 C i 4                              i 4
           This is the cumulative logit link. As you move from the first logit function to the second and from the
           second logit function to the third, the numerator increases and the denominator decreases, so the
           cumulative logits are increasing.
        Linear Predictors
        General Model
           To model the variation of the probabilities, you set each of your J – 1 = 3 response functions equal to a
           different linear predictor, ˛j C X0 ˇj ; j D 1; 2; 3. Choosing this model has several consequences:
            1. There are .J 1/.1 C p/ D 3.1 C p/ parameters, which can be too many for a successful fit and
               can violate the ideal of model parsimony.
            2. If you take the difference of the jth response function between any two subpopulations, h and i, you
               get
               where g is the link function and g.ij / is the jth response function for the ith observation. The
               subscript on ˇj means that this difference depends on the specific response function, so your
               conclusions are drawn for each response function separately. If you use the generalized logit link
               function, this difference is the log odds ratio.
            3. There are no constraints on the ˛j and ˇj , so there is no guarantee that the response functions
               are ordered.
           Any of the links that are used with this model is an instance of a general (generalized) model (Agresti
           2010). In PROC LOGISTIC, specifying the LINK=GLOGIT option in the MODEL statement fits this model
           to the generalized logit.
        Proportional Odds Model
           The general model has unequal slopes for the predictors, and you need enough data to estimate a
           different coefficient for each predictor in each response function. To simplify this model, you can force an
                                                                  2
SAS Global Forum 2013                                                                      Statistics and Data Analysis
            ordering on the linear predictors by using the same slope parameters for each response function and by
            constraining the intercepts to increase (˛1 < ˛2 < ˛3 ) or decrease:
g.j / D ˛j C X0 ˇ j D 1; 2; 3
           This model has J 1 C p parameters and thus requires less data for an adequate fit than the general
           model requires. It also provides a more straightforward interpretation. Compute the difference of the jth
           response function between two subpopulations h and i to see the impact of this model:
           This difference is proportional to the distance between the explanatory variables, and the difference is
           the same no matter which response function you consider. This is the equal slopes assumption, which
           is also called the parallel lines assumption. You can apply the parallel lines assumption to any of the
           link functions, but it is most commonly used with the cumulative logit link. When you use the cumulative
           logit link, the assumption is the proportional odds assumption, the model is the proportional odds model,
           and the difference of cumulative logits (g) is the log cumulative odds ratio. By default, PROC LOGISTIC
           fits the proportional odds model combined with the cumulative logit link when you have more than two
           response levels.
        Partial Proportional Odds Model
           The third linear predictor is a hybrid of the general model and the proportional odds model. Suppose one
           set of effects X has p1 parameters that satisfy the parallel lines assumption (that is, they have equal
           slopes), but the remaining set Z has p2 parameters that do not and instead require the general model
           (that is, they have unequal slopes). This model is written as
g.j / D ˛j C X0 ˇ C Z0 j j D 1; 2; 3
            It has .J 1/ C p1 C .J 1/p2 parameters and, when used with the cumulative logit link, is called the
            partial proportional odds model (Peterson and Harrell 1990). Interpretation of the proportional odds
            parameters is independent of the response function; interpretation of the general parameters depends
            on the response function. When you fit the partial proportional odds model, you must be especially
            careful to ensure that the cumulative logits remain ordered for the data being modeled.
The next sections fit proportional odds models and partial proportional odds models to several data sets.
           data Asbestos;
              input Task Ventilation Exposure Freq @@;
              datalines;
           0 0 1 29   0 0 2 1   0 0 3 1       0 1 1 3                0 1 2 1    0 1 3 2
           1 0 1 10   1 0 2 1   1 0 3 7       1 1 1 3                1 1 2 3    1 1 3 22
           ;
        The Task variable is 0 if the worker removed asbestos tile and 1 if the worker removed asbestos insulation.
        The Ventilation variable is 0 if a negative pressure ventilation system was used and 1 if a general system
        was in place. The response variable Exposure measures the extent of the worker’s exposure to asbestos; it
        takes the values 1 for low exposure, 2 for exposure near the legal limit, or 3 for exposure above the legal
        limit. The Freq variable counts the number of workers in each grouping.
        The following code fits a proportional odds model to the Asbestos data set; the classification variables are
        entered as main effects. The CLASS statement is optional for these data, but you specify the PARAM=REF
        and REF=FIRST options in the CLASS statement to preserve the 0-1 coding of the two binary covariates
        (PARAM=REF is the incremental effects parameterization). The EFFECTPLOT statement displays a plot of
        the fit.
                                                                 3
SAS Global Forum 2013                                                                              Statistics and Data Analysis
This model has only four combinations of the explanatory variables, as shown in Table 1.
        Each subpopulation has two cumulative logits in the model. If you write the cumulative probabilities for the
        subpopulations in each logit function as ij D Pr.Exposure  j jsubpopulation D i /, the model is
           2               3 2                         3 2                 3
              logit.a1 /       ˛1                            1 0 0 0
           6 logit.b1 / 7 6 ˛1
           6               7 6             Cˇ1         7 6 1 0 1 0 7
                                                       7 6                 7
           6 logit.c1 / 7 6 ˛1                  Cˇ    7 6 1 0 0 1 72              3
           6               7 6                       2 7   6               7 ˛1
           6 logit.d1 / 7 6 ˛1            Cˇ1 Cˇ2 7   7 6 1 0 1 1 76
                                                           6               7
                                                                               ˛2 7
                           7D6                         7D6
           6               7 6
                                                                           76      7
                                                                               ˇ
           6
           6 logit.a2 / 7 6          ˛ 2              7   6  0  1  0    0 7 4   1 5
                                                       7 6 0 1 1 0 7 ˇ2
           6               7 6                         7 6                 7
           6 logit.b2 / 7 6
           6               7 6        ˛2 Cˇ1           7 6                 7
           4 logit.c2 / 5 4          ˛2         Cˇ2 5 4 0 1 0 1 5
              logit.d 2 /            ˛2 Cˇ1 Cˇ2              0 1 1 1
        The first row in the preceding equations shows you that the intercept for the first response function, ˛1 , is the
        log odds of Exposure = 1 versus 2 or 3 for Task = 0 and Ventilation = 0. The fifth row shows you that the
        intercept ˛2 is the log odds of Exposure = 1 or 2 versus 3 for Task = 0 and Ventilation = 0. The second (or
        sixth) row, compared to the first (or fifth), shows you that ˇ1 is the increment in both response functions (or
        both types of log cumulative odds) due to Task = 1. The third (or seventh) row, compared to the first (or fifth),
        shows you that ˇ2 is the increment in both types of log cumulative odds due to Ventilation = 1.
        Exponentiating the cumulative logits produces the cumulative odds, and you can solve for the cumulative
        probabilities:
                                            j
            logit.j / D ˛j C X0 ˇ !                    D exp.˛j C X0 ˇ/ ! j D exp.X0 ˇ/=.1 C exp.X0 ˇ//
                                       .1        j /
Table 2 displays these equations for each of the subpopulations and cumulative logits.
                                                                       4
SAS Global Forum 2013                                                                                Statistics and Data Analysis
        Figure 1 displays the response profiles. The note means that the first logit function compares Exposure = 1
        to Exposure = 2 or 3, and the second logit function compares Exposure = 1 or 2 to Exposure = 3; that
        is, they compare lower exposure levels to higher exposure levels. You can compare higher values to lower
        values by specifying the DESCENDING response option.
Response Profile
                                           Ordered                                  Total
                                             Value           Exposure           Frequency
                                                    1                      1            45
                                                    2                      2             6
                                                    3                      3            32
        Figure 2 displays a score test for the proportional odds assumption; the test does not reject the null hypothesis
        that the proportional odds assumption holds. The degrees of freedom are the difference between the number
        of parameters in a general model and the number of parameters in a proportional odds model; in this case,
        6 – 4 = 2. This score test actually tends to reject the null hypothesis more often than it should; Stokes,
        Davis, and Koch (2012) say that this statistic needs approximately five observations (or frequencies) for each
        outcome at each level of each main effect, because small samples might make the statistic artificially large.
        This score test is a good confirmatory test if it does not reject the null; however, if it rejects the null, then you
        need other means to justify the proportional odds assumption.
1.6130 2 0.4464
        The AGGREGATE SCALE=NONE options produce the goodness-of-fit statistics that are displayed in Figure 3.
        These tests compare the actual number of observations in each subpopulation to the number that is predicted
        by the model, and both tests suggest an adequate fit. The degrees of freedom are the number of response
        functions times the number of profiles minus the number of parameters, 4  2 – 4 = 4. The note in Figure 3,
        “Number of unique profiles: 4,” refers to the four subpopulations that are identified earlier. Stokes, Davis, and
        Koch (2012) suggest that you need at least five observations per subpopulation and response in order for
                                                                   5
SAS Global Forum 2013                                                                          Statistics and Data Analysis
        the test to be valid. If you have a continuous variable in your model, then you should not use these tests; you
        should instead compare your model to a larger model and show that your model is just as good.
        The fit statistics that are shown in Figure 4 are often used to compare nested models. The difference of the
        –2 Log L statistics forms the likelihood ratio statistic that is shown in Figure 5.
                                                                           Intercept
                                                          Intercept              and
                                       Criterion               Only       Covariates
        The three global tests that are displayed in Figure 5 evaluate the significance of all the predictors combined.
        They tell you only whether the model has some significance; they don’t say anything about the effect of
        individual predictors. There are 4 parameters – 2 intercepts = 2 degrees of freedom for these tests. All
        tests in Figure 5 reject the null hypothesis that the covariates are unrelated to Exposure; that is, the model
        explains a significant amount of variation in the data.
        The tests that are displayed in Figure 6 are Wald tests that the parameters in a given effect are jointly 0.
        Because the classification covariates are binary and are coded in the design matrix as a single column, the
        tests all have one degree of freedom.
                                                              6
SAS Global Forum 2013                                                                            Statistics and Data Analysis
Figure 6 Tests That the Parameters for a CLASS Effect Are All Zero Are Rejected
                                                                      Wald
                                   Effect                DF     Chi-Square          Pr > ChiSq
        The parameter estimates are displayed in Figure 7. There are two intercepts for the two logit functions
        modeled, but only one slope is reported. The parameter labeled “Intercept 1” is the ˛1 parameter for
        Exposure = 1 versus Exposure = 2 or 3, and “Intercept 2” is the ˛2 parameter for Exposure = 1 or 2 versus
        Exposure = 3.
                                                                Standard              Wald
                 Parameter               DF       Estimate         Error        Chi-Square         Pr > ChiSq
        Both cumulative logits (or log odds) decrease as the explanatory effects increase. Table 3 shows the fitted
        cumulative logits for the subpopulations. Recall the interpretation of the parameter estimates in terms of
        logits; this table shows you that ˛1 =2.4751 is the log odds of Exposure = 1 versus 2 or 3 for a worker with
        Task = 0 and Ventilation = 0, ˛2 is the log odds of Exposure = 1 or 2 versus 3 for a worker with Task = 0
        and Ventilation = 0, ˇ1 is the increment in both log odds for Task = 1, and ˇ2 is the increment in both log
        odds for Ventilation = 1.
                                                                 7
SAS Global Forum 2013                                                                       Statistics and Data Analysis
        Note that the interpretation of the parameter estimates in Figure 7 depends on the PARAM= option, but the
        odds ratios in Figure 8 are independent of the PARAM= option.
        The interpretation of these odds ratios is that a person who performs Task = 1 has 0.102 times the odds
        of a low exposure versus a higher exposure than a person who performs Task = 0 both for Exposure = 1
        versus 2 or 3 and for Exposure = 1 or 2 versus 3. That is, a person who removes insulation is more likely to
        have a higher exposure to asbestos than a person who removes tile. Similarly, a person who works in an
        environment that has general ventilation is more likely to have a higher exposure to asbestos than a person
        who works in an environment that has negative pressure ventilation. This summary holds for both response
        functions, because of the proportional odds assumption.
        Finally, you can generate plots of the fitted model. The EFFECTPLOT statement produces the plot in
        Figure 9, where you can easily see that the predicted probability of a low exposure is the highest for workers
        who remove tile in negative pressure ventilation, and the probability of exposure above the legal limit is
        highest for workers who remove insulation in general ventilation.
           data Coal;
              input Severity $ @@;
              do i=1 to 8; input Exposure freq @@; lnExposure=log(Exposure); output; end;
              datalines;
           Normal   5.8 98 15 51 21.5 34 27.5 35 33.5 32 39.5 23 46 12 51.5 4
           Moderate 5.8 0 15 2 21.5 6 27.5 5 33.5 10 39.5 7 46 6 51.5 2
           Severe   5.8 0 15 1 21.5 3 27.5 8 33.5 9 39.5 8 46 10 51.5 5
           ;
                                                              8
SAS Global Forum 2013                                                                         Statistics and Data Analysis
        The following program fits a proportional odds model to these data and displays plots of the effect of
        the lnExposure variable. By default, PROC LOGISTIC orders the response levels alphanumerically; the
        ORDER=DATA and DESCENDING options are specified to override the default by fitting cumulative logits of
        higher versus lower severity.
        You should not use the standard goodness-of-fit tests, because the lnExposure variable is continuous. A
        good alternative is to compare your main-effects model to a larger one, and test whether your smaller model
        is sufficient. A reasonable extension of this model is to square the continuous effect. To get this test in PROC
        LOGISTIC, you specify a forward selection and include your model. If the fit is good, no other effects are
        added to your model, and the score test is reported in the “Residual Chi-Square Test” table.
        The three EFFECTPLOT statements display the relationship of lnExposure to Severity on three different
        scales: cumulative probabilities over a range of exposure levels extended by 50%, logit of cumulative
        probabilities (because the link function here is the cumulative logit link), and individual probabilities.
                                                                     Standard           Wald
             Parameter                     DF        Estimate           Error     Chi-Square        Pr > ChiSq
                                                                 9
SAS Global Forum 2013                                                                         Statistics and Data Analysis
        The interpretation is that incrementing log(exposure) by 1 increases a miner’s odds of having more severe
        pneumoconiosis by a factor of 13.
        The EFFECTPLOT statements create several different views of the fitted model. First, Figure 12(A) shows
        the three estimated cumulative probability functions. The plot is extended beyond the data to display the
        S-shaped curve of the logistic function. Because the lnExposure values are fit toward the left half of the
        logit functions, the predicted probabilities of higher severity are generally quite low. Second, Figure 12(B)
        shows the fit on the logit scale, which has the parallel lines that you expect when making the proportional
        odds assumption. Third, Figure 12(C) shows the model-predicted probabilities of a miner having each level
        of Severity across the range of the lnExposure values. A miner who has little exposure has the highest
        probability of being normal. As exposure increases, the higher severity levels become more likely and the
        probability of a normal level decreases. When exposure exceeds 33 years, the probability of a moderate
        severity levels off. Presumably miners are moving from normal to moderate severity and from moderate to
        severe, so after this amount of exposure they mostly move into the severe range.
                                          Figure 12 Displays of the Fitted Model
           data Backache;
              input Severity Month Age @@;
              lnAge=log(Age);
              if Severity=0 then Severity=1;
              Trimester=(Month>6)+(Month>3)+1;
              datalines;
           1 0 23     3 0 36    1 0 21    1 0 30    1 0 42       1 0 34   3 3 26   1 7 18    3 6 39      1 0 25
           ;
        The variables that are extracted from this data set are Severity, which takes the values 1 = none or very
        little pain, 2 = troublesome pain, and 3 = severe pain; Trimester, which identifies in which trimester the pain
        starts; and lnAge, which is the log of the mother’s age in years.
        The following program fits a proportional odds model to the data. The third trimester is the reference level for
        the Trimester parameterization. Specifying the DESCENDING option means that the probability of higher
        severity is modeled.
                                                                 10
SAS Global Forum 2013                                                                         Statistics and Data Analysis
22.2794 3 <.0001
        As mentioned earlier, rejection tends to occur more often than it should. Now what should you do?
        You can first try to determine whether the proportional odds assumption is actually valid. There are other
        ways to assess the proportional odds structure besides relying on this test. Two graphical methods are
        discussed in the next section.
            • To confirm the ordinality of the response for a predictor, the means should be strictly increasing or
              decreasing with respect to the Y variable.
            • To assess the proportional odds assumption for a predictor, the model-expected value curve should
              closely follow the mean curve.
        The macro requires CLASS variables to be numeric; otherwise, you can use the coding from the design
        matrix data set that is provided by the OUTDESIGN= option in the PROC LOGISTIC statement.
                                                                 11
SAS Global Forum 2013                                                                        Statistics and Data Analysis
The following code applies these macros to the Backache data set and produces Figure 15:
          %EmpiricalLogitPlot(class=Trimester,cont=lnAge,data=Backache,y=Severity);
          %EXPlot(Trimester lnAge,data=Backache,y=Severity);
                                     Figure 15 Assessment Plots for Backache Data
        The plots for lnAge show no profound departure from proportionality (the mean curve in (B) is increasing),
        but the plots for Trimester certainly do show a departure. But how do you generate a model in which
        the proportional odds structure holds for one covariate but not for another? The answer is the partial
        proportional odds model, a hybrid method in which the lnAge variable has a proportional odds structure and
        the Treatment variable has a general structure. The partial proportional odds model for the Backache data
        is
        where the response functions share the lnAge parameter but each function has its own version of the
        intercept, the Trimester = 1 parameter, and the Trimester = 2 parameter. To interpret these parameters and
        their odds ratios, the lnAge parameter is treated as before—your interpretation applies to both response
        functions. However, when you discuss the Trimester = 1 parameter, you have to distinguish between the
                                                           12
SAS Global Forum 2013                                                                          Statistics and Data Analysis
        Severity = 3 versus 2 or 1 logit function and the Severity = 3 or 2 versus 1 logit function. The partial
        proportional odds model is fit in the next section.
                                                                   Wald
                                  Effect             DF      Chi-Square           Pr > ChiSq
        As in the analysis of the Coal data set, you should not use the standard goodness-of-fit tests because the
        lnAge variable is continuous. Instead compare your model to a reasonable extension: in this case, take
        the interaction and square the continuous effect. You can produce the Rao score test in Figure 17 from a
        forward selection step as shown in the following code:
                                                              13
SAS Global Forum 2013                                                                           Statistics and Data Analysis
8.8720 5 0.1143
        The score test in Figure 17 suggests the main-effects model fits the data adequately.
        Alternatively, you can use a likelihood ratio test to compare the two models, just as the likelihood ratio global
        test compares your model to an intercept-only model. The following %LRTest macro call uses a likelihood
        ratio test to compare these two models and does not reject the main-effects model (p = 0.357):
           %LRTest(data=Backache,class=Trimester            / param=ref,
                   model1=Severity(descending) =            lnAge Trimester / unequalslopes=Trimester,
                   model2=Severity(descending) =            lnAge Trimester lnAge*lnAge lnAge*Trimester
                                               /            unequalslopes=Trimester);
        Figure 18 shows that the Trimester effects for the first response function are not significantly nonzero.
                                                                       Standard           Wald
               Parameter          Severity     DF     Estimate            Error     Chi-Square       Pr > ChiSq
        Even though the Trimester effects for the first logit function are not significant, PROC LOGISTIC does not
        set the estimates to 0. Peterson and Harrell (1990) introduce a constrained partial proportional odds model,
        which enables you to multiply the Trimester parameters by a different specified constant for each response
        function—multiplying the parameter in the first response function by 0 and in the second response function
        by 1 removes the effect from the first response function. You can look for the constrained model in a future
        release of PROC LOGISTIC.
                                                               14
SAS Global Forum 2013                                                                         Statistics and Data Analysis
        You can interpret the intercepts and the lnAge parameter as before. The first intercept (–7.2266) is the log
        odds of Severity = 3 versus 2 or 1 for lnAge = 0 and Trimester = 3. The second intercept (–5.5761) is the
        log odds of Severity = 3 or 2 versus 1 for lnAge = 0 and Trimester = 3. The lnAge parameter (1.767) is the
        increment in log odds of a higher Severity for a one-unit increase in lnAge.
        The interpretation of the Trimester parameters is complicated because they depend on the response
        function. In the first response function, the Trimester = 1 parameter (–0.3949) is the increment in log odds
        for Severity = 3 versus 2 or 1 for pain onset in Trimester = 1, and the Trimester = 2 parameter (–0.3262) is
        the increment in log odds for Severity = 3 versus 2 or 1 for pain onset in Trimester = 2. Neither of these
        is significant. In the second response function, the Trimester = 1 parameter (–1.1475) is the increment in
        log odds for Severity = 3 or 2 versus 1 for pain onset in Trimester = 1, and the Trimester = 2 parameter
        (1.0891) is the increment in log odds for Severity = 3 or 2 versus 1 for pain onset in Trimester = 2.
        Similarly, the Trimester odds ratios that are displayed in Figure 19 also depend on the response functions.
        For the response function that compares Severity = 3 versus 2 or 1, every confidence interval contains 1, so
        mothers who experience pain onset in any trimester are equally likely to have higher-severity pain; that is,
        the Trimester effect does not describe any variation in the first response function. However, in the second
        response function, which compares Severity = 3 or 2 versus 1, mothers who experience pain onset in the
        first trimester are less likely to have higher-severity pain than mothers who experience pain onset in the later
        trimesters, and mothers with pain onset in the second trimester are more likely to have higher-severity pain
        than mothers with pain onset in the third trimester. A mother with a one-unit increase in lnAge has 5.853
        times the odds of having (is more likely to have) a higher severity, for both response functions.
        Up    Start small with a proportional odds model and compare it to unequal slopes on each effect in turn
              (Peterson and Harrell 1990).
        Down Start big with a general model and compare it to equal slopes on each effect in turn (Koch, Amara,
              and Singer 1985, p. 372).
        OneUp Study proportionality for a series of one-effect models, for which Up and Down are equivalent
              (based on Harrell 2001, p. 332).
        None of these types of test are perfect: Up and OneUp are apt to signal nonproportional odds too often;
        Down, which starts big, can require a lot of data and might be less able to detect proportional odds.
        You can implement these tests in three ways:
                                                               15
SAS Global Forum 2013                                                                           Statistics and Data Analysis
           • Wald tests can be computed directly by using the TEST statement in PROC LOGISTIC, but they are
             usually not as good as either likelihood ratio tests or score tests.
           • Likelihood ratio tests require only a little SAS programming to extract the values of the fitted log likelihood
             from the appropriate PROC LOGISTIC table and compare the difference of their log likelihoods to a
             chi-square distribution. The degrees of freedom are equal to the difference of the degrees of freedom
             of the two models. The %LRTest macro facilitates this procedure.
           • Score tests (Peterson and Harrell 1990) are traditionally computed in PROC LOGISTIC by using the
             model selection machinery to produce them from the residual chi-square test for adding the effect in
             question last.
        You can run the likelihood ratio test version of the Up, Down, and OneUp tests for every variable in the model
        by calling the %LRTestCycle macro as in the following code. The macro uses the UNEQUALSLOPES option
        to generate the models in the Up, Down, and OneUp tests that involve nonproportional odds.
        Table 4 reports that the Up tests reject proportional odds for both variables, but the OneUp and Down tests
        disagree for lnAge. To break the tie, compare the two competing models. Both models have Trimester with
        unequal slopes, one model has lnAge with proportional odds, and the other model has lnAge with unequal
        slopes. This is just the Down test that you already ran, so you can conclude that the partial proportional
        odds model fit in the previous section is appropriate.
                                                                16
SAS Global Forum 2013                                                                      Statistics and Data Analysis
        of the selection algorithm. The following sections discuss these two approaches by using the Docvisit data
        set (Cameron and Trivedi 1998, p. 68) from Example 54.17 of the PROC LOGISTIC documentation in the
        SAS/STAT 12.1 User’s Guide.
           data Docvisit;
              input Sex Age Agesq Income Levyplus Freepoor Freerepa
                    Illness Actdays Hscore Chcond1 Chcond2 Dvisits;
              if ( Dvisits > 2) then Dvisits = 2;
              Dvisitsp1= Dvisits+1;
           datalines;
           1 0.19 0.0361 0.55 1 0 0 1 4 1 0 0 1
        The empirical cumulative logits in Figure 20(A) suggest that Actdays and Income should have unequal
        slopes, whereas Age, Agesq, and Hscore might have unequal slopes. The E.XjY O      / plots in Figure 20(B)
        show that the non-ordinality of the Levyplus and Chcond1 mean curves gives them unequal slopes. The
        slight non-ordinality of the Age, Agesq, Freerepa, and Sex mean curves and their slight divergence from
        the expected curves suggest they might have unequal slopes. Finally, choose effects that either plot is sure
                                                                   17
SAS Global Forum 2013                                                                       Statistics and Data Analysis
        of or that both sets of plots agree on: let Actdays, Income, Levyplus, Chcond1, Age, and Agesq have
        unequal slopes. Table 5 contains these conclusions in the Graphical row.
        Second, the results from the following Down likelihood ratio tests are displayed in Table 5, and tests with
        p-values less than 0.1 are chosen to have unequal slopes.
        Method     Sex   Age Agesq Income Levyplus Freepoor Freerepa Illness Actdays Hscore Chcond1 Chcond2
        Graphical equal neq neq         neq      neq      equal    equal    equal    neq    equal     neq      equal
        LRTDown 0.298 0.113 0.053      0.003    0.899     0.598    0.158    0.354   0.011   0.827    0.405     0.804
        Iterative equal neq neq         neq     equal     equal     neq     equal    neq    equal    equal     equal
        Besides Agesq, Income, and Actdays, Table 5 shows that the methods have little agreement on which vari-
        ables should have nonproportional odds. However, you might not have enough data to make a determination
        for nonsignificant effects, so the disagreement might ultimately be unimportant.
        Perform a stepwise selection for the three competing models. The following program runs the model that is
        given in the Graphical row of Table 5, and the other models differ only in the UNEQUALSLOPES specification.
        The results are summarized in the section “Selection Results.”
                                                              18
SAS Global Forum 2013                                                                         Statistics and Data Analysis
        Selection Results
        The selection methods in the preceding sections all choose Sex, Freepoor, Illness, and Hscore as
        proportional odds effects and Agesq and Actdays as general effects. Of these methods, copying the
        variables and using the selection methods is the easiest to implement using PROC LOGISTIC. Finish the
        Docvisit analysis by fitting the selected model as follows:
        CONCLUSION
        This paper shows you how to use the LOGISTIC procedure to fit proportional odds models and how to
        interpret the results. The UNEQUALSLOPES option, available in SAS/STAT 12.1, is introduced to fit partial
        proportional odds models and general models to cumulative logits. Two graphical methods for assessing
                                                              19
SAS Global Forum 2013                                                                         Statistics and Data Analysis
        proportionality are discussed. The UNEQUALSLOPES option also enables you to determine which variables
        satisfy the proportional odds assumptions and which do not.
        Expanding the scope of the LOGISTIC procedure is an active area of SAS/STAT development. You can look
        for the following features in a future release of PROC LOGISTIC: enabling effects to have both general and
        proportional odds structures, fitting the unconstrained partial proportional odds model, fitting generalized
        logit links with proportional odds and partial proportional odds models, and providing alternative ordinal links
        such as the adjacent-category logit.
REFERENCES
        Agresti, A. (2010), Analysis of Ordinal Categorical Data, 2nd Edition, New York: John Wiley & Sons.
        Cameron, A. C. and Trivedi, P. K. (1998), Regression Analysis of Count Data, Cambridge: Cambridge
         University Press.
        Chatfield, C. (1995), Problem Solving: A Statistician’s Guide, 2nd Edition, Boca Raton, FL: Chapman &
         Hall/CRC.
        Harrell, F. E. (2001), Regression Modeling Strategies, New York: Springer-Verlag.
        Koch, G. G., Amara, I. A., and Singer, J. M. (1985), “A Two-Stage Procedure for the Analysis of Ordinal
          Categorical Data,” in P. K. Sen, ed., Biostatistics: Statistics in Biomedical, Public Health, and Environmental
          Sciences, Amsterdam: Elsevier Science.
        McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, 2nd Edition, London: Chapman & Hall.
        Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., and Feinstein, A. R. (1996), “A Simulation Study of the
          Number of Events per Variable in Logistic Regression Analysis,” Journal of Clinical Epidemiology, 49,
          1373–1379.
        Peterson, B. L. and Harrell, F. E. (1990), “Partial Proportional Odds Models for Ordinal Response Variables,”
          Journal of the Royal Statistical Society, Series B, 39, 205–217.
        SAS Institute Inc. (2012), “SAS Note 37944: Plots to Assess the Proportional Odds Assumption in an Ordinal
          Logistic Model,” http://support.sas.com/kb/37944.
        Simonoff, J. S. (2003), Analyzing Categorical Data, New York: Springer-Verlag.
        Stokes, M. E., Davis, C. S., and Koch, G. G. (2012), Categorical Data Analysis Using SAS, 3rd Edition, Cary,
          NC: SAS Institute Inc.
        ACKNOWLEDGMENTS
        The author is grateful to Maura Stokes, Randy Tobias, David Schlotzhauer, Ed Huddleston, Anne Baxter,
        and Tim Arnold of the Advanced Analytics Division at SAS Institute Inc. for their valuable assistance in the
        preparation of this paper.
        CONTACT INFORMATION
        Your comments and questions are valued and encouraged. Contact the author:
         Bob Derr
         SAS Institute Inc.
         SAS Campus Drive
         Cary, NC 27513
         bob.derr@sas.com
        SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
        SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
        Other brand and product names are trademarks of their respective companies.
20