17.12 Logistic Regression with Binomial Response

Logistic regression on a binomial response from one or more binary, continuous, or categorical variables and/or first-order interactions between these variables is provided in ChemTree as an alternative to recursive partitioning on a binomial response.

Categorical dummy variables and interaction terms are used just as they are for linear regression (17.5).

17.12.1 Methodology

To obtain the p-value for logistic regression on a binomial response with n observations, we use a logit model to fit the binary response, y, using the covariate matrix, x. We use the Newton method approach of maximizing the log likelihood function for the logit model outlined in Econometric Analysis, by W.H. Green, third ed., Prentice Hall, NJ, 1997, pp. 882-886. To obtain a p-value, we test the hypothesis that all of the slope coefficients in the logit model are zero. We calculate a likelihood ratio statistic, where L0 is the unrestricted likelihood and L1 is the restricted likelihood, and -2ln(L0∕L1) should be chi-squared with n-1 degrees of freedom. That is,

log(L0) = n[s log(s)+ (1- s)log(1- s)],

where s is the proportion of the n dependent variables (y1...yn) that are equal to one,

     ∑n [     (         )            (             )]
L1 =     yilog  ----1-T--  + (1- yi)log  1- ----1--T-   ,and
     i=1        1 +e- β xi                  1 + e-β xi

p = aP = chisqr(- 2(L0 - L1),n - 1).

17.12.2 Stepwise Regression

It could be that only a few potential variables really affect the outcome. If this is suspected to be the case, then stepwise regression can be appropriate.

Starting with the null model, successive models are created, each one using one more regressor than the previous model.

To pick which regressor to use for the next model, each of the unused regressors in turn is tried out by adding it to the current model. The P-value of the trial model as a “full model” vs. the current model as a “reduced model” is found, and the model with the best (smallest) P-value found this way is used. However, if no P-value is better than the “P-value cutoff” that was specified, the stepwise method stops, and declares the current model as the end result. (Of course, the stepwise method will also stop if all possible regressors have been used up.)

To find the significance of each “full model” vs. its corresponding “reduced model”, we calculate a likelihood ratio statistic. Using L0 as the restricted likelihood of the reduced model and L1 as the restricted likelihood of the full model (both computed as in 17.12.1), the statistic -2ln(L0∕L1) should be chi-squared with one degree of freedom.

The same remarks apply for permutation testing with logistic regression as for permutation testing (17.6) with linear regression.