17.15 Caveats
Under some circumstances, the iteration procedure for the logistic regression will be unstable and the regression may fail, even when the matrix has sufficient rank and significant regressors are included. Such a circumstance can be when the regression tries to emulate a step function, or otherwise tries to accommodate independent values for which the dependent value is either exclusively 1 or exclusively 0.
If the regression is being done stepwise, similar circumstances resulting in instability may cause “paradoxical” phenomena such as:
- The final regression (used to get the statistics to show) failing, even though it “is the same as” the last model tried in the stepwise regression. (Actually, the regressors in the final model can be in a different order than in the last model tried in the stepwise regression. If the problem is highly unstable, the different order may be enough to cause the failure.)
- For some regressors, you may have Pr(> ∣t∣) = 1. This happens where the regression fails after removing the current regressor. (Of course, this is only possible for a regressor other than the latest one that was added in).
Techniques for remedying this situation directly are being contemplated, and may be implemented in Optimus RP in the future.
At this time, the best workaround is to filter out the data that causes such instabilities. For instance, if one covariate of a regression has a coefficient above 15 or 20 or below -15 or -20 and the regressors from a stepwise regression won’t regress directly, or if a certain covariate doesn’t regress by itself, consider splitting (doing recursive partitioning) on the covariate and doing the regression on one or more of the tree node subset spreadsheets.