Haplotype Trend Regression (HTR) with Binomial Response

Haplotype frequencies are computed as they are for the normal response (26.6). However, instead of using multiple linear regression, we use a logit model to fit the binary response, y, using the covariate matrix, x, consisting of the patient frequencies along with a vector of all one’s. (The vector of all one’s facilitates obtaining an intercept.) We use the Newton method approach of maximizing the log likelihood function for the logit model outlined in Econometric Analysis, by W.H. Green, third ed., Prentice Hall, NJ, 1997, pp. 882-886. To obtain a p-value, we test the hypothesis that all of the slope coefficients and the intercept coefficient in the logit model are zero. We calculate a likelihood ratio statistic, where l0 is the unrestricted likelihood and l1 is the restricted likelihood, and -2ln(l0∕l1) should be chi-squared with n-1 degrees of freedom.

We simply the notation by using capital L to mean “log likelihood”–that is,

L0 = log(l0)

and

L1 = log(l1),

where base e logarithms are used.

Using this notation, the unrestricted log likelihood is

L0 = n[s log(s)+ (1- s)log(1- s)],

where s is the proportion of the n dependent observations (y1...yn) that are equal to one, and the restricted log likelihood is

        [     (         )            (             ) ]
     ∑n        ----1----                  ----1----
L1 =     yilog  1 +e- βT xi  + (1- yi)log  1- 1 + e-βTxi  ,and
     i=1

p = aP = chisqr(- 2(L0 - L1 ),n - 1).

Here, β is a vector consisting of the slope coefficients followed by the intercept coefficient.