Haplotype Trend Regression
›› See also Haplotype Association Analysis
›› See also Haplotype Frequency Estimation
A haplotype is a set of closely linked genetic markers present on one chromosome which tend to be inherited together (not easily separable by recombination). With haplotype analysis we can capture a larger part of the genetic variation in a candidate gene. It is also more powerful than single marker associations. Haplotype analysis also captures interactions between multiple functional SNPs.
The motivation for haplotype trend regression is to provide a unified approach for testing association of haplotype frequencies with discrete and continuous phenotype. The haplotype trend regression, developed by Zaykin, Westfall et al. (2001), fits a model of additive effects of haplotypes. Under the hypothesis of no haplotype effects and under the assumption of Hardy-Weinberg equilibrium, the haplotype trend regression is closely related to the previously suggested likelihood ratio test (LRT) for the binary phenotype, described in Xie and Ott, (1993), Fallin and Schork, (2000) and Zhao et al., (2000).
The difference occurs upon the situation of interest, when haplotypes have an actual effect on the phenotype, and the assumption of Hardy-Weinberg equilibrium in a particular category of response is no longer valid (Nielsen, 1998). In this situation the haplotype trend regression provides a more powerful test (Zaykin, Westfall et al., 2001). This is the consequence of the fact that the LRT operates on haplotype counts that add up to twice the sample size, which implicitly assumes Hardy-Weinberg equilibrium.

Figure 1. Haplotype Trend Regression Spreadsheet.
Using a novel haplotype trend regression methodology, it is possible to associate disease or drug response with haplotype frequencies of individuals. Haplotype trend regression takes a series of marker genotypes, computes haplotype probabilities for each observation using the composite haplotype method (CHM), and forms a linear regression on the response using the haplotype probabilities as the regression matrix as in seen in Figure 1. Instead of using the CHM method, one can also employ the Expectation Maximization (EM) algorithm to compute haplotypes.
In Figure 2, instead of partitioning the data into subgroups, a node containing the regression residuals is dropped underneath the parent node. Note Figure 2 and 3, the standard deviation drops from 0.53 to 0.24 once the 3rd order haplotype of markers 27-31 has been factored out. One can mix and match regression and partitioning in the same model.

Figure 2. Haplotype Trend Regression residual node.

Figure 3. Histogram plot.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |





