Methods for Obtaining General Marker Statistics

The following subsections further explain certain methods used in obtaining General Marker Statistics, which may be invoked using a separate window (6.6.2) or as a tab in the Genetic Association Test module (Chapter 18).

26.24.1 Fisher’s Exact Test for HWE P-Value

In this test, all of the possible sets of genotypic counts consistent with the observed allele totals are cycled through, and the probabilities of all sets of counts which are as extreme or more extreme (equally probable or less probable) than the observed set of counts are summed.

See [Emigh 1980].

26.24.2 Signed HWE R

We define this as

R      = nnDD--- n2D,
  signed    nnD - n2D

where

nD = nDD  + nDd,
             2

n is the total genotype count and nDD and nDd are the counts for genotypes DD and Dd, respectively.

This is derived from the formula for (signed) correlation between two sets of observations, xi and yi,

            ∑       ∑   ∑
   --------n--xiyi ---xi---yi-------
r = ∘n-∑-x2--(∑-xi)2∘n-∑-y2-- (∑-yi)2,
          i               i

where we take the xi to be 0 if the first allele is d and 1 if the first allele is D, and the yi to be 0 if the second allele is d and 1 if the second allele is D.

Because of phase ambiguity, we set each of the counts of (d,D) and (D,d) to be one-half of the (phase-ambiguous) observed count of Dd. The correlation then simplifies to the formula first given above.

If there is a high homozygous count, xi and yi will often be 1 or often be 0 at the same time, and therefore there will be a positive correlation between the xi and the yi. Similarly, if there is a high heterozygous count, xi and yi will often be 1 at opposite times, causing an anti-correlation to exist.