‹‹ Back to SVS Home

Correcting for Stratification/Batch Effects by Genomic Control

7.9 Correcting for Stratification/Batch Effects by Genomic Control

This somewhat older method, pioneered by Devlin and Roeder [Devlin and Roeder 1999], notes that the chi-squared distribution of statistics from association tests being confounded by stratification will be more “spread out” than they should be. The result is a higher median than the median of a true chi-square distribution. Several models exist for how much the distribution should be spread out, depending on the test type, but the distribution will usually be uniformly spread out by a certain “inflation factor” λ.

This technique of Genomic Control measures this “inflation factor” λ by taking the median of the distribution of the chi-square statistic from results of an actual test done over a set of markers from the study in question, and dividing this median by the median of the corresponding (ideal) chi-square distribution. If the result is less than one, the distribution is considered close enough to ideal and λ is taken to be one.

Then, Genomic Control applies its correction by dividing the actual association test chi-square statistic results by this λ, thus possibly making these results appropriately more pessimistic.

Two approaches exist for this:

  • Measure the “inflation factor” λ over a set of markers designed to indicated population stratification. Then, use this λ on the actual association test (presumably done for just a few candidate markers). (For studies over a small number of markers.)
  • Measure the “inflation factor” λ over the actual association tests being done. Then afterward, use this λ on all chi-square results so obtained. (For whole-genome scans or a large number of markers.)

Both of these approaches are available in SVS, through the Analysis > Genotype Association Tests dialog in the Analysis menu.

From the Genotype Association Tests dialog, select Show Inflation Factor (Lambda), Chi-Squares, and Corrected Values to find inflation factors (λ) and the results of applying the Genomic Control technique on chi-squares, p-values, Bonferroni-adjusted p-values, and False Discovery Rates.

From the Genotype Association Tests dialog, select Correct Using This Inflation Factor (Lambda) Instead: and enter a λ value to use as an “inflation factor” that was determined from a previous association test run or from other data previously analyzed.

NOTE:

  • The inflation factor relates to the chi-square statistic. After a chi-square statistic has been corrected through Genomic Control, the normal procedure for finding the approximate p-value is still followed. If there had been “inflation”, the Genomic Control-corrected p-value will be pushed closer to one.