‹‹ Back to SVS Home
Correcting for Stratification/Batch Effects by Genomic Control
7.9 Correcting for Stratification/Batch Effects by Genomic Control
This somewhat older method, pioneered by Devlin and Roeder [Devlin and Roeder 1999], notes that the chi-squared
distribution of statistics from association tests being confounded by stratification will be more “spread out” than they should
be. The result is a higher median than the median of a true chi-square distribution. Several models exist for how much the
distribution should be spread out, depending on the test type, but the distribution will usually be uniformly spread out by a
certain “inflation factor” λ.
This technique of Genomic Control measures this “inflation factor” λ by taking the median of the distribution of the
chi-square statistic from results of an actual test done over a set of markers from the study in question, and dividing this
median by the median of the corresponding (ideal) chi-square distribution. If the result is less than one, the distribution is
considered close enough to ideal and λ is taken to be one.
Then, Genomic Control applies its correction by dividing the actual association test chi-square statistic results by this λ,
thus possibly making these results appropriately more pessimistic.
Two approaches exist for this:
- Measure the “inflation factor” λ over a set of markers designed to indicated population stratification. Then, use this λ on the actual association test (presumably done for just a few candidate markers). (For studies over a small number of markers.)
- Measure the “inflation factor” λ over the actual association tests being done. Then afterward, use this λ on all chi-square results so obtained. (For whole-genome scans or a large number of markers.)
Both of these approaches are available in SVS, through the Analysis > Genotype Association Tests dialog in the
Analysis menu.
From the Genotype Association Tests dialog, select Show Inflation Factor (Lambda), Chi-Squares, and Corrected
Values to find inflation factors (λ) and the results of applying the Genomic Control technique on chi-squares, p-values,
Bonferroni-adjusted p-values, and False Discovery Rates.
From the Genotype Association Tests dialog, select Correct Using This Inflation Factor (Lambda) Instead: and
enter a λ value to use as an “inflation factor” that was determined from a previous association test run or from other data
previously analyzed.
NOTE:
- The inflation factor relates to the chi-square statistic. After a chi-square statistic has been corrected through Genomic Control, the normal procedure for finding the approximate p-value is still followed. If there had been “inflation”, the Genomic Control-corrected p-value will be pushed closer to one.