PBAT Methodology
The cornerstone of PBAT is the unified approach to the FBAT statistic, which itself is a generalization of the Transmission Disequilibrium Test (TDT) method, in which alleles transmitted to affected offspring are compared with the expected distribution of alleles among offspring. The FBAT statistic is based on a linear combination of offspring genotypes and traits:
where V = V ar(s) and Tij represents the coded phenotype (ie the phenotype adjusted for any covariates) of the j-th offspring in family i. Xij denotes the offspring’s coded genotype at the locus being tested, and depends on the genetic model under consideration.
The expected distribution is derived using Mendel’s law of segregation and, in PBAT, conditioning on the sufficient statistics for any nuisance parameters under the null hypothesis, the null hypothesis being “no linkage and no association” or “no association, in the presence of linkage”.
PBAT generalizes this test to cover different genetic models, tests of different sampling designs, tests involving different disease phenotypes, tests with missing parents and tests of different null hypotheses, all in the same framework.
The key concept of PBAT’s screening techniques is the conditional mean model approach, for which the data space is considered to be partitioned into two independent testing sets. This approach may be described as follows:
- First, finding which combination of phenotypes as a group and markers have the highest power when tested against not the actual genotypes, but those that are predicted from the parents’ genotypes, or, if those are missing, from the sufficient statistic of the marker distribution.
- Second, performing the FBAT test for the selected combinations of phenotypes and markers on the actual genotypes of the patients, both as a group and individually.
This allows one to control the type I error rates and to overcome one of the most important statistical hurdles when analyzing genome-wide association studies with thousands of markers: the multiple comparison problem. The screening methods are only minimally affected by the non-casual SNPs–in addition, they are robust against effects of population stratification and admixture, since the final decision is based on FBAT statistics, which guard against these confounding factors. Finally, the screening tools are most successful in detecting common disease susceptibility loci.
Also, PBAT supports the advance planning of family-based association studies by providing calculations of power estimates for virtually any given study design or ascertainment conditions. (By “Ascertainment conditions” are meant which phenotype(s) will be considered to be important by the lab technician, thus influencing for which patients the data is obtained.)
A new feature of PBAT is the support of testing for copy-number variation (CNV) in a family-based setting. All of the test approaches that are used on genotypic data (coded genotypes) may also be performed on copy number intensity data.