Alternative Rapid Extended Pedigree Algorithm
As of SVS version 6.3, Golden Helix PBAT includes an option when doing family-based SNP analysis to use a new Alternative Rapid Extended Pedigree Algorithm (available under the Phenotype and Haplotype Parameters tab). The greatest benefit of this new algorithm is that it's significantly faster ...
Why are there negative signed p-values in the PBAT output?
In Golden Helix SVS 6.3 there is an option under the PBAT Test Statistic and Computational tab to output "signed" p-values for SNP, haplotype and copy number analysis (Figure 1). Choosing this option will provide positive and negative p-values for the pvalue(FBAT) column in the output table. NOT ...
Interpreting the Correlation/Interaction view
From the HelixTree Manual: The Correlation Interaction View The figure aboveshows a matrix of the selected variables in the order they were sorted in the Multitree Model window. The numbers appearing in the black diagonal blocks represent the average proportion of observations described by the ...
EIGENSTRAT in HelixTree
The technique implemented in HelixTree to correct input data for population stratification by principal component analysis is based on the program EIGENSTRAT with a few enhancements. This technique was pioneered at the Broad Institute. The PCA correction technique is described in [Price 2006]. ...
Dominant and Recessive Genetic Association Tests and Cardinality
When running genetic association tests (population data) under the dominant and recessive model, genotypes are recoded according to the specified model and then tested for association. Suppose that the SNP we are looking at has 2 alleles, A and B. In this process, we choose the allele to model b ...
Does the adjusted p-value (Ap) include as possible splits those that do not contain the minimum number of observations in the nodes, as required in the tree options?
The short answer is no. For tree splits, the factors used in the adjusted p calculations were derived from simulations on random data from the appropriate distribution (binary or normal) where no constraint was placed on the minimum observations in the nodes. Thus, if you have a constraint on th ...
What is the difference between single and full scan permutations?
These permutations are accessed from the Genetic Association Tests Window. For this example, let’s assume we have 50 markers and run 100 permutations. Of course, these are user defined. With a single scan permutation: 1. For each of the 50 markers, we run 100 permutations on our dependent var ...
Hardy Weinberg Equilibrium (HWE) differences between HelixTree and PBAT
1. There are two HWE columns in PBAT. The "main" HWE column is for patients only and the other for parents only. HelixTree calculates HWE for all samples (not including deactivated rows.) 2. PBAT uses 2 degrees of freedom and HelixTree only one for its HWE calculations.
Can HelixTree handle microsatellite genotypes (or multi-allelic markers)?
Yes, HelixTree can handle markers that are multi-allelic. If you have a multi-allelic marker HelixTree will still recognize the column as genetic and you can perform LD, HWE, etc. You will not be able to test for association through the Case/Control Association window, but instead use Interactiv ...
Settings for handling missing values in HelixTree
There are three levels of options to manipulate how HelixTree handles missing values. Beginning with HelixTree's opening screen, you can click on Tools > Options for Updates and New Projects. Within these tabs, you will see a check box next to "Use missing values as predictors." If it is c ...
Is there a recommended cutoff for Hardy Weinberg Equilibrium and minor allele frequency for whole genome analysis?
We are really not sure if there are accepted rules of thumb, so a more cautious answer is, that it depends. There are certain population structures where large departures from HWE are legitimate, as well as regions of the genome prone to copy number deletions that could result in large departure ...
How does the random tree creation tool work in HelixTree?
The random tree creation tool will build random tree models of your data, where the split variables are randomly chosen from a list of significant effects (by default 10 or less). These random trees can then be analyzed in various ways. We can look and see which variables are used across all the ...
The meaning of recursive split in HelixTree
The recursive split will build a dendrogram or tree-like model. The top node is the root node and represents the entire data set. (n=number of observations, u=mean & s=standard deviation). From this root node other branches will grow, forming subgroups, and so on, until a termination criterion i ...
2-Loci Genetic Plot, methodology, p-value computation
The gist of the two-loci plot is that it is implemented as a categorical split using the underlying tree splitting methodology in HelixTree. Here is how the two-loci procedure works for each pair of markers: 1. Create categorical variables corresponding to the unique PAIRS of genotypes. This is ...
HelixTree seems to be missing menu items and/or icons for genetic features.
Certain features such as Haplotype Frequency Estimation, Hardy Weinberg Equilibrium Plot, LD plot, etc. are only available when genetic data is in the spreadsheet. When you have genetic data in your spreadsheet and these features and corresponding buttons are missing, it usually means that genet ...
How does the manual split option work?
If you execute the manual split, HelixTree will look for the model that gives you the best p-value. So HelixTree will look and calculate the p-value for: co-dominant model (A_A) vs. (A_B) vs. (B_B) dominant model (A_A) vs. (A_B, B_B) recessive model (A_A, A_B) vs. (B_B) heterygous advantage ...