14.3 The File Menu
| Figure 14.9: | LD Plot File Menu |
|
In addition to saving bitmaps and printing, the Linkage Disequilibrium plot has several outputs (either spreadsheet or
comma-separated-value (CSV) file) available under the File menu (Fig.14.9). The first is a summary of the entire plot. The
next four outputs are certain statistical values computed for any individual point. Each such output will only relate to
whichever point has values on display in the lower-left corner of the plot window. The final menu option outputs SNP tags for
the markers shown on the bottom axis, which are first found using the Carlson method. The file menu options are as
follows:
14.3.1 Create Bitmap
| Figure 14.10: | The menu choices File->Create Bitmap opens a Save As file dialog window for navigating and
saving the BMP file. |
|
The menu choices File->Create Bitmap opens a Save As file dialog window for navigating and saving the BMP
file.
| Figure 14.11: | Bitmap of the LD plot with all HelixTree controls, such as buttons, pull downs and menus, removed. |
|
14.3.2 Print Image
| Figure 14.12: | The menu choices File->Print Image opens a Print dialog window. |
|
14.3.3 Summarize Data for All Points to CSV File
| Figure 14.13: | LD data summarized in a spreadsheet |
|
Use the menus File->Summarize Data for All Points to save and export the summary data as a comma-separated-value text
(CSV) file. The CSV file is organized as one row for every point with just a few columns, so it may be handled easier by
spreadsheet software. Figure 14.13 shows the data loaded into a spreadsheet.
Note that for the initial (un-zoomed) display, only the data above the diagonal is exported, because this plot is symmetric
around the diagonal. For a zoomed display, all points which are showing will be output, even if some may redundantly occur
on both sides of the diagonal.
The fields for this output are as follows:
| Field | Value |
| Marker1 | The X-coordinate marker |
| Marker2 | The Y-coordinate marker |
| Distance | The distance between the
two markers |
| Chi Squared | Statistic on the correlation between marker1 and marker2 |
| p Value (Chi Sq.) | The corresponding
p-value for Chi square |
| Neg Log p (Chi Sq.) | The negative logarithm base 10 of p-value |
| LD Correlation R | The correlation
measure between marker1 and marker2 |
| D Prime | This is an alternative measure of linkage disequilibrium |
14.3.4 Output Genotypes for Current Point to CSV File
| Figure 14.14: | LD Genotypes output to spreadsheet |
|
Use the menus File->Output Genotypes for Current Point to save and export the genotype data as a comma-separated-value
text (CSV) file. This file contains the counts of marker/genotypes for a selected coordinate. Figure 14.14 shows the data
loaded into a spreadsheet.
The fields for this output are as follows:
| Field | Value |
| Marker | The name of the X-axis marker or of the Y-axis marker for this point will appear here. |
| Genotype |
The genotype in the above-named marker, in a_b format |
| Count | The count of this genotype at this marker |
14.3.5 Output Genotype Combinations for Current Point
| Figure 14.15: | LD Combined Genotypes output to spreadsheet |
|
Use the menus File->Output Genotype Combinations for Current Point to save this file. Figure 14.15 shows the data
loaded into a spreadsheet.
The fields for this output are as follows:
| Field | Value |
| Marker 1 | The X-axis marker |
| Marker 2 | The Y-axis marker |
| Genotype 1 | A genotype at marker 1 |
| Genotype
2 | A genotype at marker 2 |
| Count | The number of times Genotype 1 and Genotype 2 occur together in the same person |
14.3.6 Output Allele Counts for Current Point
| Figure 14.16: | Output of allele counts for a selected point |
|
Use the menus File->Output Allele Counts for Current Point to save this file. This saves the allele counts for a selected
point in the plotted matrix. Figure 14.16 shows the data loaded into a spreadsheet.
The fields for this output are as follows:
| Field | Value |
| Marker | This shows either the x-axis marker name or the y-axis marker name |
| Allele | An allele for the
named marker |
| Count | The number of occurrences of this allele in the named marker |
| Freq | The probability for this allele at
this marker |
14.3.7 Output Allele Combinations for Current Point
| Figure 14.17: | Output of allele combinations for a current point |
|
Use the menus File->Output Allele Combinations for Current Point to save this file. Figure 14.17 shows the data loaded
into a spreadsheet.
The purpose of this file is to answer the question: “For every allele from the x-axis marker and every allele from the y-axis
marker, how does the actual probability of finding this combination compare with the expected probability of finding
it?”
The fields for this output are as follows:
| Field | Value |
| Marker 1 | The x-axis marker |
| Marker 2 | The y-axis marker |
| Allele 1 | An allele for the x-axis marker |
| Allele
2 | An allele for the y-axis marker |
| Combination Count | The number of occurrences of this allele combination at this point |
| Nij
Over n | The probability of this combination at this point |
| Freq 1 | The probability for this x-axis allele |
| Freq 2 |
The probability for this y-axis allele |
| Delta ij | The difference between the actual and expected combination
probabilities |
| Chi Squared ij | Chi squared resulting from this combination |
| D Prime ij | The D Prime number resulting
from this combination |
| Chi Square Contrbn | The contribution from this combination to the point’s entire Chi
squared value |
| D Prime Contrbn | The contribution from this combination to the point’s entire D Prime value |
| (DAA) | Hardy-Weinberg coefficient DAA for the first allele of marker 1 (biallelic HW-corrected case only). |
| (DBB) | Hardy-Weinberg coefficient DBB for the first allele of marker 2 (biallelic HW-corrected case only). |
14.3.8 View Carlson-Method SNP Tags
This option finds and shows the result of SNP-tagging on the markers currently showing on the X-axis using the Carlson
method.
This option may also be requested by clicking the “C” icon on the toolbar.
| Figure 14.18: | LD Plot Carlson SNP-Tagging Option |
|
The Carlson (see [Carlson 2004]) method, which is based completely on the R2 LD statistic, determines groupings of
markers which are in tight correlation with an individual marker or markers (tagging markers) within the grouping. Markers
whose minor-allele frequency (MAF) is not sufficiently high are not considered for designation as tagging
markers.
NOTE: This method is designed to work best where the markers are individual SNPs without missing
data.
When this option is selected, the following dialogue appears. In it, you select the thresholds that you want to use for the
minor-allele frequency (MAF) and for the R2 statistic. In addition, you can limit the separation of pairs of markers being
considered for groupings; that is, you can use this method over a narrow window of markers that are more likely to yield pairs
of markers which are in LD. To do this, check the “Check for Marker Correlation Only Within Window” checkbox and enter
the window size.
| Figure 14.19: | Carlson SNP-Tagging Options |
|
After some computation, a spreadsheet will appear.
The fields for this output are as follows:
| Field | Value |
| Name | The name of the marker |
| Minor Allele Freq | The minor-allele frequency of the marker |
| Minimum LD
R^2 | The minimum R2 LD statistic existing between this marker and any other marker in its group |
| Tag? | Whether or not this
marker is usable as a tag for this group |
| Tag Group | The number of the group into which this marker has been placed, or zero
if the marker is not in a group |
| Figure 14.20: | Carlson SNP Tags |
|
In order to better view the groupings, sort the spreadsheet by column 4, the grouping number.
| Figure 14.21: | Grouped Carlson SNP Tags |
|
“Group 0”, if present, represents those markers with insufficient minor-allele frequency to be grouped or to be
used as tags. Group 1 will be the largest group, since the Carlson method finds the largest possible groupings
first, then out of the remaining markers, finds the next-largest possible grouping, etc. The final groups may
consist of singleton markers, that is, those which do not have sufficient R2 LD with any other marker to be
grouped.