Welcome to the Introduction to VarSeq Tutorial!
Updated: January 16, 2021

Level: Beginner
Product: VarSeq
This tutorial covers a basic cardiomyopathy gene panel workflow with an emphasis on understanding and exploring filter chains and variant tables.
Requirements
To complete this tutorial you will need to download and unzip the following file, which includes a starter project.
Important
The majority of the workflow described in this tutorial can be done within the free VarSeq Viewer however some options do require an active license to perform. You can go to Discover VarSeq and request a viewer or evaluation license.
Download
Files included in the above ZIP file: Introduction to VarSeq Tutorial – Starter project containing 1 VCF file for the NextSeq 500 TruSight Cardio dataset with the NA12877-Rep_S1 sample.
Note
VarSeq version 2.2.1 was used to create this tutorial. While every attempt will be made to keep this content relevant, it is possible that certain features or icons may change with newer releases.
Setup
The most recent version of VarSeq can be downloaded from here: VarSeq Download.
Select your operating system and download. Additional information for platform specific installation can be found in the Installing and Initializing section of the manual.
The Setup Wizard will then guide you through the setup process.
On the final page of the Setup Wizard, select Finish with the Launch VarSeq option checked.
This will bring up the introductory VarSeq page where new users can register their information. This will lead to a confirmation email being sent to confirm the email address.
Once the email has been confirmed, users can select the Login tab and enter their login email and password.
At this point, the VarSeq Viewer mode is accessed and can be used. If the user already has a license key, this can be activated by selecting Help on the title bar and then selecting Activate a VarSeq License Key.
This will bring up a dialog where the license key can be entered. Enter you license key, select and select Verify.
Once the license key is verified, select the I accept the license agreement after reading the agreement, and select Verify.
Congratulations! At this point, the product license is activated and you are ready to start an example project or a tutorial!
Note
During the initial installation process, the user will be asked where to store the AppData folder. Although this location can be changed after installation, it is recommended that multiple-user organizations select a shared drive location to increase ease of project sharing and to decrease redundancy.
Overview
This tutorial will start with an existing project. A lot of the functionality described in this tutorial can be performed with a “Viewer” license, for those options not allowed you will get instead a warning from the software. We recommend going through the tutorial with an active license or a demo license to experience full functionality within this project. Additionally, this project could be used as a basis for further exploration, annotation and filtering. The end result will be exported data with a candidate variant.
VarSeq software supports several variant filtering workflows such as trio, cancer, and gene panel workflows to name a few. This tutorial will focus on a Cardiomyopathy gene panel workflow that utilizes a set of filtering criteria to find a clinically relevant pathogenic variant.
The ZIP file can be downloaded and the contents can be extracted to a convenient location. Then from VarSeq go to File > Open Project to select the saved project file. Alternatively, if you would like to go through the import process, see the import instructions below.
To import the original VCF files that were used in the supplied project, download them from our Data Repository through VarSeq by opening the software and going to Tools > Manage Data Sources and selecting the NextSeq 500 TruSight Cardio file from the Example Samples > NextSeq 500 TruSight Cardio folder. Select the**NA12877-Rep1_S1 and** NA12878-Rep1_S1 Variant Map files. Then click Download in the lower left corner. See Figure 1-1.

Return to VarSeq and select Create New Project. At this point you can either choose to use a template or start an empty project from scratch. Provide a name for the project and click Ok. See Figure 1-2.

Next we will import the VCF files by clicking Import Variants. Then, open the folder containing the NA12877-Rep1_S1 and NA12878-Rep1_S1 VCF files that were downloaded and click and drag the files into the import window. Alternatively, you can click Add Files and navigate to the folder with the samples and add them this way. See Figure 1-3. Click next. Now specify the sample relationships. Click next twice more and click Finished to complete the import.

A typical hereditary workflow would begin with a genetic test for patients with inherited cardiovascular disorders, hereditary cancers, or other inherited diseases that are identified through the patient’s family history and clinical presentation. Once variants have been called by a secondary analysis pipeline, the variants should be first filtered on quality of the call. Also, it is important to capture rare variants that are not found at high frequency across the general population and this can be achieved by utilizing variant frequency databases. The workflow wraps up by choosing variants that are associated with the phenotype of the patient and selecting for those variants that are pathogenic or likely pathogenic.
The second chapter or section of this tutorial will cover the Filter View, including the input and output of the filter chain, how to expand and collapse a card, how to move a card, change a card and click on a card to get a filter level report.
The third chapter will cover the Table View, including how to hide and show columns, cloning a table and linking and unlinking it to a filter result, viewing intermediate results in the filter chain, and getting column reports and histograms.
The fourth chapter will cover exporting results from VarSeq into various formats.
The last chapter will wrap up the tutorial as well as include links for where to go next to learn more about VarSeq.
The Filter View Interface

- First, we will examine the input and output number of variants in the filter chain. You should see a number at the top right of the filter chain (as noted in Figure 2-1) that indicates how many variants were imported from the VCF file for the current sample. This number should be 14,597. See also Figure 2-2.

You should also see a number at the bottom right of the filter chain (as noted in Figure 2-1) that indicates how many variants remained for the current sample after applying the current filter chain and all enabled filters. This number should be 1. See also Figure 2-3.

Note
Two filters are currently not enabled. These filters are grayed out and all of the variants pass through these cards (the card has no effect on the variants in the filter chain).
You should also notice a number on each filter card enabled or otherwise. This is the number of variants that remain in the filter chain after applying the particular filter.
2. Next, we will examine a filter in more detail. In the current project, the filters are shown in a collapsed state. These filters can be examined by expanding the cards and showing the filter criteria. Click on the open rectangle (see Figure 2-4) for the Read Depths (DP) (Current) card.

This filter is a numeric filter which means there is a numeric threshold box and options on how to filter based on that threshold. Currently only variants for the current sample are kept if the read depth is greater than or equal to 100 reads. See Figure 2-5.

3. To see what happens to the filter chain when changing a numeric filter, change the value of 100 to 50. You should see that only 1,077 variants pass this filter See Figure 2-6.

Now, change the threshold back to 100 and collapse the card by clicking on the horizontal line in the upper right corner of the card. See Figure 2-7.

4. Filter containers are a great way to group filter cards. To make a filter container, right click inside of the filter chain window and select Add Filter Container. See Figure 2-8. Re-name the filter container to ‘Quality Control Filters’ by double clicking on the filter container title. See Figure 2-9.


To change the order of the filtering operations, click on a card and drag it to the desired location. We will demonstrate this by moving the Read Depths (DP) (current) and the Genotype Qualities (GQ) (Current) filter cards to into the Quality Control Filters filter container. Then, click on the Read Depths (DP) (Current) filter card and drag and hover over inside of the Quality Control Filters filter container until the black horizontal I-bar appears under the Quality Control Filters filter container. See Figure 2-10.

Repeat this process with the Genotype Qualities (GQ) (Current) card. See Figure 2-11.

Finally, drag the Quality Control Filters filter container to the top of the filter chain by clicking on the filter container and drag and hover over the top of the filter chain until the black horizontal I-bar appears over the All MAF card. See Figure 2-12.

5. To add an annotation source to the project, click on the Add Icon and select Add Variant Annotation. See Figure 2-13.

We will add an additional population frequency database. To do this, select the Public Annotations folder and then select 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI. You can also use the Filter search bar at the top of the window to easily search for annotation sources. See Figure 2-14.

Once added, scroll all the way to the right in the variant table to see 1kG Phase3-Variant Frequencies 5a with Genotype Counts, GHI within the table and all of the associated fields. Any of the fields in the variant table can be added to the filter chain by right clicking on the field and selecting Add to Filter Chain. Let’s do this with the Allele Frequencies field. In addition, when you click on any column header in the variant table, the Details view will update to reflect the information from the source as well as provide a table of values contained within the column and the proportional of values seen for the input set of variants for each category. See Figure 2-15.

Now, set the numeric filter to 0.3 for the Allele Frequencies filter card, then move the filter card under the All MAF filter card. See Figure 2-16.

6. Next, expand the All MAF filter card and click on the yellow histogram icon. This will give a histogram of the minor allele frequencies for input variants for this card. See Figure 2-17.

The Table View Interface

The table view (Figure 3-1) in VarSeq can have numerous columns from the various data sources. To make these columns or data fields easier to work with, they are grouped by source. Columns and column groups can be hidden or shown to make the fields or sources more easily accessible. First, we will show you two different ways to hide and show columns and column groups. Right-click on any column, for example Identifier and select Hide. All column groups with the exception of sample column groups can be hidden in this manner as well. See Figure 3-2.

The visibility of fields for a particular column group can be quickly modified by hovering over the column group and right clicking to see all the fields available. Check the Identifier box to show it again. See Figure 3-3.

Another way to hide and show columns and column groups is to use the Hide/Show or Column Visibility dialog. This dialog allows you to quickly select which columns to see or not see as well as to show hidden columns or column groups. Click on the eye icon in the tool bar to launch the dialog. See Figure 3-4. The column visibility dialog should look like Figure 3-4 with the RefSeq Genes 105v2, NCBI column group expanded.

Expand the Transcript Interactions RefSeq Genes 105v2, NCBI column group, click on Unselect All to the right of this column group name and then click on the transcript interactions RefSeq Genes 105v2, NCBI box, finally, click on the box in front of HGVS p. and then click OK. This will show the column group but only have the “P.” notation column visible. See Figure 3-5.

Tables can either updated when filter results are clicked on or they can be fixed to a particular filter result. To tell if a table is locked to a particular filter, examine the table filter icon. If the padlock icon is open the table is unlocked. If it is closed, then it is locked to the particular result identified in the icon. See Figure 3-6.

Click on the final result of the filter chain. The table filter icon should now say “Cardiomyopathy Input”. We want to keep a copy of this table a create another table to explore intermediate results. So, click the lock icon to lock the current table then click the clone icon in the Table View tool bar. This will lock the current table and create a copy. See Figure 3-7.


Click the table for the copy to bring it to the front and unlock the filter view then click on the filter results for the Effect (Combined) filter card. You should see the top table update to include all 82 variants from that card, however, the bottom table locked to Cardiomyopathy does not update. See Figure 3-8.

Click around some more, expand some cards and see how the table updates. Click on the Gene Rank filter card and expand it. Now right-click on the Greater than 0.9 category and select Results Up Through This Expression. This will update the table to be all variants for the current sample that match our phenotype of interesting (cardiomyopathy) from the original input set of variants. See Figure 3-9.

See in Figure 3-10 how the table filter icon changes to indicate a particular category was used for the variants to display in the table. See Figure 3-10.

Exporting Results
For this chapter, the definition of “export” will be loosely interpreted. The first “export” we are going to perform is exporting a ClinVar RCV number to the ClinVar website to examine a variant in further detail. To start, click on the locked table tab that contains our 1 clinically significant variant and unlock the table. Next, scroll over to the ClinVar* column group. See Figure 4-1. Double-click on the RCV number for the variant and this will launch a web browser displaying results from ClinVar for this variant. See Figure 4-1.

Double-click on the RCV number for the variant and this will launch a web browser displaying results from ClinVar for this variant. See Figure 4-2.

Rows can also be copied out of the table individually. To do this, right-click anywhere in a row and select one of the options. Information from a cell can also be copied out of the table from the right-click menu as well. See Figure 4-3.

To copy information out of the Detail View, you can either click on the Copy button on the lower right corner of the view to copy all of the visible information or you can click and drag to make a selection and either use Ctrl + C or right-click and select Copy to copy the information and formatting to the clipboard. See Figure 4-4.

The variants and annotation information from the table can also be exported directly to different file formats including VCF, Microsoft Excel XLSX, and text (TXT). Click on the Export icon on the top left of the VarSeq Window and then select VCF File…. See Figure 4-5.

Select the (Variants: 14,597) Cardiomyopathy Input: NA12877-Rep1_S1 table to be exported in VCF format then click OK. See Figure 4-6.

All of the annotation information can be exported in the VCF file, which will include all samples. Just as columns and column groups can be hidden and shown, the available columns and column groups can be selected for inclusion in the VCF file. Select RefSeq Genes 105 Interim v1, NCBI, Gene and Cardiomyopathy PhoRank and then click Next. Click on Browse to select a location to save the file and set the file to export to export_Cardiomyopathy_tutorial-{sample}.vcf.gz or another name of your choice. Once the options are set click Export. See Figure 4-7.

This file can then be imported into any other program that takes VCF files or reimported into VarSeq for accounts that have an active license.
Conclusion
This tutorial was designed to give a taste of all the features and capabilities of VarSeq and a brief orientation to key features.
If you are interested in getting a demo license to try out additional features that require an active license, such as creating a project, adding annotation sources, and saving project, please request a demo from: Discover VarSeq
If you have an active license, we encourage you to try out the intermediate tutorial on Cancer Gene Panels: Cancer Gene Panel Tutorial.
Additional features and capabilities are being added all the time, so if you do not see a feature you need for your workflows please do not hesitate to let us know at Golden Helix Support!