Updated: February 15, 2021
This tutorial covers a basic VSClinical ACMG Trio workflow with an emphasis on understanding and exploring the trio filter logic and VSClinical ACMG classification tools.
To complete this tutorial you will need to download and unzip the following file, which includes sample trio data to analyze in the project. This file is a multi-sample VCF file with Proband (NA12940), Mother (NA12938), and Father (NA12939).
The majority of the workflow described in this tutorial requires VarSeq with the VSClinical ACMG algorithm. You can go to Discover VarSeq and request a viewer or evaluation license.
VarSeq version 2.2.1 was used to create this tutorial. While every attempt will be made to keep this content relevant, it is possible that certain features or icons may change with newer releases.
The most recent version of VarSeq can be downloaded from here: VarSeq Download.
Select your operating system and download. Additional information for platform specific installation can be found in the Installing and Initializing section of the manual.
The Setup Wizard will then guide you through the setup process.
On the final page of the Setup Wizard, select Finish with the Launch VarSeq option checked.
This will bring up the introductory VarSeq page where new users can register their information. This will lead to a confirmation email being sent to confirm the email address.
Once the email has been confirmed, users can select the Login tab and enter their login email and password.
At this point, the VarSeq Viewer mode is accessed and can be used. If the user already has a license key, this can be activated by selecting Help on the title bar and then selecting Activate a VarSeq License Key.
This will bring up a dialog where the license key can be entered. Enter you license key, select and select Verify.
Once the license key is verified, select the I accept the license agreement after reading the agreement, and select Verify.
Congratulations! At this point, the product license is activated and you are ready to start an example project or a tutorial!
During the initial installation process, the user will be asked where to store the AppData folder. Although this location can be changed after installation, it is recommended that multiple-user organizations select a shared drive location to increase ease of project sharing and to decrease redundancy.
This tutorial will cover a trio analysis that includes importing VCF files, annotating and filtering variants, and rendering a clinical report using the ACMG Guidelines. The sample data includes a Yoruban trio from the HapMap Project with Mother-NA12938, Father-NA12939, and Proband-NA12940. Using this pedigree structure we will identify variants of different inheritance structures and evalute them using VSClinical according to the ACMG Guidelines.
VarSeq VSClinical ACMG
VSClinical is a tool that provides a simple way to leverage all the available evidence for a variant and score it for the potential impact it has on a disorder. The available evidence can be categorized into groups which includes considerations for gene function/phenotype association, variant segregation in a family, results of running variants through prediction algorithms or their presence in databases, but also utilizing previous discoveries you yourself have made for any variant.
This collection of evidence is then linked to 33 criteria that the user can efficiently assess in a streamlined effort. These criteria are then aggregated to provide the basis for the final classifications of pathogenic, likely pathogenic, uncertain significance, likely benign or benign. When considering the amount of available evidence and requirements for eventually classifying variants, this process can become complex and difficult to master. Fortunately, VSClinical is a solution to this complexity. VSClinical provides a means of simplifying not only the process of scoring and classifying variants, but also provides a simple yet sophisticated means of presenting all evidence and criteria visually.
This tutorial walks the user though a trio analysis workflow with focus on the ACMG guidelines starting with a template provided by Golden Helix and ending with a clinical report. We will first cover the import steps and explore the filtering logic then we will use the ACMG Classifier to evaluate selected variants and render a clinical report. This tutorial was accompanied by three VCF samples contained in a ZIP file, which you will need to download and extract to a convenient location.
New Project Workflow
This tutorial begins by opening a new instance of VarSeq and selecting Create New Project. With the Homo sapiens (Human), GRCh37 (hg19) (Feb 2009) Genome Assembly chosen, select the template called ACMG Guidelines Trio Template and then for the Name enter ACMG Trio Tutorial. If desired, you can also change the directory in which the project is stored by clicking on the Browse icon. The interface should look like Figure 2-1.
Selecting OK brings you to the next screen that directs the user to import variants. Select Add Files, navigate to the location of the extracted ZIP file, and add the ACMG Guidelines Trio Samples.vcf. Click Next.
In the Import Variants of Family Samples dialog, you need to define sample relationships and affection status as well as sex. For the proband (NA19240) you will need to define mother (NA12938) and father (NA12939) as well as the Affection status. You can do this by clicking on the dropdown options under the Mother and Father columns and selecting the appropriate samples. Affection status can be selected by clicking once on the box. The proband is also a Female, which should be selected under the Sex column. Mother and father Sex should also be defined. The dialog should look like Figure 2-3. Once the sample relationships are defined, you also have the option to associate BAM files using the Associate BAM File icon. Although we do not have BAM files for this project, this feature will automatically detect the BAM files if they have the same sample name and are located in the same directory or sub-directory as the VCF files. Click Next.
Selecting ACMG Catalogs
After the import process the user will be provided with a dialog titled ACMG Project Options, which will be used to define the assessment catalogs for VSClinical. The assessment catalogs include one for classified variants, one for classified CNVs, and one for gene dosage sensitivity. Detailed information for these assessment catalogs can be found in the manual here. These catalogs will serve as an internal database by storing finalized variant classifications and can be used to track variants across projects. If the user already created assessment catalogs for VSClinical, you can choose it from the dropdown menus, otherwise select Create Missing Catalogs at the bottom of the dialog to autopopulate these fields.
One the assessment catalogs have been created, select OK and it will bring you to the main project interface where we can start to investigate the filter logic of the template.
Investigating Filter Chains
The VCF data imported into the project using the ACMG Trio Template results in 50,691 variants for all three samples. The imported variants are automatically applied to the filter logic, which is composed of two filter containers: variant quality control (Variant QC) and Inheritance Model. For this project we will focus on variants present in the proband but if we wanted to change the sample you can click on the dropdown icon, highlighted in Figure 4-1, to view the other samples.
The Variant QC filter logic can be expanded in the filter chain by clicking the square box icon in the upper right corner of the container. Once selected, we can investigate the specific filter criteria that are being applied.
The 50,691 variants are applied to the Variant QC container which includes Read Depth (DP) >10 and Genotype Qualities (GQ) >20. These fields are derived from the variant table and are eliminating variants that have a low read depth and genotype quality. If you want to add additional fields into the filter logic, right click on the column header in the variant table and select Add to Filter Chain. Newly added filters will appear at the bottom of the filter logic, but they can be moved using a drag and drop method into any location within the filter chain. Applying this filter logic resulted in a total of 48,521 variants. The resulting variants are then passed into the Inheritance Model filter container.
Next, we will deep dive into variants of Dominant de novo inheritance. For easier viewing, minimize the Variant QC filter container.
Dominant de Novo
The first filter logic in the inheritance model filter container is variants of Dominant de novo. The Dominant de novo filter logic is based on the Mendel Error, Gene Inheritance and the ACMG Auto Classification.
The Mendel Error algorithm computes the Mendelian inheritance for the child’s genotype using the pedigree information, and for this inheritance pattern, de novo allele is selected. This filter indicates that the child’s genotype shows 152 de novo mutations that are not present in either parent. This can also be understood by looking at the genotype zygosity for the proband, mother and father in the variant table. The proband has heterozygous calls whereas the parents are either reference or missing.
The next filter card is Gene Inheritance, which Dominant is selected. This filter logic is pulled from the ACMG Sample Classifier, but is based on the known gene inheritance from OMIM. This filter results in 9 variants where the Gene Inheritance is Dominant. The last filter logic applied is from the ACMG Sample Classifier, which utilizes the available variant evidence and scores it according to the ACMG guideline criteria. Together, this filter logic identified two Pathogenic de novo mutations with a Dominant inheritance pattern. At the bottom of the Dominant de novo filter container, clicking on the 2 in the filter logic will display the variants that have passed the filtering restrictions for this inheritance pattern in the variant table.
We can also visualize the data in GenomeBrowse. To open GenomeBrowse, select the GenomeBrowse tab near the Filter Variants and ACMG Guidelines tab. Next, click on the second variant in your variant table and it will be displayed in the the interface. Here you can see that this variant is heterozygous in the proband but absent in the mother and father.
Next, we will take a look at the Dominant Inherited filter logic. Select the Filter Variant tab to return to the original view.
The Dominant Inherited filter logic is based on Mendel Error, Zygosity, Gene Inheritance, and the ACMG Sample Classifier. The unique filter card that is used in this criteria is Genotype Zygosity, which is a computed algorithm. This algorithm is applied in the Either Parent is Heterozygous container, which when expanded will provide more details. Specifically, this card is filtering for variants that are heterozygous in the father and reference or missing in the mother, or heterozygous in the mother and reference or missing in the father.
The next filter cards identify transmitted variants that are Heterozygous in the proband as well as Dominant for the gene inheritance pattern. Combined with the ACMG Auto Classification, this filter logic identified 6 variants that were transferred from either the mother or the father to the proband. You can click on these dominant inherited variants in the variant table and open Genome Browse to visualize which parent transferred the variant to the proband. For example, if you click on the row that contains the T/C variant at chromosome and position 7:151935853, you will see in Genome Browse that the allele is inherited from the father.
Once finished viewing, reopen the Filter Variant tab to investigate the next filter logic: Recessive Homozygous variants.
A recessive homozygous mutation occurs when there are two copies of the same recessive alternate allele in the proband that were transmitted by the parents. This mode of inheritance can be seen in the applied filter logic, where the parents are heterozygous and the proband is homozygous.
The next filter card is Gene Inheritance and the two options that are selected are Default (Recessive) and Recessive. Default (Recessive) is for a gene where the inheritance has not yet been identified in OMIM as recessive or dominant, whereas the Recessive option is when the gene inheritance has been confirmed. You can also see that there are 18 variants that have a Gene Inheritance of Missing. These variants likely lack genotype zygosity, are reference alleles, or are not in a known gene. Together with the Auto Classification, there are 4 variants identified as Recessive Homozygous, which can be selected and viewed in the variant table and in Genome Browse.
At this point, we will now take a look at Recessive Compound Heterozygous mutations.
Recessive Compound Heterozygous
A compound heterozygous polymorphism refers to a child that has inherited two different heterozygous variants within the same gene, one from each parent. This could result in both copies of the gene being potentially affected. This type of polymorphism should also alter the amino acid sequence, or be classified as a non-synonymous variant. The filter container implements the Compound Het algorithm, which can be found by going to Add>Computed Data>Compound Het. It is important to note that this algorithm is computed on the variants in the selected filter chain and not all variants originally imported. Thus, if a filter logic above this filter card is modified, this algorithm will need to be rerun.
Using the Auto Classification, Compound Het algorithm and Gene Inheritance filter logic resulted in 19 variants. To see further specifics, click on the Compound Het Genes table, next to the variant table tab. In the left dialog you can see the Gene Names and if the gene has a compound het polymorphism, as well as the variant information in the right dialog. If you click on a gene in the Gene Name column, the variant table will display the variants that fall within that gene. For example, if you click on the Gene Name: ANKRD36, you can see that this gene has three variants, one De novo variant and two heterozygous mutations inherited from the mother. Once finished, change the display back to the variant table.
X-linked mutations occur in sex chromosomes or non-autosomal regions. The first filter segments the X-chromosome into specific regions to rule out pseudo-autosomal regions (PAR). PAR regions are homologous sequences of nucleotides on the X and Y chromosomes and any genes within them are inherited just like any autosomal genes. The next filter logic that is applied is the zygosity, which selects for hemizygous (only one copy of a gene is present) and homozygous mutations. When the Auto Classification is applied, however, there are 0 variants that are potentially pathogenic in the X chromosome.
Next, we will take a look at Pathogenic in IF Genes.
Pathogenic in IF Genes
The Pathogenic in Incidental Finding (IF) Genes filter card is based on the ACMG recommendations for mutations in certain genes that should be reported to individuals because of their high potential medical importance. There are currently 59 ACMG IF genes that are accessed through this link, https://www.ncbi.nlm.nih.gov/clinvar/docs/acmg/, and the list can be seen by selecting the details icon and clicking on the column in your variant table titled In ACMG IF Genes?. There is only one variant passes this filter logic and it is in the SMAD4 gene.
Now that we have explored the filter logic for a trio workflow, lets now focus on using VSClinical to evaluate variants according to the ACMG guidelines.
To open VSClinical select the ACMG Guidelines tab. For viewing purposes lets also hide the variant table by hovering over the bottom right corner of the project and selecting the stash icon that will appear.
If a screen called Annotation Versions and Download is shown, Update and Download the most recent catalogs and annotations. The next interface displays the family association, as well as sex and affection status for the proband. The associations are correct, so select Start New Evaluation.
At this point we are now in the VSClinical ACMG interface and the first step will be to add the variants to evaluate. Under the Evaluation Summary click on the Add Variants icon, and then select Add Variants from Project in the next dialog.
This option displays all filtered variants in your project, but for this tutorial we will focus on a de novo SMAD4 variant. To select this variant, first click on Clear All, then locate and choose the SMAD4 pI500V variant, near the bottom, and click Prepare to Add. This will then populate on the right hand side, from which select Add 1 Variants. This will load the variant into VSClinical to be evaluated.
Once the variant has been loaded, navigate to the Variants tab to begin the evaluation.
SMAD4 p.I500V Frequency
The example variant we will be evaluating using the ACMG guidelines is a missense mutation in the SMAD4 gene. The Evidence Summary will provide a comprehensive description of the variant and the recommended scoring criteria. To being the evaluation, we will first look at the frequency of this variant in different population databases. Scroll down to Population frequency.
In the Population tab you can investigate the frequency of this variant observed in gnomAD Exomes and 1kG Phase 3. By default, this interface will display the highest observed allele frequency in the databases, which according to gnomAD is in Non Finnish European at a frequency of 1 in 113754 individuals or 0.0009%. With a low or absent frequency in the population catalogs, we can answer the YES to the first relevant scoring criteria PM2, which states that the variant is absent from controls in population catalogs. Notice that VSClinical will provide you with the auto recommendation to score the variant as well as the reasonings for the selection.
We will next investigate the impact of this variant on the gene.
SMAD4 p.I500V Gene Impact
The first dropdown is the Gene and Transcript. This section provides relevant information including the Gene Name, NCBI description as well as the ability to change the transcript. By clicking on the blue Transcript(s) icon in the Gene Impact dialog, different transcripts can be selected, which will be stored for all future analysis of the gene.The default transcript selected is based on clinical relevance and other heuristics, that are outlined in this blog: https://blog.goldenhelix.com/whats-in-a-name-the-intricacies-of-identifying-variants/. For this gene there is only one transcript, so click Close.
The next section is the Gene Region and Mutation Profile, which provides evidence of previously classified variants in ClinVar and their associated review status. For this interface you can change the settings to look at Missense mutations that occur within the same Exon that have been classified as Pathogenic. This results in 3 variants located within 6 amino acid positions of the SMAD4 p.I500V variant, which indicates that this variant is located in a mutational hotspot.
With this evidence we can answer the criteria for PM1, as this variant in located in a mutational hotspot and there are no benign variants in the region.
Missense as Mechanism of Disease
To determine the missense mutation rate for the SMAD4 gene, we can focus on the next section, Missense as Mechanism of Disease. For this gene, the Z-score produced by the gnomAD project is high, 4.13. This indicates that there is a low rate of benign missense mutations for this gene and that there are other missense variants that are a common mechanism for disease. Together, this provides the first supporting evidence criteria, PP2.
In the Computational Evidence section, VSClinical displays Functional Prediction algorithms such as: SIFT, PolyPhen2, PhyloP and GERP++. These algorithms indicate that the variant is predicted to be damaging (SIFT and PolyPhen) and occur in a conserved region (PhyloP and GERP). Together, this evidence indicates that multiple lines of computational evidence support a deleterious effect on the gene product. This applies All Deleterious to the PP3 criteria. Answering All Deleterious brings our current classification for the SMAD4 p.I500V to a Likely Pathogenic classification. Next we will move on to the Studies tab to determine if there are any publications supporting a pathogenic classification.
SMAD4 p.I500V Clinical and Functional Studies
The Clinical and Functional Studies allows users to search for relevant literature regarding the variant in question. This feature provides the options to search Google, Google Scholar or PubMed and will provide all relevant search terms.
Beyond these searching capabilities, detailed assessments and citations of labs submissions for this variant can be reviewed using the ClinVar Assessment for This Variant search engine. There are 25 ClinVar assessments for this variant, which show the source of interpretation, the guidelines used, and a summary that can be added to the evaluation. As an example, locate the interpretation provided by Invitae. Next, hover over the blue plus icon and the end of the summary excerpt and select Add Text to My Clinical Summary. You can also incorporate the references into your evaluation for the variant when listed by clicking on the blue plus icon as well. This information will populate on the left side and will be incorporated into your evaluation.
With the available literature supporting findings that this variant has been previosuly classified as pathogenic, we can then answer Yes to both PS1 and PM5, which brings the final predicted classification to Pathogenic.
Inheritance and Allelic State
The Inheritance and Allelic section contains scoring criteria that can be applied when there is known information of the disorder being absent or present in the family. For this patient, we know that the variant of interest is a de novo variant but we do not have evidence that both parental samples, through identity testing, are the biological parents. Even though the parents have not been tested to be the biological parents, we have the option to assume that they are. With this evidence, we will answer Yes to De novo variant and the patient has the disorder but no family history and Unconfirmed (Assumed) to Maternity and paternity confirmed, which applies the scoring criteria PM6. Since we have now answered all relevant criteria, let’s go back to the Variant Interpretation section near the top of the VSClinical interface.
Variant Interpretation for Sample
At this point all of the criteria have been accounted for and resulted in a final classification of Pathogenic following rule iii of the ACMG guidelines: 1 strong, AND 4 moderate and 2 supporting criteria. This can be viewed in the ACMG Classification dropdown.
With the final classification of Pathogenic, the next step is to include this finding in the report. Scroll down to the Variant Interpretation for This Sample section. The interpretation created from answering the criteria can be automatically populated into the report by clicking on the Add to Interpretation icon. Then on the left hand side select Primary Findings for the option next to Reporting As:. In the For Disorder: box, selecting the option will provide a list of associated phenotypes based on the OMIM annotation source, select Myhre Syndrome. Lastly, you should see the added interpretation created from the evaluation in the Interpretation section.
Once the information has been filled in, the variant interpretation can be saved by selecting Review & Save Now. In the next dialog, select Save & Review. This will store your interpretation for this variant in your assessment catalog and allow you to render the clinical report. Selecting this option will bring you back to the main Evaluation tab.
The next step will be to generate the clinical report. To do so, click on the Report tab.
The new reporting system is based on word, which allows easy customization to tailor the reports to your specific lab requirements. The main interface allows the user to validate that the information is correct. The information can be edited in the block or in the section used to provide the information. Once the information has been validated, select the greyed out word report icon on the right side.
This will then give you the option to select from three report templates offered by Golden Helix. For this tutorial we will select Mendelian Disorder Template under the Copy System Template option and name it ACMG Trio Template under New Template Name. Click Create.
This will then bring you back to the main interface, where you can click the Render icon. This will generate the report in a word document.
This concludes the VarSeq VSClinical ACMG Guidelines Trio workflow tutorial.
This tutorial was designed to give a taste of all the features and capabilities of VarSeq and a brief orientation to key features. If you would like to learn more about the unique capabilities and functionalities of our software, we have great webinars that can be accessed using this link: https://www.goldenhelix.com/resources/webcasts/index.html.
If you are interested in getting a demo license to try out additional features that require an active license, such as creating a project, adding annotation sources, and saving project, please request a demo from: Discover VarSeq
If you have an active license, we encourage you to try out the intermediate tutorial on Cancer Gene Panels: Cancer Gene Panel Tutorial
Additional features and capabilities are being added all the time, so if you do not see a feature you need for your workflows please do not hesitate to let us know!