Incorporating gnomAD V4 into the ACMG Auto-Classifier

         February 14, 2024

We at Golden Helix are thrilled with our recent release of gnomAD v4! The curation of this massive database was a huge undertaking for our development team, but with the addition of new features, we believe it was well worth the wait. This annotation track contains allele frequencies from a more diverse population than ever before, with the addition of functional effects from REVEL, CADD, PhyloP, spliceAI, and Pangolin! With the addition of this awesome new track, many of our customers have been asking how we can incorporate it into our ACMG auto-classifier for use in VSClinical. Have no fear; this handy guide will take you step-by-step through the process. Before we get started, it is important to note that in VarSeq version 2.5.0, you will have to add this database to the classifier frequencies manually. In future versions, VarSeq 2.6.0 and onward, the addition of the gnomAD v4 Joint Frequencies Track will be automatic (read more about the gnomAD v4 Joint Frequencies Track here). That being said, for templates made in VarSeq 2.5.0 and prior, you will still need to manually add gnomAD v4 to your workflow. With that, let’s get started!

First, you will want to go to your Data Source Library and download the gnomAD database of your choice (Figure 1). Here we have gnomAD v4 for Exomes, Genomes, and the Joint Variant Frequencies track. Make sure you start this process well ahead of adding it to the classifier, as these files are quite large, and downloading may take some time.

Figure 1: Download the gnomAD v4 database.
Figure 1: Download the gnomAD v4 database.

After gnomAD v4 has been downloaded, go to the Gear Icon in VSClinical (Figure 2). This is where you can change your workflow options.

Figure 2: Workflow options.
Figure 2: Workflow options.

Inside of the Project Options menu, go to the ACMG Frequency Sources (Figure 3).

Figure 3: ACMG Frequency Sources.
Figure 3: ACMG Frequency Sources.

From the Add New Source menu, you will see there are a variety of databases already available for use as a frequency database (Figure 4). This includes older versions of gnomAD Genomes, TOPMed, GenomeAsia, and more. Here we will add a Custom Source for gnomAD v4.

Figure 4: Adding a Custom Source.
Figure 4: Adding a Custom Source.

If you want to be specific to Exomes or Genomes, you can add one of those databases. In future versions of VarSeq, the default will be the gnomAD Joint Variant Frequencies v4 (Figure 5).

Figure 5: gnomAD v4 options available for download.
Figure 5: gnomAD v4 options available for download.

Once you have selected your version of gnomAD, you will be brought to the Field Mapping screen (Figure 6). Here you will notice that not all of the Destination Fields are mapping to corresponding Source Fields. In addition to the Filter and Population Allele Number, we will need the Population Alt Allele Count and the Population Homozygous Count fields at a minimum to work in the classifier. This will require using a computed field to bring in the source field as the correct data type. First, go to the Population Alt Allele Count. Select the Computed Field option, then hit Edit.

Figure 6: Manually mapping fields through Computed Field.
Figure 6: Manually mapping fields through Computed Field.

In the Expression Editor, enter {AlleleCount} as seen below as the Current Expression (Figure 7). Leave the other options as is, then click OK.

Figure 7: Editing a current expression.
Figure 7: Editing a current expression.

Repeat the same steps adding {HomozygousCount} for the Population Homozygous Count Destination Field below (Figure 8). When you are done, you can click OK.

Figure 8: Adding the Population Homozygous Count to the Computed Field.
Figure 8: Adding the Population Homozygous Count to the Computed Field.

You should see the new gnomAD v4 database added to the ACMG Frequency Sources menu (Figure 9). When you are ready, click OK.

Figure 9: Exploring the ACMG Frequency Sources.
Figure 9: Exploring the ACMG Frequency Sources.

When starting a new evaluation in VSClinical, there will now be a new gnomAD v4 field available under the Population Frequency menu (Figure 10). To keep these changes for future workflows, make sure to update your project template with the new gnomAD v4 database.

Figure 10: The new gnomAD v4 field under Population Frequency in VSClinical.
Figure 10: The new gnomAD v4 field under Population Frequency in VSClinical.

Please let us know at support@goldenhelix.com if you have any problems or questions about this process!

About Jennifer Dankoff

Jennifer has been a FAS with Golden Helix since September 2021. She has a PhD in Microbiology and Immunology from Montana State University, and is passionate about working with customers to fulfill their NGS analysis needs. When she isn't working with customers or writing blogs, Jennifer can be found hiking in the mountains or playing softball.

Leave a Reply

Your email address will not be published. Required fields are marked *