Annotation Tracks Required for PGx Analysis in VarSeq PGx

The PGx Variant Detection and Recommendations algorithm is the driving force behind the pharmacogenomic analysis capabilities provided by VSPGx in VarSeq. This algorithm is used to identify pharmacogenomic diplotypes and annotate them against drug recommendations. In this blog post, we outline the steps involved in the process and explain the annotation tracks utilized by the algorithm at each stage of the analysis.

Diplotype Calling

The first step of this process is the calling of diplotypes for each pharmacogene. For autosomal genes, this process involves assigning a combination of two haplotypes or named alleles. These alleles are typically described using star allele notation, which identifies a combinations of pharmacogenomic variants in a gene using a designated number. For example, the star allele notation for the deletion of a single base pair at coding position 775 in CYP2D6 is *3.

In order to identify the named alleles present in a given sample, the algorithm uses the Allele Definition track. This track provides definitions for all named alleles defined by the presence of one or more small variants in the database.

For some genes, the relevant phenotype is influenced by the presence of structural variations (SVs) and copy number variations (CNVs). One such gene is CYP2D6, for which the ultrarapid metabolizer phenotype is predicted based on duplications of normal function alleles. Additionally, both the complete deletion of CYP2D6 (*5) and the fusion of CYP2D6 with CYP2D7 (*13/*68) are considered no-function alleles.

Although VSPGx does not currently detect SV and CNV alleles from NGS data, these alleles are considered by the annotation algorithm if specified in the sample manifest. The algorithm matches each sample field name against allele names defined in the Structural Variant track, using these matching fields to inform the diplotype caller of externally called CNVs and SVs. For more details on this process, see our blog post on CNV Import in PGx Genes.

Phenotype Determination

Once diplotypes have been assigned to each gene, the algorithm searches the CPIC Diplotypes annotation track to determine the associated phenotype. This track defines the phenotype associations and activity scores for all possible diplotype pairs and provides lookup keys that can be used to match against specific drug recommendations. Each phenotype is matched to a specific drug, with drug details provided by the PGx Drugs annotation track.

Recommendations

Once a phenotype has been assigned for each gene and drug defined in the Gene-Drug Pair track, recommendations are provided by annotating against the Recommendation track. This track links gene phenotype combinations to specific drug recommendations using a lookup key field, which defines the associated gene phenotypes required to make the recommendation.

Required Annotation Tracks and Fields

This process relies on six different annotation tracks, with each having a specific set of required fields. While the default allele definitions and recommendations are based on the CPIC Guidelines, custom annotations can be used, provided that these required fields are present.

The fields required for each annotation track are outlined below:

Allele Definition Track:

Provides the star allele definitions used by the diplotype caller.
Default: CPIC Variants
Required Fields:
- Ref/Alt: Reference and Alternate alleles in the format Ref/Alt(s)
- Haplotype Name: Variant haplotype name
- Gene Name: Name of the gene
- Entrez Gene ID: Entrez gene identifier
- Core Variant: True if variant is core variant of the allele
- Reference Allele: True if all variants in the allele match the reference sequence

Structural Variant Track:

Defines structural variants which may be specified in the sample manifest.
Default: CPIC Structural Variants
Required Fields:
- Haplotype Name: Variant haplotype name
- Gene Name: Name of gene
- Entrez Gene ID: Entrez gene identifier
- Type: Type of structural variant (e.g. Deletion, Fusion, Possible Duplication)

Gene-Drug Pair

Provides list of gene-drug associations
Default: CPIC Gene Drug Pairs
Required Fields:
- Gene Name: Name of gene
- Entrez Gene ID: Entrez gene identifier
- Drug ID: Drug identifier

Diplotype Track

Defines list of diplotypes along with drug phenotype associations
Default: CPIC Diplotypes
Required Fields:
- Diplotype: Name of diplotype
- Gene Name: Name of gene
- Entrez Gene ID: Entrez gene identifier
- Lookup Key: Key used to match diplotypes to recommendations, typically associated phenotype or activity score

Recommendation Track

Drug recommendations associated with specific gene phenotype combinations
Default: CPIC Recommendations
Required Fields:
- Drug ID: Drug identifier
- Gene Names: Name of gene
- Entrez Gene IDs: Entrez gene identifier
- Lookup Keys: Keys for each gene to match recommendations to diplotypes
- Recommendation: Recommendation text

Drug Track

Provides details on all drugs with associated recommendations
Default: PGx Drugs
Required Fields:
- Drug ID: Identifier for the drug

Conclusion

I hope this blog post has provided a better understanding of the annotations used by the PGx Variant Detection and Recommendations algorithm. To see VSPGx in action, watch our webcast where we provide a complete demonstration of the product’s features, such as identifying actionable pharmacogenomic diplotypes and generating clinical reports.

If you’re searching for a PGx analysis solution and would like to try the software, contact us at [email protected] or visit the VSPGx webpage here.

VarSeq PGx inputs: Annotation Tracks Required for PGx Analysis

Diplotype Calling

Phenotype Determination

Recommendations

Required Annotation Tracks and Fields

Conclusion

Leave a comment Cancel reply

About Nathan Fortier

Diplotype Calling

Phenotype Determination

Recommendations

Required Annotation Tracks and Fields

Conclusion

Related Posts

June 2026 Customer Publications

World Sickle Cell Awareness Day

What We Heard at ESHG 2026: Clinical Genomics is entering Main Street

Pharmacogenomic Variants That Differ Across Populations

Leave a comment Cancel reply

About Nathan Fortier