Leveraging ClinVar Curated Databases in VarSeq

         May 13, 2025
Leveraging ClinVar Curated Databases in VarSeq header image

ClinVar is a global, publicly accessible database maintained by the National Center for Biotechnology Information (NCBI) that archives interpretations of human genetic variants and their clinical relevance. ClinVar consists of contributions from clinical labs, expert panels, and research groups, and serves as a community-powered knowledge base that is foundational for clinical genetics. This powerful database is expectedly one of the NGS analysts’ favorites.

At Golden Helix, we’ve tapped into the power of ClinVar by curating several ClinVar datasets seamlessly integrated into the VarSeq platform. These are available as annotations and feed into our algorithms, enabling scalable NGS workflow automation, supporting thorough variant interpretation, and providing in-depth, clinically relevant insights that facilitate clinical genetic analyses.

A Structured Resource for Variant Analysis and Automation

ClinVar has grown to include over 5 million submitted assertions, consolidated into over 3.2 million unique variant classifications. This scale allows for widespread automation in clinical workflows. In VarSeq, structured ClinVar data feeds directly into our germline and cancer variant classification pipelines, enabling real-time decision support and the ability to auto-classify known pathogenic/oncogenic or benign variants, dramatically reducing manual review time.

Rich Interpretation Context with Expert Review

Golden Helix curates multiple ClinVar sources, each designed to support a different aspect of clinical analysis:

  • ClinVar Variants: A comprehensive track of all minor variants in the ClinVar repository, their classifications, review statuses, and associated disorders.
  • ClinVar Assessments: The assessments track goes a step further, providing detailed interpretation text, supporting citations, and metadata on the submitting organization, but is limited to variants with detailed interpretations.
  • ClinVar CNVs and Large Variants: Includes curated classifications of structural variants, including gene fusions and copy number variants.
  • ClinVar Transcript Counts: This track tallies variants with a ≥1 Star review status per overlapping RefSeq mRNA transcript and thus enables rapid assessment of variant effect and clinical significance patterns across genes and transcripts.

Informing Novel Interpretations

One of the greatest values of ClinVar is its ability to inform the classification of novel variants. VarSeq leverages ClinVar citations and historical assessments to guide the classification and interpretation of novel or uncertain variants, providing reliable evidence for applying certain ACMG criteria.

Let’s consider a few examples leveraging our ClinVar Assessments track:

  • HFE p.C282Y (Stanford Medicine): The HFE p.Cys282Tyr variant is a classic example of a pathogenic variant with complex penetrance patterns. While this variant is highly prevalent, identified in a homozygous state in up to 90% of individuals of European ancestry with hereditary hemochromatosis, it shows variable clinical expression. Studies report biochemical evidence of iron overload in most carriers, but clinical disease manifests in only a subset, ranging from 2% to 33%. This nuanced penetrance is supported by extensive literature citations and population frequency data curated within ClinVar. Despite its relatively high frequency in the general population, it remains classified as pathogenic for autosomal recessive HFE hemochromatosis. This illustrates how the detailed context from multiple submitters bolsters the interpretation of this variant based on ACMG criteria PS4 and PP3.
  • PHOX2B p.A241[33] (Ambry Genetics): This variant represents an expansion mutation in the polyalanine repeat region of exon 3 of PHOX2B. It is strongly associated with congenital central hypoventilation syndrome, a rare but severe disorder affecting autonomic control of breathing. The pathogenic classification is underpinned by robust clinical and functional studies cited in ClinVar, reflecting consensus from expert panels and diagnostic labs. This example highlights how clinical databases capture and interpret complex structural changes like repeat expansions to support confident variant assessment.
  • HBB p.E7V (Prevention Genetics): The HBB c.20A>T variant is predicted to result in the p.Glu7Val missense mutation in the HBB gene, which is a well-characterized cause of sickle cell anemia in the homozygous state. ClinVar entries compile extensive clinical data demonstrating pathogenicity for sickle cell disease and sickle cell trait in heterozygotes, who may be asymptomatic but carry risk for complications. This variant is reported at 4.5% in individuals of African descent in gnomAD. The population frequency data and functional evidence supported by multiple expert-reviewed clinical submissions provide a powerful and comprehensive clinical context that empowers VarSeq users to interpret such variants with confidence.
  • 21q22.11q22.3 Duplication (Quest Diagnostics): This large copy number duplication encompasses numerous genes within the critical Down syndrome cytoband. Phenotypic correlations between Down syndrome and individuals with partial trisomy 21q have been reported, supporting the pathogenic classification of this CNV. Importantly, the absence of similar duplications in population CNV databases like the Database of Genomic Variants further corroborates its rarity and clinical significance. ClinVar’s inclusion of such structural variants, with detailed phenotypic and literature support, enables clinical users to interpret complex CNVs effectively in VSClinical.

By integrating curated ClinVar content, VarSeq empowers labs to leverage this clinical community knowledge base for expert-level clinical genetic analyses. This review highlights ways our platform consistently provides the tools and content necessary to meet the highest standards of clinical interpretation.

Thank you for reading! If you’d like to learn more about using VarSeq to leverage ClinVar and other curated datasets to streamline your variant interpretation, contact us at [email protected].

About Rana Smalling

Rana Smalling, PhD joined our team as a Field Application Scientist in September of 2021. Rana is a Jamaican native who is passionate about using biomedical research and science communication to bring about better healthcare solutions. She earned a Bachelor’s degree in Biological Sciences from the University of Chicago, a PhD in Biochemistry from the University of Utah and completed postdoctoral research at Vanderbilt University Medical Center. She has used both lab bench and bioinformatics approaches to identify novel regulators and potential biomarkers in cancer and metabolic diseases. Rana enjoys providing support and training to Golden Helix customers. When she is not working, she likes to learn about the medicinal uses of plants, fungi and microbes, and she enjoys road trips, singing and listening to music.

Leave a Reply

Your email address will not be published. Required fields are marked *