Harnessing the Power of VarSeq: Recent Publications

         January 14, 2025

Next-generation sequencing (NGS) has transformed our ability to detect the genetic causes of diseases. VarSeq is a key tool in this process because it helps users sift through large numbers of genetic variants to find those most likely to cause or contribute to disease. The below publications have demonstrated VarSeq’s usefulness in discovering new gene variants related to neurodevelopmental disorders, coenzyme Q deficiencies, and stroke in sickle cell disease. These findings highlight VarSeq’s value in clinical research and its potential to improve patient care.

EHMT2 as a Candidate Gene for an Autosomal Recessive Neurodevelopmental Syndrome

Neurodevelopmental disorders (NDD) comprise clinical conditions with high genetic heterogeneity and a notable enrichment of genes involved in regulating chromatin structure and function. The EHMT1/2 epigenetic complex plays a crucial role in repression of gene transcription in a highly tissue- and temporal-specific manner. Mutations resulting in heterozygous loss-of-function (LoF) of EHMT1 are implicated in Kleefstra syndrome 1 (KS1). EHMT2 is a gene acting in epigenetic regulation; however, the involvement of mutations in this gene in the etiology of NDDs has not been established thus far. A homozygous EHMT2 LoF variant [(NM_006709.5):c.328 + 2 T > G] was identified by exome sequencing in an adult female patient with a phenotype resembling KS1, presenting with intellectual disability, aggressive behavior, facial dysmorphisms, fused C2-C3 vertebrae, ventricular septal defect, supernumerary nipple, umbilical hernia, and fingers and toes abnormalities. The absence of homozygous LoF EHMT2 variants in population databases underscores the significant negative selection pressure exerted on these variants. In silico evaluation of the effect of the EHMT2(NM_006709.5):c.328 + 2 T > G variant predicted the abolishment of intron 3 splice donor site. However, manual inspection revealed potential cryptic donor splice sites at this EHMT2 region. To directly access the impact of this splice site variant, RNAseq analysis was employed and disclosed the usage of two cryptic donor sites within exon 3 in the patient’s blood, which are predicted to result in either an out-of-frame or in-frame effect on the protein. Methylation analysis was conducted on DNA from blood samples using the clinically validated EpiSign assay, which revealed that the patient with the homozygous EHMT2(NM_006709.5):c.328 + 2 T > G splice site variant is conclusively positive for the KS1 episignature. Taken together, clinical, genetic, and epigenetic data pointed to a LoF mechanism for the EHMT2 splice variant and support this gene as a novel candidate for an autosomal recessive Kleefstra-like syndrome. The identification of additional cases with deleterious EHMT2 variants, alongside further functional validation studies, is required to substantiate EHMT2 as a novel NDD gene.

… The VarSeq software (Golden Helix) was used for analysis of SNVs and indels, and variant prioritization was performed with the following parameters: read depths ≥ 10, genotype qualities ≥ 17; low frequency on 1 KG Phase3 [15], gnomAD [16] …

Carvalho, L.M.L., Rzasa, J., Kerkhof, J. et al. EHMT2 as a Candidate Gene for an Autosomal Recessive Neurodevelopmental Syndrome. Mol Neurobiol (2024). https://doi.org/10.1007/s12035-024-04655-x

Identification of a new COQ4 spliceogenic variant causing severe primary coenzyme Q deficiency

Background and aims

Primary Coenzyme Q (CoQ) deficiency caused by COQ4 defects is a clinically heterogeneous mitochondrial condition characterized by reduced levels of CoQ10 in tissues. Next-generation sequencing has lately boosted the genetic diagnosis of an increasing number of patients. Still, functional validation of new variants of uncertain significance is essential for an adequate diagnosis, proper clinical management, treatment, and genetic counseling.

Materials and methods

Both fibroblasts from a proband with COQ4 deficiency and a COQ4 knockout cell model have been characterized by a combination of biochemical and genetic analysis (HPLC lipid analysis, Oxygen consumption, minigene analysis, RNAseq, among others).

Results

Here, we report the case of a subject harboring a new variant of the COQ4 gene in compound heterozygosis, which shows severe clinical manifestations. We present the molecular characterization of this new pathogenic variant affecting the splicing of COQ4.

Conclusion

Our results highlight the importance of expanding the genetic analysis beyond the coding sequence to reduce the misdiagnosis of primary CoQ deficiency patients.

WES was conducted using DNA isolated from blood with the Ion AmpliSeq™ exome RDY kit (Thermo Fisher, Waltham, MA, USA). Library preparation was carried out with the IonChef, followed by sequencing using the IonProton system (Thermo Fisher). Base calling, read pre-processing, short read alignment, and variant calling were completed using the Torrent Suite, including the Torrent Variant Caller (Version 4.4–5.0) (Thermo Fisher). Genomic variants were annotated and filtered using VarSeq (GoldenHelix, Bozeman, MT, USA). NM_016035.5 was used as the reference sequence for the COQ4 transcript, and NP_057119.3 for the COQ4 protein for variant nomenclature.

María Alcázar-Fabra, Elsebet Østergaard, Daniel J.M. Fernández-Ayala, María Andrea Desbats, Valeria Morbidoni, Laura Tomás-Gallado, Laura García-Corzo, María del Mar Blanquer-Roselló, Abigail K. Bartlett, Ana Sánchez-Cuesta, Lucía Sena, Ana Cortés-Rodríguez, María Victoria Cascajo-Almenara, David J. Pagliarini, Eva Trevisson, Sabine W. Gronborg, Gloria Brea-Calvo, Identification of a new COQ4 spliceogenic variant causing severe primary coenzyme Q deficiency, Molecular Genetics and Metabolism Reports, Volume 42, 2025, 101176, ISSN 2214-4269, https://doi.org/10.1016/j.ymgmr.2024.101176

Genetic and Clinical Determinants of Stroke in Sickle Cell Patients: Insights from a Nigerian Cohort

Introduction: Sickle cell disease (SCD) is prevalent in sub-Saharan Africa. Phenotypic diversity in patients with similar SCD genotypes may result from genetic and environmental factors. Nigeria has the highest SCD prevalence, with homozygous SCD (HbSS) being the most severe form. Stroke affects about 11% of children under 20 years with SCD (Ohene-Frempong, Weiner, et al. 1998), primarily ischemic, involving medium to large intracranial arteries. Specific single nucleotide polymorphisms have been linked to ischemic stroke risk in some populations. (Earley, Kelly et al. 2023). This study aimed to identify genetic factors associated with early-age large-vessel stroke in SCD using whole-exome sequencing (WES) in a Nigerian cohort.

Methods: The protocol was approved by the IRBs of TJU, Nemours, and 3 Nigerian sites. Subjects with SCD who experienced large vessel stroke early in life (16 years or less) were enrolled, along with controls (SCD patients without stroke). Subjects were recruited from 3SCD clinics in Kano Metropolis, Nigeria. Inclusion criteria were confirmed SCD diagnosis (SS and Sβ0 thalassemia genotypes), abnormal transcranial doppler (TCD) measurements, and clinical evidence of stroke confirmed by MRI by age 16. Controls included SCD patients (SS and Sβ0) aged 16-26 years without clinical stroke and at least one normal TCD. All cases underwent Magnetic Resonance Angiography (MRA) to establish arteriopathy. Immediate family members of subjects >16 years (siblings and parents) were used per best practice trio analysis to annotate variants with appropriate Mendelian inheritance patterns. WES was performed using SureSelect XT HS (Agilent) and Human All Exon V7 Rev B capture probes for library preparation on all samples. Libraries were sequenced on a NextSeq 550 to 50M paired-end 150bp reads. Variants were called using the GATK suite of algorithms following best practice guidelines and annotated using the VarSeq algorithm, including AlphaMissense annotations.

Results: The control group had a median age of 21 years, and the stroke group had a median age of 9 years. The stroke group had a higher proportion of males (60%) compared to controls (30%) (p < 0.05). Fifty percent of the stroke group and 73% of the controls had a history of acute chest syndrome. History of avascular necrosis (7.5%), heart disease (2.5%), and splenic sequestration (1.3%) were present only in controls. Malaria (1.3%), sepsis (2.5%), and hand-foot syndrome (59%) affected the stroke group. History of priapism (6.3%) and leg ulcers (10%) had similar rates in both groups. Using AlphaMissense annotations, nine genes were identified to have more damaging mutations in the stroke cohort compared to the control cohort (odds ratio > 1 and p < 0.05). Pathway enrichment analysis utilizing genes prioritized with an odds ratio > 1 identified cadherin signaling, cellular junction/adhesion, and WNT signaling as significant (p < 0.05). The role of cellular adhesion has been previously described in SCD stroke subjects (Oni, Brito et al. 2024). The genes that we found that showed significance in our analysis include ADAMTSL2, ADCY9, OR3A1, OR3A2, XRRA1, ABTB2, CLDN18, CPNE7, EIF3A, SLC7A6, UBA7, PDE6A, GLOD5, and PCDHA1.

Conclusion: Genes in the ADCY9 and ADAMSL2 families have been reported in previous studies to be associated with ischemic stroke. Though the small sample size limits this study, it offers preliminary insights into the genetic and clinical determinants of stroke in SCD patients in Northern Nigeria. The identified candidate genes potentially hold biological significance for the stroke phenotype; however, additional research with a larger sample size is required to validate these associations. These findings underscore the need for further whole-exome sequencing studies in Nigerian patients with the stroke/arteriopathy phenotype, which may augment our understanding of this severe complication. An in-depth examination of the genetic modifiers associated with stroke in SCD could lead to enhanced predictive models and targeted preventive interventions, reducing the burden of stroke in affected individuals. Our team has identified additional datasets in the United States to help validate molecular signatures of interest, inspiring us to continue our work in this important field.

WES was performed using SureSelect XT HS (Agilent) and Human All Exon V7 Rev B capture probes for library preparation on all samples. Libraries were sequenced on a NextSeq 550 to 50M paired-end 150bp reads. Variants were called using the GATK suite of algorithms following best practice guidelines and annotated using the VarSeq algorithm, including AlphaMissense annotations.

Rasaq Olaosebikan, Erin L. Crowgey, Shehu Umar Abdullahi, Binta Jibir Wudil, Safiya Gambo, Raghu H Ramakrishnaiah, Muhammad Bello, Abubakar Shehu Gezawa, Bernard Ebruke, Karl R Franke, Stephen Obaro, Anders Kolb, Walter Kraft, Robin E Miller, Genetic and Clinical Determinants of Stroke in Sickle Cell Patients: Insights from a Nigerian Cohort, Blood, Volume 144, Supplement 1, 2024, Page 5299, ISSN 0006-4971, https://doi.org/10.1182/blood-2024-205366

VarSeq makes it easier to pinpoint disease-causing variants across a wide range of conditions. Its efficient workflow—from data analysis to variant interpretation—allows researchers and clinicians to connect genetic changes to specific diseases, guiding them toward accurate diagnoses and personalized treatment options. As sequencing technologies advance, VarSeq remains a crucial tool for uncovering genetic insights that can shape better healthcare outcomes.

Leave a Reply

Your email address will not be published. Required fields are marked *