Category Archives: Bioinformatic Support

Using rsID lookup in a VarSeq Workflow

October 4, 2022

There are many reasons a user may wish to focus in on specific variants as part of variant annotation and filtration workflow. You may be looking for the occurrence of specific SNPs in a cohort or perhaps looking for variants known to be associated with specific forms of cancer. For both of these use cases, VarSeq provides a Match String… Read more »

Streamlining Variant Analysis for Large Genetic Cohorts: Part 1

September 27, 2022

Large genetic cohorts require substantial effort to analyze. Genetic researchers are increasingly turning to whole exome and whole genome sequencing analyses for their clinical diagnostics and research. However, with that approach comes the challenge of making sense of these massive datasets. This is especially challenging when looking for tools that can streamline variant analysis for large genetic cohorts and include… Read more »

Using ACMG Secondary Findings v3.0 List in VarSeq and VSClinical

December 21, 2021

As the number of genes on a gene panel increases, there is the possibility of picking up variants of medical significance that are not related to the primary indication for the test. Especially with large gene panels, exomes, and genomes, it is medically and ethically important to report variants that may be actionable to the patient. These include variants implicating… Read more »

Consideration when using CADD in your NGS Workflow

December 15, 2021

VarSeq serves as a streamlined approach to handle the rare variant analysis typically carried out for your next-generation sequencing data. Our team at Golden Helix seeks to simplify this process by automating the curation of staple databases needed for filtering and evaluation of clinically relevant variants. The goal we seek to accomplish with our software is to provide ease in… Read more »

Alternate Allele Frequency and VCF File Format

December 7, 2021

VCF file format comes with a lot of interesting quality assurance and statistics fields that can be used for filtering in VarSeq. Open your files in a text editor to see all the fields that are available in your files, each field will have a header line with a description of its content. See the VCF Specifications to help with… Read more »

Selecting Clinically Relevant Transcripts in VarSeq

June 28, 2021

One of the many tricks of encoding so much functionality into so little space in eukaryotic genomes is the ability to produce multiple distinct mRNAs (transcripts) from a single gene. While one transcript is often the dominant one for a given tissue or cell type, there are, of course, exceptions in the messy reality of biology. It doesn’t take many… Read more »

A Simple Guide to Curating Genome Assemblies

June 8, 2021

A common feature request from Golden Helix customers is to curate and make available genome assemblies for different plant and animal species. These requests commonly come from SVS users as many research projects are being carried out, and having the genome assembly available for analysis is essential. That being said, Golden Helix has an SVS Tutorial available that walks users… Read more »

Importing CNVs using VSPipeline

June 2, 2021

VSPipeline is a command-line interface that will provide high throughput environments the ability to tap the full power of VarSeq’s algorithms and flexible project template system from any command-line context, including the existing bioinformatics pipeline. This feature is a great resource for analyzing large sample volumes as it automates importing and annotating your data, which can help streamline your analysis… Read more »

Using Gene Preferences in VarSeq and VSClinical

April 20, 2021

With the latest release of VarSeq, we have made significant updates to our handling of the interaction of variants and genes. This includes the support for non-coding transcripts, improved splice site predictions, and updates to gene and transcript annotations. We received several questions regarding how decisions are made in the software regarding genes and transcripts with these gene-related changes. This… Read more »

Implementation of ClinGen Dosage Sensitivity in VSClinical

March 4, 2021

The collaboration between the Clinical Genome Resource (ClinGen) consortium and the American College of Medical Genetics (ACMG) recently developed published guidelines for the interpretation of CNVs called on next-generation sequencing data. These new guidelines are the first to provide a robust set of rules for the interpretation of small intragenic deletions and duplications and are now automated in VSClinical. … Read more »

Shared Data in VarSeq

July 22, 2019

When using VarSeq; annotations, application settings, and assessment catalogs are all stored locally. Sometimes these resources can grow to large space grabbing directories, causing you to either purchase additional storage devices or getting rid of previously downloaded resources you might need down the road. But there’s hope! You can set where you want all of your data stored to be… Read more »

Sentieon’s Secondary Analysis Tools Explained

September 12, 2017

We find ourselves talking about our partnership with Sentieon, a lot! More specifically, we are always extolling the powerful, comprehensive genomic data analysis solution we are able to offer our clients. Sentieon’s Secondary Analysis Suite made a significant improvement in runtime over BWA-MEM, GATK, Mutect, and MuTect2 while providing deterministic and identical results. Here are two fantastic white papers that… Read more »

Annotating with gnomAD: Frequencies from 123,136 Exomes and 15,496 Genomes

May 16, 2017

Annotating with gnomAD: Frequencies from 123,136 Exomes and 15,496 Genomes When the Broad Institute team lead by Dan MacArthur announced at ASHG 2016 that the successor to the popular ExAC project (frequencies of 61,486 exomes) was live at http://gnomad.broadinstitute.org/, I thought their servers would have a melt-down as everyone immediately jumped on and started looking up their favorite genes and… Read more »

Massive Variant Boost to ClinVar & PubMed Citation Fields

January 24, 2017

It may have been easy to miss in the drum-beat of monthly annotation updates we do here at Golden Helix, but there are a couple of things that are very special about the January update to the ClinVar database: We added new fields including HGVS names of variants and citations in PubMed for variants ClinVar nearly doubled in size by… Read more »

Using Assessment Catalogs in your VarSeq workflow

December 15, 2016

Variant interpretation is an integral part of any workflow that results in some decisions being made about the validity and suspected functional impact of a variant in a given sample and their presenting phenotypes. The VarSeq Assessment Catalog functionality is designed to assist the VarSeq user in streamlining this process. To include this functionality in your workflow, you will first… Read more »

ExAC CNVs: The First Large Scale Public Exome CNV Variant Set

December 8, 2016

ExAC CNVs were released publicly with a recent publication, providing the full set of rare CNVs called on ~60K human exomes. While there are many public CNV databases out there, this is the first one that was derived from exome data, and thus includes both extremely rare and very small CNV events. With the recent release of Golden Helix’s CNV calling… Read more »

Annotating Cancer Mutations with CIViC

November 15, 2016

While clinical assessments of germline mutations have been collected in ClinVar under the stewardship of the NCBI and the collaborate effort of many testing labs, the same type of resource has been missing for mutations that could informal clinical care in Cancer. Or at least, that is what I thought until I started to work with CIViC. With the stewardship of… Read more »

Genotype Imputation and Phasing now in SNP & Variation Suite

November 8, 2016

One of the tools at the top of the toolbox for researchers working with microarray data is genotype imputation. Genotype imputation is the process of inferring the genotype of one or more markers based on the correlation pattern (aka linkage disequilibrium or LD) of the surrounding markers for which genotypes are known. We have now integrated a natively ported version of BEAGLE into Golden… Read more »

Determining the best LD Pruning options

September 13, 2016

Pruning your data based on Linkage Disequilibrium (LD) values is an important quality assurance step for GWAS analysis. In particular, some tests such as Identity by Descent Estimation (IBD), Inbreeding Coefficient Estimation (f) and Principal Component Analysis (PCA) will obtain better results if the markers used are not in linkage disequilibrium with each other. Therefore, Golden Helix’s SVS provides the… Read more »

Variant Normalization: Underappreciated Critical Infrastructure

July 7, 2016

Variant Normalization: Underappreciated Critical Infrastructure It may surprise you to learn that every variant in the human genome has an infinite number of representations! Of course, although true, I’m being a bit hyperbolic to prove a point. Even seemingly simple mutations like single letter substitutions are legitimately represented differently in the local context of other mutations that can be described… Read more »

The Golden Helix Blog

OUR 2 SNPS…