Using the GRCh38 Reference Assembly for Clinical Interpretation in VSClinical: Webcast Q&A

· Gabe Rudy · Uncategorized
https://www.youtube.com/watch?v=2bQKCN8Qnts

This webcast generated some great questions! If you have any other questions for me that are not answered below, please feel free to ask those by emailing [email protected].

Does VSClinical come with support for the new reference genome?

Yes! We worked hard to make everything work in VSClinical regardless of your choice of reference genomes. The only caveat is that CADD annotation, which is not used directly as a recommendation engine component. However, everything else is available on the GRCh38 assembly.

Are there cancer annotations for the new reference genome?

This is a great question – we do have CIViC coming in on a monthly basis that is a fantastic clinical annotation for cancer. The annotations are 37 by default, but we lift over to 38, so we present that on both coordinates. We are also in the process of QC’ing an updated COSMIC! COSMIC is now natively on 38, but we will bring this back to 37. We have a lot of exciting upcoming product ideas and announcements to be had about focusing on the VSClinical-style interpretation of cancer, but we plan to support both 37 and 38 as equal partners. Moreover, as you can see in the webcast, it’s easy to move back and forth to both.

Can you elaborate on the issue of mapping RefSeq transcripts obtained from UCSC? What kind of errors have you encountered?

Note: I got contacted by the bioinformatic folks at the UCSC and informed that now additionally provided RefSeq with NCBI alignments using their “ncbiRefSeq” track (versus the “refGene” table/track that uses BLAT described below).

Essentially, they use BLAT, which is there own fast-alignment algorithm, to place the canonical RefSeq transcript coding sequence onto the genome. It’s not that theirs are necessarily always worse, in most cases that I’ve seen – maybe less than ideal, it’s just that if you’re going to have differences, you might as well go with the NCBI. However, there are examples where the NCBI and UCSC mappings are different, in some cases radically so.

Example where the UCSC and NCBI alignments of the same transcript sequence NM_052814.3 differ dramatically in their placement of an exon.

Moreover, then in other cases where I’ve dug in, I’ve generally been happier with the NCBI mappings that we had with this recent release from the interim that they gave us.I did have a previous webcast about transcripts and genome annotations – it’s an old one, but I cover all of this in more detail if you’re interested in checking it out:

I did have a previous webcast about transcripts and genome annotations – it’s an old one, but I cover all of this in more detail if you’re interested in checking it out.

Other than CADD scores, are there other annotations that have yet to be ported to GRCh38?

That’s it! It’s a lot of effort to maintain annotations on both genome assemblies, but as a Golden Helix customer, its part of the package. We have our annotation team set up to use the same tools available to you with our Convert Wizard to move data to whatever genome it is not natively on. So, you get to see the same annotation on both references! When you are in VarSeq, and you go to add annotations, you’re automatically filtered down to the annotations for a given genome assembly, and you should be able to find everything you need there.

Do I have to upload my data to use the LiftOver capabilities?

No, like everything in VarSeq, it runs locally on your computer without any transfer of your data. We have implemented the UCSC LiftOver algorithm strategy right into our core algorithm annotation and importing pipelines, allowing this to seamlessly integrate with your analysis workflow that you control on you local compute!

Leave a comment

Gabe Rudy

About Gabe Rudy

Gabe Rudy is the Vice President of Product and Engineering at Golden Helix, where for over two decades he has led the development of clinically validated software solutions that power precision medicine worldwide. Under his leadership, Golden Helix has delivered a suite of best-in-class tools for genomic analysis, including CNV calling, pharmacogenomics, carrier screening, and somatic variant interpretation. These solutions are designed for flexible deployment across on-premises, private cloud, and managed cloud environments, and are used by organizations ranging from small diagnostic teams to large clinical laboratories and even national-scale genomic initiatives. With a background in Computer Science and graduate work in compiler optimization and high-performance computing, Gabe brings a unique blend of software architecture expertise and deep domain knowledge in genomics. Since 2006, he directed product strategy and engineering at Golden Helix, ensuring the company stays at the forefront of innovation while maintaining the highest standards of usability, scalability, and quality. Gabe is an active participant in the genomics community, regularly presenting on topics such as NGS best practices, variant interpretation workflows, and the integration of AI into clinical diagnostics. His work has supported thousands of labs across the globe in the adoption of robust, intuitive, and clinically actionable bioinformatics workflows. Based in Bozeman, Montana, Gabe balances his passion for advancing precision medicine with family life and a love for the outdoors.

View all posts by Gabe Rudy →