About this webinar
Recorded On: Wednesday, January 13, 2021
Presented By: Golden Helix Field Application Scientist Team: Eli Sward, Ph.D., Darby Kammeraad, and Julia Love
Clinical variant analysis involves many steps and potentially requires the expertise and input from multiple individuals. For many, this process can be rather complex as it entails moving the data from user to user and possibly multiple platforms or internal bioinformatic pipelines. In general, this workflow can be broken down into three stages:
- Quality control (QC) and processing of the data. This step involves importing the data into a tertiary analysis platform and validating that the variants and samples are high quality.
- Variant evaluation using different genomic databases and annotation sources, which is then incorporated into a draft report.
- Assessment and assurance that the previous steps were performed correctly and a sign-off on the report.
Golden Helix offers VSClinical, a complete clinical workflow solution that simplifies and streamlines the clinical analysis process outlined above. VSClinical is a single testing paradigm that consolidates and automates the tertiary analysis workflow. With this software, users can perform variant and sample QC, create a variant evaluation using a plethora of public and licensed annotation sources and evaluate variants with the automated ACMG and AMP guidelines for SNVs, Indels, and CNVs. All this information can then be used to create a clinical report with our new word-based report templates.
In this webcast, we plan to demonstrate the full clinical analysis workflow within VSClinical, focusing on the topics below:
- How easy it is to perform the different stages of a tertiary analysis in VarSeq and VSClinical
- The fluidity in transferring data between different users
- Evaluating germline and somatic variants according to the ACMG and AMP guidelines
- Creating and signing off on a clinical report with the new word-based templates
Watch on demand
Please enjoy this webcast recording. Should you have any questions about the content covered, please reach out to our team here.
Download the slide deck
VSClinical: a complete clinical workflow solution.mp4
Hello, everyone, Happy New Year. Thank you for taking the time to attend today's webcast presentation. We are all so excited to have you here with us today to demonstrate how VSClinical can simplify and streamline your clinical analysis. My name is Delaina Hawkins, Director of Marketing and Operations here at Golden Helix, and I am honored to be the moderator for today's webcast. Joining us today is our field application scientist team who will be giving us this demonstration. I'd like to welcome Dr. Eli Sward, Darby Kammeraad, and Julia Love. Dr. Sward and team, thank you so much for joining us today. I hope you all are doing well.
Yeah. Thanks for having us on, and we are happy to give this demonstration.
Great, as always, before I pass things over to you, I just want to remind our attendees that we will be having a live Q&A session at the end of this presentation. So, if there's a question that comes up, you can enter those directly into the questions pane of your GoTo webinar panel. On the screen, you can see where we're referring to. In case there are any concerns, these will be asked anonymously. So, I'll be back later to do the Q&A and announce some exciting events happening here at Golden Helix. But for now, I will pass it over to you guys.
All right, I guess I'll take the lead on some of the company background info, so hi everybody, thank you for joining us today. So, as Delaina had mentioned, not only are we going to talk about some of the utilization and value of our VSClinical tool, but also just kind of the overarching strategy of an entire workflow from beginning to end with using some tools for automation, of project creation, for predefined workflows all the way down to a validated clinical report and what we can leverage from cohort data stored in our genomic repository. But before we go into the deep dive into the subject, I did want to show our appreciation for our grant funding received from the NIH. The research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institute of Health under these listed awards that you see here. That, of course, is in addition to a local grant funding that we got from the state of Montana, which we're also appreciative of. So, our PI is Dr. Andreas Scherer, who is also the CEO at Golden Helix. The content described today is solely the responsibility of us, as the authors, and does not necessarily officially represent the views of the NIH. So, again, we're really grateful for these kinds of grants. It really supports the development of the high-quality software that you're going to be exposed to today. So, before we go into the workflow, though, let's take a few minutes and just talk about some of the background info from our company. So Golden Helix was originally founded in 1998, and we are actually one of the few bioinformatics companies that can say we have 20+ years of experience in building these bioinformatics solutions in the research and clinical space. Starting on the right, we have a research platform, Snip & Variation Suite, or SVS for short, which is really built on a foundation of genome-wide association capability, but over the years has collected an extensive list of additional methods tailored for large sample studies. So, even though SVS also contains the capability for DNASeq workflows and other modern NGS strategies like CNV detection, for example, there was definitely a need to develop a clinically dedicated platform, which is VarSeq. Now, VarSeq is the tertiary realm, which handles not only the fundamentals of importing the data, annotating your variants, and filtering through those variants, but also allows users to process their germline and somatic SNVs, indels, and CNVs through the automated ACMG and AMP guidelines. Post variant analysis, users can easily render a custom clinical report and the whole workflow defined and controlled by director-level management. Additionally, our VSWarehouse genomic repository is going to sit securely in your private network or your private server, for example, linking directly to VarSeq to not only store all that project report and catalog data but also utilize it in a VarSeq workflow which you can access directly through a data browser. We even have some ability to track and query through all the cohort data to keep track of those variants. So, fortunately, this extensive list of tools has been very well received by the industry and we have been cited in thousands of peer-reviewed publications and popular journals like Science and Nature. We actually go through and summarize these recent publications on our blog. So, I definitely encourage you guys to go out to our company's site and read up on those blogs and see what kind of recent publications have come out. In addition to the publications, another reputation topic, of course, would be the customer base themselves, right. So, we serve over 400 customers globally with thousands of installs and 20,000 unique users. Our customers really span different countries throughout the globe, but also different industries. Academic institutions, research hospitals, commercial testing labs, and government institutions. So why does this all matter to you? Well, over the years of development and with the large customer base, we've had fantastic success in creating incredibly efficient but robust software tools. We're always collecting and incorporating the feedback from our customers to really meet their diverse set of needs to make sure that that software stays as relevant as possible. We always kind of want to feed the front lines of the development by being integrated with the scientific community. We provide content in the form of e-books and webinars like today's, but we also, maybe not so much recently due to Covid, but attend conferences regularly to meet our customers face to face. We also have a customer-focused business model where we provide our software on a simple annual subscription instead of charging you guys per sample. So, there is definitely some cost savings there. Additionally, with the license, you have unlimited access to our support staff. So, we're going to keep the subject today of the overall workflow pretty general to make sure that we kind of cover all bases. But any one of these individual components, we can always go deep dive into more detail on a one-on-one call if you guys would like to set up a demo or training. So, you have full access to support interaction like that. So, now that we've covered all the background topics, let's start by discussing essentially a complete analysis workflow based on the VarSeq tools. So, VarSeq overall is the optimized tertiary analysis tool where the user imports their sample VCF data. It is going to perform all the filtering and isolation of the clinically relevant variants which are evaluated under those true to form ACMG and AMP automated guidelines, and then eventually finally produce that clinical report. We, of course, want to simplify the complexities of the entire workflow process, so we segment the workflow into the appropriate steps. Step 1 important filter your variants. This filtering process is comprised of a number of fields, which includes varying quality data from the VCF, as well as an extensive list of curated annotations that we host and update for our users, and a long list of helpful algorithms to advance filtering even more. These algorithms include not only the ACMG auto-classifier itself but also gene panel designation or phenotypic prioritization for the sample variants, just to list a couple of examples. Once the filter does its work, any remaining variants are going to be carried into step 2, which is the interpretation through the VSClinical hub, following those appropriate guidelines, and then the final classifications will then be decided on and stored along with the comprehensive variant interpretation. Then we would move into step 3 for all of that information to go into the clinical report. So what we're going to cover in today's webcast is an overview of this entire workflow, but also highlights some topics for workflow optimization. Here is a snapshot of how we're going to break it down and what sections each FAS will cover. I'm going to start things off with the discussion on the need to build a project template or the filtering workflow in VarSeq and then lock that down and routinely use it essentially through VSPipeline for automated project creation. Once the project is created, the variant quality and filtering results confirmed, I'm going to pass things off to Julia, who will work through the VSClinical variant classification and interpretation process for a few example variants. It's also going to show us how to do that report, rendering the function with some draft reports. Then lastly, all of that work is going to be validated and approved by our lab director, Eli, who will assess the final report interpretation, as well as guide us through some helpful applications that we can use in VSWarehouse, as that genomic repository where all of this data gets stored for overall project management, as well as some ongoing variant tracking. So, we thought we would switch gears just for a second to get some insight from you, as our viewers. So, before we deep dive into the project creation figure, we would do a quick poll, and Delaina, if you want to kind of guide us through this, basically, you know, what we're talking about is the general workflow overall, but there's obviously some key components through that. Some of you might be users that are already using the tools, but we want to get some insight. Maybe there are other tools you're interested in or if you're not generally interested in the complete pipeline. So, I appreciate any kind of input on that.
Perfect. And you guys should all be seeing this on your screen right now so you can go ahead and input your answers. We'll give you a second to collect the responses. All right, it looks like that was enough time. Go ahead and share all your results.
And then just let me know. When I should move forward, and I've seen the results on my side.
Looks like a majority of us said VSClinical ACMG, and then VSPipeline, pretty much decent interest to all across the board. Perfect.
Thank you. All right. OK, so, yeah, that is a good framework, not only to talk about the discussion and today, but we can also do any kind of follow-up meetings or schedule calls if you want to go through any of the finer detail for any of the individual topics. So, thanks for that feedback. So, when first using VarSeq users are going to work to develop the filtering strategy based on all of the raw VCF annotation and algorithm data available, although we ship example templates with the software as a starting point, I always like to advertise that the filter is really like a blank canvas and the users can develop as simple or sophisticated of workflows as they wish. However, overall, it's been obvious over the years that there are some default filtering logics that seem consistent across most workflows. A simple example would be this container here that at the top that I have here for variant quality, basically just making sure that any of the imported data from the sample were filtering out any low-quality variants and that could be done with low read-depth or genotype quality, for example. Another typical filtering strategy may be to remove common variants present among multiple population frequency catalogs such as gnomAD, 1kGP3, or NHLBI. Another example could be to implement gene panel designations for prioritizing variants and targeted genes, or even phenotypic filters to home in on variants most relevant to the patient's disorder. Then lastly, the ACMG classifier itself is a good example of helping the user immensely in getting a framework for the final classification of a variant's impact. But essentially, once this filter is developed, it can be locked down, so no user can accidentally tweak any of the thresholds, and it really maintains the integrity of the validated workflow. Moreover, it doesn't take long for users to grow accustomed to their filtering strategy once it's all locked down. Then their next goal is going to be able to speed up the project creation process. This is essentially where the VSPipeline comes in. So, once the template is constructed and validated, our batch command-line tool VSPipeline can handle the entire project creation process. VSPipeline really provides a full list of commands, giving users full granular control of project design. Utilization of VSPipeline can be as simple as running a single batch script, much like the screenshot here on the right. So, my goal is to not overwhelm you with scripts. So, I just highlighted the key focal points here, which is line 6 is the first designation of where that project is going to be created and which project template we're going to use. The next line would be like line 17 to designate which samples are going to be imported, and if you are going to include a sample manifest for any of the necessary sample data that would either go into the reports or the algorithm as part of that project template. So, when discussing VSPipeline, what we're really discussing is speed and efficiency. For a simple example, if a batch of new sample VCFs comes in at the end of the day, VSPipeline could be run during the non-work hours to have the completed projects ready the next day, so users aren't having to manually go through and construct them in the morning. Today's demonstration is going to begin with the framework that our workflow has already been pre-validated. And I will show you not only where to access VSPipeline, but in a simple batch script scenario to show you how it will run. And then I will open the finished project, confirm that the filtering is working as expected, confirm things like variant quality, and then, in that case, I will hand things off to Julia, who will handle the evaluation and stage 2. So, Julia, if you wouldn't mind giving us a breakdown of what to expect.
Sure, thanks, Darby. So, you've seen the first steps in automating your NGS pipeline using the VarSeq product suite, which Darby just discussed, but now I want to talk about the second stage of variant evaluation, which is applying the ACMG and AMP guidelines to clinically relevant variants. Now the ACGS guidelines can be used in the variant evaluation as well, instead of the ACMG guidelines. All these guidelines are fully automated within VarSeq's VSClinical interface, wherein you are walked through variant interpretations and the recommended scoring criteria, and all the corresponding explanations for that scoring logic. All of that is provided for you. There are several value points associated with automating variant interpretation, according to these guidelines. First of all, automating this process maintains consistency in results. And this applies whether it is one person or multiple people carrying out the variant interpretation. On a related note, even if a user is new, they can easily jump into the variant evaluation as the interpretation hub serves as a great educational interface and gets these new users up to speed quickly applying these guidelines and scoring the clinically relevant variants. Perhaps the most obvious but most important value of automating this process is that labs can generate results more quickly and scale up the lab's production overall. And since we are discussing the automation of these guidelines, I want to mention that a few updates have been made to the guidelines in scoring criteria and that these changes have been integrated into the VSClinical workflow. The research papers and publications are available within VSClinical and in the VarSeq manual, so you can explore these changes in detail. Additionally, we have discussed these updates in our recent previous webcast, but there were essentially three major updates. The first is something I mentioned earlier, that the ACGS guidelines are now available for evaluating variants. This work was done by alerting colleagues and the difference between the ACGS and ACMG guidelines is really in applying strong criteria for pathogenic and likely pathogenic variants. In the ACGS guidelines, 2 strong criteria lead to only a likely pathogenic classification and in the ACMG guidelines, 2 strong criteria will classify the period as pathogenic. The second update comes from work done by the ClinGen working group, and these changes have been made to adjust the strength level of scoring PVS1 loss of function variants. So, now the variant can be scored as PVS1 strong, moderate, or supporting. And ultimately, this will lead to more loss of function variants being scored as pathogenic. The third update is the ACMG guidelines can now be applied to copy number variants in addition to the single nucleotide variants, and this was made possible through a collaboration with ClinGen. But with that said, I will now pass things over to Eli and he will talk about stage 3 of the workflow, validating reports, and project management.
Thank you, Julia. Now, the next step in a clinical workflow, once an evaluation has been made, is to validate the results and oversee project management, which in this scenario will be coordinated by the lab director. So, our solutions provided by Golden Helix allow multiple interfaces to assess the previous work of colleagues. As we will show in the demonstration, directors have the ability to open existing evaluations to observe any comments that are made as well as make changes that will document when the change was made, and by which user. Another point of interest is report generation. As we have demonstrated in previous webcasts, we now support Word-based report templates that are entirely customizable. It is easy to tailor these reports to your specific lab designs as well as convert the reports into a PDF format. Furthermore, as a lab director, you have the ability to sign off on the report, using your electronic signature. The final stage of project management is ensuring that this data is stored properly and adding this information to an internal knowledge base for future reference, which is achievable using VSWarehouse. So, the last feature I would like to discuss is storing and accessing genomic data in VSWarehouse. As many in the NGS workspaces know, it is a challenge to thoroughly identify and capture all levels of relevant evidence for the determination of an impact and classification. Once captured, sharing this content internally with all other technicians is crucial for entire workplace efficiency and maintaining accuracy with interpretations. So, this is where a genomic repository such as VSWarehouse can be utilized. With Warehouse, users can store all variants seen across their entire cohort of samples, which can then serve as a frequency filter to prioritize where variants are seen on the pipeline. Content to be stored in Warehouse includes project data for all variants, assessment catalogs for captured clinical variants, clinical reports, and sample data, which can also be integrated into a hospital LIMS system. Once stored, data can be managed for permissions and individual field level. Information can also be rapidly created as well as audited with clinical databases like ClinVar, to assess variant's classifications change over time for you stored variants. Additionally, the stored variants in Warehouse can be leveraged in VarSeq as a filterable field to eliminate variants commonly seen among your cohort. So together, storing the data in Warehouse will be the final step in a clinical workspace. And now that we have covered the basics of this feature, let's discuss the product we will be demonstrating today. So, for the product demonstration, we'll start with Darby, the bioinformatician, who will show how to create the project workflow and use VSPipeline as well as assess variant quality. The filtered set of variants will then be saved at the project level and transfer to Julia, the geneticist. Julia will evaluate the three variants from a targeted gene panel and will show the application of the ACMG and AMP guidelines. For the ACMG workflow, we will evaluate a PTEN germline mutation and see the application of the new PVS1 scoring criteria, as well as a copy number variant in the BRCA2 gene. On the AMP side, we will take a look at the BRAF V600E mutation that was added to our project. Once these evaluations are made, I will show a workflow from a director's perspective, including signing off on a report and utilizing our warehouse solution. So, with that said, let's begin the project demonstration, which I will hand over to Darby.
Sounds good. So, let's go ahead and discuss project creation. So, for any of you seeing VarSeq for the first time, we're going to spend some time orienting you with what the software looks like when we crack it open. But before we open up the project to start looking at the results, I did want to show you where the accessibility for all of these features is. So, for example, if we go to create a new project. This is going to be the window where you're going to see all of the templates that you construct and so the template that you would design would reside here, and once that's built, this is where we're going to segue into standardizing this whole process with VSPipeline. So, you can see my familial breast carcinoma panel that I've constructed, and this is all through the review and evaluation with my colleagues and locked down so that this was a standardized workflow that can't be modified at all. So, I'm going to go ahead and access VSPipeline now to show where we would essentially use this template automatically. I click cancel here and go to Tools. Luckily the VSPipeline tool is actually embedded in the installation folder with VarSeq. So, if we just go to the program folder, you're going to see all the necessary libraries and executables. But if we scroll down this list, this is where you would find the VSPipeline terminal. So, the nice thing about the VSPipeline terminal, as I mentioned before, users can always access all the individual tools in VSPipeline to go through all the granular control of the project creation, but we're actually going to keep things simple here and just use the batch command. So, to use that, I'm actually going to reference my essentially VSPipeline kind of working or operational directory that I have here where I've got my batch script set up to say, run VSPipeline. So, if I open that up, you can see the batch script, very similar to what we were looking at in the slides, with those rows that are kind of crucial for this project to create, designated here with line 6 and 17. So, line 6, where am I going to create this project? And that destination can be fully managed by the admin pipeline runner or bioinformatician to decide which user if you had multiple users, gets each project and they could open up those projects and work on their specific set of samples. So, you have full control over what that file management would be, or path management would be. Additionally, you can see here we selected that specific template that we're going to use. In this case, for sample 15. So, line 17 is designating which data we're going to import, and then also the sample manifest that has all of that crucial sample data, not only set up for the algorithms in a case like BAM files for coverage statistics or CNVs but also all the important reporting data that we're going to pull into the report and VSClinical. So, to run this file, essentially just highlight the path to that batch script. Copy its path and then type in the command batch file equals and then the path. Press enter, you would see that whole process carry out, so it's really that easy. I wanted to make sure that anybody who would be intimidated by command line is not suffering from that because the reality is it's actually pretty simple to use. But in this case, instead of sitting here waiting for the project to create, I've actually got it set up here. So, this would be the view of the VarSeq project once I open it up for the scope of this given template. So, some of the things that we have laid out by default are we do have some visualization in Genome Browse, which I will get to here in a minute, but also the workflow structure itself. And you can see that little lock icon that's locked down so that none of these workflow components can be modified. But essentially this filter that we have here on the left that's taking our single variants, 88 variants that are part of the small gene panel, they're passing through all of these criteria that are actually comprised of all of the data fields from the variant table itself. So, in our discussion earlier in the PowerPoint, we were talking about variant quality. A lot of that comes from the standard VCF format fields, Filter, Read Depth, Genotype Quality, and other fields like Zygosity, if you wanted to leverage that or variant allele frequency computed from allelic depth. But we're not limited to that data, obviously. We have all of the algorithms like the ACMG sample classifier as well as tracks like RefSeq, where all of the data, and if I just click on this eye-shaped icon, you can see the full list of all of those fields that are present that you can use in your filter chain. So, let's talk about how just the basics of how we built this workflow. So, in the first section, I'm looking at variant quality, just to say that if there are any low-quality variants in the filter field, I'm removing them, and that's what that orange exclamation point is, is to invert that logic and actually scrub those low-quality variants out. Beyond that, I'm looking for any variants that have sufficient read-depth from this field here to say that it has to be at least 20 or more. And then zygosity, in this case, inferred from the genotype field to say I want to look at anything that would be designated heterozygous variant or heterozygous alternate or homozygous variant. Beyond that, this is where we start to use some of our annotations and in this case, I've got gnomaAD exomes, gnomaAD genomes, 1kgPhase3, and NHLBI all as population frequency catalogs to set thresholds for a low frequency, to say, I only want to look for variants that are novel or rare and that already gets us down to 32. And in this case, I'm also getting rid of any variants that are well known and validated to be benign. So, in this case, I'm using ClinVar, scrubbing out anything that would be benign or likely benign, but very strict criteria here to say that they have to have 3 or 4 stars in agreement to say that those variants would be benign. So, we do scrub out a few of those benign variants. Beyond that, I want to make sure that if I'm looking at any variants for their impact on the gene, that I'm not missing anything. So, one of the default strategies, would be to look for loss of function or missense variants, which would include things like inframe insertions or deletions, but also making sure that any kind of synonymous variant potentially or any variant that would affect a splice site. I'm looking 20 bases deep into the intron off the edge of that exon to make sure I don't miss anything that would be potentially affecting a splice site. And then this is where things get a little bit more specific in terms of the panel. You can see if I go to the samples table, I've got the genes listed here that I've imported with the specific disorder for that familial breast carcinoma or cancer, and in that case, the 14, high quality, rare, non-benign variants that we're looking at. Nine of those fall within these genes for this panel. And then the last thing, of course, would be the ACMG classifier to say, are these either likely pathogenic or pathogenic? And then the last thing you might check, too, is just to confirm that the quality of this variant checks out. And you can see this insertion for T that we have right here where I could look at the depth of what's present in the BAM file, for example. So, all of that looks good and the other thing, right before I hand this over to Julia, this is obviously the goal to do this, not only for the VCF level data but also for CNVs. So very similarly, for my CNV results, I'm looking at any kind of CNV that would be called for the sample as duplication, het deletion or deletion and making sure that I'm only looking at high-quality events by selecting that the flag is missing and have high confidence in the call with that low p-value threshold of anything less than .001. So, we have a good high-quality CNV here that actually falls within those genes from our panel. And the CNV ACMG classifier is actually listing this is likely pathogenic 2. And again, looking at the quality of any of these events, if I click on the CNV and we browse to it in Genome Browse, not only do I look at the signal for the call here spanning these exons and BRCA, but you can also see the metrics that we use to reinforce the call with the ratio and the Z score. So, I know a lot of you might have questions about CNVs. We do have a lot of webcasts and tutorials to help kind of guide you through the setup with this, but we'd be more than happy to hop on a call and guide you through that too. So, now that I've reviewed all the quality and the filtering strategy works, I'm going to hand things over to Julia now so she can handle her review through VSClinical.
All right, great, thanks, Darby. So, I've opened up the project from the shared network location and I can see the predefined filtering logic that Darby set up, and that we end up with 1 variant here, our PTEN variant, and also 1 CNV. So, our BRCA2 het deletion here that we'll want to take forward and evaluate with the ACMG guidelines. But if I take a quick look at this filter chain, I can notice that it is locked. So, I can't manipulate the filter chain, but I can still look at each filter and understand the logic that landed us on the PTEN variant. So, for example, if I go ahead and click on this, “remove benigns” filter card, I can see that the benign variants in ClinVar with a 3 or 4-star status have been excluded. So, after reviewing the filter chain here, I can see that everything checks out and I'm ready to continue the evaluation with the ACMG guidelines, and actually, I want to go ahead and end with our germline variant analysis. So, let's actually go ahead and start with the AMP workflow. Darby had this project ready for me yesterday, so I have gone through and evaluated these variants already, but I do want to review those evaluations before notifying the lab director. So, I'll open up my evaluation from yesterday. And the first thing I can do is review all of the sample-level and patient-level information, and make sure that that's all filled in. Naturally, this information will fill into the report as well. I went ahead and added this BRAF V600E variant. It's a very well-studied variant, so it does a good job of showing the full- scope capability of the AMP workflow. But if I go ahead and start with the genes tab, I have the opportunity to take a look at the NGS coverage summary for my sample, and additionally, I can check to see if there are any failed target regions that we would need to include in the clinical report. I can see here that we are in the green, and so that means all of our genes and targets meet our lab set minimum of 20 X coverage. So, I don't have any failed targets to report, but I may still include this information in the report just to kind of show that. Moving on to the variants tab, I can review the oncogenicity scoring for the BRAF V600E variant, and I can see each of the scoring criteria that were applied that result in this +10 score, ultimately an oncogenic classification. All the score criteria and explanations are compiled here in the variant interpretation section. Much of this information is automatically filled in from the oncogenicity scoring criteria. But below are each of the sections that you can go through and really explore the evidence for the recommended criteria. But, for the sake of time, we're going to go ahead and move on, and the next thing that we would want to do is review the potential treatment options. That's where the biomarkers tab comes into play. We can review gene-level information, as well as information on the biomarker itself, and I can see here that the Golden Helix CancerKB is an expert-curated database and it houses interpretations on a gene level, but also on a biomarker level and really facilitates in speeding up the interpretation process. So, as I continue to scroll down, the biomarker analysis wraps up with reviewing potential treatment options for drug sensitivity resistance, prognostic and diagnostic information, really encompassing what we would want to include in our report. Then related is the drugs and trials section. So, if there are any relevant clinical trials for our patient based on these inclusion and exclusion criteria, this can be compiled into the report as well and included for that. So, that's really the last step before we would go on to creating a report. But at this point, I want to go ahead and transition over to the ACMG guidelines, so close that out, and open this up. And the first thing that I want to double-check is that my interpretations are being saved to the data warehouse, not only for just my variants but also my CNVs. We can see that these are all my warehouse-based catalogs. So, we are good to go. I'll go ahead and open my evaluation, and so, again, we're presented with the patient information here, but if we scroll down, we can see that I've added the PTEN and the BRCA2 CNV from the project. And so just like in the AMP workflow, we can click on the genes tab and evaluate the coverage for our sample and report on any failed targets, but as we know for this sample, we don't have any failed targets. So, that allows us to go ahead and move on to the variants tab. We can start by taking a look at our PTEN variant. I can see all of the criteria that I've selected. I've selected to report this variant actually as an uncertain significance and that all the interpretation has been collected here. But something I want to mention is I made a few comments into the interpretation that I want to go over before we move on to our CNVs. If we see here for this PM2 recommendation, I can see that we're getting an inbreeding coefficient flag. So, that's something I want to check out. We also have this potential PVS1 recommendation, that's also something I want to look into a bit more as well. But let's start with this PM2 flag here. So, we'll just pop over into that section. According to gnomAD exomes, we can see we're getting that flag here, but if we go ahead and take a look at the 1kgPhase3, we can see that this variant is actually novel, as opposed to it being fairly common here, according to gnomAD, but it may be a false positive, so to capture this and let my lab director know, I've added a comment here for variant quality control that the inbreeding coefficient is < -0.3. Moving on to our potential PVS1 strong recommendation, I can see here that we're presented with that this mutation is predicted to disrupt an acceptor splice site, and that there are some nearby pathogenic variants in the same region. However, if I take a closer look at our alignment here, we can see that our alternate T is just is one base pair before that canonical splice site. So, really what this means is, to assign this PVS1 strong, I would want this to actually be affecting that canonical site, so we can see that it does affect a splice site by all four of our splice prediction algorithms that alternate T does disrupt the splice receptor site. So instead, I will include supporting criteria of PP3 instead of adding this PVS1 strong criterion. And so, I've made a note here that we're not on that canonical splice site and that RNA validation would be required to confirm that that is the case. And so even still, I do want to track this variant in the event that functional studies are done, which would cause this classification to change. Since everything looks good with our PTEN variant, let's go ahead and take a look at our CNV. I'm sure many of you are familiar with using the ACMG guidelines for SNVs, but the application of the ACMG guidelines to CNVs is a relatively new feature and we have covered the scoring process in-depth in previous webcasts. But let's briefly go through this now. So, again, we have a summary of our CNV here at the top and some information also about the BRCA2 gene. We can see here that I've also selected this CNV to be a primary finding and report. It is a heterozygous deletion. It was maternally inherited, and I have ended up classifying this CNV as pathogenic. So, this is all of this CNV level interpretation that's collected here, but we can take a closer look at the scoring and also look at the gene level interpretation as well. So, as I'm scrolling down here, we can see the interpretation was collected for the BRCA2 gene, and we can explore this information graphically. But we can also see in the scoring here for Section 2 that this CNV is in a region that has evidence of haploinsufficiency, according to ClinGen. And thus, I did not need to score Section 4. I did score Section 5 as it is a sample-specific section, and so for the sample specifically, I applied a 5D and we can take a quick look at what this means. So, I answered that this CNV is inherited and that the CNV segregates with a consistent phenotype, and I assigned two segregations here as the patient's mother and aunt are also are affected with the same phenotype. So that kind of summarizes that that 5D score that I assigned there. The last step before we go into generating the report is the phenotypes tab, and so here we can include any clinical notes or case studies or any phenotypes and disorders specific to the patient. But since Darby imported the phenotype for the sample text file, we can actually see all of this information here on the right- already filled in there for our patient. Now that I've completed the evaluation, I can move on to the report. I've generated a draft that can be reviewed by the lab director, so I'll go ahead and close my evaluation and save my project and pass things over to Eli.
All right, thank you, Julia. So, in this particular situation, at the project manager level, I can see that Julia has created an interpretation with one variant and one CNV, so let's just deep dive into the interpretation that was just created. So, in the evaluation section, we can see the sample patient's phenotypic information, which has been entered in a draft report, which we can also see indicated here. We also see that the status of the variant and the CNV is displayed. If we scroll down, we can also see some additional information regarding the variants that were incorporated into the project includes the PTEN variant, which we also have a flag indicated by Julia, and then we have our CNV. Also, from a project management perspective, there's a changelog history that creates timestamps for processes that occur in VarSeq. So, with that said, let's focus on the variants and CNV interpretations that were created for the sample. So, in the variant tab, again, we are looking at the PTEN variant that has evidence for being classified as benign and pathogenic. So, let's briefly start with the pathogenic recommendations. So, in the PM2, we can see that the answer was yes, and that the comment provided by Julia was that the inbreeding coefficient was low, which likely indicates that it may not represent the true population frequency, according to gnomAD. Furthermore, this variant is novel in 1kgPhase3, so we can ultimately agree on these criteria. Next, if we go into the PVS1, the evidence is conflicting in this situation based on the reasonings for both yes and no. So, let's see if there were comments provided by the geneticist. Indeed, we can see that Julia highlighted that the variant does not occur in a canonical splice site and requires RNA validation before answering. So, at this point, we might look to see what the benign criteria have been suggested, or if there are any functional studies supporting pathogenic evidence. So, looking at the BP6, we can see that this variant has been previously classified as benign, according to ClinVar, and there is a 2-star review status indicating that there is an agreement between multiple labs. Furthermore, if we wanted to take a look briefly at the functional studies, we can see that there are three assessments, according to ClinVar, all classified as benign criteria. So, ultimately, since the variant evidence supports both pathogenic and benign criteria, the classification should be classified as uncertain significance, which we can see we have that interpretation provided by Julia. Additionally, we can also see that this variant has been saved to our warehouse catalog by this interpretation that we're seeing here. So, if the classification changes based on maybe functional studies or ClinVar or even internal information, we will be notified via the warehouse interface. So, now let's take a look at the CNV interpretation that was created. So, looking at the interpretation section, we can see that we have a pathogenic event that is incorporated into the report as a primary finding. With an interpretation created by my colleague, Julia, which again is stored in our warehouse catalog. On the right-hand side, we can also see that the impact of the CNV leads to protein truncation by deleting exons 22 to 24 and there is sufficient evidence for dosage pathogenicity. There is also documentation in the previous CNV tabs that will show previous CNVs for this gene, which we can use as a reference to see if the classification is correct, which in this case it is. We can also see that there are no flags or comments. If we were to scroll down here, which ultimately indicates that this event can be rendered into a clinical report. So, if we go into the report tab, now that the information has been validated from the lab director's perspective, we can quickly validate that the information in the report is correct by visualizing the PDF that was created by Julia. So, indeed, the PDF has the correct lab, patient, and sample information, which is documented at the top of the report, and we also have the BRCA2 gene being impacted by our CNV event, as well as that individual variant interpretation that was created by Julia. So, this information is rendered automatically from all of the criteria that are answered within VSClinical. So, you can see the interpretation that was created for our BRCA2 event, as well as the interpretation. If we scroll down, we have the Relevance to Patient Phenotype notes on gene evidence for haploinsufficiency. Additionally, we have our PTEN variant that was classified as unknown significance, which is also included in the report. And if we scroll down, we even have coverage statistics regarding our particular sample as well as on a per gene level. We can see for our BRCA2 gene, we have adequate coverage of around 90% for all of the regions over our BRCA2. As we scroll down, we also have the classification system and frequency thresholds that were implemented for classifying our variants, as well as the annotation sources that were used for this particular individual. So, as I mentioned, we do have the ability to, since this information is correct, sign out and finalize the report, which can be done just by the click of a button. It will add my electronic signature to the report, and it will finalize and close this evaluation. So, now that we have a finalized report, the last step is to ensure that the information for the sample is stored in VSWarehouse, that can be integrated into an existing LIMS system. Fortunately, VarSeq and Warehouse communicate seamlessly with a one-click option that connects you to your warehouse server. So, here you have the ability to add samples into specific projects as well as look at your previous catalogs, as well as the annotations that can be implemented into your VarSeq project or filtering and annotation, but you can also maintain and view this information in a Web browser interface. So, here you can get a snapshot of the collective variants that have been added to your database. In this case, we have close to around 80 million variants that can be easily queried. You can also regulate who has access to this information by managing permissions. So, in addition to storing your cohort of variants, storing your projects, reports, and catalogs, which you can see if we were to scroll down, we have project reports, assessment catalogs. You also have indications for ClinVar changes. So, in this feature, users can identify new variants that have been added to ClinVar with details as to what project the variant is present in. Similarly, this feature will also indicate a variant has changed classification, which will be valuable for revisiting old temple interpretations. So, in this example, this feature would be valuable, as it would indicate. If additional functional studies were added that changed the classification for. This would also show us the project and catalog for the variant and allow us to properly re-analyze that particular variant. Together, this webcast was meant to provide an illustration of a complete computer workflow solution that can be implemented with solutions provided by Golden Helix. We realized this demonstration incorporated many functional studies, and we always like to point out that if you have any questions, support is always included with the license, and we are more than happy to follow up in a virtual meeting or point you to the right resources. So, with that said, let's jump back into the PowerPoint. So, again, we want to mention how grateful we are for grants such as these, which provides huge momentum in developing our software. At this point, I'll turn things back over to Delaina and she will talk about some Golden Helix updates, and then we will go into the question-and-answer period for this project demonstration.
Great, well, thank you to all three of you for preparing that great demonstration and sharing it all with us. And by looking at the questions panel, I know I speak for everyone, so thank you very much. We will give all of you attendees a few more minutes to enter those questions into the system. You can do so in the questions pane of your GoTo Webinar panel. And while we wait, I will cover some exciting events happening here at Golden Helix. So, to start, this webcast is one of many resources our team has made available to the community to learn about our software and also best practices in next-gen sequencing. These come in many forms such as this webcast and then also our eBook library, which has nearly a dozen different topics. Next week, we'll be releasing a new version of our Precision Medicine e-book written by Golden Helix's president and CEO, Dr. Andreas Scherer. We invite all of you, whether you have downloaded it before or not. We encourage all of you to request a copy of this. And if you would like to, you can add a note in the questions pane or the chat panel and request a copy and our team will personally reach out to you with a free copy. And then secondly, I would like to feature our End of Year bundles, so as some of you may not know, that these were extended into2021 due to covid-19 restricting the budget of many labs last year. So, this is a wonderful time to implement a Golden Helix solution if it's something you're considering. Listed here are the software packages that we are offering. And then the number on the left represents how many bundles remain of each of those. A few were snagged up last year, but we do have some remaining in all different categories. So, if you're interested, I highly advise you reaching out to our team as soon as possible. These will be going through February 15th. So, let us know if you're interested. We'll put your name down on one of those and we will follow up with you regarding the details. And just for ease, I will go ahead and throw a link to these bundles in the chat pane. You should have seen those just pop up, so you can go take a look and learn more details. We’d also like to remind our community of the Covid-19 Remote Assistance Program. So, this program allows our customers to receive an additional licensed installation for those that are still working remotely. So, you can use your analysis wherever you may be. If this is something you would like to request, you can simply enter that into the questions pane as well, and we'll follow up with you privately. All right. Then moving on next is our Abstract Competition. It's currently underway. We're looking for everyone involved in the NGS space to share how you are, or plan to use, Golden Helix software for the chance to win some great prizes. Your abstract does not need to be published- just looking for a summary of your involvement in the space and how our tools would be helpful for you. There is a chance to win a free VarSeq or SVS license, a Dell Latitude 5000 series laptop, and the opportunity to present your work to Golden Helix's community through a webcast like today's and a blog post. And we encourage all of you to participate. Like we said, if you are currently using our software or plan to, just hearing the story there, this competition will be ending on March 9th. So, go ahead and put together an abstract and send that to email@example.com, or anyone on the team would be happy to enter you in that competition. And lastly, if you're new to our Webcasts or haven't joined us for a while, our team has used our research platform to publish a number of Covid-19 related publications. Starting on the right, we're very excited to share with you the pre-release of a new article currently in the pre-release status. Over the course of 2020, we worked with a clinical lab in Germany to do the analysis of 46,000 samples breaking down the population structure for the genomic variability among those samples. So, we're very excited to be able to share this upcoming publication with you. Again, I will pop one more link into the chat panel for anyone who might be interested in reading this. There you go. Then on the left are interviews and articles by Golden Helix president and CEO, Dr. Andreas Scherer, just discussing the use case of our platform in the space, and then on the bottom, we have also published our work in two journals already using some of the statistical analysis features available in our SVS platform. So, if you're doing work in this space and would like to discuss these capabilities, you can reach out to our team and we'd love to show you on a more individualized demo. But again, very excited to be able to share that pre-release. All right. So, lots of exciting stuff happening. Lots of things to check out and get a hold of our team if you're interested. But now we'll move back to the Q&A if our team is ready.
Definitely. All right.
So, question 1: are all the features demonstrated in this webcast sold together or separately?
So, I think I can answer that one, so, yeah, essentially there is an option to sell these- each product item- sort of individually, but it really depends on the license. So, obviously, you get the most bang for your buck if you do the full stack, which would be VarSeq VSClinical. So, you have the ACMG and AMP guidelines. You also have VarSeq CNV for CNV analysis, and then of course, VSPipeline and VSWarehouse as well. That would kind of be the most financial benefits there for you. But you can get any of these individually as well.
Thank you. Next question: is the VarSeq warehouse software stack available in an offline configuration?
So, Golden Helix realizes that that confidentiality is key, in addition to having VarSeq being a locally downloaded application, we do support offline activation for all products, including VSWarehouse. If this is a further interest, we'd be more than happy to discuss how this capability has been implemented with some of our other customers, and you can always reach out to us in an email, or just contact your sales representative. But thanks for your question.
Great, thank you, Eli, and then Darby maybe we'll throw one your way: can we leverage our cohort data and warehouse to manage and eliminate false positive variants?
Oh, yeah, I know that was something that we briefly talked about, but we probably didn't show it as much, I can answer that, and I think this might answer another question on the list, too, asking about the scale of how many variants that we loaded up into Warehouse and how did we get that amount if we're only doing what sounds like a couple of variants with the catalog? Well, essentially, Warehouse itself is kind of storing a lot of different types of data, and one of those types of data, and I can show my screen here. You guys can see that okay? Hopefully? All the different types of data that we have in the projects, or up in the Warehouse server, are comprised of not only the individual variant submissions that you would get from, like the ACMG review, but also the entire cohort data itself. And so, you can basically take the raw set of all variants seen across all samples, which scales from gene panel up to whole genome, obviously, to the 80 million scale. But the idea is that that final resting place for all that data, you can still take either the high-quality variants or all of the variants. If you kind of want to manage artifacts, for example and leverage it back in your workflow. So, if you wanted to add any of these as an annotation, I could take what would be essentially some of my project data, add as an annotation, and you would see those fields come into the project. Let me find the specific field. Oh, I'm looking at CNVs, that's why. Let's go back to the variant table. You would see all of that cohort allele frequency data come into the project, and I could utilize any of these in the filter chain as well. So, for example, if I were to go back to my variant filter and I were to go to remove common, one of the additional filters I might add to this workflow could be my own cohort allele frequencies. If I go to add filter here, I can grab on my own internal allele frequency and set my thresholds in place there, so maybe less than 1%, for example. So yeah, absolutely. And if this is something you want to talk in more detail about the scale and the process of uploading that data, we can always do it on a one-on-one session. So, I hope that answers your question, but let me know if there's anything else.
Alright, Thank you, Darby. Does Warehouse integrate with existing LIMS systems?
I can take that question, so Warehouse is a fully customizable feature, which ultimately means it can be integrated with existing LIMS feature. So, this allows users to submit information seamlessly with electronic health record systems or those that are similar, and it prevents the need to outsource this functionality. And so, again, if this is of interest to you, our development team is composed of a lot of experts that can easily accommodate to your particular lab needs.
Thank you, and then we have one more question before we wrap it up: someone's asking the difference between VarSeq and VSClinical.
That might be in regard to pricing, maybe, perhaps? I'm not sure if that was the basis of the question.
Well, I mean, essentially the difference between VarSeq and VSClinical is the access to the ACMG and AMP guidelines. So, in just VarSeq, without VSClinical, you can access and filter variants with a number of publicly available annotation sources. But really, if you want to evaluate variants clinically, you can apply those ACMG and AMP guidelines, you want to go ahead and add the VSClinical portion to the license as well.
And the Area Directors would be more than happy to give you a breakdown of what the differences are if the cost is an issue, and then we can always hop on a call and demo any of those capabilities one on one.
Great. Well, thank you, everyone. Unfortunately, we have run out of time for today's call, but we still have a lot of great questions to answer. We will follow up with all of you who went unanswered today, and we'll also be sending out a link to today's webcast with a recap of those questions in the OnDemand recording. So, in just a moment, you will be shown a short, 2 question survey about today's webcast that we would very much appreciate all of your feedback on. And at this time, I would like to thank you, Eli, Darby and Julia, for this great presentation. It was great to have you on here today.
Thanks to all of you for attending. Hope to see you again on our next webcast in February. Have a good day.