*Please note that you may experience errors in the below transcript, therefore, we recommend watching the video above for full context.
This webcast will be introducing our brand-new product release, VSClinical's AMP Guidelines: a comprehensive workflow for NGS testing of cancer. This will build on VSClinical’s capabilities that are already in VarSeq, but really expand it to the space of gene panel testing in oncology and labs that are oriented around providing services there - it's a big topic.
Before I give a little more background, let me make sure to acknowledge that some of this work will be funded by the NIH Grant awards that you see here. We're extremely grateful for the NIH for supporting us through these SBIR awards. The research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institute of Health. Our PI is Dr. Andreas Scherer, who is our CEO. And, the content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Health. We realize that we are in a great position to receive just one of these awards, but to receive as many as we have, we feel extremely privileged, and we express gratitude to the NIH for that continued support.
Now, who are we? Let me give you a quick background in case you haven't heard of us before. Golden Helix is a bioinformatics company that was founded in 1998. And, we actually have a number of products spanning the clinical space which we'll be talking about today where we're providing tools for filtering and annotating NGS data, as well as supporting advanced workflows - supporting different types of clinical reporting like we’ll talk about with the cancer side. But also, we have the ACMG guidelines workflow oriented around reporting hereditary gene panel type results. We also have a place to store all this data over time, and that's our VSWarehouse. And this is a product we won't talk about too much today but is the natural place to store all of the raw sample data, as well as all called variants, and even the analysis results in the interpretation results so that you have a single centralized and shareable location, as well as an integration point for other types of lab connections. As a company, we've been around since 1998, and a lot of that time, we have been serving the research market as it has evolved from doing things like genome-wide association testing to doing things outside of even humans and plant/animal space and doing the large statistical test work there. So there is a number of capabilities in our SNP & Variation Suite supporting kind of robust statistical analysis of genomic data, but the shared theme here is that we are a company that really does work on supporting all aspects of analyzing genomic data. All of that has been validated throughout the years by having our customers publish in 1,000’s of peer-reviewed publications citing us including a number of these prominent journals, and those customers span the whole gamut from industry to government to academia. So we have folks in the pharmaceutical industry, commercialized genetic testing companies — anything from small labs to larger labs and even up to National Institutes.
So as a company, what we do is provide more than just software. We have over 20,000 users and 400 organizations, and that gives us a large base to pull feedback from and integrate it into the software development process. This is means that we also give back in terms of publishing the best practices that we have learned and aggregated across that wide set of use cases and give webinars like we're doing today. And then we have a very simple business model - instead of charging you on a per sample basis, we work on the model of just setting up per-user subscriptions and then you can run an unlimited number of samples, have unlimited training and upgrades, and all of the support involved in making sure that you have success in your ventures. And then, of course, this has been a very highly validated solution with thousands of publications and all of those use cases being supported.
All right, so that's about the background. Now, let's talk a little talk about the context of the solution will be working on today and presenting today. When you provide an NGS gene panel clinical test, the output of the sequencer will be some raw data that will go through a secondary analysis process to align and call small variants. So that will be the BAM and VCF file that we see up top. Now, that BAM file can be used to also call small CNVs up to larger CNVs directly off of the target capture regions that are part of that gene panel kit. VS-CNV is a very advanced method which those small CNVs, down to a single exon level, can be detected with very high accuracy with a lot of supporting data to have the confidence to put those into clinical reports aloNGSide the VCFs. So now those VCF variants, these are small SNPs, and indels and CNVs, can be integrated into a variant annotation and prioritization workflow to get down to the candidate variants, the high-quality CNVs, the high-quality potentially damaging variants that are not super common in the population or have some likelihood of impacting the outcome of the patient. But there's still going to be an evaluation process that needs to be used whether it's to assess the pathogenicity of a germline variant or essentially the impact of a somatic variant that has the potential to cause cancer or be a driver. And to do that, we incorporate best practice workflows from places like the ACMG guidelines and the AMP guidelines and build very specialized workflows that integrate all of the scoring, criteria, and methodology from those published guidelines into a process that allows for highly repeatable and accurate following of those fairly complex workflows all the way to the point of taking a variant from a candidate to a well described, annotated and interpreted, reportable variant and then putting that into a clinical report context. The output of that can be a word report and can also be converted to a PDF report. And, of course, all of this rich annotated data can be looked at in a tabular form in the Excel file. So that's a little bit about where VSClinical fits in this pipeline. So we're generally starting with some level of filtered variants, and we need to follow these specific guidelines. Last year, we released the ACMG guidelines, and what we're talking about today is supporting a cancer-oriented workflow. Now, to understand why we will go through the steps we go through and what sort of the final outcome is, let's talk about NGS testing for cancer and what are sort of the outcomes and considerations of a cancer-oriented gene test.
Well, the reality is cancer is understood at a patient level by understanding the underlying biology. That biology is going to be driven by genetics because cancer is a disease of the genome. So to best understand the biology of cancer, this seminal paper here in 2011 Hallmarks of Cancer really kind of dug into what are the different cellular mechanisms and drivers of tumor genesis and the proliferation of metastasis. So these were originally I think now been expanded in this updated catalog from cosmic to these 10 Hallmarks here, and these are the names of the Hallmarks, and this is an example of a scorecard for a given gene where they describe these 10 Hallmarks for p 10 and P stands for Promoting an S stands for Suppressing. But you can see that for a given gene, and for a given gene being sort of mutated in the context of cancer, it can promote one of these Hallmarks. And, generally, there are more than one of these Hallmarks involved in an individual patients tumor. Now, understanding that gene mutation and understanding those specific drivers for that cancer in that patient opens up new opportunities given the ability to use targeted molecular therapy and other more precise ways of treating that cancer than just a blanket standard chemotherapy. And so that's where NGS testing really allows us to provide personalized medicine that can address these Hallmarks. Now, generally what that means is if we understand that an oncogene which is a gene that sort of promotes one of these Hallmarks or really promotes one of the sorts of turning on proliferation signaling, for example, might be an oncogene Hallmark, you are looking then for a gain of function of that gene. So this can be detected often in different biomarker assays, and we'll talk about those in a second, versus a tumor suppressor gene such as P53 which is really just performing a function that prevents cancer such as genome instability. So, basically, P53 is constantly looking to repair the genome and once it has been removed or there is a loss of function of TP53 as a gene or the P53 protein, then you essentially are going to incur lots more mutations, which will eventually lead to the types of mutations that drive other types of Hallmarks of cancer. So, in this case, these different types of specific understandiNGS allow you to use much more than targeted therapy as well as have a better understanding of the prognosis of that patient as well as potentially even diagnosing less easy to discern types of tumors.
So all this can be done by essentially using lab biomarkers, but some of those biomarkers can be tested, and quite a few biomarkers can be tested using an NGS gene panel. So that's why we start to see the proliferation of these kits coming from Illumina and IMTorrent oriented around 50 genes or 150 genes that are the most commonly mutated genes, oncogenes, and tumor suppressor genes that allow us to assess whether these specific biomarkers are present and the impact of those for the care of the patient. So in general biomarker is just any sort of state or measurement that can provide an indication for treatment or prognostic or diagnostic outcomes. But this can range from looking at just the presence or absence of specific proteins to looking at antigens, which are then responding to some on antibodies in the system or specific genomic states are attributes. So here's a couple really commonly tested biomarkers.
So you look for HER2+ in lung cancer, and you're looking at breast cancer actually most commonly, and you're looking for that her2 receptor protein here. You know some of these are going to be coming through just standard amino acid to chemistry tests other ones like MSI status are going to be often done with just a PCR. But most of these other ones are going to be done with genomic testing. So for example, looking for the presence of V600E and RAF or the CNV amplification of ERBB2 or a fusion for BCR-ACL1 now, some of those can also still be done by PCR testing, but overall if we can simplify our workflow and have one test to cover the vast majority of these that's going to be both economical and give us the most precision for describing the state of the genome in the context of a given patient. One thing to keep in mind is even just the absence of mutations.
For example, a tp53 wild-type maybe a significant biomarker for specific types of cancer in which that is the most form of that cancerous tumor Genesis. So if tp53 is always mutated and say colorectal cancer and then it's not mutated in the context of a given patient, that means some other pathway is being activated which may mean the most standard treatment might not apply and some other treatment might be considered. So if we have all these biomarkers coming in from our NGS test which ones are reportable and what's the threshold for both reporting it and describing the confidence of the evidence for that biomarker. Well, here is where we have the help of the AMP guidelines. So this paper is standards and guidelines for the interpretation and reporting of sequence variants is a vast survey of sort of best practices that have already been established in the industry on what to report and how to report it as well as an evidence tier system for reporting the quality of evidence for the biomarker that you're considering. Now that quality of evidence is in the context of a given patient's tumor type.
So while there might be strong evidence of say BRAF v600 E and melanoma, there may not be as much evidence on how to report that and say non-small cell lung cancer or it might just be a simple different tier of evidence because there is a very different application potential of the studies and the specificity of those studies as well as the clinical trials and the improved FDA drugs and the context of each tumor type. So we can detect our variants using NGS testing. We can classify those variants as activating mutations for oncogenes. This will often be missense mutations, which will be a set of things like BRAF v600e where a single amino acid change results in the activation or gain of function of a protein can also be things like CNV gains and fusions. Fusions are often one oncogene being turned on by being essentially spliced in through often genome translocations with another gene.
We can also have the loss of a tumor suppressor gene. That might be from a loss of function like a nonsense or frameshift that truncates the protein and essentially disables one copy of it. It might also be missense changes that are extremely damaging or CNV losses. Missense changes are often a little less easy to evaluate here. So you can imagine, for example, BRCA 1, and 2 have a lot of missense mutations in the populations, some of those damages that tumor suppressor gene. Some of them do not, so you need things like a scoring system to dig into the details of whether a missense mutation really does damage a tumor suppressor gene. Once we have those high confident variants that we are confident are activating mutations or inactivating mutations, then we need to decide how to report those and go through this tier level evidence rating for all potential clinical evidence. This is things like active clinical trials, previously published papers, FDA labels for drugs that mentioned indication for use specifically to a genomic biomarker, which there are now a number of drugs with that level of specificity. Here's a quick review of those evidence tiers and how they're broken down as evidence levels A B C or D which gets you into a tier 1 strong clinical significance and to tier 2 potential clinical significance. Outside of that, we basically have an absence of evidence which will get us into the variant of unknown significance category. And if you have high frequency and germline population catalogs and no evidence, we sort of get into the benign category. In this Tier 1, we're basically looking for extremely well-validated evidence things like FDA approved therapies that, as I mentioned, mention a genomic biomarker in their indication for use or professional guidelines, like the NCCN that specifically mentioned a course of action based on a genomic biomarker that's for therapeutic response. There are also ways in which prognostic or diagnostic evidence can reach level 1 or level B as well. Level C starts to get into the category of well, there is FDA approved therapies, but it is in a different tumor type or maybe there's just a number of small published papers, but not enough to meet this level B or A. So, this is essentially a manual process of evaluating all this information, but we can automate a lot of the work to get to this point as well as curate a lot of the data sources used to make this evaluation. So that's where the VSClinical AMP workflow comes in. So the starting point is which variants are you going to report? And how are you going to report them?
So your results can go into a reportable biomarker, but if you detect a germline variant in your NGS assay, you may consider reporting that as a secondary germline finding. And if it doesn't meet the criteria of being reported but there is some level of evidence for it, and it's not in the benign category it might go into the variants of uncertain or unknown significance in your report. The other information that you might want to include in your report is summary information about the coverage and quality of the assay and essentially the average coverage as well as regions that are not meeting the target threshold. And finally, any references that were used in the description and interpretation of these different biomarkers and secondary germline findings. So constructing that report involves evaluating which variants to place in there for the variants that get reported as biomarkers were classifying the evidence following those AMP guidelines, get into A B C or D and tier 1 or 2, and then, of course, that work to do that interpretation should be saved so that it can be reused. And the saving is not just about saving it at a variant level -potentially at a gene level or an exon level. So for example, if you are describing a damaging variant in tp53, a truncating variant in tp53, it doesn't matter exactly which specific nonsense or frameshift variant. Once you have evaluated that a variant is damaging the same interpretation can be reused, and we allow you to set your scopes for your saved interpretations going into your interpretation knowledgebase to be, for example, all loss-of-function variants in tp53. So the next time you see any variant, you can reuse that same interpretation including the clinical evidence, the drug in therapy evidence, etcetera. Now just adding variants in them of themselves for example, with BRCA 1 & 2 can be complex because you might not be sure about the impact of that variant. So we have autoscoring built-in for both the somatic and the germline context. These are using different scoring systems and the germline context we’re using the ACMG guidelines, and in the somatic context, we worked with a number of different stakeholders and in contact with the Vic GA4GH group; developed our own oncogenicity scoring system, which we covered in another webcast. Then we want to integrate all this into a report that can be both comprehensive and consistent and very efficient to build. So one more thing to display before I give you a demonstration of all of this fantastic workflow and that is how many different pieces of data we had to curate to bring into this analysis. It is by far the vastest integration of data that we have accomplished here at Golden Helix, and we have been known to be very comprehensive in our data curation and an often picked as a solution because of the breadth of our data curation. So with your VSClinical AMP workflow, you will be getting these other interpreted data sources, and some of these are commercially licensed and specific to this workflow. So COSMIC is a fantastic catalog of somatic variants with publications associated with each variant, or most variants. They've also included sort of genome-wide somatic catalogs as well. But they also have a number of other data sources that were integrating including a specific survey of fusions associated with cancer, the gene census which describes a lot of high-level descriptions about a gene including whether it's an oncogene or tumor suppressor gene, and we display that information, and their Hallmarks of cancer survey, which is an ongoing project in which they are annotating each gene involved in cancer, and describe how those ten different hallmarks are being suppressed or promoted by that specific gene. And this is fantastic background information when you're looking at a gene to understand what you expect to happen if a mutation is an activating mutation in that Gene or a loss-inducing mutation in a tumor suppressor gene.
Outside of that, we have a number of sources to provide that critical info and background information to make those tier assessments about the clinical evidence for therapeutic diagnostic and prognostic. One that is specific to the amp workflow is DrugBank. So this is commercially licensed information from a fantastic resource that describes in a structured way the function of a drug as well as these indications for use, and it's label information in the FDA and other labels as well as specific genomic targets and gene targets that are specific for that drug. So we have that information integrated and used as a way of giving you that high-quality information of which drugs are FDA approved for use in a given gene. We also integrate CIViC and a number of different levels there. There are annotations in CIViC down to the variant level, but there are also annotations at the gene level or the exon level describing with star ratings the strength of the evidence for often things like publications, but also things like FDA approved drugs indications as well. And PMKB as well, which is a fantastic resource of interpretations at a fairly high level, at a gene level often, but also for specific biomarkers describing often therapeutic evidence.
Now along with these sources, there's something new here that we are introducing, and that is Golden Helix CancerKB. So as we were doing this, we realized that quite a lot of the interpretations would be reused and they'll be the most common cancer genes that you will want to have some background interpretation for all the time. So we took it upon ourselves to gather a team of expert curators and professionals in the space working in a clinical context to aggregate a lot of this source and write these interpretations for the most common genes, most common biomarkers, in the most common cancers. We consider this an ongoing process, but we are releasing the beta version of that, and it will be integrated right into here. What this will mean is it you will add a variant to your report and if it is in CancerKB, even if you haven’t interpreted it before you will get a jump start on your interpretations often giving gene level and sometimes biomarker level and even targeted therapy information about that gene and the biomarker. And if you would like, you can contribute your own interpretations back to us anonymously, and our professional team will integrate all interpretations on a regular basis back into the knowledge base. So it is an ever-improving resource part of our interpretations, as you will see, including detailed citations to all of the papers. We make it extremely easy to reference citations and those get integrated directly back into your clinical report, and I'll demonstrate what that looks like. Now, I could spend a whole lot more time describing all of the genomic annotations parts here, but I just wanted to reference the fact that we are bringing in COSMIC, ICGC, MSK-Impact and TCGA to give you these sort of variant level views to show you which variants, how often those variants occur in these different somatic catalogs. We're also bringing in these other resources protein domains and gene level resources to help you write those interpretations about a given gene in the context of cancer. So all this will be very richly displayed inside the user interface. So there's no more point in describing it. Let's go ahead and show it and let's go ahead and start by going into a VarSeq project and getting oriented and going through the process of reporting a given biomarker in a given patient.
So what you see here is VarSeq 2.1.1. We just released this yesterday. So you're welcome to download it and evaluate it for this workflow. Let's go ahead and open this for a true sight myeloid panel, and you have to forgive the fact that these are essentially reference samples. So these are samples that have a whole lot of mutations intentionally in them not representing clinical samples, but they're great for testing the filtering and interpretation workflow. So what we have is a standard kind of filtering process here where I don't go through a whole lot of let's say hard filters. I just want to get down to a list of variants that are potentially impacting the gene and not clearly benign. So what I did is I looked at things like where the variant allele frequency and that's coming in from the VCF file, computed automatically from the depth information in the VCF file is greater than 1%. I'm also using my gene annotation just to include all loss of function or missense variants. We also have this other bucket of variants that are sort of intronic or exon or UTR or Etc, and some of those might be of interest. So to keep the ones of interest there, I'm including any variants that potentially disrupt splicing, so some of those might be different from the ones that are just clearly missense or loss of function. So that's just basically a do you impact the gene filter? And then I'm also removing variants that are not greater than 1% in essentially that I want to keep variants that are less than 1% remove variants that are greater than 1% in my known gnomAD track. You can see I'm using the updated gnomAD 2.1.1 here.
Just for the sake of looking at some interesting variants. I'm filtering down to ones that are in COSMIC and these are just the three again. There's a lot of variants in here intentionally, but these are the three with very high sample count. So the most common variants in COSMIC. So that's the process of that, and if I click on a given variant, I could view that in GenomeBrowse. This is that fantastic genomic visualization where I can see the coverage information, where the target is here for this gene as well as any other QC information on might want to do to evaluate those variants. Of course, I can visualize this coverage information here, but I will actually integrate that coverage information into the report and into the clinical workflow. We do that by essentially running a coverage algorithm as part of my project which describes the targets for this panel, and you can have all this information in a table form as well as in a summary form at the sample level. So I can see at sample level all this information, but I actually prefer to look at this now in our new cancer workflow. So let me describe how our cancer workflow integrates this.
The first step of any clinical workflow is just deciding whether the sample meets the thresholds designed by that test to be reportable and often if it doesn't meet those thresholds, you just rerun the sample, or you try to get a better extraction of DNA. So in my cancer AMP guideline workflow here, this is distinct from the ACMG guidelines that we've already had, which is a germline workflow, in my AMP guideline workflow, I have some information about the sample that has been entered, and I also have some information about the current tumor diagnosis, and I have that selected to melanoma for this sample. Then we can see the summary information visualized in a more graphical form. So this gives us a sense of the coverage that the sample was run at as well as a number of targets and hot spots that are passing our targets As well as just the information about the sort of the allele frequency of the variants detected. A lot of these are somatic variants, and so they have an allele frequency of less than 30% Most of those are SNPs. We have some insertions and deletions as well, along with the summary information about the test and the base coverage statistics; We can also display information about the target regions. So in this case, I can see that there are target regions across the different genes of this panel and we can see those genes include things like BRAF as well as a number of other cancer genes and I can set the desired average target here. I want to set that to 10,000 in this case. Let me just do 1,000, that's a little bit too high. And so I can see that I have a number of targets that failed that, 38 targets that failed that desired target region. and I can include those targets just to be included in my clinical report. I can also define different hotspots. These are essentially smaller than target regions that we must call that we have that we must ensure that we have coverage over. So, for example, for melanoma, I want to make sure that the 600 position is covered like to see that it's covered at very very high read depth here. And if I zoom into the genes, I can actually start to see per target information which exons are covered at which level and which exons are feeling that coverage as well as our individual hotspots. My hot spots for melanoma recovered sufficiently, but you can define different hotspots for different cancer types. So if you had a different cancer type here you could have a different hotspot that you wanted to ensure was covered.
So the next tab in my workflow here is the mutation profile tab, and this is a place where you add in variants of different types of small variants, CNVs, fusions, and wild types that you want to include in the context of evaluation for your clinical report. Not every variant has to go into the clinical report and there's these different sections of the report that a given variant can go in so you can see I've already added a BRCA1 loss of function variant that I suspected was germline because of its allele frequency, and it was in a heterozygous state - and that's already been classified using the ACMG classification system as pathogenic. So I want to report that variant in my secondary germline section of my report. that's different from the biomarker section where we are somatic variants will generally go if I didn't meet a pathogenic classification, I might report that as of us or if I met sort of a benign classification after evaluating it I might just not report that variant but still included in my evaluation here because it was coming through maybe my filters.
So let's go ahead and add another variant to the sample. We're going to add this BRAF v600e variant, and we suspect this as somatic you can see the variant allele frequency here as I add this now given its a somatic variant. It's following a different classification system, the oncogene necessity classification system, and it's been automatically scored as oncogenic. Now, there's a lot of detail behind that system. I'm not going to have time to go through all of that today. We might just get a chance to get a preview of it. But that is an automatic step that can be done for you, and you already are probably very certain at this point that we want to report it as a biomarker, and we can actually see that it already hit a couple of our annotation sources showing up in ClinVar and CIViC and dbSNP as well. So when I add this variant as a biomarker, I have a row in my variant list, but now I have an entry in my biomarker list, and this biomarker list can include variants, but it might also include CNVs or fusions or wild types and these are now describing potentially different scopes of things being evaluated and this case we're going to keep the v600e as the biomarker scope still. But a tp53 loss-of-function variant might be changed to be described as just a tp53 loss-of-function variant, as opposed to the very specific variant. The evidence here, though, is being filled in for my interpretation database. So when you set up VSClinical for the first time, you define an interpretation database, and that is the place where all of the work that you do while evaluating the significance of a biomarker gets saved and reused. Now you have essentially two interpretation databases your internal one, which will always take precedence. But you also have Golden Helix CancerKB which will be filled in if you don't have your own internal information about a given gene or a biomarker. And in this case, BRAF V600E is in CancerKB, and it's drug sensitivity clinical evidence was rated at Tier 1A, so, we essentially already have great detail about this specific biomarker.
Now besides just the sort of very specific evidence about drug sensitivity or drug resistance, as well as prognostic and diagnostic clinical evidence, We also want to give a little background about this Gene. So this might be the first time someone on the receiving end of this report really has seen anything about BRAF in the context of melanoma. So what's the clinical significance of BRAF? Well, we have these sections here that you can optionally include a report describing the gene summary, the outcomes and frequencies of that gene in the context of this specific tumor type, and then something about the biomarker just describing the functional impact. Is it an onco? Is it an activating mutation in an oncogene? How often does it occur? That type of thing. Not describing these specific outcomes particularly, but just giving a little bit of background information. Now, I want to dive into this and I can by clicking on any of these, but I want to, before going to that information in the biomarkers tab, show you how this all comes together in the clinical report.
So report tab is essentially a summary of all that information from these different tabs in the patient information, our reportable biomarkers. We can see here BRAF V600e a description of those a number of annotations of the critical report here. And these are these different background descriptions. Now, these are all coming from Golden Helix CancerKB, but if I were to edit this and make any changes I want, I can go ahead and save that to my own knowledge base, my revisions, and those will be included then in the next time I see this gene. So you can just use this as a starting point. You're free to make your own changes - notice these inline citations every one of those gets picked up here, and you get to review these inline citations even look at the papers and those automatically get appended to the bottom of the report. So I have a little bit about describing the clinical significance of BRAF at the outcome and frequency of BRAF in melanoma. So describing how often BRAF occurs in melanoma, a little bit about v600e as a biomarker and then this is sort of the critical information… Drug sensitivity about these drug combinations as well as a lot of detailed going into the FDA approvals the NCCN guidelines etc., as well as a little bit of information about prognostic evidence that is really applicable to all cancers. So you can change the scope not only of the interpretation biomarker but also which type of cancers that evidence is for. In our secondary germline findings, we have this BRCA stop gain, and we also have the output of our ACMG classification. Now all of this is obviously editable. You can click on these things and edit them here, but we'll probably go back and work our way through the biomarkers tab and variant tabs to show you how you can edit those with all of the supporting details. In this case, I don't have any variants of unknown significance, but I do have information about my coverage statistics and my failed target regions that fell below my threshold of interest here, but I didn't report any field hot spots.
And then again, these are all those citations coming in from each of these different sections of the report in the order that they occurred so we can have those in our report. Now. What does this all look like when it comes together? You can think of this as essentially a preview of all the content of your report, and you can choose to format this in any way using our new advanced Microsoft Word-based report template. One thing I probably forgot to do was just also review how to set a final summary of this. We give you a little blurb here, for example, to describe that you might have seen genomic alteration with an FDA-approved drug and that you also saw a secondary germline finding, so I have that information ready to put in my clinical report. When I’m done with all of my changes I can sign out this report which will essentially put it in read-only mode and capture the current date and time and the user that signed it out. Let's, before doing that, just take a look at what this report looks like. So if I were to click on this word output, this is one of the different output formats here. I can see that I've made changes since I last rendered this.
Clicking the render button takes all of that content and puts it in an application to this report template that I designed here. And this is really easy to edit. We're going to use a different webcast to go into more detail here. I'm going to open up this report template just to give you an idea of what this looks like. Now. This is a word file you can see I have my somatic detections. We have my germline detected variants and all of these descriptions in a nicely laid out content. So I have the information about the drugs and those summaries as well as the technical data here. Why do we include this information? If you look at the AMP guidelines, one of the things that they do a right at the beginning of the paper is describe how do you report a genomic variant? What information do you provide? And they suggested by essentially surveying the labs that you include these representations of the variant encoding space in genomic space its variant allele frequency and then critically, what we're also displaying here is the presence of this variant in the catalogs at the time of the report being generated.
So here's our first cancer variant and we have a germline variant with our ACMG classification in our ACMG interpretation, our cover statistics, our failed target regions. Again, this is really the same information that we were looking at before, but you can set up your own text here based on your own report template, have your general references, and then these are the specific citations from PubMed, and we could even have citations coming in from clinicaltrials.gov as well. Now that's the word version of this report. I can go ahead and save this as a PDF file, and the default location will save it inside the project which is where this information goes, and it will update this PDF file here, which we can give a preview of, and that's the same information. So now I have a pdf version of this report as well. I can open up the folder to look at this and copy it out or save it out, but it is essentially the output of it, and it's saved in your project. Now If you don't have a word program around like you're running this on a Linux computer, for example, you can use our Cloud convert feature which will essentially just upload this word file securely to our Cloud which will convert it directly to a PDF and return it. That's all done in a secure fashion. We're not saving a copy of that file. It's all done in memory. We're not even logging your interaction, but you're free to use that or to just make your local copy of a PDF using the word file. So that's how it works in terms of getting all of this information into a reportable form. Let's take a look real quickly at this biomarkers tab. So this is this fantastic detail that allows you to write these interpretations from scratch or to modify the interpretations filled in by CancerKB potentially. So we can see on the right a little bit of a review of the current biomarker as well as these interpretation sections. We have a little bit of description about BRAF, Hallmarks in cancer, which transcript is selected, the fusions that have been reported by COSMIC, etc, as well as genomic pathways in our gene description.
We have a number of different sources here. Again. These are the Hallmarks of cancer coming from COSMIC describing this gene. If I wanted to, I could simply copy some of this text and add it to interpretation here, and you can see that it notices that I've modified this from the interpretation filled in from CancerKB. If I review and save that modification, I could see that difference and update my own internal copy of this interpretation, that internal copy will be used the next time I do any work inside of here. So we have a number of different ways of capturing that. These inline references are automatically being picked up and displayed here in this citation list. So this is information from COSMIC. We also have fantastic gene descriptions coming from CIViC. The genomic home reference describes in a lot more sort of physician friendly terms the implications of a given gene in disease as well as they'll have these little call-outs to like cancer for example, and these are sometimes just great little blurbs for helping you write the impact of the significance of this gene now again, this is sort of a reusable blurb that will show up any time. You see any mutation in BRAF for any type of cancer as we go down here, We also have this outcome and frequency section where we're describing the impact of BRAF in melanoma and how often it occurs and what that impact is and we also have for that some supporting evidence of the mutation rate of BRAF mutations and for which types of primary tumor it occurs for BRAF mutations in Cosmic and MSK. Now notice. We also have this related interpretation. So this will be information from your own internal catalog or from CancerKB of when you've read a description, not for this specific type of cancer that I have for this patient, but maybe you wrote a description for non-small cell lung cancer, and that'd be useful to sort of reference while you're looking at your own interpretation here. So you can minimize this and just sort of keep around and look at any other descriptions, in this case, CancerKB Involves a melanoma outcomes of frequencies and a nonsmall cell lung cancer. But as you fill this out, you'll find you have other information here. Next, we're looking at just describing v600e. We don't get into a lot of detail here, but we do describe it sort of mutation rate to support this. We have a whole lot of information from our Oncogenicity scoring system. So we don't get a chance to sort of go into this in more detail here, but we will have another webinar where we describe how we get to this classification of oncogenic. We do a lot of this bioinformatic analysis. We do it all in fact automated so you have these nice descriptions about this variant in cancer that you might want to use. We also have a description of the different catalogs here with these little call-outs. You can kind of see the fact that this is very commonly mutated and thyroid and skin for TCGA, For example, we have in silico predictions just a click away, including splice site analysis.
We have literature references with ability to search in Google by clicking a single button and pulling up those different representations of this variant as well as we pull information from dbSNP, clinvar, PMKB and Cosmic and pull it all onto here. So you can evaluate and this case this variant has an abundance of Publications many variants will not. And of course, if there are specific assessments of this variant that have been previously interpreted by either CIViC or clinvar or PMKB, this is a fantastic place to see that information because it might be very relevant just to write a high-level description of this variant. So we sort of pack a lot of information in here for you to evaluate that and of course, you might have seen a different mutation at the same amino acid or the same variant in another tumor type, and those interpretations can be referenced side by side as well.
If you haven't seen the pattern here, it's basically to help you reuse as much of the work as you've done before whenever you do something new. Finally. Let's look at this important table here, which is essentially the aggregation of those three annotation sources DrugBank, PMKB, and CIViC to give you the clinical evidence for drug sensitivity drug resistance, prognostic and Diagnostic. and this is really what makes this biomarker reportable. Everything else is just describing the potential; you know background information. This is what makes this get to a tier 1 level A evidence of drug sensitivity. And for each of these rows here, we actually are giving you a deep dive into the evidence down here. So as I select this row, I can see from drugbank, for example, that vemurafenib was approved for use in melanoma. And this is essentially the indication from use section from drug bank, but there's this extra information including a link out to the FDA label, different product names, synonyms as well as just general drug descriptions from drugBank, pharmacodynamic information. So all this information, including their citable references, which you can copy at any time and bring into your description here. If you want to include vemurafenib as a drug, you just click this plus button. It adds it over here. Now, we found that in fact, there's better evidence for using these specific combination therapies. And so we describe that in our own CancerKB interpretation about the use of these combination therapies for melanoma. So that's how we get to this Tier 1 level A if you're not comfortable with the terms here, we have little descriptions from the amp guidelines describing the level A B C and d and what level of evidence you should be finding. So for example level C and D, you might not have, or D, you might just be at the level of clinical trials and although this drug Bank information is FDA-approved information, as I go down you can start to see that there are some other pages of information here including things that are just in the state of clinical trials for combination therapies and you might include these clinical trial IDs, and as you include these types of things these interpretations do get filled in.
So for example, if I were to copy this line of text out here and pull it into my Drug Association just going to put it in here that clinical trial ID will be looked up and brought into the list of inline citations, including the background information of that trial and that will be brought into the the clinical report as well. So we do a lot of this for you automatically. So that's a little bit about this just this, you know, including a biomarker like BRAF v600e. Let's step back a little bit and go back to our patient screen and go to a different patient here. And this one I included a CNV. And so I already have it included in my mutation profile, and there's not much different we go to our mutation profile we can go to the CNVs table. We can add CNVs that are called in my project using VS-CNV. This is the built-in algorithm for detecting CNVs based on coverage statistics, or we can add CNVs manually. You can see I already added erbb2. This also has tier 1 level A evidence coming in from CancerKB. I can click on that biomarker details to pull up the information about that including all these descriptions, and finally the drug sensitivity details. This brings in FDA-approved information about amplification in breast cancer and this sample how to invasive breast cancer as the tumor type. Go to the report section the reportable evidence for biomarkers that are CNVs is very similar can preview that report and look at that CNV results showing up in the report. Now, of course, you might have a CNV+ of us plus a you know, other types of biomarker all these things can be integrated into one report. Let's finally go to one more patient tab here. So in this one, I did a fusion kit using a BCR-ABL1 PCR, and if I go to my mutation profile, I can go and see that there is in fact, oh, what I started out with here is one variant of uncertain significance from a missense change in tp53. Let's go ahead and add in that BCR-ABL1 fusion by hand. So I go to the fusion dialog I can just type in the name of the fusion BCR-ABL1. It could be any combination of genes. And then this is going to be reported as a biomarker. And again, the interpretations here can be saved, and they'll be based on the primary gene of this fusion which is the ABL1 oncogene, and then we'll pull in the interpretations from this. In this case, I pulled in interpretations from my internal interpretation database. So all that is already done. I can go to my report and again see the reportable information on this fusion and go to the PDF preview to see the results there and just like any other biomarker we can describe in information about the gene, its outcomes and frequencies, the description of the biomarker of BCR fusions, and the drug sensitivity implications and drug resistance implications and all the way down We still have all of our citations coming in.
So that's a little bit about just the fact that this can be brought through various different types of biomarkers including wild types as well, which I didn't have time to get into but we will show you for example if there is a specific wild type that is applicable for the type of cancer your reporting and you could include that in your report. So very quick preview of all of that. Let's go back to the slides here and continue. Obviously, I wasn't able to get into all of the detail I would like to so what we decided is that this really warrants breaking this up into a number of follow-up webcasts. So we're going to have one in July describing that oncogenicity scoring system, and that's going to be presented by our director of research. Dr. Nathan Fortier. Now that describes how we do scoring on the cancer side as opposed to the ACMG scoring system. In August we will give you more of a deep dive into that fact that you can bring in both guidelines into the same workflow. I didn't really show it, but the entire ACMG guidelines can be run directly inside of this workflow. You'll just be deep diving into that variants tab, so will describe how to do that and get to those secondary germline findings like you saw in my report will also show you how to customize that report. As easy as opening up the report file in word the template file, you can change the logo. You could change the stock copy. You can save it for every one of your specific tests and then use that if you don't want to report a given section or you don't want as much detail just take it out, and it's very easy to just delete things from there. So that will be presented by our Field Application Scientist, Eli Sward, Ph.D., and finally I will be following up with more detail on Golden Helix CancerKB about how you can kind of set up your own lab and knowledge base how you can reuse interpretations more details on how that gets filled in how you deal with things like conflicts. If you write one thing, somebody else writes. Another thing all of these details really just allow you to start with most of the report written every time you open up and use VSClinical AMP workflows and I'll be giving that in September. With that I just want to close with another acknowledgment of our grant funding from the NIH really appreciate all of the grant awards there and the research it supports and we will start to take questions here as I look through these questions here. I'm going to pass this off to Delaina.
Okay. Yeah, thank you so much, Gabe, as he just mentioned we will soon go into the Q&A -doesn't look like we have too many questions right now. So I will give you all some time to enter those into that questions pane of your GoToWebinar panel and cover some exciting news happening here at Golden Helix. To start, on your screen here, you will see we are celebrating this exciting new workflow Gabe just covered with a limited time offer. We will be upgrading all license purchases of VSClinical with this new workflow to a 15-month license. This will be offered for the entire month of June, ending on the 30th. So if you're interested, please reach out to us, whether it be in the questions pane right now or reach out to your Area Director, or you could also email us at info at goldenhelix.com and just let us know before that date so we could get that going.
And then the next topic, Golden Helix, President & CEO. Andreas Scherer, Ph.D., has released two new e-Books recently. The first is the “Clinical Variant Analysis for Cancer” which explains today's topic, and the second is a new edition of our “Genetic Testing for Cancer” eBook. These are both great resources for anyone interested in the cancer industry. And if you would like to receive a free copy, please send us a request here.
And finally, Andreas Scherer, Ph.D. and Gabe will be heading to ESHG 2019 in Gothenburg Sweden starting this Saturday. If you will be attending this conference, we hope you will stop by our booth, which is number 368 say hello receive an in-person demo of any of our solutions and of course pick up a t-shirt.
With that before we start with the Q&A, I would just like to ask all of our attendees to take a brief moment after the webcast and take our exit survey, and we will start with some questions.
So one of the questions is how do you determine the oncogenicity of a missense variant in the tumor suppressor gene?
So this gets into essentially the complexities of understanding the impact of a missense change on a protein which is less clear than say a loss of function variant and it really requires a lot of bioinformatics support. So you kind of have to look at the nearby pathogenic variants, and some of those types of techniques are very similar to the ACMG guidelines where they have the concept of being in a mutation hotspot being inside a nearby region. So we actually have a number of those types of analysis in our oncogenicity scoring algorithm, and it includes things like hot spot analysis and being inside of commonly recurrent cancer regions where regions that somatic variants occur infrequently, but it's it really doesn't do it justice describe that here. So I suggest you come to the webcast describing the oncogenicity scoring algorithm. (See recording here)
Another question is, how do you describe or how do you annotate fusions?
So in the case of fusions you have to essentially two genes what we're really doing is focusing the interpretation on the driver gene, essentially the gene that informs the function and impacts the Hallmarks of cancer and essentially the treatment strategy Etc. So in the case of like BCR ABL1 is actually the oncogene and BCR is essentially its fusion partner that enables it or activates it just like a CNV amplifies the signal of a given protein. In the case of fusion genes essentially the fusion with another gene amplifies the activity of the primary function of that primary gene. So we have a way of detecting and emphasizing those primary genes for you as you type them in so that you can focus the interpretation around the annotations around that Gene. So in this case, ABL1.
I guess just one more question, and then we'll close, and the question is, can this be automated on the command line essentially?
So a lot of this will be able to be automated in the command line you have. VSPipeline is our ability to do high-throughput versions of this type of workflow where you can start with raw VCFs that have no annotations and get to a set of candidate variants that are fully evaluated, obviously at some point, someone has to go into an interface to work through the remaining interpretation work and apply their expert judgment to include things in a report and sign them off. But with VSPipeline, you can get to essentially that point of having the fully automated project in which you start to do the final interpretation work. And of course, a lot of the process that you just saw was automated so I can type in something like BRAF v600e or put that in from the project and we automate the scoring of that using the appropriate scoring system the ACMG guidelines or the oncogenicity scoring system. And we automate the application of all of the interpretations coming from your own interpretation knowledge base as well as Golden Helix CancerKB, and of course, then we automate a lot of the things that you saw in the QC screen. So your cover statistics can be computed, and that can be part of VSPipeline. Those hot spot regions can be computed, and all that can be easily reviewed in the context of composing your clinical report.
So, I think that's it for questions.
Great, thank you Gabe so much and thank you everyone for attending this webcast. We hope to see you at Eshg or on our next webcast which Gabe mentioned earlier and with that. Thanks very much again for joining us, have a great day.