*Please note that you may experience errors in the below transcript, therefore, we recommend watching the video above for full context.
Rudy Parker: Well, hello, everyone. Thank you very much for joining us today for our special webinar for the Asia-Pacific region hosted by Golden Helix and Virtus Diagnostics. We're going to be presenting today on clinical variant analysis with VSClinical. My name is Rudy Parker and I am the Area Director at Golden Helix for the Asia-Pacific region. Especially, I'd like to extend a very warm welcome today to our co-presenters from Virtus Diagnostics in Brisbane, Australia. We are joined today by Peter Field and Val Hyland, who are Molecular Geneticists at Virtus Diagnostics. Hi, Peter. Hi, Val.
Val Hyland: Good day, Rudy. How are you?
Rudy Parker: Thanks very much for being with us and spending some time with us in your busy schedule. Both Peter and Val will be discussing a poster, "Preconception genetic carrier screening in an Australian fertility clinic, the first 1000 patients", which was first presented earlier this year at the Human Genetics Society of Australasia Annual Meeting in Wellington. And at this conference, Peter also did an oral presentation on the same poster at the special interest group discussion of the Australasian Society of Diagnostic Genomics. And so we are very grateful for Peter and Val to spend some time with us. Also on the agenda is my colleague and coworker at Golden Helix, Gabe Rudy. Hi, Gabe.
Gabe Rudy: Hi, how are you doing Rudy.
Rudy Parker: Thanks. So Gabe is our Vice President of Product and Engineering and he will be reviewing our VSClinical software, discussing the ACMG and AMP guidelines for scoring germline and somatic mutations. But before we get started with our hosts and our guests, just a minor detail here. If you have questions during this presentation, feel free to type them into to GoToWebinar panel and we'll try to answer those questions as they come in. Now, before we get started with our presenters, I just want to give you a little background on who we are as a company, Golden Helix, and the software products that we provide. Our biggest asset is our ability to create bioinformatics software tools for genetic data analysis. And we've been doing this now for over 20 years. And what you see on this slide is basically the software products that we provide today. We have our SNP and Variation Suite, VarSeq, and VSWarehouse. Now, back in 1998, when we all got started, our focus was exclusively on the research market and our customers were at the very beginning of this genetics revolution and their main goal was, you know, trying to understand a human genome and so we develop tools to support that. And out of that effort came SNP and Variation Suite. And that tool is still very popular today for doing statistical analysis, working with phenotypes and genotypes, common SNPs, mostly, especially in the early days. And as the NGS field started to develop. We really felt the need to move better tools, to move from the common SNPs to enable the rare variant analysis workflow at the clinical level. As so, more and more customers were applying what they had learned about the human genome at the clinical level. And we just needed better tools to support that. So VarSeq was released in 2014 to support the filtering and notation of rare variants and also, of course, to help with the interpretation process at the clinical level. And so now as you finish analyzing the various samples through VarSeq, you, of course, need a tool to kind of store and aggregate all this data in a useful form. And for that, we build the VSWarehouse tool, which is basically an enterprise software that typically sits on a server, you know, connecting all the VarSeq users within an organization to a central data repository. And so basically what it does, it tracks all the variants that you have ever seen in every sample that you have analyzed. But also it stores your interpretation and the work that your lab may do to annotate and report on each variant. Now, looking specifically at our VarSeq technology stack, this is what that stack looks like from the 30,000-foot level, so golden helix can provide an end-to-end NGS solution for clinical testing and research labs starting with your fastQ file if needed. And then we, of course, can build pipelines to process all the VCF and BAM files that typically come out of that process for gene panels, for exomes, for whole-genome data. You know, whether you're looking at germlines or somatic mutations from calling CNVs to SNVs to creating clinical reports and you know, eventually storing all that data in the VarSeq Warehouse solution. And this is what we can do for you. So we offer value to our customers basically by reducing the time it takes to process your samples. Reducing the complexity of the analysis process. And by improving also your clinical yield so you can find the clinical variants that have importance to you. And so the components that you see here on the left-hand side can be licensed individually or as a package. And of course, with VarSeq, you always remain in control of your data. You never have to share your data with anyone. So you choose where you want to run the software and store your data. You know what is on your server, your desktop or laptop. We even support putting it on the cloud if you have one. And so further, there is also no per sample fees. We charge a flat annual licensing fee regardless of how many samples you process. And all of this, you know, puts golden helix in a very unique position to help you realize your precision medicine goals. Now, one of the things we also like to do at Golden Helix is help our customers make new discoveries in the field of genetics. And so we have over fourteen hundred articles in peer-reviewed journals in which our software has been cited and more are really added on a weekly basis. And the nice thing is that you don't have to really be a bioinformatician to extract value from this tremendous amount of genetics data that is being generated. We provide very easy to use graphical user interfaces. So, you know, you can focus on making that next discovery, whether it's in the research or on the clinical site. Now, we do business with over 400 client organizations worldwide, including in Australia, India, and China. There are at least 20,000 software installations across all the major applications we provide. And all top 10 medical schools in the United States use our software tools to conduct genetic research and also clinical research. And so we do miss business with many clinical labs, including Virtus Diagnostics from whom you will be hearing here shortly. And I'd say clinical diagnostics is one of the fastest-growing parts of our business right now. And so we're very proud of that. Now, with over 20,000 installations, there are, of course, a lot of eyeballs that are looking at our software. Our annotation sources provide it with VarSeq are really a very sought after resource in our industry. We have companies that just license our annotation tracks for using their own pipelines, so quality is really our number one priority. And we spend a tremendous amount of time and resources just to make sure that our notations are up to speed and correct. Now, our business model is very simple. We work with you to support your work, whether it is clinical or research-oriented. We do that by, you know, like I said at charging a simple annual licensing fee, which includes unlimited support and webinar-based training as needed and also all the software updates that are released during the year. So, you know, our customers always pushing boundaries Gabe and his team are always working to adapt VarSeq to be the best of its class. And so in order to accomplish that, we're very active in the community. We always have a finger on the pulse, on what's going on there. We have a tremendous amount of useful blog posts and we organize webinars like this one, we attend conferences, obviously. Next week we'll be at AMP in Baltimore. So if you want to come and see us, please visit our booth. So these are all the things I just wanted to share with you briefly. Thank you very much for your attention. And what I'll do now is we'll turn over the presentation to our first guest speaker, Peter Field from Virtus Diagnostics. Peter, it's your turn.
Peter Field: Thanks very much for that Rudy, and thank you for inviting us along.
Rudy Parker: You're welcome.
Peter Field: So here's the screen.
Rudy Parker: Yes, I can see it.
Peter Field: Okay. So. So this was part of a poster on a talk that I presented at the HGSA conference in Wellington, New Zealand this year.
Peter Field: So it was a combined oral presentation at the ASDG Special Interest Group and then a poster in the main HGSA conference itself.
Peter Field: And so this is from a body of work for the first 1000 patients that we've screened. And so we thought that we would present this data with you today. So all of the data is from a preconception carrier screening perspective, so the majority of the patients coming into the clinic are either having difficulties getting pregnant or have yet to get pregnant for the first time, the second time. And so what we're actually offering is a true preconception carrier screening as opposed to either mid-pregnancy or newborn screening, and samples from within Australia are from a very ethnically diverse population there's been a lot of immigration into Australia. And so it can be challenging analyzing samples. And in Australia for a molecular genetic testing, all the patients are charged an out-of-pocket fee. There is no government nor any insurance coverage for these charges. And for these screens, we've been using the been Illumina inherited disease panel. So it covers 552 genes, all of which are for autosomal or recessive conditions. There's no predictive testing and no cancer testing in this panel. So we've been evaluating the Golden Helix software package, so especially with the VarSeq and then the Sentieon add on for the alignment. And one of the difficulties that we've had with our screen is that, of course, we have no phenotype other than the patients coming into the infertility clinic. They are all relatively normal and healthy and therefore there's no disease phenotype to follow up. So all of our filtering has to be sort of spot on so that we're not actually losing important variants. But also, there's a lot more variants to look at. And one of the main points that I'll touch on and this is that pathogenic in a variant is not always pathogenic, and that will make more sense by the time I finish talking. So just to give you a quick overview of some of the data that we've got from the screen, these are the top 20 genes that we've identified from those 1000 patients. And you can see at the top of the table there, cystic fibrosis, the CFTR gene, we reported that 72 times. And in those reports, there's a total number of 35 different variants. So within our population of screening, we've got a carrier rate of one in 15. So it's higher than the calculated population rate. But there's also an association with the CFTR mutations and infertility. So the fact that such an increased population fits with that theory. And then as you work your way down the list, there's other common genes and there's the GJB2, autosomal recessive deafness. Yes. So there's nothing necessarily out of the ordinary in what we've discovered, and a lot of the carrier screen, carrier right in our screen match with the published data. So I'll focus in on one of those genes. So this CBS gene. So we've seen 31 times we've reported variants in the CBS gene and we've reported six different variants. So the variant we're going to look at in detail today is this C dot 8 3 3 2 T to C change, which is the most common variant causing homocystinuria. And in this panel on the left here, you can say that the variant is there annotated here with the A to G change. There's good coverage across the region, good read depth to make that call. And that would actually gain a pathogenic classification. So on the right-hand side of the panel on here, you can actually see this is in a different patient. Again, you can see that the variant located is A to G change. And I've put a box around the area to look at. And for those who have already noticed where these reads end, there's actually a 68 base-pair insertion at this point. So we would actually then report this is a benign classification. And that's because where the insertion goes into the gene that actually creates a splice site. So this isn't necessarily new information, but it's something that you won't necessarily see unless you're actually looking at the sequence or you won't necessarily see it if your sequence isn't long enough. So we're running a hundred and fifty-five cycles on the MiSeq to get this data. So we're actually able to sequence the DNA before, during and after the insertion, you can see on here where the two red arrows are on this on this diagram that the bases, that actually the C to the T change got picked up on here where you have the splice and the splice variant coming in where the Green Arrow is. So you have the insertion creating the splice site and actively splices out the pathogenic variant. Just to take that a step further. The panel on the left is another patient, and this has been run with the MiSeq onboard alignment software. And again, this has the 68 base pair insertion. And again, that would gain a benign classification. When we rerun the same patient with a Sentieon alignment, the Sentieon alignment is actually taking this a step further. And instead of just highlighting that there's an H or a G change as it has done on the left here, it's actually highlighting the insertion itself. So you're actually better warned there's an insertion there. So if you're not if you're not visually looking at the data, at least it's then reporting that there's an insertion present. And again, both of these would obviously attract a benign classification from this point of view. So even though there is a pathogenic variant present in the patient, the actual insertion creates a splice variant and actually negates that pathogenic status. When you actually follow through on the data and you go and have a look at this insertion on the gnomad and you can then obviously link through to other databases, including ClinVar. When you click on that link and actually takes you to a pathogenic link for that variant, which is actually incorrect because we know that insertion itself is actually benign. On the right hand side of the panel on here.
Peter Field: This is an image from the Alamut software and across the CBS gene next to it. There's lots of different variants in here. Also several insertions around this area. And most of those are quite rare. It just so happens that this insertion that we're looking at here is actually quite common within the population globally. So I think that's a good example of when we're looking at next-gen sequencing data that it's actually quite important to actually visualize the DNA or visualize the sequence as you're actually analyzing it.
Peter Field: And that works quite well within the VarSeq software. So just to go through some of the other data and to finish my section. These are the carrier couples that we've picked up from the first 1000 patients. You can see it here. There's 13 couples where we've actually confined this carrier, couples. And again, no surprise that several CFTR and couples in here are at risk of having to see a CF-infected baby in the future generations. And then one of the other variants that we have in here is for galactosemia, the GALT variant. Val will talk in more detail about that. So I'll hand over to Val and he can carry on. Thanks very much.
Val Hyland: Thank you. Peter. So can you see my screen?
Val Hyland: So what I've prepared for today was one of these carrier couples and they have a quite common variant in the GALT gene, and I thought I would show you a real-time presentation using the VarSeq project. But if I can say that I'm new to the Virtus Diagnostics. I'd previously worked in another Australian laboratory and had used VarSeq software and on moving, it was the 20th anniversary of the Golden Helix and we certainly got a very good package deal. So we had a 20th anniversary special and we bought the full package for genomic germline analysis. And so we managed to get a 20-month license instead of 12 months. We started off with three users such as myself, Peter and another user male in the laboratory. And more recently, we've added another license for a trainee. So what appeals to me about the genome, the VarSeq software, is I like to see all the data together and particularly in a clinical background. I like that each user has a log so we know exactly what each person has done and when they've done it. Normally, when we're doing our analysis, we will have to end the independent people scoring the data. So one person will do the analysis first and then another person will come in and check.
Val Hyland: So over on my left-hand side, we have our filter and we've added in our BAM file and our VCF.
Val Hyland: And I've put in a filter here so that we're only looking at the GALT gene and we're just looking at a single pathogenic variant, there. You can see in our screen here that we've detected this variant a number of times and we've been collecting data in here for at least 12 months.
Val Hyland: Here are the two patients, same variant is in both. This is the coverage plot the BAM data. And here we're showing paired-end reads. And I think one of one of the things you may notice here is that sometimes there is some trimming. So the sequence is ending before the variant. And so normally when I'm looking at this, I have this, I have two screens, I use 30 inch screens and I've got one set of data on one screen and then the rest of the data on another. But for this presentation, everything is together. But conveniently we can hide this filter and then hide the genome browser and then bring up the ACMG guidelines and it will help us go through the classification. And I'm just going to skip over what I see as the final part, and I'll come to that at the end. And one of the reasons I selected this variant is often it's quite a challenge to classify a missense variant. But it also demonstrates perhaps virtually all of the fields that are in here for classification. And if we just look at the first screen here, gives us something about a description for the clinical phenotype and the molecular genetics. And this particular variant is one of the more common variants that is seen in the Caucasian population. So from the point of view of population frequency. We can look either at numbers or at percentages. We can look at the exome or we can follow this link here and look at the whole genome data, but from the point of view of the data that's presented, the allele frequency is greater than expected for this disorder, and it is not absent in the population catalogs. So, we're not getting any score for these. If we move to the impact, we've got some more information, so we've got the location of the variants in exome six of 11.
Val Hyland: We've got the profile of what kind of variants are found in the GALT gene. And we've got some data from ClinVar and this one here at the end is the actual variant that is present. It is present in ClinVar and we have it in 13 of our samples that we've screened so far. So this database that we have is the VSWarehouse. We can use one of the other criteria to assign a PM1 so we know it's in a hotspot region and we know there are no benigns present in that region. And again, we've got prompts from the software to help us with that.
Val Hyland: We can go down and look at the mechanism of this disease, missense, as a mechanism of disease, and we see that there is a low rate of missense variants so we can select this value and we can see that missense variants are a common mechanism of disease. For this missense badness and MPC for the variant, to some extent, it's predicting that it can be tolerated as a missense variant. But we're not really selecting this or using this data, and I find that when we're looking for a missense variant, where there is little information, this data can come into its own and prove useful. So we're scoring PP2, so we've got PM1, PP2. And then as we as we move down, we're looking at some more insilico computational evidence and we're selecting that all of the computational evidence is pointing towards a deleterious effect. Now, this splicing data is of interest and if I flick back to the genome browser quickly, you may notice that the variant is quite close to the end of the variant, but there is no prediction of any splicing abnormality.
Val Hyland: So we add our score, our PP3 score here.
Val Hyland: Then coming to our studies.
Val Hyland: We're bringing up some of the data from ClinVar and ClinVar has a two-star rating for pathogencity.
Val Hyland: So we're using the moderate value for this PP5. But I think, again, where VarSeq is coming into its own is we are collecting information from clinical studies and from functional studies, and we're able to include this in with our other data. And we're able to make some summaries that can be used for reporting. We tend to accept and score this information for ClinVar, perhaps above what might be this PS1 scoring because we certainly feel for PS1. We're talking about another nucleotide change that may lead to the same amino acid, but I think some people could argue that we should be selecting PS1 instead of PP5_M. Or perhaps given the amount of published information, we could perhaps increase this score to strong instead of moderate.
Val Hyland: But we feel that, you know, especially if we have some functional data, we don't want to be adding the same kind of data, perhaps to two different things. And we feel that, you know, with the well-established functional studies, we're giving that another strong value. So to some extent, we've got a clear pathogenic call for this variant. And that's to be expected, you know, because this variant has been widely reported. Now because there are two patients. We don't have any information about an infected child. But there certainly is plenty of information out there of other couples who do have affected children and have certainly worked with the neonatal unit in my previous position, and certainly they have diagnosed many, many young patients at the neonatal stage with these two variants. So it certainly could be argued that we could add in some of these values. But coming back to the classification we've brought, all the software itself, has brought all this information together and we've copied that over here. And perhaps just reported the variants more in the human genome variation nomenclature where we've given a reference to C Dash and then the actual amino acid change and we would use some of this information and reporting. So at the moment, we certainly have the complete package and we've been using VarSeq widely and comparing it with other software that we have for classifying variants. We are running Sentieon and we are working on our pipeline so that it is completely automated, less manual. We haven't been using the reporting, but it's just a matter of ensuring that we can link VarSeq and the VSReporting to our LIMS laboratory management system. And I think that's only a matter of time. There are perhaps a few other things that I would like to say. And, you know, when you're looking at what might be some common variants. And for us, obviously, CF is a gene that we're looking at, you know, in our patients. It's useful to look at something like the POLE T and the POLE T G, because that kind of gives you information about what is the homopolymer or dinucleotide repeat look like when there either are deletions of one or two or more nucleotides. We look at the SMN1 gene for spinal muscular atrophy and we kind of, we do our alignment to just one of the genes to SMN1, and we've been looking at variant allele frequency and how that is linked to copy number using MLPA.
Val Hyland: The other thing I would like to say is that I certainly value when I hear Gabe speak and I always think that Gabe is giving a masterclass and I think why can't I be there in person? And one of the things I've suggested to Gabe and the other people at Golden Helix is, is it possible to set up a user group?
Val Hyland: And the idea of this perhaps would be for those people that don't actually make it to the conferences where Golden Helix is actually there and you can ask them questions. And we might have a way of catching up separately when we attend other meetings. Or we might have a way to interact on FaceTime or Viber or Zoom and in the digital environment. So I just, thanks, Gabe and Rudy, for the opportunity to say what I've said.
Gabe Rudy: Thank you. That sounds fantastic.
Val Hyland: Yeah
Gabe Rudy: All right, so I'm going to go ahead and take over from here. And that was just such a fantastic demonstration. I'm going to try to provide a little more high level context of what you just saw for the ACMG classification, as well as discuss the similar type of guided workflow experience you can have if you are performing sort of molecular diagnostic tests in the cancer space. So if you are looking for sort of targeted therapy type of screenings for biomarkers in specific tumor types and how to report those out, we also have support for the AMP guidelines. And this is an all in one suite that includes the type of single nucleotide variant analysis you just saw. Of course, looking at insertions and deletions and I might go through an example of one of those just as a different type of variant in the different type of evidence that you might evaluate there. But we also support incorporating copy number variants and gene fusions into that clinical workflow. The whole point here is that we get through this guided workflow that you just saw consistency in getting to the same results no matter who is doing the work. So you also have the full complexity of the guidelines encoded for you. So while there are different rules and codes and all those things, this workflow will have those pieces of information there. It will be structured to be able to answer those questions with lots of recommendations and prompts. There's also a built in documentation and help that basically describes each of those scoring criterias like Val was showing with PM5 or PP5 and PS1, etc.. So all that can be very helpful in following these guidelines even if you haven't had a lot of experience with these specific codes.
Gabe Rudy: So the next thing is just to talk about the end product here, which can be essentially the scored variants themselves, but also they can be chose to be represented in different formats and those might be the result of the test itself, could be formatted and produced inside of our VSClinical system. Example here on the right is a clinical report for a kind of somatic workflow where we're gonna be reporting out BRAF mutation. But this also just allows again all this to be automated, standardized and put into the same workflow. You get to the same result no matter who's doing it. And they can really also make sure that the same text is going to be produced from one day to another. So we're saving and reusing your interpretations at different levels. And this cancer workflow comes out as a Word report, you can also save that out as a PDF, it can be very customizable. So the next thing is just to discuss a little bit about the overall workflow. So Val jumped into showing you he had some filter variants, he went into one example. The starting point often is your unannotated, unfiltered set of variants coming off of your NGS test. And the first step that you have in VarSeq when you bring those in is you can customize to meet the needs of your specific lab's criteria. For hard filtering for quality, etc. which variants are going to be prioritized and brought into this clinical workflow. So we have a enormous number of annotations, a lot of flexibility to define your own criteria for whether you're filtering on read depth or genotype quality. Other attributes about the variant in that sample. And then again, you kind of run this whole process for one sample, but you can import a whole batch of samples at a time. So that might reflect the batch that came off the sequencer or the batch that's assigned to a given user. And then one user can go through and follow that process and another user could either go through, use the same project or a different project and then check in when they get to the final results that they've matched the same process. And then this sort of interpretation hub, like you were seeing there, allows you to follow things like the ACMG guidelines and where you're scoring all these different criteria for all the evidence that we have and all this extra detail will be in there. And I think I'll go through and give you a quick demonstration of what that looks like.
Gabe Rudy: Again, there is quite a lot of complexity to get to these final classifications states on the ACMG rules. But this is done for you in the sense that not only do we auto score a lot of these criteria, we don't answer the questions, but we sort of circle the answer that we have strong evidence to support. And we give you reasons why we are suggesting those answers for a given score. And as you answer the scoring questions, yes or no for each scored question, you will get your classification updated to match the final classification state, likely pathogenic, pathogenic, based on the ACMG rules. And then if we do hit like a rule like we do here, we're telling you which of these little sub-rules coming from the paper. These are sort of pictures coming from the ACMG paper. Those are all embedded right in the software. So you know how you got to that state, how we're following those ACMG criteria.
Gabe Rudy: So with that, I am going to just jump on into a quick demonstration. I just want to show you a quick example of a different type of variant. So in this case, I've done a little bit of filtering on a project as well.
Gabe Rudy: Again, this is a very straightforward filtering process. It may be more complex depending on the needs of your lab. What I am filtering on is essentially our auto classification where we are running this for every variant in your project following as many of the ACMG rules as possible. And I'm basically saying give me any interesting variant that meets this auto classification of likely pathogenic or pathogenic, but also variants of uncertain significance that are leaning towards pathogenecity as in there are some, but maybe not sufficient criteria scored for pathogenecity and that can get me down to quite a small number of variants and I could filter my tables on that. I could see here my three variants in this table and see which genes they're in. In this case, this panel also has CNV calls. And I'm not going to have time to go into a lot of details, but we do call CNVs off NGS data. We also have our coverage statistics for that and then we go into our guidelines. And so I'm going to pop into the guidelines here. And just to give you a different flavor, we're gonna look at quickly an insertion and as you open up and go to a given variant and you can kind of see the details of that variant showing up here on the right. It's an insertion of a TCGA or TTGA. You can also see that same variant in Genome Browse. I'm just using my mouse wheel to scroll around here so you can get a good context of where we are in the gene for the current sample, the mutations in that gene, the coverage, which is fantastic. You can do things like plot your own internal catalogs like Val was doing, have other contexts that might be useful. And again, that genomic context can be very useful where it's not just sometimes what's going on with this variant, but what's going on nearby in the same haplotype or in the same area, that can be very helpful. So with that, I'm just gonna go ahead and hide Genome Browse like Val did, and we'll just look at this insertion real quick. Now, what's interesting about this insertion, if we look at the population information, let's look at the allele frequency, is that at a percentage level, it's actually kind of a little bit low. It might actually pass like you're below 1 percent filter for all populations. But it's actually very common within one subpopulation with any stations. So we sort of pull that out and that actually helps us inform that this is a tolerated mutation at a very high allele frequency amongst a very large population group. There's probably a founder mutation in here that got passed on. And in fact, it shows up in a homozygous state in eleven individuals. That in and of itself really helps classify this variant as benign. So I will say, if I answer this, we actually give you the reasons why we're recommending or circling this answer. Answer this, and you'll notice this little bar on the right that adjusts as you answer and immediately slams it to the left as we're in the benign state. But if we weren't certain for sure, or if the population information wasn't all that informative, we would go into the gene impact where we would see that we're either a frame shift mutation or at the very end of the gene, we're almost the very last codon, for goodness sakes. And if we go to this coding change and repeat section and Val showed you a couple of these other sections, so we're not going to go into those, we can actually see the description of this variant and how it changes the amino acid.
Gabe Rudy: And actually, although it's a frame shift, it introduces a stop very quickly. So essentially, the net change is not very much, maybe a matter for four amino acids being truncated or changed. So that's probably why it seems to be a tolerated frame shift. So this just gives you a sense of how you can take what may look like an interesting loss of function frame shift variant and using VSClinical and looking at some of our supporting detail, deep dive evidence, which is variant specific, so this information is specific to the type that it's a a loss of function variant. And we also have other information about where it is in the transcript and whether loss of function variants are tolerated.
Gabe Rudy: So finally, as I go through this, I will classify this variant. I can score at various levels. I can get to a final classification, and I can put this into my database, and I can do that for this variant and then move on to the next and the next, etc.. And the next one here is it turns out to be a pathogenic variant, et cetera. As I score it, I can say I want to put this into my report. And what that means is if you are selecting these variants for reporting, I can pull up my report tab and fill in my phenotype information as well as select all the variants, including this RAF1 and potentially even secondary findings and CNVs. I just had a couple selected here. And then we can choose to render this report and that report will contain all this information. My primary findings, my secondary findings by CNVs copy number gains all laid out in a very customizable format. So that's a quick, very quick, of course, review of the ACMG guidelines from the beginning to the reporting. I'm just going to give you also a little quick overview of our cancer guidelines and then we'll take some questions. And if we don't cover everything that you're interested in right now, feel free to contact Rudy. And we can always set up a demonstration to go into more detail into any subject of interest. So similar to the ACMG guidelines, there are, again, guidelines coming from the American Medical Pathologists group that are focused on how do you weight evidence for potential therapeutic diagnostic or prognostic effect of biomarkers in the context of given cancers and the clinical evidence that they have, they sort of rate into these these four levels and then they sort of group those into Tier 1 and Tier 2, essentially, where Tier 1 is clinical evidence that is for FDA approved drugs in the exact same cancer type that matches the tumor type that's being tested, versus if it's maybe FDA approved drugs but in a different tumor type than it fits into this level 2 or things that are in like a preclinical trial phase that don't quite reach the same level of evidence. So these are kind of equivalent to your likely pathogenic or pathogenic. But it's a very different scoring system because it has a very different context, which is essentially how relevant and how strong is the clinical evidence for this biomarker in this tumor type so that it can be applied to the context of a given patient and put it in the report with that context.
Gabe Rudy: So with that, you also do have a category that kind of doesn't meet that, which gets into you have some clinical evidence, but not enough goes into this variants of unknown clinical significance. And you also similarly have a benign and likely benign category. Now, there are a lot. These are all tables coming from the paper which discusses banks best practices to gather all of these databases and VSClinical pulls in pretty much every single one of these. We have all these splice finder predictions. We have conservation scores, we have functional prediction scores. We have clinical databases, population databases, sequence repositories, population database, our cancer specific databases, et cetera. And all those can be pulled in to both automatically score a variant in a very similar way, but now in a cancer-oriented workflow, as well as provide a lot of details for the manual interpretation process where you are evaluating that evidence to reach the conclusion of whether it meets the Tier 1 or Tier 2 evidence level. Now the nice thing is, we actually give you a bit of a starting point there, where we have a built in knowledge base called CancerKB that will have a lot of interpretations for the most common biomarkers and the most common tumor types. So that allows you also to give a starting place to build your own internal database, which will be reusable at different levels. You can have an interpretation for a gene. So every time you see that gene, you get to use the same interpretation that you've already written. Also for a specific tumor type. Also for a biomarker, et cetera, et cetera. You just sort of set up those those levels of usability and then that helps. Also to be consistent. So you're you know, you're getting the exact same report for the exact same input every time and you're not redoing any work. So this is just an example where we're going to jump into an example here and in this case where we're looking to have a context of a tumor type of melanoma and then we'll be looking at a mutation such as BRAF V600E, which is indication of use for Vemurafenib, which has this fantastic response rate when being used, when BRAF V600E mutations are driving melanoma. So very quickly, we'll look at that and then we'll jump into a Q and A session. So I've actually just popped up this project. This is now the cancer workflow, it has a different set of tabs here. And we has a built in sort of reporting tab. We're going to start here and kind of work backwards just a little bit. But you can see in my reporting tab in this case now, I can see my total results here. I can look at my kind of final interpretation state if I would like to. I also have information about the sample and then I have the information about the biomarkers that I have written. Now, this is that BRAF V600E mutation. We also have a lot of technical details about it, its state in different databases. And we have this interpretation at a gene level, at an outcomes and frequencies level, at a biomarker level. And then specifically, at a drug sensitivity level, which provides us with the tier level information as well as that this is Tier 1, Level A evidence for these drugs, as well as citations, including in-line references being pulled out that will actually be added to our report. So that's our main biomarker in this case. We also have a secondary germline finding. And so it's important to note that we actually can embed the entire ACMG guidelines, including following the ACMG guidelines to get to a classification for a germline variant that you might want to report out for the use for the family, etc. In the case that you are testing and you found a relevant germline finding. We could also have sections in our report that include variants of unknown significance, coverage statistics, regions that you might want to flag as potentially being failed, etc. and then of course all those in-line references. So this is sort of like the interactive summary of all the work you've done for this sample, and I could also create and render this whole thing as a PDF. So this exact same data gets rendered as a word file as well as a PDF file. And again, it's the exact same information, just beautifully formatted and customizable to your labs needs. You can set up your own footers, you can set up your own information here. We just give you a starting point and it's all customizable by using a Word templating system. So you can basically edit to your heart's content what you want this to look like in Word. And then we just fill in all the data. So that's a little bit about how to create the report. And I just wanted to show you quickly on the biomarkers tab what that BRAF V600E interpretation looks like. We have the information about the biomarker on the right. And then here is where you sort of fill in those reusable chunks of interpretation. Here's our clinical significance for melanoma. Again, we have these nice citations which are all very interactive. If you click on these, you can deep dive into these citations. And these are only coming from the fact that this PubMed ID was just referenced here. So if I look at COSMIC and how much it brings in. It also has a lot of great descriptions about this this gene and its impact on cancer. If I wanted to I could just add part of those descriptions in here.
Gabe Rudy: You would pick up this extra PubMed reference. Add it to my list. And now it says, I have modified this and I might want to review my changes compared to the previous one. And I could even look at like the diff of those, etc. Or I could discard those changes if I want to go back to my last saved version.
Gabe Rudy: So this is how you can kind of manage your little interpretation bits over time. We have information about the gene and its frequencies and different cancers. Information about the biomarker itself. And this is using our oncogenicity scoring system to give it a very high score, basically very similar to the ACMG scoring, but just adapted to focus on cancer sources, including things like COSMIC and CIViC, etc. And then finally, our interpretations, including a bunch of sources from things like PMKB and Drug Bank and CIViC. And each one of these sources as I click on it, I get the details of that source down here. I could read this information about this. This declaration of evidence between these drugs and this mutation for this type of cancer. And with this level of description, I could copy and bring those into my own interpretation. Again, all the interpretations you see here are actually built into the software because they're part of our shipped CancerKB knowledge base that were curated by our group of expert panels. So this one just being a very commonly mutated gene with a very common and highly mutated mutation is already written for you. Others may have pieces of it written for you, and you would be able to start with this table, which is an aggregation of many cancer sources, to help sort and organize the interpretation process. So with that, why don't we switch on over to take some questions? I do want to also acknowledge the fact that this work was partially funded by the National Institutes of Health under these award numbers. And our PI is Dr. Andreas Scherer. As well as a Montana grant. The content is solely the responsibility of the authors and does not reflect necessarily the official views of the NIH and of course, are very grateful for the support of the government and this work.
Gabe Rudy: So with that, we can take any questions if we want to or have any other discussions. Rudy, did you wanted to make any final points? Rudy, you're still muted, I don't know if you wanted to add any more to this at this time. If not, we can also just wrap this up because we're getting close to the end of the hour here.
Val Hyland: So maybe if you have a little bit of time, I'd probably like to add something more.
Gabe Rudy: Sure. Absolutely.
Val Hyland: One of the things we leave asked you and you certainly sorted a lot of problems for us as we are collecting more and more variants now and we're looking at how can we use a lot of the public sources. How can we upload our variants to ClinVar and particularly without any middleman. And if we could go direct from our warehouse to to the submission process.
Gabe Rudy: Absolutely. That's a great question and actually very, very topical. At this loss. ASHG meeting the ClinVar and ClinGen group discussed lab participation in moving or contributing back interpretations. And they're going to use both carrots and sticks to incentivize U.S. labs and definitely encourage and make easier, standardized, rather, the way that they want to see submissions across the world. So we've actually been working closely with ClinVar for a number of years and they basically have a certification program they're going to put in place, which we're going to try to be the first out the door to match all their requirements. So we're actually right now going to be starting the development efforts to have a certified export process that would allow you to select your variants of interest that you or your interpretations, rather, with your variants that you want to submit to ClinVar and it will create a basically an Excel file or a file containing all the exact fields that they would like to have exported and will be guaranteed to match what they expect for their submissions. So that will be coming shortly and we'll have some more announcements about that as soon as it's available.
Val Hyland: OK, thank you.
Rudy Parker: Thank you very much. Gabe and thank you, Val, Peter. Of course, there's a lot more to discuss. And so I would like to suggest that if you want to look at our recorded webinars, there are so many of them on so many topics, take a look at that, about CNV, about Warehouse. Val mentioned a VarSSeq Warehouse solution, and that is certainly something that we recommend for big labs. So, yeah, I've not much more to add, but thank you very much for you guys' time and effort.
Val Hyland: And thank you for staying up so that we could have a seminar in our working day.
Gabe Rudy: Absolutely.
Rudy Parker: Thank you. So yeah, without much ado, I'd like to then say, well, good day to you all. Thanks for joining us. For those who were on the webinar. If there is any questions and you want to quickly ask them type them in or else we will just come to a close for this meeting.
Val Hyland: Good day mate, as they say here.
Rudy Parker: All right. Thank you and good night. Good night, everybody.