Sentieon TNscope is a platform dedicated to the identification of tumor-normal somatic variants including single nucleotide variation (SNV), indel, as well as structure variations (SVs). In our recent webcast, Dr. Donald Freed, Bioinformatics Scientist at Sentieon, gave viewers an exclusive overview of the platform.
The webcast generated a lot of great questions which I would like to share with you today. If there is a question you have that is not covered in this blog post, please leave a comment and we will be happy to answer!
Q: Structural Variants of what size?
TNscope does call structural variants although we didn’t present that data in the webcast. The indels we’re talking about typically have a max size of about 30-50 base pairs at the largest indels. Above 50 base pairs, typically the events are picked up by our structural variant caller. The structural variant caller is able to call breakends of essentially arbitrary size. So any breakends that are larger than 50 base pairs could potentially be picked up by our structural variants caller.
Q: Does MuTect detect only single nucleotide variants, or can it also detect structural variants?
We have a variant caller which matches MuTect called TNsnv. Both MuTect and TNsnv only call SNVs, so single base pair changes. MuTect2, because it’s a haplotype based variant caller, is able to call indels as well as these single base changes. And finally, TNscope, which we believe has accuracy improvements over both MuTect and MuTect2, is able to call a wide range of events – SNVs, indels and structural variants.
Q: Have you compared variant calling results from Illumina data on the HiSeq vs X vs NovaSeq?
We have not performed benchmarking on the HiSeq X versus the nova seq. The data presented in the webcast is data from the Genome in a Bottle Consortium which was collected on some a version of a high seq (not sure if it was an X versus a 2500 high seq model).
Q: Have you looked at WGS vs exome calls from the same set of samples?
No, we have not looked at whole exome calls from these samples. For these Genome in a Bottle samples, typically the whole genome data is very plentiful whereas there’s not too much whole exome data. To get around this issue, we use a wide range of depths in our experiments. So we’re not benchmarking only at 30x depths, we’re benchmarking at 100x depth as well to more closely approximate the depths that you would see with whole exome data.
Q: Can TNscope accommodate long-read sequence data?
It could potentially, that is possibly an area we could have some work. But currently, TNscope is targeted towards short-read Illumina data. Potentially, you could use TNscope in combination with a barcode library preparation technology, such as 10x genomics, to run TNscope on short read data which has a long-range information.
Q: Can it call somatic variants from RNAseq data?
No, currently TNscope is just geared towards DNA sequence data. We do have algorithms for calling variants from RNA seq data but they aren’t necessarily geared towards calling somatic variants from RNA seq data. I’ve mentioned our pipelines which match results of kind of the GATK best practices pipeline. We do have a pipeline that matches GATK’s best practices pipeline for RNA variant calling, but I’m not sure how somatic variants would fit in there.
Q: In a collection of samples that may be a mix of normal vs tumor, for example, a mouse model that may have a mixture of normal and tumor samples, could the Sentieon platform sort out potentially mixed normal and tumor samples in primary tumor samples or biopsy?
You can run Sentieon TNscope on tumor only data without a matched normal. So if you have a sample which comes from two different sources, and these two different sources can’t be distinguished from each other, you can run TNscope on that data and it will produce variant calls but it will not be able to separate out what’s coming from either of your two sources very easily.
*Note: TNscope supports tumor-only variant calling. Similar to MuTect2 in tumor-only mode, TNscope accepts a dbSNP file, a COSMIC file, and a panel-of-normals (PON) file to improve accuracy for tumor-only somatic variant calls.
Q: Would the sensitivity be better in for example a tumor model in inbred mice where there is ‘no’ normal background variation?
I’m not sure the sensitivity would be better – we would expect TNscope to have high sensitivity in both cases. Sensitivity of TNscope shouldn’t be that limited by the background variation, so sensitivity should be high in either case.
Don’t forget, we give free trials of all our software products should you be interested in giving TNscope a try! If you would like to request a free trial of Sentieon, let us know!