Minor ReleaseVarSeq 2.2.5 is a minor release, with no new products or features introduced. It reflects an aggregation of incremental improvements, polish items, performance enhancements and issue resolutions based on feedback from active users.
VSClinical and Reporting
- gnomAD Genomes v3.1.2 is now supported as a frequency source in the VSClinical Options dialog and for use with the ACMG Classifier algorithm.
- Zip code and country can now auto-populate from a sample manifest file into the clinical trials search bar in the Drugs and Trials tab
- In the VSClinical Options dialog, when adding gnomAD Genomes v3.1.2 as a custom frequency source, the quality flags were not displaying for variants within an evaluation. This has been fixed and if quality flags exist for a variant within the gnomAD genomes v3.1.2 database, they will now display in the VSClinical interface.
- On Linux, VSClinical reports could not be rendered when projects where opened from a symlinked paths. This has been fixed, allowing reports in more contexts.
- Due to updates to the system gene preferences file, it was possible to get errors in VSClinical when saving or loading gene preferences such as mode of inheritance or disorder for genes that were recently removed from the system gene preferences file. This issue has been resolved.
- In the current AMP workflow, only drugs that are associated with a biomarker can be used to select clinical trials in the Drugs and Trials tab. Now, new drugs or drugs from a clinical trial card can be included in a clinical report without requiring a biomarker association.
- When a variant was added to the AMP workflow as a Germline Suspected variant, and the variant was set to be reported as a biomarker, the variant was being listed as a biomarker and not a germline origin biomarker in the clinical report.
- In the AMP workflow, when a variant was set to report as “variant of uncertain significance”, the variant could not be distinguished as a somatic or germline variant in the clinical report. Now, additional details are provided in the report data that can be used to designate the variant as a germline or somatic.
- When both drug sensitivity and resistance information were being reported for the same variant, the JSON file that was produced after rendering the report would have duplicated entries for the same drug, once for drug sensitivity and again for drug resistance. This has been improved to have one entry for the drug even when both resistance and sensitivity information are being reported with identical information.
VarSeq Annotations and Algorithms
- The ACMG Classifier algorithm would hold a lot of of the human reference sequence in RAM for large genes. On some computers with limited RAM this would prevent the algorithm from running to completion on whole exome or genome projects. The ACMG Classifier memory profile was improved through various optimizations, requiring less than half as much memory in most cases and even less in regions with large genes.
- The Human Phenotype Ontology annotation source that is shipped with VarSeq has been updated to the 02-14-2022 released version.
- The OK button for the Match String List algorithm was always disabled even when valid string fields were being used as input for the algorithm. This has now been fixed.
- In family analysis projects that had included the Compound Het Algorithm, the algorithm would cause VarSeq to crash when sample fields were edited in specific ways, such as parental assignment. The algorithm has been fixed to allow sample fields to change without the algorithm causing VarSeq to crash.
- When computing the reference sample match score, VS-CNV excludes any targets that are in an LoH event. However, for male samples the entire chromosome X may be called as an LoH. In the case where a sample does not have targets in the Y chromosome, and an LoH state was detected spanning all targets in the X chromosome, there would not be any allosomal targets over which to compute the match score for male samples. Previously, this would have resulted in an algorithm error for that sample (reported in the sample status) and no CNVs will be called. The algorithm is now updated to not mask allosomal targets in an LOH state during normalization, allowing these samples be have CNV calls.
- Lifting over CNVs in VSClinical from GrCh37 to GrCh38 now lifts over both the start and stop positions of the CNVs whose size exceeds 50 bp. Previously, the size of the CNV was assumed to be preserved during liftover.
- When a CNV region has a mix of homozygous and heterozygous deletions detected in the first phase of the CNV calling algorithm, the later phases that harmonize and merge potential calls into final CNV events sometimes would not merge these events and in some cases would not call any CNV. We have resolved this issue and improved harmonizing CNV calls in these mixed regions. In our test data, we generally see no changes in the CNV caller behavior on smaller gene panel datasets. Exomes and large gene panels generally have a few changed CNV call events per sample.
VarSeq Projects and General
- The import of multiple VCF files require the normalization of variant representations: especially complex variants and multi-allelic variants. Improvements have been made to the order, deduplication, normalization, merging and filling of variant and sample level data across the import pipeline to improve rare edge cases where the inheritance of variants was not tracked across family relationships due to multiple representations of complex variants across input VCF files.
- The HGVS notation for insertions of a sequence that includes a stop codon is different than the general HGVS p. notation for large deletion/insertions. The heuristic to detect when to use this notation did not previously handle changes spanning multiple amino acids such as NM_000455.5:c.387_388delinsAT. Previously the nomenclature in VarSeq was p.Met129_Gln433delinsIle. The nomenclature has been corrected to p.Met129_Glu130delinsIle*.
- Some annotation sources were not exporting all of the data to an excel file due to invalid XML 1.0 characters. The excel exporter was improved to prevent writing invalid XML characters.
- For some VarSeq Windows users, when attempting to update VarSeq 2.2.3 to VarSeq 2.2.4 or after installing 2.2.4 on a new machine, an error informs the user that the program cannot start or update because there is a missing Microsoft DLL file. This new DLL dependency is installed on most machines, but is now bundled in future versions of the product.
- VarSeq was crashing when the project template gene annotation track was not downloaded and there was a match gene panel algorithm in the project. The algorithm was attempting to look up gene names based on gene IDs but cannot perform this task on remote annotation sources.
- The VSWarehouse connection would timeout in the last step of uploading a large file as the VarSeq API request to VSWarehouse would only wait 30 seconds for the API call to resolve. As a resolution, the VarSeq API request to VSWarehouse has been increased to match the 900 second server-side timeout.
- When computing new reference samples in the Manage Reference Samples manager on machines with 64 or more CPU cores, VarSeq would allocate to many active threads and crash. Thread management strategy has been updated to be more efficient and handle high-core count configurations.
- When running multiple VSPipeline instances simultaneously, VSPipeline was competing to write to the vsprops preferences file. In some cases, this resulted in the loss of custom file paths for annotations. VSPipeline will no longer write to this file by default as the command line context does need to change user preferences in most cases. If needed, a new -u/–update flag has been added to the VSPipeline command to enable updating the user preferences file. For example, to stay logged in on a fresh install, run “
vspipeline -u -c login <email> <password>“