Configuring a bioinformatic pipeline to reliably process genomic data is no small task. Doing so in an efficient, consistent way is an even grander challenge. Luckily, the VarSeq software suite provides a comprehensive toolbox for automation and integration. One of the first questions a new VarSeq user might ask is where processed data needs to end up. The versatility of VarSeq means that there really isn’t a wrong answer to that question. With a bit of custom scripting, VarSeq can integrate with any LIMS that hosts an API and produce virtually any file type. Today, we’ll go through the basics of interfacing with VarSeq’s custom scripting options and present some examples and starting points for exporting different file types.
While there are options to import, export, and manipulate data within the basic VarSeq GUI, VSClinical is the most user-friendly and customizable point of ingress for diverse data exports. In particular, for users looking to manipulate and export data that already exists within VSClinical, the VSClinical report tab is the best place to start. Navigating here and selecting “New Custom Script” from the “Report Exports” sidebar will allow us to reference existing custom export scripts and create our own (Figure 1).
Let’s investigate the anatomy of this custom report script a little further. Selecting the ellipse to the right of the vcf_example rectangle allows us to open the location of this script. We can then right-click and open the directory with VSCode. In this case, we’re looking at a VSClincal AMP project, so we can peruse the report_cancer.js file (Figure 4). While the file itself is relatively short, it still may be a little overwhelming at first, so let’s parse things out piece by piece.
First, let’s look at the components that should be constant across all custom scripts. In general, we’ll have a JSON that contains a description, the input and output files, and the code that’s executed. While this may not be the most robust snippet of code ever written, the way errors are logged to VarSeq means that throwing everything in a try-catch block is good practice in this context. Any errors encountered during execution outside of the debugger will be printed with the full stack trace to the VSClinical console, so users can quickly debug simple problems.
For more complex errors and development of new scripts, let’s take a look at the debugger and the JSON file we use to pull data from VSClinical. Clicking “Run Script” with this example will automatically generate an updated JSON, which we can open with VSCode by selecting the vertical ellipsis in the JSON’s rectangle. Next, within VSCode we can attach the debugger by clicking “Run With Debugger” next to our script and navigating to our VSCode window where we have our report_cancer.js file open (Figure 5).
Either using the debugger or inspecting the input JSON file itself, we can explore all of the fields available from our VSClinical evaluation. We can then manipulate that data and export it as whatever file type we’d like, either by writing to a local file or making an API. With these tools at your fingertips, creating an integrated, consistent genomic analysis pipeline is far more attainable.