CNV Analysis

SNP & Variation Suite offers a complete set of tools for processing raw intensity data, identifying regions of copy number variation (CNV), visualizing copy number data, and performing association analyses on a variety of copy number covariates. From cytogenetic research to genome-wide copy number association, SVS delivers the most powerful toolset for correlating common and rare chromosomal aberrations with disease.

Golden Helix is committed to helping you succeed.

SVS comes with our bioinformatic experts at your fingertips. You can rest assured that you have someone to call when you get stuck or have a question.

Learn more about the Bioinformatic Support
that comes with SVS »

Data Processing

SVS offers direct import of log ratio data from a number of providers including Affymetrix, Agilent, NimbleGen, and Illumina. For Affymetrix CEL files (500K, 5.0, and 6.0), a powerful processing tool enables you to run quantile normalization on the A and B probe intensities, including virtual array generation to merge CN and SNP probes or multiple arrays (e.g. NSP and STY). This process scales to thousands of samples and can use any sample set as a reference.

CNV Association Testing

A number of covariate generation procedures enable you to perform association testing on raw or PCA-corrected log ratios, CNV segment means, and discretized values based on three- and two-state models representing loss, neutral, and gain. Perform numeric association tests or advanced linear and logistic regression with CNV covariates alone or in combination with other genetic markers and phenotypic variables.

Univariate optimal segmenting results.

Copy Number Detection with Optimal Segmenting

SVS employs a powerful optimal segmenting algorithm called Copy Number Analysis Method (CNAM) using dynamic programming to detect inherited and de novo CNVs on a per-sample (univariate) and multi-sample (multivariate) basis. Unlike Hidden Markov Models, which assume the means of different copy number states are consistent, optimal segmenting properly delineates CNV boundaries in the presence of mosaicism, even at a single probe level, and with controllable sensitivity and false discovery rate.

Optimal segmenting incorporates a parallelized, unbiased randomization permutation procedure that uses all available cores on your computer. The permutation procedure replaces a naïve, potentially biased randomization procedure with the unbiased Fisher and Yates method (also known as the Knuth shuffle). An added option allows you to further refine your segments by efficiently removing univariate outliers during the segmentation process.

SVS comes with an array of resources for learning and utilizing the software to get the most out of your data including tutorials, add-on scripts, example data and projects, and much more.

Check out the SVS resources available »

Detecting and Correcting for Plate/Batch Effects,
Genomic Waves, and other Quality Issues

For both microarray and aCGH data, significant bias can be introduced by batch effects (plate, machine, and site variation), genomics waves, and population stratification. Other sources of variation include sample extraction and preparation procedures, cell types, temperature fluctuation, and even ambient ozone levels in a lab. These can lead to complications ranging from poorly defined segments to false and non-replicable findings. SVS offers a number of tools to not only detect for these data quality problems but correct for them as well. These include:

  • Derivative log ratio spread (DLRS) to measure signal-to-noise ratios in log ratio data.
  • Extreme value distribution to detect samples with an excess of large positive and negative copy number regions.
CNV data in GenomeBrowse
  • Gender concordance to detect misreported samples.
  • Chromosomal means to detect cell line artifacts.
  • SNP call rates to filter poorly called markers when SNP probes are available.
  • Principal component analysis (PCA) designed to simultaneously correct for multiple quality issues (in addition to population stratification), while significantly improving signal-to-noise ratios.
  • Genomic wave detection and correction from Diskin, et. al. to control for wave effects.

CNV Data Visualization

"Seeing is believing" with richly interactive data visualization that provides unprecedented whole genome views and easy navigation of your data. Visually detect CNVs across many samples or confirm optimal segmenting results with the integration of GenomeBrowse as SVS's visualization engine. Generate cluster plots of allele intensities to filter poor quality markers. Visualize CNV association p-values alongside SNP p-values. And when you finalize the views you want, you can save them to a number of publication quality formats, including scalable vector graphics. Learn more about GenomeBrowse as a standalone tool »

Core Features of SVS

Data Manangement

The core architecture of SVS has been designed to efficiently handle datasets of virtually any size and type on a desktop computer. SVS natively supports over 70 different file formats and over 40 export formats to streamline data management, ensuring you spend most of your time on the more important aspects of analysis.

Real-time spreadsheet manipulation, data editing, and enrichment help eliminate the hassles of working with large-scale, complex data. Easily combine multiple sample sets and data of different types, from different arrays, or even platforms.

Further, an integrated spreadsheet editor facilitates data editing and transformations on a grand scale.

Marker mapped spreadsheet


Genomic Build, Marker Map, and Annotation Management

SVS provides a robust set of tools for working with and managing genomic information. Easily switch among a wide-variety of supported species and genomic builds and apply genetic marker maps to ensure all analyses and visualizations are accurate based on correct genomic coordinates.

Further, genomic annotations can be used to enrich analyses with visualization alongside data for greater context in the genome browser. SVS provides real-time network access to an expanding list of genomic annotations. You can also use your own custom annotations from private sources or public databases, such as UCSC, RefSeq, and Ensemble.

Data Source Library


Python Scripting

Python is a clear and powerful object-oriented programming language, comparable to Perl, Ruby, Scheme, or Java. SVS gives you fully programmatic access to most SVS functionality via a Python scripting interface enabling you to automate workflows, interoperate with other programs, and develop more robust data management and manipulation routines. Also included is the mature statistical and numeric packages of NumPy and SciPy giving you a broad base of standardized test statistics to add your own methods as well as the 2D plotting library, matplotlib, for generating a near limitless number of publication quality plots and other visualizations. More about Python Scripting in SVS »

Python interface