Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research Genomics

Date: Wednesday, September 21st, 2016

Presenters: Gabe Rudy, VP of Product and Engineering

Duration: 60 Minutes


Big Data at Golden Helix


Every day, the trove of genomic data is growing.

Clinics are sequencing targeted genes at high read depths to report out genetic tests. Research groups are adding new exomes and genomes to their disease specific cohorts. Agricultural breeders are genotyping their herds and flocks by the thousands of thousands.

The conventional attitude to big data is give up using your existing workstations and servers and fully commit to an alternative universe of computation: clusters of computers run by complex management systems, opaque distributed file systems only accessible by specialized tools and software completely rewritten for this specific and often proprietary platform. At Golden Helix, we believe there is an alternative approach.

Modern workstations and servers have powerful multi-core CPUs and plenty of RAM.

With a focus on scalable architecture and optimized native code that fully utilizes the CPU and RAM available, we can scale genomic analysis into sizes conventionally considered Big Data on a single host. In this webcast, we demonstrate recent innovations and features in Golden Helix solutions that enable the analysis of big data on your own terms:

  • Be prepared to switch from exomes to whole genomes in your clinical or research use cases. We will demonstrate the latest VarSeq version with a completely rewritten and optimized interface for analyzing whole genomes that can effortlessly work with tens of millions of variants.
  • We will review the challenges that face clinical labs and research consortiums looking to aggregate their growing number of sequenced samples and how VSWarehouse was built with a scalable data store and tight integration with VarSeq to meet those challenges.
  • Breeding programs need to run prediction algorithms that would seemingly be limited by the size of the matrices a single machine can operate on. We present our completely novel approach that makes it possible to analyze sample sets into the tens and hundreds of thousands available in our recent SNP & Variation Suite release.

Golden Helix is constantly pushing what is possible, and this webcast will help you rethink what you can do with your own genomic data.

About the Presenter

Gabe Rudy

Gabe Rudy is GHI's Vice President of Product Development and team member since 2002. Gabe thrives in the dynamic and fast-changing field of bioinformatics and genetic analysis. Leading a killer team of Computer Scientists and Statisticians in building powerful products and providing world-class support, Gabe puts his passion into enabling Golden Helix's customers to accelerate their research. When not reading or blogging, Gabe enjoys the outdoor Montana lifestyle. But most importantly, Gabe truly loves spending time with his sons and wife. Follow Gabe on Twitter @gabeinformatics.