Introducing Phenotype Gene Ranking in VarSeq

Personal genome sequencing is rapidly changing the landscape of clinical genetics. With this development also comes a new set of challenges. For example, every sequenced exome presents the clinical geneticist with thousands of variants. The job at hand is to find out which one might be responsible for the person’s illness.

In order to reduce the search space, clinicians use various methods to filter out noise. Case-cohort analysis or sequencing additional family members can also improve diagnostic accuracy by eliminating variants that are present in non-carriers that are also present in the cases. There have been a vast amount of algorithms and filters developed for those scenarios.

Unfortunately, clinicians mostly deal with affected individuals and/or small families. Conventional whole-genome and whole-exome search and variant-prioritization tools are under powered in these situations, potentially limiting the number of successful diagnoses. In order to further reduce the search space and to focus the analysis towards variant candidates that with high likelihood are impacting the observed symptoms, we have chosen to implement a phenotype driven variant ontological re-ranking tool in VarSeq called PhoRank.

PhoRank is modeled on the Phevor algorithm [1] that was published by Mark Yandell’s group in 2014. It works by leveraging the knowledge resident in diverse biomedical ontologies, such as the Human Phenotype Ontology (HPO), and the Gene Ontology (GO). These Ontologies contain the critical links between gene and disease associations. The Ontologies are organized as directed acyclic graphs, allowing different traversals to take you from one input term, through connections, to other term nodes and their connections.

PhoRank and Phevor start by inputting individual’s phenotypes to well specified terms in the Human Phenotype Ontology. It assigns initial scores to these nodes in the ontology graph. Next the algorithm propagates this score information through the various ontologies, with the score decaying the further they are from the original nodes. When finished, the genes observed in the ontologies with high scores are more closely related to the specified phenotypes, while genes with low scores have little or no relation to the phenotypes.

VarSeq then takes these ranked genes and joins them to the genes observed in the imported and annotated variants from your project. The output contains the gene score, a useful percentile ranking of that gene amongst all other observed, as well as an informative path between the gene and the closest input phenotype term.

The result is VarSeq can harmonize ranking and filtering strategies, allowing some filtering to narrow the search space, while prioritizing the resulting variants by their gene’s relevance to an individual’s phenotypes.


PhoRank results showing prioritized candidate genes with de Novo mutations in a proband with global developmental delay, cleft palate, and a few other phenotypes.

PhoRank is especially useful for single-exome and family-trio-based diagnostic analyses, the most commonly occurring clinical scenarios. Please check out our newest release of VarSeq for more information. Our team of experts are happy to demonstrate how you can use this latest addition to VarSeq’s capabilities to conduct a whole exome analysis more effectively .

Literature

[1] Phevor Combines Multiple Biomedical Ontologies for Accurate Identification of Disease-Causing Alleles in Single Individuals and Small Nuclear Families. Am J Hum Genet. 2014 Apr 3;94(4):599-610

Leave a comment

Andreas Scherer

About Andreas Scherer

Dr. Andreas Scherer is CEO of Golden Helix. The company has been delivering industry leading bioinformatics solutions for the advancement of life science research and translational medicine for over a decade. Its innovative technologies and analytic services empower scientists and healthcare professionals at all levels to derive meaning from the rapidly increasing volumes of genomic data produced from next-generation sequencing. With its solutions, hundreds of the world’s hospitals and testing labs are able to harness the full potential of genomics to identify the cause of disease, develop genomic diagnostics, and advance the quest for personalized medicine. Golden Helix products and services have been cited in thousands of peer-reviewed publications. Golden Helix is also on the Inc 5000 list of the fastest-growing private companies in the US. He is also Managing Partner of Salto Partners, Inc, a management consulting firm headquartered in Nevada.  He has extensive experience successfully managing growth as well as orchestrating complex turnaround situations. His company, Salto Partners, advises on business strategy, financing, sales, and operations. Clients are operating in the high-tech and life sciences space. Dr. Scherer holds a Ph.D. in computer science from the University of Hagen, Germany, and a Master of Computer Science from the University of Dortmund, Germany. He is author and co- author of over 20 international publications and has written books on project management, the Internet, and artificial intelligence. His latest book, “Be Fast Or Be Gone”, is a prizewinner in the 2012 Eric Hoffer Book Awards competition, and has been named a finalist in the 2012 Next Generation Indie Book Awards! 

View all posts by Andreas Scherer →