The GenomeAsia 100K Variant Frequency database is a pilot annotation source now available to our users. This valuable database offers a deep characterization of specific populations in Asia that can be used to drive genetic studies. GenomeAsia is comprised of whole-genome sequencing data of over 1,000 individuals from 219 populations across Asia. Using this as an annotation, users can analyze population structure and history, and improve variant filtering for the discovery of disease-associated genes of rare diseases.
This annotation source can be accessed in the public annotations directory. It contains over 66 million features and catalogs genetic variation, population structure, disease associations, and founder effects.
The primary value of this database is that it provides information for genetic studies on non-European individuals that are typically underrepresented in reference genome datasets. A recent publication highlights some additional values of GenomeAsia (1):
- Improves ability to filter out low-probability candidates for highly penetrant disorders.
- Identify putatively pathogenic variants that are found at high frequency in particular populations.
- Improves the ability to infer pathogenicity of identified variants.
In summary, GenomeAsia can expand genetic studies by incorporating information for over 200 populations across Asia. We hope that you can find value in using this in your internal pipeline and please let us know if there are other annotation sources that you would like to have implemented. If you have any questions about this database or those pertaining to our software, please do not hesitate to reach out to [email protected]. Feel free to also check out some of our other blogs that contain important, useful news and updates for the next-gen sequencing community.
Works Cited:
- Wall, J. Stawiski, E. Ratan, A. Kim, H. The GenomeAsia 100K Project Enables Genetic Discoveries Across Asia. Nature. GenomeAsia100K Consortium. 2019