Need Help?

SG10K_Pilot - Large-scale whole-genome sequencing of three diverse Asian populations in Singapore

Underrepresentation of Asian genomes has hindered population and medical genetics research on Asians, leading to population disparities in precision medicine. By whole-genome sequencing of 4,810 Singapore Chinese, Malays, and Indians, we found 98.3 million SNPs and small insertions/deletions, over half of which are novel. Population structure analysis demonstrated great representation of Asian genetic diversity by three ethnicities in Singapore, and revealed a Malay-related novel ancestry component. Furthermore, demographic inference suggested that Malays split from Chinese ~24,800 years ago, and experienced significant admixture with East Asians ~1,700 years ago, coinciding with the Austronesian expansion. Additionally, we identified 20 candidate loci for natural selection, among which 14 harbored robust associations with complex traits and diseases. Finally, we showed that our data can substantially improve genotype imputation in diverse Asian and Oceanian populations. These results highlight the value of our data as a resource to empower human genetics discovery across broad geographic regions.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001005337 4810
Publications Citations
Prevalence and spectrum of DNA mismatch repair gene variation in the general Chinese population.
J Med Genet 59: 2022 652-661
Variant landscape of the RYR1 gene based on whole genome sequencing of the Singaporean population.
Sci Rep 12: 2022 5429
A robust pipeline for ranking carrier frequencies of autosomal recessive and X-linked Mendelian disorders.
NPJ Genom Med 7: 2022 72
GBC: a parallel toolkit based on highly addressable byte-encoding blocks for extremely large-scale genotypes of species.
Genome Biol 24: 2023 76