Imputation performance in Latin American populations: improving rare variant estimations with the inclusion of Native American genomes.

Study ID Alternative Stable ID Type
EGAS00001005797 Other

Study Description

Current Genome-Wide Association Studies (GWAS) rely on genotype imputation to increase statistical power, improve fine-mapping of association signals, and facilitate meta-analyses. Due to the complex demographic history of Latin America and the lack of balanced representation of Native American genomes in current imputation panels, the discovery of locally relevant disease variants is likely to be missed, limiting the scope and impact of biomedical research in these populations. Therefore, the necessity of better diversity representation in genomic databases is a scientific imperative. Here, we expand the 1000 Genomes reference panel (1KGP) with 134 Native American genomes (84 publicly available + 50 newly sequenced genomes) to assess imputation performance in Latin American individuals of mixed ancestry. Our panel increased the number of SNPs above the GWAS quality threshold, thus improving statistical power for association studies in the region. It also increased imputation accuracy, particularly in low-frequency variants segregating in Native American ancestry tracts.

Study Datasets 1 dataset.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
50 Whole genome sequences from 50 Mexican individuals with a high proportion of Native American ancestry.
HiSeq X Ten 50

Who archives the data?