The UK10K project proposes a series of complementary genetic approaches to find new low-frequency/rare variants contributing to disease phenotypes. These will be based on obtaining the genome-wide sequence of 4000 samples from the TwinsUK and ALSPAC cohorts (at 6x sequence coverage), and the exome sequence (protein-coding regions and related conserved sequence) of 6000 samples selected for extreme phenotypes. Our studies will focus primarily on cardiovascular-related quantitative traits, obesity and related metabolic traits, neurodevelopmental disorders and a limited number of extreme clinical phenotypes that will provide proof-of-concept for future familial trait sequencing. We will directly analyse quantitative traits in the cohorts and the selected traits in the extreme samples, and also use imputation down to 0.1% allele frequency to extend the analyses to further sample sets with genome wide genotype data. In each case we will investigate indels and larger structural variants as well as SNPs, and use statistical methods that combine rare variants in a locus or pathway as well as single-variant approaches. The TwinsUK samples will be part of the cohort study and will undergo whole genome sequencing. For further information with regard to this cohort please contact Brent Richards ( or Nicole Soranzo (

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001000194 Illumina Genome Analyzer II Illumina HiSeq 2000 1713
EGAD00001000741 Illumina Genome Analyzer II Illumina HiSeq 2000 1854
EGAD00001000776 Illumina Genome Analyzer II Illumina HiSeq 2000 3781
EGAD00001000790 1854
Publications Citations
