NIDDK IBD Genetics Consortium Ulcerative Colitis Genome-Wide Association Study

This dataset contains data for 1,028 white, non-Hispanic, European ancestry individuals with ulcerative colitis who were included in a genome-wide association study published by Silverberg et al. (2009). These individuals were ascertained in North America and selected to have either left-sided or extensive disease (i.e., individuals with proctitis only were excluded). Genotyping was performed using the Illumina HumanHap300v2 (n = 540) and HumanHap550v3 (n = 488) Genotyping BeadChips at the Feinstein Institute for Medical Research. Control data (not included) were obtained from the NIDDK IBD Genetics Consortium's Crohn's Disease GWAS (available from dbGaP) and from studies 64 and 65 deposited in the Illumina iControlDB.

Seven hundred eighty individuals in this dataset were taken from the NIDDK IBD Genetics Consortium cell line repository ( These individuals are identified in the file dbGaP_SubjectDS.txt. The subject IDs for these individuals may be used to request corresponding samples for follow-up research through the repository. In addition, complete phenotype data for these individuals are included, collected using the Consortium's forms and phenotyping manual (both included). The remaining 248 individuals were identified from pre-existing collections ascertained by members of the Consortium or their collaborators. For these samples, several of the items in the phenotype file are incomplete.

Those who wish to replicate the results in Silverberg et al. should note that 6 individuals with missing genotype rates > 0.07 were excluded from that analysis (leaving 1,022 affected samples total). In addition, the minor allele frequencies (MAFs) reported in the publication were calculated using only those individuals who were included in the allelic association tests (n = 977 for SNPs included in the HumanHap300 and n = 476 for SNPs included only in the HumanHap550). These tests were performed using conditional logistic regression on gender-ancestry strata; individuals who were not placed in a stratum (using the procedure described in the supplementary information for Silverberg et al.) were excluded. The indicator variables hh300 and hh550 in the file dbGaP_PhenotypeDS.txt identify the samples included in the allelic association tests, and may be used to replicate the published MAFs among affected individuals.