Population Architecture Using Genomics and Epidemiology (PAGE): Epidemiologic Architecture for Genes Linked to Environment (EAGLE) - BioVU Cancer Project

As part of Population Architecture using Genomics and Epidemiology PAGE study (Phase I), the Epidemiologic Architecture using Genomics and Epidemiology (EAGLE I) project accessed both epidemiologic- and clinic-based collections. The epidemiologic-based collection of EAGLE I included the National Health and Nutritional Examination Surveys (NHANES), ascertained between 1991-1994 (NHANES III), 1999-2002, and 2007-2008. NHANES is a population-based cross-sectional survey now conducted every year in the United States to assess the health status of Americans at the time of ascertainment and to assess trends over the years of survey. Genetic NHANES consists of 19,613 DNA samples linked to thousands of variables including demographics, health and lifestyle variables, physical examination variables, laboratory variables, and exposures. NHANES is diverse with almost one-half of the samples (46.4%) coming from self-reported Mexican Americans and non-Hispanic blacks. In contrast to NHANES, BioVU is a clinic-based collection of >150,000 DNA samples from Vanderbilt University Medical Center linked to de-identified electronic medical records (EMRs). Approximately 12% of BioVU's overall DNA sample collection is from African American, Hispanic, and Asian patients.

The overall goals of PAGE I and EAGLE I were broad and several-fold:

  1. Replicate genome-wide association study (GWAS)- identified variants in European Americans;
  2. Identify population-specific and trans-population genotype-phenotype associations;
  3. Identify genetic and environmental modifiers of these associations.

NHANES is an excellent resource for the study of quantitative traits associated with common human diseases. However, given that the age range of NHANES spans childhood to late adulthood and not all diseases are surveyed, NHANES is less useful for the study of adult-onset diseases such as major cancers. Therefore, under American Recovery and Reinvestment Act (ARRA) funding, EAGLE as part of PAGE I defined eight major cancers sites for genetic analysis in BioVU, Vanderbilt's biorepository linked to de-identified EMRs. The eight major cancers defined for this study included melanoma, breast, ovarian, prostate, colorectal, lung, endometrial, and Non-Hodgkin's lymphoma (NHL). Cancer cases were defined using a combination of ICD-9 codes and tumor registry entries. Controls include BioVU participants without cancer and encompassing the age and gender distributions of cancer cases. Targeted genotyping of GWAS-identified variants for these diseases (124 SNPs) and ancestry informative markers (128 AIMs) was performed by the Center for Human Genetics Research Vanderbilt DNA Resources Core. After quality control, a total of 116 cancer-associated SNPs and 122 AIMs were available for downstream analyses.