The Resource for Genetic Epidemiology Research on Aging (GERA) Cohort was created by a RC2 "Grand Opportunity" grant that was awarded to the Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) and the UCSF Institute for Human Genetics (AG036607; Schaefer/Risch, PIs). The RC2 project enabled genome-wide SNP genotyping (GWAS) to be conducted on a cohort of over 100,000 adults who are members of the Kaiser Permanente Medical Care Plan, Northern California Region (KPNC), and participating in its RPGEH. The purpose of the RPGEH is to facilitate research on the genetic and environmental factors that affect health and disease by linking together clinical data from electronic health records, survey data on demographic and behavioral factors, and environmental data from various sources, with genetic data derived from biospecimens collected from participants. At the time of the award of the RC2 project in late 2009, the RPGEH had established a cohort of about 140,000 individuals who had answered a detailed survey, provided saliva samples for extraction of DNA, and given broad consent for the use of their data in studies of health and disease. To maximize the diversity of the resulting sample, the GERA cohort was formed by including all racial and ethnic minority participants with saliva samples (N = 20,925; 19%); the remaining participants were drawn sequentially and randomly from white non-Hispanic participants (89,341; 81%). A total of 110,266 participant samples were included to ensure that at least 100,000 were successfully assayed. The resulting GERA cohort is 42% male, 58% female, and ranges in age from 18 to over 100 years old with an average age of 63 years at the time of the RPGEH survey (2007). The sample is ethnically diverse, generally well-educated with above average income. Approximately 69% of the participants are married or living with a partner. Length of membership in KPNC averages 23.5 years. UCSF and RPGEH investigators worked with the genomics company Affymetrix to design four custom microarrays for genotyping each of the four major race-ethnicity groups included in the GERA Cohort, described in detail in Hoffmann et al., 2011a and 2011b. Following genotyping and quality control procedures, and after removal of invalid, discordant, or withdrawn samples, about 103,000 participants were successfully genotyped. The resulting genotypic data were linked to survey data and data abstracted from the electronic medical records. As described below, all RPGEH participants were mailed new consent forms with explicit discussion of the placement of data in the NIH-maintained dbGaP. About 77% of participants returned completed consent forms, resulting in a final sample size of 78,486 participants in the GERA Cohort with data for deposit into dbGaP. Origins of the RPGEH GERA Cohort The goal in creating the RPGEH GERA cohort was to create a large, multiethnic, and comprehensive population-based resource for research into the genetic and environmental basis of common age-related diseases and their treatment, and factors influencing healthy aging and longevity. The GERA Cohort consists of a diverse cohort of more than 100,000 adults who are members of the Kaiser Permanente Medical Care Plan, Northern California Region (KPNC), and participating in its Research Program on Genes, Environment and Health (RPGEH). KPNC is an integrated health care delivery system with a population of about 3.3 million people in northern California. The membership of KPNC is representative of the general population in the 14 county area in which facilities are located, although the membership is underrepresented for the extremes of income at both ends of the spectrum. The RPGEH utilizes the longitudinal electronic health records (EHR) of KPNC to obtain clinical, laboratory, imaging and pharmacy information on all cohort members, to which personal demographic, behavioral and health characteristics have been added through member surveys. The GERA Cohort comprises a subsample of the RPGEH participant cohort, and was created through the RC2 award from the NIA, NIMH, and NIH Common Fund as described above. GERA Study Design The GERA Cohort is a subsample, as described above, of the longitudinal cohort enrolled in the Kaiser Permanente RPGEH. The RPGEH cohort includes about 400,000 survey participants of whom about 200,000 have provided broad consent and a sample of saliva or blood for use in studies of genetic and environmental factors in health and disease. The GERA Cohort was developed from a mailed survey sent to all adult members of KPNC who had been members for two years or more in 2007. All survey respondents were contacted and asked to complete a consent form; those who completed consent forms were asked to provide a saliva sample. Additional male participants were added to the RPGEH through inclusion of the Northern California sample of the California Men's Health Study (CMHS) cohort of about 40,000 men from KPNC, ages 45-69 years old at the time of the CMHS survey in 2002-2003. The CMHS participants contributed about 15,400 saliva samples to the RPGEH and were eligible for inclusion in the GERA Cohort. CMHS participants were included according to the same sampling design as for the RPGEH cohort as a whole. Specifically, all minority participants were selected for inclusion in order to maximize representation of minorities in the GERA Cohort, and Non-Hispanic White participants were selected at random to complete the sample of 110,266 GERA Cohort participants. GERA Genotypic Data High-density genotyping was conducted at UCSF using custom designed Affymetrix Axiom arrays, as described in Hoffmann et al. (2011a; 2011b). To maximize genome-wide coverage of common and less common variants, four specific arrays were designed for individuals of Non-Hispanic White (EUR), East Asian (EAS), African-American (AFR), and Latino (LAT) race/ethnicity. There was broad overlap among the SNPs on the arrays, which were designed using a hybrid greedy imputation algorithm (Hoffmann et al., 2011b) applied to genotype information validated by Affymetrix from the 1000 Genomes Project. However, in order to capture low frequency variants specific to particular race-ethnicity groups, SNP content varies between arrays. A more detailed description of the process of genotyping and results is included in Genotyping of DNA Samples. Description of the analyses of population structure and development of principal components for adjustment of population structure is included in Population Structure Analysis. GERA Phenotypic Data RPGEH and CMHS Survey Data. The sources of data on demographic and behavioral factors deposited in dbGaP for the GERA Cohort are the RPGEH and CMHS surveys. Data on common demographic factors such as gender, race/ethnicity, marital status, and education and on behavioral factors such as smoking, alcohol consumption, and body mass index, have been cleaned, edited, reconciled between the two surveys, and compiled into summary indices, where appropriate, for deposition into dbGaP. A more complete description of the survey variables is included in Survey Variables Documentation. Please note that the terms of use of the GERA Cohort Data, as specified in the Data Use Certification (DUC), prohibit the use of survey variables as outcomes in analyses. For example, a genome-wide association study (GWAS) of education or smoking is not permitted as specified by the DUC. Only health conditions can be used as outcome variables in analyses. Health Conditions derived from Kaiser Permanente Electronic Medical Records. Data on the occurrence of health conditions in participants in the GERA Cohort have been derived from summarizing ICD-9 coded diagnoses in Kaiser Permanente's electronic medical records. An algorithm that aggregates specific ICD-9 codes into appropriate diagnostic groups for selected conditions is applied to outpatient and inpatient databases; see Disease and Conditions Definitions Documentation for details. The criterion for including a condition as "present" for a participant is the occurrence of two or more diagnoses within a diagnostic category occurring on separate days. Two or more is used as the criterion in order to reduce false positives due to mistakes or rule-out diagnoses. When compared with validated disease registries, the criterion of 2+ diagnoses yields high specificity and good sensitivity. ICD-9 codes in the electronic records are specified in several ways. For outpatient visits occurring during the period 1995 to 2006, diagnoses were assigned by the treating physician who endorsed specific diagnoses on an optically scanned list that varied by specialty. Beginning in 2006 with the advent of an integrated, fully electronic medical record, outpatient diagnoses are made by physicians/ providers using a pull down menu. Discharge diagnoses from inpatient stays are specified by physicians and coded by specially trained coders. Databases of ICD-9 codes for diagnoses assigned at outpatient visits, or as one of the discharge diagnoses following inpatient stays, are complete and available for all KPNC members dating back to 1995. Although the average length of KPNC membership among GERA cohort members is 23.5 years in 2007, not all have been members since 1995, so the history for some conditions, such as those that are not chronic or recurrent, may not be complete for all cohort members. The year of first membership in KPNC is included as a variable in the list of survey variables, enabling investigators to estimate the number of years of observation of each Cohort member. RPGEH Access and Collaborations Website and Procedures The RPGEH maintains a web portal for inquiries and applications for collaboration and access to data. The url is: https://rpgehportal.kaiser.org/. RPGEH has an application process and an Access Review Committee that reviews applications for collaboration and use. For more details, please contact RPGEH through the website.
The focus of this project is to identify genetic variants that are associated with orofacial clefts in African populations in sub-Saharan Africa. Most genetic studies of CLP (including the vast majority of GWAS) have been conducted in populations of European origin with only a few focused on Asian or African populations. We choose to study the genetics of these complex traits in African populations because, African populations have the greatest genetic variation amongst the various populations in the world by virtue of being the primary ancestral population to modern humans (Cavalli-Sforza and Feldman, 2003; Ramsay et al., 2011). Therefore, the potential for finding novel loci for CLP is quite high. To date 6 genome wide association studies (GWAS) for cleft lip with or without cleft palate (CL/P) have been conducted and 18 risk loci identified (Birnbaum et al., 2009; Grant et al., 2009 ; Beaty et al., 2010; Mangold et al., 2010; Ludwig et al., 2012; Sun et al., 2015). All these studies have either been conducted in European populations, Asian populations or both. There is currently no published GWAS for clefts in African populations. African populations represent a novel and richly productive populations for genetic and environmental exposure studies for CL/P. Investigating the presence of genetic variants in diverse population groups can identify novel variants and candidate genes that are population specific. Environmental factors may also increase the risks in certain population groups due to genetic susceptibility and/or specific exposures. Understanding the role these susceptibility genes play in the effects of environmental risk factors can inform strategies designed towards reducing the outcome of these complex traits, e.g. through the modification of the environmental influences. The study population comprises a large number of individuals (3205 individuals) from Africa (Ghana, Ethiopia and Nigeria). There are cases, case triads (nuclear families), as well as controls with no history of OFC nor other developmental defects.
Alzheimer disease is the most common neurodegenerative disorder of the elderly affecting an estimated five million Americans. Genetic factors contribute to the risk for disease with heritability estimates ranging from 57% to 79%. More than a decade ago, the ε4 variant of APOE was identified and remains the most consistently replicated genetic variant influencing the risk of late onset Alzheimer disease. A segregation analysis suggests there may be four additional genes influencing the age-at-onset of Alzheimer disease. In 2007 there were 968 association studies in 398 candidate genes reported, but none replicated consistently. There are many reasons for the lack of consistency, but one important reason for the lack of progress is the paucity of a sufficient number of well characterized families and patients available to the entire scientific community. The extensive effort and expense required to ascertain such a population has been addressed by the NIA-LOAD Family Study. Its goal is to identify and recruit families with two or more siblings with the late-onset form of Alzheimer's disease and a cohort of unrelated, non-demented controls similar in age and ethnic background, and to make the samples, the clinical and genotyping data and preliminary analyses available to qualified investigators world-wide. Genotyping by the Center for Inherited Disease Research (CIDR) was performed using the Illumina Infinium II assay protocol with hybridization to Illumina Human 610Quadv1_B Beadchips. This genotyping represents the largest collection of families ever assembled with Alzheimer's disease combining the NIA-LOAD Genetics Initiative Multiplex Family Study, the National Cell Repository for Alzheimer's Disease (NCRAD) with additional controls from the University of Kentucky. These genotyping results will serve as a focal point for future research that will identify all of the remaining genetic variants in Alzheimer's disease.
The Centers for Mendelian Genomics project uses next-generation sequencing and computational approaches to discover the genes and variants that underlie Mendelian conditions. By discovering genes that cause Mendelian conditions, we will expand our understanding of their biology to facilitate diagnosis and new treatments.
To identify effective drugs for clear cell ovarian cancer (CCC) and high-grade serous ovarian cancer (HGSC) through high-throughput drug screening using ovarian cancer organoids and to identify novel therapeutic targets based on the biological characteristics of CCC and HGSC through omics analysis.
We collected 187 samples of normal and primary tumor tissues in 69 cases of endometrial carcinoma. The purpose of this study is to develop actionable molecular targets and/or biomarkers for prognostication and patient stratification.
WTCCC genome-wide case-control association study for Bipolar Disorder (BD) using six disease collections together with the 1958 British Birth Cohort and the UK National Blood Service collections as controls.
To identify biomarkers allowing for the distinction between different chronic rhinosinusitis patient groups and disease controls (n=20 of each DC, CRSsNP, CRSwNP and N-ERD), we use high-throughput targeted proteomics (Olink Multiplex platform) in nasal secretions and serum samples.