Esophageal cancer is one of the most aggressive cancers and the sixth leading cause of cancer death worldwide1. Approximately 70% of the global esophageal cancers occur in China and over 90% histopathological forms of this disease are esophageal squamous cell carcinoma (ESCC)2-3. Currently, there are limited clinical approaches for early diagnosis and treatment for ESCC, resulting in a 10% 5-year survival rate for the patients. Meanwhile, the full repertoire of genomic events leading to the pathogenesis of ESCC remains unclear. Here we show a comprehensive genomic analysis in 158 ESCC cases, as part of the International Cancer Genome Consortium (ICGC) Research Projects (http://icgc.org/icgc/cgp/72/371/1001734). We conducted whole-genome sequencing in 14 ESCC cases and whole-exome sequencing in 90 cases.
EUROBATS consists of RNA-seq data for female twins in lisphoblastoide cell lines (LCL), abdominal adipose tissue (Fat), abdominal skin tissue (Skin) and whole blood (Blood). The samples are a combination of MZ twins, DZ twins and singletons. The goal of the project is to find biomarkers related with the ageing process and cellular senescence.
Open label study to evaluate the safety of Lactobacillus rhamnosus GG ATCC 53103 (LGG) in elderly subjects. Fifteen healthy elderly volunteers, ages 65-80 were enrolled in a study in which they received LGG capsules containing 1 x 1010 CFU, twice daily for 28 days and were followed through day 56. The study subjects completed a daily diary, a telephone call on study days 3, 7 and 14 and study visits at the Massachusetts General Hospital Clinical Research Center at baseline, day 28 and day 56. During each visit, the subject diary, interim history, potential adverse effects and concomitant medications were reviewed and vital signs and a physical examination were performed. Routine blood tests were obtained to monitor for safety during visits and nasopharyngeal and stool samples were collected for microbiome analysis. Volunteers interested in participating in the effect of LGG probiotic on the Human whole Blood Transcriptome substudy also had blood drawn for DNA and RNA extraction, after signing the substudy consent form. The main objective of the study was to assess the safety and tolerability of 2 x 1010 CFU LGG administered orally to elderly subjects for 28 days. Secondary objectives were to evaluate the richness and microbial diversity in nasopharyngeal and stool specimens using pyrosequencing, and to compare cytokine production in response to bacterial stimulation by following the kinetics of mRNA expression of pro and anti-inflammatory genes and different signaling pathways, in relation to changes in stool Bifidobacterium and Lactobacillus spp. The study was reviewed and approved by the Partners Human Research Committee (IRB # 2010P001695) and was registered at ClinicalTrials.gov (NCT01274598). The main results regarding the clinical signs and safety of intervention is published in PLoS One. 2014 Dec 1;9(12):e113456. doi: 10.1371/journal.pone.0113456. eCollection 2014. PMID: 25438151. RNA sequencing for transcriptome analysis was only done in 11 of the 15 subjects as RNA integrity and quantity had to be optimal for the three times where blood was collected in order for a subject to be included in the study. The information regarding the microbiome data associated with this study is presented in a separate paper, which can be found at dbGaP study accession, phs000896, and PMID: 25873374.
Preeclampsia (PE) is a leading cause for peripartal morbidity, especially if developing early in gestation. To enable prophylaxis to prevent PE, it is essential that pregnancies at risk of PE are identified early, in the first trimester. To identify patients at risk, we profiled methylomes of plasma-derived cell-free DNA (cfDNA) from pregnant women. We detected DNA methylation differences between control and PE pregnancies that enable risk stratification of patients, at PE diagnosis but also presymptomatically, around 12 weeks of gestation.
Non-invasive prenatal testing (NIPT) is a powerful screening method for fetal aneuploidy detection, relying on laboratory and computational analysis of cell-free DNA. Although several published computational NIPT analysis tools are available, no comprehensive and direct accuracy evaluations of these tools is published. Here, we evaluate and determine the precision of five commonly used computational NIPT aneuploidy analysis tools, considering diverse sequencing depth (coverage), arbitrary sequencing read placement, and fetal DNA fraction on clinically validated NIPT samples.
The dataset contains raw fastq files (fastq.gz) for Chromium Single Cell 5’ gene expression (GEX), human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. Single cell 5’ gene expression, V(D)J-enriched and cell surface protein libraries were generated using Chromium single cell kits, and barcoded cDNA from a total of 5,000-10,000 cells per sample was generated using the 10x Genomics Chromium Controller. The libraries were pooled prior to sequencing on a NovaSeq 6000 instrument (Illumina) using the following configuration: read 1: 26 cycles, read 2: 89 cycles, index read 1: 8 cycles.
The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (Illumina HiSeqX, 40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.
The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.
Autism Spectrum Disorders (ASD) are early onset neurodevelopmental syndromes characterized by impairments in reciprocal social interaction and communication, accompanied by restricted and repetitive behaviors. ASDs afflict ~1% of the human population, and represent a major public health burden. Evidence for the etiology of ASD has consistently pointed to a strong genetic component, though the genetic architecture is indisputably complex. This dbGaP Collection consists of studies that include GWAS, sequencing (targeted, exome, whole genome), transcriptomics, etc, across many different cohorts. Many of the datasets have been curated and harmonized with NDAR prior to submission to dbGaP. Researchers who request and are approved to access this collection will be granted access to all related substudies in dbGaP as well as all NDAR data. Individual level genomics data will only be available for download through dbGaP along with minimal phenotype and pedigree information. Detailed phenotype and available imaging data related to these same research subjects are available for query and download within NDAR. The NDAR GUID will allow individual genomic data to be associated with all NDAR data collected for those subjects.
The San Antonio Family Heart Study (SAFHS) is a complex pedigree-based mixed longitudinal study designed to identify low frequency or rare variants influencing susceptibility to cardiovascular disease, using whole genome sequence (WGS) information from 2,590 individuals in large Mexican American pedigrees from San Antonio, Texas. The major objectives of this study are to identify low frequency or rare variants in and around known common variant signals for CVD, as well as to find novel low frequency or rare variants influencing susceptibility to CVD. WGS of the SAFHS cohort has been obtained through three efforts. Approximately 540 WGS were performed commercially at 50X by Complete Genomics, Inc (CGI) as part of the large T2D-GENES Project. The phenotype and genotype data for this group is available at dbGaP under accession number phs000462. An additional ~900 WGS at 30X were obtained through Illumina as part of the R01HL113322 "Whole Genome Sequencing to Identify Causal Genetic Variants Influencing CVD Risk" project. Finally, ~1,150 WGS at 30X WGS were obtained through Illumina funded by a supplement as part of the NHLBI's TOPMed program. Extensive phenotype data are provided for sequenced individuals primarily obtained from the P01HL45522 "Genetics of Atherosclerosis in Mexican Americans" for adults and R01HD049051 for children in these same families. Phenotype information was collected between 1991 and 2016. For this dataset, the SAFHS appellation represents an amalgamation of the original SAFHS participants and an expansion that reexamined families previously recruited for the San Antonio Family Diabetes Study (R01DK042273) and the San Antonio Family Gall Bladder Study (R01DK053889). Due to this substantial examination history, participants may have information from up to five visits. The clinical variables reported are coordinated with TOPMed and include major adverse cardiac events (MACE), T2D status and age at diagnosis, glycemic traits (fasting glucose and insulin), blood pressure, blood lipids (total cholesterol, HDL cholesterol, calculated LDL cholesterol and triglycerides). Additional phenotype data include the medication status at each visit, classified in four categories as any current use of diabetes, hypertension or lipid-lowering medications, and, for females, current use of female hormones. Anthropometric measurements include age, sex, height, weight, hip circumference, waist circumference and derived ratios. PBMC derived gene expression assays for a subset of ~1,060 individuals obtained using the Illumina Sentrix-6 chip is also available from the baseline examination. The WGS data have been jointly called and are available in the current TOPMed accession (phs001215).
Genome-wide DNA Methylation Data from Illumina HumanMethylationEPIC arrays for whole blood samples from 403 healthy individuals. Additional associated phenotype information is available for all individuals included in this study directly from CIBMTR. Data are available under controlled access release upon reasonable request and execution of a data use agreement. Requests should be submitted to CIBMTR at info-request@mcw.edu and include the study reference IB17-04.
Hypodiploid acute lymphoblastic leukemia (ALL) is an aggressive leukemia characterized by aneuploidy and poor outcome. The genetic basis of hypodiploid ALL is unknown. Here, using complementary genome-wide profiling approaches, we show that hypodiploid ALL comprises two subtypes that differ in the severity of aneuploidy, transcriptional profile and submicroscopic genetic alterations. Near haploid cases with 24-31 chromosomes frequently harbor alterations targeting receptor tyrosine kinase- and Ras signaling (71%) and IKZF3 (AIOLOS; 13%). In contrast, low hypodiploid ALL cases with 32-39 chromosomes are characterized by TP53 alterations, almost half of which are present in non-tumor cells, and have alterations of IKZF2 (HELIOS; 53%) and RB1 (41%). Both near haploid and low hypodiploid tumors exhibit activation of Ras and PI3K signaling pathways, and are sensitive to PI3K inhibition, indicating that these drugs should be explored as a new therapeutic strategy for this frequently lethal form of leukemia.
Compared to Caucasians residing in the same community the incidence rate of Alzheimer's disease is approximately twice as high in Caribbean Hispanics. Moreover, Caribbean Hispanics represent a homogenous population with only a few founders. Replication of genetic associations in other ethnic groups provides supporting evidence that a putative gene is involved in the disease pathogenesis, and when different allelic variants within the same genes are associated with disease it can help to localize the pathogenic variant. SORL1 is an example of a candidate gene that was first identified in Hispanics and then confirmed in multiple ethnic and racial groups in a meta-analysis. For this project, we are genotyping 704 individuals in families multiply affected by Alzheimer's disease (166 families). These families were recruited in the United States, Puerto Rico and the Dominican Republic. In addition, we are genotyping 2491 individuals (960 patients with sporadic Alzheimer's disease and 1531 unrelated controls) phenotyped in a similar fashion totaling 3195 individuals. Both cohorts are followed at regular intervals of 18 to 24 months, and potential phenotypes available other than those related to Alzheimer's disease include body mass index, measured blood pressure, neurological history and examinations. More importantly, both studies are funded through 2014 and additional phenotypes could be added in successive waves. We propose a genome wide association (GWA) study of Alzheimer's disease and longitudinal changes in cognition and other age-related neurological and medical phenotypes. The goal will be to identify the chromosomal locations of genes underlying this disease and its related endophenotypes.
Original description of the study: From ELLIPSE (linked to the PRACTICAL consortium), we contributed ~78,000 SNPs to the OncoArray. A large fraction of the content was derived from the GWAS meta-analyses in European ancestry populations (overall and aggressive disease; ~27K SNPs). We also selected just over 10,000 SNPs from the meta-analyses in the non-European populations, with a majority of these SNPs coming from the analysis of overall prostate cancer in African ancestry populations as well as from the multiethnic meta-analysis. A substantial fraction of SNPs (~28,000) were also selected for fine-mapping of 53 loci not included in the common fine-mapping regions (tagging at r2>0.9 across ±500kb regions). We also selected a few thousand SNPs related with PSA levels and/or disease survival as well as SNPs from candidate lists provided by study collaborators, as well as from meta-analyses of exome SNP chip data from the Multiethnic Cohort and UK studies. The Contributing Studies: Aarhus: Hospital-based, Retrospective, Observational. Source of cases: Patients treated for prostate adenocarcinoma at Department of Urology, Aarhus University Hospital, Skejby (Aarhus, Denmark). Source of controls: Age-matched males treated for myocardial infarction or undergoing coronary angioplasty, but with no prostate cancer diagnosis based on information retrieved from the Danish Cancer Register and the Danish Cause of Death Register. AHS: Nested case-control study within prospective cohort. Source of cases: linkage to cancer registries in study states. Source of controls: matched controls from cohort ATBC: Prospective, nested case-control. Source of cases: Finnish male smokers aged 50-69 years at baseline. Source of controls: Finnish male smokers aged 50-69 years at baseline BioVu: Cases identified in a biobank linked to electronic health records. Source of cases: A total of 214 cases were identified in the VUMC de-identified electronic health records database (the Synthetic Derivative) and shipped to USC for genotyping in April 2014. The following criteria were used to identify cases: Age 18 or greater; male; African Americans (Black) only. Note that African ancestry is not self-identified, it is administratively or third-party assigned (which has been shown to be highly correlated with genetic ancestry for African Americans in BioVU; see references). Source of controls: Controls were identified in the de-identified electronic health record. Unfortunately, they were not age matched to the cases, and therefore cannot be used for this study. Canary PASS: Prospective, Multi-site, Observational Active Surveillance Study. Source of cases: clinic based from Beth Israel Deaconness Medical Center, Eastern Virginia Medical School, University of California at San Francisco, University of Texas Health Sciences Center San Antonio, University of Washington, VA Puget Sound. Source of controls: N/A CCI: Case series, Hospital-based. Source of cases: Cases identified through clinics at the Cross Cancer Institute. Source of controls: N/A CerePP French Prostate Cancer Case-Control Study (ProGene): Case-Control, Prospective, Observational, Hospital-based. Source of cases: Patients, treated in French departments of Urology, who had histologically confirmed prostate cancer. Source of controls: Controls were recruited as participating in a systematic health screening program and found unaffected (normal digital rectal examination and total PSA < 4 ng/ml, or negative biopsy if PSA > 4 ng/ml). COH: hospital-based cases and controls from outside. Source of cases: Consented prostate cancer cases at City of Hope. Source of controls: Consented unaffected males that were part of other studies where they consented to have their DNA used for other research studies. COSM: Population-based cohort. Source of cases: General population. Source of controls: General population CPCS1: Case-control - Denmark. Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPCS2: Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPDR: Retrospective cohort. Source of cases: Walter Reed National Military Medical Center. Source of controls: Walter Reed National Military Medical Center ACS_CPS-II: Nested case-control derived from a prospective cohort study. Source of cases: Identified through self-report on follow-up questionnaires and verified through medical records or cancer registries, identified through cancer registries or the National Death Index (with prostate cancer as the primary cause of death). Source of controls: Cohort participants who were cancer-free at the time of diagnosis of the matched case, also matched on age (±6 mo) and date of biospecimen donation (±6 mo). EPIC: Case-control - Germany, Greece, Italy, Netherlands, Spain, Sweden, UK. Source of cases: Identified through record linkage with population-based cancer registries in Italy, the Netherlands, Spain, Sweden and UK. In Germany and Greece, follow-up is active and achieved through checks of insurance records and cancer and pathology registries as well as via self-reported questionnaires; self-reported incident cancers are verified through medical records. Source of controls: Cohort participants without a diagnosis of cancer EPICAP: Case-control, Population-based, ages less than 75 years at diagnosis, Hérault, France. Source of cases: Prostate cancer cases in all public hospitals and private urology clinics of département of Hérault in France. Cases validation by the Hérault Cancer Registry. Source of controls: Population-based controls, frequency age matched (5-year groups). Quotas by socio-economic status (SES) in order to obtain a distribution by SES among controls identical to the SES distribution among general population men, conditionally to age. ERSPC: Population-based randomized trial. Source of cases: Men with PrCa from screening arm ERSPC Rotterdam. Source of controls: Men without PrCa from screening arm ERSPC Rotterdam ESTHER: Case-control, Prospective, Observational, Population-based. Source of cases: Prostate cancer cases in all hospitals in the state of Saarland, from 2001-2003. Source of controls: Random sample of participants from routine health check-up in Saarland, in 2000-2002 FHCRC: Population-based, case-control, ages 35-74 years at diagnosis, King County, WA, USA. Source of cases: Identified through the Seattle-Puget Sound SEER cancer registry. Source of controls: Randomly selected, age-frequency matched residents from the same county as cases Gene-PARE: Hospital-based. Source of cases: Patients that received radiotherapy for treatment of prostate cancer. Source of controls: n/a Hamburg-Zagreb: Hospital-based, Prospective. Source of cases: Prostate cancer cases seen at the Department of Oncology, University Hospital Center Zagreb, Croatia. Source of controls: Population-based (Croatia), healthy men, older than 50, with no medical record of cancer, and no family history of cancer (1st & 2nd degree relatives) HPFS: Nested case-control. Source of cases: Participants of the HPFS cohort. Source of controls: Participants of the HPFS cohort IMPACT: Observational. Source of cases: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has been diagnosed with prostate cancer during the study. Source of controls: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has not been diagnosed with prostate cancer during the study. IPO-Porto: Hospital-based. Source of cases: Early onset and/or familial prostate cancer. Source of controls: Blood donors Karuprostate: Case-control, Retrospective, Population-based. Source of cases: From FWI (Guadeloupe): 237 consecutive incident patients with histologically confirmed prostate cancer attending public and private urology clinics; From Democratic Republic of Congo: 148 consecutive incident patients with histologically confirmed prostate cancer attending the University Clinic of Kinshasa. Source of controls: From FWI (Guadeloupe): 277 controls recruited from men participating in a free systematic health screening program open to the general population; From Democratic Republic of Congo: 134 controls recruited from subjects attending the University Clinic of Kinshasa KULEUVEN: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases recruited at the University Hospital Leuven. Source of controls: Healthy males with no history of prostate cancer recruited at the University Hospitals, Leuven. LAAPC: Subjects were participants in a population-based case-control study of aggressive prostate cancer conducted in Los Angeles County. Cases were identified through the Los Angeles County Cancer Surveillance Program rapid case ascertainment system. Eligible cases included African American, Hispanic, and non-Hispanic White men diagnosed with a first primary prostate cancer between January 1, 1999 and December 31, 2003. Eligible cases also had (a) prostatectomy with documented tumor extension outside the prostate, (b) metastatic prostate cancer in sites other than prostate, (c) needle biopsy of the prostate with Gleason grade ≥8, or (d) needle biopsy with Gleason grade 7 and tumor in more than two thirds of the biopsy cores. Eligible controls were men never diagnosed with prostate cancer, living in the same neighborhood as a case, and were frequency matched to cases on age (± 5 y) and race/ethnicity. Controls were identified by a neighborhood walk algorithm, which proceeds through an obligatory sequence of adjacent houses or residential units beginning at a specific residence that has a specific geographic relationship to the residence where the case lived at diagnosis. Malaysia: Case-control. Source of cases: Patients attended the outpatient urology or uro-onco clinic at University Malaya Medical Center. Source of controls: Population-based, age matched (5-year groups), ascertained through electoral register, Subang Jaya, Selangor, Malaysia MCC-Spain: Case-control. Source of cases: Identified through the urology departments of the participating hospitals. Source of controls: Population-based, frequency age and region matched, ascertained through the rosters of the primary health care centers MCCS: Nested case-control, Melbourne, Victoria. Source of cases: Identified by linkage to the Victorian Cancer Registry. Source of controls: Cohort participants without a diagnosis of cancer MD Anderson: Participants in this study were identified from epidemiological prostate cancer studies conducted at the University of Texas MD Anderson Cancer Center in the Houston Metropolitan area. Cases were accrued in the Houston Medical Center and were not restricted with respect to Gleason score, stage or PSA. Controls were identified via random-digit-dialing or among hospital visitors and they were frequency matched to cases on age and race. Lifestyle, demographic, and family history data were collected using a standardized questionnaire. MDACC_AS: A prospective cohort study. Source of cases: Men with clinically organ-confined prostate cancer meeting eligibility criteria for a prospective cohort study of active surveillance at MD Anderson Cancer Center. Source of controls: N/A MEC: The Multiethnic Cohort (MEC) is comprised of over 215,000 men and women recruited from Hawaii and the Los Angeles area between 1993 and 1996. Between 1995 and 2006, over 65,000 blood samples were collected from participants for genetic analyses. To identify incident cancer cases, the MEC was cross-linked with the population-based Surveillance, Epidemiology and End Results (SEER) registries in California and Hawaii, and unaffected cohort participants with blood samples were selected as controls MIAMI (WFPCS): Prostate cancer cases and controls were recruited from the Departments of Urology and Internal Medicine of the Wake Forest University School of Medicine using sequential patient populations as described previously (PMID:15342424). All study subjects received a detailed description of the study protocol and signed their informed consent, as approved by the medical center's Institutional Review Board. The general eligibility criteria were (i) able to comprehend informed consent and (ii) without previously diagnosed cancer. The exclusion criteria were (i) clinical diagnosis of autoimmune diseases; (ii) chronic inflammatory conditions; and (iii) infections within the past 6 weeks. Blood samples were collected from all subjects. MOFFITT: Hospital-based. Source of cases: clinic based from Moffitt Cancer Center. Source of controls: Moffitt Cancer Center affiliated Lifetime cancer screening center NMHS: Case-control, clinic based, Nashville TN. Source of cases: All urology clinics in Nashville, TN. Source of controls: Men without prostate cancer at prostate biopsy. PCaP: The North Carolina-Louisiana Prostate Cancer Project (PCaP) is a multidisciplinary population-based case-only study designed to address racial differences in prostate cancer through a comprehensive evaluation of social, individual and tumor level influences on prostate cancer aggressiveness. PCaP enrolled approximately equal numbers of African Americans and Caucasian Americans with newly-diagnosed prostate cancer from North Carolina (42 counties) and Louisiana (30 parishes) identified through state tumor registries. African American PCaP subjects with DNA, who agreed to future use of specimens for research, participated in OncoArray analysis. PCMUS: Case-control - Sofia, Bulgaria. Source of cases: Patients of Clinic of Urology, Alexandrovska University Hospital, Sofia, Bulgaria, PrCa histopathologically confirmed. Source of controls: 72 patients with verified BPH and PSA<3,5; 78 healthy controls from the MMC Biobank, no history of PrCa PHS: Nested case-control. Source of cases: Participants of the PHS1 trial/cohort. Source of controls: Participants of the PHS1 trial/cohort PLCO: Nested case-control. Source of cases: Men with a confirmed diagnosis of prostate cancer from the PLCO Cancer Screening Trial. Source of controls: Controls were men enrolled in the PLCO Cancer Screening Trial without a diagnosis of cancer at the time of case ascertainment. Poland: Case-control. Source of cases: men with unselected prostate cancer, diagnosed in north-western Poland at the University Hospital in Szczecin. Source of controls: cancer-free men from the same population, taken from the healthy adult patients of family doctors in the Szczecin region PROCAP: Population-based, Retrospective, Observational. Source of cases: Cases were ascertained from the National Prostate Cancer Register of Sweden Follow-Up Study, a retrospective nationwide cohort study of patients with localized prostate cancer. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PROGReSS: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases from the Hospital Clínico Universitario de Santiago de Compostela, Galicia, Spain. Source of controls: Cancer-free men from the same population ProMPT: A study to collect samples and data from subjects with and without prostate cancer. Retrospective, Experimental. Source of cases: Subjects attending outpatient clinics in hospitals. Source of controls: Subjects attending outpatient clinics in hospitals ProtecT: Trial of treatment. Samples taken from subjects invited for PSA testing from the community at nine centers across United Kingdom. Source of cases: Subjects who have a proven diagnosis of prostate cancer following testing. Source of controls: Identified through invitation of subjects in the community. PROtEuS: Case-control, population-based. Source of cases: All new histologically-confirmed cases, aged less or equal to 75 years, diagnosed between 2005 and 2009, actively ascertained across Montreal French hospitals. Source of controls: Randomly selected from the Provincial electoral list of French-speaking men between 2005 and 2009, from the same area of residence as cases and frequency-matched on age. QLD: Case-control. Source of cases: A longitudinal cohort study (Prostate Cancer Supportive Care and Patient Outcomes Project: ProsCan) conducted in Queensland, through which men newly diagnosed with prostate cancer from 26 private practices and 10 public hospitals were directly referred to ProsCan at the time of diagnosis by their treating clinician (age range 43-88 years). All cases had histopathologically confirmed prostate cancer, following presentation with an abnormal serum PSA and/or lower urinary tract symptoms. Source of controls: Controls comprised healthy male blood donors with no personal history of prostate cancer, recruited through (i) the Australian Red Cross Blood Services in Brisbane (age range 19-76 years) and (ii) the Australian Electoral Commission (AEC) (age and post-code/ area matched to ProsCan, age range 54-90 years). RAPPER: Multi-centre, hospital based blood sample collection study in patients enrolled in clinical trials with prospective collection of radiotherapy toxicity data. Source of cases: Prostate cancer patients enrolled in radiotherapy trials: CHHiP, RT01, Dose Escalation, RADICALS, Pelvic IMRT, PIVOTAL. Source of controls: N/A SABOR: Prostate Cancer Screening Cohort. Source of cases: Men >45 yrs of age participating in annual PSA screening. Source of controls: Males participating in annual PSA prostate cancer risk evaluations (funded by NCI biomarkers discovery and validation grant), recruited through University of Texas Health Science Center at San Antonio and affiliated sites or through study advertisements, enrolment open to the community SCCS: Case-control in cohort, Southeastern USA. Prospective, Observational, Population-based. Source of cases: SCCS entry population. Source of controls: SCCS entry population SCPCS: Population-based, Retrospective, Observational. Source of cases: South Carolina Central Cancer Registry. Source of controls: Health Care Financing Administration beneficiary file SEARCH: Case-control - East Anglia, UK. Source of cases: Men < 70 years of age registered with prostate cancer at the population-based cancer registry, Eastern Cancer Registration and Information Centre, East Anglia, UK. Source of controls: Men attending general practice in East Anglia with no known prostate cancer diagnosis, frequency matched to cases by age and geographic region SNP_Prostate_Ghent: Hospital-based, Retrospective, Observational. Source of cases: Men treated with IMRT as primary or postoperative treatment for prostate cancer at the Ghent University Hospital between 2000 and 2010. Source of controls: Employees of the University hospital and members of social activity clubs, without a history of any cancer. SPAG: Hospital-based, Retrospective, Observational. Source of cases: Guernsey. Source of controls: Guernsey STHM2: Population-based, Retrospective, Observational. Source of cases: Cases were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PCPT: Case-control from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial SELECT: Case-cohort from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial TAMPERE: Case-control - Finland, Retrospective, Observational, Population-based. Source of cases: Identified through linkage to the Finnish Cancer Registry and patient records; and the Finnish arm of the ERSPC study. Source of controls: Cohort participants without a diagnosis of cancer UGANDA: Uganda Prostate Cancer Study: Uganda is a case-control study of prostate cancer in Kampala Uganda that was initiated in 2011. Men with prostate cancer were enrolled from the Urology unit at Mulago Hospital and men without prostate cancer (i.e. controls) were enrolled from other clinics (i.e. surgery) at the hospital. UKGPCS: ICR, UK. Source of cases: Cases identified through clinics at the Royal Marsden hospital and nationwide NCRN hospitals. Source of controls: Ken Muir's control- 2000 ULM: Case-control - Germany. Source of cases: familial cases (n=162): identified through questionnaires for family history by collaborating urologists all over Germany; sporadic cases (n=308): prostatectomy series performed in the Clinic of Urology Ulm between 2012 and 2014. Source of controls: age-matched controls (n=188): age-matched men without prostate cancer and negative family history collected in hospitals of Ulm WUGS/WUPCS: Cases Series, USA. Source of cases: Identified through clinics at Washington University in St. Louis. Source of controls: Men diagnosed and managed with prostate cancer in University based clinic. Acknowledgement Statements: Aarhus: This study was supported by the Danish Strategic Research Council (now Innovation Fund Denmark) and the Danish Cancer Society. The Danish Cancer Biobank (DCB) is acknowledged for biological material. AHS: This work was supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics (Z01CP010119). ATBC: This research was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, N01-RC-37004, HHSN261201000006C, and HHSN261201500005C from the National Cancer Institute, Department of Health and Human Services. BioVu: The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center's BioVU which is supported by institutional funding and by the National Center for Research Resources, Grant UL1 RR024975-01 (which is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06). Canary PASS: PASS was supported by Canary Foundation and the National Cancer Institute's Early Detection Research Network (U01 CA086402) CCI: This work was awarded by Prostate Cancer Canada and is proudly funded by the Movember Foundation - Grant # D2013-36.The CCI group would like to thank David Murray, Razmik Mirzayans, and April Scott for their contribution to this work. CerePP French Prostate Cancer Case-Control Study (ProGene): None reported COH: SLN is partially supported by the Morris and Horowitz Families Endowed Professorship COSM: The Swedish Research Council, the Swedish Cancer Foundation CPCS1 & CPCS2: Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev Ringvej 75, DK-2730 Herlev, DenmarkCPCS1 would like to thank the participants and staff of the Copenhagen General Population Study for their important contributions. CPDR: Uniformed Services University for the Health Sciences HU0001-10-2-0002 (PI: David G. McLeod, MD) CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study II cohort. CPS-II thanks the participants and Study Management Group for their invaluable contributions to this research. We would also like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, and cancer registries supported by the National Cancer Institute Surveillance Epidemiology and End Results program. EPIC: The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by the Danish Cancer Society (Denmark); the Deutsche Krebshilfe, Deutsches Krebsforschungszentrum and Federal Ministry of Education and Research (Germany); the Hellenic Health Foundation, Greek Ministry of Health; Greek Ministry of Education (Greece); the Italian Association for Research on Cancer (AIRC) and National Research Council (Italy); the Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF); the Statistics Netherlands (The Netherlands); the Health Research Fund (FIS), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, Spanish Ministry of Health ISCIII RETIC (RD06/0020), Red de Centros RCESP, C03/09 (Spain); the Swedish Cancer Society, Swedish Scientific Council and Regional Government of Skåne and Västerbotten, Fundacion Federico SA (Sweden); the Cancer Research UK, Medical Research Council (United Kingdom). EPICAP: The EPICAP study was supported by grants from Ligue Nationale Contre le Cancer, Ligue départementale du Val de Marne; Fondation de France; Agence Nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES). The EPICAP study group would like to thank all urologists, Antoinette Anger and Hasina Randrianasolo (study monitors), Anne-Laure Astolfi, Coline Bernard, Oriane Noyer, Marie-Hélène De Campo, Sandrine Margaroline, Louise N'Diaye, and Sabine Perrier-Bonnet (Clinical Research nurses). ERSPC: This study was supported by the DutchCancerSociety (KWF94-869,98-1657,2002-277,2006-3518, 2010-4800), The Netherlands Organisation for Health Research and Development (ZonMW-002822820, 22000106, 50-50110-98-311, 62300035), The Dutch Cancer Research Foundation (SWOP), and an unconditional grant from Beckman-Coulter-HybritechInc. ESTHER: The ESTHER study was supported by a grant from the Baden Württemberg Ministry of Science, Research and Arts. The ESTHER group would like to thank Hartwig Ziegler, Sonja Wolf, Volker Hermann, Heiko Müller, Karina Dieffenbach, Katja Butterbach for valuable contributions to the study. FHCRC: The FHCRC studies were supported by grants R01-CA056678, R01-CA082664, and R01-CA092579 from the US National Cancer Institute, National Institutes of Health, with additional support from the Fred Hutchinson Cancer Research Center. FHCRC would like to thank all the men who participated in these studies. Gene-PARE: The Gene-PARE study was supported by grants 1R01CA134444 from the U.S. National Institutes of Health, PC074201 and W81XWH-15-1-0680 from the Prostate Cancer Research Program of the Department of Defense and RSGT-05-200-01-CCE from the American Cancer Society. Hamburg-Zagreb: None reported HPFS: The Health Professionals Follow-up Study was supported by grants UM1CA167552, CA133891, CA141298, and P01CA055075. HPFS are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. IMPACT: The IMPACT study was funded by The Ronald and Rita McAulay Foundation, CR-UK Project grant (C5047/A1232), Cancer Australia, AICR Netherlands A10-0227, Cancer Australia and Cancer Council Tasmania, NIHR, EU Framework 6, Cancer Councils of Victoria and South Australia, and Philanthropic donation to Northshore University Health System. We acknowledge support from the National Institute for Health Research (NIHR) to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden Foundation NHS Trust. IMPACT acknowledges the IMPACT study steering committee, collaborating centres, and participants. IPO-Porto: The IPO-Porto study was funded by Fundaçäo para a Ciência e a Tecnologia (FCT; UID/DTP/00776/2013 and PTDC/DTP-PIC/1308/2014) and by IPO-Porto Research Center (CI-IPOP-16-2012 and CI-IPOP-24-2015). MC and MPS are research fellows from Liga Portuguesa Contra o Cancro, Núcleo Regional do Norte. SM is a research fellow from FCT (SFRH/BD/71397/2010). IPO-Porto would like to express our gratitude to all patients and families who have participated in this study. Karuprostate: The Karuprostate study was supported by the the Frech National Health Directorate and by the Association pour la Recherche sur les Tumeurs de la ProstateKarusprostate thanks Séverine Ferdinand. KULEUVEN: F.C. and S.J. are holders of grants from FWO Vlaanderen (G.0684.12N and G.0830.13N), the Belgian federal government (National Cancer Plan KPC_29_023), and a Concerted Research Action of the KU Leuven (GOA/15/017). TVDB is holder of a doctoral fellowship of the FWO. LAAPC: This study was funded by grant R01CA84979 (to S.A. Ingles) from the National Cancer Institute, National Institutes of Health. Malaysia: The study was funded by the University Malaya High Impact Research Grant (HIR/MOHE/MED/35). Malaysia thanks all associates in the Urology Unit, University of Malaya, Cancer Research Initiatives Foundation (CARIF) and the Malaysian Men's Health Initiative (MMHI). MCCS: MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 209057, 251553, and 504711, and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. MCC-Spain: The study was partially funded by the Accion Transversal del Cancer, approved on the Spanish Ministry Council on the 11th October 2007, by the Instituto de Salud Carlos III-FEDER (PI08/1770, PI09/00773-Cantabria, PI11/01889-FEDER, PI12/00265, PI12/01270, and PI12/00715), by the Fundación Marqués de Valdecilla (API 10/09), by the Spanish Association Against Cancer (AECC) Scientific Foundation and by the Catalan Government DURSI grant 2009SGR1489. Samples: Biological samples were stored at the Parc de Salut MAR Biobank (MARBiobanc; Barcelona) which is supported by Instituto de Salud Carlos III FEDER (RD09/0076/00036). Also sample collection was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d'Oncologia de Catalunya (XBTC). MCC-Spain acknowledges the contribution from Esther Gracia-Lavedan in preparing the data. We thank all the subjects who participated in the study and all MCC-Spain collaborators. MD Anderson: Prostate Cancer Case-Control Studies at MD Anderson (MDA) supported by grants CA68578, ES007784, DAMD W81XWH-07-1-0645, and CA140388. MDACC_AS: None reported MEC: Funding provided by NIH grant U19CA148537 and grant U01CA164973. MIAMI (WFPCS): ACS MOFFITT: The Moffitt group was supported by the US National Cancer Institute (R01CA128813, PI: J.Y. Park). NMHS: Funding for the Nashville Men's Health Study (NMHS) was provided by the National Institutes of Health Grant numbers: RO1CA121060. PCaP only data: The North Carolina - Louisiana Prostate Cancer Project (PCaP) is carried out as a collaborative study supported by the Department of Defense contract DAMD 17-03-2-0052. For HCaP-NC follow-up data: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. For studies using both PCaP and HCaP-NC follow-up data please use: The North Carolina - Louisiana Prostate Cancer Project (PCaP) and the Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study are carried out as collaborative studies supported by the Department of Defense contract DAMD 17-03-2-0052 and the American Cancer Society award RSGT-08-008-01-CPHPS, respectively. For any PCaP data, please include: The authors thank the staff, advisory committees and research subjects participating in the PCaP study for their important contributions. For studies using PCaP DNA/genotyping data, please include: We would like to acknowledge the UNC BioSpecimen Facility and LSUHSC Pathology Lab for our DNA extractions, blood processing, storage and sample disbursement (https://genome.unc.edu/bsp). For studies using PCaP tissue, please include: We would like to acknowledge the RPCI Department of Urology Tissue Microarray and Immunoanalysis Core for our tissue processing, storage and sample disbursement. For studies using HCaP-NC follow-up data, please use: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. The authors thank the staff, advisory committees and research subjects participating in the HCaP-NC study for their important contributions. For studies that use both PCaP and HCaP-NC, please use: The authors thank the staff, advisory committees and research subjects participating in the PCaP and HCaP-NC studies for their important contributions. PCMUS: The PCMUS study was supported by the Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2-2009; DFNI-B01/28/2012) with additional support from the Science Fund of Medical University - Sofia (contract 51/2009; 8I/2009; 28/2010). PHS: The Physicians' Health Study was supported by grants CA34944, CA40360, CA097193, HL26490, and HL34595. PHS members are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. PLCO: This PLCO study was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIHPLCO thanks Drs. Christine Berg and Philip Prorok, Division of Cancer Prevention at the National Cancer Institute, the screening center investigators and staff of the PLCO Cancer Screening Trial for their contributions to the PLCO Cancer Screening Trial. We thank Mr. Thomas Riley, Mr. Craig Williams, Mr. Matthew Moore, and Ms. Shannon Merkle at Information Management Services, Inc., for their management of the data and Ms. Barbara O'Brien and staff at Westat, Inc. for their contributions to the PLCO Cancer Screening Trial. We also thank the PLCO study participants for their contributions to making this study possible. Poland: None reported PROCAP: PROCAP was supported by the Swedish Cancer Foundation (08-708, 09-0677). PROCAP thanks and acknowledges all of the participants in the PROCAP study. We thank Carin Cavalli-Björkman and Ami Rönnberg Karlsson for their dedicated work in the collection of data. Michael Broms is acknowledged for his skilful work with the databases. KI Biobank is acknowledged for handling the samples and for DNA extraction. We acknowledge The NPCR steering group: Pär Stattin (chair), Anders Widmark, Stefan Karlsson, Magnus Törnblom, Jan Adolfsson, Anna Bill-Axelson, Ove Andrén, David Robinson, Bill Pettersson, Jonas Hugosson, Jan-Erik Damber, Ola Bratt, Göran Ahlgren, Lars Egevad, and Roy Ehrnström. PROGReSS: The PROGReSS study is founded by grants from the Spanish Ministry of Health (INT15/00070; INT16/00154; FIS PI10/00164, FIS PI13/02030; FIS PI16/00046); the Spanish Ministry of Economy and Competitiveness (PTA2014-10228-I), and Fondo Europeo de Desarrollo Regional (FEDER 2007-2013). ProMPT: Founded by CRUK, NIHR, MRC, Cambride Biomedical Research Centre ProtecT: Founded by NIHR. ProtecT and ProMPT would like to acknowledge the support of The University of Cambridge, Cancer Research UK. Cancer Research UK grants (C8197/A10123) and (C8197/A10865) supported the genotyping team. We would also like to acknowledge the support of the National Institute for Health Research which funds the Cambridge Bio-medical Research Centre, Cambridge, UK. We would also like to acknowledge the support of the National Cancer Research Prostate Cancer: Mechanisms of Progression and Treatment (PROMPT) collaborative (grant code G0500966/75466) which has funded tissue and urine collections in Cambridge. We are grateful to staff at the Welcome Trust Clinical Research Facility, Addenbrooke's Clinical Research Centre, Cambridge, UK for their help in conducting the ProtecT study. We also acknowledge the support of the NIHR Cambridge Biomedical Research Centre, the DOH HTA (ProtecT grant), and the NCRI/MRC (ProMPT grant) for help with the bio-repository. The UK Department of Health funded the ProtecT study through the NIHR Health Technology Assessment Programme (projects 96/20/06, 96/20/99). The ProtecT trial and its linked ProMPT and CAP (Comparison Arm for ProtecT) studies are supported by Department of Health, England; Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466, and The NCRI, UK. The epidemiological data for ProtecT were generated though funding from the Southwest National Health Service Research and Development. DNA extraction in ProtecT was supported by USA Dept of Defense award W81XWH-04-1-0280, Yorkshire Cancer Research and Cancer Research UK. The authors would like to acknowledge the contribution of all members of the ProtecT study research group. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Department of Health of England. The bio-repository from ProtecT is supported by the NCRI (ProMPT) Prostate Cancer Collaborative and the Cambridge BMRC grant from NIHR. We thank the National Institute for Health Research, Hutchison Whampoa Limited, the Human Research Tissue Bank (Addenbrooke's Hospital), and Cancer Research UK. PROtEuS: PROtEuS was supported financially through grants from the Canadian Cancer Society (13149, 19500, 19864, 19865) and the Cancer Research Society, in partnership with the Ministère de l'enseignement supérieur, de la recherche, de la science et de la technologie du Québec, and the Fonds de la recherche du Québec - Santé.PROtEuS would like to thank its collaborators and research personnel, and the urologists involved in subjects recruitment. We also wish to acknowledge the special contribution made by Ann Hsing and Anand Chokkalingam to the conception of the genetic component of PROtEuS. QLD: The QLD research is supported by The National Health and Medical Research Council (NHMRC) Australia Project Grants (390130, 1009458) and NHMRC Career Development Fellowship and Cancer Australia PdCCRS funding to J Batra. The QLD team would like to acknowledge and sincerely thank the urologists, pathologists, data managers and patient participants who have generously and altruistically supported the QLD cohort. RAPPER: RAPPER is funded by Cancer Research UK (C1094/A11728; C1094/A18504) and Experimental Cancer Medicine Centre funding (C1467/A7286). The RAPPER group thank Rebecca Elliott for project management. SABOR: The SABOR research is supported by NIH/NCI Early Detection Research Network, grant U01 CA0866402-12. Also supported by the Cancer Center Support Grant to the Cancer Therapy and Research Center from the National Cancer Institute (US) P30 CA054174. SCCS: SCCS is funded by NIH grant R01 CA092447, and SCCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). Data on SCCS cancer cases used in this publication were provided by the Alabama Statewide Cancer Registry; Kentucky Cancer Registry, Lexington, KY; Tennessee Department of Health, Office of Cancer Surveillance; Florida Cancer Data System; North Carolina Central Cancer Registry, North Carolina Division of Public Health; Georgia Comprehensive Cancer Registry; Louisiana Tumor Registry; Mississippi Cancer Registry; South Carolina Central Cancer Registry; Virginia Department of Health, Virginia Cancer Registry; Arkansas Department of Health, Cancer Registry, 4815 W. Markham, Little Rock, AR 72205. The Arkansas Central Cancer Registry is fully funded by a grant from National Program of Cancer Registries, Centers for Disease Control and Prevention (CDC). Data on SCCS cancer cases from Mississippi were collected by the Mississippi Cancer Registry which participates in the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the CDC or the Mississippi Cancer Registry. SCPCS: SCPCS is funded by CDC grant S1135-19/19, and SCPCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). SEARCH: SEARCH is funded by a program grant from Cancer Research UK (C490/A10124) and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge. SNP_Prostate_Ghent: The study was supported by the National Cancer Plan, financed by the Federal Office of Health and Social Affairs, Belgium. SPAG: Wessex Medical ResearchHope for Guernsey, MUG, HSSD, MSG, Roger Allsopp STHM2: STHM2 was supported by grants from The Strategic Research Programme on Cancer (StratCan), Karolinska Institutet; the Linné Centre for Breast and Prostate Cancer (CRISP, number 70867901), Karolinska Institutet; The Swedish Research Council (number K2010-70X-20430-04-3) and The Swedish Cancer Society (numbers 11-0287 and 11-0624); Stiftelsen Johanna Hagstrand och Sigfrid Linnérs minne; Swedish Council for Working Life and Social Research (FAS), number 2012-0073STHM2 acknowledges the Karolinska University Laboratory, Aleris Medilab, Unilabs and the Regional Prostate Cancer Registry for performing analyses and help to retrieve data. Carin Cavalli-Björkman and Britt-Marie Hune for their enthusiastic work as research nurses. Astrid Björklund for skilful data management. We wish to thank the BBMRI.se biobank facility at Karolinska Institutet for biobank services. PCPT & SELECT are funded by Public Health Service grants U10CA37429 and 5UM1CA182883 from the National Cancer Institute. SWOG and SELECT thank the site investigators and staff and, most importantly, the participants who donated their time to this trial. TAMPERE: The Tampere (Finland) study was supported by the Academy of Finland (251074), The Finnish Cancer Organisations, Sigrid Juselius Foundation, and the Competitive Research Funding of the Tampere University Hospital (X51003). The PSA screening samples were collected by the Finnish part of ERSPC (European Study of Screening for Prostate Cancer). TAMPERE would like to thank Riina Liikanen, Liisa Maeaettaenen and Kirsi Talala for their work on samples and databases. UGANDA: None reported UKGPCS: UKGPCS would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now Prostate Action), The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. UKGPCS should also like to acknowledge the NCRN nurses, data managers, and consultants for their work in the UKGPCS study. UKGPCS would like to thank all urologists and other persons involved in the planning, coordination, and data collection of the study. ULM: The Ulm group received funds from the German Cancer Aid (Deutsche Krebshilfe). WUGS/WUPCS: WUGS would like to thank the following for funding support: The Anthony DeNovi Fund, the Donald C. McGraw Foundation, and the St. Louis Men's Group Against Cancer.
Looking to identify mutations in order to validate that lines we have classified as from Ataxia patients contain the disease relevant mutations. This will allow us to publish on the existance of these lines which are now commercially avalable. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
Single-cell genotyping data for bone marrow samples from 9 cases with clonal hematopoiesis and 1 control sample. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding transcriptome files use the same naming with the "_transcriptome" suffix.
1 sample is pure plasmid DNA and 10 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers.
1 sample is pure plasmid DNA and 8 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers.
Resistant hypertension is defined as blood pressure that remains above goal in spite of the concurrent use of 3 antihypertensive agents of different classes or the concurrent use of 4 or more antihypertensive agents regardless of control. Its diagnosis is important for the identification of patients who are at high risk of having reversible causes of hypertension and/or patients who, because of persistently high blood pressure levels, may benefit from special diagnostic and therapeutic considerations. Resistant hypertension represents an extreme phenotype, thus, it has been predicted that genetic factors could play a larger role than for the general hypertensive population. Genetic assessments of patients with resistant hypertension have been limited. The current study assayed the exome of 91 African American patients with treatment resistant hypertension.
The majority of cases of lung cancer are the culmination of a dynamic process that begins with smoking initiation, proceeds through dependency and smoking persistence, continues with lung cancer development and ends with progression to disseminated disease or response to therapy and survival. We are conducting a whole genome study of lung cancer and smoking to examine critical steps in lung cancer progression. This study is a genome-wide association study (GWAS) to investigate the genetic determinants of lung cancer risk. The study design efficiently allows identification of genes that also contribute to smoking persistence and outcome from lung cancer using a single GWAS of 5,900 subjects using the primary GENEVA dataset, derived from two studies. The first is the Environment and Genetics in Lung Cancer Etiology Study (EAGLE), a population-based, biologically intensive, case-control study from the Lombardy region of Italy including ~2000 newly diagnosed lung cancer cases and ~2000 age-, gender- and region- matched controls. The second is the Prostate, Lung, Colon and Ovary Study (PLCO) Cancer Screening Trial from which we have selected ~850 lung cancer cases and ~850 controls, also matched on age and gender. Understanding the basis for the well-established hereditary component of lung cancer and smoking persistence could provide new insights into etiology, prevention, and treatment, and have an enormous impact on public health. The same GWAS genotyping data in the two studies will be used to investigate the genetic determinants of smoking persistence. Specifically, we will analyze current smokers and former smokers from EAGLE and PLCO for diverse smoking phenotypes, including persistence of smoking as well as ever/never smoking comparisons, quitting attempts, and the Fagerström index of tobacco addiction. PLCO participants are all European-Americans and EAGLE involves subjects from Italy. EAGLE is a case-control study and contains 3937 phenotyped subjects. PLCO is a screening trial with a cohort design and contains 1651 phenotyped subjects. This study is part of the Gene Environment Association Studies initiative (GENEVA, http://www.genevastudy.org) funded by the trans-NIH Genes, Environment, and Health Initiative (GEI). The overarching goal is to identify novel genetic factors that contribute to lung cancer and smoking through large-scale genome-wide association studies of population-based samples of lung cancer cases and controls. Genotyping was performed at the Johns Hopkins University Center for Inherited Disease Research (CIDR). Data cleaning and harmonization were done at the GEI-funded GENEVA Coordinating Center at the University of Washington.
The Resource for Genetic Epidemiology Research on Aging (GERA) Cohort was created by a RC2 "Grand Opportunity" grant that was awarded to the Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) and the UCSF Institute for Human Genetics (AG036607; Schaefer/Risch, PIs). The RC2 project enabled genome-wide SNP genotyping (GWAS) to be conducted on a cohort of over 100,000 adults who are members of the Kaiser Permanente Medical Care Plan, Northern California Region (KPNC), and participating in its RPGEH. The purpose of the RPGEH is to facilitate research on the genetic and environmental factors that affect health and disease by linking together clinical data from electronic health records, survey data on demographic and behavioral factors, and environmental data from various sources, with genetic data derived from biospecimens collected from participants. At the time of the award of the RC2 project in late 2009, the RPGEH had established a cohort of about 140,000 individuals who had answered a detailed survey, provided saliva samples for extraction of DNA, and given broad consent for the use of their data in studies of health and disease. To maximize the diversity of the resulting sample, the GERA cohort was formed by including all racial and ethnic minority participants with saliva samples (N = 20,925; 19%); the remaining participants were drawn sequentially and randomly from white non-Hispanic participants (89,341; 81%). A total of 110,266 participant samples were included to ensure that at least 100,000 were successfully assayed. The resulting GERA cohort is 42% male, 58% female, and ranges in age from 18 to over 100 years old with an average age of 63 years at the time of the RPGEH survey (2007). The sample is ethnically diverse, generally well-educated with above average income. Approximately 69% of the participants are married or living with a partner. Length of membership in KPNC averages 23.5 years. UCSF and RPGEH investigators worked with the genomics company Affymetrix to design four custom microarrays for genotyping each of the four major race-ethnicity groups included in the GERA Cohort, described in detail in Hoffmann et al., 2011a and 2011b. Following genotyping and quality control procedures, and after removal of invalid, discordant, or withdrawn samples, about 103,000 participants were successfully genotyped. The resulting genotypic data were linked to survey data and data abstracted from the electronic medical records. As described below, all RPGEH participants were mailed new consent forms with explicit discussion of the placement of data in the NIH-maintained dbGaP. About 77% of participants returned completed consent forms, resulting in a final sample size of 78,486 participants in the GERA Cohort with data for deposit into dbGaP. Origins of the RPGEH GERA Cohort The goal in creating the RPGEH GERA cohort was to create a large, multiethnic, and comprehensive population-based resource for research into the genetic and environmental basis of common age-related diseases and their treatment, and factors influencing healthy aging and longevity. The GERA Cohort consists of a diverse cohort of more than 100,000 adults who are members of the Kaiser Permanente Medical Care Plan, Northern California Region (KPNC), and participating in its Research Program on Genes, Environment and Health (RPGEH). KPNC is an integrated health care delivery system with a population of about 3.3 million people in northern California. The membership of KPNC is representative of the general population in the 14 county area in which facilities are located, although the membership is underrepresented for the extremes of income at both ends of the spectrum. The RPGEH utilizes the longitudinal electronic health records (EHR) of KPNC to obtain clinical, laboratory, imaging and pharmacy information on all cohort members, to which personal demographic, behavioral and health characteristics have been added through member surveys. The GERA Cohort comprises a subsample of the RPGEH participant cohort, and was created through the RC2 award from the NIA, NIMH, and NIH Common Fund as described above. GERA Study Design The GERA Cohort is a subsample, as described above, of the longitudinal cohort enrolled in the Kaiser Permanente RPGEH. The RPGEH cohort includes about 400,000 survey participants of whom about 200,000 have provided broad consent and a sample of saliva or blood for use in studies of genetic and environmental factors in health and disease. The GERA Cohort was developed from a mailed survey sent to all adult members of KPNC who had been members for two years or more in 2007. All survey respondents were contacted and asked to complete a consent form; those who completed consent forms were asked to provide a saliva sample. Additional male participants were added to the RPGEH through inclusion of the Northern California sample of the California Men's Health Study (CMHS) cohort of about 40,000 men from KPNC, ages 45-69 years old at the time of the CMHS survey in 2002-2003. The CMHS participants contributed about 15,400 saliva samples to the RPGEH and were eligible for inclusion in the GERA Cohort. CMHS participants were included according to the same sampling design as for the RPGEH cohort as a whole. Specifically, all minority participants were selected for inclusion in order to maximize representation of minorities in the GERA Cohort, and Non-Hispanic White participants were selected at random to complete the sample of 110,266 GERA Cohort participants. GERA Genotypic Data High-density genotyping was conducted at UCSF using custom designed Affymetrix Axiom arrays, as described in Hoffmann et al. (2011a; 2011b). To maximize genome-wide coverage of common and less common variants, four specific arrays were designed for individuals of Non-Hispanic White (EUR), East Asian (EAS), African-American (AFR), and Latino (LAT) race/ethnicity. There was broad overlap among the SNPs on the arrays, which were designed using a hybrid greedy imputation algorithm (Hoffmann et al., 2011b) applied to genotype information validated by Affymetrix from the 1000 Genomes Project. However, in order to capture low frequency variants specific to particular race-ethnicity groups, SNP content varies between arrays. A more detailed description of the process of genotyping and results is included in Genotyping of DNA Samples. Description of the analyses of population structure and development of principal components for adjustment of population structure is included in Population Structure Analysis. GERA Phenotypic Data RPGEH and CMHS Survey Data. The sources of data on demographic and behavioral factors deposited in dbGaP for the GERA Cohort are the RPGEH and CMHS surveys. Data on common demographic factors such as gender, race/ethnicity, marital status, and education and on behavioral factors such as smoking, alcohol consumption, and body mass index, have been cleaned, edited, reconciled between the two surveys, and compiled into summary indices, where appropriate, for deposition into dbGaP. A more complete description of the survey variables is included in Survey Variables Documentation. Please note that the terms of use of the GERA Cohort Data, as specified in the Data Use Certification (DUC), prohibit the use of survey variables as outcomes in analyses. For example, a genome-wide association study (GWAS) of education or smoking is not permitted as specified by the DUC. Only health conditions can be used as outcome variables in analyses. Health Conditions derived from Kaiser Permanente Electronic Medical Records. Data on the occurrence of health conditions in participants in the GERA Cohort have been derived from summarizing ICD-9 coded diagnoses in Kaiser Permanente's electronic medical records. An algorithm that aggregates specific ICD-9 codes into appropriate diagnostic groups for selected conditions is applied to outpatient and inpatient databases; see Disease and Conditions Definitions Documentation for details. The criterion for including a condition as "present" for a participant is the occurrence of two or more diagnoses within a diagnostic category occurring on separate days. Two or more is used as the criterion in order to reduce false positives due to mistakes or rule-out diagnoses. When compared with validated disease registries, the criterion of 2+ diagnoses yields high specificity and good sensitivity. ICD-9 codes in the electronic records are specified in several ways. For outpatient visits occurring during the period 1995 to 2006, diagnoses were assigned by the treating physician who endorsed specific diagnoses on an optically scanned list that varied by specialty. Beginning in 2006 with the advent of an integrated, fully electronic medical record, outpatient diagnoses are made by physicians/ providers using a pull down menu. Discharge diagnoses from inpatient stays are specified by physicians and coded by specially trained coders. Databases of ICD-9 codes for diagnoses assigned at outpatient visits, or as one of the discharge diagnoses following inpatient stays, are complete and available for all KPNC members dating back to 1995. Although the average length of KPNC membership among GERA cohort members is 23.5 years in 2007, not all have been members since 1995, so the history for some conditions, such as those that are not chronic or recurrent, may not be complete for all cohort members. The year of first membership in KPNC is included as a variable in the list of survey variables, enabling investigators to estimate the number of years of observation of each Cohort member. RPGEH Access and Collaborations Website and Procedures The RPGEH maintains a web portal for inquiries and applications for collaboration and access to data. The url is: https://rpgehportal.kaiser.org/. RPGEH has an application process and an Access Review Committee that reviews applications for collaboration and use. For more details, please contact RPGEH through the website.
Cohort DescriptionIn 1948, the researchers recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts, and began the first round of extensive physical examinations and lifestyle interviews that they would later analyze for common patterns related to CVD development. Since 1948, the subjects have returned to the study every two years for an examination consisting of a detailed medical history, physical examination, and laboratory tests, and in 1971, the study enrolled a second-generation cohort -- 5,124 of the original participants' adult children and their spouses -- to participate in similar examinations. The second examination of the Offspring cohort occurred eight years after the first examination, and subsequent examinations have occurred approximately every four years thereafter. In 1994, the need to establish a new study reflecting a more diverse community of Framingham was recognized, and the first Omni cohort of the Framingham Heart Study, consisting of 506 participants, was enrolled. In April 2002 4095 third generation of participants, the grandchildren of the original cohort, were added. In 2003, 103 spouses of the offspring Cohort (NOS), and a second group of 410 Omni participants were enrolled. Through 2019, the original cohort has completed a total of 32 exams, the Offspring cohort 9 exams, the OMNI1 cohort 4 exams, and GEN3, NOS and OMNI2 cohorts each have completed 3 exams. The FHS is a joint project of the National Heart, Lung and Blood Institute and Boston University.Data Being Submitted Wave 1 questionnaire data includes 3967 variables for up to 3112 FHS participants in C4R.Wave 2 questionnaire data includes 448 variables for up to 2337 FHS participants in C4R.Dried Blood Spot/Serosurvey data includes 7 variables for up to 2189 FHS participants in C4R.Derived data includes 43 variables for up to 3151 FHS participants in C4R.Phenotype data includes 113 variables for up to 3151 FHS participants in C4R.
Desmoplastic Small Round Cell Tumor (DSRCT) is an aggressive mesenchymal tumor driven by fusions between the disordered domain of the Ewing sarcoma RNA binding protein 1 (EWSR1) and the developmental transcription factor Wilms tumor 1 (WT1). We used genome-wide chromatin profiling to identify EWSR1-WT1-dependent gene regulatory networks and target genes. Our studies show that EWS-WT1 is a powerful activator of distal regulatory elements and controls an oncogenic gene expression program that characterizes primary DSRCTs. ChIP-seq profiles for histone marks in primary DSRCT samples are available through dbGaP.