At the dawn of the second millennium, the expansion of the Indian Ocean trading network aligned with the emergence of an outward-oriented community along the East African coast to create a cosmopolitan cultural and trading zone known as the Swahili Corridor. Based on analyses of genome-wide genotyping data in 140 individuals from coastal Kenya and the Comoros islands, along with 3,477 individuals from the Indian Ocean rim, we reconstruct historical population dynamics showing that the Swahili Corridor is largely an East Bantu genetic continuum. Within this continuity significant gene flows from the Middle East can be seen in the Swahili and Comorians at dates corresponding to the Islamization of East Africa. However the main external gene flow into insular populations of the Swahili Corridor, such as Comorian and Malagasy groups, came from Island Southeast Asia as early as the 10th century. Remarkably, our results reflect archaeological and linguistic data in suggesting that the Comoros archipelago is the oldest place of contact between Austronesian and African populations in the Swahili Corridor.
The emergence of agriculture in West-Central Africa, ~5,000 years ago, profoundly modified the cultural landscape and mode of subsistence of most sub-Saharan populations. How this major innovation has impacted the genetic history of rainforest hunter-gatherers — historically referred to as “pygmies” — and agriculturalists, however, remains poorly understood. Here, we report genome-wide SNP data from eight of these populations located west-to-east of the equatorial rainforest. We find that hunter-gathering populations present up to 50% of farmer genomic ancestry, and that substantial admixture began only within the last 1,000 years. Furthermore, we show that the historical population sizes characterising these communities already differed before the introduction of agriculture. Our results suggest that the first socio-economic interactions between rainforest hunter-gatherers and farmers introduced by the spread of farming were not accompanied by immediate, extensive genetic exchanges and occurred on a backdrop of two groups already differentiated by their specialisation in two ecotopes with differing carrying capacities.
The National Institute on Aging (NIA) Alzheimer's Disease Centers (ADCs) cohort includes subjects ascertained and evaluated by the clinical and neuropathology cores of the 29 NIA-funded ADCs. Data collection is coordinated by the National Alzheimer's Coordinating Center (NACC). NACC coordinates collection of phenotype data from the 29 ADCs, cleans all data, coordinates implementation of definitions of AD cases and controls, and coordinates collection of samples. The ADC cohort consists of autopsy-confirmed and clinically-confirmed AD cases, and cognitively normal elders (CNEs) with complete neuropathology data who were older than 60 years at age of death, and living CNEs evaluated using the Uniform dataset (UDS) protocol who were documented to not have mild cognitive impairment (MCI) and were between 60 and 100 years of age at assessment. ADCs sent frozen tissue from autopsied subjects and DNA samples from some autopsied subjects and from living subjects to the National Cell Repository for Alzheimer's Disease (NCRAD). DNA was prepared by NCRAD for genotyping and sent to the genotyping site at Children's Hospital of Philadelphia. ADC samples were genotyped and analyzed in separate batches. [Reprinted from AC Naj et al. Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer's disease. Nature Genetics 43, 436-441 (2011). doi:10.1038/ng.801. PMID: 21460841.]
Peripheral T-cell lymphomas not otherwise specified (PTCL-NOS) represent a heterogeneous group of nodal and extra-nodal mature T-cell lymphomas, with a low prevalence in Western countries. PTCL-NOSs account for about 25% of all PTCLs and are currently diagnosed based on exclusion criteria, as this lymphomas lack unifying morphological, phenotypic and genomic features. Cytogenetic and FISH analysis of PTCL-NOS samples have not revealed recurrent pathogenetic abnormalities, while gene expression profiling has shown only partial ability to segregate cases representing homogeneous clinic-pathological entities. This underscores the need to look at PTCL-NOS with innovative and high-throughput approaches to identify recurrent genetic lesions that could further our understanding of the biology of this heterogeneous group of diseases, provide better diagnostic tools and perhaps new targets for innovative treatments. Our aim is to study ~15 patients affected by PTCL-NOS. Out study will be funded by a private, non-profit Italian cancer research fund (Associazione Italiana per la Ricerca sul Cancro, www.airc.it) based on a grant owned by Anna Dodero and Cristiana Carniti, hematologists at INT. Samples will be analysed by whole genome sequencing using Illumina X10 machines, on a 150bp-PE protocol. Data will be analysed using the pipeline available in Team 78, under the supervision of Peter Campbell, the WTSI faculty who will oversee the project, and by Francesco Maura, visiting scientist at the WTSI.
Summary: Single-cell transcriptome analysis of skin and blood samples from individuals with Kaposi's sarcoma Aims: Elucidate the cellular composition and gene expression profiles of Kaposi sarcoma herpesvirus (KSHV) infected cells and non-infected tumor infiltrating cells, and peripheral blood samples from the same individuals, with or without specific therapies, such as antiretroviral therapy for HIV, immunotherapies (such as immune checkpoint inhibitors and immune modulator drugs like pomalidomide), or angiogenesis inhibitory agents. Population: Age 18 and older, both genders, including individuals from North America and other areas of the world (including Africa). Individuals with or without HIV are included. Molecular Technologies: Single-cell RNAseq using 10x Genomics Principal Findings: Two distinct populations of endothelial cells were found infected with KSHV, one CD34+ and one CD34-Both clusters include lytic and latent KSHV gene expressionThe KSHV infected cell clusters differ in expression of housekeeping genes, proliferative markers, ribosomal and secretory vesicle genesBiomarkers of KSHV infected cells were identified, including sodium channel SCN9A Changes in cell composition with therapy are noted Clonal amplification of TCR+ T-cells were noted in the skin and blood Data Available through dbGaP: scRNA-seq data
The distribution of deleterious genetic variation across human populations is a key issue in evolutionary biology and medical genetics. However, the impact of different modes of subsistence on recent changes in population size, patterns of gene flow, and deleterious mutational load remains unclear. Here, we report high-coverage exomes from various populations of rainforest hunter-gatherers and farmers from central Africa. We find that the recent demographic histories of hunter-gatherers and farmers differed considerably, with population collapses for hunter-gatherers and expansions for farmers, accompanied by increased gene flow. We show that purifying selection against newly arising deleterious alleles is of similar efficiency across African populations, in contrast with Europeans where we detect weaker purifying selection. Furthermore, the per-individual mutation load of rainforest hunter-gatherers is similar to that of farmers, under both additive and recessive models. Our results indicate that differences in the cultural practices and demographic regimes of African populations have not resulted in large differences in mutational burden, and highlight the beneficial role of gene flow in reshaping the distribution of deleterious genetic variation across human populations.
The southern African indigenous Khoe-San populations harbor the most divergent lineages of all living peoples. Exploring their genomes is key to understanding deep human history. We sequenced 25 full genomes from five Khoe-San populations, revealing many novel variants, that 25% of variants are unique to the Khoe-San, and that the Khoe-San group harbors the greatest level of diversity across the globe. In line with previous studies, we found several gene-regions with extreme values in genome-wide distributions, potentially caused by natural selection early in the modern human lineage and more recent in time. These gene-regions included immunity-, sperm-, brain-, diet- and muscle-related genes. When accounting for recent admixture, all Khoe-San groups display genetic diversity approaching the levels in other African groups and a reduction in effective population size starting around 100,000 years ago. Hence, all human groups show a reduction in effective population size commencing around the time of the Out-of-Africa migrations, which coincides with changes in the paleoclimate records, changes that potentially impacted all humans at the time.
The substantial reproductive impact of schizophrenia, for which affected individuals have fewer than half as many offspring as unaffected individuals do, implies that mutations of largest effect will frequently be de novo mutations. Ascertaining exome sequence variation in father-mother-offspring trios allows such mutations to be identified and distinguished from the far-larger amount of rare variation that is inherited by each individual. The pursuit of this approach in a large, well-powered cohort of trios can also provide lessons that inform the development of such gene discovery strategies more generally in human genetics. Schizophrenia trios from the Taiwanese population are being collected by Dr. Ming Tsuang (PI, UC San Diego, California) and investigators in Taiwan (PI, Dr. Hai Gwo Hwu; both funded by NIMH grant 1R01MH085560; Expanding Rapid Ascertainment Networks of Schizophrenia Families in Taiwan). A total of 3800 trios are anticipated to be collected by May 2013. This represents a highly homogenous national sample from the same ancestral population. DNA samples will be obtained from the NIMH Repository, Rutgers University Cell and DNA Repository (described below) and stored at the Broad Institute. Genetic and data analyses will be performed at the Broad Institute. We propose to sequence the whole exome of trios by hybrid capture and Illumina next generation sequencing and perform targeted genotyping and validation of variants (SNPs, indels and CNVs) using several molecular methods, to include emulsion-based PCR and Sanger sequencing.
Hepatocellular Carcinoma (HCC) is a leading cause of cancer-related death and can be considered a prototype of inflammation-derived cancer arising from chronic liver injury. The cell composition of the HCC tumor immune microenvironment (TiME) has a major impact on cancer biology as the TiME can have divergent capacities on tumor initiation, progress, and response to therapy. Recent development of multi-omics and single-cell technologies help us to comprehensively quantify the cellular heterogeneity and spatial organization of the TiME and to further our understanding of antitumor immunity. We investigated the cellular composition of liver cancer patient samples (n=8) using single-cell RNA-seq. For this purpose, mononuclear cells were isolated from human liver cancer samples. Three locations within the tumor-bearing liver were used: adjacent liver, rim and tumor core. Then, cells were FACS sorted (either CD45+ cells or MAIT cells) and subjected to 10X Chromium-based scRNA-seq. We investigated immune cell changes and the cellular heterogeneity of the HCC TiME with a focus on MAIT cells. Annotated H5 files are provided. Clinical metadata including TMN stage, sex, gender, ethnicity, pretreatment, and histopathological reports are available for all patient samples. Further details on the study can be obtained in our paper once it’s published.A complementary dataset derived from ultrahighplex CO-Detection by indEXing (CODEX) from paired patient samples can be accessed through The Cancer Imaging Archives (TCIA) under DOI: https://doi.org/10.7937/bh0r-y074.
This project follows a cohort of 78 Very Low Birth Weight (VLBW) previously enrolled infants in a R21 grant plus additional 25 infants through their Neonatal Intensive Care Unit (NICU) stay until they reach the age of 4 years. The data, gathered over 6 weeks of the NICU stay, includes multiple factors, such as prenatal and postnatal events and illnesses, received human milk amount, weekly means of cytokines, chemokines, growth factors, and secretory Immunoglobulin A in the milk, and weekly levels of fecal calprotectin. These factors could potentially alter the gut microbiome. Microbiome species and diversities will be measured in the laboratory of Dr. Jack Gilbert at Argonne National Laboratory using state of the science deep sequencing and amplification of microbial sRNA genes. The microbiome will again be measured in stool samples from those children at the ages of 2 and 4 years. Relationship between the prenatal and postnatal factors, human milk volume and immunobiology, fecal calprotectin levels, and the very early microbiome will be analyzed. The predictive power of the VLBW infant gut microbiome for determining later childhood microbiomes will be analyzed prospectively. The relationships between microbiomes across time and later growth, development and health will be determined. VLBW infants are at risk for both early and later health effects, and the role of the microbiome in these effects will be measured in this prospective study.
Pediatric low-grade gliomas (PLGGs) are among the most common solid tumors in children but, apart from mutations or duplications in the BRAF kinase in specific subclasses, few genetic driver events are known. Diffuse PLGGs compose a set of uncommon subtypes that exhibit invasive growth and are therefore especially challenging clinically. These tumors are particularly poorly understood. We performed high-resolution copy-number analysis of 44 diffuse PLGGs to identify recurrent alterations. Diffuse PLGGs exhibited fewer such alterations than adult low-grade gliomas, but we identified several significantly recurrent events. The most significant event, 8q13.1 gains, was observed in 28% of diffuse astrocytoma WHO grade II (DA2) and resulted in partial duplication of the transcription factor MYBL1 with truncation of its C-terminal negative-regulatory domain. A similar recurrent deletion-truncation breakpoint was identified in two angiocentric gliomas in the related gene MYB on 6q23.3. Whole genome sequencing of a MYBL1-rearranged diffuse astrocytoma grade II demonstrated MYBL1 tandem duplication and few other events. Two truncated MYBL1 transcripts identified in this tumor induced anchorage-independent growth when expressed in 3T3 cells and tumor formation in nude mice. Truncated transcripts were also expressed in two additional tumors with MYBL1 partial duplication. Our results define clinically relevant molecular subclasses of diffuse PLGGs and highlight a potential role for the MYB family in the biology of low-grade gliomas. "Reprinted from www.pnas.org/cgi/doi/10.1073/pnas.1300252110 with permission from PNAS."
Despite the potential of whole-genome sequencing (WGS) to improve patient diagnosis and care, the empirical value of WGS in the cancer genetics clinic is unknown. We performed WGS on members of two cohorts of cancer genetics patients: those with BRCA1/2 mutations (n = 176) and those without (n = 82). Initial analysis of potentially pathogenic variants (PPVs, defined as nonsynonymous variants with allele frequency < 1% in ESP6500) in 163 clinically-relevant genes suggested that WGS will provide useful clinical results. This is despite the fact that a majority of PPVs were novel missense variants likely to be classified as variants of unknown significance (VUS). Furthermore, previously reported pathogenic missense variants did not always associate with their predicted diseases in our patients. This suggests that the clinical use of WGS will require large-scale efforts to consolidate WGS and patient data to improve accuracy of interpretation of rare variants. While loss-of-function (LoF) variants represented only a small fraction of PPVs, WGS identified additional cancer risk LoF PPVs in patients with known BRCA1/2 mutations and led to cancer risk diagnoses in 21% of non-BRCA cancer genetics patients after expanding our analysis to 3209 ClinVar genes. These data illustrate how WGS can be used to improve our ability to discover patients' cancer genetic risks. "Reprinted from doi:10.1016/j.ebiom.2014.12.003, with permission from EBioMedicine."
This dataset contains three sets of samples. The first sample set contains euploid fetus pregnancies reported by NIPTIFY screening test and postnatal evaluation. Dataset was processed similarly to previously published guidelines from KU Leuven, with modifications [1]. Briefly, peripheral blood samples were collected in cell-free DNA BCT tubes (Streck, USA), and plasma was separated with standard dual centrifugation. Cell-free DNA was extracted from 3 ml plasma using MagMAX Cell-Free DNA Isolation Kit (ThermoFisher Scientific). Whole-genome libraries were prepared using the FOCUS (Fragmented DNA Compact Sequencing Assay, Competence Centre on Health Technologies, Estonia) NIPT method protocol with 12 cycles for the final PCR enrichment step. In the following quantification, equal amounts of 36 samples were pooled, and the quality and quantity of the pool were assessed on Agilent 2200 TapeStation (Agilent Technologies, USA). Whole genome sequencing was performed on the NextSeq 550 instrument (Illumina Inc.) with an average coverage of 0.32× (minimum 0.08 and maximum 0.42) and producing 85 bp single-end reads. The second sample set contains a single NIPT sample postnatally diagnosed with Prader-Willi syndrome. The sample was sequenced with Illumina NextSeq 500 platform, producing 85 bp single-end reads with an average per-sample coverage of 0.32× at the University of Tartu, Institute of Genomics Core Facility, according to the manufacturer’s standard protocols, as described previously [2]. The third sample set contains samples SC005 (SeraCare Life Sciences Inc lot #10446565), SC0042 (#10571706), and SC016 (#10560229). These are SeraCare Life Sciences Inc circulating cell-free DNA (ccfDNA) like mixture of human genomic DNA that consists of matched maternal and fetus. SC005 and SC0042 consist of matched DNA of maternal and fetus with DiGeorge Syndrome. SC016 is a custom-ordered DNA Mix with fetus DNA having a pathogenic loss of the terminal region of 20p13 and a pathogenic 3q29 duplication. SC016 was processed as the first sample set was processed, and SC0042 was processed as the second sample set was processed. Sample SC005 was processed once as was sample set 1 and once as was sample set 2 processed. This study was performed with the approval of the Research Ethics Committee of the University of Tartu (#352/M-12). 1. Bayindir B, Dehaspe L, Brison N, Brady P, Ardui S, Kammoun M, et al. Noninvasive prenatal testing using a novel analysis pipeline to screen for all autosomal fetal aneuploidies improves pregnancy management. Eur J Hum Genet. 2015;23: 1286– 1293. doi:10.1038/ejhg.2014.282 2. Žilina O, Rekker K, Kaplinski L, Sauk M, Paluoja P, Teder H, et al. Creating basis for introducing noninvasive prenatal testing in the Estonian public health setting. Prenat Diagn. 2019;39: 1262–1268. doi:10.1002/pd.5578
Original description of the study: From ELLIPSE (linked to the PRACTICAL consortium), we contributed ~78,000 SNPs to the OncoArray. A large fraction of the content was derived from the GWAS meta-analyses in European ancestry populations (overall and aggressive disease; ~27K SNPs). We also selected just over 10,000 SNPs from the meta-analyses in the non-European populations, with a majority of these SNPs coming from the analysis of overall prostate cancer in African ancestry populations as well as from the multiethnic meta-analysis. A substantial fraction of SNPs (~28,000) were also selected for fine-mapping of 53 loci not included in the common fine-mapping regions (tagging at r2>0.9 across ±500kb regions). We also selected a few thousand SNPs related with PSA levels and/or disease survival as well as SNPs from candidate lists provided by study collaborators, as well as from meta-analyses of exome SNP chip data from the Multiethnic Cohort and UK studies. The Contributing Studies: Aarhus: Hospital-based, Retrospective, Observational. Source of cases: Patients treated for prostate adenocarcinoma at Department of Urology, Aarhus University Hospital, Skejby (Aarhus, Denmark). Source of controls: Age-matched males treated for myocardial infarction or undergoing coronary angioplasty, but with no prostate cancer diagnosis based on information retrieved from the Danish Cancer Register and the Danish Cause of Death Register. AHS: Nested case-control study within prospective cohort. Source of cases: linkage to cancer registries in study states. Source of controls: matched controls from cohort ATBC: Prospective, nested case-control. Source of cases: Finnish male smokers aged 50-69 years at baseline. Source of controls: Finnish male smokers aged 50-69 years at baseline BioVu: Cases identified in a biobank linked to electronic health records. Source of cases: A total of 214 cases were identified in the VUMC de-identified electronic health records database (the Synthetic Derivative) and shipped to USC for genotyping in April 2014. The following criteria were used to identify cases: Age 18 or greater; male; African Americans (Black) only. Note that African ancestry is not self-identified, it is administratively or third-party assigned (which has been shown to be highly correlated with genetic ancestry for African Americans in BioVU; see references). Source of controls: Controls were identified in the de-identified electronic health record. Unfortunately, they were not age matched to the cases, and therefore cannot be used for this study. Canary PASS: Prospective, Multi-site, Observational Active Surveillance Study. Source of cases: clinic based from Beth Israel Deaconness Medical Center, Eastern Virginia Medical School, University of California at San Francisco, University of Texas Health Sciences Center San Antonio, University of Washington, VA Puget Sound. Source of controls: N/A CCI: Case series, Hospital-based. Source of cases: Cases identified through clinics at the Cross Cancer Institute. Source of controls: N/A CerePP French Prostate Cancer Case-Control Study (ProGene): Case-Control, Prospective, Observational, Hospital-based. Source of cases: Patients, treated in French departments of Urology, who had histologically confirmed prostate cancer. Source of controls: Controls were recruited as participating in a systematic health screening program and found unaffected (normal digital rectal examination and total PSA < 4 ng/ml, or negative biopsy if PSA > 4 ng/ml). COH: hospital-based cases and controls from outside. Source of cases: Consented prostate cancer cases at City of Hope. Source of controls: Consented unaffected males that were part of other studies where they consented to have their DNA used for other research studies. COSM: Population-based cohort. Source of cases: General population. Source of controls: General population CPCS1: Case-control - Denmark. Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPCS2: Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPDR: Retrospective cohort. Source of cases: Walter Reed National Military Medical Center. Source of controls: Walter Reed National Military Medical Center ACS_CPS-II: Nested case-control derived from a prospective cohort study. Source of cases: Identified through self-report on follow-up questionnaires and verified through medical records or cancer registries, identified through cancer registries or the National Death Index (with prostate cancer as the primary cause of death). Source of controls: Cohort participants who were cancer-free at the time of diagnosis of the matched case, also matched on age (±6 mo) and date of biospecimen donation (±6 mo). EPIC: Case-control - Germany, Greece, Italy, Netherlands, Spain, Sweden, UK. Source of cases: Identified through record linkage with population-based cancer registries in Italy, the Netherlands, Spain, Sweden and UK. In Germany and Greece, follow-up is active and achieved through checks of insurance records and cancer and pathology registries as well as via self-reported questionnaires; self-reported incident cancers are verified through medical records. Source of controls: Cohort participants without a diagnosis of cancer EPICAP: Case-control, Population-based, ages less than 75 years at diagnosis, Hérault, France. Source of cases: Prostate cancer cases in all public hospitals and private urology clinics of département of Hérault in France. Cases validation by the Hérault Cancer Registry. Source of controls: Population-based controls, frequency age matched (5-year groups). Quotas by socio-economic status (SES) in order to obtain a distribution by SES among controls identical to the SES distribution among general population men, conditionally to age. ERSPC: Population-based randomized trial. Source of cases: Men with PrCa from screening arm ERSPC Rotterdam. Source of controls: Men without PrCa from screening arm ERSPC Rotterdam ESTHER: Case-control, Prospective, Observational, Population-based. Source of cases: Prostate cancer cases in all hospitals in the state of Saarland, from 2001-2003. Source of controls: Random sample of participants from routine health check-up in Saarland, in 2000-2002 FHCRC: Population-based, case-control, ages 35-74 years at diagnosis, King County, WA, USA. Source of cases: Identified through the Seattle-Puget Sound SEER cancer registry. Source of controls: Randomly selected, age-frequency matched residents from the same county as cases Gene-PARE: Hospital-based. Source of cases: Patients that received radiotherapy for treatment of prostate cancer. Source of controls: n/a Hamburg-Zagreb: Hospital-based, Prospective. Source of cases: Prostate cancer cases seen at the Department of Oncology, University Hospital Center Zagreb, Croatia. Source of controls: Population-based (Croatia), healthy men, older than 50, with no medical record of cancer, and no family history of cancer (1st & 2nd degree relatives) HPFS: Nested case-control. Source of cases: Participants of the HPFS cohort. Source of controls: Participants of the HPFS cohort IMPACT: Observational. Source of cases: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has been diagnosed with prostate cancer during the study. Source of controls: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has not been diagnosed with prostate cancer during the study. IPO-Porto: Hospital-based. Source of cases: Early onset and/or familial prostate cancer. Source of controls: Blood donors Karuprostate: Case-control, Retrospective, Population-based. Source of cases: From FWI (Guadeloupe): 237 consecutive incident patients with histologically confirmed prostate cancer attending public and private urology clinics; From Democratic Republic of Congo: 148 consecutive incident patients with histologically confirmed prostate cancer attending the University Clinic of Kinshasa. Source of controls: From FWI (Guadeloupe): 277 controls recruited from men participating in a free systematic health screening program open to the general population; From Democratic Republic of Congo: 134 controls recruited from subjects attending the University Clinic of Kinshasa KULEUVEN: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases recruited at the University Hospital Leuven. Source of controls: Healthy males with no history of prostate cancer recruited at the University Hospitals, Leuven. LAAPC: Subjects were participants in a population-based case-control study of aggressive prostate cancer conducted in Los Angeles County. Cases were identified through the Los Angeles County Cancer Surveillance Program rapid case ascertainment system. Eligible cases included African American, Hispanic, and non-Hispanic White men diagnosed with a first primary prostate cancer between January 1, 1999 and December 31, 2003. Eligible cases also had (a) prostatectomy with documented tumor extension outside the prostate, (b) metastatic prostate cancer in sites other than prostate, (c) needle biopsy of the prostate with Gleason grade ≥8, or (d) needle biopsy with Gleason grade 7 and tumor in more than two thirds of the biopsy cores. Eligible controls were men never diagnosed with prostate cancer, living in the same neighborhood as a case, and were frequency matched to cases on age (± 5 y) and race/ethnicity. Controls were identified by a neighborhood walk algorithm, which proceeds through an obligatory sequence of adjacent houses or residential units beginning at a specific residence that has a specific geographic relationship to the residence where the case lived at diagnosis. Malaysia: Case-control. Source of cases: Patients attended the outpatient urology or uro-onco clinic at University Malaya Medical Center. Source of controls: Population-based, age matched (5-year groups), ascertained through electoral register, Subang Jaya, Selangor, Malaysia MCC-Spain: Case-control. Source of cases: Identified through the urology departments of the participating hospitals. Source of controls: Population-based, frequency age and region matched, ascertained through the rosters of the primary health care centers MCCS: Nested case-control, Melbourne, Victoria. Source of cases: Identified by linkage to the Victorian Cancer Registry. Source of controls: Cohort participants without a diagnosis of cancer MD Anderson: Participants in this study were identified from epidemiological prostate cancer studies conducted at the University of Texas MD Anderson Cancer Center in the Houston Metropolitan area. Cases were accrued in the Houston Medical Center and were not restricted with respect to Gleason score, stage or PSA. Controls were identified via random-digit-dialing or among hospital visitors and they were frequency matched to cases on age and race. Lifestyle, demographic, and family history data were collected using a standardized questionnaire. MDACC_AS: A prospective cohort study. Source of cases: Men with clinically organ-confined prostate cancer meeting eligibility criteria for a prospective cohort study of active surveillance at MD Anderson Cancer Center. Source of controls: N/A MEC: The Multiethnic Cohort (MEC) is comprised of over 215,000 men and women recruited from Hawaii and the Los Angeles area between 1993 and 1996. Between 1995 and 2006, over 65,000 blood samples were collected from participants for genetic analyses. To identify incident cancer cases, the MEC was cross-linked with the population-based Surveillance, Epidemiology and End Results (SEER) registries in California and Hawaii, and unaffected cohort participants with blood samples were selected as controls MIAMI (WFPCS): Prostate cancer cases and controls were recruited from the Departments of Urology and Internal Medicine of the Wake Forest University School of Medicine using sequential patient populations as described previously (PMID:15342424). All study subjects received a detailed description of the study protocol and signed their informed consent, as approved by the medical center's Institutional Review Board. The general eligibility criteria were (i) able to comprehend informed consent and (ii) without previously diagnosed cancer. The exclusion criteria were (i) clinical diagnosis of autoimmune diseases; (ii) chronic inflammatory conditions; and (iii) infections within the past 6 weeks. Blood samples were collected from all subjects. MOFFITT: Hospital-based. Source of cases: clinic based from Moffitt Cancer Center. Source of controls: Moffitt Cancer Center affiliated Lifetime cancer screening center NMHS: Case-control, clinic based, Nashville TN. Source of cases: All urology clinics in Nashville, TN. Source of controls: Men without prostate cancer at prostate biopsy. PCaP: The North Carolina-Louisiana Prostate Cancer Project (PCaP) is a multidisciplinary population-based case-only study designed to address racial differences in prostate cancer through a comprehensive evaluation of social, individual and tumor level influences on prostate cancer aggressiveness. PCaP enrolled approximately equal numbers of African Americans and Caucasian Americans with newly-diagnosed prostate cancer from North Carolina (42 counties) and Louisiana (30 parishes) identified through state tumor registries. African American PCaP subjects with DNA, who agreed to future use of specimens for research, participated in OncoArray analysis. PCMUS: Case-control - Sofia, Bulgaria. Source of cases: Patients of Clinic of Urology, Alexandrovska University Hospital, Sofia, Bulgaria, PrCa histopathologically confirmed. Source of controls: 72 patients with verified BPH and PSA<3,5; 78 healthy controls from the MMC Biobank, no history of PrCa PHS: Nested case-control. Source of cases: Participants of the PHS1 trial/cohort. Source of controls: Participants of the PHS1 trial/cohort PLCO: Nested case-control. Source of cases: Men with a confirmed diagnosis of prostate cancer from the PLCO Cancer Screening Trial. Source of controls: Controls were men enrolled in the PLCO Cancer Screening Trial without a diagnosis of cancer at the time of case ascertainment. Poland: Case-control. Source of cases: men with unselected prostate cancer, diagnosed in north-western Poland at the University Hospital in Szczecin. Source of controls: cancer-free men from the same population, taken from the healthy adult patients of family doctors in the Szczecin region PROCAP: Population-based, Retrospective, Observational. Source of cases: Cases were ascertained from the National Prostate Cancer Register of Sweden Follow-Up Study, a retrospective nationwide cohort study of patients with localized prostate cancer. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PROGReSS: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases from the Hospital Clínico Universitario de Santiago de Compostela, Galicia, Spain. Source of controls: Cancer-free men from the same population ProMPT: A study to collect samples and data from subjects with and without prostate cancer. Retrospective, Experimental. Source of cases: Subjects attending outpatient clinics in hospitals. Source of controls: Subjects attending outpatient clinics in hospitals ProtecT: Trial of treatment. Samples taken from subjects invited for PSA testing from the community at nine centers across United Kingdom. Source of cases: Subjects who have a proven diagnosis of prostate cancer following testing. Source of controls: Identified through invitation of subjects in the community. PROtEuS: Case-control, population-based. Source of cases: All new histologically-confirmed cases, aged less or equal to 75 years, diagnosed between 2005 and 2009, actively ascertained across Montreal French hospitals. Source of controls: Randomly selected from the Provincial electoral list of French-speaking men between 2005 and 2009, from the same area of residence as cases and frequency-matched on age. QLD: Case-control. Source of cases: A longitudinal cohort study (Prostate Cancer Supportive Care and Patient Outcomes Project: ProsCan) conducted in Queensland, through which men newly diagnosed with prostate cancer from 26 private practices and 10 public hospitals were directly referred to ProsCan at the time of diagnosis by their treating clinician (age range 43-88 years). All cases had histopathologically confirmed prostate cancer, following presentation with an abnormal serum PSA and/or lower urinary tract symptoms. Source of controls: Controls comprised healthy male blood donors with no personal history of prostate cancer, recruited through (i) the Australian Red Cross Blood Services in Brisbane (age range 19-76 years) and (ii) the Australian Electoral Commission (AEC) (age and post-code/ area matched to ProsCan, age range 54-90 years). RAPPER: Multi-centre, hospital based blood sample collection study in patients enrolled in clinical trials with prospective collection of radiotherapy toxicity data. Source of cases: Prostate cancer patients enrolled in radiotherapy trials: CHHiP, RT01, Dose Escalation, RADICALS, Pelvic IMRT, PIVOTAL. Source of controls: N/A SABOR: Prostate Cancer Screening Cohort. Source of cases: Men >45 yrs of age participating in annual PSA screening. Source of controls: Males participating in annual PSA prostate cancer risk evaluations (funded by NCI biomarkers discovery and validation grant), recruited through University of Texas Health Science Center at San Antonio and affiliated sites or through study advertisements, enrolment open to the community SCCS: Case-control in cohort, Southeastern USA. Prospective, Observational, Population-based. Source of cases: SCCS entry population. Source of controls: SCCS entry population SCPCS: Population-based, Retrospective, Observational. Source of cases: South Carolina Central Cancer Registry. Source of controls: Health Care Financing Administration beneficiary file SEARCH: Case-control - East Anglia, UK. Source of cases: Men < 70 years of age registered with prostate cancer at the population-based cancer registry, Eastern Cancer Registration and Information Centre, East Anglia, UK. Source of controls: Men attending general practice in East Anglia with no known prostate cancer diagnosis, frequency matched to cases by age and geographic region SNP_Prostate_Ghent: Hospital-based, Retrospective, Observational. Source of cases: Men treated with IMRT as primary or postoperative treatment for prostate cancer at the Ghent University Hospital between 2000 and 2010. Source of controls: Employees of the University hospital and members of social activity clubs, without a history of any cancer. SPAG: Hospital-based, Retrospective, Observational. Source of cases: Guernsey. Source of controls: Guernsey STHM2: Population-based, Retrospective, Observational. Source of cases: Cases were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PCPT: Case-control from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial SELECT: Case-cohort from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial TAMPERE: Case-control - Finland, Retrospective, Observational, Population-based. Source of cases: Identified through linkage to the Finnish Cancer Registry and patient records; and the Finnish arm of the ERSPC study. Source of controls: Cohort participants without a diagnosis of cancer UGANDA: Uganda Prostate Cancer Study: Uganda is a case-control study of prostate cancer in Kampala Uganda that was initiated in 2011. Men with prostate cancer were enrolled from the Urology unit at Mulago Hospital and men without prostate cancer (i.e. controls) were enrolled from other clinics (i.e. surgery) at the hospital. UKGPCS: ICR, UK. Source of cases: Cases identified through clinics at the Royal Marsden hospital and nationwide NCRN hospitals. Source of controls: Ken Muir's control- 2000 ULM: Case-control - Germany. Source of cases: familial cases (n=162): identified through questionnaires for family history by collaborating urologists all over Germany; sporadic cases (n=308): prostatectomy series performed in the Clinic of Urology Ulm between 2012 and 2014. Source of controls: age-matched controls (n=188): age-matched men without prostate cancer and negative family history collected in hospitals of Ulm WUGS/WUPCS: Cases Series, USA. Source of cases: Identified through clinics at Washington University in St. Louis. Source of controls: Men diagnosed and managed with prostate cancer in University based clinic. Acknowledgement Statements: Aarhus: This study was supported by the Danish Strategic Research Council (now Innovation Fund Denmark) and the Danish Cancer Society. The Danish Cancer Biobank (DCB) is acknowledged for biological material. AHS: This work was supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics (Z01CP010119). ATBC: This research was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, N01-RC-37004, HHSN261201000006C, and HHSN261201500005C from the National Cancer Institute, Department of Health and Human Services. BioVu: The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center's BioVU which is supported by institutional funding and by the National Center for Research Resources, Grant UL1 RR024975-01 (which is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06). Canary PASS: PASS was supported by Canary Foundation and the National Cancer Institute's Early Detection Research Network (U01 CA086402) CCI: This work was awarded by Prostate Cancer Canada and is proudly funded by the Movember Foundation - Grant # D2013-36.The CCI group would like to thank David Murray, Razmik Mirzayans, and April Scott for their contribution to this work. CerePP French Prostate Cancer Case-Control Study (ProGene): None reported COH: SLN is partially supported by the Morris and Horowitz Families Endowed Professorship COSM: The Swedish Research Council, the Swedish Cancer Foundation CPCS1 & CPCS2: Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev Ringvej 75, DK-2730 Herlev, DenmarkCPCS1 would like to thank the participants and staff of the Copenhagen General Population Study for their important contributions. CPDR: Uniformed Services University for the Health Sciences HU0001-10-2-0002 (PI: David G. McLeod, MD) CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study II cohort. CPS-II thanks the participants and Study Management Group for their invaluable contributions to this research. We would also like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, and cancer registries supported by the National Cancer Institute Surveillance Epidemiology and End Results program. EPIC: The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by the Danish Cancer Society (Denmark); the Deutsche Krebshilfe, Deutsches Krebsforschungszentrum and Federal Ministry of Education and Research (Germany); the Hellenic Health Foundation, Greek Ministry of Health; Greek Ministry of Education (Greece); the Italian Association for Research on Cancer (AIRC) and National Research Council (Italy); the Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF); the Statistics Netherlands (The Netherlands); the Health Research Fund (FIS), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, Spanish Ministry of Health ISCIII RETIC (RD06/0020), Red de Centros RCESP, C03/09 (Spain); the Swedish Cancer Society, Swedish Scientific Council and Regional Government of Skåne and Västerbotten, Fundacion Federico SA (Sweden); the Cancer Research UK, Medical Research Council (United Kingdom). EPICAP: The EPICAP study was supported by grants from Ligue Nationale Contre le Cancer, Ligue départementale du Val de Marne; Fondation de France; Agence Nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES). The EPICAP study group would like to thank all urologists, Antoinette Anger and Hasina Randrianasolo (study monitors), Anne-Laure Astolfi, Coline Bernard, Oriane Noyer, Marie-Hélène De Campo, Sandrine Margaroline, Louise N'Diaye, and Sabine Perrier-Bonnet (Clinical Research nurses). ERSPC: This study was supported by the DutchCancerSociety (KWF94-869,98-1657,2002-277,2006-3518, 2010-4800), The Netherlands Organisation for Health Research and Development (ZonMW-002822820, 22000106, 50-50110-98-311, 62300035), The Dutch Cancer Research Foundation (SWOP), and an unconditional grant from Beckman-Coulter-HybritechInc. ESTHER: The ESTHER study was supported by a grant from the Baden Württemberg Ministry of Science, Research and Arts. The ESTHER group would like to thank Hartwig Ziegler, Sonja Wolf, Volker Hermann, Heiko Müller, Karina Dieffenbach, Katja Butterbach for valuable contributions to the study. FHCRC: The FHCRC studies were supported by grants R01-CA056678, R01-CA082664, and R01-CA092579 from the US National Cancer Institute, National Institutes of Health, with additional support from the Fred Hutchinson Cancer Research Center. FHCRC would like to thank all the men who participated in these studies. Gene-PARE: The Gene-PARE study was supported by grants 1R01CA134444 from the U.S. National Institutes of Health, PC074201 and W81XWH-15-1-0680 from the Prostate Cancer Research Program of the Department of Defense and RSGT-05-200-01-CCE from the American Cancer Society. Hamburg-Zagreb: None reported HPFS: The Health Professionals Follow-up Study was supported by grants UM1CA167552, CA133891, CA141298, and P01CA055075. HPFS are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. IMPACT: The IMPACT study was funded by The Ronald and Rita McAulay Foundation, CR-UK Project grant (C5047/A1232), Cancer Australia, AICR Netherlands A10-0227, Cancer Australia and Cancer Council Tasmania, NIHR, EU Framework 6, Cancer Councils of Victoria and South Australia, and Philanthropic donation to Northshore University Health System. We acknowledge support from the National Institute for Health Research (NIHR) to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden Foundation NHS Trust. IMPACT acknowledges the IMPACT study steering committee, collaborating centres, and participants. IPO-Porto: The IPO-Porto study was funded by Fundaçäo para a Ciência e a Tecnologia (FCT; UID/DTP/00776/2013 and PTDC/DTP-PIC/1308/2014) and by IPO-Porto Research Center (CI-IPOP-16-2012 and CI-IPOP-24-2015). MC and MPS are research fellows from Liga Portuguesa Contra o Cancro, Núcleo Regional do Norte. SM is a research fellow from FCT (SFRH/BD/71397/2010). IPO-Porto would like to express our gratitude to all patients and families who have participated in this study. Karuprostate: The Karuprostate study was supported by the the Frech National Health Directorate and by the Association pour la Recherche sur les Tumeurs de la ProstateKarusprostate thanks Séverine Ferdinand. KULEUVEN: F.C. and S.J. are holders of grants from FWO Vlaanderen (G.0684.12N and G.0830.13N), the Belgian federal government (National Cancer Plan KPC_29_023), and a Concerted Research Action of the KU Leuven (GOA/15/017). TVDB is holder of a doctoral fellowship of the FWO. LAAPC: This study was funded by grant R01CA84979 (to S.A. Ingles) from the National Cancer Institute, National Institutes of Health. Malaysia: The study was funded by the University Malaya High Impact Research Grant (HIR/MOHE/MED/35). Malaysia thanks all associates in the Urology Unit, University of Malaya, Cancer Research Initiatives Foundation (CARIF) and the Malaysian Men's Health Initiative (MMHI). MCCS: MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 209057, 251553, and 504711, and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. MCC-Spain: The study was partially funded by the Accion Transversal del Cancer, approved on the Spanish Ministry Council on the 11th October 2007, by the Instituto de Salud Carlos III-FEDER (PI08/1770, PI09/00773-Cantabria, PI11/01889-FEDER, PI12/00265, PI12/01270, and PI12/00715), by the Fundación Marqués de Valdecilla (API 10/09), by the Spanish Association Against Cancer (AECC) Scientific Foundation and by the Catalan Government DURSI grant 2009SGR1489. Samples: Biological samples were stored at the Parc de Salut MAR Biobank (MARBiobanc; Barcelona) which is supported by Instituto de Salud Carlos III FEDER (RD09/0076/00036). Also sample collection was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d'Oncologia de Catalunya (XBTC). MCC-Spain acknowledges the contribution from Esther Gracia-Lavedan in preparing the data. We thank all the subjects who participated in the study and all MCC-Spain collaborators. MD Anderson: Prostate Cancer Case-Control Studies at MD Anderson (MDA) supported by grants CA68578, ES007784, DAMD W81XWH-07-1-0645, and CA140388. MDACC_AS: None reported MEC: Funding provided by NIH grant U19CA148537 and grant U01CA164973. MIAMI (WFPCS): ACS MOFFITT: The Moffitt group was supported by the US National Cancer Institute (R01CA128813, PI: J.Y. Park). NMHS: Funding for the Nashville Men's Health Study (NMHS) was provided by the National Institutes of Health Grant numbers: RO1CA121060. PCaP only data: The North Carolina - Louisiana Prostate Cancer Project (PCaP) is carried out as a collaborative study supported by the Department of Defense contract DAMD 17-03-2-0052. For HCaP-NC follow-up data: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. For studies using both PCaP and HCaP-NC follow-up data please use: The North Carolina - Louisiana Prostate Cancer Project (PCaP) and the Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study are carried out as collaborative studies supported by the Department of Defense contract DAMD 17-03-2-0052 and the American Cancer Society award RSGT-08-008-01-CPHPS, respectively. For any PCaP data, please include: The authors thank the staff, advisory committees and research subjects participating in the PCaP study for their important contributions. For studies using PCaP DNA/genotyping data, please include: We would like to acknowledge the UNC BioSpecimen Facility and LSUHSC Pathology Lab for our DNA extractions, blood processing, storage and sample disbursement (https://genome.unc.edu/bsp). For studies using PCaP tissue, please include: We would like to acknowledge the RPCI Department of Urology Tissue Microarray and Immunoanalysis Core for our tissue processing, storage and sample disbursement. For studies using HCaP-NC follow-up data, please use: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. The authors thank the staff, advisory committees and research subjects participating in the HCaP-NC study for their important contributions. For studies that use both PCaP and HCaP-NC, please use: The authors thank the staff, advisory committees and research subjects participating in the PCaP and HCaP-NC studies for their important contributions. PCMUS: The PCMUS study was supported by the Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2-2009; DFNI-B01/28/2012) with additional support from the Science Fund of Medical University - Sofia (contract 51/2009; 8I/2009; 28/2010). PHS: The Physicians' Health Study was supported by grants CA34944, CA40360, CA097193, HL26490, and HL34595. PHS members are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. PLCO: This PLCO study was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIHPLCO thanks Drs. Christine Berg and Philip Prorok, Division of Cancer Prevention at the National Cancer Institute, the screening center investigators and staff of the PLCO Cancer Screening Trial for their contributions to the PLCO Cancer Screening Trial. We thank Mr. Thomas Riley, Mr. Craig Williams, Mr. Matthew Moore, and Ms. Shannon Merkle at Information Management Services, Inc., for their management of the data and Ms. Barbara O'Brien and staff at Westat, Inc. for their contributions to the PLCO Cancer Screening Trial. We also thank the PLCO study participants for their contributions to making this study possible. Poland: None reported PROCAP: PROCAP was supported by the Swedish Cancer Foundation (08-708, 09-0677). PROCAP thanks and acknowledges all of the participants in the PROCAP study. We thank Carin Cavalli-Björkman and Ami Rönnberg Karlsson for their dedicated work in the collection of data. Michael Broms is acknowledged for his skilful work with the databases. KI Biobank is acknowledged for handling the samples and for DNA extraction. We acknowledge The NPCR steering group: Pär Stattin (chair), Anders Widmark, Stefan Karlsson, Magnus Törnblom, Jan Adolfsson, Anna Bill-Axelson, Ove Andrén, David Robinson, Bill Pettersson, Jonas Hugosson, Jan-Erik Damber, Ola Bratt, Göran Ahlgren, Lars Egevad, and Roy Ehrnström. PROGReSS: The PROGReSS study is founded by grants from the Spanish Ministry of Health (INT15/00070; INT16/00154; FIS PI10/00164, FIS PI13/02030; FIS PI16/00046); the Spanish Ministry of Economy and Competitiveness (PTA2014-10228-I), and Fondo Europeo de Desarrollo Regional (FEDER 2007-2013). ProMPT: Founded by CRUK, NIHR, MRC, Cambride Biomedical Research Centre ProtecT: Founded by NIHR. ProtecT and ProMPT would like to acknowledge the support of The University of Cambridge, Cancer Research UK. Cancer Research UK grants (C8197/A10123) and (C8197/A10865) supported the genotyping team. We would also like to acknowledge the support of the National Institute for Health Research which funds the Cambridge Bio-medical Research Centre, Cambridge, UK. We would also like to acknowledge the support of the National Cancer Research Prostate Cancer: Mechanisms of Progression and Treatment (PROMPT) collaborative (grant code G0500966/75466) which has funded tissue and urine collections in Cambridge. We are grateful to staff at the Welcome Trust Clinical Research Facility, Addenbrooke's Clinical Research Centre, Cambridge, UK for their help in conducting the ProtecT study. We also acknowledge the support of the NIHR Cambridge Biomedical Research Centre, the DOH HTA (ProtecT grant), and the NCRI/MRC (ProMPT grant) for help with the bio-repository. The UK Department of Health funded the ProtecT study through the NIHR Health Technology Assessment Programme (projects 96/20/06, 96/20/99). The ProtecT trial and its linked ProMPT and CAP (Comparison Arm for ProtecT) studies are supported by Department of Health, England; Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466, and The NCRI, UK. The epidemiological data for ProtecT were generated though funding from the Southwest National Health Service Research and Development. DNA extraction in ProtecT was supported by USA Dept of Defense award W81XWH-04-1-0280, Yorkshire Cancer Research and Cancer Research UK. The authors would like to acknowledge the contribution of all members of the ProtecT study research group. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Department of Health of England. The bio-repository from ProtecT is supported by the NCRI (ProMPT) Prostate Cancer Collaborative and the Cambridge BMRC grant from NIHR. We thank the National Institute for Health Research, Hutchison Whampoa Limited, the Human Research Tissue Bank (Addenbrooke's Hospital), and Cancer Research UK. PROtEuS: PROtEuS was supported financially through grants from the Canadian Cancer Society (13149, 19500, 19864, 19865) and the Cancer Research Society, in partnership with the Ministère de l'enseignement supérieur, de la recherche, de la science et de la technologie du Québec, and the Fonds de la recherche du Québec - Santé.PROtEuS would like to thank its collaborators and research personnel, and the urologists involved in subjects recruitment. We also wish to acknowledge the special contribution made by Ann Hsing and Anand Chokkalingam to the conception of the genetic component of PROtEuS. QLD: The QLD research is supported by The National Health and Medical Research Council (NHMRC) Australia Project Grants (390130, 1009458) and NHMRC Career Development Fellowship and Cancer Australia PdCCRS funding to J Batra. The QLD team would like to acknowledge and sincerely thank the urologists, pathologists, data managers and patient participants who have generously and altruistically supported the QLD cohort. RAPPER: RAPPER is funded by Cancer Research UK (C1094/A11728; C1094/A18504) and Experimental Cancer Medicine Centre funding (C1467/A7286). The RAPPER group thank Rebecca Elliott for project management. SABOR: The SABOR research is supported by NIH/NCI Early Detection Research Network, grant U01 CA0866402-12. Also supported by the Cancer Center Support Grant to the Cancer Therapy and Research Center from the National Cancer Institute (US) P30 CA054174. SCCS: SCCS is funded by NIH grant R01 CA092447, and SCCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). Data on SCCS cancer cases used in this publication were provided by the Alabama Statewide Cancer Registry; Kentucky Cancer Registry, Lexington, KY; Tennessee Department of Health, Office of Cancer Surveillance; Florida Cancer Data System; North Carolina Central Cancer Registry, North Carolina Division of Public Health; Georgia Comprehensive Cancer Registry; Louisiana Tumor Registry; Mississippi Cancer Registry; South Carolina Central Cancer Registry; Virginia Department of Health, Virginia Cancer Registry; Arkansas Department of Health, Cancer Registry, 4815 W. Markham, Little Rock, AR 72205. The Arkansas Central Cancer Registry is fully funded by a grant from National Program of Cancer Registries, Centers for Disease Control and Prevention (CDC). Data on SCCS cancer cases from Mississippi were collected by the Mississippi Cancer Registry which participates in the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the CDC or the Mississippi Cancer Registry. SCPCS: SCPCS is funded by CDC grant S1135-19/19, and SCPCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). SEARCH: SEARCH is funded by a program grant from Cancer Research UK (C490/A10124) and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge. SNP_Prostate_Ghent: The study was supported by the National Cancer Plan, financed by the Federal Office of Health and Social Affairs, Belgium. SPAG: Wessex Medical ResearchHope for Guernsey, MUG, HSSD, MSG, Roger Allsopp STHM2: STHM2 was supported by grants from The Strategic Research Programme on Cancer (StratCan), Karolinska Institutet; the Linné Centre for Breast and Prostate Cancer (CRISP, number 70867901), Karolinska Institutet; The Swedish Research Council (number K2010-70X-20430-04-3) and The Swedish Cancer Society (numbers 11-0287 and 11-0624); Stiftelsen Johanna Hagstrand och Sigfrid Linnérs minne; Swedish Council for Working Life and Social Research (FAS), number 2012-0073STHM2 acknowledges the Karolinska University Laboratory, Aleris Medilab, Unilabs and the Regional Prostate Cancer Registry for performing analyses and help to retrieve data. Carin Cavalli-Björkman and Britt-Marie Hune for their enthusiastic work as research nurses. Astrid Björklund for skilful data management. We wish to thank the BBMRI.se biobank facility at Karolinska Institutet for biobank services. PCPT & SELECT are funded by Public Health Service grants U10CA37429 and 5UM1CA182883 from the National Cancer Institute. SWOG and SELECT thank the site investigators and staff and, most importantly, the participants who donated their time to this trial. TAMPERE: The Tampere (Finland) study was supported by the Academy of Finland (251074), The Finnish Cancer Organisations, Sigrid Juselius Foundation, and the Competitive Research Funding of the Tampere University Hospital (X51003). The PSA screening samples were collected by the Finnish part of ERSPC (European Study of Screening for Prostate Cancer). TAMPERE would like to thank Riina Liikanen, Liisa Maeaettaenen and Kirsi Talala for their work on samples and databases. UGANDA: None reported UKGPCS: UKGPCS would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now Prostate Action), The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. UKGPCS should also like to acknowledge the NCRN nurses, data managers, and consultants for their work in the UKGPCS study. UKGPCS would like to thank all urologists and other persons involved in the planning, coordination, and data collection of the study. ULM: The Ulm group received funds from the German Cancer Aid (Deutsche Krebshilfe). WUGS/WUPCS: WUGS would like to thank the following for funding support: The Anthony DeNovi Fund, the Donald C. McGraw Foundation, and the St. Louis Men's Group Against Cancer.
Differences in breast cancer incidence and mortality rates between North American Caucasian and African American women are well-described and transcend socioeconomic issues. Black women are diagnosed with breast cancer at a younger median age; have more clinically aggressive disease and stage-for-stage; and have higher mortality rates than age-matched Caucasian women. Black women in West Africa, the origin of the slave trade in the US in the 19th century and thus the founder population for most African Americans, have even higher rates of early-onset, poor-prognosis breast cancer than African American women. Racial difference in the distribution of intrinsic molecular subtypes has been well characterized in the US and throughout the African Diaspora as well. Despite the large efforts on characterizing racial/ethnic differences, however, the reasons women of African ancestry are disproportionately affected by breast cancer incidence and mortality remain poorly understood - largely due to paucity of data on inherent genomic differences that contribute to the disparities in incidence and progression of breast cancer across populations. West Africa Breast Cancer Study (WABCS) is an initiative that aims to comprehensively understand the genetic architecture of breast cancer in West Africans, the founder population of a large proportion of black women in the United States. The objective of the study was to provide a better understanding of the molecular genetic factors that influence prognosis in Nigerian breast cancer patients, and determine which of these alterations may be amenable to available therapy. To that end, we examined the molecular features of breast cancers of indigenous African women using a combination of whole-genome, whole-exome, and transcriptome sequencing (WGS, WES, and RNA-seq) on 194 tumors from Nigerian patients. The goal of this project was to obtain answers to two related research questions using an unscreened population without genetic admixture in Nigeria: 1) why are women of African ancestry more likely to develop aggressive young onset breast cancer? 2) What are the associated genomic and non-genomic risk factors? We hypothesize that the genomic determinants of breast cancer molecular subtypes in women of African ancestry are also molecular drivers of tumor progression and represent targets for interventions to improve clinical outcomes and close the mortality gap. By identifying causal links between genetic variants that promote aggressive tumor progression in Nigerian women in comparison to women from different population found in TCGA and ICGC, the present dataset will have significant public health impact on millions of women in the African Diaspora. The potential to identify novel pathways for interventions to reduce the increasing mortality gap between women of African and European ancestry is huge
The dataset includes spatially-resolved and single-cell antigen receptor, as well as gene expression, data from two different HER2+ breast cancer patients. The tumor piece obtained during surgery from each patient was divided into several regions and tissue sections were used for spatial transcriptomics (Visium, 10x genomics). As indicated, some tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. In parallel, tissue pieces from the same tumor were dissociated for single-cell gene expression analysis (10x genomics GEX, VDJ, and feature barcoding/Hash Tag Oligonucleotide). The deposited data is in the form of fastq files. All processed data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are publicly available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Cell Ranger, Seurat, Space Ranger, and STutility pipelines.
The dataset includes spatially-resolved gene expression and antigen receptor data from two Tonsil samples (1 and 2). Tissue sections from the tonsil samples were used for spatial transcriptomics (Visium, 10x genomics). Tonsil 2 tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. Nearby or adjacent tissue sections (from Tonsil2) were also analyzed by a bulk antigen receptor sequencing approach (amplicon sequencing), by a method also newly developed by us in the same publication (Bulk SS3 VDJ). For Visium, the data were anonymized (all SNPs removed) using Bamboozle (Ziegenhain and Sandberg, Nature Communications 2021). The deposited data is in the form of fastq files. All remaining data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Seurat, Space Ranger, and STutility pipelines.
The genetic architecture and polygenicity of skin pigmentation is explored in KhoeSan communities, including Nama individuals whose phenotype and genotype data are provided in this dataset. By pairing genotypes and quantitative spectrophotometric skin pigmentation phenotypes, we show that skin pigmentation is highly heritable in KhoeSan, yet known pigmentation loci only explains a small fraction of the phenotypic variance. Using genome-wide association analyses, we identified both canonical and non-canonical skin pigmentation loci. We show that this phenotype is more polygenic and complex than previously characterized. (Martin et al., Cell 2017. PMID: 29195075) Following up on the top associated signal with large effect size in the gene SLC24A5, we demonstrate that the canonical Eurasian nonsynonymous allele was introduced into KhoeSan via a recent migration ~2 kya and was under extremely strong selection. The derived allele was present at high frequency despite controlling for gene flow. With high-throughput sequences in the captured SLC24A5 region, we show that the most common derived haplotype is identical amongst Europeans, eastern African and KhoeSan. Using 4-population demographic simulations with selection, we show that the allele was introduced into the KhoeSan only 2,000 ya via a back-to-Africa migration and then experienced a selective sweep.
Glycophorin A and glycophorin B are red blood cell surface proteins that are both receptors for the parasite Plasmodium falciparum, which is the principal cause of malaria in sub-Saharan Africa. DUP4 is a complex structural genomic variant that carries extra copies of a glycophorin A - glycophorin B fusion gene, and has a dramatic effect on malaria risk by reducing the risk of severe malaria by up to 40% Using fiber-FISH and Illumina sequencing, we validate the structural arrangement of the glycophorin locus in the DUP4 variant, and reveal somatic variation in copy number of the glycophorin A- glycophorin B fusion gene. By developing a simple, specific, PCR-based assay for DUP4 we show the DUP4 variant reaches a frequency of 13% in a malaria-endemic village in south-eastern Tanzania. We genotype a substantial proportion of that village and demonstrate an association of DUP4 genotype with hemoglobin levels, a phenotype related to malaria, using a family-based association test. Taken together, we show that DUP4 is a complex structural variant that may be susceptible to somatic variation, and show that it is associated with a malarial-related phenotype in a non-hospitalizedlongitudinally-followed population.
(Excerpted/paraphrased from original grant application): FOCI seeks to expand our understanding of epithelial ovarian cancer through a coordinated and comprehensive approach. Project 1 will focus on discovery, expansion, and replication. By pooling GWAS, we expect to identify new associations and achieve independent replication, explore whether there are risk variants specific for histologic subtypes, and evaluate structural polymorphisms - copy number variants - as risk factors. Finally, Project 1 will leverage the GWAS data to correlate DNA variants with a new endpoint - survival. Project 2 will focus on biological studies designed to help inform interpretation of findings from Project 1. This will include efforts to identify the functional consequences of variants and improve understanding of biological mechanisms. Project 3 will include epidemiologic studies of gene by gene interaction, gene by environment interaction, and development of risk prediction models. The collective effort builds upon the strengths and history of collaboration inherent in the Ovarian Cancer Association Consortium (OCAC), a multidisciplinary group comprised of epidemiologists, genetic epidemiologists, statistical geneticists, molecular and cell biologists and clinicians that was formed in 2005. The FOCI Cohort is utilized in the following dbGaP sub-studies. To view genotypes, other molecular data, and derived variables collected in these sub-studies, please click on the following sub-studies below or in the "Sub-studies" box located on the right hand side of this top-level study page phs001133 FOCI Cohort. phs001131 Affymetrix Exome Chip phs001132 GWAS Meta Analysis phs001142 Mayo Omni Express phs001150 Mayo 2 5M
Preterm birth (PTB, born before 37 weeks of gestation) is a leading cause of neonatal mortality and post-natal morbidity. PTB affects one in nine all live births in the U.S. Notably, the highest rate of PTB occurs among African Americans (one in six). PTB is a complex trait, likely determined by multiple environmental and genetic factors and their interactions. We demonstrated strong familial aggregation of preterm and low birthweight in the US Blacks and Whites (Wang et al, NEJM, 1995) and conducted the largest candidate gene study of preterm birth at that time (Hao et al, HMG, 2004). We showed that a subset of mothers with certain metabolic gene variants are particularly vulnerable to the adverse effects of cigarette smoking on low birthweight and preterm births (Wang et al, JAMA, 2002). We also published a number of papers that examined the effect of maternal pre-pregnancy BMI, micronutrient status, stress and environmental toxins on the risk of preterm birth and related conditions. This project, supported by a grant from the NICHD (2R01HD41702, PI, Xiaobin Wang), aimed to conduct a genome-wide association study (GWAS) and apply advanced statistical methods to identify susceptibility loci of PTB in a predominantly urban low-income African American sample, a subset of the Boston Birth Cohort. PUBLIC HEALTH REVELANCE: We anticipate that this study will lead to the identification of novel genetic loci of PTB and gene-environment interactions. Such findings not only will provide important insights into mechanisms leading to PTB, but also may help identify women at high-risk of PTB, which in turn, may lead to the development of early and targeted interventions that can prevent PTB or mitigate the severity and consequences of PTB.
The biomarker development study consisted of two parts: discovery and validation. The first part was the discovery and verification phase of biomarkers using two different platforms: transcriptomic and miRNA. The salivary transcriptomes of 63 GC samples and 31 non-GC controls were profiled using Affymetrix HG U133+2.0 microarrays (Affymetrix, Santa Clara, CA). The identified exRNA candidates were verified by quantitative real-time PCR (RT-qPCR) using all 94 of the original samples. In the discovery phase for the miRNA biomarkers, 10 early-stage GC samples and 10 non-GC controls were selected. The salivary miRNAs of these samples (n=20) were profiled using the TaqMan MicroRNA Array (Applied Biosystems, Foster City, CA). MicroRNA candidates were verified using TaqMan miRNA Assay (Thermo Scientific, Grand Island, NY). The second part of the study was to validate these verified exRNA biomarker candidates with exRNA samples extracted from an independent cohort of 100 GC and 100 non-GC saliva samples. The cohort was not balanced for demographics on gender and smoking history but more accurately reflected the diagnostic setting where our proposed final model could be implemented. Reprinted from "Li F, Yoshizawa MJ, Kim K, Kanjanapangka J, Grogan T, Wang X, Elashoff D, Ishikawa S, Chia D, Liao W, Akin D, Yan X, Lee M, Choi R, Kim S, Kang S, Bae J, Sohn T, Lee J, Choi M, Min B, Lee J, Kim J, Kim Y, Kim S, Wong D. (2018) Development and Validation of Salivary Extracellular RNA Biomarkers for Noninvasive Detection of Gastric Cancer. Clin Chem. PMID: 30097497 DOI: 10.1373/clinchem.2018.290569", with permission from American Association for Clinical Chemistry (United States).
The Tourette International Collaborative Genetics (TIC Genetics) Study is an international collaboration of scientists and clinicians specialized in Tourette Disorder (TD) from more than 20 sites across the United States, Europe, and South Korea. The study was established to further our understanding of the genetic architecture of tic disorders by developing a large sample of genotypically and phenotypically well-characterized affected probands and their relatives. We employ state-of-the-art genetic technologies to identify major genetic variants contributing to TD and the most commonly comorbid disorders, such as Obsessive-Compulsive Disorder (OCD) and Attention-Deficit/Hyperactivity Disorder (ADHD). TIC Genetics is a direct result of work of the New Jersey Center for Tourette Syndrome (NJCTS) Sharing Repository (Heiman et. al., 2008; PMID: 19036136), funded by a grant from NJCTS Center of Excellence. Established in 2011 (Dietrich et. al., 2015; PMID: 24771252), the TIC Genetic study focuses on both on familial genetic variants with large effects within multiplex affected pedigrees and on de novo mutations ascertained through the analysis of apparently simplex parent-child trios with non-familial tics. In May 2017, we published a whole-exome sequencing study on apparently 311 parent-child trios (Willsey et. al., 2017; PMID: 28472652). These data, both phenotypes and sequencing data, are available through dbGaP. There were 120 subject samples included in the publication that did not have consent for sharing. These are excluded from dbGaP.In November 2021, we published a whole-exome sequencing study on 13 multiplex TD families (Cao et. al., 2021). These data, both phenotypes and sequencing data, are available through dbGaP.
The focus of this study is to identify and test both common and rare genetic variants that elevate risk for CL/P, and to identify genetic variants associated with specific orofacial cleft (OFC) phenotypes in the population that has accumulated the greatest genetic variation in the human race. We hypothesize that bilateral complete cleft lip and palate (BCLP), the most clinically severe form of OFC, is associated with a higher mutation load than less severe forms (cleft lip only and unilateral cleft lip and palate) and focusing on BCLP will facilitate the discovery of novel risk variants.