This dataset contains the RNA and ChIP Sequencing data from the study Kalirin-RAC controls nucleokinetic migration in ADRN-type neuroblastoma. The data is organized in 7 experiments which are divided by both sequencing technology or the application of siRNA or drug interventions (or lack thereof) on neuroblastoma cell lines. The experiment names and the file names have been chosen in each respective experiment to guide future users of the data to replicate the analyses in the manuscript.
We analyzed multiple myeloma samples from two patients included in the observational prospective cohort MYRACLE before talquetamab treatment and after relapse. Five other myelomas from the same cohort were included for comparison. Normal plasma cells were also retrieved. All samples were analyzed by whole genome sequencing and single-nucleus Multiome, except one that could only be analyzed by bulk RNA sequencing.
64 left atrial appendages from patients without atrial fibrillation (AF) undergoing cardiac surgery, patients with paroxysmal AF and with persistent AF (~20 per group). Trizol RNA isolation, rRNA depletion, paired transcriptome sequencing on illumina NovaSeq 6000. Provided are FastQ and BAM files. Additional data (e.g. clinical characteristics, RIN values etc.) can be provided upon reasonable request. The same RNA samples (62 out of 64) were used for miRNA sequencing. Results from miRNA seuqencing are stored in the EGA database managed by the same DAC.
South African breast cancer GWAS genotype data for 2823 female African breast cancer cases. The data was generated using the H3Africa Custom microarray and genotyped on the Illumina HiScan instrument. The dataset is in VCF format.
We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.
snRNA-seq performed on patient tumours (n = 6 patients, one biopsy sample each sequenced) using the 10x technology. Single nuclei were acquired from six frozen mCRPC biopsies (4 lymph node, 2 liver metastases) obtained from consenting patients treated at the Royal Marsden Hospital between 2018 and 2023 under an institutional review board approved research protocol (Research Ethics Committee approval number: 04/Q0801/60). All six patients had previously received androgen-receptor signalling inhibitor(s) and five of six patients had previously received taxane(s).
Raw and processed ATAC-seq data from the VOA1066 undifferentiated endometrial carcinoma cell-line (10 samples). Samples were treated with DMSO, dBRD9, or compound 12, without doxycycline; some samples instead received doxycycline-induced ARID1A treatment, or doxycycline alone. Samples were sequenced on Illumina NextSeq 500 using 37 bp paired-end sequencing parameters. Raw fastq fastq files, bigwig files giving normalized genomic coverage per region, and called broad peaks are included for each sample.
This dataset contains snRNA-seq data of 11 regionally sampled GBM tissue (peritumoral region, tumor edge, and tumor core). Regionally sampled GBM patient tissue was dissociated and nuclei were processed in an unbiased manner without any sorting procedure. Nuclei were dissociated from frozen tissue using Chromium Nuclei Isolation Kit. Nuclei barcoding, cDNA preparation, and library construction were performed following the Evercode WT or WT mini User Manual, by combinatorial barcoding to assign a unique barcode to each cell.
Nonsyndromic cleft lip and palate (NSCL/P) is a complex disorder caused by both genetic and environmental factors and has been the focus of an extensive effort to identify genetic risk factors. A number of candidate gene studies have been performed but have not been widely replicated. To date, four independent genome wide association studies have been performed as well as a meta-analysis. Together these studies have identified many loci associated with NSCL/P. The goal of this project is to use targeted sequencing to further characterize these regions and to progress from the association signals identified by GWAS to the identification of causative genes and/or variants. This study is part of the GWASeq project, a collaboration of five disease studies, which will sequence genomic regions from GWAS to characterize the genetic variation underlying these diseases and to compare study design and methods for the follow-up of GWAS studies by sequencing. The goal of this study is to sequence 1000+ NSCL/P case-parent trios from China and the Philippines and 400 trios of European ancestry. Targeted sequencing was performed on intervals ranging between 60kb to 1Mb surrounding 13 genes/loci previously associated with NSCL/P including: IRF6, MAFB, ARHGAP29, 8q24, PAX7, VAX1, NTN1, NOG, FOXE1, MSX1, BMP4, FGFR2, PTCH1.
Human cancer cell lines are largely used in the searching for new antineoplastic agents. However, due to the artifacts of a long-term in culture, cell lines do not always represent the realistic tumor cell behavior. This has motivated the development of models that better mimetics the tumor tissue, among them, the establishment of primary cell cultures. In this work, we establish and characterized a low-passage cervix cancer cell line from a Brazilian patient with squamous cell carcinoma. The phenotype confirms the epithelial and tumor origin, through cytokeratins, EpCAM, and p16 staining. Whole exome sequencing showed relevant somatic mutations in several genes including BRCA2, TGFBR1, and IRX2 genes. CNV analysis by nanostring and WGS revealed amplification in genes related mainly with kinases proteins, involved in proliferation, migration and cell differentiation, such as EGFR, PIK3CA, and MAPK7. Overexpression of EGFR was confirmed by phospho RTK-array and western blot analysis. Furthermore, the cell was sensitivity to cisplatin, with IC50 13 times lower than SiHa cell line. In conclusion, this cervical cancer cell line presents molecular alterations that are an important tool for leading pre-clinical studies of new drugs that target one or more of the altered pathways.
This is a study to determine the efficacy of androgen receptor (AR) inhibitors in LAR (luminal androgen receptor)-enriched triple-negative breast cancer (TNBC) in the neoadjuvant setting. Twenty-four patients were treated with neoadjuvant AR inhibitor enzalutamide and paclitaxel for 12 weeks. Whole exome sequencing and RNA-sequencing was performed prior to treatment. The data for only two patients are consented for release through dbGaP. The remaining data are available under a Materials Transfer Agreement with the University of Texas MD Anderson Cancer Center.
The SardiNIA Medical Sequencing Discovery Project studies the genetics of blood lipid levels and personality in a Sardinian population cohort. The project has generated draft genome sequences for approximately 2,000 individuals using whole genome shotgun sequencing. The draft sequences will allow investigators to evaluate the contribution of common and rare single nucleotide polymorphisms, short insertions and deletions, large copy number polymorphisms and other structural variants to blood levels of low density lipoprotein cholesterol (LDL-c), high density lipoprotein cholesterol (HDL-c) and triglycerides (TG), all of which are key risk factors for cardiovascular disease, and to the 5 domains of personality as assessed by the NEO-PI-R questionnaire. The two traits represent different ends of the spectrum of medically interesting complex traits. Blood lipid levels are a risk factor for cardiovascular disease for which genetic studies have been very successful. In contrast, personality traits and other behavioral phenotypes represent a set of phenotypes that have proven more challenging to dissect using standard genetic tools. In both cases, we expect whole genome sequencing to improve our understanding of the underlying biology. The isolated Sardinian population is ideal for this type of study for several reasons, in particular because: (i) the bottleneck that occurred after colonization of the island attenuated natural selection against alleles with phenotypic consequences, increasing the odds that functional alleles will reach modest frequencies (0.5 - 5.0%) and will be detected in the present study and (ii) sharing of long haplotype stretches surrounding rare variants will facilitate imputation based analyses of shotgun sequence data, which improve the accuracy of individual genotype calls and thus increase power. This research helps advance NIH's mission by furthering our understanding of the genetic factors contributing to blood lipid levels and coronary heart disease and to personality, behavior and mental health. In addition, these data should facilitate development of analysis tools and strategies that can be used to study the genomes of hundreds to thousands of individuals and further our understanding of the genetics and biology of many different traits and conditions.
Whole Exome Sequencing(WES) and copy number variation(CNV) analysis were performed using breast samples collected from non-invasive breast cancer patients in order to analysis of somatic mutations related to breast tumor phenotypes.
Cancers are often defined by the dysregulation of specific transcriptional programs; however, the importance of global transcriptional changes is less understood. Hypertranscription is the genome-wide increase in RNA output. Hypertranscription’s prevalence, underlying drivers and prognostic significance are undefined in primary human cancer. This is due in part to limitations of expression profiling methods, which assume equal RNA output between samples. Here, we developed a computational method to directly measure hypertranscription in 7,494 human tumors, spanning 31 cancer types. Hypertranscription is ubiquitous across cancer, especially in aggressive disease. It defines patient subgroups with worse survival, even within well-established subtypes. Our data suggest that loss of transcriptional suppression underpins the hypertranscriptional phenotype. Single-cell analysis reveals hypertranscriptional clones, which dominate transcript production regardless of their size. Finally, patients with hypertranscribed mutations have improved response to immune checkpoint therapy. Our results provide fundamental insights into gene dysregulation across human cancers and may prove useful in identifying patients that would benefit from novel therapies.
TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC). This study looks at the multi-region RNAseq data from the TRACERx100 cohort with high enough quality RNA available. There is RNAseq data from 164 regions (64 patients).
This study is a longitudinal multidisciplinary investigation on the natural history, morbidity and mortality of Angelman Syndrome (AS). We will collect detailed longitudinal data on a cohort of AS individuals to gain a better understanding of the disease progression, and to follow the natural history of the clinical features of this cohort including assessment of quality of life and longevity. The participants to be recruited for the study will include 1) patients who have a documented molecular diagnosis of AS and 2) patients with a clear clinical diagnosis of Angelman Syndrome as determined by the Principal Investigator (PI) and the Co-investigators in this study but who do not have a known molecular defect. One of the goals of the natural history study will be to characterize the phenotypic differences between patients with Class I deletions and those with Class II deletions, particularly with respect to the issue of autism. A blood sample may be collected on the participants in order to create a DNA repository, and in some cases, to establish cell lines if further material is required for molecular studies. Alternatively, DNA may be obtained by buccal mucosa swabs/brushing. In those AS patients with known deletions involving the 15q11-q13 regions, a blood sample will be collected to perform comparative genome hybridization (CGH) microarray studies to characterize the extent of the deletion. No drugs or treatments will be administered through this protocol. In rare instances, a skin biopsy to establish a fibroblast cell line may be requested (using separate consent).
The primary determinant of disease severity in patients with severe Duchenne muscular dystrophy (DMD) or milder Becker muscular dystrophy (BMD) is whether their dystrophin gene (DMD) mutation disrupts the mRNA reading frame or permits the expression of a partially functional protein. However, even in the complete absence of dystrophin, variability in disease severity is observed, with candidate gene studies implicating several genes as potential DMD modifiers. This study undertakes a comprehensive genome-wide search for modifier loci influencing disease severity in DMD patients. The availability of subjects for such studies remains limited, resulting in modest sample sizes that challenge the GWAS design. To address this, we have implemented measures to minimize heterogeneity within the dataset at the dystrophin (DMD) gene itself, adopting a conservative approach to DMD mutation classification to limit the possibility of residual dystrophin expression. Additionally, the study employed statistical methods that are well-suited to smaller sample sizes, including the use of a novel linear regression-like residual for time to ambulatory loss and the application of evidential statistics, specifically the Posterior Probability of Linkage Disequilibrium (PPLD), to assess trait-SNP associations in the GWAS framework. With a sample size of 419 patients, this study has identified multiple candidate genetic modifier loci. The molecular data available in dbGaP includes 2,562,265 directly genotyped variants in 419 people using the Illumina Infinium Omni2.5Exome-8 Beadchip assay, achieving a genotyping rate of 0.998. The phenotypic data comprises age at ambulatory loss, steroid status, DMD gene mutation, and the PPLD value from the time-to-event (TE) phenotype.
Group 3 (G3) medulloblastoma (MB) is one of the deadliest forms of the disease for which novel treatment is desperately needed. Here we evaluate ribociclib, a highly selective CDK4/6 inhibitor, with gemcitabine in mouse and human G3 MBs. Ribociclib CNS penetration was assessed by in vivo microdialysis and by immunohistochemistry and gene expression studies. Survival studies to determine the efficacy of ribociclib and gemcitabine combination were performed on mice orthotopically implanted with luciferase labelled mouse and human G3 MB. Pharmacokinetic-pharmacodynamic outcomes and univariable survival models were analyzed to estimate survival. Gene activity inference using NetBID and tumor differentiation analysis investigated the effects of the combination after short and long-term treatments. Tumors from mice treated with oral ribociclib displayed inhibited RB phosphorylation, downregulated E2F target genes, and decreased proliferation. Treatment of mice with the combination of ribociclib and gemcitabine was well tolerated, slowed tumor progression and metastatic spread, and increased survival. Molecular analysis of treated versus untreated tumors showed a significant decrease in the activity and expression of genes involved in cell cycle progression and DNA damage response, and an increase in activity and expression of genes implicated in neuronal identity and neuronal differentiation. Ribociclib is CNS-penetrant. When administered/combined with gemcitabine in orthotopic G3 MB models resulted in improved survival. Our findings, with both mouse and human patient-derived-orthotopic xenograft models, suggest that this combination therapy has promise for children with G3 MB and may represent an effective treatment strategy for other CNS malignancies.
Naïve (CD27-IgD+) B cells were isolated from buffy coat preparations of healthy donors using CD19 magnetic beads, followed by reals of CD19 beads and incubation with IgD-biotin and anti-biotin magnetic beads. B cells were infected with EBV by spinoculation or stimulated with heat-inactivated EBV and control cells were left uninfected. RNA was extracted immediately after isolation in un-infected B cells. From EBV-infected B cells and B cells stimulated with heat-inactivated virus, RNA was extracted 24 and 96 hours after infection / stimulation.
Epstein-Barr virus-transformed B-lymphoblastoid cell lines from six individuals (HG00114, HG00282, NA12005, NA12044, NA12717 and NA12751) of the GEUVADIS Project were selected for ribosomal profiling. For each EBV-LCL, 20 million cells were treated with 2 μg/ml harringtonine and 100 μg/ml cycloheximide. After lysing cells in lysis buffer supplemented with 100 μg/ml cycloheximide, ribosome complexes were purified by density purification. Small RNA molecules were isolated using the NucleoSpin miRNA Kit (Bioke, Leiden, Netherlands), followed by RNA PAGE gel separation. Universal linkers were added after dephosphorylation. After reverse transcription, the cDNA was circularized and, after ribosomal RNA depletion, barcoded and sequenced on Illumina NextSeq 500.
We discovered genomic and epigenomic dysregulation originating at sites of human papillomavirus (HPV) integration following the Oxford Nanopore long-read analysis of 72 cervical cancer genomes. The integration events had allele-specific effects on the genome, methylome, and transcriptome, which sometimes resulted in the activation of cancer genes. We also examined the 72 samples using Illumina short-read analysis in phs000528.
Agilent whole exome hybridisation capture was performed on genomic DNA derived from Chondrosarcoma cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products.
Both control and vascularized organoids were processed using the 10x Chromium 3' RNA method. Sequencing reads were aligned with STARsolo.
The FAA Functional Genomics Team examined the feasibility of analyzing gene expression profiles using post-mortem brain, lung, muscle, and blood samples collected from aviation accident fatalities, with the goal of validating post-mortem extracted RNA for use in RNA sequencing. We compared the RNA sequencing results obtained from THC-positive and THC-negative samples to determine whether RNA sequencing on samples collected from aviation accident fatalities was possible and if any differences in gene expression associated with THC consumption were detectable post-mortem. The study population included 57 males of indeterminant age who perished in general aviation accidents and for whom post-mortem toxicological testing identified delta-9-THC or its primary or secondary metabolites, as well as negative controls. We determined that RNA sequencing on post-mortem aviation accident fatality samples is possible, and further identified genes showing differential expression in lung and muscle between THC-positive and THC-negative samples. We conclude that post-mortem tissue samples collected from aviation accident fatalities are suitable for gene expression profiling, but caution researchers that such samples have low RNA integrity numbers and some degree of microbial contamination is both probable and to be expected. Raw sequencing files (.fastq) from this study are available in dbGaP.
Experiments using targeted pulldown methods will be sequenced to validate findings in the exomes of patients with Myeloproliferative Neoplasms (MPN).
Dataset includes 160 double-stranded RNAseq libraries collected from 16 patients with adult diffuse glioma. The majority of these samples were spatially-mapped during sample collection, enabling the genomic information derived from them to be mapped in 3D space.
This DAC oversees controlled access to human RNA-seq data generated as part of the project “ABHD11 inhibition and T cell function in autoimmunity” at Swansea University. The DAC will evaluate and approve access requests based on scientific merit and alignment with participant consent. Data requests will be considered from academic researchers and healthcare professionals working on immunology or autoimmunity.
We collected fresh tissue from an untreated GBM (SF10345) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
We collected fresh tissue from an untreated GBM (SF10282) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
We collected fresh tissue from an untreated GBM (SF10360) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
We collected fresh tissue from an untreated GBM (SF10679) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
We collected fresh tissue from an untreated GBM (SF10281) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
We collected fresh tissue from an untreated GBM (SF10592) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.
This submission includes raw FASTQ files (for bulk RNA-seq and 10X joint snATAC+snRNA multiome profiling experiments), sample phenotype files, and genotypes for the data included in the manuscript.
This dataset consists of RNA-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, it includes 63 samples.
To generate an RNA-Seq dataset for organoids apically stimulated with Salmonella Typhimurium.These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
To identify dysfunctional neuronal subtypes underlying seizure activity in the human brain, we have performed single-nucleus transcriptomics analysis of >110,000 neuronal transcriptomes derived from temporal cortex samples of multiple temporal lobe epilepsy and non-epileptic subjects.
Pancreatic cancers arise from two different precursors; intraductal papillary mucinous neoplasms (IPMN) and pancreatic intraepithelial neoplasm (PanIN), while biological differences in cancers originated from them remain obscure. Here, we analyzed their genomic and transcriptomic landscapes using patient-derived organoids.
Using a novel sorting strategy, we performed ultra low input RNAseq from FACS-sorted populations from diagnostic DNMT3Amut and NPM1mut AML patients. Primary samples were retrospectively collected based on their mutational profile. Samples were thawed, stained and FACS sorted using combination of lineage markers, CD34, GPR56 and NKG2DLigands. RNA was extracted and library prepared from 13 samples.
DNA-methylation is an important epigenetic feature in health and disease. Two cost-efficient genome-scale methodologies to assess DNA-methylation are MethylCap-seq and Illumina's Infinium HumanMethylation450 BeadChips (HM450). However, objective information regarding the best-suited methodology for a specific research question is scant. Therefore, we performed a large-scale evaluation on a set of 70 brain tissue samples obtained from 65 glioblastoma and 5 non-tumoral brain tissues, using a gold standard free Bayesian modeling procedure. While conditional specificity was adequate for both approaches, conditional sensitivity was systematically higher for HM450. Also the genome-wide characteristics were compared, revealing that the HM450 probes assess less than 10% of the regions identified as methylated by MethylCap-seq. Hence the latter method may detect more potentially relevant DNA-methylation, defined by either functional location or previously reported differentially methylated candidate regions. Our results therefore indicate that – at least for the tissue under study - both methodologies are complementary, with a higher sensitivity for HM450, but a far larger genome-wide coverage for MethylCap-seq. Note that here only the relevant MethylCap-seq data is deposited, for the HM450 data we refer to GEO (GSE60274).
Original description of the study: From ELLIPSE (linked to the PRACTICAL consortium), we contributed ~78,000 SNPs to the OncoArray. A large fraction of the content was derived from the GWAS meta-analyses in European ancestry populations (overall and aggressive disease; ~27K SNPs). We also selected just over 10,000 SNPs from the meta-analyses in the non-European populations, with a majority of these SNPs coming from the analysis of overall prostate cancer in African ancestry populations as well as from the multiethnic meta-analysis. A substantial fraction of SNPs (~28,000) were also selected for fine-mapping of 53 loci not included in the common fine-mapping regions (tagging at r2>0.9 across ±500kb regions). We also selected a few thousand SNPs related with PSA levels and/or disease survival as well as SNPs from candidate lists provided by study collaborators, as well as from meta-analyses of exome SNP chip data from the Multiethnic Cohort and UK studies. The Contributing Studies: Aarhus: Hospital-based, Retrospective, Observational. Source of cases: Patients treated for prostate adenocarcinoma at Department of Urology, Aarhus University Hospital, Skejby (Aarhus, Denmark). Source of controls: Age-matched males treated for myocardial infarction or undergoing coronary angioplasty, but with no prostate cancer diagnosis based on information retrieved from the Danish Cancer Register and the Danish Cause of Death Register. AHS: Nested case-control study within prospective cohort. Source of cases: linkage to cancer registries in study states. Source of controls: matched controls from cohort ATBC: Prospective, nested case-control. Source of cases: Finnish male smokers aged 50-69 years at baseline. Source of controls: Finnish male smokers aged 50-69 years at baseline BioVu: Cases identified in a biobank linked to electronic health records. Source of cases: A total of 214 cases were identified in the VUMC de-identified electronic health records database (the Synthetic Derivative) and shipped to USC for genotyping in April 2014. The following criteria were used to identify cases: Age 18 or greater; male; African Americans (Black) only. Note that African ancestry is not self-identified, it is administratively or third-party assigned (which has been shown to be highly correlated with genetic ancestry for African Americans in BioVU; see references). Source of controls: Controls were identified in the de-identified electronic health record. Unfortunately, they were not age matched to the cases, and therefore cannot be used for this study. Canary PASS: Prospective, Multi-site, Observational Active Surveillance Study. Source of cases: clinic based from Beth Israel Deaconness Medical Center, Eastern Virginia Medical School, University of California at San Francisco, University of Texas Health Sciences Center San Antonio, University of Washington, VA Puget Sound. Source of controls: N/A CCI: Case series, Hospital-based. Source of cases: Cases identified through clinics at the Cross Cancer Institute. Source of controls: N/A CerePP French Prostate Cancer Case-Control Study (ProGene): Case-Control, Prospective, Observational, Hospital-based. Source of cases: Patients, treated in French departments of Urology, who had histologically confirmed prostate cancer. Source of controls: Controls were recruited as participating in a systematic health screening program and found unaffected (normal digital rectal examination and total PSA < 4 ng/ml, or negative biopsy if PSA > 4 ng/ml). COH: hospital-based cases and controls from outside. Source of cases: Consented prostate cancer cases at City of Hope. Source of controls: Consented unaffected males that were part of other studies where they consented to have their DNA used for other research studies. COSM: Population-based cohort. Source of cases: General population. Source of controls: General population CPCS1: Case-control - Denmark. Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPCS2: Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPDR: Retrospective cohort. Source of cases: Walter Reed National Military Medical Center. Source of controls: Walter Reed National Military Medical Center ACS_CPS-II: Nested case-control derived from a prospective cohort study. Source of cases: Identified through self-report on follow-up questionnaires and verified through medical records or cancer registries, identified through cancer registries or the National Death Index (with prostate cancer as the primary cause of death). Source of controls: Cohort participants who were cancer-free at the time of diagnosis of the matched case, also matched on age (±6 mo) and date of biospecimen donation (±6 mo). EPIC: Case-control - Germany, Greece, Italy, Netherlands, Spain, Sweden, UK. Source of cases: Identified through record linkage with population-based cancer registries in Italy, the Netherlands, Spain, Sweden and UK. In Germany and Greece, follow-up is active and achieved through checks of insurance records and cancer and pathology registries as well as via self-reported questionnaires; self-reported incident cancers are verified through medical records. Source of controls: Cohort participants without a diagnosis of cancer EPICAP: Case-control, Population-based, ages less than 75 years at diagnosis, Hérault, France. Source of cases: Prostate cancer cases in all public hospitals and private urology clinics of département of Hérault in France. Cases validation by the Hérault Cancer Registry. Source of controls: Population-based controls, frequency age matched (5-year groups). Quotas by socio-economic status (SES) in order to obtain a distribution by SES among controls identical to the SES distribution among general population men, conditionally to age. ERSPC: Population-based randomized trial. Source of cases: Men with PrCa from screening arm ERSPC Rotterdam. Source of controls: Men without PrCa from screening arm ERSPC Rotterdam ESTHER: Case-control, Prospective, Observational, Population-based. Source of cases: Prostate cancer cases in all hospitals in the state of Saarland, from 2001-2003. Source of controls: Random sample of participants from routine health check-up in Saarland, in 2000-2002 FHCRC: Population-based, case-control, ages 35-74 years at diagnosis, King County, WA, USA. Source of cases: Identified through the Seattle-Puget Sound SEER cancer registry. Source of controls: Randomly selected, age-frequency matched residents from the same county as cases Gene-PARE: Hospital-based. Source of cases: Patients that received radiotherapy for treatment of prostate cancer. Source of controls: n/a Hamburg-Zagreb: Hospital-based, Prospective. Source of cases: Prostate cancer cases seen at the Department of Oncology, University Hospital Center Zagreb, Croatia. Source of controls: Population-based (Croatia), healthy men, older than 50, with no medical record of cancer, and no family history of cancer (1st & 2nd degree relatives) HPFS: Nested case-control. Source of cases: Participants of the HPFS cohort. Source of controls: Participants of the HPFS cohort IMPACT: Observational. Source of cases: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has been diagnosed with prostate cancer during the study. Source of controls: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has not been diagnosed with prostate cancer during the study. IPO-Porto: Hospital-based. Source of cases: Early onset and/or familial prostate cancer. Source of controls: Blood donors Karuprostate: Case-control, Retrospective, Population-based. Source of cases: From FWI (Guadeloupe): 237 consecutive incident patients with histologically confirmed prostate cancer attending public and private urology clinics; From Democratic Republic of Congo: 148 consecutive incident patients with histologically confirmed prostate cancer attending the University Clinic of Kinshasa. Source of controls: From FWI (Guadeloupe): 277 controls recruited from men participating in a free systematic health screening program open to the general population; From Democratic Republic of Congo: 134 controls recruited from subjects attending the University Clinic of Kinshasa KULEUVEN: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases recruited at the University Hospital Leuven. Source of controls: Healthy males with no history of prostate cancer recruited at the University Hospitals, Leuven. LAAPC: Subjects were participants in a population-based case-control study of aggressive prostate cancer conducted in Los Angeles County. Cases were identified through the Los Angeles County Cancer Surveillance Program rapid case ascertainment system. Eligible cases included African American, Hispanic, and non-Hispanic White men diagnosed with a first primary prostate cancer between January 1, 1999 and December 31, 2003. Eligible cases also had (a) prostatectomy with documented tumor extension outside the prostate, (b) metastatic prostate cancer in sites other than prostate, (c) needle biopsy of the prostate with Gleason grade ≥8, or (d) needle biopsy with Gleason grade 7 and tumor in more than two thirds of the biopsy cores. Eligible controls were men never diagnosed with prostate cancer, living in the same neighborhood as a case, and were frequency matched to cases on age (± 5 y) and race/ethnicity. Controls were identified by a neighborhood walk algorithm, which proceeds through an obligatory sequence of adjacent houses or residential units beginning at a specific residence that has a specific geographic relationship to the residence where the case lived at diagnosis. Malaysia: Case-control. Source of cases: Patients attended the outpatient urology or uro-onco clinic at University Malaya Medical Center. Source of controls: Population-based, age matched (5-year groups), ascertained through electoral register, Subang Jaya, Selangor, Malaysia MCC-Spain: Case-control. Source of cases: Identified through the urology departments of the participating hospitals. Source of controls: Population-based, frequency age and region matched, ascertained through the rosters of the primary health care centers MCCS: Nested case-control, Melbourne, Victoria. Source of cases: Identified by linkage to the Victorian Cancer Registry. Source of controls: Cohort participants without a diagnosis of cancer MD Anderson: Participants in this study were identified from epidemiological prostate cancer studies conducted at the University of Texas MD Anderson Cancer Center in the Houston Metropolitan area. Cases were accrued in the Houston Medical Center and were not restricted with respect to Gleason score, stage or PSA. Controls were identified via random-digit-dialing or among hospital visitors and they were frequency matched to cases on age and race. Lifestyle, demographic, and family history data were collected using a standardized questionnaire. MDACC_AS: A prospective cohort study. Source of cases: Men with clinically organ-confined prostate cancer meeting eligibility criteria for a prospective cohort study of active surveillance at MD Anderson Cancer Center. Source of controls: N/A MEC: The Multiethnic Cohort (MEC) is comprised of over 215,000 men and women recruited from Hawaii and the Los Angeles area between 1993 and 1996. Between 1995 and 2006, over 65,000 blood samples were collected from participants for genetic analyses. To identify incident cancer cases, the MEC was cross-linked with the population-based Surveillance, Epidemiology and End Results (SEER) registries in California and Hawaii, and unaffected cohort participants with blood samples were selected as controls MIAMI (WFPCS): Prostate cancer cases and controls were recruited from the Departments of Urology and Internal Medicine of the Wake Forest University School of Medicine using sequential patient populations as described previously (PMID:15342424). All study subjects received a detailed description of the study protocol and signed their informed consent, as approved by the medical center's Institutional Review Board. The general eligibility criteria were (i) able to comprehend informed consent and (ii) without previously diagnosed cancer. The exclusion criteria were (i) clinical diagnosis of autoimmune diseases; (ii) chronic inflammatory conditions; and (iii) infections within the past 6 weeks. Blood samples were collected from all subjects. MOFFITT: Hospital-based. Source of cases: clinic based from Moffitt Cancer Center. Source of controls: Moffitt Cancer Center affiliated Lifetime cancer screening center NMHS: Case-control, clinic based, Nashville TN. Source of cases: All urology clinics in Nashville, TN. Source of controls: Men without prostate cancer at prostate biopsy. PCaP: The North Carolina-Louisiana Prostate Cancer Project (PCaP) is a multidisciplinary population-based case-only study designed to address racial differences in prostate cancer through a comprehensive evaluation of social, individual and tumor level influences on prostate cancer aggressiveness. PCaP enrolled approximately equal numbers of African Americans and Caucasian Americans with newly-diagnosed prostate cancer from North Carolina (42 counties) and Louisiana (30 parishes) identified through state tumor registries. African American PCaP subjects with DNA, who agreed to future use of specimens for research, participated in OncoArray analysis. PCMUS: Case-control - Sofia, Bulgaria. Source of cases: Patients of Clinic of Urology, Alexandrovska University Hospital, Sofia, Bulgaria, PrCa histopathologically confirmed. Source of controls: 72 patients with verified BPH and PSA<3,5; 78 healthy controls from the MMC Biobank, no history of PrCa PHS: Nested case-control. Source of cases: Participants of the PHS1 trial/cohort. Source of controls: Participants of the PHS1 trial/cohort PLCO: Nested case-control. Source of cases: Men with a confirmed diagnosis of prostate cancer from the PLCO Cancer Screening Trial. Source of controls: Controls were men enrolled in the PLCO Cancer Screening Trial without a diagnosis of cancer at the time of case ascertainment. Poland: Case-control. Source of cases: men with unselected prostate cancer, diagnosed in north-western Poland at the University Hospital in Szczecin. Source of controls: cancer-free men from the same population, taken from the healthy adult patients of family doctors in the Szczecin region PROCAP: Population-based, Retrospective, Observational. Source of cases: Cases were ascertained from the National Prostate Cancer Register of Sweden Follow-Up Study, a retrospective nationwide cohort study of patients with localized prostate cancer. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PROGReSS: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases from the Hospital Clínico Universitario de Santiago de Compostela, Galicia, Spain. Source of controls: Cancer-free men from the same population ProMPT: A study to collect samples and data from subjects with and without prostate cancer. Retrospective, Experimental. Source of cases: Subjects attending outpatient clinics in hospitals. Source of controls: Subjects attending outpatient clinics in hospitals ProtecT: Trial of treatment. Samples taken from subjects invited for PSA testing from the community at nine centers across United Kingdom. Source of cases: Subjects who have a proven diagnosis of prostate cancer following testing. Source of controls: Identified through invitation of subjects in the community. PROtEuS: Case-control, population-based. Source of cases: All new histologically-confirmed cases, aged less or equal to 75 years, diagnosed between 2005 and 2009, actively ascertained across Montreal French hospitals. Source of controls: Randomly selected from the Provincial electoral list of French-speaking men between 2005 and 2009, from the same area of residence as cases and frequency-matched on age. QLD: Case-control. Source of cases: A longitudinal cohort study (Prostate Cancer Supportive Care and Patient Outcomes Project: ProsCan) conducted in Queensland, through which men newly diagnosed with prostate cancer from 26 private practices and 10 public hospitals were directly referred to ProsCan at the time of diagnosis by their treating clinician (age range 43-88 years). All cases had histopathologically confirmed prostate cancer, following presentation with an abnormal serum PSA and/or lower urinary tract symptoms. Source of controls: Controls comprised healthy male blood donors with no personal history of prostate cancer, recruited through (i) the Australian Red Cross Blood Services in Brisbane (age range 19-76 years) and (ii) the Australian Electoral Commission (AEC) (age and post-code/ area matched to ProsCan, age range 54-90 years). RAPPER: Multi-centre, hospital based blood sample collection study in patients enrolled in clinical trials with prospective collection of radiotherapy toxicity data. Source of cases: Prostate cancer patients enrolled in radiotherapy trials: CHHiP, RT01, Dose Escalation, RADICALS, Pelvic IMRT, PIVOTAL. Source of controls: N/A SABOR: Prostate Cancer Screening Cohort. Source of cases: Men >45 yrs of age participating in annual PSA screening. Source of controls: Males participating in annual PSA prostate cancer risk evaluations (funded by NCI biomarkers discovery and validation grant), recruited through University of Texas Health Science Center at San Antonio and affiliated sites or through study advertisements, enrolment open to the community SCCS: Case-control in cohort, Southeastern USA. Prospective, Observational, Population-based. Source of cases: SCCS entry population. Source of controls: SCCS entry population SCPCS: Population-based, Retrospective, Observational. Source of cases: South Carolina Central Cancer Registry. Source of controls: Health Care Financing Administration beneficiary file SEARCH: Case-control - East Anglia, UK. Source of cases: Men < 70 years of age registered with prostate cancer at the population-based cancer registry, Eastern Cancer Registration and Information Centre, East Anglia, UK. Source of controls: Men attending general practice in East Anglia with no known prostate cancer diagnosis, frequency matched to cases by age and geographic region SNP_Prostate_Ghent: Hospital-based, Retrospective, Observational. Source of cases: Men treated with IMRT as primary or postoperative treatment for prostate cancer at the Ghent University Hospital between 2000 and 2010. Source of controls: Employees of the University hospital and members of social activity clubs, without a history of any cancer. SPAG: Hospital-based, Retrospective, Observational. Source of cases: Guernsey. Source of controls: Guernsey STHM2: Population-based, Retrospective, Observational. Source of cases: Cases were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PCPT: Case-control from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial SELECT: Case-cohort from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial TAMPERE: Case-control - Finland, Retrospective, Observational, Population-based. Source of cases: Identified through linkage to the Finnish Cancer Registry and patient records; and the Finnish arm of the ERSPC study. Source of controls: Cohort participants without a diagnosis of cancer UGANDA: Uganda Prostate Cancer Study: Uganda is a case-control study of prostate cancer in Kampala Uganda that was initiated in 2011. Men with prostate cancer were enrolled from the Urology unit at Mulago Hospital and men without prostate cancer (i.e. controls) were enrolled from other clinics (i.e. surgery) at the hospital. UKGPCS: ICR, UK. Source of cases: Cases identified through clinics at the Royal Marsden hospital and nationwide NCRN hospitals. Source of controls: Ken Muir's control- 2000 ULM: Case-control - Germany. Source of cases: familial cases (n=162): identified through questionnaires for family history by collaborating urologists all over Germany; sporadic cases (n=308): prostatectomy series performed in the Clinic of Urology Ulm between 2012 and 2014. Source of controls: age-matched controls (n=188): age-matched men without prostate cancer and negative family history collected in hospitals of Ulm WUGS/WUPCS: Cases Series, USA. Source of cases: Identified through clinics at Washington University in St. Louis. Source of controls: Men diagnosed and managed with prostate cancer in University based clinic. Acknowledgement Statements: Aarhus: This study was supported by the Danish Strategic Research Council (now Innovation Fund Denmark) and the Danish Cancer Society. The Danish Cancer Biobank (DCB) is acknowledged for biological material. AHS: This work was supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics (Z01CP010119). ATBC: This research was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, N01-RC-37004, HHSN261201000006C, and HHSN261201500005C from the National Cancer Institute, Department of Health and Human Services. BioVu: The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center's BioVU which is supported by institutional funding and by the National Center for Research Resources, Grant UL1 RR024975-01 (which is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06). Canary PASS: PASS was supported by Canary Foundation and the National Cancer Institute's Early Detection Research Network (U01 CA086402) CCI: This work was awarded by Prostate Cancer Canada and is proudly funded by the Movember Foundation - Grant # D2013-36.The CCI group would like to thank David Murray, Razmik Mirzayans, and April Scott for their contribution to this work. CerePP French Prostate Cancer Case-Control Study (ProGene): None reported COH: SLN is partially supported by the Morris and Horowitz Families Endowed Professorship COSM: The Swedish Research Council, the Swedish Cancer Foundation CPCS1 & CPCS2: Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev Ringvej 75, DK-2730 Herlev, DenmarkCPCS1 would like to thank the participants and staff of the Copenhagen General Population Study for their important contributions. CPDR: Uniformed Services University for the Health Sciences HU0001-10-2-0002 (PI: David G. McLeod, MD) CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study II cohort. CPS-II thanks the participants and Study Management Group for their invaluable contributions to this research. We would also like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, and cancer registries supported by the National Cancer Institute Surveillance Epidemiology and End Results program. EPIC: The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by the Danish Cancer Society (Denmark); the Deutsche Krebshilfe, Deutsches Krebsforschungszentrum and Federal Ministry of Education and Research (Germany); the Hellenic Health Foundation, Greek Ministry of Health; Greek Ministry of Education (Greece); the Italian Association for Research on Cancer (AIRC) and National Research Council (Italy); the Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF); the Statistics Netherlands (The Netherlands); the Health Research Fund (FIS), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, Spanish Ministry of Health ISCIII RETIC (RD06/0020), Red de Centros RCESP, C03/09 (Spain); the Swedish Cancer Society, Swedish Scientific Council and Regional Government of Skåne and Västerbotten, Fundacion Federico SA (Sweden); the Cancer Research UK, Medical Research Council (United Kingdom). EPICAP: The EPICAP study was supported by grants from Ligue Nationale Contre le Cancer, Ligue départementale du Val de Marne; Fondation de France; Agence Nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES). The EPICAP study group would like to thank all urologists, Antoinette Anger and Hasina Randrianasolo (study monitors), Anne-Laure Astolfi, Coline Bernard, Oriane Noyer, Marie-Hélène De Campo, Sandrine Margaroline, Louise N'Diaye, and Sabine Perrier-Bonnet (Clinical Research nurses). ERSPC: This study was supported by the DutchCancerSociety (KWF94-869,98-1657,2002-277,2006-3518, 2010-4800), The Netherlands Organisation for Health Research and Development (ZonMW-002822820, 22000106, 50-50110-98-311, 62300035), The Dutch Cancer Research Foundation (SWOP), and an unconditional grant from Beckman-Coulter-HybritechInc. ESTHER: The ESTHER study was supported by a grant from the Baden Württemberg Ministry of Science, Research and Arts. The ESTHER group would like to thank Hartwig Ziegler, Sonja Wolf, Volker Hermann, Heiko Müller, Karina Dieffenbach, Katja Butterbach for valuable contributions to the study. FHCRC: The FHCRC studies were supported by grants R01-CA056678, R01-CA082664, and R01-CA092579 from the US National Cancer Institute, National Institutes of Health, with additional support from the Fred Hutchinson Cancer Research Center. FHCRC would like to thank all the men who participated in these studies. Gene-PARE: The Gene-PARE study was supported by grants 1R01CA134444 from the U.S. National Institutes of Health, PC074201 and W81XWH-15-1-0680 from the Prostate Cancer Research Program of the Department of Defense and RSGT-05-200-01-CCE from the American Cancer Society. Hamburg-Zagreb: None reported HPFS: The Health Professionals Follow-up Study was supported by grants UM1CA167552, CA133891, CA141298, and P01CA055075. HPFS are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. IMPACT: The IMPACT study was funded by The Ronald and Rita McAulay Foundation, CR-UK Project grant (C5047/A1232), Cancer Australia, AICR Netherlands A10-0227, Cancer Australia and Cancer Council Tasmania, NIHR, EU Framework 6, Cancer Councils of Victoria and South Australia, and Philanthropic donation to Northshore University Health System. We acknowledge support from the National Institute for Health Research (NIHR) to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden Foundation NHS Trust. IMPACT acknowledges the IMPACT study steering committee, collaborating centres, and participants. IPO-Porto: The IPO-Porto study was funded by Fundaçäo para a Ciência e a Tecnologia (FCT; UID/DTP/00776/2013 and PTDC/DTP-PIC/1308/2014) and by IPO-Porto Research Center (CI-IPOP-16-2012 and CI-IPOP-24-2015). MC and MPS are research fellows from Liga Portuguesa Contra o Cancro, Núcleo Regional do Norte. SM is a research fellow from FCT (SFRH/BD/71397/2010). IPO-Porto would like to express our gratitude to all patients and families who have participated in this study. Karuprostate: The Karuprostate study was supported by the the Frech National Health Directorate and by the Association pour la Recherche sur les Tumeurs de la ProstateKarusprostate thanks Séverine Ferdinand. KULEUVEN: F.C. and S.J. are holders of grants from FWO Vlaanderen (G.0684.12N and G.0830.13N), the Belgian federal government (National Cancer Plan KPC_29_023), and a Concerted Research Action of the KU Leuven (GOA/15/017). TVDB is holder of a doctoral fellowship of the FWO. LAAPC: This study was funded by grant R01CA84979 (to S.A. Ingles) from the National Cancer Institute, National Institutes of Health. Malaysia: The study was funded by the University Malaya High Impact Research Grant (HIR/MOHE/MED/35). Malaysia thanks all associates in the Urology Unit, University of Malaya, Cancer Research Initiatives Foundation (CARIF) and the Malaysian Men's Health Initiative (MMHI). MCCS: MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 209057, 251553, and 504711, and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. MCC-Spain: The study was partially funded by the Accion Transversal del Cancer, approved on the Spanish Ministry Council on the 11th October 2007, by the Instituto de Salud Carlos III-FEDER (PI08/1770, PI09/00773-Cantabria, PI11/01889-FEDER, PI12/00265, PI12/01270, and PI12/00715), by the Fundación Marqués de Valdecilla (API 10/09), by the Spanish Association Against Cancer (AECC) Scientific Foundation and by the Catalan Government DURSI grant 2009SGR1489. Samples: Biological samples were stored at the Parc de Salut MAR Biobank (MARBiobanc; Barcelona) which is supported by Instituto de Salud Carlos III FEDER (RD09/0076/00036). Also sample collection was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d'Oncologia de Catalunya (XBTC). MCC-Spain acknowledges the contribution from Esther Gracia-Lavedan in preparing the data. We thank all the subjects who participated in the study and all MCC-Spain collaborators. MD Anderson: Prostate Cancer Case-Control Studies at MD Anderson (MDA) supported by grants CA68578, ES007784, DAMD W81XWH-07-1-0645, and CA140388. MDACC_AS: None reported MEC: Funding provided by NIH grant U19CA148537 and grant U01CA164973. MIAMI (WFPCS): ACS MOFFITT: The Moffitt group was supported by the US National Cancer Institute (R01CA128813, PI: J.Y. Park). NMHS: Funding for the Nashville Men's Health Study (NMHS) was provided by the National Institutes of Health Grant numbers: RO1CA121060. PCaP only data: The North Carolina - Louisiana Prostate Cancer Project (PCaP) is carried out as a collaborative study supported by the Department of Defense contract DAMD 17-03-2-0052. For HCaP-NC follow-up data: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. For studies using both PCaP and HCaP-NC follow-up data please use: The North Carolina - Louisiana Prostate Cancer Project (PCaP) and the Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study are carried out as collaborative studies supported by the Department of Defense contract DAMD 17-03-2-0052 and the American Cancer Society award RSGT-08-008-01-CPHPS, respectively. For any PCaP data, please include: The authors thank the staff, advisory committees and research subjects participating in the PCaP study for their important contributions. For studies using PCaP DNA/genotyping data, please include: We would like to acknowledge the UNC BioSpecimen Facility and LSUHSC Pathology Lab for our DNA extractions, blood processing, storage and sample disbursement (https://genome.unc.edu/bsp). For studies using PCaP tissue, please include: We would like to acknowledge the RPCI Department of Urology Tissue Microarray and Immunoanalysis Core for our tissue processing, storage and sample disbursement. For studies using HCaP-NC follow-up data, please use: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. The authors thank the staff, advisory committees and research subjects participating in the HCaP-NC study for their important contributions. For studies that use both PCaP and HCaP-NC, please use: The authors thank the staff, advisory committees and research subjects participating in the PCaP and HCaP-NC studies for their important contributions. PCMUS: The PCMUS study was supported by the Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2-2009; DFNI-B01/28/2012) with additional support from the Science Fund of Medical University - Sofia (contract 51/2009; 8I/2009; 28/2010). PHS: The Physicians' Health Study was supported by grants CA34944, CA40360, CA097193, HL26490, and HL34595. PHS members are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. PLCO: This PLCO study was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIHPLCO thanks Drs. Christine Berg and Philip Prorok, Division of Cancer Prevention at the National Cancer Institute, the screening center investigators and staff of the PLCO Cancer Screening Trial for their contributions to the PLCO Cancer Screening Trial. We thank Mr. Thomas Riley, Mr. Craig Williams, Mr. Matthew Moore, and Ms. Shannon Merkle at Information Management Services, Inc., for their management of the data and Ms. Barbara O'Brien and staff at Westat, Inc. for their contributions to the PLCO Cancer Screening Trial. We also thank the PLCO study participants for their contributions to making this study possible. Poland: None reported PROCAP: PROCAP was supported by the Swedish Cancer Foundation (08-708, 09-0677). PROCAP thanks and acknowledges all of the participants in the PROCAP study. We thank Carin Cavalli-Björkman and Ami Rönnberg Karlsson for their dedicated work in the collection of data. Michael Broms is acknowledged for his skilful work with the databases. KI Biobank is acknowledged for handling the samples and for DNA extraction. We acknowledge The NPCR steering group: Pär Stattin (chair), Anders Widmark, Stefan Karlsson, Magnus Törnblom, Jan Adolfsson, Anna Bill-Axelson, Ove Andrén, David Robinson, Bill Pettersson, Jonas Hugosson, Jan-Erik Damber, Ola Bratt, Göran Ahlgren, Lars Egevad, and Roy Ehrnström. PROGReSS: The PROGReSS study is founded by grants from the Spanish Ministry of Health (INT15/00070; INT16/00154; FIS PI10/00164, FIS PI13/02030; FIS PI16/00046); the Spanish Ministry of Economy and Competitiveness (PTA2014-10228-I), and Fondo Europeo de Desarrollo Regional (FEDER 2007-2013). ProMPT: Founded by CRUK, NIHR, MRC, Cambride Biomedical Research Centre ProtecT: Founded by NIHR. ProtecT and ProMPT would like to acknowledge the support of The University of Cambridge, Cancer Research UK. Cancer Research UK grants (C8197/A10123) and (C8197/A10865) supported the genotyping team. We would also like to acknowledge the support of the National Institute for Health Research which funds the Cambridge Bio-medical Research Centre, Cambridge, UK. We would also like to acknowledge the support of the National Cancer Research Prostate Cancer: Mechanisms of Progression and Treatment (PROMPT) collaborative (grant code G0500966/75466) which has funded tissue and urine collections in Cambridge. We are grateful to staff at the Welcome Trust Clinical Research Facility, Addenbrooke's Clinical Research Centre, Cambridge, UK for their help in conducting the ProtecT study. We also acknowledge the support of the NIHR Cambridge Biomedical Research Centre, the DOH HTA (ProtecT grant), and the NCRI/MRC (ProMPT grant) for help with the bio-repository. The UK Department of Health funded the ProtecT study through the NIHR Health Technology Assessment Programme (projects 96/20/06, 96/20/99). The ProtecT trial and its linked ProMPT and CAP (Comparison Arm for ProtecT) studies are supported by Department of Health, England; Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466, and The NCRI, UK. The epidemiological data for ProtecT were generated though funding from the Southwest National Health Service Research and Development. DNA extraction in ProtecT was supported by USA Dept of Defense award W81XWH-04-1-0280, Yorkshire Cancer Research and Cancer Research UK. The authors would like to acknowledge the contribution of all members of the ProtecT study research group. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Department of Health of England. The bio-repository from ProtecT is supported by the NCRI (ProMPT) Prostate Cancer Collaborative and the Cambridge BMRC grant from NIHR. We thank the National Institute for Health Research, Hutchison Whampoa Limited, the Human Research Tissue Bank (Addenbrooke's Hospital), and Cancer Research UK. PROtEuS: PROtEuS was supported financially through grants from the Canadian Cancer Society (13149, 19500, 19864, 19865) and the Cancer Research Society, in partnership with the Ministère de l'enseignement supérieur, de la recherche, de la science et de la technologie du Québec, and the Fonds de la recherche du Québec - Santé.PROtEuS would like to thank its collaborators and research personnel, and the urologists involved in subjects recruitment. We also wish to acknowledge the special contribution made by Ann Hsing and Anand Chokkalingam to the conception of the genetic component of PROtEuS. QLD: The QLD research is supported by The National Health and Medical Research Council (NHMRC) Australia Project Grants (390130, 1009458) and NHMRC Career Development Fellowship and Cancer Australia PdCCRS funding to J Batra. The QLD team would like to acknowledge and sincerely thank the urologists, pathologists, data managers and patient participants who have generously and altruistically supported the QLD cohort. RAPPER: RAPPER is funded by Cancer Research UK (C1094/A11728; C1094/A18504) and Experimental Cancer Medicine Centre funding (C1467/A7286). The RAPPER group thank Rebecca Elliott for project management. SABOR: The SABOR research is supported by NIH/NCI Early Detection Research Network, grant U01 CA0866402-12. Also supported by the Cancer Center Support Grant to the Cancer Therapy and Research Center from the National Cancer Institute (US) P30 CA054174. SCCS: SCCS is funded by NIH grant R01 CA092447, and SCCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). Data on SCCS cancer cases used in this publication were provided by the Alabama Statewide Cancer Registry; Kentucky Cancer Registry, Lexington, KY; Tennessee Department of Health, Office of Cancer Surveillance; Florida Cancer Data System; North Carolina Central Cancer Registry, North Carolina Division of Public Health; Georgia Comprehensive Cancer Registry; Louisiana Tumor Registry; Mississippi Cancer Registry; South Carolina Central Cancer Registry; Virginia Department of Health, Virginia Cancer Registry; Arkansas Department of Health, Cancer Registry, 4815 W. Markham, Little Rock, AR 72205. The Arkansas Central Cancer Registry is fully funded by a grant from National Program of Cancer Registries, Centers for Disease Control and Prevention (CDC). Data on SCCS cancer cases from Mississippi were collected by the Mississippi Cancer Registry which participates in the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the CDC or the Mississippi Cancer Registry. SCPCS: SCPCS is funded by CDC grant S1135-19/19, and SCPCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). SEARCH: SEARCH is funded by a program grant from Cancer Research UK (C490/A10124) and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge. SNP_Prostate_Ghent: The study was supported by the National Cancer Plan, financed by the Federal Office of Health and Social Affairs, Belgium. SPAG: Wessex Medical ResearchHope for Guernsey, MUG, HSSD, MSG, Roger Allsopp STHM2: STHM2 was supported by grants from The Strategic Research Programme on Cancer (StratCan), Karolinska Institutet; the Linné Centre for Breast and Prostate Cancer (CRISP, number 70867901), Karolinska Institutet; The Swedish Research Council (number K2010-70X-20430-04-3) and The Swedish Cancer Society (numbers 11-0287 and 11-0624); Stiftelsen Johanna Hagstrand och Sigfrid Linnérs minne; Swedish Council for Working Life and Social Research (FAS), number 2012-0073STHM2 acknowledges the Karolinska University Laboratory, Aleris Medilab, Unilabs and the Regional Prostate Cancer Registry for performing analyses and help to retrieve data. Carin Cavalli-Björkman and Britt-Marie Hune for their enthusiastic work as research nurses. Astrid Björklund for skilful data management. We wish to thank the BBMRI.se biobank facility at Karolinska Institutet for biobank services. PCPT & SELECT are funded by Public Health Service grants U10CA37429 and 5UM1CA182883 from the National Cancer Institute. SWOG and SELECT thank the site investigators and staff and, most importantly, the participants who donated their time to this trial. TAMPERE: The Tampere (Finland) study was supported by the Academy of Finland (251074), The Finnish Cancer Organisations, Sigrid Juselius Foundation, and the Competitive Research Funding of the Tampere University Hospital (X51003). The PSA screening samples were collected by the Finnish part of ERSPC (European Study of Screening for Prostate Cancer). TAMPERE would like to thank Riina Liikanen, Liisa Maeaettaenen and Kirsi Talala for their work on samples and databases. UGANDA: None reported UKGPCS: UKGPCS would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now Prostate Action), The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. UKGPCS should also like to acknowledge the NCRN nurses, data managers, and consultants for their work in the UKGPCS study. UKGPCS would like to thank all urologists and other persons involved in the planning, coordination, and data collection of the study. ULM: The Ulm group received funds from the German Cancer Aid (Deutsche Krebshilfe). WUGS/WUPCS: WUGS would like to thank the following for funding support: The Anthony DeNovi Fund, the Donald C. McGraw Foundation, and the St. Louis Men's Group Against Cancer.
Single-cell profiling of sero-negative and sero-positive humans that were inoculated with SARS-CoV-2. The cellular response during SARS-CoV-2 is profiled using single-cell transcriptomics, CITE-seq and single cell immune profiling, by sampling PBMCs and nasal swabs before and at multiple time points during SARS-CoV-2 infection. This one-of-a-kind cellular map will give unique temporal resolution of how nasal and immune cells respond to SARS-CoV-2 exposure and infection
The underrepresentation of non-European individuals in human genetic studies so far has limited the diversity of individuals in genomic datasets and led to reduced medical relevance for a large proportion of the world’s population. Population-specific reference genome datasets as well as genome-wide association studies in diverse populations are needed to address this issue. Here we describe the pilot phase of the GenomeAsia 100K Project. This includes a whole-genome sequencing reference dataset from 1,739 individuals of 219 population groups and 64 countries across Asia. We catalogue genetic variation, population structure, disease associations and founder effects. We also explore the use of this dataset in imputation, to facilitate genetic studies in populations across Asia and worldwide.
A better understanding of the molecular landscape of non-muscle-invasive bladder cancer (NMIBC) is essential to improve risk assessment and identify potential therapeutic targets. Here, we perform a comprehensive genomic analysis of patients diagnosed with NMIBC based on whole-exome- (n=438), shallow whole-genome- (n=362), and total RNA-sequencing (n=414). This dataset contains 414 BAM files corresponding to the full total RNA-sequencing dataset.
This dataset includes sequence files from whole exome sequencing and bulk RNA sequencing of tissue and blood biospecimens from a phase II clinical trial investigating epigenetic priming followed by immune checkpoint blockade in non-small cell lung cancer (NSCLC; NCT01928576). There are 78 whole exome sequencing files and 36 bulk RNA sequencing files included in this dataset.
This study measured the expression of a panel of over 300 cancer immunotherapy-relavant genes in tumors after neoadjuvant chemoimmunotherapy. It was performed on RNA from FFPE tumor tissue. Differential gene expression analysis was performed between patients achieving or not achieving an event-free survival of 12 months.
The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid cost and labour intensive TF ChIP-seq assays.It is important to develop reliable and fast computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices.TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq, using either peaks or footprints as input.In addition to open-chromatin data, also Histone-Marks (HMs) can be used in TEPIC to identify candidate TF binding sites.TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength.Using machine learning techniques, we show that incorporating low affinity binding sites improves our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites.Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance.In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq datasets.Finally, we show that these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively.
Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework (www.sanger.ac.uk/genetics/CGP/cosmic/). The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data.