Concerted efforts in genomic studies have revealed profound insights in prognostic ovarian cancer subtypes. On the other hand, abundant histology slides have been generated to date, yet their uses remain very limited and largely qualitative. Our goal is to develop automated histology analysis as an alternative subtyping technology for ovarian cancer that is cost-efficient and do not rely on DNA quality. We develop an automated system for scoring hematoxylin and eosin-stained (H&E) primary tumour sections of 91 late-stage ovarian cancer to identify single cells including cancer and stromal cells. We demonstrated high accuracy of our system based on expert pathologists’ scores (cancer=97.1%, stromal=89.1%) as well as compared to immunohistochemistry scoring (correlation=0.87). Quantitative stromal cell ratio is significantly associated with poor overall survival after controlling for clinical parameters including debulking status and age (multivariate analysis p=0.0021, HR=2.54, CI=1.40-4.60) and progression-free survival (multivariate analysis p=0.022, HR=1.75, CI=1.09–2.82). We demonstrate how automated image analysis enables objective quantification of microenvironmental composition of ovarian tumours. Our analysis reveals a strong effect of the tumour microenvironment on ovarian cancer progression and highlights the potential of therapeutic interventions that target the stromal compartment or cancer-stroma signalling in the stroma-high, late-stage ovarian cancer subset.
Recent GWAS studies have made extensive use of large eQTL data sets to functionallyannotate index SNPs. With a large number of association signals located outside codingregions there has been an intense search among sequence variants affecting geneexpression at the transcriptional level. However, little progress has been made in mappingregulatory variants that affect protein levels at the translational or post-translational level. It isnow possible to undertake a protein QTL scan for focused sets of e.g. oxidized proteins bymass spectrometry. We have established a collaboration with a longitudinal, family-basedstudy in France, the Stanislas cohort, which comprises circa 1000 nuclear families (4,295individuals) and has follow up data for 10 years (three visits). We have undertaken a pilotstudy in a focus set of 257 subjects from 79 families with the aim to integrate GWAS,transcriptomic and DNA methylation data with proteomic data on a set of 100 proteinsmeasured in PBMCs. We have already generated GWAS data using Illumina's core-exomechip as well as DNA methylation profiles with the 450K array. We propose to use RNA seq togenerate transcriptomic data of the corresponding PBMCs.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
Non-syndromic cases of congenital heart defects (CHD) exhibit variable modes of inheritance (Mendelian and non-Mendelian). Several studies have identified strong candidates in humans by taking a candidate gene approach as well as by using whole exome next generation sequencing (NGS). So far these studies could only explain a minor fraction of the observed phenotype in humans, most of them in syndromic cases and no single study has focused on the subset of cases with left ventricular outflow tract obstruction (LVOTO). To discover novel disease-causing genes a large cohort of patients with LVOTO, approximately 100 cases, 25 families and 100 trios have been exome sequenced. This study based on NGS sequencing data yielded several known and novel compelling candidate genes, such as MYH6, NR2F2 and MYH11, but also novel ones, such as ITGB4. To evaluate the significance of our findings in a replication cohort we assembled another 1614 cases with an LVOTO phenotype from our collaborators in Toronto, Berlin and Amsterdam. Targeted resequencing in this additional cohort will help to find additional cases with mutations in the identified candidate genes to strengthen genotype-phenotype association. We will use control data from the INTERVAL project for case/control analyses The pulldowns will be performed as 24-plex ISC with 192 or greater indexes, and the sequencing will be performed with 192 samples per lane, requiring 9 lanes of sequencing.
Although the intricate and prolonged development of the human brain critically distinguishes it from other mammals, our current understanding of neurodevelopmental diseases is largely based on work using animal models. Recent studies revealed that neural progenitors in the human brain are profoundly different from those found in rodent animal models. Moreover, post-mortem studies revealed extensive migration of interneurons into the late-gestational and post-natal human prefrontal cortex that does not occur in rodents. Here, we use cerebral organoids to show that overproduction of mid-gestational human interneurons causes Tuberous Sclerosis Complex (TSC), a severe neuro-developmental disorder associated with mutations in TSC1 and TSC2. We identify a previously uncharacterized population of caudal late interneuron progenitors, the CLIP-cells. In organoids derived from patients carrying heterozygous TSC2 mutations, dysregulation of mTOR signaling leads to CLIP-cell over-proliferation and formation of cortical tubers and subependymal tumors. Surprisingly, second-hit events resulting from copy-neutral loss-of-heterozygosity (cnLOH) are not causative for but occur during the progression of tumor lesions. Instead, EGFR signaling is required for tumor proliferation, opening up a promising approach to treat TSC lesions. Our study demonstrates that the analysis of developmental disorders in organoid models can lead to fundamental insights into human brain development and neuropsychiatric disorders.
Peripheral T-cell lymphomas not otherwise specified (PTCL-NOS) represent a heterogeneous group of nodal and extra-nodal mature T-cell lymphomas, with a low prevalence in Western countries. PTCL-NOSs account for about 25% of all PTCLs and are currently diagnosed based on exclusion criteria, as this lymphomas lack unifying morphological, phenotypic and genomic features. Cytogenetic and FISH analysis of PTCL-NOS samples have not revealed recurrent pathogenetic abnormalities, while gene expression profiling has shown only partial ability to segregate cases representing homogeneous clinic-pathological entities. This underscores the need to look at PTCL-NOS with innovative and high-throughput approaches to identify recurrent genetic lesions that could further our understanding of the biology of this heterogeneous group of diseases, provide better diagnostic tools and perhaps new targets for innovative treatments. Our aim is to study ~15 patients affected by PTCL-NOS. Out study will be funded by a private, non-profit Italian cancer research fund (Associazione Italiana per la Ricerca sul Cancro, www.airc.it) based on a grant owned by Anna Dodero and Cristiana Carniti, hematologists at INT. Samples will be analysed by whole genome sequencing using Illumina X10 machines, on a 150bp-PE protocol. Data will be analysed using the pipeline available in Team 78, under the supervision of Peter Campbell, the WTSI faculty who will oversee the project, and by Francesco Maura, visiting scientist at the WTSI.
Our understanding of genomic heterogeneity in lung cancer is largely based on the analysis of early-stage surgical specimens. Here we used endoscopic sampling of paired primary and intrathoracic metastatic tumors from 11 lung cancer patients to map genomic heterogeneity inoperable lung cancer with deep whole-genome sequencing. Intra-patient heterogeneity in driver or targetable mutations was predominantly in the form of copy number gain. Private mutation signatures, including patterns consistent with defects in homologous recombination, were highly variable both within and between patients. Irrespective of histotype, we observed a smaller than expected number of private mutations, suggesting that ancestral clones accumulated large mutation burdens immediately prior to metastasis. Single-region whole-genome sequencing of from 20 patients showed that tumors in ever-smokers with the strongest tobacco signatures were associated with germline variants in genes implicated in the repair of cigarette-induced DNA damage. Our results suggest that lung cancer precursors in ever-smokers accumulate large numbers of mutations prior to the formation of frank malignancy followed by rapid metastatic spread. In advanced lung cancer, germline variants in DNA repair genes may interact with the airway environment to influence the pattern of founder mutations, whereas similar interactions with the tumor microenvironment may play a role in the acquisition of mutations following metastasis.
We report the case of a 74-year-old man with a very rare subtype of hepatocellular carcinoma with neuroendocrine differentiation (HCC-NED). The patient presented with two independent tumors, a gastrointestinal stromal tumor in the stomach and a hepatocellular carcinoma in the liver. Both tumors were surgically removed in curative intent. Histopathological work-up of the liver tumor revealed poorly differentiated hepatocellular carcinoma (Edmondson-Steiner grade IV) with diffuse expression of neuroendocrine markers synaptophysin (SYP) and chromogranin (CHGA). Three months after resection, multifocal recurrence of the HCC with neuroendocrine differentiation (HCC-NED) was observed. In the meantime, tumor organoids have been generated from the resected HCC-NED and extensively characterized. Sensitivity to a number of drugs approved for the treatment of HCC or neuroendocrine carcinomas was tested in vitro. Based on their in vitro efficacy, etoposide and carboplatin were used as first line palliative combination treatment. Because genomic analysis revealed a NTRK1-mutation (kinase domain) and tumor organoids were sensitive to entrectinib, a pan-TRK inhibitor, the patient received entrectinib as second line therapy. After only two weeks, treatment had to be discontinued due to deterioration of the patient’s general condition. In conclusion, we demonstrate for the first time that preclinical drug testing using organoids is feasible in selected HCC cases.
Reproductive longevity is critical for fertility and impacts healthy ageing in women, yet insights into the underlying biological mechanisms and treatments to preserve it are limited. Here, we identify 290 genetic determinants of ovarian ageing, assessed using normal variation in age at natural menopause (ANM) in ~200,000 women of European ancestry. These common alleles influence clinical extremes of ANM; women in the top 1% of genetic susceptibility have an equivalent risk of premature ovarian insufficiency to those carrying monogenic FMR1 premutations. Identified loci implicate a broad range of DNA damage response (DDR) processes and include loss-of-function variants in key DDR genes. Integration with experimental models demonstrates that these DDR processes act across the life-course to shape the ovarian reserve and its rate of depletion. Furthermore, we demonstrate that experimental manipulation of DDR pathways highlighted by human genetics increase fertility and extend reproductive life in mice. Causal inference analyses using the identified genetic variants indicates that extending reproductive life in women improves bone health and reduces risk of type 2 diabetes, but increases risks of hormone-sensitive cancers. These findings provide insight into the mechanisms governing ovarian ageing, when they act across the life-course, and how they might be targeted by therapeutic approaches to extend fertility and prevent disease.
We recently described a 16-gene expression signature for improved risk stratification of acute myeloid leukemia (AML) patients called the AML Prognostic Score (APS). A subset of APS high-risk AML patients showed increased levels of focal adhesion kinase (FAK), encoded by the Protein Tyrosine Kinase 2 (PTK2) gene, which was correlated with RUNX1 mutations. RUNX1 mutant cells are more sensitive to PTK2 inhibitors. As we were not able to detect RUNX1 binding sites in the PTK2 promoter, we hypothesized that RUNX1 might regulate micro(mi)RNAs that repress PTK2, such that loss-of-function RUNX1 mutations would result in reduced miRNA expression and derepression of PTK2. Examination of paired RNA-seq and miRNA-seq data from 301 AML cases revealed two miRNAs that positively correlated with RUNX1 expression, contained RUNX1 binding sites in their promoters and were predicted to target PTK2. We show that the hsa-let7a-2-3p and hsa-miR-135a-5p promoters are regulated by RUNX1, and that PTK2 is a direct target of both miRNAs. Even in the absence of RUNX1 mutations, hsa-let7a-2-3p and hsa-miR-135a-5p regulate PTK2 expression, and reduced expression of these two miRNAs sensitizes AML cells to PTK2 inhibition. These data explain how RUNX1 regulates PTK2, and identify potential miRNA biomarkers for targeting AML with PTK2 inhibitors.
Homologous recombination (HR) deficiency causes DNA breaks and cytogenetic aberrations. Paradoxically, the types of DNA rearrangements specifically associated with HR-deficient cancers only minimally impact chromosomal structure. Addressing this, we combined a genome graph analysis of short-read whole genome sequencing (WGS) profiles across thousands of tumors with deep linked-read (LR) WGS of 46 BRCA1 or BRCA2 mutant breast cancers to discover a distinct class of HR deficiency-enriched rearrangements called reciprocal pairs. LR WGS showed that reciprocal pairs with identical rearrangement orientations gave rise to one of two distinct chromosomal outcomes, distinguishable only with long molecule data. While one (cis) outcome corresponded to the copy and pasting of a small segment to a distant site, a second (trans) outcome was a quasi-balanced translocation or multi-megabase inversion with substantial (10kb) duplications at each junction. The full spectrum of reciprocal pair outcomes could be explained by an HR-independent replication restart repair mechanism. LR WGS additionally identified single-strand annealing (SSA) as a BRCA2-deficiency specific repair pathway in human cancers. Replication restart- and SSA-associated SVs improved BRCA1- vs. BRCA2- deficiency classification and identified metastatic cancer cases with favorable chemotherapy responses. Our data reveal classes of BRCA1- and BRCA2-deficiency specific rearrangements as drivers of cytogenetic aberrations in HR deficient cells.
Prostate cancer (PCa) is a heterogeneous disease, impeding early detection and risk stratification. Liquid biopsies (LBx) enable minimally invasive tumor profiling, but circulating tumor-derived DNA (ctDNA) detection remains difficult, particularly in early-stage PCa. We developed a multimodal LBx approach combining genomic and epigenomic cell-free DNA (cfDNA) features in plasma and urine from newly diagnosed PCa patients to improve early characterization of PCa and risk stratification of aggressive disease. Plasma and urine samples from 55 localized PCa (lPCa) patients, 18 advanced PCa (aPCa) patients, and 36 cancer-free controls were subjected to low-coverage whole-genome sequencing and methylated DNA immunoprecipitation sequencing to assess fragmentation, chromosomal instability, and methylation in cfDNA. This complementary approach yielded a 45% ctDNA detection rate in newly diagnosed PCa. Major differences were observed between aPCa and controls, reflecting increasing signals with tumor progression. Epigenomic cfDNA features differentiated lPCa from aPCa, and ctDNA was detected in 46% of PCa patients with prostate-specific antigen <10 ng/ml, suggesting potential for risk stratification. This study highlights the value of multimodal LBx approaches for early characterization of primary PCa and identification of aggressive disease at initial diagnosis. Integration into clinical workflows could complement diagnostics and support personalized decision-making tailored to patients’ PCa risk profiles.
Objectives We are sharing a database of dynamic magnetic resonance imaging (dMRI) scans of normal children, which can serve as a reference standard to quantify regional respiratory abnormalities in young patients with various respiratory conditions and facilitate treatment planning and response assessment. The database can also be useful to advance future AI-based research on image-based object segmentation and analysis. Background In pediatric patients with respiratory abnormalities, it is important to understand the alterations in regional dynamics of the lungs and other thoracoabdominal components, which in turn requires a quantitative understanding of what is considered as normal in healthy children. Currently, such a normative database of regional respiratory structure and function in healthy children does not exist. Participants 200 normal children (ages 6-18 years) participated in our research study related to this dataset. DesignThe shared open-source normative database is from our ongoing virtual growing child (VGC) project, which includes 4D dMRI images representing one breathing cycle for each normal child and also segmentations of 10 objects at end expiration (EE) and end inspiration (EI) phases of the respiratory cycle in the 4D image. The lung volumes at EE and EI as well as the excursion volumes of chest wall and diaphragm from EE to EI, left and right sides separately, are also reported. The database has thus 4,000 3D segmentations from 200 normal children in total. The database is unique and provides dMRI images, object segmentations, and quantitative regional respiratory measurement parameters of volumes for normal children. All dMRI scans are acquired from normal children during free-breathing. The dMRI acquisition protocol was as follows: 3T MRI scanner (Verio, Siemens, Erlangen, Germany), true-FISP bright-blood sequence, TR=3.82 ms, TE=1.91 ms, voxel size ~1×1×6 mm3, 320×320 matrix, bandwidth 258 Hz, and flip angle 76o. With recent advances, for each sagittal location across the thorax and abdomen, we acquired 40 2D slices over several tidal breathing cycles at ~480 ms/slice. On average, 35 sagittal locations are imaged, yielding a total of ~1400 2D MRI slices, with a resulting total scan time of 11-13 minutes for any particular study participant.The collected dMRI scan data then went through the procedure of 4D image construction, image processing, object segmentation, and volumetric measurements from segmentations. 4D image construction: For the acquired dMRI scans, we utilized an automated 4D image construction approach to form one 4D image over one breathing cycle (consisting of typically 5-8 respiratory phases) from each acquired dMRI scan to represent the whole dynamic thoraco-abdominal body region. The algorithm selects 175-280 slices (35 sagittal locations × 5-8 respiratory phases) from the 1400 acquired slices in an optimal manner using an optical flux method. Image processing: Intensity standardization is performed on every time point/3D volume of the 4D image so that image values have the same tissue-specific meaning across all subjects. Object segmentation: For each subject, there are 10 objects segmented at both EE and EI time points in this database. They include the thoracoabdominal skin outer boundary, left and right lungs, liver, spleen, left and right kidneys, diaphragm, and left and right hemi-diaphragms. All dMRI scans utilize large field of view images, which include the full thorax and abdomen to the inferior aspect of the kidneys in the sagittal plane. We used a pretrained U-Net based deep learning network to first segment all objects, and then all auto-segmentation results were visually checked and manually refined as needed, under the supervision of a radiologist with over 25 years of expertise in MRI and thoracoabdominal radiology. Manual segmentations have been performed for all objects in all datasets. Volumetric measurements based on object segmentations for lung volumes (left and right separately) at EE and EI, as well as for chest wall and diaphragm excursion volumes (left and right separately) are reported. ConclusionsThe provided database is unique and provides dMRI images, object segmentations, and quantitative regional respiratory measurement parameters of volumes for normal children. The database has 4,000 3D segmentations from 200 normal children, which to our knowledge is the largest and only such dMRI dataset to date. All images and object segmentations are saved in DICOM. All DICOM files (176,574 in total) have been anonymized, and PHI has been removed. The database can be used as a reference standard to quantify regional respiratory abnormalities in young patients with various respiratory conditions and facilitate treatment planning and response assessment. The large amount of object segmentations can potentially benefit AI-based research on image-based object segmentation and analysis.
Purpose: To inform prognosis, treatment response, disease biology, and KRAS G12C mutation heterogeneity, we conducted exploratory circulating tumor DNA (ctDNA) profiling on 134 patients with solid tumors harboring a KRAS G12C mutation treated with single-agent divarasib (GDC-6036) in a phase 1 study. Experimental design: Plasma samples were collected for serial ctDNA profiling at baseline (Cycle 1 Day 1 prior to treatment) and multiple on-treatment time points (Cycle 1 Day 15 and Cycle 3 Day 1). Results: KRAS G12C ctDNA was detectable from plasma samples in 72.9% (43/59) and 92.6% (50/54) of patients with non-small cell lung cancer and colorectal cancer, respectively, the majority of whom were eligible for study participation based on a local test detecting the KRAS G12C mutation in tumor tissue. Baseline ctDNA tumor fraction was associated with tumor type, disease burden, and metastatic sites. A decline in ctDNA level was observed as early as Cycle 1 Day 15. Serial assessment showed a decline in ctDNA tumor fraction associated with response and progression-free survival. Except for a few cases of KRAS G12C sub-clonality, on-treatment changes in KRAS G12C variant allele frequency mirrored changes in the overall ctDNA tumor fraction. Conclusion: Across tumor types, the KRAS G12C mutation likely represents a truncal mutation in the majority of patients. Rapid and deep decline in ctDNA tumor fraction was observed in patients responding to divarasib treatment. Early on-treatment dynamics of ctDNA were associated with patient outcomes and tumor response to divarasib treatment.
Rectal cancer poses challenges in preoperative treatment response, with up to 30% achieving a complete response (CR). Personalized treatment relies on accurate identification of responders at diagnosis. This study aimed to unravel CR determinants, overall survival (OS), and time to recurrence (TTR) using clinical and targeted sequencing data. Analyzing 402 patients undergoing preoperative treatment, tumor stage, size, and treatment emerged as robust response predictors. CR rates were higher in smaller, early-stage, and intensively treated tumors. Targeted sequencing analyzed 216 cases, while 120 patients provided hotspot mutation data. KRAS mutation dramatically reduced CR odds by over 50% (odds ratio [OR]=0.3 in the targeted sequencing and OR=0.4 hotspot cohorts, respectively). In contrast, SMAD4 and SYNE1 mutations were associated with higher CR rates (OR=6.0 and 6.8, respectively). Favorable OS was linked to younger age, CR, and low baseline carcinoembryonic antigen levels. Notably, CR and an APC mutation increased TTR, while a BRAF mutation negatively affected TTR. Beyond tumor burden, SMAD4 and SYNE1 mutations significantly influenced CR. KRAS mutations independently correlated with radiotherapy resistance, and BRAF mutations heightened recurrence risk. Intriguingly, nonresponding tumors with initially small sizes carried a higher risk of recurrence. These findings offer insights into rectal cancer treatment response, guiding personalized therapeutic strategies. By uncovering factors impacting CR, OS, and TTR, this study underscores the importance of tailored approaches for rectal cancer patients. These findings, based on extensive analysis and mutation data, pave the way for personalized interventions, optimizing outcomes in the challenges of rectal cancer preoperative treatment.
Please note: This synthetic data set (with cohort “participants” / ”subjects” marked with FAKE) has no identifiable data and cannot be used to make any inference about cohort data or results. The purpose of this dataset is to aid development of technical implementations for cohort data discovery, harmonization, access, and federated analysis. In support of FAIRness in data sharing, this dataset is made freely available under the Creative Commons Licence (CC-BY). Please ensure this preamble is included with this dataset and that the CINECA project (funding: EC H2020 grant 825775) is acknowledged. For any questions please contact isuru@ebi.ac.uk or cthomas@ebi.ac.uk This dataset (CINECA_synthetic_cohort_EUROPE_UK1) consists of 2521 samples which have genetic data based on 1000 Genomes data (https://www.nature.com/articles/nature15393), and synthetic subject attributes and phenotypic data derived from UKBiobank (https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001779). These data were initially derived using the TOFU tool (https://github.com/spiros/tofu), which generates randomly generated values based on the UKBiobank data dictionary. Categorical values were randomly generated based on the data dictionary, continuous variables generated based on the distribution of values reported by the UK Biobank showcase, and date / time values were random. Additionally we split the phenotypes and attributes into 4 main classes - general, cancer, diabetes mellitus, and cardiac. We assigned the general attributes to all the samples, and the cardiac / diabetes mellitus / cancer attributes to a proportion of the total samples. Once the initial set of phenotypes and attributes were generated, the data data was checked for consistency and where possible dependent attributes were calculated from the independent variables generated by TOFU. For example, BMI was calculated from height and weight data, and age at death generated by date of death and date of birth. These data were then loaded to the development instance of Biosamples (https://www.ebi.ac.uk/biosamples/) which accessioned each of the samples. The genetic data are derived from the 1000 Genomes Phase 3 release (https://www.internationalgenome.org/category/phase-3/). The genotype data consists of a single joint call vcf files with call genotypes for all 2504 samples, plus bed, bim, fam, and nosex files generated via plink for these samples and genotypes. The genotype data has had a variety of errors introduced to mimic real data and as a test for quality control pipelines. These include gender mismatches, ethnic background mislabelling and low call rates for a randomly chosen subset of sample data as well as deviations from Hardy Weinberg equilibrium and low call rates for a random selection of variants. Additionally 40 samples have raw genetic data available in the form of both bam and cram files, including unmapped data. The gender of the samples in the 1000 genomes data has been matched to the synthetic phenotypic data generated for these samples. The genetic data was then linked to the synthetic data in BioSamples, and submitted to EGA.
The Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) is a collaborative effort comprised of a coordinating center and scientific researchers from well-characterized cohort and case-control studies conducted in North America and Europe. This international consortium aims to accelerate the discovery of common and rare genetic risk variants for colorectal cancer by conducting large-scale meta-analyses of existing and newly generated genome-wide association study (GWAS) data, replicating and fine-mapping of GWAS discoveries, and investigating how genetic risk variants are modified by environmental risk factors. To expand these efforts, we assembled case-control sets or nested case-control sets from 20 different North American or European studies. Summary descriptions and study participant inclusions/exclusion criteria for each of these studies are detailed below. The Black Women's Health Study (BWHS): Is the largest follow-up study of the health of African-American women (Cozier et al., 2004; Rosenberg et al., 1995) [PMID: 15018884; PMID: 7722208]. The purpose is to identify and evaluate causes and preventives of cancers and other serious illnesses in African-American women. Among the diseases being studied are breast cancer, colorectal cancer, type 2 diabetes, uterine fibroids, systemic lupus erythematosus, and cardiovascular disease. The study began in 1995, when 59,000 black women from all parts of the United States enrolled through postal questionnaires. The women provided demographic and health data on the 1995 baseline questionnaire, including information on weight, height, smoking, drinking, contraceptive use, use of other selected medications, illnesses, reproductive history, physical activity, diet, use of health care, and other factors. The participants are followed through biennial questionnaires to determine the occurrence of cancers and other illnesses and to update information on risk factors. Self-reports of cancer are confirmed through medical records and state cancer registry records. Mouthwash-swish samples, as a source of DNA, were obtained from ~26,000 BWHS participants in 2002-2007. DNA was isolated from the mouthwash-swish samples at the Boston University Molecular Core Genetics Laboratory using the QIAAMP DNA Mini Kit (Qiagen). All incident colorectal cancer cases with a DNA sample were included in the present analysis. Two controls per case, selected from among BWHS participants free of colorectal cancer at end of follow-up, were matched to cases on year of birth (+/- 2 years) and geographical region of residence (Northeast, South, Midwest, and West). A total 209 colorectal cancer cases and 423 controls were sent for genotyping. Campaign Against Cancer and Heart Disease (CLUE II): The Campaign Against Cancer and Heart Disease, is a prospective cohort designed to identify biomarkers and other factors associated with risk of cancer, heart disease, and other conditions (Kakourou et al., 2015) [PMID: 26220152]. 32,894 participants were recruited from May through October 1989 from Washington County, Maryland and surrounding communities. Colorectal cancer cases (n = 297) and matched controls (n = 296) were identified between 1989 and 2000 among participants in the CLUE II cohort of Washington County, Maryland. Colorectal Cancer Study of Austria (CORSA): In the ongoing colorectal cancer study of Austria (CORSA), more than 13,000 Caucasian participants have been recruited within the province-wide screening project "Burgenland Prevention Trial of Colorectal Disease with Immunological Testing" (B-PREDICT) since 2003 (Hofer et al., 2011) [PMID: 21422235]. All inhabitants of the Austrian province Burgenland aged between 40 and 80 years are annually invited to participate in fecal immunochemical testing and haemoccult positive screening participants are invited for colonoscopy. CORSA includes genomic DNA and plasma of colorectal cancer cases, low-risk and high-risk adenomas, and colonoscopy-negative controls. Controls received a complete colonoscopy and were free of colorectal cancer or polyps. CORSA participants have been recruited in the four KRAGES hospitals in Burgenland, Austria, and additionally, at the Medical University of Vienna (Department of Surgery), the Viennese hospitals "Rudolfstiftung" and the "Sozialmedizinisches Zentrum Sud", and at the Medical University of Graz (Department of Internal Medicine). 1403 colorectal cancer and advanced colorectal adenoma cases, and 1404 matched controls were selected for the study. Distribution of factors sex and age (5 year strata) were evenly matched between cases and controls. Cancer Prevention Study II (CPS II): The CPS II Nutrition cohort is a prospective study of cancer incidence and mortality in the United States, established in 1992 and described in detail elsewhere (Calle et al., 2002; Campbell et al., 2014) [PMID: 12015775; PMID: 25472679]. At enrollment, participants completed a mailed self-administered questionnaire including information on demographic, medical, diet, and lifestyle factors. Follow-up questionnaires to update exposure information and to ascertain newly diagnosed cancers were sent biennially starting in 1997. Reported cancers were verified through medical records, state cancer registry linkage, or death certificates. The Emory University Institutional Review Board approves all aspects of the CPS II Nutrition Cohort. A total of 360 cases and 359 controls were selected for this study. Czech Republic Colorectal Cancer Study (Czech Republic CCS): Cases with positive colonoscopy results for malignancy, confirmed by histology as colon or rectal carcinomas, were recruited between September 2003 and May 2012 in several oncological departments in the Czech Republic (Prague, Pilsen, Benesov, Brno, Liberec, Ples, Pribram, Usti and Labem, and Zlin). Two control groups, sampled at the same time of cases recruitment, were included in the study. The first group consisted of hospital-based individuals with a negative colonoscopy result for malignancy or idiopathic bowel diseases. The reasons for the colonoscopy were: i) positive fecal occult blood test, ii) hemorrhoids, iii) abdominal pain of unknown origin, and iv) macroscopic bleeding. The second control group consisted of healthy blood donor volunteers from a blood donor center in Prague. All individuals were subjected to standard examinations to verify the health status for blood donation and were cancer-free at the time of the sampling. Details of CRC cases and controls have been reported previously (Vymetalkova et al., 2014; Naccarati et al., 2016; Vymetalkova et al., 2016) [PMID: 24755277; PMID: 26735576; PMID: 27803053]. All subjects were informed and provided written consent to participate in the study. They approved the use of their biological samples for genetic analyses, according to the Declaration of Helsinki. The design of the study was approved by the Ethics Committee of the Institute of Experimental Medicine, Prague, Czech Republic. All subjects included in the study were Caucasians and comprised 1792 cases and 1764 matched controls. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age and sex. Age was matched on +-5 years, whereas sex was matched exactly. For the cases without matched controls, matching was done only on sex. Early Detection Research Network (EDRN): The aim of the EDRN initiative is to develop and sustain a biorepository for support of translational research (Amin et al., 2010) [PMID: 21031013]. High-quality biospecimens were accrued and annotated with pertinent clinical, epidemiologic, molecular and genomic information. A user-friendly annotation tool and query tool was developed for this purpose. The various components of this annotation tool include: CDEs are developed from the College of American Pathologists (CAP) Cancer Checklists and North American Association of Central Cancer Registries (NAACR) standards. The CDEs provides semantic and syntactic interoperability of the data sets by describing them in the form of metadata or data descriptor. A total of 352 colorectal case samples and 399 controls were selected for this study. Controls were matched to CRC cases based on age and sex. The EPICOLON Consortium (EPICOLON): The EPICOLON Consortium comprises a prospective, multicentre and population-based epidemiology survey of the incidence and features of CRC in the Spanish population (Fernandez-Rozadilla et al., 2013) [PMID: 23350875]. Cases were selected as patients with de novo histologically confirmed diagnosis of colorectal adenocarcinoma. Patients with familial adenomatous polyposis, Lynch syndrome or inflammatory bowel disease-related CRC, and cases where patients or family refused to participate in the study were excluded. Hospital-based controls were recruited through the blood collection unit of each hospital, together with cases. All of the controls were confirmed to have no history of cancer or other neoplasm and no reported family history of CRC. Controls were randomly selected and matched with cases for hospital, sex and age (+- 5 years). A total of 370 cases and 370 controls were selected for genotyping. Hawaii Adenoma Study: For this adenoma study, two flexible-sigmoidoscopy screening clinics were first used to recruit participants on Oahu, Hawaii. Adenoma cases were identified either from the baseline examination at the Hawaii site of the Prostate Lung Colorectal and Ovarian cancer screening trial during 1996-2000 or at the Kaiser Permanente Hawaii's Gastroenterology Screening Clinic during 1995-2007. In addition, starting in 2002 and up to 2007, we also approached for recruitment all eligible patients who underwent a colonoscopy in the Kaiser Permanente Hawaii Gastroenterology Department. Cases were patients with histologically confirmed first-time adenoma(s) of the colorectum and were of Japanese, Caucasian or Hawaiian race/ethnicity. Controls were selected among patients with a normal colorectum and were individually matched to the cases on age at exam, sex, race/ethnicity, screening date (+-3 months) and clinic and type of examination (colonoscopy or flexible sigmoidoscopy). We recruited 1016 adenoma cases (67.8% of all eligible) and 1355 controls (69.2% of all eligible); 889 cases and 1169 controls agreed to give a blood and 29 cases and 34 controls, a mouthwash sample. A total of 989 cases and 1185 controls were genotyped for this study. Columbus-area HNPCC Study (HNPCC, OSUMC): Patients with colorectal adenocarcinoma diagnosed at six participating hospitals were eligible for this study, regardless of age at diagnosis or family history of cancer. Patients with a clinical diagnosis of familial adenomatous polyposis were not eligible for this study. These six hospitals perform the vast majority of all operations for CRC in the Columbus metropolitan area (population 1.7 million). The institutional review board at all participating hospitals approved the research protocol and consent form in accordance with assurances filed with and approved by the United States Department of Health and Human Services. Briefly, during the period of January 1999 through August 2004, 1,566 eligible patients with CRC were accrued to the study (Hampel et al., 2008) [PMID 18809606]. A total of 1472 colorectal cancer samples had enough blood DNA remaining to be sent for genotyping. Control samples were provided by the Ohio State University Medical Center%#39;s (OSUMC) Human Genetics Sample Bank. The Columbus Area Controls Sample Bank is a collection of control samples for use in human genetics research that includes both donors' anonymized biological specimens and linked phenotypic data. The data and samples are collected under the protocol "Collection and Storage of Controls for Genetics Research Studies", which is approved by the Biomedical Sciences Institutional Review Board at OSUMC. Recruitment takes place in OSUMC primary care and internal medicine clinics. If individuals agree to participate, they provide written informed consent, complete a questionnaire that includes demographic, medical and family history information, and donate a blood sample. 4-7 ml of blood is drawn into each of 3 ACD Solution A tubes and is used for genomic DNA extraction and the establishment of an EBV-transformed lymphoblastoid cell culture, cell pellet in Trizol, and plasma. Controls were matched to CRC cases as 1:1. Matching was done on age at reference time (age_ref), race, and sex. Age_ref was matched on +-5 years. Sex and race were matched exactly. For the cases without matched controls, matching was done only on sex and race with 1:1 ratio. Since controls are fewer than cases, one control is matched on 2 cases at most. Health Professionals Follow-up Study (HPFS): A parallel prospective study to the NHS (Nurses' Health Study). The HPFS cohort comprised 51,529 men aged 40-75 who, in 1986, responded to a mailed questionnaire (Rimm et al., 1990) [PMID: 2090285]. Participants provided information on health related exposures, including current and past smoking history, age, weight, height, diet, physical activity, aspirin use, and family history of colorectal cancer. Colorectal cancer and other outcomes were reported by participants or next-of-kin and were followed up through review of the medical and pathology record by physicians. Overall, more than 97% of self-reported colorectal cancers were confirmed by medical record review. Information was abstracted on histology and primary location. Incident cases were defined as those occurring after the subject provided the blood sample. Prevalent cases were defined as those occurring after enrollment in the study but before the subject provided the blood sample. Follow-up evaluation has been excellent, with 94% of the men responding to date. Colorectal cancer cases were ascertained through January 1, 2008. In 1993-1995, 18,825 men in the HPFS mailed blood samples by overnight courier, which were aliquoted into buffy coat and stored in liquid nitrogen. In 2001-2004, 13,956 men in the HPFS who had not provided a blood sample previously mailed in a swish-and-spit sample of buccal cells. Incident cases were defined as those occurring after the subject provided a blood or buccal sample. Prevalent cases were defined as those occurring after enrollment in the study in 1986, but before the subject provided either a blood or buccal sample. After excluding participants with histories of cancer (except nonmelanoma skin cancer), ulcerative colitis, or familial polyposis, case-control sets were previously constructed. In addition to colorectal cancer cases and controls, a set of adenoma cases and matched controls with available DNA from buffy coat were selected for genotyping. Over the follow-up period, data were collected on endoscopic screening practices and, if individuals had been diagnosed with a polyp, the polyps were confirmed to be adenomatous by medical record review. Adenoma cases were ascertained through January 1, 2008. A separate case-control set was constructed of participants diagnosed with advanced adenoma matched to control participants who underwent a lower endoscopy in the same time period and did not have an adenoma. Advanced adenoma was defined as an adenoma 1 cm or larger in diameter and/or with tubulovillous, villous, or highgrade dysplasia/carcinoma-in-situ histology. Matching criteria included year of birth (within 1 year) and month/ year of blood sampling (within 6 months), the reason for their lower endoscopy (screening, family history, or symptoms), and the time period of any prior endoscopy (within 2 years). Controls matched to cases with a distal adenoma either had a negative sigmoidoscopy or colonoscopy examination, and controls matched to cases with proximal adenoma all had a negative colonoscopy. In total, 159 advanced adenoma cases and 109 controls were selected for genotyping. Leeds Colorectal Cancer Study (LCCS): Following local ethical approval, colorectal cancer cases were recruited from 1997 until 2012 in Leeds, UK through surgical clinics. Initially, funding was provided by the UK Ministry of Agriculture, Farming and Fisheries (subsequently the Food Standards Agency) and Imperial Cancer Research Fund (subsequently Cancer Research UK). Recruitment also occurred similarly in Dundee, Perth and York between the periods of 1997 and 2001 using the same protocol and the data and samples were combined. Pathologically confirmed cases were consented at outpatient clinics, providing information on known and postulated risk factors for colorectal cancer (diet, lifestyle and family history) as well as providing a blood sample for DNA. Exclusion criteria included pre-existing diverticular disease and an inability to complete the questionnaire. The General Practitioners of cases (all UK residents have a nominated General Practitioner to whom to refer initial medical queries) and these GPs were asked to send letters to other persons on their patient list of the same gender and born within 5 years of the case. Subsequently to enhance the number of controls, we systematically invited patients from selected GP practices. Diet was assessed in cases and controls using an extensive dietary and lifestyle questionnaire modified by that produced by the European Prospective Investigation in Cancer (EPIC). The frequency that each specific food items were eaten was recorded and we also obtained average fruit and vegetable consumption as a cross-check. In total, 1591 cases and 739 controls provided a DNA sample. The North Carolina Colon Cancer Studies (NCCCS I/II): The North Carolina Colon Cancer Studies (NCCCS I- colon and NCCCS II-rectal) were population-based case-control studies conducted in 33 counties of North Carolina. Cases were identified using the rapid case ascertainment system of the North Carolina Central Cancer Registry. Patients with a first diagnosis of histologically confirmed invasive adenocarcinoma of the colon (cecum through sigmoid colon) between October 1996 and September 2000 were classified as potential cases in the NCCCS I. The NCCCS II included patients with a first diagnosis of histologically confirmed invasive adenocarcinoma of the sigmoid colon, rectosigmoid, or rectum (hereafter collectively referred to as rectal cancer) between May 2001 and September 2006. Additional eligibility requirements were: aged 40-80 years, residence in one of the 33 counties, ability to give informed consent and complete an interview, had a driver's license or identification card issued by the North Carolina Department of Motor Vehicles (if under the age of 65), and had no objections from the primary physician in regards to contacting the individual. Controls, identified and sampled during the respective study dates, were selected from two sources. Potential controls under the age of 65 were identified using the North Carolina Department of Motor Vehicles records. For those 65 years and older, records from the Center for Medicare and Medicaid Services were used. Controls were matched to cases using randomized recruitment strategies. Recruitment probabilities were done using strata of 5-year age, sex, and race groups. Dietary information was collected using a modified version of the semiquantitative food frequency questionnaire developed at the National Cancer Institute. In addition, participants were asked about vitamin and mineral supplementation, special diets, restaurant eating, sodium use, and fats used in cooking. In NCCCS I, 515 colorectal cases and 687 matched controls were sent for genotyping. In NCCCS II, 796 colorectal cases and 823 controls were sent from the NCCCS II for genotyping. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age, race, and sex. Age was matched on +-5 years. Race and sex was matched exactly. For the cases without matched controls, matching was done only on sex and race. Nurses Health Study (NHS): The NHS cohort began in 1976 when 121,700 married female registered nurses age 30-55 years returned the initial questionnaire that ascertained a variety of important health-related exposures (Belanger et al., 1978) [PMID: 248266]. Since 1976, follow-up questionnaires have been mailed every 2 years. Colorectal cancer and other outcomes were reported by participants or next-of-kin and followed up through review of the medical and pathology record by physicians. Overall, more than 97% of self-reported colorectal cancers were confirmed by medical-record review. Information was abstracted on histology and primary location. The rate of follow-up evaluation has been high: as a proportion of the total possible follow-up time, follow-up evaluation has been more than 92%. Colorectal cancer cases were ascertained through June 1, 2008. In 1989 -1990, 32,826 women in NHS I mailed blood samples by overnight courier, which were aliquoted into buffy coat and stored in liquid nitrogen. In 2001-2004, 29,684 women in NHS I who did not previously provide a blood sample mailed a swish-and-spit sample of buccal cells. Incident cases were defined as those occurring after the subject provided a blood or buccal sample. Prevalent cases were defined as those occurring after enrollment in the study in 1976 but before the subject provided either a blood or buccal sample. After excluding participants with histories of cancer (except nonmelanoma skin cancer), ulcerative colitis, or familial polyposis, case-control sets were previously constructed from which DNA was isolated from either buffy coat or buccal cells for genotyping. In addition to colorectal cancer cases and controls, a set of advanced adenoma cases and matched controls with available DNA from buffy coat were selected for genotyping. Over the follow-up period, data were collected on endoscopic screening practices and, if individuals had been diagnosed with a polyp, the polyps were confirmed to be adenomatous by medical record review. Adenoma cases were ascertained through June 1, 2011. A separate case-control set was constructed of participants diagnosed with advanced adenoma matched to control participants who underwent a lower endoscopy in the same time period and did not have an adenoma. Advanced adenoma was defined as an adenoma more than 1 cm in diameter and/or with tubulovillous, villous, or high-grade dysplasia/carcinoma-in-situ histology. Matching criteria included year of birth (within 1 year) and month/year of blood sampling (within 6 months), the reason for their lower endoscopy (screening, family history, or symptoms), and the time period of any prior endoscopy (within 2 years). Controls matched to cases with a distal adenoma either had a negative sigmoidoscopy or colonoscopy examination, and controls matched to cases with proximal adenoma all had a negative colonoscopy. A total of 272 cases and 236 matched controls were sent to CIDR for the advanced adenoma case-control set. Northern Swedish Health and Disease Study (NSHDS): Comprises over 110,000 participants, including approximately one third with repeated sampling occasions, from three population-based cohorts (Dahlin et al., 2010; Myte et al., 2016) [PMID: 20197478; PMID: 27367522]. The largest is the ongoing Vasterbotten Intervention Programme, in which all residents of Vasterbotten County are invited to a health examination upon turning 30 (some years), 40, 50 and 60 years of age. Extensive measured and self-reported health and lifestyle data, as well as blood samples for central biobanking in Umea, Sweden, are collected at the health exam. Leucocyte DNA samples for 1:1-matched CRC case-control sets from the NSHDS, of which 878 samples are included in this study, have been selected for genotyping. This is in addition to 354 samples from the NSHDS previously analyzed as part of the multicenter EPIC cohort. Cancer-specific and overall survival data are available for all patients. For at least 425 patients, archival tumor tissue has been analyzed for the BRAF V600E mutation and by sequencing codon 12 and 13 for KRAS mutations, as well as for MSI screening status by immunohistochemistry and for an eight-gene CIMP panel using quantitative real-time PCR (MethyLight). Ohio Colorectal Cancer Prevention Initiative (OCCPI, OSUMC): OCCPI (ClinicalTrials.gov identifier: NCT01850654) is a population-based study of colorectal cancer patients diagnosed in one of 51 hospitals throughout the state of Ohio from January 1, 2013 through December 31, 2016. The OCCPI was created to decrease CRC incidence in Ohio by identifying patients with hereditary predisposition (statewide universal tumor screening for newly diagnosed CRC patients), increase colonoscopy compliance for first-degree relatives of CRC patients, and encourage future research through the creation of a biorepository. The 51 Ohio hospitals participating in the OCCPI were selected to represent a cross-section of clinical centers in the state based on high reported volume of CRC patients, affiliation with a high volume hospital, or interest in participation. Institutional Review Board (IRB) approval was obtained by the individual hospitals, Community Oncology Programs, or by ceding review to the OSU IRB. Written informed consent was obtained. A total of 2139 colorectal cases were genotyped. Patients were considered eligible for this study if they were age 18 or older at the time of enrollment, if they had a surgical resection (or biopsy if unresectable) in the state of Ohio demonstrating an adenocarcinoma of the colorectum from 1/1/13 - 12/31/16. Matched control samples were selected from the Ohio State University Medical Center's (OSUMC) Human Genetics Sample Bank in an identical way to the selection for the Columbus-area HNPCC Study (please refer to the description for the Columbus-area HNPCC Study). Prostate, Lung, Colorectal and Ovarian Cancer Screening Trail (PLCO): PLCO enrolled 154,934 participants (men and women, aged between 55 and 74 years) at ten centers into a large, randomized, two-arm trial to determine the effectiveness of screening to reduce cancer mortality. Sequential blood samples were collected from participants assigned to the screening arm. Participation was 93% at the baseline blood draw. In the observational (control) arm, buccal cells were collected via mail using the "swish-and-spit" protocol and participation rate was 65%. Details of this study have been previously described (Huang et al., 2016) [PMID: 27673363] and are available online (http://dcp.cancer.gov/plco). For this study 1651 advanced adenoma cases and 1392 controls were selected for genotyping. Selenium and Vitamin E Prevention Trial (SELECT): The Selenium and Vitamin E Cancer Prevention Trial (SELECT) was a double-blind, placebo controlled clinical trial which explored using selenium and vitamin E alone and in combination to prevent prostate cancer in healthy men (Lippman et al., 2009) [PMID: 19066370]. Secondary endpoints included the prevention of colorectal and lung cancers. SELECT was conducted at 427 sites and centers in the United States, Canada and Puerto Rico; 35,533 men 55 years and older (50 or older if African American) were randomized beginning August 22, 2001. Supplementation was discontinued on October 23, 2008 due to futility. 308 colorectal cancer cases and 308 matched controls were selected from the SELECT population and sent for genotyping. Screening Markers For Colorectal Disease Study and Colonoscopy and Health Study (SMS-REACH): Details on this study population were previously reported (Burnett-Hartman et al., 2014) [PMID: 24875374]. Participants were enrollees in an integrated health-care delivery system in western Washington State (Group Health Cooperative, Seattle, Washington) aged 24-79 years who underwent an index colonoscopy for any indication between 1998 and 2007 and donated a buccal-cell or blood sample for genotyping analysis. Study recruitment took place in 2 phases, with phase 1 occurring in 1998-2003 and phase 2 occurring in 2004-2007. Persons who had undergone a colonoscopy less than 1 year prior to the index colonoscopy, persons with inadequate bowel preparation for the index colonoscopy, and persons with a prior or new diagnosis of colorectal cancer, a familial colorectal cancer syndrome (such as familial adenomatous polyposis), or another colorectal disease were ineligible. Patients diagnosed with adenomas or serrated polyps and persons who were polyp-free at the index colonoscopy (controls) were systematically recruited during both phases of recruitment. Approximately 75% agreed to participate and provided written informed consent. Based on medical records, persons who agreed to participate and those who refused study participation were similar with respect to age, sex, and colorectal polyp status. Study protocols were approved by the institutional review boards of the Group Health Cooperative and the Fred Hutchinson Cancer Research Center (Seattle, Washington). A total of 575 cases and 508 matched were selected for the study. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age_ref, race, and sex. Age_ref was matched on +-5 years. The Women's Health Initiative (WHI): WHI is a long-term national health study that has focused on strategies for preventing heart disease, breast and colorectal cancer, and osteoporotic fractures in postmenopausal women. The original WHI study included 161,808 postmenopausal women enrolled between 1993 and 1998. The Fred Hutchinson Cancer Research Center in Seattle, WA serves as the WHI Clinical Coordinating Center for data collection, management, and analysis of the WHI. The WHI has two major parts: a partial factorial randomized Clinical Trial (CT) and an Observational Study (OS); both were conducted at 40 Clinical Centers nationwide. The CT enrolled 68,132 postmenopausal women between the ages of 50-79 into trials testing three prevention strategies. If eligible, women could choose to enroll in one, two, or all three of the trial components. The components are: Hormone Therapy Trials (HT): This double-blind component examined the effects of combined hormones or estrogen alone on the prevention of coronary heart disease and osteoporotic fractures, and associated risk for breast cancer. Women participating in this component with an intact uterus were randomized to estrogen plus progestin (conjugated equine estrogens [CEE], 0.625 mg/d plus medroxyprogesterone acetate [MPA] 2.5 mg/d] or a matching placebo. Women with prior hysterectomy were randomized to CEE or placebo. Both trials were stopped early, in July 2002 and March 2004, respectively, based on adverse effects. All HT participants continued to be followed without intervention until close-out. Dietary Modification Trial (DM): The Dietary Modification component evaluated the effect of a low-fat and high fruit, vegetable and grain diet on the prevention of breast and colorectal cancers and coronary heart disease. Study participants were randomized to either their usual eating pattern or a low-fat dietary pattern. Calcium/Vitamin D Trial (CaD): This double-blind component began 1 to 2 years after a woman joined one or both of the other clinical trial components. It evaluated the effect of calcium and vitamin D supplementation on the prevention of osteoporotic fractures and colorectal cancer. Women in this component were randomized to calcium (1000 mg/d) and vitamin D (400 IU/d) supplements or a matching placebo. The Observational Study (OS)examines the relationship between lifestyle, environmental, medical and molecular risk factors and specific measures of health or disease outcomes. This component involves tracking the medical history and health habits of 93,676 women not participating in the CT. Recruitment for the observational study was completed in 1998 and participants were followed annually for 8 to 12 years. All centrally confirmed cases of invasive colorectal cancers, or deaths from colorectal cancer were selected as potential cases from September 30, 2015 database. Controls were participants free of colorectal cancer (invasive or in situ) as of September 30, 2015. Potential cases and controls were excluded if they (1) were non-White; (2) had history of colorectal cancers at baseline; (3) lost to follow-up after enrollment; (4) DbGAP ineligible; (5) had <1.25ug of DNA; (6) selected for WHI study M26 Phase I or II; (7) selected for WHI study AS224 and also included in the imputation project. A total of 578 cases and 104,429 controls met the eligibility criteria. Each case was matched with 1 control (1:1) that exactly met the following matching criteria: age (+-5 years), 40 randomization centers (exact), WHI date (+-3 years), CaD date (+-3 years), OS flag (exact), HRT assignments (exact), DM assignments (exact), and CaD assignments (exact). Control selection was done in a time-forward manner, selecting one control for each case from the risk set at the time of the case's event. The matching algorithm was allowed to select the closest match based on a criteria to minimize an overall distance measure (Bergstralh EJ, Kosanke JL. Computerized matching of cases to controls. Technical Report #56, Department of Health Sciences Research, Mayo Clinic, Rochester MN. April 1995). Each matching factor was given the same weight. When exact matches could not be found, the matching criteria were gradually relaxed among unmatched cases and controls until all cases had found matched controls. Using the matching criteria specified above, 559 of the 578 eligible cases found exact matches. The matching criteria was then relaxed to : Age+-5, randomization centers, WHI date +- 3 years, CaD date +- 3 years, OS flag, HRT flag, DM flag, CaD flag. 17 of the remaining 19 unmatched cases found matched controls. By matching on Age+-5, randomization centers, WHI date +- 3 years, CaD date +- 3 years, OS flag, HRT flag, the remaining 2 unmatched cases found their matches.
This study involves sequencing of patients with a diagnosis of sickle cell disease from Brazil. No exclusionary criteria were employed and any eligible patients that consented to this study were recruited.
Synovial tissue was collected at the time of arthroplasty from rheumatoid and osteoarthritis arthroplasty. Tissues were left intact or disaggregated with different methods to evaluate the effect on the transcriptome.
Data for 1) hiPSC-derived macrophages across a polarization time course from the naive M0-state to the M1- and M2-states, 2) Primary T-cells differentiated to helper subtypes, and 3) spontaneous differentiation of hiPSCs upon perturbation of GATA2, NR4A2 or SOX17
We performed whole exome sequencing of bone marrow monoclear cells derived from a cold agglutinin disease patient. Aim of our study is to elucidate the pathogenesis of hemolytic anemia.
To elucidate the genetic clues of the related clinical manifestations, whole exome sequencing analysis was performed in 163 of the patients with TSC and compared the severity of the symptoms.
To identify biomarkers of the antitumor efficacy of molecular targeted therapies, patient-derived xenograft (PDX) mouse models established from 52 patients with solid tumors were treated.
MVP is an ongoing prospective cohort study and mega-biobank in the Department of Veterans Affairs Healthcare System designed to study genetic influences on health and disease among veterans.
In this study, we have performed the whole genome sequencing (WGS) for primary tumors from adult T-cell leukemia/lymphoma (ATL) to identify recurrent genetic mutations and structural variants.