The overall goal of the project is to identify genetic risk factors associated with chromosome 21 nondisjunction in the oocyte. The dataset derives from multi-site collection of live birth probands with Down syndrome due to standard trisomy 21 (T21) and their biological parents. The type of nondisjunction error (NDJ)(maternal or paternal) in all cases has been determined to be maternal in origin based on the chromosome 21 variants contributed from parent to proband. The Center of Inherited Disease with support from NICHD has conducted genome-wide genotyping using the Illuminia Human OmniExpress Plus Exome array on approximately 800 women who have been identified through their offspring with DS and have been characterized as having a maternal meiois I (MI) or meiosis II (MII) nondisjunction error. Genotypes from biological fathers of the offspring with DS can be used with the data on mothers to better define the type of nondisjunction error (MI or MII) and to refine the chromosome 21 recombination profile. We provide the type of nondisjunction error, knowing that this will be updated based on the new panel. We do not provide the recombination profile, as this can be best defined using the new comprehensive set of SNPs in the OmniExpress panel.
Genome-wide association studies (GWAS) of colorectal cancer (CRC) have been instrumental in identifying a number of common susceptibility loci in Non Hispanic (NH)-White populations, and a NCI priority is to extend GWAS findings to other populations to address racial/ethnic disparities in cancer susceptibility. Currently, GWA studies of CRC in NH-Whites, Japanese and African-Americans are ongoing. We propose a complementary study to address this critical research area in Hispanics. Hispanics represent the fastest growing ethnic population in the U.S. and have been largely understudied in terms of genetic susceptibility to cancer. There are noted differences in incidence, survival and mortality in CRC by ethnic/racial groups. Hispanics often present with CRC at a younger age and have a significantly greater incidence of stage IV tumors or metastatic disease compared to NH-Whites. We propose to conduct a large, cost-efficient, population-based GWAS in Hispanics by building upon existing NIH-funded resources, the Colon Cancer Family Registry (Colon CFR) and the Multiethnic Cohort Study (MEC). We plan to recruit 2,500 Hispanic men and women diagnosed with CRC between 01/2008 to present using cancer registries in California, physican referrals and familial referrals. Risk factor/diet questionnaires, pathology reports, Oragene saliva samples (for genotyping), optional blood samples (for genotyping and biometric analysis) and tumor blocks (for MSI testing) will be collected using methodologies developed in the Colon CFR/MEC. Cases of CRC in the MEC (currently 473; anticipated 600 at end) will also be included. Population-based Hispanic individuals without a diagnosis of CRC participating in other GWA studies in the MEC (n=3,900, U01HG004726, Haiman) will be used as controls. We will genotype all 3,100 cases using the Illumina 1M array and use available genotype and epidemiologic data collected on 3,900 controls. Our statistical analyses will include: single-SNP and haplotype effects, gene-environment interactions and heterogeneity by MSI, tumor subtype and family history of CRC. We will replicate findings in a second-stage using CRC cases and controls from Mexico (1,000 cases and 1,000 controls, EU FP7 funding, CHIBCHA, Carvajal-Carmona/Tomlinson). We will also examine heterogeneity of the risk estimates by ethnicity/race by leveraging GWA data on NH-Whites (2,142 cases, 1,909 controls, U01 CA122839, Casey), (4,000 cases, 6,000 NH-White controls, UK-CHIBCHA, Tomlinson), Colombians (2,000 cases and 2,000 controls, CHIBCHA), Japanese (1,000 cases and 1,000 controls) and African-Americans (1,500 cases and 1,500 controls, R01CA126895, Le Marchand). We will genotype replicated significant SNPs in our main and combined analysis in several Hispanic populations (note: studies funded by EU or NIH for data collection but not GWAS), including 800 Puerto Ricans, 2,000 Brazilians, 2,000 Argentineans and 3,000 Spanish/Portuguese, to assess generalizability of findings. We will examine the differences in inflammatory gene transcription dynamics in leukocytes (from blood sample collection) by fatigue level (as assessed from study questionnaire data). This study will have a high impact by addressing the key question of racial/ethnic disparities related to genetic susceptibility to CRC, will provide translational guidelines on biological mechanisms during the cancer survivorship period to increase quality of life among cancer survivors, and will enable further growth and investment into research among Hispanics by providing a resource of genetic data and biospecimens, which is lacking.
Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalogue of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of seventeen individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased "platinum" variant catalogue of 4.7 million single nucleotide variants (SNVs) plus 0.7 million small (1-50bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and eleven children of this pedigree. Platinum genotypes are highly concordant with the current catalogue of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%), and add a validated truth catalogue that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission ("non-platinum") revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes.
The identification of recurrent 8p11.23 amplifications including FGFR1 raised the hope of a treatable target in squamous cell lung cancer (SQLC). However, only a minority of patients with FGFR1-amplified tumors respond to single agent inhibitor therapy targeting FGFR. To understand the underlying mechanism of FGFR1 dependency, we performed whole genome and transcriptome sequencing of 25 FGFR1-amplified primary tumors with unknown response upon FGFR inhibition. In addition, we performed deep sequencing of 26 FGFR1-amplified samples whereof the response upon FGFR inhibition was known for 25 samples. In both cohorts we identified intra-chromosomal tail-to-tail breaks close to the FGFR1 transcription start site, being responsible for focal amplification of FGFR1. These specific breaks are caused by a Breakage-Fusion-Bridge-like (BFB-like) mechanism. Here, we associate these breaks with FGFR inhibitor sensitivity. Moreover, in some cases these breaks are located within the open reading frame of FGFR1, which leads to the expression of an ΔEC-FGFR1 transcript that lacks the ecto-domain. Overexpression of ΔEC-FGFR1 transforms Baf3 cells and lead to an FGFR1-dependent phenotype. Our results demonstrate that the truncation of the FGFR1 ectodomain is a frequent event in 8p11.23-amplified squamous cell lung cancer caused by tail-to-tail breaks. These breaks might be used as a predictive therapeutic marker to stratify patients for FGFR-inhibitor therapy.
RNA from snap-frozen breast tissue biopsies were purified after lysing by tissuelyser (Qiagen) and using RNA Purification Plus Kit (Norgen biotek CORP, 47700) with additional on-column DNase-I treatment (Qiagen, 79254) at 27 °C. RNA purity and integrity (RIN) were quantified using RNA 6000 Nano kit (Agilent Technologies, 5067-1511) on the 4200 TapeStation (Agilent, Santa Clara, USA). Library preparation and 2x75bp paired-end of 160 ng total RNA input was performed using Illumina Stranded Total RNA Prep Ligation kit and Illumina HiSeq4000 system (Illumina, Sand Diego, CA, USA). RNA sequencing data from HiSeq4000 were quality checked and aligned to GRCh38 (GCA_000001405.15) reference genome using HISAT2 2.0.5 and submitted to subread v.1.5.2 for feature counts calculation. Finally, 36 paired biopsies samples (metformin n=26 and placebo n=12) were sequenced and included in the final analyses.
Our study aimed to define the extent of heterogeneity between anatomically separated tumor manifestations in follicular lymphoma at single-cell resolution. We subjected single-cell suspensions derived from nodal, synchronously-acquired fine needle aspirations from two distinct tumor sites to high-throughput microfluidics-based single-cell RNA sequencing. By comparing the relative composition of the tumor subpopulations between the two tumor sites, we found that some patients can exhibit site-to-site differences. While the overall composition of the tumor microenvironment did not differ significantly between sites, we did detect a specific correlation between site-to-site tumor heterogeneity and T follicular helper cell abundance. Our study demonstrates the significant limitations of a solitary biopsy in defining the full scope of a patient's disease.
DAC for Study assessing the efficacy and safety of durvalumab plus olaparib plus fulvestrant in selected metastatic or locally advanced ER-positive, HER2-negative breast cancer patients.
Tissue site for RNAsequencing data. Tissue site is associated with the clinical biomarker data and be linked to the biomarker data using the SAMPLE and PAT identifiers.
Ultra high resolution Micro-C data from multiple colorectal cancer cell lines
Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring