77 samples collected from 35 multiple myeloma patients. Each patient provided one healthy sample and one primary tumor sample and, in some cases, also samples collected after the progression of the disease.
Illumina Novaseq paired-end single-cell RNA sequencing samples prepared using 10X Genomics platform. 3 ovarian cancer patients, 5 omentum biopsies samples: 3 HGSOC metastasis samples (one per patient), 2 healthy samples (one per patient except Patient2). 4 paired fastq files per sample.
The HTAN-MCL Pre-Cancer Atlas Pilot Project (PCAPP) is the result of a collaboration between the seven members of the MCL consortium. Across four organ types, PCAPP's goal is to collect and profile pre-malignant lesions for gene expression, DNA mutations, single-cell gene expression and immune-environment. Most PML are small in size and only available come from formalin fixed paraffin embedded archived tissue. The primary goal of PCAPP is to 1) understand the logistical challenges of PML specimen collection, 2) document technical limitations of the assays that are specific to the PML and 3) overcome them to support the generation of a more comprehensive Pre-Cancer Atlas in the future. The current upload provides RNA and DNA sequencing from participants with DCIS who were studied at the University of San Diego and the University of Vermont. Description of the overall study: A. Background/Significance One of the critical barriers to developing new approaches for cancer detection and prevention is the lack of understanding of the key molecular and cellular changes that cause cancer initiation and progression. Unlike the extensive work that has been done profiling advanced stage tumors, few studies have comprehensively profiled the molecular alterations found in precancerous tissues. Premalignant lesions are currently characterized by histologic changes that precede the development of invasive carcinoma1,2.These lesions can often be identified in regions surrounding an invasive tumor or in biopsies taken from patients undergoing diagnostic evaluation for suspicion of cancer. Currently, limited metrics exist to identify lesions that will likely progress to carcinoma and require intervention from those that will naturally regress or remain stable3,4. Characterization of the molecular alterations in premalignant lesions and the corresponding changes in the microenvironment would hasten the development of biomarkers for early detection and risk stratification as well as suggest preventive interventions to reverse or delay the development of cancer. Our pilot study will establish the feasibility of transcriptomic, genomic and immune profiling of FFPE premalignant lesions from multiple organ sites, collected and profiled with uniform SOPs across multiple institutions within the MCL consortium. We will characterize the molecular alterations in precancerous lesions and the corresponding microenvironment in four major organ sites, in order to uncover the molecular and cellular determinants of premalignancy, and establish standardized sequencing and immunohistochemistry protocols on FFPE precancerous tissue. We will also evaluate the technical feasibility of single nuclei sequencing of small FFPE pre-cancer lesions. Successful completion of the proposed pilot study will set the stage for expansion and development of a comprehensive Pre-Cancer Atlas (PCA) as part of the NCI's moonshot.B. Specific Aims Aim 1: Collect premalignant lesions (PML) and their associated microenvironment via LCM from FFPE tissue across four organ sites (breast, lung, pancreas & prostate). Aim 2: Perform bulk RNA and DNA seq on premalignant FFPE samples (and flash frozen tissue where available) and compare the genomic/transcriptomic alterations within and across organ sites. C. ApproachAim 1: Collect premalignant lesions (PML) and their associated microenvironment via LCM from FFPE tissue across four organ sites (breast, lung, pancreas & prostate). MethodsI. Patient Population/Sample Collection: Overview of the sites collecting PML tissue from the respective organs is provided in Table 1 and a full description of the biospecimens to be obtained is described in detail for each organ type below. Table 1. Breakdown of cohort by tissue type and collection site.Organ siteBreastLungPancreasProstateType of PMLDCISAAH, Squamous Dysplasia/CISIPMNsPINCollection of PatientsUCSF/UCSDUVMBU*/UCLAVanderbilt/MoffittMDACC*JHUStanford*# of Patients201920 (10 of each type)20 (10 of each type)242020Total patients per Organ39402440Note: single nuclei/cell RNA-Seq will be performed on 4-5 FFPE samples from each of the organ types 1. DCIS lesions from breast tissue: DCIS lesions will be collected from 39 patients (20 from UCSF/UCSD & 19 from UVM) with primary low or high-grade DCIS diagnosed from a breast core biopsy. Subsequent resected lumpectomy or mastectomy tissues will be prospectively sampled in the vicinity of the prior biopsy site using multiple approaches: 1) Live cells (heterogeneous mix) will be obtained as a cell scrape slurry from the lesion surface or by fine needle aspirate (FNA); 2) For a subset of specimens where size is sufficient, a block of breast tissue with DCIS will be fresh-frozen; 3) The remainder of the specimen will be taken for routine formalin-fixation and paraffin-embedding (FFPE). The FFPE sample will be annotated to identify the matched FFPE tissue block adjacent to the fresh-frozen sample and will be sectioned for use in bulk and single nuclei sequencing . We will dissect DCIS, adjacent normal and when available, associated carcinoma. In addition, when possible, normal tissue will be collected from a tissue block lacking lesions as well as collection of blood. A subset of patients (n = 5 | FFPE, flash frozen and fresh) will be sent to the Broad Institute for single nuclei/cell sequencing.2. AAH and squamous dysplastic/CIS lesions from airway and lung tissue: For squamous cell lung cancer, we will collect endobronchial biopsies from abnormal airway regions identified on autofluroscence bronchoscopy or identify PMLs in the margins of resected lung tissue. We will study 20 patients (5 each from BU/UCLA/Vanderbilt/Moffitt) with pre-invasive squamous lesions (moderate-severe dysplasia or carcinoma in situ (CIS)) identified on pathologic examination. LCM of the premalignant region and adjacent normal epithelium will be performed as well as the invasive tumor for those collected from the resection margin (n=5 from UCLA). On a subset of lesions collected at bronchoscopy (n=5), we will collect additional biopsies that will be flash frozen and fresh for single nuclei and cell sequencing, respectively, performed at the Broad Institute. In parallel to the work at the Broad, BU will perform single cell RNA-seq on these freshly cell sorted tissues (n = 5). Blood will be collected on all patients for genomic studies. For lung adenocarcinoma, we will collect resected FFPE lung tissues from 20 patients (10 from UCLA and 10 from Vanderbilt/Moffitt) with early stage lung adenocarcinoma that harbor atypical adenomatous (AAH) premalignant lesions in the resection margin. We will LCM multiple AAH regions (3-5 per patient) as well as adjacent regions of normal epithelium and invasive adenocarcinoma. In addition, blood will be collected on all patients for genomic studies. 3. IPMNs from pancreatic tissue: For pancreatic cancer PML, we will collect low and high grade lesions from 24 patients representing macroscopic Intraductal Papillary Mucinous Neoplasms (IPMN) (n=24) from surgically resected specimens along with blood samples. Archival FFPE specimens of microscopic PanIN lesions, occurring multi-focally adjacent to invasive PDAC, and archival IPMN lesions (with or without associated invasive cancer), along with the adjacent normal tissue, will undergo LCM and utilized for bulk DNA and RNA sequencing. If matched frozen tissues are available for a subset of these FFPE samples, we will bank for comparison of profiles. Because IPMNs are macroscopic lesions, they provide an opportunity for obtaining the samples fresh and therefore can be used for single cell sequencing (in contrast to PanINs). Therefore, 5 freshly obtained IPMNs will be used for the single cell RNA sequencing studies performed at both the Broad Institute and MDACC, and the matched FFPE and/or frozen sections from these lesions (obtained from the adjacent PML) will be sent to Broad Institute as a pilot to assess "single nuclei" RNA sequencing.4. PINs from prostate tissue: For prostate cancer PML, there will be 40 samples of Prostatic Intraepithelial Neoplasia (PIN) collected between the Stanford and JHU sites (20 cases per site). At the Stanford site, 20 prostate specimens detected by PSA screening who have/will undergo surgery (radical prostatectomy) for clinically localized disease will make up the final cohort. The age range of the participants would be 40-75, and we anticipate that 18 will be Caucasian, 1 Asian and 1 Latino or African American based on the practice demographics practice at Stanford. Clinical and MRI data will also be collected for these samples. We will collect low grade (e.g. Gleason score of 6/Grade group 1; n=10) and high grade (Gleason score 4+3=7 or higher/Grade group 3 or higher; n=10) PINs from FFPE samples that have prostate carcinoma. In addition to obtaining LCM archival samples of low and high grade PIN, we will also obtain normal prostatic epithelial from the peripheral, central and transition zones as well as multiple samples of prostate carcinoma in order to obtain the spectrum of Gleason grades in the carcinoma as needed. LCM samples will be used for bulk DNA and RNA sequencing. In addition, single cells will be dissected from FFPE samples to prepare single cell RNA seq libraries using techniques developed at Stanford, and FFPE tissue will be sent to the Broad for single nuclei sequencing. When available, flash frozen and fresh samples from these prostates will be archived and prepared for single nuclei and cell sequencing, respectively, at the Broad Institute and at Stanford (single cell only). JHU will also capture 10 cases (5 grade group 1 and 5 grade group 2) of high grade PIN, normal and invasive adenocarcinoma using frozen sections from fresh frozen tissues. When possible these will be from the same patients as the FFPE samples. Since frozen sections can be quite challenging to morphologically determine high grade PIN from normal epithelium, for these samples we will perform a number of additional tissue-based characterizations. These will include a multicolor combined basal cells (p63 and CK903) and PIN/carcinoma markers (AMACR) referred to in the cocktail as "PIN4", c-MYC (referred to as MYC) protein5, by IHC and mRNA by in situ hybridization (AM De Marzo, Q Zheng unpublished observations), telomere length by in situ hybridization6 and the 5'ETS/45S rRNA7. For these slides, the whole slides will be scanned with a Hammamatsu Nanozoomer with a 40x objective and regions of interest will be annotated as a guide for LCM.II. Laser-capture microdissection (LCM): FFPE tissue blocks will be sectioned at 7μm thickness and serial sections will be stained with H&E. LCM will be performed utilizing standard LCM systems, such as Leica LMD7000 and ArcturusXT at each site. Regions of premalignancy will be dissected and RNA/DNA will be extracted from microdissected cells using the Qiagen All Prep DNA/RNA FFPE Kit. Aim 2: Perform bulk RNA and DNA seq on premalignant FFPE samples and compare the genomic/transcriptomic alterations within and across organ sites. Rationale: There have been limited studies characterizing the genomic and transcriptomic landscape of premalignant lesions associated with breast, pancreatic, lung or prostate cancers. Characterizing the molecular determinants of premalignant disease that are unique and shared across multiple organs will enable new candidate biomarkers for early detection and novel therapeutic strategies for early intervention. MethodsBulk RNA-seq of LCM FFPE tissue: All participating sites will perform bulk RNA-seq in accordance with SOPs developed at BU. In brief, total RNA will be isolated from LCM'd lesion and associated microenvironment tissue using the Qiagen All Prep DNA/RNA FFPE Kit and quality will be assessed with the Agilent Bioanalyzer. Libraries will be generated with the Illumina TruSeq Access kit (for FFPE samples). They will be sequenced on the Illumina HiSeq2500 with 75base-pair paired-end reads. Quality of FASTQ files will be assessed with FastQC. Reads will be aligned to the human genome with STAR and gene-level and isoform-level expression will be quantified with RSEM. Splice junction saturation, transcript integrity, and biotype distributions will be calculated for each sample with RSeQC. DESeq2 and EdgeR will be used to identify associations between gene expression profiles and clinical variables while controlling for confounding covariates. BU will serve as an RNA-seq Core to assess reproducibility of FFPE RNA-seq methods across sites. We will perform RNA-seq according to the SOP listed above on a subset of samples for each organ type (total n ~ 20). Bulk seq of DNA from FFPE tissue: All participating sites will perform targeted or whole exome-seq (WES) in accordance with SOPs. In brief, DNA from laser captured material will be isolated using the Qiagen All Prep DNA/RNA FFPE Kit and undergo stringent quality control to ensure high quality input material for genomic profiling. Purified DNA (ideally 100-200 ng) will be used for library preparation and amplification, followed by next generation sequencing using standard protocols distributed by CDMG. Exome-seq methods are considered standardized, thus we will not need a DNA-seq Core to assess reproducibility across sites. We anticipate local centers will use Illumina paired end reads, following the following general approach. 1) DNA library preparation: Paired-end libraries will be prepared following the manufacturer's protocols (Illumina and Agilent), fragmented to 150-200 bp 2) Capture of targeted exome: Whole exome capture will be carried out using the protocol for Agilent's SureSelect Human All Exon kit. Purified capture products will be amplified using the SureSelect GA PCR primers (Agilent) for 12 cycles. 3) Sequencing will be carried out for the captured libraries using at least 100 bp paired-end reads. To achieve high level sensitivity and accuracy for detecting all the mutations in the whole exome, each sample will be sequenced at 200X mean depth. 4) Read mapping and alignment and variant analysis: Sequence short reads will be aligned to a reference genome (NCBI human genome assembly build 38) using BWA-MEM. Local realignment of aligned reads will be performed using Genome Analysis Toolkit (GATK).Data QC: To ensure scientific rigor and consistency among sites in RNA and DNA processing we will include a preliminary analysis of steps in processing and analysis. Protocols for extraction of high quality RNA and DNA from formalin fixed paraffin embedded (FFPE) tissues, which will be used extensively in these studies continue to improve and may have variable implementation among the sites participating in this study. To evaluate consistency of preliminary steps in processing and downstream analyses, we will initially distribute slides from one large FFPE fixed cancer of origin from prostate, breast, lung and pancreatic cancer. Analysis of these samples will allow us to review the DNA and RNA characteristics (yield, purity and strand length) among sites. Downstream analysis of these same samples will also allow us to compare among sites the consistency of variant calls among centers. We will be able to identify if there are some times of calls (such as small insertion deletions) that are more variable among centers versus other types of calls (such as relative gene expression or single base pair substitutions) that we expect to be less variable and to characterize the reliability of findings across sites. We are also including a 5% blind duplicate analysis of RNA sequencing. Samples will be analysed by the participating genomics cores without knowledge of the phenotype. RNA seq and CNA analyses are normalized for batch effects. We will also compare the observed sex to the self-reported sex as based on RNA profiles and exome sequencing of X chromosome genes as another check for processing accuracy and sample management. D. References 1. Wacholder, S. Precursors in Cancer Epidemiology: Aligning Definition and Function. Cancer Epidemiol. Prev. Biomark. 22, 521-527 (2013). PMID: 23549395.2. Berman, J. J. Precancer: The Beginning and the End of Cancer. (Jones & Bartlett Learning, 2011).3. Nasiell, K., Nasiell, M. & Vaćlavinková, V. Behavior of moderate cervical dysplasia during long-term follow-up. Obstet. Gynecol. 61, 609-614 (1983). PMID: 6835614.4. Merrick, D. T. et al. Persistence of Bronchial Dysplasia Is Associated with Development of Invasive Squamous Cell Carcinoma. Cancer Prev. Res. (Phila. Pa.) 9, 96-104 (2016). PMID: 26542061.5. Gurel, B. et al. Nuclear MYC protein overexpression is an early alteration in human prostate carcinogenesis. Mod. Pathol. Off. J. U. S. Can. Acad. Pathol. Inc 21, 1156-1167 (2008). PMID: 18567993.6. Meeker, A. K. et al. Telomere shortening is an early somatic DNA alteration in human prostate tumorigenesis. Cancer Res. 62, 6405-6409 (2002). PMID: 12438224.7. Guner, G. et al. Novel Assay to Detect RNA Polymerase I Activity In Vivo. Mol. Cancer Res. MCR 15, 577-584 (2017). PMID: 28119429.
We used targeted capture and massively parallel sequencing of exomes of CD138 purified plasma cells and matched somatic DNA from 17 patients with Multiple Myeloma. For each patient, an early tumor sample (at diagnosis) and a late one (at relapse) were available. For few of them, an additional late sample (2nd progression) was present. In total, the study has 52 samples. 1044 variants were validated by 454 sequencing. The present study will validate an additional 4630 variants by targeted pull-down and sequencing. All 52 samples will be indexed and then will be pre-pooled in groups of 5 or 6. The sequence capture will use 10 reactions. All samples will then be pooled and go thorugh one lane of HiSeq.
By whole exome sequencing (WES) our group observed the accumulation of mutations in receptor tyrosine kinases (RTKs), adhesion molecules and their effectors and developed a signaling network that was affected by at least one mutation in almost 100% of MM patients and by more than one mutation in around 50% of MM patients which we decided to call inter- and intra-individual pathway redundancy. Interestingly, the extension of our WES dataset to 67 primary MM samples with correspond normal tissue which were collected at the Medizinische Klinik and Poliklinik II in Würzburg within the frame of the CRU216 has tentatively led to the designation of three molecular subgroups based on their mutation profile: "adhesion only", "adhesion & downstream" and "RTK & adhesion & downstream
Exome sequencing from 1,000 UK population samples (the ICR1000 Exome series), using samples from the 1958 Birth Cohort Collection, a population-based collection of all individuals born in the UK in one week in 1958 (http://www.cls.ioe.ac.uk). DNA libraries were prepared from genomic DNA using the Illumina TruSeq sample preparation kit. DNA was fragmented using Covaris technology and the libraries were prepared without gel size selection. Target enrichment was performed in pools of six libraries using the Illumina TruSeq Exome Enrichment kit. The captured DNA libraries were PCR amplified using the supplied paired-end PCR primers. Sequencing was performed with an Illumina HiSeq2000 (v3 flow cell, one pool per lane) generating 2x100-bp reads.
To investigate the relationship between somatic mutations and phenotypic hematologic differentiation at the single-cell level, we used a droplet-based multi-omics single-cell platform on eleven NPM1-mutated AML diagnostic samples. We developed a bioinformatics framework to perform phylogeny-driven genotype correction, that allowed us to study 52 103 single cells. Intra-leukemic genetic heterogeneity was detectable in all patients, including a branched architecture in nine of them, always owing to co-occurring signaling mutations. We identified two groups of NPM1-mutated AML, one characterized by a progenitor immunophenotype and a second one with a predominant monocytic differentiation. We also identified various degrees of intra-leukemic immunophenotype heterogeneity, sometimes associated with strong genetic/phenotype correlations.
To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated global transcriptomic data from long-term endurance (8 men, 8 women) and strength (8 men, 8 women) trained individuals and healthy age-matched untrained controls (8 men, 8 women). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and after 1h and 3hrs following acute exercise. All subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks. All 384 samples were multiplexed in 4 lanes and sequenced (2x250bp paired end) on the Illumina NovaSeq 6000.
This project aims to conduct genome-wide association studies (GWAS) in two anatomically different upper gastrointestinal (UGI) cancer sites in two populations with distinctly different disease rates and genetic profiles. One population has very high rates for both esophageal squamous cell carcinomas (ESCCs) and gastric cancers (GCs) and is comprised of Asians (the "Asian UGI GWAS"), while the other population has low rates of ESCC and GC and includes non-Asians from the Americas, Europe, and Australia (the "Non-Asian UGI GWAS"). Study participants for the Asian UGI GWAS reported here (Illumina 660W Quad chip) were drawn from 2 studies, the Shanxi Upper Gastrointestinal Cancer Genetics Project (Shanxi) and the Linxian Nutrition Intervention Trial (NIT), a prospective cohort, and included a total of 1898 ESCCs, 1625 GCs, and 2100 controls. For the 2nd phase (8 TaqMan SNPs), additional subjects from Shanxi and NIT as well as subjects from the Shanghai Men's Health Study (SMHS), the Shanghai Women's Health Study (SWHS), and the Singapore Chinese Health Study (SCHS) were also included (217 ESCCs, 615 GCs, 1202 controls). Altogether 2115 ESCCs, 2240 GCs, and 3302 controls were genotyped in this study.
We performed low input RNA-seq and single-cell RNA-seq on 7 lymphocyte populations with the objective of finding the common transcriptional programs shared among innate T cells. In brief, we drew blood from healthy donors and isolated CD4+ T cells, CD8+ T cells, Natural Killer cells, invariant natural killer T (iNKT) cells, mucosal-associated invariant T (MAIT) cells, and two γδ T cell populations: Vδ1 and Vδ2. We performed low input RNA-seq on samples with 1000 cells each, for 6 healthy individuals, in duplicate. We performed single-cell RNA-seq with DNA-barcoded hashing antibodies to identify the cell type of origin for each cell. We found a common transcriptional program across lymphocytes that is gradually turned on as lymphocytes have a more "innateness" state, promoting effector functions and sacrificing proliferative capacity. Additionally, we performed low input RNA-seq for Vδ3-expressing γδ T cells and δ/αβ T cells for one healthy individual in duplicate (1000 cells per sample). This revealed δ/αβ T cells are transcriptionally more adaptive-like and Vδ3 more innate-like.