The HTAN-MCL Pre-Cancer Atlas Pilot Project (PCAPP) is the result of a collaboration between the seven members of the MCL consortium. Across four organ types, PCAPP's goal is to collect and profile pre-malignant lesions for gene expression, DNA mutations, single-cell gene expression and immune-environment. Most PML are small in size and only available come from formalin fixed paraffin embedded archived tissue. The primary goal of PCAPP is to 1) understand the logistical challenges of PML specimen collection, 2) document technical limitations of the assays that are specific to the PML and 3) overcome them to support the generation of a more comprehensive Pre-Cancer Atlas in the future. The current upload provides RNA and DNA sequencing from participants with DCIS who were studied at the University of San Diego and the University of Vermont. Description of the overall study: A. Background/Significance One of the critical barriers to developing new approaches for cancer detection and prevention is the lack of understanding of the key molecular and cellular changes that cause cancer initiation and progression. Unlike the extensive work that has been done profiling advanced stage tumors, few studies have comprehensively profiled the molecular alterations found in precancerous tissues. Premalignant lesions are currently characterized by histologic changes that precede the development of invasive carcinoma1,2.These lesions can often be identified in regions surrounding an invasive tumor or in biopsies taken from patients undergoing diagnostic evaluation for suspicion of cancer. Currently, limited metrics exist to identify lesions that will likely progress to carcinoma and require intervention from those that will naturally regress or remain stable3,4. Characterization of the molecular alterations in premalignant lesions and the corresponding changes in the microenvironment would hasten the development of biomarkers for early detection and risk stratification as well as suggest preventive interventions to reverse or delay the development of cancer. Our pilot study will establish the feasibility of transcriptomic, genomic and immune profiling of FFPE premalignant lesions from multiple organ sites, collected and profiled with uniform SOPs across multiple institutions within the MCL consortium. We will characterize the molecular alterations in precancerous lesions and the corresponding microenvironment in four major organ sites, in order to uncover the molecular and cellular determinants of premalignancy, and establish standardized sequencing and immunohistochemistry protocols on FFPE precancerous tissue. We will also evaluate the technical feasibility of single nuclei sequencing of small FFPE pre-cancer lesions. Successful completion of the proposed pilot study will set the stage for expansion and development of a comprehensive Pre-Cancer Atlas (PCA) as part of the NCI's moonshot.B. Specific Aims Aim 1: Collect premalignant lesions (PML) and their associated microenvironment via LCM from FFPE tissue across four organ sites (breast, lung, pancreas & prostate). Aim 2: Perform bulk RNA and DNA seq on premalignant FFPE samples (and flash frozen tissue where available) and compare the genomic/transcriptomic alterations within and across organ sites. C. ApproachAim 1: Collect premalignant lesions (PML) and their associated microenvironment via LCM from FFPE tissue across four organ sites (breast, lung, pancreas & prostate). MethodsI. Patient Population/Sample Collection: Overview of the sites collecting PML tissue from the respective organs is provided in Table 1 and a full description of the biospecimens to be obtained is described in detail for each organ type below. Table 1. Breakdown of cohort by tissue type and collection site.Organ siteBreastLungPancreasProstateType of PMLDCISAAH, Squamous Dysplasia/CISIPMNsPINCollection of PatientsUCSF/UCSDUVMBU*/UCLAVanderbilt/MoffittMDACC*JHUStanford*# of Patients201920 (10 of each type)20 (10 of each type)242020Total patients per Organ39402440Note: single nuclei/cell RNA-Seq will be performed on 4-5 FFPE samples from each of the organ types 1. DCIS lesions from breast tissue: DCIS lesions will be collected from 39 patients (20 from UCSF/UCSD & 19 from UVM) with primary low or high-grade DCIS diagnosed from a breast core biopsy. Subsequent resected lumpectomy or mastectomy tissues will be prospectively sampled in the vicinity of the prior biopsy site using multiple approaches: 1) Live cells (heterogeneous mix) will be obtained as a cell scrape slurry from the lesion surface or by fine needle aspirate (FNA); 2) For a subset of specimens where size is sufficient, a block of breast tissue with DCIS will be fresh-frozen; 3) The remainder of the specimen will be taken for routine formalin-fixation and paraffin-embedding (FFPE). The FFPE sample will be annotated to identify the matched FFPE tissue block adjacent to the fresh-frozen sample and will be sectioned for use in bulk and single nuclei sequencing . We will dissect DCIS, adjacent normal and when available, associated carcinoma. In addition, when possible, normal tissue will be collected from a tissue block lacking lesions as well as collection of blood. A subset of patients (n = 5 | FFPE, flash frozen and fresh) will be sent to the Broad Institute for single nuclei/cell sequencing.2. AAH and squamous dysplastic/CIS lesions from airway and lung tissue: For squamous cell lung cancer, we will collect endobronchial biopsies from abnormal airway regions identified on autofluroscence bronchoscopy or identify PMLs in the margins of resected lung tissue. We will study 20 patients (5 each from BU/UCLA/Vanderbilt/Moffitt) with pre-invasive squamous lesions (moderate-severe dysplasia or carcinoma in situ (CIS)) identified on pathologic examination. LCM of the premalignant region and adjacent normal epithelium will be performed as well as the invasive tumor for those collected from the resection margin (n=5 from UCLA). On a subset of lesions collected at bronchoscopy (n=5), we will collect additional biopsies that will be flash frozen and fresh for single nuclei and cell sequencing, respectively, performed at the Broad Institute. In parallel to the work at the Broad, BU will perform single cell RNA-seq on these freshly cell sorted tissues (n = 5). Blood will be collected on all patients for genomic studies. For lung adenocarcinoma, we will collect resected FFPE lung tissues from 20 patients (10 from UCLA and 10 from Vanderbilt/Moffitt) with early stage lung adenocarcinoma that harbor atypical adenomatous (AAH) premalignant lesions in the resection margin. We will LCM multiple AAH regions (3-5 per patient) as well as adjacent regions of normal epithelium and invasive adenocarcinoma. In addition, blood will be collected on all patients for genomic studies. 3. IPMNs from pancreatic tissue: For pancreatic cancer PML, we will collect low and high grade lesions from 24 patients representing macroscopic Intraductal Papillary Mucinous Neoplasms (IPMN) (n=24) from surgically resected specimens along with blood samples. Archival FFPE specimens of microscopic PanIN lesions, occurring multi-focally adjacent to invasive PDAC, and archival IPMN lesions (with or without associated invasive cancer), along with the adjacent normal tissue, will undergo LCM and utilized for bulk DNA and RNA sequencing. If matched frozen tissues are available for a subset of these FFPE samples, we will bank for comparison of profiles. Because IPMNs are macroscopic lesions, they provide an opportunity for obtaining the samples fresh and therefore can be used for single cell sequencing (in contrast to PanINs). Therefore, 5 freshly obtained IPMNs will be used for the single cell RNA sequencing studies performed at both the Broad Institute and MDACC, and the matched FFPE and/or frozen sections from these lesions (obtained from the adjacent PML) will be sent to Broad Institute as a pilot to assess "single nuclei" RNA sequencing.4. PINs from prostate tissue: For prostate cancer PML, there will be 40 samples of Prostatic Intraepithelial Neoplasia (PIN) collected between the Stanford and JHU sites (20 cases per site). At the Stanford site, 20 prostate specimens detected by PSA screening who have/will undergo surgery (radical prostatectomy) for clinically localized disease will make up the final cohort. The age range of the participants would be 40-75, and we anticipate that 18 will be Caucasian, 1 Asian and 1 Latino or African American based on the practice demographics practice at Stanford. Clinical and MRI data will also be collected for these samples. We will collect low grade (e.g. Gleason score of 6/Grade group 1; n=10) and high grade (Gleason score 4+3=7 or higher/Grade group 3 or higher; n=10) PINs from FFPE samples that have prostate carcinoma. In addition to obtaining LCM archival samples of low and high grade PIN, we will also obtain normal prostatic epithelial from the peripheral, central and transition zones as well as multiple samples of prostate carcinoma in order to obtain the spectrum of Gleason grades in the carcinoma as needed. LCM samples will be used for bulk DNA and RNA sequencing. In addition, single cells will be dissected from FFPE samples to prepare single cell RNA seq libraries using techniques developed at Stanford, and FFPE tissue will be sent to the Broad for single nuclei sequencing. When available, flash frozen and fresh samples from these prostates will be archived and prepared for single nuclei and cell sequencing, respectively, at the Broad Institute and at Stanford (single cell only). JHU will also capture 10 cases (5 grade group 1 and 5 grade group 2) of high grade PIN, normal and invasive adenocarcinoma using frozen sections from fresh frozen tissues. When possible these will be from the same patients as the FFPE samples. Since frozen sections can be quite challenging to morphologically determine high grade PIN from normal epithelium, for these samples we will perform a number of additional tissue-based characterizations. These will include a multicolor combined basal cells (p63 and CK903) and PIN/carcinoma markers (AMACR) referred to in the cocktail as "PIN4", c-MYC (referred to as MYC) protein5, by IHC and mRNA by in situ hybridization (AM De Marzo, Q Zheng unpublished observations), telomere length by in situ hybridization6 and the 5'ETS/45S rRNA7. For these slides, the whole slides will be scanned with a Hammamatsu Nanozoomer with a 40x objective and regions of interest will be annotated as a guide for LCM.II. Laser-capture microdissection (LCM): FFPE tissue blocks will be sectioned at 7μm thickness and serial sections will be stained with H&E. LCM will be performed utilizing standard LCM systems, such as Leica LMD7000 and ArcturusXT at each site. Regions of premalignancy will be dissected and RNA/DNA will be extracted from microdissected cells using the Qiagen All Prep DNA/RNA FFPE Kit. Aim 2: Perform bulk RNA and DNA seq on premalignant FFPE samples and compare the genomic/transcriptomic alterations within and across organ sites. Rationale: There have been limited studies characterizing the genomic and transcriptomic landscape of premalignant lesions associated with breast, pancreatic, lung or prostate cancers. Characterizing the molecular determinants of premalignant disease that are unique and shared across multiple organs will enable new candidate biomarkers for early detection and novel therapeutic strategies for early intervention. MethodsBulk RNA-seq of LCM FFPE tissue: All participating sites will perform bulk RNA-seq in accordance with SOPs developed at BU. In brief, total RNA will be isolated from LCM'd lesion and associated microenvironment tissue using the Qiagen All Prep DNA/RNA FFPE Kit and quality will be assessed with the Agilent Bioanalyzer. Libraries will be generated with the Illumina TruSeq Access kit (for FFPE samples). They will be sequenced on the Illumina HiSeq2500 with 75base-pair paired-end reads. Quality of FASTQ files will be assessed with FastQC. Reads will be aligned to the human genome with STAR and gene-level and isoform-level expression will be quantified with RSEM. Splice junction saturation, transcript integrity, and biotype distributions will be calculated for each sample with RSeQC. DESeq2 and EdgeR will be used to identify associations between gene expression profiles and clinical variables while controlling for confounding covariates. BU will serve as an RNA-seq Core to assess reproducibility of FFPE RNA-seq methods across sites. We will perform RNA-seq according to the SOP listed above on a subset of samples for each organ type (total n ~ 20). Bulk seq of DNA from FFPE tissue: All participating sites will perform targeted or whole exome-seq (WES) in accordance with SOPs. In brief, DNA from laser captured material will be isolated using the Qiagen All Prep DNA/RNA FFPE Kit and undergo stringent quality control to ensure high quality input material for genomic profiling. Purified DNA (ideally 100-200 ng) will be used for library preparation and amplification, followed by next generation sequencing using standard protocols distributed by CDMG. Exome-seq methods are considered standardized, thus we will not need a DNA-seq Core to assess reproducibility across sites. We anticipate local centers will use Illumina paired end reads, following the following general approach. 1) DNA library preparation: Paired-end libraries will be prepared following the manufacturer's protocols (Illumina and Agilent), fragmented to 150-200 bp 2) Capture of targeted exome: Whole exome capture will be carried out using the protocol for Agilent's SureSelect Human All Exon kit. Purified capture products will be amplified using the SureSelect GA PCR primers (Agilent) for 12 cycles. 3) Sequencing will be carried out for the captured libraries using at least 100 bp paired-end reads. To achieve high level sensitivity and accuracy for detecting all the mutations in the whole exome, each sample will be sequenced at 200X mean depth. 4) Read mapping and alignment and variant analysis: Sequence short reads will be aligned to a reference genome (NCBI human genome assembly build 38) using BWA-MEM. Local realignment of aligned reads will be performed using Genome Analysis Toolkit (GATK).Data QC: To ensure scientific rigor and consistency among sites in RNA and DNA processing we will include a preliminary analysis of steps in processing and analysis. Protocols for extraction of high quality RNA and DNA from formalin fixed paraffin embedded (FFPE) tissues, which will be used extensively in these studies continue to improve and may have variable implementation among the sites participating in this study. To evaluate consistency of preliminary steps in processing and downstream analyses, we will initially distribute slides from one large FFPE fixed cancer of origin from prostate, breast, lung and pancreatic cancer. Analysis of these samples will allow us to review the DNA and RNA characteristics (yield, purity and strand length) among sites. Downstream analysis of these same samples will also allow us to compare among sites the consistency of variant calls among centers. We will be able to identify if there are some times of calls (such as small insertion deletions) that are more variable among centers versus other types of calls (such as relative gene expression or single base pair substitutions) that we expect to be less variable and to characterize the reliability of findings across sites. We are also including a 5% blind duplicate analysis of RNA sequencing. Samples will be analysed by the participating genomics cores without knowledge of the phenotype. RNA seq and CNA analyses are normalized for batch effects. We will also compare the observed sex to the self-reported sex as based on RNA profiles and exome sequencing of X chromosome genes as another check for processing accuracy and sample management. D. References 1. Wacholder, S. Precursors in Cancer Epidemiology: Aligning Definition and Function. Cancer Epidemiol. Prev. Biomark. 22, 521-527 (2013). PMID: 23549395.2. Berman, J. J. Precancer: The Beginning and the End of Cancer. (Jones & Bartlett Learning, 2011).3. Nasiell, K., Nasiell, M. & Vaćlavinková, V. Behavior of moderate cervical dysplasia during long-term follow-up. Obstet. Gynecol. 61, 609-614 (1983). PMID: 6835614.4. Merrick, D. T. et al. Persistence of Bronchial Dysplasia Is Associated with Development of Invasive Squamous Cell Carcinoma. Cancer Prev. Res. (Phila. Pa.) 9, 96-104 (2016). PMID: 26542061.5. Gurel, B. et al. Nuclear MYC protein overexpression is an early alteration in human prostate carcinogenesis. Mod. Pathol. Off. J. U. S. Can. Acad. Pathol. Inc 21, 1156-1167 (2008). PMID: 18567993.6. Meeker, A. K. et al. Telomere shortening is an early somatic DNA alteration in human prostate tumorigenesis. Cancer Res. 62, 6405-6409 (2002). PMID: 12438224.7. Guner, G. et al. Novel Assay to Detect RNA Polymerase I Activity In Vivo. Mol. Cancer Res. MCR 15, 577-584 (2017). PMID: 28119429.
Differences in rates of diseases between different populations with similar environments are presumed to be secondary to variability in population frequencies of disease causing alleles. Recently admixed populations such as African Americans (AAs) provide a natural opportunity to identify disease-causing variants by examining the ancestry of long chromosomal haplotypes. Admixture mapping approach can be successful in locating genes for diseases such as EAC whose rates varies markedly between the ancestral populations. This is a case only study with 54 African American cases and also on a subset of 28 cases with high genotyping quality. We seek to identify chromosome regions that have excess European ancestry and contrasting it to excess African ancestry.
Melorheostosis is a rare osteosclerotic disease resulting in exuberant excessive bone growth with a characteristic radiographic appearance often described as "dripping candle wax". As a result of these bony formations, patients report mild to moderate pain that interferes with their routine activities. It is usually diagnosed on radiographs but bone biopsy may be performed to exclude other osteosclerotic diseases and/or osteosarcoma. Deformities, limb-length discrepancy, muscle atrophy, neurological deficit have been reported as complications. A subset of patients have somatic mutations in MAP2K1. The cause of this disease is not known in all patients, the natural history poorly described and there is no clearly-defined systemic therapy. We propose a prospective observational study to investigate the natural history and pathogenesis of the disease. Subjects will undergo standardized initial evaluation and medically indicated testing. Skin biopsies may be performed to test for known mutations related to melorheostosis, and if negative affected bone and/or skin may be sent for genetic testing for acquired somatic mutations in genes that control bone homeostasis. Enrolled subjects will be followed every two to three years for assessment of disease progression and receive clinically indicated testing and treatment. The study of this rare bone disease offers the potential to generate new insights, provide answers as well as generate new questions into the biology of the skeletal and mineral metabolism.
Incidence rates of renal cell carcinoma (RCC) are rising and the latest estimates show that it accounts for over 300,000 cases and 120,000 deaths worldwide each year. Mechanisms underlying RCC occurrence are not fully understood and a large part of the disease heritability remains unexplained. The study aimed at augmenting the size of available RCC genome-wide association studies to increase the statistical power to detect genetic variants associated with the disease. The study includes genome-wide genotyping data from RCC cases (n=2,781) and controls (n=2,526) recruited in Western Europe, Central and Eastern Europe, and Australia.
This study (10-C-0086) collected and analyzed tumor and circulating tumor omics of pediatric and young adult patients with relapsed rhabdomyosarcoma who were co-enrolled in and receiving treatment on the interventional study 17-C-0049. We sought to describe the summary genomic findings of tumors, and report on the detection of circulating tumor DNA (ctDNA) in serial samples. Tumor tissue and blood were collected and analyzed. Tumor samples were subjected to whole exome and/or whole genome sequencing paired with RNA-Seq. Cell free DNA (cfDNA) was extracted from serially collected blood samples and sequenced. Fusion status, mutations, and IGF1R and YES1 expression from tumor samples revealed the presence of a PAX3 fusion in most samples, a number of common mutations, and universal but heterogeneous expression of IGF-1R and YES1. ctDNA was detected above the 3% threshold in a majority of patients and analysis of cfDNA demonstrated an ability to monitor tumor clonal evolution.
The whole exome sequencing showed that HERC5 deletion and under-expression associates with shorter: time to tumor recurrence, progression-free, and overall survival in hepatocellular carcinoma (HCC) in two independent studies (p1=0.004, HR1=1.80; p2=0.018, HR2=3.31) totaling 286 HCC patients. Matched primary and recurrent tumors indicated a clonal selection advantage in somatic single nucleotide and copy number variants (CNVs) in recurrent compared to primary tumors of Chinese HCC patients with Wnt signaling most activated.
Neuroblastoma is the most common extra-cranial solid tumor in children. It represents 8% to 10% of all childhood cancers. Stage 4 Neuroblastoma is characterized by its clinical heterogeneous outcome. The special category, stage 4S tumors (2-5% of all NB) are chemo-sensitive, and the patients show spontaneous regression. On the other hand, MYCN amplification (25-30% of all NB) is associated with poor outcome of neuroblastoma, thus we further categorize stage 4 neuroblastoma into MYCN non-amplified and MYCN amplified group. Here we use transcriptome sequencing to characterize the transcriptome in 29 stage 4 Neuroblastoma samples.
Background. Clear cell renal cell carcinoma (ccRCC) is the most common histologically defined renal cancer. However, it is not a uniform disease and includes several genetic subtypes with different prognoses. ccRCC is also characterized by distinguished metabolic reprogramming. Tobacco smoking (TS) is an established risk factor for ccRCC with unknown effects on tumor pathobiology.Methods. We investigated the landscape of ccRCCs and paired normal kidney tissues (NKTs) using integrated transcriptomic, metabolomic and metallomic approaches in a cohort of never smokers (NS) and long-term current smokers (LTS) Caucasian males. Results. All three Omics domains consistently identified a distinct metabolic subtype of ccRCCs in LTS, characterized by activation of oxidative phosphorylation (OxPhos) coupled with reprogramming of the malate-aspartate shuttle and metabolism of aspartate, glutamate, glutamine and histidine. Cadmium, copper and inorganic arsenic accumulated in LTS tumors showing redistribution among intracellular pools, including relocation of copper into the cytochrome c oxidase complex. Gene expression signature based on the LTS metabolic subtype provided prognostic stratification of The Cancer Genome Atlas (TCGA) ccRCC tumors that was independent from genomic alterations.Conclusions. The work identified the TS related metabolic subtype of ccRCC with vulnerabilities that can be exploited for precision medicine approaches targeting metabolic pathways. The results provided rationale for the development of metabolic biomarkers with diagnostic and prognostic applications using evaluation of OxPhos status. The metallomic analysis revealed the role of disrupted metal homeostasis in ccRCC highlighting the importance of studying effects of metals from e-cigarettes and environmental exposures.
This study assesses glioma patient blood methylation longitudinally beginning pre-surgery through other clinically important timepoints. Progression and survival are studied in relation to methylation derived immunologic profiles and other characteristics. For this study, patients scheduled for surgery at University of California San Francisco (UCSF) for a possible newly diagnosed glioma (grades 2-4) or recurrent grade 2/3 glioma were consented and enrolled. Patients had blood collected pre-surgery and, if the pathology results confirm a glioma, every few weeks or months during their treatment (up to 30 total timepoints). Blood immunomethylomic profiles were measured at these timepoints to assess associations with glioma patient outcomes, clinical measures, other known prognostic markers, and imaging characteristics.
In this study, we performed whole-exome sequencing of 41 patients with Hurthle cell carcinoma, a thyroid cancer variant with abundant mitochondria. We were able to identify recurrent somatic mutations in both the nuclear and mitochondrial genomes of Hurthle cell carcinoma. Our study identifies mutations in DAXX (GeneID: 1616), TP53 ( GeneID:7157), NF1(GeneID:4763) , NRAS (GeneID:4893), CDKN1A (GeneID:1026), ARHGAP35 (GeneID:2909), the TERT (GeneID:7015) promoter, and mitochondrial-encoded complex I genes to be candidate driver events in Hurthle cell carcinoma. Furthermore, copy number analysis revealed widespread chromosomal losses, resulting in loss-of-heterozygosity and a near-haploid state, to be a defining feature of these tumors. Finally, by analyzing locoregional recurrences and metastases, we were able to identify that widespread chromosomal losses and mitochondrial complex I mutations appear to be early driver events that are selected during clonal evolution.