We performed a comprehensive multi-omics analysis of 786 trace-tumor-samples from 154 esophageal squamous cell carcinoma phase (ESCC) patients, covering 9 histopathological stages in 3 phases as nontumor phase (NT phase), intraepithelial neoplasia phase (IEN phase), and ESCC phase. Proteogenomics elucidated the stage-specific molecular characterization and defined the cancer-driving waves along with the mutation accumulation in EC progression. The integrated multi-omics uncovered the chromosome 3q gain was the key event in the transmit from the NT to IEN phase, disclosed the top mutation of TP53 enhanced cell cycle and DNA replication in the IEN phase, and revealed the ESCC phase mutations of AKAP9 and MCAF1 elevated glycolysis and Wnt signaling, respectively. Furthermore, the trajectory analysis identified 6 major tracks related to different clinical features during ESCC progression. Growingly enhanced and hyperphosphorylated phosphoglycerate kinase 1 (PGK1, S203) was detected and considered as a drug target in ESCC progression. Collectively, this study provides insight into the understanding of ESCC molecular mechanism and a valuable resource for the development of therapeutic targets.
The human papillomavirus (HPV) genome is integrated into host DNA in most HPV-positive cancers, but the consequences for chromosomal integrity are unknown. Continuous long-read sequencing of oropharyngeal cancers and cancer cell lines revealed a unique form of structural variation, termed heterocateny here, characterized by heterogeneous, interrelated, and repetative patterns of concatemerized virus and host DNA segments. Evidence of heterocateny was detected in extrachromosomal and/or intrachromosomal DNA in all cases. Unique breakpoint sequences shared across structurally heterogeneous virus-host concatemers within each cancer facilitated stepwise reconstruction of their evolution from a common molecular ancestor. This analysis revealed that unstable virus and virus-host concatemers in ecDNA or integrated form mediate insertion into and excision from chromosomes, capture, rearrangement, and rolling-circle amplification of host DNA, and chromosomal rearrangements. The data indicate that heterocatena is driven by the dynamic, aberrant replication and recombination of an oncogenic DNA virus, thereby extending known consequences of HPV integration to include promotion of intra-tumoral heterogeneity and clonal evolution.
Glioma intratumoral heterogeneity enables adaptation to challenging microenvironments and contributes to therapeutic resistance. We integrated 914 single-cell DNA methylomes, 55,284 single-cell transcriptomes, and bulk multi-omic profiles across 11 adult IDH-mutant or IDH-wild-type gliomas to delineate sources of intratumoral heterogeneity. We show that local DNA methylation disorder associates with cell-to-cell DNA methylation differences, is elevated in more aggressive tumors, links with transcriptional disruption, and is altered in environmental stress response. Glioma cells under in vitro hypoxic and irradiation stress increased local DNA methylation disorder and shifted cell states. We identified a positive association between genetic and epigenetic instability that was supported in bulk longitudinally collected DNA methylation data. Increased DNA methylation disorder associated with accelerated disease progression, and recurrently selected DNA methylation changes were enriched for environmental stress response pathways. Our work identifies an epigenetically facilitated adaptive stress response process and highlights the importance of epigenetic heterogeneity in shaping therapeutic outcomes.
Study 1 2R01-NS050375 (PI: DOBYNS, William B.) The genetic basis of mid-hindbrain malformations Our general goal for this project is to advance our understanding of human developmental disorders that involve the brainstem and cerebellum - brain structures derived from the embryonic midbrain and hindbrain - that affect a minimum of 2.4 per 1000 resident births based on data from the CDC. Importantly, this large class of disorders co-occurs with more common developmental disorders such as autism, mental retardation and some forms of infantile epilepsy, and shares some of the same causes. With this renewal, we propose to expand the scope of our work beyond single phenotypes and genes to focus on delineating the critical phenotype spectra to which the most common MHM belong, and defining the underlying biological networks that are disrupted. To pursue these goals, we will use our large and growing cohort of human subjects to map additional MHM loci using SNP microarrays that provide both high-resolution autozygosity and linkage data in informative families as well as detect critical copy number variants in sporadic subjects. The causative genes will be identified using traditional Sanger or new high-throughput sequencing methods as appropriate abased on size of the critical region. We will use these and other known MHM causative genes to construct and revise model biological networks of genes and proteins, and test these genes and networks in additional patients as a candidate gene or more accurately a candidate network approach. These approaches need to be supported by ongoing active subject recruitment, as studies of comparable disorders such as mental retardation and autism have benefited from even larger numbers of subjects that we have so far collected. We need to use new high-throughput sequencing methods to more efficiently test larger critical regions, and to test entire gene networks rather than individual genes in matched cohorts of subjects. At every step; phenotype analysis, CNV analysis, model network construction and high-throughput sequencing, we will need expanded bioinformatics capabilities. Finally, we need to test the biological function of new genes and networks to support our gene identification studies. We expect that these studies will contribute immediately to more accurate diagnosis and counseling, and over time will lead to development of specific treatments for a subset of these disorders. We further expect that studies of mid-hindbrain development will have broad significance for human developmental disorders generally, providing compelling evidence for a connection between cerebellar development and other classes of developmental disorders such as autism, mental retardation and epilepsy. Study 2 R01-NS058721 (PI: DOBYNS, William B.) De novo copy number variation and gene discovery in human brain malformations Project Summary/Abstract The number of recognized brain malformations and syndromes has grown rapidly during the past several decades, yet relatively few causative genes have been identified, especially for three common malformations that have been associated with numerous cytogenetically visible chromosome deletions and duplications, and that often occur together: agenesis of the corpus callosum (ACC), cerebellar vermis hypoplasia (CVH) including Dandy-Walker malformation (DWM), and polymicrogyria (PMG). We propose to perform high-resolution array comparative genome hybridization (aCGH), emerging technology able to detect small copy number variants (CNV), in 700 probands with one or more of these three malformations. Our central hypothesis states that more than 10% of patients with ACC, CVH or PMG will have de novo CNV below the resolution of routine cytogenetic analysis, but detectable by current array platforms. We therefore expect to identify 70-100 patients with small CNV. We will distinguish CNV found in normal individuals from potentially disease-associated changes, and will confirm CNV using fluorescence in situ hybridization (FISH) and microsatellite (STRP) analysis. We will give highest priority to CNV that are de novo and involve 2 or more BACs, and secondary priority to familial and smaller CNV excluding known polymorphisms. After that, we will evaluate and rank candidate genes in the critical regions using information from public databases and our own expression studies, and perform mutation analysis of the best candidate genes from well-defined critical regions by sequencing in a large panel of subjects with phenotypes that match the phenotypes of the patients whose CNV define the critical regions. Here, we will use more refined criteria to supplement our clinical classification, such as the developmental level and presence of epilepsy or other birth defects. Any abnormalities found will be analyzed using existing data regarding polymorphisms (i.e. dbSNP), cross-species comparisons, and functional assays appropriate for the specific sequence change. Study 2A In 1995, we described a novel multiple congenital anomaly syndrome associated with facial dysmorphism (congenital ptosis, high arched eyebrows, shallow orbits, trigonocephaly), colobomas of the eyes, neuronal migration malformation (frontal predominant lissencephaly) and variable hearing loss. We hypothesized from de novo mutations and used trio-based exome sequencing to identify de novo mutations in the ACTB and ACTG1 genes. Study 2B In 1997 and 2004, we and others defined two novel developmental syndromes associated with markedly enlarged brain size, or megalencephaly, and other highly recognizable features. The megalencephaly-capillary malformation syndrome (MCAP) consists of megalencephaly and associated growth dysregulation with variable asymmetry, developmental vascular anomalies, distal limb malformations, variable cortical malformation, and a mild connective tissue dysplasia. The megalencephaly-polymicrogyria-polydactyly-hydrocephalus syndrome (MPPH) resembles MCAP but lacks vascular malformations and syndactyly. We hypothesized that MCAP and MPPH result from mutations - including postzygotic events - in the same pathway, and studied them together. Using a combination of exome sequencing, Sanger sequencing, restriction-enzyme assays, and targeted ultra-deep sequencing in 50 families with MCAP or MPPH, we identified de novo germline or postzygotic mutations in three core components of the phosphatidylinositol-3-kinase/AKT pathway. These include two mutations in AKT3, a recurrent mutation in PIK3R2, and multiple mostly postzygotic mutations in PIK3CA (Rivière JB, Mirzaa GM, O'Roak BJ, Beddaoui M, Alcantara D, Conway RL, St-Onge J, Schwartzentruber JA, Gripp KW, Nikkel SM, Worthylake T, Sullivan CT, Ward TR, Butler HE, Kramer NA, Albrecht B, Armour CM, Armstrong L, Caluseriu O, Cytrynbaum C, Drolet BA, Innes AM, Lauzon JL, Lin AE, Mancini GMS, Meschino WS, Reggin JD, Saggar AK, Lerman-Sagie T, Uyanik G, Weksberg R, Zirn B, Beaulieu CL, FORGE Canada Consortium, Majewski J, Bulman DE, O'Driscoll M, Shendure J, Graham Jr. JM, Boycott KM, Dobyns WB. De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes. Nat. Genet. In press). Study 3 2R01-NS046616 (PI: GOLDEN, Jeffrey A) The role of ARX in normal and abnormal brain development This subcontract from the Children's Hospital of Philadelphia to the University of Chicago (UC) is intended to support research studies of the ARX and functionally related genes in human subjects with any one of several specific developmental disorders. The Co-investigator at UC (W.B. Dobyns) will identify a series of patients with mental retardation and severe infantile epilepsy, some of whom will have specific brain malformations and others who will have normal brain structure by brain imaging studies, and collect research samples from these subjects with informed consent. The studies to be performed will include mutation analysis of ARX, mutation analysis of specific downstream target genes, X inactivation studies in humans and X inactivation studies in mutant mice. The results will be analyzed to determine the significance of any changes found in the gene.
Chondrosarcoma (CHS) is a heterogeneous collection of malignant bone tumours and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1/2 in nearly half of central CHS. However, there has been little systematic genomic analysis of this tumour type and thus the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 cases of CHS. We identified hypermutability of the major cartilage collagen COL2A1 with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition we identified mutations in IDH1/2 (59%), TP53 (20%), the RB1 pathway (27%) and hedgehog signaling (22%).
Tumor samples were collected from a patient with synovial sarcoma, which acquired resistance to ACT targeting NY-ESO-1. Biopsies (n=3; primary, metastasis, and recurrence) were subjected to bulk tumor DNA and RNA sequencing, as well as high-dimensional spatial profiling of RNA and protein targets. Bulk tumor whole exome and RNA sequencing corresponding to all three tumor specimens and whole exome sequencing from patient-matched normal blood are made available through this accession.
In this study, sequencing data (WES, WGS, linked-read WGS) was used to identify candidate causal germline variants in a family with inherited cholangiocarcinoma. Candidate causal SNVs and indels were identified from the germline WES data of eight siblings (four affected, four unaffected), then somatic second hits were identified from matched tumor/normal pairs. Second hits were verified using haplotype information derived from linked-read WGS of the tumor data.
We have developed FusionSeq to identify fusion transcripts from paired-end RNA-sequencing. FusionSeq includes filters to remove spurious candidate fusions with artifacts such as misalignments or random pairing of transcript fragments and it ranks candidates according to several statistics. It also has a module to identify exact sequences at breakpoint junctions. FusionSeq detected known and novel fusions in a specially sequenced calibration data set, including 8 cancers with and without known rearrangements.
The current data pertains RNA-sequencing reads obtained from thyroid samples acquired from fetuses with Down syndrome and fetuses with no genetic/developmental abnormality. Total RNA was isolated from left lobe from thyroid samples using a hand-held homogenizer and the Promega ReliaPrep RNA Miniprep System (Thermo Fisher Scientific). RNA yield was determined with the NanoDrop Microvolume Spectrophotometer (Thermo Fisher Scientific). Fragmentation and mRNA library preparation was performed using the Kapa mRNA Hyperprep Kit (Roche, Basel, Switzerland). Libraries were equimolar pooled and quality was checked on a TapeStation system using the DNA1000 ScreenTape (Agilent Technologies, Santa Clara, CA, USA). Libraries were sequenced with poly(A) selection to sequence all messenger RNA for gene expression analysis on the NovaSeq6000 PE150 (Illumina, San Diego, CA, USA), producing at least 40M 150-bp paired-end reads per library.
To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated targeted epigenomic data from long-term endurance (8 men) and strength (8 men) trained individuals and healthy age-matched untrained controls (8 men). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and 3hrs following acute exercise. Control subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks, athletes completed one bout in their respective form of sports. All 96 samples were used for DNA extraction and targeted library construction using a custom Twist Biosciences panel and following EM-methylation tranformation were sequenced (2x150bp paired end) on the Illumina NovaSeq 6000.