To elucidate the timing and mechanism of the clonal expansion of somatic mutations in cancer-associated genes in the normal endometrium, we conducted target sequencing of 112 genes for 1,298 endometrial glands and matched blood samples from 36 women. By collecting endometrial glands from different parts of the endometrium, we showed that multiple glands with the same somatic mutations occupied substantial areas of the endometrium. The 112 genes are as follows: ABCC1, ACRC, ANK3, ARHGAP35, ARID1A, ARID5B, ATCAY, ATM, ATR, BARD1, BCOR, BRCA1, BRCA2, BRD4, BRIP1, CAMTA1, CDC23, CDYL, CFAP54, CHD4, CHEK1, CHEK2, CTCF, CTNNB1, CUX1, DGKA, DISP2, DYNC2H1, EMSY, FAAP24, FAM135B, FAM175A, FAM65C, FANCA, FANCB, FANCC, FANCD2, FANCE, FANCF, FANCG, FANCI, FANCL, FANCM, FAT1, FAT3, FBN2, FBXW7, FGFR2, FRG1, GPR50, HEATR1, HIST1H4B, HNRNPCL1, HOOK3, KIAA1109, KIF26A, KMT2B, KMT2C, KRAS, LAMA2, LRP1B, MLH1, MON2, MRE11A, MSH2, MSH6, MTOR, NBN, PALB2, PHEX, PIK3CA, PIK3R1, PLXNB2, PLXND1, PMS2, POLE, POLR3B, PPP2R1A, PTEN, PTPN13, RAD50, RAD51, RAD51B, RAD51C, RAD51D, RAD52, RAD54B, RAD54L, RICTOR, SACS, SIGLEC9, SLC19A1, SLX4, SPEG, STT3A, TAF1, TAF2, TAS2R31, TFAP2C, TNC, TONSL, TP53, TTC6, UBA7, VNN1, WT1, XIRP2, ZBED6, ZC3H13, ZFHX3, ZFHX4, ZMYM4.
Helios, encoded by IKZF2, is a member of the Ikaros family of transcription factors with pivotal roles in T-follicular helper, NK- and T-regulatory cell physiology. Somatic IKZF2 mutations are frequently found in lymphoid malignancies. Although germline mutations in IKZF1 and IKZF3, encoding Ikaros and Aiolos, have recently been identified in patients with phenotypically similar immunodeficiency syndromes, the effect of germline mutations in IKZF2 on human hematopoiesis and immunity remains enigmatic. We identified germline IKZF2 mutations (one nonsense (p.R291X)- and 4 distinct missense variants) in six patients with systemic lupus erythematosus, immune thrombocytopenia or EBV-associated hemophagocytic lymphohistiocytosis. Patients exhibited hypogammaglobulinemia, decreased number of T-follicular helper and NKcells. Single-cell RNA sequencing of PBMCs from the patient carrying the R291X variant revealed upregulation of pro-inflammatory genes associated with T-cell receptor activation and T-cell exhaustion. Functional assays revealed the inability of HeliosR291X to homodimerize and bind target DNA as dimers. Moreover, proteomic analysis by proximity-dependent Biotin Identification revealed aberrant interaction of 3/5 Helios mutants with core components of the NuRD complex conveying HELIOS mediated epigenetic and transcriptional dysregulation.
The extensive primary and secondary drug resistance in many cancer types requires rational approaches to design personalized and selective combinatorial therapies that do not only show synergistic effect in overall cancer cell killing but also result in minimal toxic side effects on non-malignant cells. To address the combinatorial explosion in the number of relevant combinations, we implemented a machine learning approach that prioritizes patient-customized drug combinations with a desired synergy-efficacy-toxicity balance by combining single-cell RNA-sequencing with ex vivo single-agent testing in scarce patient-derived primary cells. When applied to two diagnostic and two refractory AML patient cases, each with a different genetic background, our integrated approach accurately predicted patient-specific combinations that were shown to result not only in synergistic cancer cell co-inhibition but were also capable of targeting specific AML cell subpopulations that emerge in differing stages of disease pathogenesis or treatment regimens. Our data-driven approach provides an unbiased means for systematic identification of personalized combinatorial regimens that selectively co-inhibit leukemic cells while avoiding inhibition of non-malignant cells, and highlight the relevance of considering cell heterogeneity for personalized cancer therapy.
Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML). To identify potential disturbances in gene regulation, we conducted an unbiased screen of allele-specific expression (ASE) in 209 AML cases. The gene encoding GATA binding protein 2 (GATA2) displayed ASE more often than any other myeloid or cancer-related gene. GATA2 ASE was strongly associated with CEBPA double mutations (CEBPA DM), with 95% of cases presenting GATA2 ASE. In CEBPA DM AML with GATA2 mutations, the mutated allele was preferentially expressed. We found that GATA2 ASE is a somatic event lost in complete remission, supporting the notion that it plays a role in CEBPA DM AML. Acquisition of GATA2 ASE involved silencing of one allele via promoter methylation, compensated by overactivation of the other allele, thereby preserving expression levels. Notably, promoter methylation was also lost in remission together with GATA2 ASE. In summary, we propose that GATA2 ASE is acquired by epigenetic mechanisms and is a prerequisite for the development of AML with CEBPA DM. This finding constitutes a novel example of an epigenetic hit cooperating with a genetic hit in the pathogenesis of AML.
To perform a comprehensive genomic characterization of 70 patients suffering from cancer of unknown primary (CUP) we used whole-exome, whole-genome, transcriptome and methylome analysis. We detected a substantial mutational heterogeneity with genes most commonly affected by SNVs, indels and fusions being TP53, TTN, MUC16, ABCA13, COL6A3, KRAS, LRP1B, XIRP2 and CSMD3. The most common fusion involved FGFR2, the most common focal deletion affected CDKN2A. A molecular tumor board recommended genomics-based therapies in 56/70 (80%) patients which were applied in 20/56 (35.7%) cases. Entity predictions based on transcriptome and methylome data could be made in up to 62/70 (88.6%) cases but were conclusive in only 16/48 (33.3%) cases. Germline analysis revealed 6 (likely) pathogenic mutations in 5 patients. Recommended therapies translated into a mean PFS2/1 ratio of 3.61 (median=2.25) with a median PFS1 of 89 days (n=17) compared to a median PFS2 of 182.5 days (n=20). Our data emphasize the clinical benefit of comprehensive genetic approaches in diagnostic and therapeutic management and underline the need for innovative, mechanism-based clinical trials in this heterogeneous group of diseases.
Fusion genes arising from cancer-associated somatic mutations are a potential rich source for highly immunogenic neo-antigens. However, their exploitation as targets for personalized cancer immunotherapy is currently limited by the lack of computational tools allowing transcriptome-wide identification of unique fusion genes in an accurate and sensitive manner. Here, we present EasyFuse, a computational pipeline, to detect individual and cancer-specific fusion genes in next-generation-sequencing transcriptome data obtained from human cancer samples. Using machine learning, EasyFuse predicts personal fusion genes with high precision and sensitivity and outperforms previously described approaches as qualified by an unprecedented ground-truth dataset of >1500 verification experiments in relevant patient samples. By testing immunogenicity with autologous blood lymphocytes from cancer patients we detected pre-established CD4+ and CD8+ T cell responses for 10 of 21 (48%), and for 1 of 30 (3%) of identified fusion genes, respectively. In conclusion, we demonstrate accurate detection of cancer-specific fusion genes. The high frequency of T cell responses detected in cancer patients support the relevance of private fusion genes as neo-antigens for personalized immunotherapies, especially for tumors with low point mutation burdens.
Genome-wide analysis of cell-free DNA (cfDNA) methylation profile has been recognized as a promising approach for sensitive and specific detection of many cancers. However, scaling such genome-wide assays for clinical translation is impractical due to the high cost of whole genome bisulfite sequencing. We have shown that the small fraction of GC-rich genome is highly enriched in CpG sites and disproportionately harbors the majority of cancer-specific methylation signature. Here, we report on the simple but effective Heat enrichment of CpG-rich regions for Bisulfite Sequencing (Heatrich-BS) platform that allows for focused methylation profiling in these highly informative regions. Our novel method and bioinformatics algorithm enable accurate tumor burden estimation with high sensitivity and quantitative tracking of colorectal cancer patient’s response to treatment, at much reduced sequencing cost suitable for frequent monitoring. We also show, for the first time, tumor epigenetic subtyping from cfDNA using Heatrich-BS, which could enable patient stratification from non-invasive liquid biopsy. As such, Heatrich-BS holds great potential for highly scalable screening and regular monitoring of cancer using liquid biopsy.
Chromosomal instability is a major challenge to patient stratification and targeted drug development for high-grade serous ovarian carcinoma (HGSOC). Here we show that somatic copy number alterations (SCNAs) in frequently amplified HGSOC cancer genes significantly correlate with gene expression and methylation status. We identified five prevalent clonal driver SCNAs (chromosomal amplifications encompassing MYC, PIK3CA, CCNE1, KRAS and TERT) from multi-regional HGSOC data and reasoned that their strong selection should prioritise them as key biomarkers for targeted therapies. We used primary HGSOC spheroid models to test interactions between in vitro targeted therapy and SCNAs. MYC chromosomal copy number was associated with in-vitro and clinical response to paclitaxel and in-vitro response to mTORC1/2 inhibition. Activation of the mTOR survival pathway in the context of MYC-amplified HGSOC was statistically associated with increased prevalence of SCNAs in genes from the PI3K pathway. Co-occurrence of amplifications in MYC and genes from the PI3K pathway was independently observed in squamous lung cancer and triple negative breast cancer. These results suggest that identifying co-occurrence of clonal driver SCNA genes could be used to tailor therapeutics for precision medicine.
The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.
We designed a comprehensive multiple myeloma (MM) targeted sequencing panel to identify common genomic abnormalities in a single assay and validated it against known standards. The panel comprised 228 genes/exons for mutations, 6 regions for translocations, and 56 regions for copy number abnormalities (CNAs). Toward panel validation, targeted sequencing was conducted on 233 patient samples and further validated using clinical fluorescence in situ hybridization (FISH) (translocations), multiplex ligation probe analysis (MLPA) (CNAs), whole genome sequencing (WGS) (CNAs, mutations, translocations) or droplet digital PCR (ddPCR) of known standards (mutations). Canonical IgH translocations were detected in 43.2% of patients by sequencing, and aligned with FISH except for one patient. CNAs determined by sequencing and MLPA for 22 regions were comparable in 103 samples and concordance between platforms was R2=0.969. VAFs for 74 mutations were compared between sequencing and ddPCR with concordance of R2=0.9849. In summary, we have developed a targeted sequencing panel that is as robust or superior to FISH and WGS. This molecular panel is cost effective, comprehensive, clinically actionable and can be routinely deployed to assist risk stratification at diagnosis or post-treatment to guide sequencing of therapies.