DAC

WTSI CGP Data access committee

Dac ID Contact Person Email Access Information
EGAC00001000000 Data Sharing datasharing [at] sanger [dot] ac [dot] uk https://edam.sanger.ac.uk/

This DAC controls 395 datasets:

Dataset ID Description Technology Samples
EGAD00000000051 Sequencing data from matching Renal Carcinoma samples Illumina Genome Analyzer II 25
EGAD00000000052 Sequencing data from natching Pancreatic Carcinoma samples Illumina Genome Analyzer II 25
EGAD00000000053 Sequencing data from Breast Cancer samples Illumina Genome Analyzer II 1
EGAD00000000054 NCI-H209 is an immortal cell line derived from a bone marrow metastasis of a patient with small cell lung cancer, taken before chemotherapy. The specimen showed histologically typical small cells with classic neuroendocrine features. NCI-BL209 is an EBV-transformed B-cell line derived from the same patient as the small cell lung cancer cell line, NCI-H209 Life Tech - Solid 1
EGAD00000000055 COLO-829 is a publicly available immortal cancer cell line and COLO-829BL is a lymphoblastoid cell line derived from the same patient Illumina Genome Analyzer II 2
EGAD00001000001 Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma Illumina Genome Analyzer II 18
EGAD00001000002 Massive genomic rearrangement acquired in a single catastrophic event during cancer development Illumina Genome Analyzer,Illumina Genome Analyzer II 11
EGAD00001000004 CLL cancer Sample Sequencing Illumina Genome Analyzer,Illumina Genome Analyzer II 5
EGAD00001000005 Various Cancer Fusion Gene Sequencing Illumina Genome Analyzer II 14
EGAD00001000007 Osteosarcoma Sequencing Illumina Genome Analyzer II 43
EGAD00001000013 CLL Cancer Whole Genome Sequencing Illumina Genome Analyzer II 19
EGAD00001000014 Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Illumina Genome Analyzer II 54
EGAD00001000050 Tandem duplication of chromosomal segments is common in ovarian and breast cancer genomes Illumina Genome Analyzer II 13
EGAD00001000062 ADCC Rearrangement Screen Illumina Genome Analyzer II,Illumina HiSeq 2000 14
EGAD00001000064 Cell Line Sub Clone Rearrangement Screen Illumina Genome Analyzer II 6
EGAD00001000065 Mixed Leukemia Rearrangement Screen Illumina Genome Analyzer II 5
EGAD00001000066 Breast Cancer Follow Up Series Illumina Genome Analyzer II 288
EGAD00001000067 Cancer Single Cell Sequencing Illumina HiSeq 2000 16
EGAD00001000068 Multifocal Breast Project Illumina Genome Analyzer II,Illumina HiSeq 2000 22
EGAD00001000069 Lung Rearrangement Study Illumina HiSeq 2000 48
EGAD00001000070 TMD_AMLK Exome Study Illumina HiSeq 2000 50
EGAD00001000071 Kaposi sarcoma exome Illumina HiSeq 2000 20
EGAD00001000072 Fanconi Anemia transformation to AML Illumina HiSeq 2000 6
EGAD00001000073 MDSMPN Rearrangement Screen Illumina HiSeq 2000 11
EGAD00001000074 Integrative Oncogenomics of Multiple Myeloma Illumina Genome Analyzer II,Illumina HiSeq 2000 174
EGAD00001000075 Gastric and Esophageal tumour rearrangement screen Illumina HiSeq 2000 32
EGAD00001000076 CRLF2 sequencing project Illumina HiSeq 2000 13
EGAD00001000077 CRLF2 sequencing project Exomes Illumina HiSeq 2000 26
EGAD00001000078 ALK inhibitors in the context of ALK-dependent cancer cell lines Illumina HiSeq 2000 16
EGAD00001000079 PREDICT Illumina HiSeq 2000 186
EGAD00001000080 Genomics of Colorectal Cancer Metastases - Massively Parallel Sequencing of Matched Primary and Metastatic tumours to Identify a Metastatic Signature of Somatic Mutations (MOSAIC) Illumina HiSeq 2000 351
EGAD00001000081 Splenic Marginal Zone Lymphoma with villous lymphocytes exome sequencing Illumina HiSeq 2000 1
EGAD00001000082 20 Matched Pair Breast Cancer Genomes Illumina Genome Analyzer II,Illumina HiSeq 2000 42
EGAD00001000084 Matched Ovarian Cancer Sequencing Illumina Genome Analyzer II 23
EGAD00001000089 Acute Lymphoblastic Leukemia Exome sequencing Illumina Genome Analyzer II 20
EGAD00001000090 Glioma cell lines rearrangement screen Illumina Genome Analyzer II 3
EGAD00001000091 Non Tumour Renal Cell Line Sequencing Illumina Genome Analyzer II 1
EGAD00001000092 Cancer Exome Resequencing Illumina Genome Analyzer II 58
EGAD00001000093 Breast Cancer Exome Resequencing Illumina Genome Analyzer II 21
EGAD00001000094 Cancer Genome Libraries Tests Illumina Genome Analyzer II 16
EGAD00001000095 Acute Myeloid Leukemia Sequencing Illumina Genome Analyzer II,Illumina HiSeq 2000 9
EGAD00001000097 Matched breast cancer fusion gene study Illumina Genome Analyzer II,Illumina HiSeq 2000 46
EGAD00001000098 FRCC Exome sequencing Illumina Genome Analyzer II 16
EGAD00001000099 Meningioma Exome Illumina Genome Analyzer II 26
EGAD00001000100 Renal Matched Pair Cell Line Exome Sequencing Illumina Genome Analyzer II 10
EGAD00001000101 ADCC Exome Sequencing Illumina Genome Analyzer II,Illumina HiSeq 2000 125
EGAD00001000104 Acute Lymphoblastic Leukemia Exome sequencing 2 Illumina Genome Analyzer II 97
EGAD00001000111 CML Discovery Project Illumina Genome Analyzer II 6
EGAD00001000112 Identifying Novel Fusion Genes in Myeloma Illumina Genome Analyzer II 6
EGAD00001000116 Acute Lymphoblastic Leukemia Sequencing Illumina Genome Analyzer II,Illumina HiSeq 2000 61
EGAD00001000119 Chordoma Exome Sequencing Illumina Genome Analyzer II,Illumina HiSeq 2000 50
EGAD00001000121 Breast Cancer Whole Genome Sequencing Illumina HiSeq 2000 6
EGAD00001000124 Sequencing Acute Myeloid Leukaemia Illumina HiSeq 2000 4
EGAD00001000125 Chondrosarcoma Exome Illumina HiSeq 2000 104
EGAD00001000127 Burden of Disease in Sarcoma Illumina HiSeq 2000 220
EGAD00001000128 Familial Thrombocytosis germline exome sequencing Illumina HiSeq 2000 4
EGAD00001000130 Breast Cancer Matched Pair Cell Line Whole Genomes Illumina HiSeq 2000 22
EGAD00001000142 Renal Follow Up Series Illumina HiSeq 2000 637
EGAD00001000143 Xenograft Seqeuncing Illumina HiSeq 2000 16
EGAD00001000144 Lung Cancer Whole Genomes Illumina HiSeq 2000 18
EGAD00001000145 Matched Pair Cancer Cell line Whole Genomes Illumina HiSeq 2000 58
EGAD00001000149 A Comprehensive Catalogue of Somatic Mutations from a Human Cancer Genome Illumina HiSeq 2000 2
EGAD00001000154 Single-cell genome sequencing reveals DNA-mutation per cell cycle Illumina Genome Analyzer II,Illumina HiSeq 2000 12
EGAD00001000175 Identification of SPEN as a novel cancer gene and FGFR2 as a potential therapeutic target in adenoid cystic carcinoma Illumina Genome Analyzer II 48
EGAD00001000205 BRAF and MEK resistant cell line clones Illumina HiSeq 2000 3
EGAD00001000226 Chordoma is a rare malignant bone tumor that expresses the transcription factor T. We conducted an association study of 40 patients with chordoma and 358 ancestry-matched, unaffected individuals with replication in an independent cohort. Whole-exome and Sanger sequencing of T exons reveals a strong risk association ( allelic odds ratio (OR) = 4.9, P = 3.3x10-11, CI= 2.9-8.1) with the common (minor allelic frequency >5%) non-synonymous SNP rs2305089 in chordoma, which is exceptional in cancer genetics. Illumina Genome Analyzer II,Illumina HiSeq 2000 18
EGAD00001000243 Melanoma-TIL Study Exomes Illumina HiSeq 2000 43
EGAD00001000245 Pulldown cytosine deaminases Illumina HiSeq 2000 20
EGAD00001000246 Integrative Oncogenomics of multiple myeloma Illumina HiSeq 2000 106
EGAD00001000247 Integrative Oncogenomics of multiple myeloma Illumina HiSeq 2000 51
EGAD00001000248 RNAseq Pulldown Illumina HiSeq 2000 6
EGAD00001000252 Evaluation of PCR library method on whole genome samples Illumina HiSeq 2000 12
EGAD00001000253 AML targeted resequencing study Illumina HiSeq 2000 0
EGAD00001000255 Testing the feasibility of genome scale sequencing in routinely collected FFPE cancer specimens versus matched fresh frozen samples Illumina HiSeq 2000 32
EGAD00001000264 Resistance towards chemotherapy is one of the main causes of treatment failure and deathamong breast cancer patients.The main objective of this project is toidentify genetic mechanisms causing some breast cancer patients not torespond to a particluar type of chemotherapy (epirubicin) while otherpatients respond very well to the same treatment. In the project wewill perform genome / exome sequencing of a selection of breast cancerpatients (n=30). These patients are drawn from a cohort where allpatients have recieved treatment with epirubicin monotherapy before surgical removal of alocally advanced breast tumour, and where all patients have beensubjected to objective evaluation of the response to thetherapy. Subsequent to sequencing, we will analyse the data andcompare with the clinical data for each patient (object response totherapy). The main aim being to identify mutations that are associatedwith resistance to epirubicin. Identification of mutations with strongpredictive value, may have a direct impact on cancer treatment sinceit opens the possibility for genetic testing of a tumour, and desicionon which drug is likely to work best, prior to treatment start. Illumina HiSeq 2000 29
EGAD00001000265 This Study uses a focused bespoke bait pull down library method to target findings of Chondrosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. Illumina HiSeq 2000 0
EGAD00001000266 This Study uses a focused bespoke bait pull down library method to target findings of Osteosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. Illumina HiSeq 2000 110
EGAD00001000267 This Study uses a focused bespoke bait pull down library method to target findings of Chordoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. Illumina HiSeq 2000 46
EGAD00001000273 This Study uses a focused bespoke bait pull down library method to target findings of Meningioma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples. Illumina HiSeq 2000 147
EGAD00001000287 Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Illumina Genome Analyzer II 54
EGAD00001000288 Invasive lobular carcinoma (ILC) is the second most common histological subtype of breast cancer accounting for 10-15% of cases. ILC differs from invasive ductal carcinoma (IDC)with respect to epidemiology, histology, and clinical presentation. Moreover, ILC is lesssensitive to chemotherapy, more frequently bilateral, and more prone to form gastrointestinal, peritoneal, and ovarian metastases than IDCs. In contrast to IDC, the prognostic value ofhistological grade (HG) in ILC is controversial. One of the three major components of histological grading (tubule formation) is missing in ILC which hinders the process of gradingin this histological subtype and results in the classification of approximately two thirds of ILC as HG 2.Over the last decade, a number of gene expression signatures have shed light onto breast cancer classification, allowing breast cancer care to become more personalized. Withrespect to the management of estrogen receptor (ER)-positive breast cancer, several gene expression signatures provide prognostic and/or predictive information beyond what is possible with current classical clinico-pathological parameters alone. Nevertheless, most studies using gene expression signature have not considered different histologic subtypesseparately. Recently, a comprehensive research program has elucidated some of the biological underpinnings of invasive lobular carcinoma. Genetic material extracted from 200 ILC tumor samples were studied using gene expression profiling and identified ILCmolecular subtypes. These proliferation-driven gene signatures of ILC appear to have prognostic significance. In particular, the Genomic Grade (GG) gene signature improved upon HG in ILC and added prognostic value to classic clinico-pathologic factors. In addition this study demonstrated that most ILC are molecularly characterized as luminal-A (~75%)followed by luminal-B (~20%) and HER2-positve tumors (~5%). Moreover, we investigated the prognostic value of known gene signatures/ gene modules in the same cohort of ILC. As a second step within the scope of this project, we aim to investigate the interactionsbetween somatic ILC tumor mutations to observed transcriptome findings. To this end, we aim to perform somatic mutation analysis for the ILC tumors for which Affymetrix gene expression profiling is available. To this end, we will use a gene screen assay, which specifically interrogates the mutational status of a few hundreds of cancer genes. We believe that this pioneering effort will be fundamental for a tailored treatment of ILC withimprovement in patients' outcome. Illumina HiSeq 2000 1130
EGAD00001000289 Agilent whole exome hybridisation capture was performed on genomic DNA derived from cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products. Illumina HiSeq 2000 12
EGAD00001000301 A couple of previously characterized and sequenced libraries will be repeated using a couple of differing size selection criteria and skim sequenced using an Illumina HiSeq. The resulting sequence will be analyzed to determine the optimal DNA library size for our specific downstream analysis. Illumina HiSeq 2000 1
EGAD00001000302 This experiment is looking at the mutational signatures generated by engineered HRAS mutations by using whole genome sequence generated on massively parallel next generation sequencers. Illumina HiSeq 2000 6
EGAD00001000324 We will sequence the RNA of lymphoblast samples, transformed with EBV, which have poikiloderma syndrome with mutations in c16orf57. The aim of the experiment is to characterise RNA structural effects in this disease. Illumina HiSeq 2000 4
EGAD00001000325 In this study, mutations present in a series of human melanomas (stage IV disease) will be determined, using autologous blood cells to obtain a reference genome. From each of the samples that are analyzed, tumour-infiltrating T lymphocytes have also been isolated. This offers a unique opportunity to determine which (fraction of) mutations in human cancer leads to epitopes that are recognized by T cells. The resulting information is likely to be of value to understand how T cell activating drugs exert their action. Illumina HiSeq 2000 22
EGAD00001000333 Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing,s sarcoma tumours. Illumina HiSeq 2000 58
EGAD00001000337 Illumina RNA-Seq will be performed on four Ewing's sarcoma cell lines and two control cell lines. RNA was extracted from all the lines using a basic Trizol extraction protocol. Illumina HiSeq 2000 12
EGAD00001000338 We propose to definitively characterise the somatic genetics of ER+ve, HER2-ve breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. Illumina HiSeq 2000 3
EGAD00001000339 Multiple myeloma is an incurable plasma cell malignancy whose molecular pathogenesis is incompletely understood. We used whole exome sequencing, copy number profiling and cytogenetic to analyses 84 samples from 67 patients with myeloma. In addition to known myeloma genes, we identify new candidate genes, including truncations of SP140, ROBO1 and FAT3 and clustered missense mutations in EGR1. We find oncogenic mutations in cancer genes not previously implicated in myeloma, including SF3B1, PI3KCA and PTEN. We define diverse processes contributing to the mutational repertoire, including kataegis and somatic hypermutation. Most cases have at least one cluster of subclonal variants, including subclonal driver mutations, implying on-going tumor evolution. Serial samples revealed diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Our findings reveal the myeloma genome to be heterogeneous across patients and, within individual patients, to exhibit diversity in clonal admixture and dynamics in response to therapy. Illumina Genome Analyzer II,Illumina HiSeq 2000 154
EGAD00001000349 These samples are from locally advanced breast cancers that have been treated with epirubicin monotherapy before surgery. We will sequence some samples from patients with good response to the therapy and some with poor response to the therapy. Illumina HiSeq 2000 33
EGAD00001000350 We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. Illumina HiSeq 2000 17
EGAD00001000354 Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. Illumina HiSeq 2000 81
EGAD00001000359 In this study we will sequence the transcriptome of Verified Cancer Cell lines. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found. Illumina HiSeq 2000 2
EGAD00001000360 The genome-wide landscape of somatically acquired mutations in mesothelioma has not been deeply characterised to date, but advances in DNA sequencing technology now allow this to be addressed comprehensively. Harnessing massively parallel DNA sequencing platforms, we will identify somatically acquired point mutations in all coding regions of the genome from patients with mesothelioma. In addition, using paired-end sequencing, we will map copy number changes and genomic rearrangements from the same patients. Illumina HiSeq 2000 232
EGAD00001000361 This is a small pilot data set to test the feasibility of cDNA exomes across 1200 cancer cell line panel. cDNA exomes or Fus-seq is further explained in this studies Abstract. Illumina HiSeq 2000 3
EGAD00001000367 Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. Illumina HiSeq 2000 5
EGAD00001000369 We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. Illumina HiSeq 2000 3
EGAD00001000388 Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. Illumina HiSeq 2000 15
EGAD00001000389 Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours. Illumina HiSeq 2000 20
EGAD00001000392 Agilent whole exome hybridisation capture was performed on genomic DNA derived from Chondrosarcoma cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products. Illumina MiSeq 60
EGAD00001000444 Cancer is driven my mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours. Illumina HiSeq 2000 3
EGAD00001000606 Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework (www.sanger.ac.uk/genetics/CGP/cosmic/). The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data. Illumina MiSeq 38
EGAD00001000624 Multifocality or multicentricity in breast cancer may be defined as the presence of two or more tumor foci within a single quadrant of the breast or within different quadrants of the same breast, respectively. This original classification of the breast cancer as multicentric or multifocal was based on the assumption that cancers arising in the same quadrant were more likely to arise from the same ductal structures than those occurring in separate areas of the breast. The problem with these definitions is that the ?quadrants? of the breast are arbitrary external designations, as no internal boundaries do exist. This project will therefore focus both on synchronous multifocal and multicentric tumors. The incidence of multifocal and multicentric breast cancers was reported to be between 13 and 75% depending on the definition used, the extent of the pathologic sampling of the breast and whether in situ disease is considered evidence of multicentricity (1). Although this incidence is variable, those figures show that it is a frequent phenomenon. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic management of the cancer. Multifocality or multicentricity has been associated with a number of more aggressive features including an increased rate of regional lymph node metastases and adverse patient outcome when compared with unifocal tumors (2-3), and a possible increased risk of local recurrence following breast conserving surgery (4). For the moment, the literature is divided on whether there is a corresponding impact on survival outcomes. Today, the current convention to stage and to treat multifocal and multicentric tumors is the classical tumor-node-metastasis (TNM) staging guidelines with which tumor size is assessed by the largest tumor focus without taking other foci of disease into consideration. If some papers, as the recent one from Lynch and colleagues, support the current staging convention (3), others, however, as Boyages et al. suggested that aggregate size and not the size of the largest lesion should be considered in order to refine the prognostic assessment of those tumors (5). On the top of that, the question whether multifocal/multicentric carcinomas are due to the spread of a single carcinoma throughout the breast or is due to multiple carcinomas arising simultaneously has been a matter of debate. Some studies suggested that multifocal breast cancer may result from either intramammary spread from a single primary tumor or multiple synchronous primary tumors; whereas others suggest that multiple breast carcinomas always arise from the same clone (6-8). Recently, Pietri and colleagues analyzed the biological characterization of a series of 113 multifocal/multicentric breast cancers (8) which were diagnosed over a 5-year period. The expression of estrogen (ER) and progesterone (PgR) receptors, Ki-67 proliferative index, expression of HER2 and tumor grading were prospectively determined in each tumor focus, and mismatches among foci were recorded. Mismatches in ER status were present in 5 (4.4%) cases and PgR in 18 (15.9%) cases. Mismatches in tumor grading were present in 21 cases (18.6%), proliferative index (Ki-67) in 17 (15%) cases and HER2 status in 11 (9.7%) cases. Interestingly, this heterogeneity among foci has led to 14 (12.4%) patients receiving different adjuvant treatments compared with what would have been indicated if we had only taken into account the biologic status of the primary tumor. This study therefore showed that differences in biological characteristics of multifocal/multicentric lesions play a crucial role in the adjuvant treatment decision making process. In this study, we will concentrate on a larger series of patients with multifocal invasive ductal breast cancer lesions. We aim at: 1. Evaluating the incidence of multifocality according to the different breast cancer molecular subtypes (ER-/HER2-, HER2+, ER+/HER2-). 2. Evaluating the incidence of multifocality in patients with hereditary breast cancer disease (presence of germline BRCA1 or BRCA2 mutations). Moreover, we would like to investigate if multifocal lesions with BRCA1 or BRCA2 mutations exhibit a characteristic combination of substitution mutation signatures and a distinctive profile of deletions as demonstrated recently by Nik-Zainal and colleagues (9). 3. Correlating multifocality with clinical information in order to define its influence on patients? survival (DFS and OS). 4. Carrying high coverage targeted gene sequencing of driver cancer genes and genes whose mutation is of therapeutic importance in order to compare clinically-relevant genetic differences between several multifocal breast cancer lesions. 5. Evaluating the impact of the distance between the different lesions on the clinical outcome but also on the genetic differences. 6. Comparing gene expression patterns between several multifocal breast cancer lesions and correlate them with the results of the targeted genes screen. 7. Characterizing the genomic and transcriptomic status of cancer related genes in metastatic lesions (local recurrence, positive lymph node or distant metastatic sites) from the same multifocal invasive ductal breast cancer patients in order to evaluate the consequence of genomic and transcriptomic heterogeneity of multifocal lesions on metastatic lesions. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic choice. This project has the potential to identify genetic/transcriptomic differences existing between several lesions constituting multifocal breast cancers, which in the routine clinical practice are usually considered to be homogeneous among them. We foresee validating significant results in a larger series of patients and this, in turn, could have a remarkable impact on the treatment and clinical management of multifocal breast cancers. Indeed, we hope to provide some evidence whether or not each focus matters in multifocal and multicentric breast cancer to define the adequate therapeutic approach, especially in the context of targeted therapies. The work to be done at Sanger will be target gene screen pooling of 1400 samples. Illumina HiSeq 2000 908
EGAD00001000630 In this study we will sequence the transcriptome of Verified Matched Pair Cancer Cell line tumour samples. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found. Illumina HiSeq 2000 7
EGAD00001000634 The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. Illumina HiSeq 2000 2
EGAD00001000635 The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. Illumina Genome Analyzer II,Illumina HiSeq 2000 50
EGAD00001000636 The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation. Illumina Genome Analyzer II 117
EGAD00001000637 Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. Illumina Genome Analyzer II,Illumina HiSeq 2000 4
EGAD00001000638 Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. Illumina HiSeq 2000 20
EGAD00001000639 Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events. Illumina HiSeq 2000 3
EGAD00001000652 Pulldown experiments will be performed on a number of patients with Myeloproliferative Neoplasms (MPN). The pulldown will be a bespoke design targeting known mutations, this pulldown will be sequenced and analysed to inform prevalence of mutations and to inform to the possibility of use as a diagnostic tool. Illumina HiSeq 2000 1036
EGAD00001000658 Changes in gene dosage are a major driver of cancer1, engineered from a finite, but increasingly well annotated, repertoire of mutational mechanisms2-6. These processes operate over levels ranging from individual exons to whole chromosomes, often generating correlated copy number alterations across hundreds of linked genes. An example of the latter is the 2% of childhood acute lymphoblastic leukemia (ALL) characterized by recurrent intrachromosomal amplification of megabase regions of chromosome 21 (iAMP21)7,8 To dissect the interplay between mutational processes and selection on this scale, we used genomic, cytogenetic and transcriptional analysis, coupled with novel bioinformatic approaches, to reconstruct the evolution of iAMP21 ALL. We find that individuals born with the rare constitutional Robertsonian translocation between chromosomes 15 and 21, rob(15;21)(q10;q10)c, have ~2700-fold increased risk of developing iAMP21 ALL compared to the general population. In such cases, amplification is initiated by chromothripsis involving both sister chromatids of the dicentric Robertsonian chromosome. In contrast, sporadic iAMP21 is typically initiated by breakage-fusion-bridge (BFB) events, often followed by chromothripsis or other rearrangements. In both sporadic and iAMP21 in rob(15;21)c individuals, the final stages of amplification frequently involve large-scale duplications of the abnormal chromosome. The end-product is a derivative chromosome 21 or a derivative originating from the rob(15;21)c chromosome, der(15;21), respectively, with gene dosage optimised for leukemic potential, showing constrained copy number levels over multiple linked genes. In summary, the constitutional translocation, rob(15;21)c, predisposes to leukemia through a novel mechanism, namely a propensity to undergo chromothripsis, likely related to its dicentric nature. More generally, our data illustrate that several cancer-specific mutational processes, applied sequentially, can co-ordinate to fashion copy number profiles over large genomic scales, incrementally refining the fitness benefits of aggregated gene dosage changes. Illumina Genome Analyzer II,Illumina HiSeq 2000 9
EGAD00001000663 This study aims to re-sequence findings from whole genome studies using a bespoke pulldown method to validate mutations in those genomes sequenced. Illumina HiSeq 2000 47
EGAD00001000678 FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. Illumina HiSeq 2000 341
EGAD00001000707 Discovery of resistance mechanisms to the BRAF inhibitor vemurafenib in metastatic BRAF mutant melanoma by massively-parallel sequencing of tumour samples. Comparison of genomic characteristics of pretreatment 'sensitive' to recurrence 'resistant' tumours to identify the genetics of drug resistance. Illumina HiSeq 2000 57
EGAD00001000732 RNA sequencing to validate findings of somatic pseudogenes acquired during cancer development Illumina HiSeq 2000 3
EGAD00001000747 Genomic libraries will be generated from total genomic DNA derived from 4000 samples with Acute Myeloid Leukaemia. Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 64 Samples will be individually barcoded and subjected to up to one lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing. Illumina HiSeq 2000 2734
EGAD00001000812 Sequencing of 350 cancer genes in BC samples from patients treated with either Epirubicin or Paclitaxel monotherapy in the neoadjuvant setting. Illumina HiSeq 2000 364
EGAD00001000824 RNA sequencing will be undertaken to reconstruct rearrangements at level of transcription to determine pathogenomic genomic events in chondromyxoid fibroma. Illumina HiSeq 2000 1
EGAD00001000825 This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer. Illumina HiSeq 2000 454
EGAD00001000847 Shwachman-Diamond syndrome (SDS) is a rare autosomal recessive disorder characterized by exocrine pancreatic insufficiency, bone marrow dysfunction, leukemia predisposition, and skeletal abnormalities. We aim to characterise the structural effects of SDS in patients with this disorder by exome sequencing. Illumina HiSeq 2000 2
EGAD00001000848 To evaluate the presence of mutations in frequently mutated genes in MPN by performing targeted resequencing of a selected gene panel comprising of 111 genes across 40 samples with MPN. Illumina MiSeq 48
EGAD00001000868 FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. Illumina HiSeq 2000,Illumina HiSeq 2500 60
EGAD00001000869 It is the ambition of the team formed by members of the Netherlands Cancer Institute (NKI) and the Cancer Genome Project at the Wellcome Trust Sanger Institute (WTSI) to unravel the genomic and phenotypic complexity of human cancers in order to identify optimal drug combinations for personalized cancer therapy. Our integrated approach will entail (i) deep sequencing of human tumours and cognate mouse tumours; (ii) drug screens in a 1000+ fully characterized tumour cell line panel; (iii) high-throughput in vitro and in vivo shRNA and cDNA drug resistance and enhancement screens; (iv) computational analysis of the acquired data, leading to significant response predictions; (v) rigorous validation of these predictions in genetically engineered mouse models and patient-derived xenografts. This integrated effort is expected to yield a number of combination therapies and companion-diagnostics biomarkers that will be further explored in our existing clinical trial networks. Illumina HiSeq 2000 62
EGAD00001000870 Testing logistics and infrastructure of molecular screening program. Core biopsies taken from invasive recurrent or metastatic breast cancer to evaluate and identify molecular traits rendering them suitable for clinical trials Illumina HiSeq 2500 52
EGAD00001000871 The purpose of this study is to sequence 500 known cancer genes in 960 newly diagnosed high risk breast cancer patients treated with current standard of care therapies and trastuzumab, for somatic alteration and copy number changes. We will be using next gen sequencing technology to determine the prognostic relevance of these somatic genetic alterations and of teh low frequency events to determine if they are associated with trastuzumab benefit or HER2 positive breast cancer, i.e. treatment interaction. The samples will be analysed adn correlated with clinical variables including outcome. Illumina HiSeq 2000 993
EGAD00001000872 These samples are to be analysed with the CGP Developed cancer panel and the results will be compared with WGS data from 4 different comercial providers. Illumina HiSeq 2500 8
EGAD00001000875 The CRO7 clinical trial recruited patients with clinically operable rectal adenocarcinoma. Patients were randomized to either pre-operative short course surgery followed by chemo-radiotherapy only in those patients at high risk of local relapse. Patients in both arms the received standard %-FU based adjuvant chemotherapy as per local policy. We intend to use FFPE derived DNA from the primary tumours to identify patterns of mutations or copy number alterations that are predictive of local or distant relapse. Illumina HiSeq 2000 330
EGAD00001000888 NSCLC WGS. AB 5500 Genetic Analyzer 4
EGAD00001000889 NSCLC targeted. Ion Torrent PGM 4
EGAD00001000894 SPECTA comprises a network of participating European clinical sites and NGS screening platforms that can screen individual patients for multiple molecular targets and potentially allow the design of trials that will match the specific biology of the diseases affecting specific patients with cancer. Illumina HiSeq 2500 64
EGAD00001000898 Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant. Illumina HiSeq 2000 42
EGAD00001000899 We propose to definitively characterise the somatic genetics of Metastatic breast cancer through generation of comprehensive catalogues of somatic mutations in Metastatic breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. Illumina HiSeq 2000 41
EGAD00001000947 Genomic libraries (500 bps) will be generated from total genomic DNA derived from Colorectal cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions. Illumina HiSeq 2000 45
EGAD00001000948 A comparison of the somatic variation present in a primary colorectal tumour and three different liver metastases from the same patient. Illumina HiSeq 2000 6
EGAD00001000965 Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant. Illumina HiSeq 2000 331
EGAD00001000980 This study involves a forward genetic screen to identify common insertion sites in drug resistant clones. We will be utilising piggybac transposon systems in order to generate multiple drug resistant clones in a range of human cancer cell lines. Illumina MiSeq 144
EGAD00001000990 mRNA-Seq on total RNA from primary osteoblastomas and phosphaturic mesenchymal tumours, focussing on fusion transcript expression Illumina HiSeq 2000 11
EGAD00001000998 Targeted capture of exonic and intronic regions of interest for the study of genomic alterations in multiple myeloma. Illumina HiSeq 2000 24
EGAD00001001014 NA Illumina HiSeq 2000 2597
EGAD00001001015 NA Illumina HiSeq 2000 76
EGAD00001001017 DNA extracted from multiple biopsies taken from different areas of primary lung tumours will be subjected to targeted re-sequencing and analysed in order to assess intra-tumour heterogeneity with respect to mutations in a selection of cancer related genes. Illumina HiSeq 2000 31
EGAD00001001018 The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations. This dataset contains all the data available for this study on 2014-09-24 Illumina HiSeq 2000 374
EGAD00001001028 DNA belonging to 16 tumour/normal samples were treated with bisulfite, then up to 5 different bisulfite PCRs were performed in each one of the samples. Amplicons form the same sample were pooled and submitted to sequencing on a MiSeq platform. Illumina MiSeq 18
EGAD00001001039 Genomic characterisation of a large series of cancer cell lines. Illumina HiSeq 2000 1072
EGAD00001001041 Comparison of genomic rearrangements and DNA methylation patterns between different foci of multiple synchronous (multifocal and multicentric) invasive breast cancers. Illumina Genome Analyzer II,Illumina HiSeq 2000 305
EGAD00001001046 We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting. Illumina HiSeq 2000 33
EGAD00001001050 We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting. Illumina HiSeq 2000 8
EGAD00001001061 This experiment is to inform us of the validity of using pre-made library material to perform a bespoke pulldown experiment to validate the mutations found between the whole genome sequencing of the DNA from the same individuals cancer and normal material. This is to identify the valid and informative mutations in cancer genomes. Illumina MiSeq 4
EGAD00001001062 Patient (who has had multiple malignancies) has previously been found to harbour a pathogenic p53 variant which is probably mosaic. This finding is based on exome sequencing performed elsewhere. In this study we will resequence the locus in question to ascertain whether the variant is indeed mosaic. Illumina MiSeq 4
EGAD00001001063 Chondromxoid fibroma is a benign tumour of bone with unknown underlying pathogenesis. To determine pathognomic genomic event in chondromyxoid fibroma whole genome sequencing will be undertaken to reconstruct rearrangements and find underlying mutations. Illumina HiSeq 2000 2
EGAD00001001090 This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer. Illumina HiSeq 2000 166
EGAD00001001122 FFPE normal panel generation for use with V3 cancer panel 0618521 Illumina HiSeq 2000 94
EGAD00001001123 Deep sequencing of two skin biopsies to study the landscape of somatic mutations in human adult tissues. Illumina HiSeq 2000 2
EGAD00001001208 Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient. Illumina HiSeq 2000,Illumina HiSeq 2500 105
EGAD00001001215 Targeted sequencing follow-up of genomic lesions in multiple myeloma. Illumina HiSeq 2000 424
EGAD00001001236 Targetted capture and resequencing of 94 known myeloid genes across MPN trials (PT1 and Voriconazole study) and other MPN samples. Illumina HiSeq 2000 1860
EGAD00001001237 This is a pilot project to determine whether the TAPG FFPE DNA's are suitable for deep sequencing. If successful an investigation of SNP distribution in a larger cohort will follow. Illumina HiSeq 2000 15
EGAD00001001242 Pilot study to set up sequencing protocols for targeted pulldown methylation profiling Illumina MiSeq 2
EGAD00001001265 Genomic architecture of mesothelioma parent study is project 925. This project is set up in parallel to project 925 in order to Whole genome sequence ten of the 59 tumours in that project. HiSeq X Ten 18
EGAD00001001266 Whole genome sequencing of primary angiosarcoma HiSeq X Ten 12
EGAD00001001267 Anaplastic meningiomas are a rare, malignant variant of meningioma. At present there is no effective treatment for this cancer. The aim of the study is to identify somatic mutations in anaplastic meningiomas. We plan to sequence a set of 500 known cancer genes in 50 anaplastic meningioma and corresponding peripheral blood DNA samples. Bioinformatics will be used to analyse the results to assess the probability of these mutations being causal and so likely of critical importance for the tumour growth. Identification of these mutations will guide selection of appropriate compounds to effectively treat the disease. HiSeq X Ten 60
EGAD00001001271 Around 50 samples of pre-invasive lung cancer lesions showing subsequent clinical and pathological progression or regression HiSeq X Ten 50
EGAD00001001330 In this experiment we have sequenced tumour normal pairs from patients presenting with CRC who have a prior history of inflammatory bowel disease. The idea is to identify driver mutations, new genes and novel pathways associated with the development of these malignancies. Illumina HiSeq 2000 70
EGAD00001001357 Genomic characterisation of a large series of cancer cell lines. Illumina HiSeq 2000 462
EGAD00001001375 Samples will be from the BRF113683 (BREAK-3) study which is a Phase III Randomized, Open-label Study Comparing GSK2118436 to Dacarbazine (DTIC) in Previously Untreated Subjects With BRAF Mutation Positive Advanced (Stage III) or Metastatic (Stage IV) Melanoma (n=250 enrolled)*NGS [Agilent capture (Sanger V2 panel): 360 genes and 20 gene fusions; Illumina HiSEQ Sequencing]*CNV: [via NGS or Affy SNP 6.0 or Illumina Omni (TBD)]Bioinformatics: Analysis will be performed using core Sanger informatics pipelines similar to those previously described (Papaemmanuil E et al. (2013) Blood. 22:3616 -3627). Briefly, copy number analysis will be performed using the ASCAT algorithm, and base substitutions, small insertions and deletions using the CAVEMAN and Pindel algorithms, respectively. Statistical approaches including generalized linear models will be used to predict clinical variables such as maximum clinical response and duration of response using genetic data. Sanger and EBI to conduct analysis; Raw data and correlation with clinical endpoints to be analyzed by both EBI/Sanger and GSK (unique pipeline analyses to increase call confidence) Illumina HiSeq 2500 169
EGAD00001001389 Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung Illumina HiSeq 2500 15
EGAD00001001395 Background: Invasive lobular breast cancer (ILBC) is the second most common histological subtype after ductal breast cancer (IDBC). In spite of significant clinical and pathological differences, ILBC is still treated as IDBC. Here, we aimed at identifying recurrent genomic alterations in ILBC with potential clinical implications.Methods: Starting from 630 ILBC primary tumors with a median follow up of 10 years, we interrogated oncogenic substitutions and indels of 360 cancer genes and genome-wide copy number alterations in 413 and 170 ILBC samples, respectively, and correlated those findings with clinical, pathological, and outcome features. The Cancer Genome Atlas database was used for comparison of frequency estimates.Results: Besides the high mutation frequency of CDH1 in 65% of the tumors, alterations in one of the three key genes of the PI3K pathway, PIK3CA, PTEN and AKT1, were present in more than half of the cases. ERBB2 and ERBB3 were mutated in 5.1 and 3.6% of the tumors. FOXA1 mutations and ESR1 copy number gains were detected in 9% and 25% of the samples. All these alterations were more frequent in ILBC than IDBC. The histological diversity of ILBC was associated with specific genomic alterations, such as enrichment for ERBB2 mutations in the mixed, non-classic subtype, and for ARID1A mutations and ESR1 gains in the solid subtype. Finally, ERBB2 and AKT1 mutations were associated with short-term risk of relapse, and chromosome 1q and 11p gain with increased and decreased breast cancer free survival, respectively.Conclusion: ERBB2, ERBB3 and AKT1 mutations represent high prevalence therapeutic targets in ILBC. FOXA1 mutations and ESR1 gains urgently deserve dedicated clinical investigation, especially in the context of endocrine treatment. Illumina HiSeq 2000 541
EGAD00001001426 Systematic next generation sequencing efforts are beginning to define the genomic landscape across a range of primary tumours, but we know very little of the mutational evolution that contributes to disease progression. We therefore propose to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in a cohort of matched primary and metastatic colorectal cancers, and additionally to explore the extent to which those mutations identified as recurrent in the metastatic setting are able to subvert normal biological processes using both genetically engineered mouse models and established cancer cell lines. This study will enable us to define to what extent primary tumour profiling can capture the biological processes operative in matched metastases as well as the significance of intratumoural heterogeneity. This dataset contains all the data available for this study on 2015-07-02. Illumina HiSeq 2000 446
EGAD00001001427 Targeted cancer gene sequencing of samples enrolled in the SSGXVIII trial from Finland. Illumina HiSeq 2000 312
EGAD00001001428 Identification of human deubiquitylating enzymes whose knock out result in hypersensitivity to DNA damaging agents, by comparing the sequence reads of 'barcode region' from mixed cell culture. Illumina HiSeq 2000 6
EGAD00001001429 Profiling subclonal architecture and phylogeny in tumors by whole-genome sequence data mining and single-cell genome sequencing HiSeq X Ten 2
EGAD00001001430 Investigation into causal genes underlying anaplastic meningioma Illumina HiSeq 2000 73
EGAD00001001445 Deep sequencing of melanoma for driver mutations Illumina MiSeq 3
EGAD00001001446 Genomic and transcriptomic characterization of drug-resistant colon cancer stem cell lines. Illumina HiSeq 2000 4
EGAD00001001447 Whole genome sequencing of single cell derived organoids from normal colon tissue and colorectal cancer. HiSeq X Ten 73
EGAD00001001448 Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing. Illumina MiSeq 11
EGAD00001001450 This study is to ascertain whether it is feasible to extract single cell from a tumour, perform amplification, generate a library and sequence a targeted pulldown. Illumina HiSeq 2000 3
EGAD00001001453 The project is to evaluate the genomic binding sites of the histone demethylase JARID1C. This gene was recently identified in CGP as a novel recessive cancer gene in human renal cell carcinoma. Illumina Genome Analyzer II 4
EGAD00001001458 Whole genome sequencing of EBV-transformed B cells in order to determine whether EBV induction of activation-induced cytidine deaminase (AID) produces genome-wide mutations and/or chromosomal rearrangements. HiSeq X Ten 12
EGAD00001001459 Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer. This dataset contains all the data available for this study on 2015-08-05. Illumina HiSeq 2000 76
EGAD00001001600 PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients. This dataset contains all the data available for this study on 2015-09-03. Illumina MiSeq 2
EGAD00001001629 Whole-genome somatic rearrangement and point mutation analysis in cell lines with induced telomere fusions. HiSeq X Ten 20
EGAD00001001845 Leeds Melanoma Cohort Illumina HiSeq 2000 16
EGAD00001001846 2 BRAFV600E cell lines that have been made resistance to 1. the BRAF inhibitor PLX4720 and 2. the combination therapy of dabrafenib and trametinib seem to have a internal duplication in the kinase domain. We would like to know if this is caused by a translocation. HiSeq X Ten 4
EGAD00001001872 Targeted exome sequencing of patient derived xenografts from primary colorectal tumours and liver metastases. This dataset contains all the data available for this study on 2016-01-06. Illumina HiSeq 2000 333
EGAD00001001873 AML emerges as a consequence of accumulating independent genetic aberrations that direct regulation and/or dysfunction of genes resulting in aberrant activation of signalling pathways, resistance to apoptosis and uncontrolled proliferation. Given the significant heterogeneity of AML genomes, AML patients demonstrate a highly variable response rate and poor median survival in response to current chemotherapy regimens. For the past 4 years we have conducted gene expression profiling on purified bone marrow populations equating to normal haematopoietic stem and progenitor cells from healthy subjects and patients with de novo AML in order to identify AML signatures of aberrantly expressed genes in cancer versus normal. We are now applying a series of bioinformatic methodologies combined with clinical and conventional diagnostic data to establish novel genomics strategies for improved prognostication of AML. Additionally, we use our AML signatures to unravel oncogenic signalling pathway activities in AML patients and test inhibitory drugs for these pathways inn preclinical therapeutic programmes. We consider that superimposing GEP and clinical data for our AML patient cohort with additional data on their mutational status will significantly improve the prognostic power of the study as well as unravel yet unknown mutations associated with aberrant signalling activities of oncogenic pathways. Illumina HiSeq 2000 215
EGAD00001001879 A pilot to establish the feasability of using a custom Agilent targeted pulldown of 110 genes implicated in colorectal tumourigensis to sequence for driver mutations in a set of 30 FFPE colorectal adenomas. If successful, we propose to sequence an additional 350 adenomas as part of a MRC research study in order to define the pattern of driver mutations across the spectrum of pathological subtypes including coventional adenomas, serrated adenomas and hyperplastic polyps Illumina HiSeq 2000 30
EGAD00001001889 ***THIS DATA CAN ONLY BE USED FOR NON-COMMERCIAL CANCER RESEARCH*** Sequencing of organoid cell lines derived from oesophageal tumour sections taken from patients diagnosed with primary oesophageal cancer who underwent tumour resection surgery. HiSeq X Ten 9
EGAD00001001898 The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2016-01-27. HiSeq X Ten 131
EGAD00001001947 Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC. Illumina HiSeq 2000 16
EGAD00001001948 Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC Illumina HiSeq 2000 16
EGAD00001002007 To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL. We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies. Illumina HiSeq 2000 11
EGAD00001002008 To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL. We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies. Illumina HiSeq 2000 22
EGAD00001002015 The use of reference DNA standards generated from cancer cell lines sequenced in the Cancer Genome Project to establish the sensitivity, specificity, accuracy and reproducibility of the WTSI GCLP sequencing pipeline Illumina HiSeq 2000 57
EGAD00001002051 BRAF V600E colorectal cancers do not respond to the only currently FDA approved targeted therapy for CRC. There is currently a trial underway in the UK recruiting V600E CRC patients for treatment with a triple therapy combination of Cetuximab, Trametinib and Dabrafenib. We have mutagenized a pool of V600E CRC cell lines and treated with this triple therapy to select out drug resistant clones. We will now sequence these drug resistant clones with the aim of identifying common point mutations engendering resistance to this new therapy. Illumina HiSeq 2500 20
EGAD00001002065 Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC Illumina HiSeq 2500 50
EGAD00001002066 KRAS mutant CRC is currently in clinical trial with a combination of a MEK and Akt inhibitor. These patients will likely develop resistance to this combination. We aim to identify the mechanisms of resistance via ENU mutagenesis, with a view to identifying additional therapeutics which have the ability to overcome this resistance. Illumina HiSeq 2500 86
EGAD00001002229 Detection of BAP1 mutations in DNA from uveal melanoma and mesothelioma samples. Illumina HiSeq 2000 22
EGAD00001002232 Mapping genetic evolution of pancreatic cancer precursor lesions such as IPMNs and PanINs. Illumina HiSeq 2000 20
EGAD00001002234 This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment. Illumina HiSeq 2000 84
EGAD00001002236 The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation. Illumina Genome Analyzer II,Illumina HiSeq 2000 32
EGAD00001002237 The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation. Illumina Genome Analyzer II,Illumina HiSeq 2000 59
EGAD00001002696 Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling. HiSeq X Ten,Illumina HiSeq 2000 58
EGAD00001002698 Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling. Illumina HiSeq 2000,Illumina HiSeq 2500,Illumina MiSeq 387
EGAD00001003216 Whole genome sequencing of tumour normal pairs of human undifferentiated sarcomas. HiSeq X Ten 98
EGAD00001003217 Targeted resequencing at high depth (21 genes, 9 chromosomal regions): at least 4 FFPE samples per case and matched germline DNA: * 100 cases with detailed outcome data, including 15 cases with tumour relapse (515 samples) * 40 cases with matched pre-chemotherapy biopsies (240 samples) * 50 nephrogenic rests matched to above cases (50 samples) We expect a proportion (possibly 10%) of cases to be mutationally silent on the above studies, and propose to subsequently carry out integrated whole-genome, methylome and transcriptome studies on matched frozen tissue from these cases Illumina HiSeq 2500 35
EGAD00001003239 This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment. Illumina HiSeq 2000 80
EGAD00001003240 Study of cell lineage and embryogenesis using biopsy samples from sites across the whole body (post mortem). Sample donors are recruited sensitively through the Phoenix study and consent to samples being taken after their death for both the Phoenix study and this WTSI study. HiSeq X Ten 33
EGAD00001003242 This study comprises of three different datasets. 1) 57 samples from the 1243 canapps cell line study,2) 91 FFPE normal samples and 3) 87 samples from the SCORT WS2 dataset. The aim is to sequence these 235 samples in order to test the new V2 Colorectal bait design. Illumina HiSeq 2000 92
EGAD00001003248 A BRAF V600E colorectal organoid which is sensitive to MAP kinase inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a combination of Trametinib, Dabrafenib and Cetuximab. Single cell derived organoids were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors from a screen in 2 2D BRAF V600E colorectal cell lines. Pools of resistant clones were also sequenced. Illumina MiSeq 36
EGAD00001003252 Sequencing of drug resistant organoids Illumina HiSeq 2000 36
EGAD00001003253 Targeted gene screen of cell line tumour samples for testing the new V2 Colorectal gene panel. Illumina HiSeq 2000 57
EGAD00001003254 R&D project to develop low input library construction methods. Illumina HiSeq 2500 12
EGAD00001003255 Transcriptome of anaplastic meingiomas Illumina HiSeq 2500 34
EGAD00001003309 The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2017-04-27. HiSeq X Ten 139
EGAD00001003320 Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer This dataset contains all the data available for this study on 2017-05-04. Illumina HiSeq 2000,Illumina HiSeq 2500 106
EGAD00001003321 This dataset contains all the data available for this study on 2017-05-04. Illumina HiSeq 2000 523
EGAD00001003330 The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations. This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000 416
EGAD00001003332 PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients This dataset contains all the data available for this study on 2017-05-11. Illumina MiSeq 4
EGAD00001003334 Targeted exome sequencing of patient derived xenografts from primary colorectal tumours and liver metastases. This dataset contains all the data available for this study on 2017-05-11. Illumina HiSeq 2000 573
EGAD00001003425 A EGFR mutant NSCLC cell line which is sensitive to AZD9291 inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a AZD9291. Single cell derived colonies were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors hypothesised from currently available literature. This dataset contains all the data available for this study on 2017-07-05. Illumina MiSeq 177
EGAD00001003445 Clear cell renal cancer is characterized by near-universal loss of the short arm of chromosome 3 (3p). This event arises through unknown mechanisms, but critically results in the loss of several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cancer (ccRCC) recruited into the Renal TRACERx study. We find novel hotspots of point mutations in the 5'-UTR of TERT, targeting a MYC-MAX repressor, that result in telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. Using molecular clocks, we estimate this occurs in childhood or adolescence, generally preceding emergence of the most recent common ancestor by years to decades. Similar genomic changes recent common ancestor by years to decades. Similar genomic changes are seen in inherited kidney cancers. Modeling differences in age-incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Targeting essential genes in deleted regions of chromosome 3p could represent a potential preventative strategy for renal cancer. HiSeq X Ten 164
EGAD00001003703 The incidence of acute myeloid leukemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 60. Only 10-15% of cases evolve from a pre-existing myeloproliferative or myelodysplastic disorder; the remaining cases arise de novo without a detectable prodrome and are diagnosed upon development of bone marrow failure. Analysis of diagnostic blood samples has demonstrated that de novo AML is preceded by the accumulation of somatic mutations in pre-leukemic hematopoietic stem and progenitor cells (preL-HSPCs) that subsequently undergo clonal expansion. If individuals in this pre-leukemic phase could be identified, methods for determination of risk and monitoring for progression to overt AML could be developed. However recurrent AML mutations also accumulate during aging in healthy individuals who never develop AML, referred to as age related clonal hematopoiesis (ARCH). To distinguish individuals with preL-HSPCs at high risk of developing AML from those with ARCH, we undertook deep targeted sequencing of genes recurrently mutated in AML in blood samples from 133 individuals in the European Prospective Investigation into Cancer and Nutrition (EPIC) study taken on average 6 years before they developed AML (pre-AML group), together with 683 matched healthy individuals (Control group). Pre-AML cases displayed accelerated age-correlated accumulation of somatic mutations.The identity, number and variant allele frequency (VAF) of mutations differed between the two groups, and were incorporated into a computational model of AML risk prediction that accurately distinguished pre-AML cases from controls on average 7 years prior to AML development. Our findings provide proof of concept that early prediction of AML development is feasible in high-risk populations, paving the way for early disease detection, monitoring, and potentially prevention. Illumina HiSeq 2000,Illumina HiSeq 2500 628
EGAD00001003811 Our project will examine the role of PIK3CA mutations and their sensitivity to endocrine therapies and its role, with the addition of complete ovarian suppression. We plan to test our hypotheses using tumour samples collected from patients enrolled in the SOFT/IBCSG24-02 clinical study (Suppression of Ovarian Function Trial - (NCT00066690). SOFT is a phase III trial that randomised 3066 premenopausal women to evaluate if adding ovarian suppression to adjuvant endocrine therapy will improve clinical outcomes. This dataset contains all the data available for this study on 2017-11-22. Illumina HiSeq 2500 81
EGAD00001003883 Background: Lung carcinoma-in-situ (CIS) lesions are the pre-invasive precursor to lung squamous cell carcinoma. However, only half progress to invasive cancer in three years, while a third spontaneously regress. Whether modern molecular profiling techniques can identify those pre-invasive lesions that will subsequently progress and distinguish them from those that will regress is unknown. Methods: Progressive and regressive CIS lesions were laser-captured and their genome, epigenome and transcriptome interrogated. We analysed 83 progressive lesions, 41 regressive and 33 normal epithelial control samples. DNA methylation and gene expression profiles were further validated using publicly available lung cancer data. Results: Somatic mutation burden was higher in progressive lesions than regressive CIS lesions, across base substitutions, rearrangements, and copy number changes. Driver mutations were present in both progressive and regressive CIS lesions, but were more numerous in progressive cases. Progressive and regressive CIS lesions had distinct epigenomic and transcriptional profiles, with a strong chromosomal instability signature. Gene expression, methylation and copy number profiles can all predict accurately which CIS lesions will progress to lung cancer. Conclusion: Pre-invasive CIS lesions that will subsequently progress to invasive lung cancer can be distinguished from those that will regress using molecular profiling. Progression is associated with a strong chromosomal instability signature. These findings inform the development of novel therapeutic targets. HiSeq X Ten 69
EGAD00001003884 The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. HiSeq X Ten 37
EGAD00001003885 The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies. Illumina HiSeq 2500 19
EGAD00001003923 The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. Illumina HiSeq 2500 15
EGAD00001003924 The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib. HiSeq X Ten 3
EGAD00001004000 Targeted gene screen of cell line tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07. Illumina HiSeq 2500 53
EGAD00001004001 Targeted gene screen of FFPEs, cell lines and primary CRC tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07. Illumina HiSeq 2500 92
EGAD00001004086 We will take a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We will grow these up into colonies, then whole genome sequence each colony. Somatic mutations will act as a unique barcode for each clone. We will then design a panel for targeted resequencing of the mutations that we find. It will then be possible to look for these mutations in the peripheral blood over several years, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19. HiSeq X Ten,Illumina HiSeq 2500 207
EGAD00001004087 We took a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We grew these up into colonies, then whole genome sequenced each colony. Somatic mutations act as a unique barcode for each clone. We have designed a panel for targeted resequencing of the mutations that we find. We are now looking for these mutations in the peripheral blood, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19. Illumina HiSeq 2500 48
EGAD00001004124 CRISPR-Cas9 genome editing is widely used to study gene function, from basic biology to biomedical research. Structural rearrangements are a ubiquitous feature of cancer cells and their impact on the functional consequences of CRISPR-Cas9 gene-editing has not yet been assessed. Utilizing CRISPR-Cas9 knockout screens for 250 cancer cell lines, we demonstrate that targeting structurally rearranged regions, in particular tandem or interspersed amplifications, is highly detrimental to cellular fitness in a gene independent manner. In contrast, amplifications caused by whole chromosomal duplications have little to no impact on fitness. This effect is cell line specific and dependent on the ploidy status. We devise a copy-number ratio metric that substantially improves the detection of gene-independent cell fitness effects in CRISPR-Cas9 screens. Furthermore, we develop a computational tool, called Crispy, to account for these effects on a single sample basis and provide corrected gene fitness effects. Our analysis demonstrates the importance of structural rearrangements in mediating the effect of CRISPR-Cas9-induced DNA damage, with implications for the use of CRISPR-Cas9 gene-editing in cancer cells. Illumina HiSeq 2000 12
EGAD00001004152 Targeted pulldown of approx 60 ffpe normal samples to use as normal controls . This dataset contains all the data available for this study on 2018-06-06. Illumina HiSeq 2500 80
EGAD00001004158 The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. Illumina HiSeq 2500 0
EGAD00001004159 The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing. HiSeq X Ten 25
EGAD00001004162 Undifferentiated sarcomas (USARC) of adults are diverse, rare and aggressive soft tissue cancers. Recent efforts have confirmed that USARC exhibit one of the highest burdens of structural aberrations across human cancer. Here, we sought to unravel the genomic basis of this structural complexity by integrating whole genome sequencing, ploidy analysis and methylation profiling of 53 USARC. We identified whole genome doubling as a prevalent and pernicious force in USARC tumourigenesis. Deconvolution of the complex copy number and rearrangement landscapes show distinct signatures associated with chromothripsis, early-haploidy, and successive whole-genome-doubling events, suggesting four divergent models of sarcoma development. We show similar distinct evolutionary tumourigenic pathways in different sarcoma subtypes from the Cancer Genome Atlas. Thirteen percent of tumours exhibited a hypermutator phenotype, opening new avenues for clinical management such as immunotherapy, whilst the period prior to and between genome doubling events may represent clinically relevant interventional points in USARC. HiSeq X Ten 56
EGAD00001004163 Cancer genomes are frequently characterized by numerical and structural karyotypic abnormalities. Here we combined an inducible centromere-specific inactivation approach with selection for a conditionally essential gene, a strategy we term ‘CEN-SELECT’, and show that single-chromosome missegregation during cell division can directly drive a broad spectrum of structural rearrangement types. Cytogenetic profiling revealed that missegregated chromosomes are 120-fold more susceptible to developing seven major categories of structural variants, including translocations, insertions, deletions, and reassembly into chromothriptically rearranged chromosomes. Whole-genome sequencing of clones with genetically propagatable derivative chromosomes identified complex rearrangements and copy-number alterations that can result in gene inactivation or extrachromosomal gene amplification. We conclude that chromosome segregation errors are sufficient to drive extensive structural variation that recapitulates those commonly associated with human cancers. HiSeq X Ten,Illumina HiSeq 2000 22
EGAD00001004192 The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. HiSeq X Ten 578
EGAD00001004193 The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium. Illumina HiSeq 2500 1632
EGAD00001004201 Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. Illumina HiSeq 2000,Illumina HiSeq 2500 75
EGAD00001004202 Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. Illumina HiSeq 2500 26
EGAD00001004203 Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo. HiSeq X Ten 192
EGAD00001004292 Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient. . This dataset contains all the data available for this study on 2018-08-13. Illumina HiSeq 2000,Illumina HiSeq 2500 112
EGAD00001004346 This is a bulk DNA and RNA sequencing study of human renal tumours . This dataset contains all the data available for this study on 2018-09-19. HiSeq X Ten 37
EGAD00001004368 Targeted gene sequencing of cancer driver genes to determine the driver mutations present in newly-derived cancer organoid models Illumina HiSeq 4000 17
EGAD00001004428 Peripheral T-cell lymphomas not otherwise specified (PTCL-NOS) represent a heterogeneous group of nodal and extra-nodal mature T-cell lymphomas, with a low prevalence in Western countries. PTCL-NOSs account for about 25% of all PTCLs and are currently diagnosed based on exclusion criteria, as this lymphomas lack unifying morphological, phenotypic and genomic features. Cytogenetic and FISH analysis of PTCL-NOS samples have not revealed recurrent pathogenetic abnormalities, while gene expression profiling has shown only partial ability to segregate cases representing homogeneous clinic-pathological entities. This underscores the need to look at PTCL-NOS with innovative and high-throughput approaches to identify recurrent genetic lesions that could further our understanding of the biology of this heterogeneous group of diseases, provide better diagnostic tools and perhaps new targets for innovative treatments. Our aim is to study ~15 patients affected by PTCL-NOS. Out study will be funded by a private, non-profit Italian cancer research fund (Associazione Italiana per la Ricerca sul Cancro, www.airc.it) based on a grant owned by Anna Dodero and Cristiana Carniti, hematologists at INT. Samples will be analysed by whole genome sequencing using Illumina X10 machines, on a 150bp-PE protocol. Data will be analysed using the pipeline available in Team 78, under the supervision of Peter Campbell, the WTSI faculty who will oversee the project, and by Francesco Maura, visiting scientist at the WTSI. . This dataset contains all the data available for this study on 2018-10-30. HiSeq X Ten 27
EGAD00001004547 All normal somatic cells are thought to acquire mutations. However, characterisation of the patterns and consequences of somatic mutation in normal tissues is limited. Uterine endometrium is a dynamic tissue undergoing cyclical shedding and reconstitution lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands showed that most are clonal cell populations derived from a recent common ancestor, with mutation burdens differing from other normal cell types and many fold lower than endometrial cancers. Mutational signatures found ubiquitously account for most mutations. Many, in some women all, endometrial glands are colonised by cell clones carrying driver mutations in cancer genes, often with multiple drivers. Total and driver mutation burdens increase with age, but are also influenced by other factors, including body mass index and parity, and clones with drivers often originate during early decades of life. The somatic mutational landscapes of normal cells differ between cell types and are revealing the procession of neoplastic change leading to cancer. HiSeq X Ten 6
EGAD00001004578 Chronic liver injury predisposes to cirrhosis and hepatocellular carcinoma, but how somatic mutations accumulate in liver disease is unexplored. We sequenced whole genomes of 400 microdissections of 100-500 hepatocytes from 5 normal and 6 cirrhotic livers. Compared to normal liver, cirrhotic liver had higher mutation burden, especially structural variants, including chromothripsis. Cirrhotic nodules were oligoclonal; sometimes entirely derived from a single, recent common ancestor. Clonal expansions millimeters in diameter occurred in cirrhosis in the absence of known driver mutations. Endogenous mutational processes predominated, although signatures of polycyclic aromatic hydrocarbon and aristolochic acid exposure occurred in some samples. Up to 10-fold within-patient variation in activity of exogenous signatures existed between adjacent cirrhotic nodules, with both clone-specific and microenvironmental forces shaping this heterogeneity. Synchronous hepatocellular carcinomas drew from the same repertoire of mutational signatures as background cirrhotic liver, but with higher burden. Somatic mutations chronicle the exposures, toxicity, regeneration and clonal structure of liver tissue as it progresses from health to disease. HiSeq X Ten 577
EGAD00001004593 Precision medicine trials in glioblastoma should be conducted at tumor recurrence. However, second surgery for recurrent GBMs is not routinely performed and therefore molecular data is predominantly derived from primary samples. This study aims to establish the frequency of driver changes at tumor recurrence. Illumina HiSeq 2500 377
EGAD00001004774 We investigated the somatic genetic basis of Wilms’ tumour and found complex phylogenetic relations between tumours. HiSeq X Ten 203
EGAD00001004867 This dataset contains all the data available for this study on 2019-03-26. HiSeq X Ten 60
EGAD00001004876 In this project we have sequenced the exome of skin moles (melanocytic naevi) and also normal skin from young and old people. We are interested in looking at the clonality of these lesions and the burden of UV mutations . This dataset contains all the data available for this study on 2019-04-01. Illumina HiSeq 2000 14
EGAD00001004877 Targeted analysis of chondrosarcoma cancer genes . This dataset contains all the data available for this study on 2019-04-01. Illumina HiSeq 2500 445
EGAD00001004878 R&D project to develop low input library construction methods. . This dataset contains all the data available for this study on 2019-04-01. HiSeq X Ten,Illumina HiSeq 2500 0
EGAD00001004879 Evolution of the cancer epigenome in myeloproliferative neoplasms. . This dataset contains all the data available for this study on 2019-04-01. HiSeq X Ten 17
EGAD00001004889 Mutational signatures have been shown to be attributable to specific genetic contexts, such as mutations in DNA repair genes. DNMT3A is a DNA methyltransferase that helps maintain the DNA methylation pattern in a site-specific manner and may participate in DNA repair or the stress response. We have identified an adult individual who is a germline mosaic for a DNMT3A mutation. We have obtained clonal lymphoblastoid cells (LCLs) from the subject representing both WT and mutant lines grown in the same individual for >50 years. These clones represent a unique opportunity to examine the mutational impact of the DNMT3A mutation in a well-controlled setting. Our goal is to perform WGS on whole blood, representing the pool, as well as several WT and several mutant clones, in order to investigate the contribution of DNMT3A to mutation rates and signatures. . This dataset contains all the data available for this study on 2019-04-03. HiSeq X Ten 9
EGAD00001004891 Drug resistant population of PC9(human non-small cell lung cancer) or A375 (human melanoma) cell lines were used for this study. By exome sequencing, we will analyse mutations of cells in drug tolerent state and after drug holiday. . This dataset contains all the data available for this study on 2019-04-03. Illumina HiSeq 2500 18
EGAD00001004893 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (Illumina HiSeqX, 40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2019-04-03. HiSeq X Ten 36
EGAD00001004895 Recent advances in genomics have demonstrated that clonal haemopoiesis driven by leukaemia associated somatic mutations is a relatively common phenomenon that increases in frequency with advancing age. Whilst individuals with clonal haemopoiesis have an increased risk of developing haematological malignancies, they also have an increased mortality from other causes. Additionally, certain mutations are almost exclusively seen in individuals aged 70 years or older, whilst others are seen in individuals with non-haematological cancers including breast and ovarian. Recently, clonal haemopoiesis was found to be associated with a significantly increased risk of atherosclerotic cardiovascular disease. This association is thought to be causative with clonally-derived macrophages showing elevated expression of several chemokine and cytokine genes that contribute to atherosclerosis. Another vascular pathology, abdominal aortic aneurysm (AAA), increases with age and shares risk factors with atherosclerosis (including smoking, male sex, high cholesterol). However, the impact of these risk factors and the overlap between AAA and atherosclerosis is poorly understood. To investigate a possible link between clonal haemopoiesis and AAA, we will study DNA samples from 300 patients with AAA and up to 200 controls for evidence of clonal haemopoiesis. This will be done using target DNA enrichment with biotinylated RNA baits followed by high throughput sequencing. . This dataset contains all the data available for this study on 2019-04-03. Illumina HiSeq 2500 472
EGAD00001004941 Recent work in the Campbell group has revealed somatic mutations present in normal, non-cancerous human skin. A subset of the mutations conferred selective advantages to the host cells, leading to clonal expansions and raising the risk for future cancer development. Capturing such somatic mutations in normal tissue is important to advance our understanding about carcinogenesis and could provide prospective medical insights. In this project, our goal is to detect somatic mutations in normal (pre-cancerous) liver tissue. Using Laser Microdissection technology, we will dissect individual liver lobules from patient samples and submit these to sequencing. For each patient sample, we aim to sequence multiple lobules to characterise the mutagenic burden. Samples will be taken from patients with different liver disease aetiologies, including alcoholism and obesity, with a view on distinguishing the prevalent mutation types occurring in each disease context. We will perform targeted sequencing, initially using the WTSI cancer panel. Later we aim to use a novel bait set that captures both cancer genes as well as genes relevant to the non-cancerous samples (ie. genes implicated in hereditary disorders, immune sequences). . This dataset contains all the data available for this study on 2019-04-08. Illumina HiSeq 2500 63
EGAD00001004953 Single cell + bulk genomics study for immune and hematopoietic organs during human fetal development . This dataset contains all the data available for this study on 2019-04-11. HiSeq X Ten 5
EGAD00001004954 We aim to describe the transcriptomic landscape of infant spindle cell tumours. . This dataset contains all the data available for this study on 2019-04-11. Illumina HiSeq 4000 38
EGAD00001004999 Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. Illumina HiSeq 4000 97
EGAD00001005000 Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. HiSeq X Ten 97
EGAD00001005001 Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. Illumina HiSeq 4000 48
EGAD00001005028 Analysis of mutational signatures is becoming routine in cancer genomics, with implications for pathogenesis, classification, prognosis, and even treatment decisions. However, the field lacks a consensus on analysis and result interpretation. Using whole-genome sequencing of multiple myeloma (MM), chronic lymphocytic leukemia (CLL) and acute myeloid leukemia, we compare the performance of public signature analysis tools. We describe caveats and pitfalls of de novo signature extraction and fitting approaches, reporting on common inaccuracies: erroneous signature assignment, identification of localized hyper-mutational processes, overcalling of signatures. We provide reproducible solutions to solve these issues and use orthogonal approaches to validate our results. We show how a comprehensive mutational signature analysis may provide relevant biological insights, reporting evidence of c-AID activity among unmutated CLL cases or the absence of BRCA1/BRCA2-mediated homologous recombination deficiency in a MM cohort. Finally, we propose a general analysis framework to ensure production of accurate and reproducible mutational signature data. HiSeq X Ten 5
EGAD00001005079 We want to investigate mosaic mutations as a cause of childhood IBD . This dataset contains all the data available for this study on 2019-06-10. Illumina HiSeq 4000 28
EGAD00001005080 This study involves mutagenizing a range of different cell lines with ENU to identify those mutations which engender resistance to targeted treatment. . This dataset contains all the data available for this study on 2019-06-10. Illumina HiSeq 2500 16
EGAD00001005081 This study involves mutagenizing 11-18 with ENU to identify those mutations which engender resistance to targeted treatment. . This dataset contains all the data available for this study on 2019-06-10. Illumina HiSeq 2500 120
EGAD00001005082 Exome Sequencing in a set of Asian Head and Neck cancer cell lines, to identify mutations that can be used to genomically classify the cell lines. . This dataset contains all the data available for this study on 2019-06-10. Illumina HiSeq 2500 21
EGAD00001005134 We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours HiSeq X Ten 20
EGAD00001005135 We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours Illumina HiSeq 2500 59
EGAD00001005136 We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours Illumina HiSeq 4000 15
EGAD00001005193 That tobacco smoking causes lung cancer is well-established, but we lack quantitative understanding of its effects on genomes of normal bronchial epithelium. We sequenced whole genomes of 632 colonies derived from single bronchial epithelial cells across 16 subjects. Tobacco smoking is the major influence on mutation burden, adding 1000-10,000+ mutations/cell, massively increasing both between-subject and within-subject variance, and generating several distinct signatures of substitutions and indels. A population of cells in subjects with smoking history had mutation burdens equivalent to that expected for never-smokers: these cells lacked tobacco-specific mutational signatures, were four-fold more frequent in ex-smokers than current smokers, and had significantly longer telomeres than their more mutated counterparts. Driver mutations increased in frequency with age, affecting 4-14% of cells in middle-aged never-smokers. In current smokers, ≥25% of cells carried driver mutations and 0-6% cells had 2 or even 3 drivers. Thus, tobacco smoking increases mutation burden, cell-to-cell heterogeneity and driver mutations, but quitting promotes replenishment of bronchial epithelium from mitotically quiescent cells that have avoided tobacco mutagenesis. HiSeq X Ten 644
EGAD00001005214 All normal somatic cells are thought to acquire mutations but understanding of the rates, patterns, causes and consequences of somatic mutation in normal cells is limited. Uterine endometrium adopts multiple physiological states over a lifetime and is lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands from women aged 19 to 81 years showed them to be clonal cell populations derived from recent common ancestors, with total mutation burdens that increase with age at ~29 base substitutions/year and which are many-fold lower than endometrial cancers. Normal endometrial glands frequently carry driver mutations in cancer genes. Driver mutation burdens increase with age and correlate negatively with parity. Phylogenetic trees of normal endometrial glands constructed using whole genome sequences indicated that clones with drivers often originate during the first decades and spread to colonise the endometrial epithelial lining. The results show that driver mutation landscapes differ between normal cell types, perhaps shaped by differences in normal tissue physiology, and suggest that the procession of neoplastic changes leading to endometrial cancer is initiated early in life. HiSeq X Ten 0
EGAD00001005232 Whole genome sequencing of immune cells from patients diagnosed with psoriatic arthritis . This dataset contains all the data available for this study on 2019-08-07. HiSeq X Ten 8
EGAD00001005233 This study is a benchmarking exercise to explore potential source of variation between different CRISPR drop out libraries. . This dataset contains all the data available for this study on 2019-08-07. Illumina HiSeq 2500 37
EGAD00001005234 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2019-08-07. HiSeq X Ten 12
EGAD00001005252 Immortalised HaCaT keratinocytes were transduced with Cas9 and the CRISPR-KO v1.1 genome-wide gRNA library. The gRNA library was prepared from genomic DNA isolated 14 days post library transduction. gRNA representation will be compared to the original CRISPR-KO v1.1 library to reveal genes essential for HaCaT survival and growth. . This dataset contains all the data available for this study on 2019-08-14. Illumina HiSeq 2500 19
EGAD00001005296 Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung . This dataset contains all the data available for this study on 2019-08-28. Illumina HiSeq 2500 237
EGAD00001005297 A targeted gene screen of 365 known cancer genes in luminal breast cancer samples pre-chemotherapy and at resection post-chemotherapy to evalaute clonal expansion of chemotherapy cancer cells. . This dataset contains all the data available for this study on 2019-08-28. Illumina HiSeq 2000,Illumina HiSeq 2500 133
EGAD00001005312 This study investigates the genomic and transcriptomic characteristics of Wilm's tumour organoids . This dataset contains all the data available for this study on 2019-09-05. HiSeq X Ten 0
EGAD00001005313 Swift kit whole genome bisulphite of MPN colonies . This dataset contains all the data available for this study on 2019-09-05. HiSeq X Ten 16
EGAD00001005372 12 tissues from the warm autopsy are selected for this project. Using 10X Chromium technology we will generate ~1000 single cell/nulei genomic libraries per tissue. Each tissue will be whole genome sequenced (~2 lanes per 1000 cells) on hiseq X10. per single cell we will generate CNV profile and we investigate the level of genomic heterogenity with in tissue and across different tissues. . This dataset contains all the data available for this study on 2019-10-02. HiSeq X Ten 6
EGAD00001005495 The genomic hallmark of clear cell renal cell carcinoma is the loss of the short arm of chromosome three. This appears to be the earliest genomic event in the formation of these cancers. Often chromosome 3 is lost at the same time as part of chromosome 5 is duplicated via an unbalanced translocation, often with features consistent with focal chromothripsis. In this study, we sought to reconstruct the chromothriptic event that underlies the initiation of kidney cancer. We used long read sequencing (promethION, Oxford Nanopore Technologies) of patient tumour-derived DNA to elucidate how a single cell division error can generate cancer genome complexity. PromethION 2
EGAD00001005751 In this study we aim to characterise the landscape of mutation and clonal selection in the human pancreas. The study combines targeted sequencing and whole-genome sequencing of microbiopsies from the pancreas. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with pancreatic ductal adenocarcinoma. This dataset contains all the data available for this study on 2019-12-17. HiSeq X Ten 136
EGAD00001005770 The aim of this study is to reconstruct the phylogenetic development of childhood tumours HiSeq X Ten 8
EGAD00001005784 CRISPR/Cas9 lethality screens in a set of Asian head and neck cancer cell lines to identify novel targets. . This dataset contains all the data available for this study on 2020-01-15. Illumina HiSeq 2500 100
EGAD00001005785 The aim of this study is to describe the transcriptome of single arthritic cells. . This dataset contains all the data available for this study on 2020-01-15. HiSeq X Ten,Illumina HiSeq 4000 510
EGAD00001005786 Cancer is a genetic disease caused by an accumulations of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from the lung tissue of individuals with a variety of lung diseases (COPD, UIP, IPF, Emphysema, pulmonary hypertension). This will allow us to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. Smoking is a large risk factor for developing many of these lung diseases so we are particularly keen to determining whether there is evidence of a smoking signature in these patients. . This dataset contains all the data available for this study on 2020-01-15. HiSeq X Ten 190
EGAD00001005787 Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development. . This dataset contains all the data available for this study on 2020-01-15. Illumina HiSeq 4000 689
EGAD00001005789 Samples prepared by LCM - 5 cases for pilot study. Bulk DNA not available. . This dataset contains all the data available for this study on 2020-01-15. HiSeq X Ten 16
EGAD00001005919 We will be using G&T method to sequence single cell genome and transcriptome derived from FS13B iPSCs cell line. The cell cycle state of each of the single cells is known. Hence, we will be analysing the genome and transcriptome of single cells from each of the cell cycle state to generate a copy number profile and transcriptome profile per given cell cycle stage: G1, S, G2, S. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 192
EGAD00001005920 Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 49
EGAD00001005921 Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 8
EGAD00001005922 Sequencing of LCM-derived microbiopsies from 40 women who underwent mastectomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2020-01-29. HiSeq X Ten 46
EGAD00001005923 Sequencing of LCM-derived microbiopsies from 30 women who mastectomies due to Breast Cancer. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 29
EGAD00001005924 Sequencing of LCM-derived microbiopsies from 30 women who underwent mastectomies due to a breast cancer diagnosis. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 2
EGAD00001005925 Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-01-29. Illumina HiSeq 4000 27
EGAD00001005955 Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. HiSeq X Ten 149
EGAD00001005958 Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. Illumina HiSeq 4000 359
EGAD00001005990 Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-02-20. HiSeq X Ten 12
EGAD00001005991 Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-02-20. HiSeq X Ten 18
EGAD00001005992 Using whole genome sequencing of lymphocytes excised from human tissue using laser capture microscopy (LCM), we identify the mutations arising in these microenvironments. This work will contribute towards developing a catalogue of mutations present in tissue resident lymphocytes across a range of tissues, and will characterize the mutational signatures that result from each microenvironment. . This dataset contains all the data available for this study on 2020-02-20. HiSeq X Ten 9
EGAD00001005994 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2020-02-20. HiSeq X Ten 4
EGAD00001005995 The study will use WGS to aid in benchmarking different culture conditions in a set of genetically annotated human organoid lines. The data will be used to assess whether there is any clonal differences introduced when culturing these lines in different conditions. . This dataset contains all the data available for this study on 2020-02-20. HiSeq X Ten 30
EGAD00001006056 The aim of this project is to differentiate human embryonic stem cells to an extra-embryonic fate, specifically the hypoblast. This is of uttermost importance given the current lack of human hypoblast stem cells. We hypothesized that the pluripotent characteristics of the starting human embryonic stem cell population may dictate the competency for extra-embryonic cell fate specification. Based on this hypothesis and using human embryonic stem cells maintained in different naïve-like culture regimes, we have now developed conditions that allow the differentiation of human embryonic stem cells to a stable GATA6+ SOX2- population. This suggests that these cells may be putative human hypoblast stem cells. To validate this finding here we propose to perform RNA sequencing experiments of the differentiated human embryonic stem cells. By comparing their RNA expression profile to the single cell sequencing data of the human embryo that we are currently generating, we will be able to determine the identity of our GATA6+ SOX2- cells, and establish whether they represent the in vivo human hypoblast. This dataset contains all the data available for this study on 2020-04-20. Illumina HiSeq 4000 7
EGAD00001006083 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. This dataset contains all the data available for this study on 2021-09-27. Illumina NovaSeq 6000 520
EGAD00001006088 Somatic mutations accumulate in healthy tissues as we age, giving rise to cancer and potentially contributing to ageing. To study somatic mutations in non-neoplastic tissues, we developed a series of protocols to sequence the genomes of small populations of cells isolated from histological sections. Here, we describe a complete workflow that combines laser-capture microdissection (LCM) with low-input genome sequencing, whilst circumventing the use of whole-genome amplification (WGA). The protocol is subdivided broadly into 4 steps: tissue processing, LCM, low-input library generation and mutation calling and filtering. The tissue processing and LCM steps are provided as general guidelines which may require tailoring based on the specific requirements of the study at hand. Our protocol for low-input library generation utilises enzymatic rather than acoustic fragmentation to generate WGA-free whole-genome libraries. Finally, the mutation calling and filtering strategy has been adapted from previously published protocols to account for artefacts introduced via library creation. To date, we have used this workflow to perform targeted and whole-genome sequencing of small populations of cells (typically 100-1,000 cells) in thousands of microbiopsies from a wide range of human tissues. The low-input DNA protocol is designed to be compatible with liquid handling platforms and make use of equipment and expertise standard to any core sequencing facility. However, obtaining low-input DNA material via LCM requires specialized equipment and expertise. The entire protocol from tissue reception through whole-genome library generation can be accomplished in as little as a week, though 2-3 weeks would be a more typical turnaround time. HiSeq X Ten,Illumina NovaSeq 6000 18
EGAD00001006113 In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05. HiSeq X Ten 84
EGAD00001006114 In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05. Illumina HiSeq 4000 1916
EGAD00001006115 In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-exome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05. Illumina HiSeq 4000 103
EGAD00001006116 In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05. Illumina NovaSeq 6000 24
EGAD00001006117 n this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05. Illumina NovaSeq 6000 575
EGAD00001006162 In this study we will perform whole genome sequencing on in vitro colonies. HiSeq X Ten,Illumina HiSeq 4000,Illumina NovaSeq 6000 616
EGAD00001006194 The risk of getting non-melanoma skin cancer varies over 40-fold across the body. Here we map mutations in normal skin in high and low risk sites in normal donors and those with an increased risk of skin cancer. The density of mutations varied widely, with evidence of positive and negative genetic selection.  Regional differences in mutational signatures in high and low cancer risk sites and preferential selection of mutants of TP53 in high risk skin and FAT1 in lower risk skin were observed. 10% of clones had copy number changes in cancer associated genes and the largest had multiple driver mutations with loss of heterozygosity. In hair follicles, a proposed site of origin of skin cancers, mutations in the upper follicle resembled adjacent skin, but the lower follicle was sparsely mutated. We conclude cancer risk reflects the efficiency of transformation of oncogenic mutants rather than the density of mutant clones. HiSeq X Ten,Illumina HiSeq 2500,Illumina NovaSeq 6000 805
EGAD00001006212 Mutation accumulation over time in normal somatic cells contributes to cancer development and is proposed as a cause of ageing. DNA polymerases POLE and POLD1 replicate DNA with high fidelity during normal cell divisions. However, in some cancers defective proofreading due to acquired mutations in the exonuclease domains of POLE or POLD1 causes markedly elevated somatic mutation burdens with distinctive mutational signatures. POLE and POLD1 exonuclease domain mutations also cause familial cancer predisposition when inherited through the germline. Here, we sequenced normal tissue DNA from individuals with germline POLE or POLD1 exonuclease domain mutations. Increased mutation burdens with characteristic mutational signatures were found to varying extents in all normal adult somatic cell types examined, during early embryogenesis and in sperm. Mutation burdens were further markedly elevated in neoplasms from these individuals. Thus human physiology is able to tolerate ubiquitously elevated mutation burdens. Indeed, with the exception of early onset cancer, individuals with germline POLE and POLD1 exonuclease domain mutations are not reported to show abnormal phenotypic features, including those of premature ageing. The results, therefore, do not support a simple model in which all features of ageing are attributable to widespread cell malfunction directly resulting from somatic mutation burdens accrued during life. Illumina HiSeq 4000,Illumina NovaSeq 6000 211
EGAD00001006255 Chronic liver disease is associated with metabolic dysregulation, liver failure and hepatocellular carcinoma. We analysed somatic mutations from 1202 genomes across 32 liver samples, including normal controls, alcohol-related and non-alcoholic fatty liver disease. Five of 27 patients with liver disease carried hotspot driver mutations in FOXO1, the major transcription factor downstream of insulin signalling. FOXO1 mutations were independently acquired by up to 5 distinct clones within the same patient’s sample, and impaired insulin-mediated nuclear export of FOXO1. GPAM, which produces storage triacylglycerol from dietary calories, also had significant excess of mutations, similarly exhibiting convergent evolution within biopsies. Telomeres were shorter in diseased than normal liver, with attrition more pronounced in larger clones. Multiple independent acquisitions of drivers within one small liver sample imply that such mutations could affect hundreds of grams of tissue across the whole organ, potentially contributing to systemic metabolic dysfunction. HiSeq X Ten,Illumina NovaSeq 6000 1111
EGAD00001006296 Like many childhood cancers, malignant rhabdoid tumours (MRT) are thought to arise from aberrant foetal development. Although MRT predominantly exhibit a mesenchymal phenotype, it has been suggested that the foetal root of MRT lies in neural crest development. Here, we combine phylogenetic analyses of MRT, single cell mRNA assays, and functional experiments in patient-derived MRT organoids, to define the embryological origin of MRT and explore therapeutic avenues that may drive MRT differentiation. Phylogenetic analyses from the distribution of somatic mutations revealed that MRT were related to neural crest-derived, but not to mesodermal tissues, providing direct evidence of the neural crest origin of MRT in humans. In MRT organoids, reversal of the principal driver event underpinning MRT, SMARCB1 loss, induced differentiation along mesenchymal pathways. Together, these findings placed MRT cells on a developmental trajectory of neural crest to mesenchyme conversion, and defined the transcriptional changes underpinning MRT differentiation. Searching perturbation databases for agents that mimic these mRNA changes, we identified HDAC and mTOR inhibition as potential differentiation agents. Treatment of MRT organoids with this drug combination induced proliferation arrest with transcriptional changes akin to SMARCB1 re-expression. Our study defines the embryological root of MRT and proposes a differentiation treatment for this often fatal childhood cancer. HiSeq X Ten,Illumina HiSeq 4000,Illumina NovaSeq 6000,NextSeq 500 30
EGAD00001006337 The human placenta harbours chromosomal aberrations that are absent from the fetus in one to two percent of pregnancies. This confined mosaicism suggests that embryonic genetic bottlenecks exist, which phylogenetically segregate placental tissue. Here, we studied the somatic genetic landscape of human placentas by whole genome sequencing of 86 placental biopsies and of 106 microdissections. HiSeq X Ten,Illumina NovaSeq 6000 278
EGAD00001006342 This dataset was used to characterise T cell gene expression and clonality at sites of active inflammation within the joints of psoriatic arthritis (PsA) patients, and to compare these results with T cells from the peripheral blood of those same patients. Freshly sorted CD45RA negative CD3+CD4+ and CD3+CD8+ single cells from four patients were individually flow sorted into 96-well full-skirted plates (Eppendorf) containing 10µL of a 2% Dithiothreitol (DTT, 2M Sigma-Aldrich), RTL lysis buffer (Qiagen) solution. Cell lysates were sealed, mixed and spun down before storing at -80 ºC. Paired-end multiplexed sequencing libraries were prepared following the Smart-seq 2 protocol using the Nextera XT DNA library prep kit (Illumina). A pool of barcoded libraries from four different plates were sequenced across two lanes on the Illumina HiSeq 2500. Illumina HiSeq 2500 4703
EGAD00001006363 The hematological malignancy multiple myeloma (MM), also called Kahler's disease or plasma cell (PC) myeloma, is characterized by a clonal expansion of PCs originating in the bone marrow (BM). The expansion of these cells leads to an overproduction of antibodies and results in typical symptoms such as anemia, renal failure and bone lesions. All cases of MM are preceded by the asymptomatic, non-malignant pre-stage monoclonal gammopathy of undetermined significance (MGUS). Of all MGUS patients, only 1% per year will progress to MM. Despite efforts to elucidate the molecular mechanisms underlying the MGUS-to-MM progression, its pathogenesis still remains largely unknown. Additionally, the genetic profiles of MGUS patients have only been limitedly investigated due to the only incidental finding of MGUS, the difficulties in BM sampling and isolating a sufficient number of aberrant PCs from the BM aspirates of MGUS patients. Consequently, reliable biomarkers to individually predict which MGUS patients will progress to MM and which will not, are lacking. Therefore, it is highly required to study the molecular pathogenesis of MGUS and the role of genetic events in relation to the malignant transformation to MM. 42
EGAD00001006392 An investigation of clonal haematopoiesis in patients with neurodegenerative disease. Illumina HiSeq 2500 181
EGAD00001006423 Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy. Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells. HiSeq X Ten,Illumina NovaSeq 6000 137
EGAD00001006424 Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy. Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells. Illumina HiSeq 2500,Illumina HiSeq 4000 48
EGAD00001006427 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. Illumina NovaSeq 6000 182
EGAD00001006459 Bottleneck sequencing of human tissue including neurons, cord blood, sperm. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-10-20. HiSeq X Ten,Illumina HiSeq 2500,Illumina HiSeq 4000,Illumina NovaSeq 6000 192
EGAD00001006595 This dataset contains 160 single-cell derived blood colonies from two neonates and 6 adults. It also contains 18 samples that were used as matched normals to call mutations in NanoSeq data (dataset EGAD00001006459). HiSeq X Ten,Illumina NovaSeq 6000 13
EGAD00001006641 During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. HiSeq X Ten,Illumina NovaSeq 6000 1
EGAD00001006642 During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. Illumina HiSeq 4000,Illumina NovaSeq 6000 0
EGAD00001006643 During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline. Illumina HiSeq 4000 85
EGAD00001006732 Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – patient metatdata (Mutographs) 552
EGAD00001006859 Osteosarcoma, the most common primary malignant tumour of bone, affects children and adults alike. No fundamental biological differences between paediatric and adult osteosarcoma are known. Here, we apply multi-region whole genome sequencing to an index case of a four-year old child whose aggressive tumour harboured high level, focal amplifications of MYC and CCNE1 connected by translocations. We re-analysed copy number readouts of 258 cases of high-grade osteosarcoma from three different cohorts and identified an additional three cases with MYC and CCNE1 co-amplification, confined to children and associated with aggressive disease. Examining the age distribution of MYC and CCNE1 amplicons across all cases revealed a significant enrichment of focal MYC amplification in children, whereas CCNE1 amplification is not strictly restricted to children. Our findings indicate that amplification of the MYC oncogene, known to be associated with a poor outcome, delineates a variant of osteosarcoma specific to childhood. When co-amplified with CCNE1, it may herald an aggressive disease course. HiSeq X Ten 8
EGAD00001006868 Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – sequence data (Mutographs) Illumina NovaSeq 6000 1145
EGAD00001006929 Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02. HiSeq X Ten,Illumina HiSeq 4000 30
EGAD00001006930 Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02. Illumina NovaSeq 6000 24
EGAD00001006932 Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02. Illumina NovaSeq 6000 20
EGAD00001006933 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2021-02-02. HiSeq X Ten 6
EGAD00001006934 We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function. . This dataset contains all the data available for this study on 2021-02-02. HiSeq X Ten 20
EGAD00001006935 We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function. . This dataset contains all the data available for this study on 2021-02-02. HiSeq X Ten 9
EGAD00001006969 NOTCH1 mutant clones occupy the majority of normal human esophagus by middle age, but are comparatively rare in esophageal cancers, suggesting NOTCH1 mutations may promote clonal expansion but impede carcinogenesis. Here we test this hypothesis. Visualizing and sequencing NOTCH1 mutant clones in aging normal human esophagus, reveals frequent biallelic mutations that block NOTCH1 signaling.  In mouse esophagus, heterozygous Notch1 mutation confers a competitive advantage over wild type cells, an effect enhanced by loss of the second allele.  Notch1 loss alters transcription but has minimal effects on epithelial structure and cell dynamics. In a carcinogenesis model, Notch1 mutations were less prevalent in tumors than normal epithelium. Deletion of Notch1 reduced tumor growth, an effect recapitulated by anti-NOTCH1 antibody treatment.  We conclude that Notch1 mutations in normal epithelium are beneficial as wild type Notch1 promotes tumor expansion. NOTCH1 blockade has therapeutic potential in esophageal squamous tumors. Illumina HiSeq 2500 0
EGAD00001007037 Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation. 416
EGAD00001007038 Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation. HiSeq X Ten,Illumina NovaSeq 6000 0
EGAD00001007503 Multiple metastatic sites were sampled at autopsy from four patients diagnosed with metastatic colorectal cancer and subjected to whole-genome sequencing using the Illumina HiSeq X Ten platform to identify somatic variants, structural rearrangements and mutational signatures. The number of tumour samples per patient ranged from 6 to 66. HiSeq X Ten 88
EGAD00001007510 The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. HiSeq X Ten,Illumina NovaSeq 6000 312
EGAD00001007682 Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 385 older adults, each sampled 2-5 times over ~13 years. Illumina HiSeq 2500,Illumina MiSeq 1269
EGAD00001007683 Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 11 older adults, each previously sampled 2-5 times over the preceding ~13 years. Illumina HiSeq 2500 1
EGAD00001007684 Whole-genome sequencing of 288 single-cell-derived blood colonies from 3 elderly individuals with clonal haematopoiesis. Illumina NovaSeq 6000 1
EGAD00001007714 Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event.  Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F,  as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development. HiSeq X Ten,Illumina HiSeq 2000 1029
EGAD00001007715 Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event.  Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F,  as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development. Illumina NovaSeq 6000 57
EGAD00001007851 Age-related loss of function in the human haematopoietic system is well documented, manifesting as reduced regenerative capacity, age-related cytopenias and immune dysfunction. However, the cellular and population level changes that underpin both this functional decline and the increased risk of clonal haematopoiesis and blood cancer in the elderly remain elusive. Here we performed whole genome sequencing on >3350 single haematopoietic stem cell / multipotent progenitors (HSC/MPP) derived colonies across 10 haematologically normal subjects aged 0 to 81. We found that HSC/MPPs accumulated 17 single nucleotide variants per year post birth and had a reduction in telomere length of 50bp per year throughout young adult life. We reconstructed phylogenies of the sampled HSC/MPPs to interrogate changes in clonal dynamics through life. Haematopoiesis in adults aged less than 65 was predominantly polyclonal, with few known driver mutations. In contrast, individuals aged over 75 displayed a profound change in clonal structure, with frequent clonal expansions, many unexplained by known driver mutations. The ratio of non-synonymous to synonymous mutations revealed widespread positive selection, estimating around 1000 driver mutations in the dataset (10-fold more than the number of known drivers). We identified novel genes ZNF318 and HIST2H3D as being under positive selection, despite not being enriched in myeloid malignancies. Our data show that HSC clonal dynamics is more complex than previously thought. One implication is that by old age, the majority of HSCs carry at least one of a number of largely undescribed driver mutations, which may underlie aspects of their functional decline. HiSeq X Ten,Illumina NovaSeq 6000 3601
EGAD00001007958 Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures. Illumina NovaSeq 6000 210
EGAD00001007997 Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures. Illumina NovaSeq 6000 31
EGAD00001008029 The dataset comprises whole exome sequences from laser capture micro-dissected biopsies of 10 patients diagnosed with clear cell renal cell carcinoma. In total over 100 regions are sampled to allow 'focally exhaustive' sequencing and explore the limits of intra-tumoural heterogeneity. Illumina HiSeq 4000 117
EGAD00001008030 The dataset comprises of 5' single cell RNA sequencing with TCR enrichment with 10x Genomics' Chromium technology of multiregional biopsies of human renal cell carcinomas. Biopsies from different tumour regions, the tumour-normal interface, normal kidney, normal adrenal, metastatic regions, peri-nephric fat, and peripheral blood were sequenced from 12 patients with kidney tumours. Illumina HiSeq 4000,Illumina NovaSeq 6000 153
EGAD00001008032 The rates and patterns of somatic mutation in normal tissues are largely unknown outside of humans. Comparative analyses can shed light on the diversity of mutagenesis across species and on long-standing hypotheses regarding the evolution of somatic mutation rates and their role in cancer and ageing. Here, we used whole-genome sequencing of 208 intestinal crypts from 56 individuals to study the landscape of somatic mutation across 16 mammalian species. We found somatic mutagenesis to be dominated by seemingly endogenous mutational processes in all species, including 5-methylcytosine deamination and oxidative damage. With some differences, mutational signatures in other species resembled those described in humans, although the relative contribution of each signature varied across species. Remarkably, the somatic mutation rate per year varied greatly across species and exhibited a strong inverse relationship with species lifespan, with no other life-history trait studied displaying a comparable association. Despite widely different life histories among the species surveyed, including ~30-fold variation in lifespan and ~40,000-fold variation in body mass, the somatic mutation burden at the end of lifespan varied only by a factor of ~3. These data unveil common mutational processes across mammals and suggest that somatic mutation rates are evolutionarily constrained and may be a determinant of lifespan. HiSeq X Ten 36
EGAD00001008092 Lynch Syndrome (LS) is an autosomal dominant disease conferring a high risk of colorectal cancer due to germline heterozygous mutations in a DNA mismatch repair (MMR) gene. Although cancers in LS patients show elevated somatic mutation burdens, information on mutation rates in normal tissues and understanding of the trajectory from normal to cancer cell is limited. Here we whole-genome sequenced 152 crypts from normal and neoplastic epithelial tissues from LS patients. In normal tissues the repertoire of mutational processes and mutation rates were similar to those found in wild type individuals. A morphologically normal colonic crypt with an increased mutation burden and mutational signatures consistent with MMR deficiency was identified, which may represent a very early stage of LS pathogenesis. Phylogenetic tress of tumour crypts indicated that the most recent ancestor cell of each tumour was already MMR deficient and had experienced multiple clonal evolution cycles. This study demonstrates the genomic stability of epithelial cells with heterozygous germline MMR gene mutations and highlights important differences in the pathogenesis of LS from other colorectal cancer predisposition syndromes. Illumina NovaSeq 6000 161
EGAD00001008107 A lymphocyte suffers many threats to its genome, including programmed mutation during differentiation, antigen-driven proliferation and residency in diverse microenvironments. After developing protocols for single-cell lymphocyte expansions, we sequenced whole genomes from 717 normal naive and memory B and T lymphocytes and hematopoietic stem cells. All lymphocyte subsets carried more point mutations and structural variants than haematopoietic stem cells – the extra mutations were mostly acquired during differentiation, with burdens higher in memory than naive lymphocytes, although T cells also had a higher rate of mutation accumulation throughout life. Off-target effects of immunological diversification accounted for most of the additional differentiation-associated mutations in lymphocytes. Memory B cells acquired, on average, 18 off-target mutations genome-wide for every one on-target IGV mutation during the germinal centre reaction. Structural variation was 16-fold higher in lymphocytes than stem cells, with ~15% of deletions being attributable to off-target RAG activity. Mutational processes associated with ultraviolet light exposure and other sporadic mutational processes generated hundreds to thousands of mutations in some memory lymphocytes. The mutation burden and signatures of normal B lymphocytes were broadly comparable to those seen in many B-cell cancers, suggesting that malignant transformation of lymphocytes arises from the same mutational processes active across normal ontogeny. The mutational landscape of normal lymphocytes chronicles the off-target effects of programmed genome engineering during immunological diversification and the consequences of differentiation, proliferation and residency in diverse microenvironments. HiSeq X Ten 717
EGAD00001008339 Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – filtered vcf files 551
EGAD00001008469 SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma. Illumina NovaSeq 6000 0
EGAD00001008470 SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma. Illumina HiSeq 4000 10
EGAD00001008764 The single base substitution mutational signatures SBS2 and SBS13, likely caused by APOBEC cytosine deaminases, are common in many human cancer types. However, the stimulus activating APOBEC mutagenesis is unknown and understanding of when it occurs in the progression from normal to cancer cell is limited. Here, as part of a wider survey of human tissues, we whole genome sequenced 342 microdissected normal epithelial crypts from the small intestines of 39 individuals. SBS2/13 mutations were present in 17% normal small intestine crypts and were likely due to APOBEC3A activity. Localised clusters of SBS2/13 mutations (kataegis) were also commonly found. APOBEC mutation burdens were variable between individuals and between crypts from the same individual. Crypts with SBS2/13 often had immediate crypt neighbours without SBS2/13, suggesting that the underlying cause of SBS2/13 is cell-intrinsic rather than a widely distributed microenvironmental exposure, or needs to be permitted by cell-intrinsic conditions. APOBEC mutagenesis occurred throughout the human lifespan, including in young children, and was episodic with a small number of episodes occurring during the life history of a single cell. The results indicate that APOBEC mutagenesis is more common in the small intestine epithelium than in many other cell types, and is an episodic process in vivo initiated or permitted by cell intrinsic factors. HiSeq X Ten,Illumina NovaSeq 6000 408
EGAD00001008781 The dataset comprises of transcriptomes of tissue sections derived from either the tumour normal interface or tumour core from clear cell renal cell carcinomas. 16 sections are sampled in total using 10x Genomics' Visium technology. Illumina NovaSeq 6000 16
EGAD00001009061 Clonal tracking of stem cells and their progeny by whole genome sequencing permits exploration of evolutionary genetics in human disease. In this study, we performed phylogenetic reconstruction of haematopoiesis using somatically acquired mutations in 323 single haematopoietic stem and progenitor cell-derived colonies from 10 individuals with an inherited disorder of ribosome assembly, Shwachman-Diamond syndrome. We observed numerous clonal expansions, with recurrent acquisition of mutually exclusive mutations (EIF6, TP53, RPL5, RPL22, PRPF8, chromosomes 7 and 15) in multiple different clones in utero or early childhood converging on the p53-dependent nucleolar surveillance pathway that monitors ribosome integrity. In contrast to clones carrying biallelic TP53 mutations, genomes derived from colonies carrying mono-allelic TP53 mutations displayed no increase in mutation burden or specific mutational signatures. Our study highlights striking loss of clonal diversity with convergent somatic evolution on the p53-dependent nucleolar surveillance pathway from early life to offset the deleterious effects of a germline mutation in a Mendelian haematopoietic disorder. HiSeq X Ten 323
EGAD00001009641 Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing WGS analysis of a panel of mesothelioma cells lines. Illumina NovaSeq 6000 21
EGAD00001009642 Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing RNAseq analysis of a panel of mesothelioma cells lines. Illumina HiSeq 4000 21
EGAD00001009666 The incidence of non-melanoma skin cancer is 17-fold lower in Singapore compared to the UK1, despite Singapore receiving 2-3 times more year-round ultraviolet radiation (UV)2,3. The ageing epidermis of the skin comprises competing somatic mutant clones4,5, from which such cancers develop. We question if differences in keratinocyte skin cancer incidence are reflected in the mutational landscape by comparing ageing facial epidermis from donors of Singapore and the UK. We find UK skin to be a highly competitive, densely mutated landscape with 4-fold greater mutation burden compared to Singaporean skin and differences in clonal selection by country. We disproportionately observe multiple features common to keratinocyte skin cancers6,7,8 in UK skin, such as UV mutagenesis, copy number aberration and hotspot mutations (in particular TP53 R248W). We conclude that keratinocyte skin cancer incidence is reflected in the somatic clones of non-cancerous epidermis. Finally, we re-analyse squamous cell carcinoma exomes from Korea9 to show, even in low incidence populations, carcinogenesis is driven by UV damage. Illumina HiSeq 2500 191
EGAD00001009760 Colorectal cancer samples will be submitted for Illumina sequencing using a custom capture of 116 genes implicated in colorectal tumourigenesis. Driver mutations will be detected and ultimately correlated with phenotypic data. Illumina HiSeq 2000,Illumina HiSeq 2500 2229
EGAD00001009812 Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways. HiSeq X Ten,Illumina NovaSeq 6000 0
EGAD00001009813 Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways. Illumina HiSeq 4000 1
EGAD00001009848 Pathogenic germline variants in the protection of telomeres 1 gene (POT1) have been associated with predisposition to a range of tumor types, including melanoma, glioma, leukemia and cardioangiosarcoma. We sequenced all coding exons of the POT1 gene in 2,929 European-descent melanoma cases and 3,298 controls, identifying 43 protein-changing genetic variants. We performed functional studies on each of these variants and explored their possible contribution to disease risk. Illumina MiSeq 6226
EGAD00001010109 Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had breast cancer and those who are BRCA 1/2 carriers. This dataset contains all the data available for this study on 2023-03-08. HiSeq X Ten 48
EGAD00001010110 Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08. HiSeq X Ten,Illumina HiSeq 4000 92
EGAD00001010111 Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2023-03-08. Illumina NovaSeq 6000 25
EGAD00001010112 Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2023-03-08. Illumina NovaSeq 6000 67
EGAD00001010113 Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08. Illumina NovaSeq 6000 48
EGAD00001010114 Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2023-03-08. HiSeq X Ten,Illumina NovaSeq 6000 251
EGAD00001010115 Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2023-03-08. Illumina NovaSeq 6000 315
EGAD00001010116 Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08. Illumina NovaSeq 6000 199
EGAD00001010117 Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2023-03-08. Illumina HiSeq 4000,Illumina NovaSeq 6000 480
EGAD00001010122 In this study we aim to characterise the landscape of mutation and clonal selection in normal lung and premalignant lung disease. The study combines targeted sequencing and whole-genome sequencing of microbiopsies of lung and bronchial epithelium. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with premalignant lung disease. . This dataset contains all the data available for this study on 2023-03-09. Illumina HiSeq 2500,Illumina HiSeq 4000,Illumina MiSeq 1
EGAD00001010123 Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development. . This dataset contains all the data available for this study on 2023-03-09. HiSeq X Ten 1
EGAD00001010124 Whole genome sequencing to identify subclonal variants for subsequent mapping back to fixed tissue specimens. . This dataset contains all the data available for this study on 2023-03-09. HiSeq X Ten 1
EGAD00001010125 This project is correlating the molecular profiling of renal tumours with multiparametric and 13C-MRI including by 13C-MRSI. . This dataset contains all the data available for this study on 2023-03-09. Illumina HiSeq 4000 1
EGAD00001010871 Genomic and epigenomic sequencing of 5 oesphageal adenocarciomas with evidence of chromothripsis. Genomic sequencing includes: Pacbio circular consensus sequencing, Pacbio continuous long read sequencing, 10X linked read and Illumia HiSeq X Ten sequencing. Epigenomic sequencing includes: Hi-C chromosome capture, ATAC-seq, ChIP seq (for H3K27ac, H3K4me3, H3K27me3 and CTCF) and long read RNA sequencing. All data types have the bam files which have not undergone haplotype resolution (demarcated as unresolved) and some data types also have haplotype resolved reads (demarcated as resolved). 0
EGAD00010000395 Myeloma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6 19
EGAD00010000452 Chondrosarcoma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6 36
EGAD00010000488 Chondroblastoma case sample genotype using Affymetrix SNP6.0 Affymetrix_SNP6- 7
EGAD00010000644 Affymetrix SNP6.0 cancer cell line exome sequencing data 1022
EGAD00010001629 Methylation of anaplastic meningiona samples Ilumina Infinium HumanMethylationEPIC BeadChip array 26
EGAD00010001911 Fresh frozen breast cancer H&E tissue images collected and annotated by the International Cancer Genome Consortium (ICGC), that included the BASIS collaboration. Associated with whole genome sequence data as originally described by Nik-Zainal et al, Nature, 2016 (DOI: 10.1038/nature17676) and deposited with ID EGAS00001001178 H and E image 151