Need Help?

Targeted sequencing of genes recurrently mutated in AML

Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework ( The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data.

Request Access

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
EGAS00001000408 Cancer Genomics

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Located in
EGAF00000224486 bam 96.6 MB
EGAF00000224487 bam 94.9 MB
EGAF00000224488 bam 90.0 MB
EGAF00000224489 bam 98.6 MB
EGAF00000224490 bam 82.4 MB
EGAF00000224491 bam 94.6 MB
EGAF00000224492 bam 90.9 MB
EGAF00000224493 bam 97.8 MB
EGAF00000224494 bam 95.6 MB
EGAF00000224495 bam 83.7 MB
EGAF00000224496 bam 85.0 MB
EGAF00000224497 bam 92.5 MB
EGAF00000224498 bam 96.4 MB
EGAF00000224499 bam 88.8 MB
EGAF00000224500 bam 86.3 MB
EGAF00000224501 bam 94.2 MB
EGAF00000224502 bam 226.9 MB
EGAF00000224503 bam 248.3 MB
EGAF00000224504 bam 258.4 MB
EGAF00000224505 bam 199.1 MB
EGAF00000224506 bam 209.2 MB
EGAF00000224507 bam 211.0 MB
EGAF00000224508 bam 77.9 MB
EGAF00000224509 bam 84.9 MB
EGAF00000224510 bam 69.9 MB
EGAF00000224511 bam 73.7 MB
EGAF00000224512 bam 79.8 MB
EGAF00000224513 bam 83.5 MB
EGAF00000224514 bam 72.3 MB
EGAF00000224515 bam 72.5 MB
EGAF00000224516 bam 87.8 MB
EGAF00000224517 bam 86.1 MB
EGAF00000224518 bam 89.8 MB
EGAF00000224519 bam 86.4 MB
EGAF00000224520 bam 79.3 MB
EGAF00000224521 bam 94.5 MB
EGAF00000224522 bam 62.7 MB
EGAF00000224523 bam 73.1 MB
EGAF00000224524 bam 95.6 MB
EGAF00000224525 bam 94.3 MB
EGAF00000224526 bam 90.5 MB
EGAF00000224527 bam 97.1 MB
EGAF00000224528 bam 81.6 MB
EGAF00000224529 bam 93.7 MB
EGAF00000224530 bam 90.7 MB
EGAF00000224531 bam 96.6 MB
EGAF00000224532 bam 94.6 MB
EGAF00000224533 bam 83.6 MB
EGAF00000224534 bam 84.7 MB
EGAF00000224535 bam 91.4 MB
EGAF00000224536 bam 95.1 MB
EGAF00000224537 bam 88.7 MB
EGAF00000224538 bam 85.6 MB
EGAF00000224539 bam 93.4 MB
EGAF00000224540 bam 224.7 MB
EGAF00000224541 bam 246.3 MB
EGAF00000224542 bam 256.6 MB
EGAF00000224543 bam 198.3 MB
EGAF00000224544 bam 209.1 MB
EGAF00000224545 bam 208.6 MB
EGAF00000224546 bam 76.9 MB
EGAF00000224547 bam 83.1 MB
EGAF00000224548 bam 68.8 MB
EGAF00000224549 bam 72.5 MB
EGAF00000224550 bam 78.6 MB
EGAF00000224551 bam 82.6 MB
EGAF00000224552 bam 71.8 MB
EGAF00000224553 bam 71.9 MB
EGAF00000224554 bam 86.6 MB
EGAF00000224555 bam 85.5 MB
EGAF00000224556 bam 88.6 MB
EGAF00000224557 bam 85.7 MB
EGAF00000224558 bam 77.6 MB
EGAF00000224559 bam 93.5 MB
EGAF00000224560 bam 61.6 MB
EGAF00000224561 bam 72.3 MB
76 Files (8.2 GB)