HPV-associated Cancer Consortium Data Access Committee
Exome sequencing for individualized cancer interpretation
Pancreatic cancer is an aggressive malignancy with a five-year mortality of 97–98%, usually due to widespread metastatic disease. Previous studies indicate that this disease has a complex genomic landscape, with frequent copy number changes and point mutations, but genomic rearrangements have not been characterized in detail. Despite the clinical importance of metastasis, there remain fundamental questions about the clonal structures of metastatic tumours, including phylogenetic relationships among metastases, the scale of ongoing parallel evolution in metastatic and primary sites, and how the tumour disseminates. Here we harness advances in DNA sequencing to annotate genomic rearrangements in 13 patients with pancreatic cancer and explore clonal relationships among metastases. We find that pancreatic cancer acquires rearrangements indicative of telomere dysfunction and abnormal cell-cycle control, namely dysregulated G1-to-S-phase transition with intact G2–M checkpoint. These initiate amplification of cancer genes and occur predominantly in early cancer development rather than the later stages of the disease. Genomic instability frequently persists after cancer dissemination, resulting in ongoing, parallel and even convergent evolution among different metastases. We find evidence that there is genetic heterogeneity among metastasis-initiating cells, that seeding metastasis may require driver mutations beyond those required for primary tumours, and that phylogenetic trees across metastases show organ-specific branches. These data attest to the richness of genetic variation in cancer, brought about by the tandem forces of genomic instability and evolutionary selection.
We previously described an approach called RealSeqS to evaluate aneuploidy in plasma cell-free DNA (cfDNA) through the amplification of ~350,000 repeated elements with a single primer. We hypothesized that an unbiased evaluation of the large amount of sequencing data obtained with RealSeqS might reveal other differences between plasma samples from patients with and without cancer. This hypothesis was tested through the development of a novel machine-learning approach called Alu Profile Learning Using Sequencing (A-PLUS) and its application to samples from 5108 individuals, 2037 with cancer and the remainder without cancer. Samples from cancer patients and controls were pre-specified into four cohorts used for: 1) model training, 2) analyte integration and threshold determination, 3) validation, and 4) reproducibility. A-PLUS alone provided a sensitivity of 40.5% across 11 different cancer types in the Validation Cohort, at a specificity of 98.5%. Combining A-PLUS with aneuploidy and 8 common protein biomarkers detected 51% of 1167 cancers at 98.9% specificity. We found that part of the power of A-PLUS could be ascribed to a single feature – the global reduction of AluS sub-family elements in the circulating DNA of cancer patients. We confirmed this reduction through the analysis of another independent dataset obtained with a very different approach (whole genome sequencing). The evaluation of Alu elements therefore has the potential to enhance the performance of several methods designed for the earlier detection of cancer.
WTCCC genome-wide case-control association study for Breast cancer (BC) using the 1958 British Birth Cohort collection as controls.