Recent progress in the analysis of cell-free DNA fragments (cell-free circulating tumor DNA, ctDNA) now allows monitoring of tumor genomes by non-invasive means. However, previous studies with plasma DNA from patients with cancer demonstrated highly variable allele frequencies of ctDNA. The comprehensive analysis of tumor genomes is greatly facilitated when plasma DNA has increased amounts of ctDNA. Therefore, a fast and cost-effective pre-screening method to identify such plasma samples without previous knowledge about alterations in the respective tumor genome could assist in the selection of samples suitable for further extensive qualitative analysis. To address this, we adapted the recently described FAST-SeqS method, which was originally established as a simple and effective, non-invasive screening method for fetal aneuploidy from maternal blood. We show that our modified FAST-SeqS method (mFAST-SeqS) can be used as a pre-screening tool for an estimation of the ctDNA percentage. Using a combined evaluation of genome-wide and chromosome-arm specific z-scores from dilution series with cell line DNA and by comparisons of plasma-Seq profiles with data from mFAST-SeqS, we established a detection limit of 10% or more of mutant alleles. Plasma samples with an mFAST-SeqS z-score above 5 showed highly concordant results compared to copy number profiles obtained from our previously described plasma-Seq approach.
DNA belonging to 16 tumour/normal samples were treated with bisulfite, then up to 5 different bisulfite PCRs were performed in each one of the samples. Amplicons form the same sample were pooled and submitted to sequencing on a MiSeq platform.
Profiling subclonal architecture and phylogeny in tumors by whole-genome sequence data mining and single-cell genome sequencing
Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm and the mitochondrial double membrane. Despite these physical barriers we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements and the features of the fusion fragments indicate that non-homologous end joining and/or replication-dependent DNA double strand break repair are the dominant mechanism involved. This study includes 12 pairs of whole-genome sequences (tumour and paired-normal), which present somatic mitochondrial DNA integrations in tumour genomes. Reference: Young Seok Ju et al., Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells, Genome Research (2015).
Reactivation of telomerase reverse transcriptase (TERT) expression enables cells to overcome replicative senescence and escape apoptosis, fundamental steps in the initiation of human cancer. Multiple cancer types, including up to 83% of glioblastomas (GBM), harbor highly recurrent TERT promoter mutations of unknown function but specific to two nucleotide positions. We identify the functional consequence of these mutations in GBM to be recruitment of the multimeric GABP transcription factor specifically to the mutant promoter. Allelic recruitment of GABP is consistently observed across four cancer types, highlighting a shared mechanism underlying TERT reactivation. Tandem flanking native ETS motifs critically cooperate with these mutations to activate TERT, likely by facilitating GABP heterotetramer binding. GABP thus directly links TERT promoter mutations to aberrant expression in multiple cancers.
Targeted gene screen of cell line tumours for testing the new V4 Colorectal gene panel.
2014 AML analysis is conducted with samples collected from Chunnam universty. 67 paired samples are belong to this project. The study was designed to examine the molecular abnormalities from leukemic patients at initial diagnosis in comparison with corresponding germ line control (saliva samples). The results of WXS were analyzed by Mutect for ranking cancer variants and creating mutational matrix.
The genetic mechanisms underlying the poor prognosis of esophageal squamous cell carcinoma (ESCC) are not well understood. In this study, we comprehensively characterized somatic mutations, copy number alterations (SCNAs)/structural variants (SVs) found in ESCC from sequencing 10 whole-genome and 57 whole-exome matched tumor-normal pairs. We identified multiple somatic mutations seen previously in known cancer pathways and identified candidate genes for ESCC including VANGL1 and MIR4707. A survival analysis based on the expression profiles of 321 ESCC individuals indicated that the somatically altered genes we found were significantly associated with ESCC poorer survival. Subsequently, we performed functional studies to validate the roles of the altered genes in tumor proliferation and metastasis.
The use of reference DNA standards generated from cancer cell lines sequenced in the Cancer Genome Project to establish the sensitivity, specificity, accuracy and reproducibility of the WTSI GCLP sequencing pipeline
Neuroblastoma, a clinically heterogeneous pediatric cancer, is characterized by distinct genomic profiles but few recurrent mutations. As neuroblastoma is expected to have high degree of genetic heterogeneity, study of neuroblastoma's clonal evolution with deep coverage whole-genome sequencing of diagnosis and relapse samples will lead to a better understanding of the molecular events associated with relapse. Samples were included in this study if sufficient DNA from constitutional, diagnosis and relapse tumors was available for WGS. Whole genome sequencing was performed on trios (constitutional, diagnose and relapse DNA) from eight patients using Illumina Hi-seq2500 leading to paired-ends (PE) 90x90 for 6 of them and 100x100 for two. Expected coverage for sample NB0175 100x100bp was 30X for tumor and constitutional samples. For the seven other patients expected coverage was 80X for tumor samples with PE 100x100, 100X in the other tumor samples and 50X for all constitutional samples (see table 1). Following alignment with BWA (Li et al., Oxford J, 2009 Jul) allowing up to 4% of mismatches, bam files were cleaned up according to the Genome Analysis Toolkit (GATK) recommendations (Van der Auwera et al., Current Protocols in Bioinformatics, 2013, picard-1.45, GenomeAnalysisTK-2.2-16). Variant calling was performed in parallel using 3 variant callers: GenomeAnalysisTK-2.2-16, Samtools-0.1.18 and MuTect-1.1.4 (McKenna et al., Genome Res, 2010; Li et al., Oxford J, 2009 Aug; Cibulskis et al., Nature, 2013). Annovar-v2012-10-23 with cosmic-v64 and dbsnp-v137 were used for the annotation and RefSeq for the structural annotation. For GATK and Samtools, single nucleotide variants (SNVs) with a quality under 30, a depth of coverage under 6 or with less than 2 reads supporting the variant were filter out. MuTect with parameters following GATK and Samtools thresholds have been used to filter our irrelevant variants. .SNVs within and around exons of coding genes overlapping splice sites.. Then,variants reported in more than 1% of the population in the 1000 genomes (1000gAprl_2012) or Exome Sequencing Project (ESP6500) have been discarded in order to filter polymorphisms. Finally, synonymous variants were filtered out. MuTect focuses on somatic by filtering with constitutional sample. Mpileup comparison between constitutional and somatic DNAs allowed us to focus also on tumor specific SNVs with GATK and Samtools. Finally, every SNV called by our pipeline and also supported in any constitutional samples were filtered our in order to prevent putative constitutional DNA coverage deficiency. Then we analyzed CNVs (copy number variants) with HMMcopy-v0.1.1 (Gavin et al., Genome Res, 2012) and control-FREEC-v6.7 (Boeva et al., Bioinformatics 2011) with a respective window of 2000bp and 1000 bp, and auto-correction of normal contamination of tumor samples for Control-FREEC. Finally we explored Structural variants (SVs) including deletions, inversions, tandem duplications and translocations using DELLY-v0.5.5 with standard parameters (Rausch et al., Oxford J, 2012). In tumors, at least 10 supporting reads were required to make a call and 5 supporting reads for the sample NB0175 with a coverage of only 40X (see table 2). To predict SVs in constitutional samples for subsequent somatic filtering, only 2 supporting reads were required in order not to miss one. To identify somatic events, all the SVs in each normal sample were first flanked by 500 bp in both directions and any SVs called in a tumor sample which was in the combined flanked regions of respective normal sample was removed (see graph 1). Deletions with more than 5 genes impacted or larger than 1Mb and inversions or tandem duplications covering more than 4 genes, were removed. We focused on exonic and splicing events for deletions, inversions, and tandem duplications. For translocation, we keep all SVs that occurred in intronic, exonic, 5'UTR, upstream or splicing regions. Bioinformatics detection of variations with Deep sequencing approach Once PE reads merged and adaptors trimmed by SeqPrep with default parameters, merged reads were aligned via the BWA (Li H. and Durbin R. 2009 PMID 19451168) allowing up to 1 differences in the 22-base-long seeds and reporting only unique alignments. Only reads having a mapping quality 20 or more have been further analysed. Variant calling software was not used, since we aimed to predict variations at low frequencies, observed in less than 1% of reads. Such variants require a custom approach. Using DepthOfCoverage functions of the Genome Analysis Toolkit (GATK) v2.13.2 (McKenna A, et al., 2010 Genome Research PMID: 20644199), we focused on high quality coverage of bases A, C, G and T at the targeted variant position. Depth of coverage of each base following a mapping quality higher than 20 and a base quality higher than 10 have been taken into account in order to focus only on high quality data. Aiming to determine the background level of variability at the studied regions, 10 control samples were included in the analysis. The same approach and filtering criteria have been applied as introduced above over the entire amplicons. In order to highlight variants, for each sample the frequencies of each bases at each amplicon position were then compared to those observed in the set of controls. Statistical analyses were performed with the R statistical software (http://www.R-project.org). Fisher’s exact two-sided tests with a Bonferroni correction were performed to compare percentages of bases between the data sets, i.e. for a given base between a case and the controls. Finally, significant variations were filtered-in once (i) a significant increase in the percentage of avariant base and (ii) a significant decrease in the percentage of it's reference base following our p.values criteria was observed (p.val < 0.05).