Need Help?

A comprehensive assessment of somatic mutation detection in cancer using whole genome sequencing

As whole genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Using tumor-normal sample pairs from two different types of cancer, chronic lymphocytic leukemia and medulloblastoma, we conducted a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines, and even validation methods. Here we show that using PCR-free methods and increasing sequencing depth to ~100x showed benefits, as long as the tumor:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artifact-prone nature of the raw data and lack of standards for dealing with the artifacts. However, armed with the benchmark mutation set we have created, we show that many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001001858 Illumina HiSeq 2500 2
EGAD00001001859 Illumina HiSeq 2500 2
Publications Citations
A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.
Nat Commun 6: 2015 10001
174
Genome-wide somatic variant calling using localized colored de Bruijn graphs.
Commun Biol 1: 2018 20
70
GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes.
BMC Bioinformatics 21: 2020 45
0
Sarek: A portable workflow for whole-genome sequencing analysis of germline and somatic variants.
F1000Res 9: 2020 63
90
SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach.
Sci Rep 10: 2020 12898
19
FiNGS: high quality somatic mutations using filters for next generation sequencing.
BMC Bioinformatics 22: 2021 77
6
Accurate somatic variant detection using weakly supervised deep learning.
Nat Commun 13: 2022 4248
6