Dataset

A practical guide for mutational signature analysis in hematological malignancies

Dataset ID Technology Samples
EGAD00001005028 HiSeq X Ten 5

Dataset Description

Analysis of mutational signatures is becoming routine in cancer genomics, with implications for pathogenesis, classification, prognosis, and even treatment decisions. However, the field lacks a consensus on analysis and result interpretation. Using whole-genome sequencing of multiple myeloma (MM), chronic lymphocytic leukemia (CLL) and acute myeloid leukemia, we compare the performance of public signature analysis tools. We describe caveats and pitfalls of de novo signature extraction and fitting approaches, reporting on common inaccuracies: erroneous signature assignment, identification of localized hyper-mutational processes, overcalling of signatures. We provide reproducible solutions to solve these issues and use orthogonal approaches to validate our results. We show how a comprehensive mutational signature analysis may provide relevant biological insights, reporting evidence of c-AID activity among unmutated CLL cases or the absence of BRCA1/BRCA2-mediated homologous recombination deficiency in a MM cohort. Finally, we propose a general analysis framework to ensure production of accurate and reproducible mutational signature data.

Data Use Conditions

IS PUB US

See further information on Data Use Conditions

Label Code Version Modifier
general research use DUO:0000042 2021-02-23
institution specific restriction DUO:0000028 2021-02-23
publication required DUO:0000019 2021-02-23
user specific restriction DUO:0000026 2021-02-23