Exploration of coding and non-coding variants in cancer using GenomePaint.
GenomePaint (https://proteinpaint.stjude.org/genomepaint) is a dynamic visualization platform for whole-genome, whole-exome, transcriptome, and epigenomic data, featuring a novel design that captures the inter-relatedness between DNA variations and RNA expression. Regulatory non-coding variants can be inspected and discovered along with coding variants, and their functional impact further explored by examining 3D genome and/or ChIP-seq data generated from cancer cell lines. Further, GenomePaint correlates mutation and expression patterns with patient outcomes, and can display external data such as adult cancer datasets and user-provided custom tracks. We used GenomePaint to analyze multi-omics data from 3,652 pediatric cancers representing 16 histotypes, and demonstrate the visualization features through examples, including two that led to new insights into oncogenic mechanisms in pediatric cancer. The first is the discovery of a new class of pathogenic recurrent variants that cause aberrant splicing, disrupting the RING domain of CREBBP, a driver gene frequently mutated in relapsed pediatric leukemia. The second is the cis-activation of the MYC oncogene in a subset of B-lineage acute lymphoblastic leukemia (B-ALL) via duplication of the NOTCH1-MYC enhancer (N-ME), previously discovered only in T-lineage ALL. The regulatory impact of N-ME enhancer amplification was initially confirmed by allelic imbalance in published gene expression and ChIP-seq data and verified by additional Capture-C and fluorescence in situ hybridization data generated by follow-up experiments. These examples demonstrate the power of GenomePaint in enabling not only data visualization but also integrative genomic analysis that can lead to novel biological insight for follow-up experimental validation.
- Type: Other
- Archiver: European Genome-Phenome Archive (EGA)
Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data
Dataset ID | Description | Technology | Samples |
---|---|---|---|
EGAD00001006678 | Illumina HiSeq 2000 | 8 | |
EGAD00001006679 | Illumina HiSeq 2000 | 1 | |
EGAD00001006680 | Illumina HiSeq 2000 | 1 |
Publications | Citations |
---|---|
Exploration of Coding and Non-coding Variants in Cancer Using GenomePaint.
Cancer Cell 39: 2021 83-95.e4 |
22 |