Click on a Dataset ID in the table below to learn more, and to find
out who to contact about access to these data
Dataset ID
Description
Technology
Samples
EGAD50000000298
The dataset represents a total of 58 DNA samples from 16 male and 12 female pediatric patients affected with embryonal central nervous system tumors. The samples were subject to whole genome sequencing, WGS, [48 samples, (representing 12 male and 11 female individuals)] and whole exome sequencing, WES, [10 samples, (representing 4 male and 1 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 26 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from two patients. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 58 files in the CRAM format (lossless compression) with a total file size of ~8,8 TB. All CRAM files but one, are derived from one sequence run and one sample. P4551_227N_P4552_112N is a CRAM file where 2 sequence runs (P4551_227N and P4552_112N) from peripheral blood samples from the same individual, P019, were aligned into one single CRAM file. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.
HiSeq X Ten
Illumina HiSeq 2500
58
EGAD50000000299
The dataset represents a total of 18 DNA samples from 6 male and 3 female pediatric patients affected with central or peripheral nervous system tumors not classified as embryonal central nervous system tumors, nor gliomas, glioneuronal, or neuronal tumors. One tumor tissue sample and one peripheral blood sample from each patient were subject to whole genome sequencing (WGS) and were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 18 files in the CRAM format (lossless compression) with a total file size of ~3,4 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.
HiSeq X Ten
18
EGAD50000000300
The dataset represents a total of 85 DNA samples from 22 male and 20 female pediatric patients affected with gliomas, glioneuronal, and neuronal tumors. The samples were subject to whole genome sequencing, WGS, [71 samples, (representing 18 male and 17 female individuals)] and whole exome sequencing, WES, [14 samples, (representing 4 males and 3 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 84 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from one patient. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 85 files in the CRAM format (lossless compression) with a total file size of ~13,3 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.
HiSeq X Ten
Illumina HiSeq 2500
85