Shallow targeted sequencing with 462 mRNA and 97 antibodies of AML patient’s bone marrow mononuclear cells from iliac crest aspirations from. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is https://doi.org/10.6084/m9.figshare.14780127.v1 AMLQ4_SMK1 AML314 male AMLQ1_SMK2 AML116 female AMLQ3_SMK3 AML127 female AMLQ6_SMK4 AML183 male AMLQ2_SMK5 AML327 female AMLQ5_SMK6 AML334 male APLQ5_SMK7 APL124 male APLQ3_SMK8 APL142 male APLQ6_SMK9 APL218 female APLQ4_SMK10 APL147 male APLQ2_SMK11 APL223 female APLQ1_SMK12 APL224 female
Paired tumor and normal WGS of primary neuroblastomas. This is an update of the „Berlin Neuroblastoma Dataset” (EGAS00001004022). This data was used for the analysis of circular RNA expression and regulation in neuroblastoma.
This is the dataset of 16S data from mucosal biopsies.
RRBS data from TRACERx non-small cell lung cancer (NSCLC) tumours and matched normal adjacent tissue. TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC).
This dataset consists of 39 noncancerous donor and 62 cancer patient plasma samples (including 29 patients with CRC across a total of 13 tumor types) that were analyzed with the PGDx elio plasma resolve assay. The PGDx elio plasma resolve assay is a hybrid capture approach targeting 33 genes with sequencing performed using the Illumina NextSeq with 150bp paired-end reads. The bam files provided have been adapter masked and contain duplicate reads.
Clinical & biomarker data from IMagyn050: treatment arm, treatment approach, outcome of surgery, ECOG PS, PD-L1 status, race, age, disease stage, progression free survival (investigator assessed), overall survival, histology, tumor mutation burden and status, genomic loss of heterozygosity, microsatellite status, BRCA1/2 mutation status, tissue of origin. Mutation status based on FoundationOne NGS for the following genes is also being provided: TP53, BRCA1, CCNE1, MYC, NF1, PIK3CA, RAD21, TERC, PRKCI, KRAS, RB1, BRCA2, ARID1A, AKT2, PTEN, KDM5A, NOTCH3, FGF12, ERBB2, CDK12, EMSY, WHSC1L1, BCL2L1, CDKN2A, GNAS, ARFRP1, ZNF217, SOX2, CCND2, FGF6, FGF23, LYN, MUTYH, AURKA, FGFR1, MCL1, MLL2, MYCL1, ZNF703, BRAF, MAP2K4, CREBBP, TSC2
Single-cell whole transcriptome sequencing data for bone marrow samples from 9 cases with clonal hematopoiesis and 4 control samples. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding genotyping files use the same naming without the "_transcriptome" suffix.
Cancer cells display heterogeneous and dynamic states in glioblastoma, but how these malignant states arise and whether they follow a tractable cellular trajectory across tumours is poorly understood. Here, we generate a deep single cell and spatial multi-region atlas of 12 isocitrate dehydrogenase wild-type (IDH-wt) primary glioblastomas that integrates transcriptomic, epigenomic and genomic analysis to comprehensively characterise their tumour heterogeneity. The datasets in this study include sequencing data from Visium spatial transcriptomic (10x Genomics) profiling of these tumours. Note: 2 new samples were added to the dataset on 2026-05-19.
This dataset contains single-cell RNA sequencing and T-cell receptor (TCR) sequencing data generated from cerebrospinal fluid (CSF) cells and peripheral blood mononuclear cells (PBMCs). Samples were collected from 129 patients, including individuals with multiple sclerosis and individuals with other inflammatory neurological diseases used as controls. The dataset is provided in FASTQ file format and was generated using 10x Genomics single-cell technology. These data enable the identification and characterization of T cells carrying TCRs shared across individuals with multiple sclerosis and enriched in the CSF compartment.
Genetic analysis of patients with Inherited Retinal Dystrophies (IRDs) was carried out by performing Whole Genome Sequencing (WGS). The main purpose of this study is to identify simple and complex mutations responsible for IRD in patients. WGS was performed on selected affected and unaffected individuals using the Illumina HiSeqX10. The reads were aligned to human genome 19 (hg19) and variant calling was performed using Genome Analysis Toolkit (GATK). The genotyping quality of single nucleotide variants (SNVs) and indels was assessed using the variant quality score recalibration approach implemented in GATK. Copy number variations (CNVs) were called using Genome STRiP and SpeedSeq. This large set of whole genome sequencing data from different ethnicity can be stored and shared through dbGaP. This data could serve as a source for checking frequencies of variants or the pathogenicity of selected variants in different ethnicities.